Hi, All!

I'm a user quite new to Hadoop. Now I'm a little confused by the
framework. The version of Hadoop I'm using is 0.14.0.

>From the "JobConf" Javadoc page
(http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/JobConf.html),
I know that the following two methods are deprecated,

 void setObject(String name, Object value)
 Object getObject(String name)

and thus a "side map" is necessary for similar usage. I'm not quite
understand what a side map is, below is my attempt.

To make use of a map, I defined my MapReduce driver class like:

public class KMeansClustering extends JobConf {
  private HashMap<Integer, String> map;

  public KMeansClustering() {
    super();
    this.map = new HashMap<Integer, String>();
  }

  public static void main(string[] args) {
   JobConf conf = new KMeansClustering();
    JobClient client = new JobClient();
    conf.setJarByClass(KMeansClustering.class);
    conf.setJobName("K-Means Clustering - Data Preparation");
    // a few more settings
    JobClient.runJob(conf);
  }
}

The reason why I write a subclass of JobConf is that I think the added
HashMap object can be useful by the mapper and reducer classes as the
JobConf object will be passed in.

Below is a mapper I write:

public class DataPreparationMapper extends MapReduceBase implements Mapper {
  private KMeansClustering conf;

  @Override
  public void configure(JobConf job) {
    System.out.println(job.toString());
    this.conf = (KMeansClustering)job;
  }

  public void map(WritableComparable key, Writable value,
      OutputCollector output, Reporter reporter) throws IOException {
    // do something
  }
}


I thought the KMeansClustering (sub class of JobConf) can be passed to
DataPreparationMapper's private object conf when the configure()
method is called. However, when the program is run, I received the
ClassCastException, meaning that the passed JobConf object cannot be
converted to KMeansClustering. The error message is:

java.lang.ClassCastException: org.apache.hadoop.mapred.JobConf
        at DataPreparationMapper.configure(DataPreparationMapper.java:22)
        at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
        at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
        at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
        at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:182)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:131)


Can anyone tell me what is wrong with my method? Any idea is appreciated.
Thank you!

-- 
LIN, Shuang
Undergraduate Student
Dept. Computer Science and Technology,
Tsinghua University, Beijing, P.R.China

Reply via email to