Try job.setMapOutputKeyClass(JoinKey.class). -- Alex K On Thu, Feb 11, 2010 at 8:25 AM, E. Sammer <[email protected]> wrote:
> It looks like you're using the local job runner which does everything in a > single thread. In this case, yes, I think the mappers are run sequentially. > The local job runner is a different code path in Hadoop and is a known > issue. Have you tried your code in pseudo-distributed mode? > > HTH. > > > On 2/11/10 11:14 AM, Martin Häger wrote: > >> Hello, >> >> We're trying to do a reduce-side join by applying two different >> mappers (TransformationSessionMapper and TransformationActionMapper) >> to two different input files and joining them using >> TransformationReducer. See attached Classify.java for complete source. >> >> When running it, we get the following error. JoinKey is our own >> implementation that is used for performing secondary sort. Somehow >> TransformationActionMapper gets passed a JoinKey when it expects a >> LongWritable (TextInputFormat). Is Hadoop actually applying the >> mappers in sequence? >> >> $ hadoop jar /tmp/classify.jar Classify >> 10/02/11 16:40:16 INFO jvm.JvmMetrics: Initializing JVM Metrics with >> processName=JobTracker, sessionId= >> 10/02/11 16:40:16 WARN mapred.JobClient: No job jar file set. User >> classes may not be found. See JobConf(Class) or >> JobConf#setJar(String). >> 10/02/11 16:40:16 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics >> with processName=JobTracker, sessionId= - already initialized >> 10/02/11 16:40:16 INFO input.FileInputFormat: Total input paths to process >> : 1 >> 10/02/11 16:40:16 INFO input.FileInputFormat: Total input paths to process >> : 1 >> 10/02/11 16:40:16 INFO mapred.JobClient: Running job: job_local_0001 >> 10/02/11 16:40:16 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics >> with processName=JobTracker, sessionId= - already initialized >> 10/02/11 16:40:16 INFO input.FileInputFormat: Total input paths to process >> : 1 >> 10/02/11 16:40:16 INFO input.FileInputFormat: Total input paths to process >> : 1 >> 10/02/11 16:40:16 INFO mapred.MapTask: io.sort.mb = 100 >> 10/02/11 16:40:16 INFO mapred.MapTask: data buffer = 79691776/99614720 >> 10/02/11 16:40:16 INFO mapred.MapTask: record buffer = 262144/327680 >> 10/02/11 16:40:16 WARN mapred.LocalJobRunner: job_local_0001 >> java.io.IOException: Type mismatch in key from map: expected >> org.apache.hadoop.io.LongWritable, recieved Classify$JoinKey >> at >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:807) >> at >> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:504) >> at >> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) >> at Classify$TransformationActionMapper.map(Classify.java:161) >> at Classify$TransformationActionMapper.map(Classify.java:1) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) >> at >> org.apache.hadoop.mapreduce.lib.input.DelegatingMapper.run(DelegatingMapper.java:51) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> at >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176) >> 10/02/11 16:40:17 INFO mapred.JobClient: map 0% reduce 0% >> 10/02/11 16:40:17 INFO mapred.JobClient: Job complete: job_local_0001 >> 10/02/11 16:40:17 INFO mapred.JobClient: Counters: 0 >> > > > -- > Eric Sammer > [email protected] > http://esammer.blogspot.com >
