Please ignore my stupidity. I had an error in my cleanup method, killing everything too early hence the threads could never write to context.
On Monday 16 January 2012 16:23:48 Markus Jelsma wrote: > Hi, > > We have a job that is IO bound. The mapper aggregates the keys and the > reducer has to lookup the incoming keys externally. If this runs serially > with 15 reducers it takes many days so we are using threads to look them > up. > > We offer the keys to a SynchronousQueue and use a ThreadPoolExectutor for > handing the worker threads. In those threads we need to write the k/v pair > using the new MapReduce API in Hadoop 1.0.0 but we get an NPE: > > java.lang.NullPointerException > at > org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.jav > a:975) at > org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1017) > at > org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.write(Seq > uenceFileOutputFormat.java:74) at > org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTas > k.java:587) at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputCon > text.java:80) at > org.apache.nutch.util.hostdb.UpdateHostDb$UpdateHostDbReducer$ResolverThrea > d.run(UpdateHostDb.java:188) at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j > ava:886) at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: > 908) at java.lang.Thread.run(Thread.java:662) > > My question, what is the recommended method of writing from a thread pool > inside a reducer? I've looked up the NPE but only see references to HBase > issues which do not seem to apply to this situation. > > Any hints to offer? > Thanks! -- Markus Jelsma - CTO - Openindex