for the timeout problem,you can use a background thread that invoke context.progress() timely which do "keep-alive" for forked Child(mapper/combiner/reducer)... it is tricky but works.
On Sat, May 5, 2012 at 10:05 PM, Zuhair Khayyat <zuhair.khay...@kaust.edu.sa > wrote: > Hi, > > I am building a MapReduce application that constructs the adjacency list > of a graph from an input edge list. I noticed that my Reduce phase always > hangs (and timeout eventually) as it calls the function > context.write(Key_x,Value_x) when the Value_x is a very large ArrayWritable > (around 4M elements). I have increased both "mapred.task.timeout" and the > Reducers memory but no luck; the reducer does not finish the job. Is there > any other data format that supports large amount of data or should I use my > own "OutputFormat" class to optimize writing the large amount of data? > > > Thank you. > Zuhair Khayyat >