Thanks for the fast response.. I think it is a good idea, however the application becomes too slow with large output arrays. I would be more interested in a solution that helps speeding up the "context.write()" it self.
On Sat, May 5, 2012 at 5:36 PM, Zizon Qiu <zzd...@gmail.com> wrote: > for the timeout problem,you can use a background thread that invoke > context.progress() timely which do "keep-alive" for forked > Child(mapper/combiner/reducer)... > it is tricky but works. > > > On Sat, May 5, 2012 at 10:05 PM, Zuhair Khayyat < > zuhair.khay...@kaust.edu.sa> wrote: > >> Hi, >> >> I am building a MapReduce application that constructs the adjacency list >> of a graph from an input edge list. I noticed that my Reduce phase always >> hangs (and timeout eventually) as it calls the function >> context.write(Key_x,Value_x) when the Value_x is a very large ArrayWritable >> (around 4M elements). I have increased both "mapred.task.timeout" and the >> Reducers memory but no luck; the reducer does not finish the job. Is there >> any other data format that supports large amount of data or should I use my >> own "OutputFormat" class to optimize writing the large amount of data? >> >> >> Thank you. >> Zuhair Khayyat >> > >