Hi, I used the setStatus method and now my mappers are not getting killed anymore.
Thanks a lot! Warm regards Arko On Thu, Oct 27, 2011 at 4:31 AM, Lucian Iordache < lucian.george.iorda...@gmail.com> wrote: > Hi, > > Probably your map method takes too long to process the data. You could add > some context.progress() or context.setStatus("status") in your map method > from time to time (at least once every 600 seconds, to not get the timeout). > > Regards, > Lucian > > > On Thu, Oct 27, 2011 at 11:22 AM, Arko Provo Mukherjee < > arkoprovomukher...@gmail.com> wrote: > >> Hi, >> >> I have a situation where I have to read a large file into every mapper. >> >> Since its a large HDFS file that is needed to work on each input to the >> mapper, it is taking a lot of time to read the data into the memory from >> HDFS. >> >> Thus the system is killing all my Mappers with the following message: >> >> 11/10/26 22:54:52 INFO mapred.JobClient: Task Id : >> attempt_201106271322_12504_m_000000_0, Status : FAILED >> Task attempt_201106271322_12504_m_000000_0 failed to report status for >> 601 seconds. Killing! >> >> The cluster is not entirely owned by me and hence I cannot change the * >> mapred.task.timeout* so as to be able to read the entire file. >> >> Any suggestions? >> >> Also, is there a way such that a Mapper instance reads the file once for >> all the inputs that it receives. >> Currently, since the file reading code is in the map method, I guess its >> reading the entire file for each and every input leading to a lot of >> overhead. >> >> Please help! >> >> Many thanks in advance!! >> >> Warm regards >> Arko >> > > > > -- > Numai bine, > Lucian >