The data is still the same. I will check on logs and see if I can find something.
H Morpheus: Do you believe in fate, Neo? Neo: No. Morpheus: Why Not? Neo: Because I don't like the idea that I'm not in control of my life. ________________________________ From: Rekha Joshi <rekha...@yahoo-inc.com> To: "mapreduce-user@hadoop.apache.org" <mapreduce-user@hadoop.apache.org> Sent: Tue, November 24, 2009 4:11:01 AM Subject: Re: Maps getting stuck at 100% Re: Maps getting stuck at 100% Even if code is the same, if the data it processes has changed (for eg: date related data), or the parameters are different(for eg:sort/spill on map), the change in behavior can occur. Seems to me related to buffering concern.The detailed logs can point out what exactly is happening. Thanks & Regards, /R On 11/24/09 2:18 PM, "himanshu chandola" <himanshu_cool...@yahoo.com> wrote: Hi Todd, >>It was definitely working fine a week before and the code hasn't changed >>much. On my laptop a pseudo distributed installation for the same code >>finishes successive map reduce iteration quickly enough. > >>As far as I can see it, it is probably due to reformatting the fs. But I >>can't understand why it occurs this way. > >>tx > >>Himanshu >> >>Morpheus: Do you believe in fate, Neo? >>Neo: No. >>Morpheus: Why Not? >>Neo: Because I don't like the idea that I'm not in control of my life. > > > ________________________________ From:Todd Lipcon <t...@cloudera.com> >To: mapreduce-user@hadoop.apache.org >Sent: Tue, November 24, 2009 2:52:51 AM >Subject: Re: Maps getting stuck at 100% > >>Hi Himanshu, > >>The map progress percentage is calculated based on the input read, rather >>than the processing actually done. So, if you're doing a lot of work in your >>mapper, or reading ahead of what you've processed, you'll see this behavior >>reasonably often. It also can show up sometimes in streaming jobs if you are >>doing a lot of work per row, since have more buffering going on between the >>counters and your actual mapper work. > >>The easiest way to see what the tasks are doing is to drill down to the logs >>for an individual task that's stuck at 100%. If you add some logging output >>to your program, that can be helpful. Another trick, if you have the right >>access, is to ssh into your tasktracker node and send the SIGQUIT signal to >>one of your task pids - this will make it dump stack to its stdout log, which >>you can then inspect to understand what's going on. > >>Hope that helps >>-Todd > >>On Mon, Nov 23, 2009 at 11:48 PM, himanshu chandola >><himanshu_cool...@yahoo.com> wrote: > >Hi, >>>>I use cloudera's distribution for hadoop. What I see is that a small >>>>fraction of maps get stuck at 100%. They show up as 100% but continue >>>>running. After a lot of delay, they succeed finally but it takes a while, >>>>like 10 mins from the time when they show up as 100%. >> >>>>We recently reformatted our hadoop fs. Could it be related to that ? >> >> >>>>Thanks >> >> >> >> >>>> Morpheus: Do you believe in fate, Neo? >>>>Neo: No. >>>>Morpheus: Why Not? >>>>Neo: Because I don't like the idea that I'm not in control of my life. >> >> >> >> >> > >> >