Found it: Hadoop-5210

"Reduce Task Progress shows > 100% when the total size of map outputs (for a
single reducer) is high "

https://issues.apache.org/jira/browse/HADOOP-5210

On Thu, Jul 9, 2009 at 5:42 PM, Peter Skomoroch
<[email protected]>wrote:

> I've seen this behavior before with reduces going over 100% on big jobs.
> What version of Hadoop are you using?  I think there are some old bugs filed
> for this if you search the Jira.
>
>
> On Thu, Jul 9, 2009 at 5:31 PM, Aaron Kimball <[email protected]> wrote:
>
>> Reduce tasks which require more than twenty minutes are not a problem. But
>> you must emit some data periodically to inform the rest of the system that
>> each reducer is still alive. Emitting a (k, v) output pair to the
>> collector
>> will reset the timer. Similarly, calling Reporter.incrCounter() will also
>> reset the clock. So if you're doing a large amount of processing in a loop
>> before you emit your final key value pairs, you should periodically
>> increment a counter to allow the rest of the system to confirm that you're
>> not deadlocked.
>>
>> I'm not sure why your progress went so high. I know that Hadoop has some
>> quirks related to compression. If you've got compressed data, then
>> percentages might be inaccurate since the completed/available_input data
>> ratio will be partially based on compressed sizes.
>> - Aaron
>>
>> On Thu, Jul 9, 2009 at 12:24 PM, Prashant Ullegaddi <
>> [email protected]> wrote:
>>
>> > Hi Jothi,
>> >
>> > We are trying to index around 245GB compressed data (~1TB uncompressed)
>> > on a 9 node Hadoop cluster with 8 slaves and 1 master. In Map, we are
>> > just parsing the files, passing the same to reduce. In Reduce, we are
>> > indexing the parsed data in much like Nutch style.
>> >
>> > When we ran the job, map got over in less than 4hrs. But strange thing
>> > happened with reduces. They went past 100% progress (some 200%!). They
>> > showed 200+% before getting killed! Is this some kind of bug in Hadoop?
>> >
>> > All eventually got killed saying "Task
>> > attempt_200907091637_0004_r_000000_0 failed to report status for 1201
>> > seconds. Killing!" But I guess indexing in reduce takes more than 1200+
>> > seconds. How to go about it?
>> >
>> >
>> > Thanks in advance,
>> > Prashant,
>> > Search and Information Extraction Lab,
>> > IIIT-Hyderabad,
>> > INDIA.
>> >
>> >
>>
>
>
>
> --
> Peter N. Skomoroch
> 617.285.8348
> http://www.datawrangling.com
> http://delicious.com/pskomoroch
> http://twitter.com/peteskomoroch
>



-- 
Peter N. Skomoroch
617.285.8348
http://www.datawrangling.com
http://delicious.com/pskomoroch
http://twitter.com/peteskomoroch

Reply via email to