Re: Some tasks fail to report status between the end of the map and the beginning of the merge

Mathias De Maré Sun, 09 Aug 2009 02:19:38 -0700

I changed the maximum split size to 30000, and now most tasks actually
succeed.
However, I still have the failure problem with some tasks (with a job I was
running yesterday, I got a failure after 1900 tasks).
The problem is that these very few failures can bring down the entire job,
as they sometimes seem to just keep failing.
I looked through the mapred-default.xml, but I didn't find a config option
that allows ignoring tasks that fail. Is there a way to do this (it seems
like the only alternative I have, since I can't make the failures stop)?


Mathias

2009/8/5 Mathias De Maré <[email protected]>

>
> On Wed, Aug 5, 2009 at 9:38 AM, Jothi Padmanabhan 
> <[email protected]>wrote:
>> Hi,
>>
>> Could you please try setting this parameter
>> mapred.merge.recordsBeforeProgress to a lower number?
>> See https://issues.apache.org/jira/browse/HADOOP-4714
>>
>> Cheers
>> Jothi
>
>
> Hm, that bug looks like it's applicable during the merge, but my case is a
> block right before the merge (but seemingly right after all of the map tasks
> finish).
> I tried putting mapred.merge.recordsBeforeProgress to 100, and it didn't
> make a difference.
>
> On Wed, Aug 5, 2009 at 10:32 AM, Amogh Vasekar <[email protected]>wrote:
>
>> 10 mins reminds me of parameter mapred.task.timeout . This is
>> configurable. Or alternatively you might just do a sysout to let tracker
>> know of its existence ( not an ideal solution though )
>>
>> Thanks,
>> Amogh
>
>
> Well, the map tasks take around 30 minutes to run. Letting the task idle
> for a large number of minutes after that is a lot of useless time, imho. I
> tried with 20 minutes now, but I still get timeouts.
>
> I don't know if it's useful, but here are the settings of the map tasks at
> the moment:
>
> <configuration>
>   <property>
>     <name>mapred.job.tracker</name>
>     <value>localhost:9001</value>
>   </property>
>   <property>
>     <name>io.sort.mb</name>
>     <value>3</value>
>     <description>The total amount of buffer memory to use while sorting
>     files, in megabytes.  By default, gives each merge stream 1MB, which
>     should minimize seeks.</description>
>   </property>
> <property>
>   <name>mapred.tasktracker.map.tasks.maximum</name>
>   <value>4</value>
>   <description>The maximum number of map tasks that will be run
>   simultaneously by a task tracker.
>   </description>
> </property>
>
> <property>
>   <name>mapred.tasktracker.reduce.tasks.maximum</name>
>   <value>4</value>
>   <description>The maximum number of reduce tasks that will be run
>   simultaneously by a task tracker.
>   </description>
> </property>
>
> <property>
>   <name>mapred.max.split.size</name>
>   <value>1000000</value>
> </property>
>
> <property>
>   <name>mapred.child.java.opts</name>
>   <value>-Xmx400m</value>
> </property>
>
> <property>
> <name>mapred.merge.recordsBeforeProgress</name>
> <value>100</value>
> </property>
>
> <property>
> <name>mapred.task.timeout</name>
> <value>1200000</value>
> </property>
>
> </configuration>
>
> Ideally, I would want to get rid of the delay that causes the timeouts, yet
> also increase the split size somewhat (though I think a larger split size
> would increase the delay even more?).
> The map tasks take around 8000-11000 records as input, and can produce up
> to 1 000 000 records as output (in case this is relevant).
>
> Mathias
>
>

Re: Some tasks fail to report status between the end of the map and the beginning of the merge

Reply via email to