Re: [jira] Commented: (HADOOP-1304) MAX_TASK_FAILURES should be configurable

Arkady Borkovsky Mon, 30 Apr 2007 16:36:53 -0700

Here is  somewhat different but related issue:

it would be useful to make the framework distinguish betweendeterministic and non-deterministic failures and react differently tothem.


E.g.

-- in streaming, a Perl script has a syntax error. There is no needto check for this 4*300 times.-- the same exception (with the same stack) is thrown whileprocessing the same record. (G's MapReduce supposedly is capable toskip the offending record at the next attempt, but short of that, whykeep trying?)

(Of course this is just an optimization, while 1304 is afunctionality one cannot do without....)


-- ab

On Apr 30, 2007, at 12:34 PM, Arun C Murthy (JIRA) wrote:

[ https://issues.apache.org/jira/browse/HADOOP-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492766 ]
Arun C Murthy commented on HADOOP-1304:
---------------------------------------
One concern with this 'feature' is that we want a reasonable cap onwhat the user can set max attempts to, else we could have asituation where a user unknowingly, not maliciously, sets it a verylarge value - thus the framework is now vulnerable to one wronglyconfigured job hogging the cluster...
Also, as per a discussion with Doug we could follow lucene'sconvention of classifying this knob as 'Expert' so as to clearlyelucidate it's importance...
MAX_TASK_FAILURES should be configurable
----------------------------------------

                Key: HADOOP-1304
URL: https://issues.apache.org/jira/browse/HADOOP-1304
            Project: Hadoop
         Issue Type: Improvement
         Components: mapred
   Affects Versions: 0.12.3
           Reporter: Christian Kunz
        Assigned To: Devaraj Das
        Attachments: 1304.patch, 1304.patch
After a couple of weeks of failed attempts I was able to finish alarge job only after I changed MAX_TASK_FAILURES to a highervalue. In light of HADOOP-1144 (allowing a certain amount of taskfailures without failing the job) it would be even better if thisvalue could be configured separately for mappers and reducers,because often a success of a job requires the success of allreducers but not of all mappers.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Commented: (HADOOP-1304) MAX_TASK_FAILURES should be configurable

Reply via email to