[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-2529:
-------------------------------------

    Attachment: M2529-1.patch
                M2529-1-20s.patch

Minor nits:
* As a default, always incrementing the metric for undefined regex probably 
makes more sense
* {{null}} is probably a better default than the empty string
* There's a possible NPE if the exception message is {{null}}
* The unit test is setting combinations of the stack/message regex, but it 
calls {{checkStackException}} in a few places, which doesn't exercise that 
logic (I think it's covered, but that could be clearer)
* While this will be useful while we work around bugs emerging from Jetty, we 
should probably keep it as an undocumented config setting.
* The trunk patch updates {{MRJobConfig}}, which is for user jobs. Moved to 
{{JTConfig}}

This slight modification defines exceptions with {{null}} messages as matching 
no regexp. Let me know if it looks OK to you

> Recognize Jetty bug 1342 and handle it
> --------------------------------------
>
>                 Key: MAPREDUCE-2529
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2529
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.204.0, 0.23.0
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>             Fix For: 0.20.205.0, 0.23.0
>
>         Attachments: M2529-1-20s.patch, M2529-1.patch, 
> jetty1342-20security.patch, mapred2529-trunk.patch
>
>
> We are seeing many instances of the Jetty-1342 
> (http://jira.codehaus.org/browse/JETTY-1342). The bug doesn't cause Jetty to 
> stop responding altogether, some fetches go through but a lot of them throw 
> exceptions and eventually fail. The only way we have found to get the TT out 
> of this state is to restart the TT.  This jira is to catch this particular 
> exception (or perhaps a configurable regex) and handle it in an automated way 
> to either blacklist or shutdown the TT after seeing it a configurable number 
> of them.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to