[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282115#comment-13282115
 ] 

Ahmed Radwan commented on MAPREDUCE-4284:
-----------------------------------------

Just to elaborate more: The default for this property is 0, so these container 
dirs are directly deleted when the job finishes. In a test cluster we can set 
the property to a relatively high value to be able to inspect container 
logs/local dirs. But how can we do that in a production cluster. The problem is 
that any change in this property will affect all jobs, and the change will 
require restarting all the NodeManagers in the whole cluster. Both consequences 
are bad, since keeping all dirs for all jobs is expensive from storage 
perspective and restarting the NMs is expensive from operations perspective.

So one possible solution is to have the scope of this property as per-job (or 
add another per-job property). so the user can set this value to give a hint to 
the NM to keep the dirs for this individual job. We can still keep a 
NodeManager property to override or cap the delay time.

Arun, what do you think?
                
> Allow setting yarn.nodemanager.delete.debug-delay-sec on a per-job basis
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4284
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4284
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Ahmed Radwan
>            Assignee: Ahmed Radwan
>
> The yarn.nodemanager.delete.debug-delay-sec property is helpful in debugging 
> jobs (inspecting container logs/local dirs after the job finishes). Currently 
> it is a nodemanager property and changing it requires restarting the 
> nodemanager. In a production cluster this can be a real problem. It is better 
> to have this property set on a per-job basis and not requiring the restart of 
> nodemanagers. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to