[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2015-05-01 Thread Xuan Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523653#comment-14523653 ] Xuan Gong commented on YARN-1055: - Close this ticket since we already have work-preserving

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2015-01-14 Thread Jian He (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277951#comment-14277951 ] Jian He commented on YARN-1055: --- With work-preserving RM restart, the max-attempts is not req

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-15 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741165#comment-13741165 ] Bikas Saha commented on YARN-1055: -- First of all, like folks have already agreed. This is

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Alejandro Abdelnur (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740650#comment-13740650 ] Alejandro Abdelnur commented on YARN-1055: -- [~bikassaha], [~vinodkv], in Hadoop 1

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740576#comment-13740576 ] Vinod Kumar Vavilapalli commented on YARN-1055: --- Same here :) We do really un

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740500#comment-13740500 ] Bikas Saha commented on YARN-1055: -- Thats exactly what I was trying to say earlier. That R

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740495#comment-13740495 ] Karthik Kambatla commented on YARN-1055: Thinking more about it, the issue is not l

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740474#comment-13740474 ] Bikas Saha commented on YARN-1055: -- Why does the launcher not retry the action? Is there a

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Robert Kanter (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740446#comment-13740446 ] Robert Kanter commented on YARN-1055: - Another way of phrasing this: when the action's

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740436#comment-13740436 ] Karthik Kambatla commented on YARN-1055: This problem doesn't exist in Hadoop-1 bec

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740435#comment-13740435 ] Karthik Kambatla commented on YARN-1055: In Hadoop 1, we set the job.recovery.enabl

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740367#comment-13740367 ] Bikas Saha commented on YARN-1055: -- How does it work in hadoop 1 then? From what I see the

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740362#comment-13740362 ] Karthik Kambatla commented on YARN-1055: As in my comment from above (https://issu

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Alejandro Abdelnur (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740249#comment-13740249 ] Alejandro Abdelnur commented on YARN-1055: -- [~rkanter], [~kkambatl], can you pleas

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740242#comment-13740242 ] Bikas Saha commented on YARN-1055: -- First of all, whatever needs to be set must be set in

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Alejandro Abdelnur (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740204#comment-13740204 ] Alejandro Abdelnur commented on YARN-1055: -- [~bikassaha], bq. Restart on am fail

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13740162#comment-13740162 ] Karthik Kambatla commented on YARN-1055: [~hitesh], you are right - we should be ca

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Hitesh Shah (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739829#comment-13739829 ] Hitesh Shah commented on YARN-1055: --- [~kkambatl] Based on the discussion, I was trying to

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-14 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739372#comment-13739372 ] Karthik Kambatla commented on YARN-1055: bq. In case of a network issue where the A

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-13 Thread Hitesh Shah (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739303#comment-13739303 ] Hitesh Shah commented on YARN-1055: --- [~kkambatl] In case of a network issue where the AM

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-13 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739109#comment-13739109 ] Karthik Kambatla commented on YARN-1055: >From a YARN-user POV, I see it differentl

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-13 Thread Bikas Saha (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739073#comment-13739073 ] Bikas Saha commented on YARN-1055: -- Restart on am failure is already determined by the def

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-13 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739040#comment-13739040 ] Vinod Kumar Vavilapalli commented on YARN-1055: --- This is a new issue with Had

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-13 Thread Alejandro Abdelnur (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739027#comment-13739027 ] Alejandro Abdelnur commented on YARN-1055: -- [~vinodkv], in theory I agree with you

[jira] [Commented] (YARN-1055) Handle app recovery differently for AM failures and RM restart

2013-08-12 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737651#comment-13737651 ] Vinod Kumar Vavilapalli commented on YARN-1055: --- Irrespective of RM restart,