[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-14 Thread sandflee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242113#comment-15242113 ] sandflee commented on YARN-4924: thanks [~jlowe] for viewing and suggest, thanks [~nroberts] for reporting

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241810#comment-15241810 ] Jason Lowe commented on YARN-4924: -- Filed YARN-4960 for the LeveldbIterator constructor issue and

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-14 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241779#comment-15241779 ] Hudson commented on YARN-4924: -- FAILURE: Integrated in Hadoop-trunk-Commit #9615 (See

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241734#comment-15241734 ] Jason Lowe commented on YARN-4924: -- bq. leveldbIterator may also throws DBException, yes? Yes, if the

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-14 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241585#comment-15241585 ] Hadoop QA commented on YARN-4924: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-14 Thread sandflee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241476#comment-15241476 ] sandflee commented on YARN-4924: leveldbIterator may also throws DBException, yes? > NM recovery race can

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-14 Thread sandflee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241470#comment-15241470 ] sandflee commented on YARN-4924: Thanks [~jlowe], I had catch All exception in

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241431#comment-15241431 ] Jason Lowe commented on YARN-4924: -- Thanks for updating the patch! If createWriteBatch does ever throw

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-11 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236182#comment-15236182 ] Hadoop QA commented on YARN-4924: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-11 Thread sandflee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236128#comment-15236128 ] sandflee commented on YARN-4924: Thanks [~jlowe], not noticed that DBException is a RUNTIME exception,

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236117#comment-15236117 ] Jason Lowe commented on YARN-4924: -- org.iq80.levedb.DBException (the one we're interested in catching) is

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-11 Thread sandflee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236115#comment-15236115 ] sandflee commented on YARN-4924: in case of createWriteBatch throws runtime Exception, seems more safer to

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-11 Thread sandflee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236072#comment-15236072 ] sandflee commented on YARN-4924: >From the interface of DB, createWriteBatch didn't not throw exception.

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15236015#comment-15236015 ] Jason Lowe commented on YARN-4924: -- Thanks for updating the patch! cleanupKeysWithPrefix can now let the

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-11 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235452#comment-15235452 ] Hadoop QA commented on YARN-4924: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-11 Thread sandflee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235369#comment-15235369 ] sandflee commented on YARN-4924: thanks [~jlowe], I added @Deprecated to FINISHED_APP_KEY_PREFIX, but

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235212#comment-15235212 ] Jason Lowe commented on YARN-4924: -- Thanks for updating the patch! It may not be clear to others reading

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-08 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233153#comment-15233153 ] Hadoop QA commented on YARN-4924: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-08 Thread sandflee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233085#comment-15233085 ] sandflee commented on YARN-4924: {quote} I don't think removeDeprecatedKeys is an appropriate API in the

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232430#comment-15232430 ] Jason Lowe commented on YARN-4924: -- Thanks for the patch! I don't think removeDeprecatedKeys is an

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-08 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231884#comment-15231884 ] Hadoop QA commented on YARN-4924: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-07 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230248#comment-15230248 ] Jason Lowe commented on YARN-4924: -- Yeah, now that the NM registers with the list of apps it thinks are

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-06 Thread sandflee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229486#comment-15229486 ] sandflee commented on YARN-4924: thanks [~nroberts], another thought, seems it's not nessesary for NM to

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-06 Thread Nathan Roberts (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15228258#comment-15228258 ] Nathan Roberts commented on YARN-4924: -- Sorry [~sandflee]. I missed your comment about updating

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-06 Thread Nathan Roberts (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15228243#comment-15228243 ] Nathan Roberts commented on YARN-4924: -- Thanks [~sandflee], [~jlowe] for the suggestion. I'll work up

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15228198#comment-15228198 ] Jason Lowe commented on YARN-4924: -- I agree with [~sandflee] that postponing the finish app event dispatch

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-05 Thread sandflee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227703#comment-15227703 ] sandflee commented on YARN-4924: In YARN-4051, we also had containers leak from NEW to DONE transition

[jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up

2016-04-05 Thread Nathan Roberts (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15226956#comment-15226956 ] Nathan Roberts commented on YARN-4924: -- Observed the following race with NM recovery. 1)