[jira] [Commented] (YARN-9193) NullPointerException happens in RM while shutdown a NM

2019-01-12 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741476#comment-16741476 ] lujie commented on YARN-9193: - Accoring the RM log, I have found the reason the NM is shuttin

[jira] [Updated] (YARN-9194) Invalid event: REGISTERED at FAILED

2019-01-13 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9194: Attachment: YARN-9194_3.patch > Invalid event: REGISTERED at FAILED > --- > >

[jira] [Commented] (YARN-9194) Invalid event: REGISTERED at FAILED

2019-01-14 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742175#comment-16742175 ] lujie commented on YARN-9194: - HI:[~wilfreds] Thanks for your kindly review! {code:java} That

[jira] [Updated] (YARN-9194) Invalid event: REGISTERED at FAILED

2019-01-14 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9194: Attachment: YARN-9194_4.patch > Invalid event: REGISTERED at FAILED > --- > >

[jira] [Comment Edited] (YARN-9193) NullPointerException happens in RM while shutdown a NM

2019-01-14 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742197#comment-16742197 ] lujie edited comment on YARN-9193 at 1/14/19 3:20 PM: -- it will be fix

[jira] [Resolved] (YARN-9193) NullPointerException happens in RM while shutdown a NM

2019-01-14 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie resolved YARN-9193. - Resolution: Duplicate it will be fixed in [YARN-9194link title|https://issues.apache.org/jira/browse/YARN-9194]

[jira] [Commented] (YARN-9194) Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-15 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743585#comment-16743585 ] lujie commented on YARN-9194: - Hi:[~wilfreds] {code:java} Should the check not be done before

[jira] [Updated] (YARN-9194) Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9194: Attachment: YARN-9194_5.patch > Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM > while

[jira] [Commented] (YARN-9194) Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743812#comment-16743812 ] lujie commented on YARN-9194: - Submitting the new patch that can prevent  SchedulerApplication

[jira] [Created] (YARN-9201) RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED

2019-01-16 Thread lujie (JIRA)
lujie created YARN-9201: --- Summary: RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED Key: YARN-9201 URL: https://issues.apache.org/jira/browse/YARN-9201 Project: Hadoop YARN Issue Type: Bug

[jira] [Updated] (YARN-9201) RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9201: Attachment: hadoop-hires-resourcemanager-hadoop11.log > RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED > -

[jira] [Comment Edited] (YARN-9194) Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743964#comment-16743964 ] lujie edited comment on YARN-9194 at 1/16/19 12:20 PM: --- Our cluster

[jira] [Commented] (YARN-9194) Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743964#comment-16743964 ] lujie commented on YARN-9194: - Our cluster happens another errors :  [YARN-9201|https://issues

[jira] [Updated] (YARN-9201) RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9201: Description: While node removed, RM will kill the application and change its state as failed. AMLauncher can't not

[jira] [Commented] (YARN-9201) RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743965#comment-16743965 ] lujie commented on YARN-9201: - I will fix it in YARN-9201 > RMAppAttemptImpl: Invalid event:

[jira] [Resolved] (YARN-9201) RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie resolved YARN-9201. - Resolution: Duplicate Assignee: lujie > RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED > -

[jira] [Comment Edited] (YARN-9194) Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743964#comment-16743964 ] lujie edited comment on YARN-9194 at 1/16/19 12:26 PM: --- Our cluster

[jira] [Updated] (YARN-9194) Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9194: Attachment: YARN-9194_6.patch > Invalid event: REGISTERED at FAILED, and NullPointerException happens in RM > while

[jira] [Updated] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9194: Summary: Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdow

[jira] [Comment Edited] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743964#comment-16743964 ] lujie edited comment on YARN-9194 at 1/17/19 2:45 AM: -- Our cluster oc

[jira] [Comment Edited] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743964#comment-16743964 ] lujie edited comment on YARN-9194 at 1/17/19 2:46 AM: -- Our cluster oc

[jira] [Commented] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16744630#comment-16744630 ] lujie commented on YARN-9194: - Hi: [~leftnoteasy] [~wilfreds]  The UT failure seems to not be

[jira] [Comment Edited] (YARN-9201) RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743965#comment-16743965 ] lujie edited comment on YARN-9201 at 1/17/19 3:00 AM: -- I will fix it

[jira] [Updated] (YARN-9201) RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9201: Description: While node removed, RM will kill the application and change its state as failed. AMLauncher can't not

[jira] [Updated] (YARN-9201) RMAppAttemptImpl: Invalid event: LAUNCH_FAILED at FAILED

2019-01-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9201: Description: While node removed, RM will kill the application and change its state as failed. AMLauncher can't not

[jira] [Created] (YARN-9223) NPE happens in NM while loading recovery fails

2019-01-22 Thread lujie (JIRA)
lujie created YARN-9223: --- Summary: NPE happens in NM while loading recovery fails Key: YARN-9223 URL: https://issues.apache.org/jira/browse/YARN-9223 Project: Hadoop YARN Issue Type: Bug Re

[jira] [Updated] (YARN-9223) NPE happens in NM while loading recovery fails

2019-01-22 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9223: Attachment: hadoop-hd-nodemanager-slave1.log > NPE happens in NM while loading recovery fails >

[jira] [Updated] (YARN-9223) NPE happens in NM while loading recovery fails

2019-01-23 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9223: Description: In org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit: {code:java} try { initAndStartR

[jira] [Created] (YARN-9226) NPE while YarnChild shudown

2019-01-23 Thread lujie (JIRA)
lujie created YARN-9226: --- Summary: NPE while YarnChild shudown Key: YARN-9226 URL: https://issues.apache.org/jira/browse/YARN-9226 Project: Hadoop YARN Issue Type: Bug Reporter: lujie In

[jira] [Assigned] (YARN-9226) NPE while YarnChild shudown

2019-01-23 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie reassigned YARN-9226: --- Assignee: lujie > NPE while YarnChild shudown > --- > > Key: YARN-922

[jira] [Updated] (YARN-9226) NPE while YarnChild shudown

2019-01-23 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9226: Attachment: syslog > NPE while YarnChild shudown > --- > > Key: YARN-9226 >

[jira] [Comment Edited] (YARN-9226) NPE while YarnChild shudown

2019-01-23 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16749921#comment-16749921 ] lujie edited comment on YARN-9226 at 1/23/19 12:50 PM: --- Attatch the 

[jira] [Updated] (YARN-9226) NPE while YarnChild shudown

2019-01-23 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9226: Priority: Minor (was: Major) > NPE while YarnChild shudown > --- > > Key: Y

[jira] [Commented] (YARN-9226) NPE while YarnChild shudown

2019-01-23 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16749921#comment-16749921 ] lujie commented on YARN-9226: - Attatch the corresponding log. And this bugs will trigged while

[jira] [Updated] (YARN-9238) An huge Data Race can make we get a wrong attempt by an appAttemptId

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have foud a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcema

[jira] [Updated] (YARN-9238) An huge Data Race can make we get a wrong attempt by an appAttemptId

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have foud a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcema

[jira] [Updated] (YARN-9238) An huge Data Race can make we get a wrong attempt by an appAttemptId

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have foud a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcema

[jira] [Created] (YARN-9238) An huge Data Race can make we get a wrong attempt by an appAttemptId

2019-01-25 Thread lujie (JIRA)
lujie created YARN-9238: --- Summary: An huge Data Race can make we get a wrong attempt by an appAttemptId Key: YARN-9238 URL: https://issues.apache.org/jira/browse/YARN-9238 Project: Hadoop YARN Issue

[jira] [Updated] (YARN-9238) An Data Race can make we get a wrong attempt by an appAttemptId

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Summary: An Data Race can make we get a wrong attempt by an appAttemptId (was: An huge Data Race can make we get a

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Summary: We get a wrong attempt by an appAttemptId when AM crash at some point (was: An Data Race can make we get

[jira] [Assigned] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie reassigned YARN-9238: --- Assignee: lujie > We get a wrong attempt by an appAttemptId when AM crash at some point > --

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have foud a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcema

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Attachment: hadoop-test-resourcemanager-hadoop11.log > We get a wrong attempt by an appAttemptId when AM crash at s

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have foud a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcema

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have foud a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcema

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have foud a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcema

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have foud a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcema

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have foud a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcema

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: We have found a data race that can make an odd situation. See  org.apache.hadoop.yarn.server.resourcem

[jira] [Comment Edited] (YARN-9226) NPE while YarnChild shudown

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752936#comment-16752936 ] lujie edited comment on YARN-9226 at 1/26/19 5:01 AM: -- I think we sho

[jira] [Comment Edited] (YARN-9226) NPE while YarnChild shudown

2019-01-25 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752936#comment-16752936 ] lujie edited comment on YARN-9226 at 1/26/19 5:00 AM: -- I think we sho

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-29 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Attachment: YARN-9238_2.patch > We get a wrong attempt by an appAttemptId when AM crash at some point > ---

[jira] [Commented] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-01-29 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754946#comment-16754946 ] lujie commented on YARN-9238: - HI [~cheersyang] I have deleted the unused code and adding the

[jira] [Created] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-01-29 Thread lujie (JIRA)
lujie created YARN-9248: --- Summary: RMContainerImpl:Invalid event: ACQUIRED at KILLED Key: YARN-9248 URL: https://issues.apache.org/jira/browse/YARN-9248 Project: Hadoop YARN Issue Type: Bug

[jira] [Updated] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-01-29 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9248: Attachment: YARN-9248_2.patch > RMContainerImpl:Invalid event: ACQUIRED at KILLED >

[jira] [Updated] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-01-29 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9248: Attachment: YARN-9248_3.patch > RMContainerImpl:Invalid event: ACQUIRED at KILLED >

[jira] [Updated] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-01-29 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9248: Attachment: YARN-9248_4.patch > RMContainerImpl:Invalid event: ACQUIRED at KILLED >

[jira] [Commented] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-01-30 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756138#comment-16756138 ] lujie commented on YARN-9248: - The UT failure are due to  TestRMAppAttemptTransitions#testCont

[jira] [Comment Edited] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-01-30 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756138#comment-16756138 ] lujie edited comment on YARN-9248 at 1/30/19 2:01 PM: -- The UT failure

[jira] [Issue Comment Deleted] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-01-30 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9248: Comment: was deleted (was: The UT failure are due to  TestRMAppAttemptTransitions#testContainerRemovedBeforeAllocate

[jira] [Updated] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-01-30 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9248: Attachment: YARN-9248_5.patch > RMContainerImpl:Invalid event: ACQUIRED at KILLED >

[jira] [Commented] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-01-30 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756809#comment-16756809 ] lujie commented on YARN-9248: - i have found the reason why  TestRMAppAttemptTransitions#testCo

[jira] [Commented] (YARN-9194) Invalid event: REGISTERED and LAUNCH_FAILED at FAILED, and NullPointerException happens in RM while shutdown a NM

2019-01-31 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757913#comment-16757913 ] lujie commented on YARN-9194: - Hi: [~leftnoteasy] and [~sunilg] Yeah I have found the error,

[jira] [Assigned] (YARN-9262) TestRMAppAttemptTransitions is failing with an NPE

2019-01-31 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie reassigned YARN-9262: --- Assignee: lujie > TestRMAppAttemptTransitions is failing with an NPE > --

[jira] [Commented] (YARN-9262) TestRMAppAttemptTransitions is failing with an NPE

2019-01-31 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757916#comment-16757916 ] lujie commented on YARN-9262: - I am in the airport now, I will give the patch tonight. > Test

[jira] [Updated] (YARN-9262) TestRMAppAttemptTransitions is failing with an NPE

2019-02-01 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9262: Attachment: YARN-9262_1.patch > TestRMAppAttemptTransitions is failing with an NPE > ---

[jira] [Commented] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-02-01 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758226#comment-16758226 ] lujie commented on YARN-9248: - TestRMAppAttemptTransitions#testContainerRemovedBeforeAllocate

[jira] [Updated] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-02-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9248: Attachment: YARN-9248_6.patch > RMContainerImpl:Invalid event: ACQUIRED at KILLED >

[jira] [Commented] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-02-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16770314#comment-16770314 ] lujie commented on YARN-9248: - ping ---> give the simplified patch and hope for review > RMCo

[jira] [Commented] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-02-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16770319#comment-16770319 ] lujie commented on YARN-9238: - Ping> Further Simplify the unit test in the latest patch a

[jira] [Updated] (YARN-9238) We get a wrong attempt by an appAttemptId when AM crash at some point

2019-02-16 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Attachment: YARN-9238_3.patch > We get a wrong attempt by an appAttemptId when AM crash at some point > ---

[jira] [Updated] (YARN-9238) Allocate on previous or removed or non existent application attempt

2019-02-19 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Summary: Allocate on previous or removed or non existent application attempt (was: We get a wrong attempt by an ap

[jira] [Updated] (YARN-9238) Allocate on previous or removed or non existent application attempt

2019-02-21 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: See  org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService.Opportunist

[jira] [Updated] (YARN-9238) Allocate on previous or removed or non existent application attempt

2019-02-21 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9238: Description: See  org.apache.hadoop.yarn.server.resourcemanager.OpportunisticContainerAllocatorAMService.Opportunist

[jira] [Commented] (YARN-9238) Allocate on previous or removed or non existent application attempt

2019-02-21 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774871#comment-16774871 ] lujie commented on YARN-9238: - Change the title and simplify the describtion > Allocate on pr

[jira] [Commented] (YARN-9238) Avoid allocating opportunistic containers to previous/removed/non-exist application attempt

2019-02-22 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775139#comment-16775139 ] lujie commented on YARN-9238: - Hi:[~cheersyang] One more thing. Could please review the patch

[jira] [Comment Edited] (YARN-9238) Avoid allocating opportunistic containers to previous/removed/non-exist application attempt

2019-02-22 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775139#comment-16775139 ] lujie edited comment on YARN-9238 at 2/22/19 1:32 PM: -- Hi:[~cheersyan

[jira] [Commented] (YARN-9248) RMContainerImpl:Invalid event: ACQUIRED at KILLED

2019-02-24 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776306#comment-16776306 ] lujie commented on YARN-9248: - Hi [~cheersyang] [^YARN-9248_5.patch] is the latest patch with

[jira] [Created] (YARN-9588) InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot

2019-05-29 Thread lujie (JIRA)
lujie created YARN-9588: --- Summary: InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot Key: YARN-9588 URL: https://issues.apache.org/jira/browse/YARN-9588 Project: Hadoop YARN

[jira] [Updated] (YARN-9588) InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot

2019-05-29 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9588: Description: HI: while application is success, but before AM unregistered, RM reboot, then one error happens: {cod

[jira] [Updated] (YARN-9588) InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot

2019-05-29 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9588: Description: HI: while application is success, but before AM unregistered, RM reboot, then one error happens: {cod

[jira] [Updated] (YARN-9588) InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot

2019-05-29 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9588: Description: HI:  RM reboot during AM unregistered,then one error happens: {code:java} 2019-05-29 18:55:11,112 ERRO

[jira] [Updated] (YARN-9588) InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot

2019-05-30 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9588: Attachment: bug.zip > InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM > reboot >

[jira] [Commented] (YARN-9588) InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot

2019-05-30 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851598#comment-16851598 ] lujie commented on YARN-9588: - Attach the log for analysis > InvalidToken: appattempt_XXX not

[jira] [Updated] (YARN-9588) Resource leak due to InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot

2019-05-30 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9588: Summary: Resource leak due to InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot (was

[jira] [Updated] (YARN-9588) Resource leak due to InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot

2019-05-30 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-9588: Attachment: bugwithNPE.zip > Resource leak due to InvalidToken: appattempt_XXX not found in > AMRMTokenSecretManage

[jira] [Commented] (YARN-9588) Resource leak due to InvalidToken: appattempt_XXX not found in AMRMTokenSecretManager while RM reboot

2019-05-30 Thread lujie (JIRA)
[ https://issues.apache.org/jira/browse/YARN-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851669#comment-16851669 ] lujie commented on YARN-9588: - while we reproduce this bug, We also meet NPE in RM recovery, a

<    1   2   3   4   5   >