[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2023-11-07 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783707#comment-17783707
 ] 

Ayush Saxena commented on YARN-9568:


I believe this fix breaks the Windows test(HDFS-17246)

> NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover
> --
>
> Key: YARN-9568
> URL: https://issues.apache.org/jira/browse/YARN-9568
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.3.0
> Environment: macos
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-9568.001.patch, YARN-9568.002.patch, npe.log
>
>
> This seems new in trunk. As in "wasn't happening a couple of weeks ago". Its 
> surfacing in the S3A committer tests which are trying to create 
> MiniYarnClusters: all such tests are failing as the mini yarn cluster won't 
> come up with an NPE in {{FileSystemNodeAttributeStore.recover}}
> I'm not sure why node labels are needed on test clusters; the default implies 
> they should be off anyway.
> At the same time, I can't seem to find one specific change in the git log to 
> say "this is causing the problem".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-07-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888140#comment-16888140
 ] 

Hudson commented on YARN-9568:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16951 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16951/])
YARN-9568. Fixed NPE in MiniYarnCluster during (eyang: rev 
c34ceb5fde9f6d3d692640eb2a27d97990f17350)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java


> NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover
> --
>
> Key: YARN-9568
> URL: https://issues.apache.org/jira/browse/YARN-9568
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.3.0
> Environment: macos
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-9568.001.patch, YARN-9568.002.patch, npe.log
>
>
> This seems new in trunk. As in "wasn't happening a couple of weeks ago". Its 
> surfacing in the S3A committer tests which are trying to create 
> MiniYarnClusters: all such tests are failing as the mini yarn cluster won't 
> come up with an NPE in {{FileSystemNodeAttributeStore.recover}}
> I'm not sure why node labels are needed on test clusters; the default implies 
> they should be off anyway.
> At the same time, I can't seem to find one specific change in the git log to 
> say "this is causing the problem".



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888132#comment-16888132
 ] 

Eric Yang commented on YARN-9568:
-

+1 for Pull Request #839.  Thank you [~ste...@apache.org] for the patch.
Thank you [~bibinchundatt] for the review.

> NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover
> --
>
> Key: YARN-9568
> URL: https://issues.apache.org/jira/browse/YARN-9568
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.3.0
> Environment: macos
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: YARN-9568.001.patch, YARN-9568.002.patch, npe.log
>
>
> This seems new in trunk. As in "wasn't happening a couple of weeks ago". Its 
> surfacing in the S3A committer tests which are trying to create 
> MiniYarnClusters: all such tests are failing as the mini yarn cluster won't 
> come up with an NPE in {{FileSystemNodeAttributeStore.recover}}
> I'm not sure why node labels are needed on test clusters; the default implies 
> they should be off anyway.
> At the same time, I can't seem to find one specific change in the git log to 
> say "this is causing the problem".



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-05-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844782#comment-16844782
 ] 

Hadoop QA commented on YARN-9568:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 41s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 59s{color} 
| {color:red} hadoop-yarn-server-tests in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9568 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12969268/YARN-9568.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 698f0136f5b7 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c1d7d68 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| unit | 
https://builds.apa

[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-05-21 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844733#comment-16844733
 ] 

Bibin A Chundatt commented on YARN-9568:


Updated patch as per your comment..

> NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover
> --
>
> Key: YARN-9568
> URL: https://issues.apache.org/jira/browse/YARN-9568
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.3.0
> Environment: macos
>Reporter: Steve Loughran
>Priority: Minor
> Attachments: YARN-9568.001.patch, YARN-9568.002.patch, npe.log
>
>
> This seems new in trunk. As in "wasn't happening a couple of weeks ago". Its 
> surfacing in the S3A committer tests which are trying to create 
> MiniYarnClusters: all such tests are failing as the mini yarn cluster won't 
> come up with an NPE in {{FileSystemNodeAttributeStore.recover}}
> I'm not sure why node labels are needed on test clusters; the default implies 
> they should be off anyway.
> At the same time, I can't seem to find one specific change in the git log to 
> say "this is causing the problem".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-05-21 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844711#comment-16844711
 ] 

Steve Loughran commented on YARN-9568:
--

bq. For MiniYarnCluster we could configure the node atttribute store path to 
unique folder inside targetWorkDir. 
This could solve the issue rt 

that's what I Was thinking something like
{code}
// to ensure that any FileSystemNodeAttributeStore started by the RM always
// uses a unique path, if unset, force it under the test dir.
if (conf.get(YarnConfiguration.FS_NODE_ATTRIBUTE_STORE_ROOT_DIR) == null) {
  File nodeAttrDir = new File(getTestWorkDir(), "nodeattributes");
  conf.set(YarnConfiguration.FS_NODE_ATTRIBUTE_STORE_ROOT_DIR,
  nodeAttrDir.getCanonicalPath());
}
{code}

The patch as submitted doesn't work as 
{{NodeAttributeTestUtils.getRandomDirConf}} creates a new configuration object; 
it needs to be the shared one of the RM which is patched.

> NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover
> --
>
> Key: YARN-9568
> URL: https://issues.apache.org/jira/browse/YARN-9568
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.3.0
> Environment: macos
>Reporter: Steve Loughran
>Priority: Minor
> Attachments: YARN-9568.001.patch, npe.log
>
>
> This seems new in trunk. As in "wasn't happening a couple of weeks ago". Its 
> surfacing in the S3A committer tests which are trying to create 
> MiniYarnClusters: all such tests are failing as the mini yarn cluster won't 
> come up with an NPE in {{FileSystemNodeAttributeStore.recover}}
> I'm not sure why node labels are needed on test clusters; the default implies 
> they should be off anyway.
> At the same time, I can't seem to find one specific change in the git log to 
> say "this is causing the problem".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-05-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844629#comment-16844629
 ] 

Hadoop QA commented on YARN-9568:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m 11s{color} 
| {color:red} hadoop-yarn-server-tests in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9568 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12969235/YARN-9568.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ba41f3bdfc1c 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1cb2eb0 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| unit | 
https://builds.apa

[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-05-20 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844556#comment-16844556
 ] 

Bibin A Chundatt commented on YARN-9568:


[~ste...@apache.org]

Nodelabel had a configuration to enabled or disable the store. NodeAttributes 
store is enabled by default [~sunilg]/[~cheersyang]


{quote}
Any error in loading should be treated as no data to recover
{quote}
Old nodelabel store had the same behaviour any invalid file used to fail RM 
startup. The logs really need improvement.

For MiniYarnCluster  we could configure the node atttribute store path to 
unique folder inside targetWorkDir. 
This could solve the issue rt ?

> NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover
> --
>
> Key: YARN-9568
> URL: https://issues.apache.org/jira/browse/YARN-9568
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.3.0
> Environment: macos
>Reporter: Steve Loughran
>Priority: Minor
> Attachments: npe.log
>
>
> This seems new in trunk. As in "wasn't happening a couple of weeks ago". Its 
> surfacing in the S3A committer tests which are trying to create 
> MiniYarnClusters: all such tests are failing as the mini yarn cluster won't 
> come up with an NPE in {{FileSystemNodeAttributeStore.recover}}
> I'm not sure why node labels are needed on test clusters; the default implies 
> they should be off anyway.
> At the same time, I can't seem to find one specific change in the git log to 
> say "this is causing the problem".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-05-20 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844227#comment-16844227
 ] 

Steve Loughran commented on YARN-9568:
--

Looking more at this looks like a bad assumption in the whole recovery logic: 
the data in the files can be recovered. Any error in loading should be treated 
as no data to recover

> NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover
> --
>
> Key: YARN-9568
> URL: https://issues.apache.org/jira/browse/YARN-9568
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.3.0
> Environment: macos
>Reporter: Steve Loughran
>Priority: Minor
> Attachments: npe.log
>
>
> This seems new in trunk. As in "wasn't happening a couple of weeks ago". Its 
> surfacing in the S3A committer tests which are trying to create 
> MiniYarnClusters: all such tests are failing as the mini yarn cluster won't 
> come up with an NPE in {{FileSystemNodeAttributeStore.recover}}
> I'm not sure why node labels are needed on test clusters; the default implies 
> they should be off anyway.
> At the same time, I can't seem to find one specific change in the git log to 
> say "this is causing the problem".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-05-20 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844223#comment-16844223
 ] 

Steve Loughran commented on YARN-9568:
--

I can make this "go away" by rm'ing everything in /tmp/hadoop-yarn-stevel/*

That is: the state of a single shared path can break all unit tests running 
locally. And presumably in production, cause RM startup to fail with not very 
meaningful error text


Proposed
* init code handles unreadable files somehow
* for the minicluster we don't use a fixed location for the files, as with 
parallel test runs its inevitable that eventually they will end up in a 
corrupted state

> NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover
> --
>
> Key: YARN-9568
> URL: https://issues.apache.org/jira/browse/YARN-9568
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.3.0
> Environment: macos
>Reporter: Steve Loughran
>Priority: Major
> Attachments: npe.log
>
>
> This seems new in trunk. As in "wasn't happening a couple of weeks ago". Its 
> surfacing in the S3A committer tests which are trying to create 
> MiniYarnClusters: all such tests are failing as the mini yarn cluster won't 
> come up with an NPE in {{FileSystemNodeAttributeStore.recover}}
> I'm not sure why node labels are needed on test clusters; the default implies 
> they should be off anyway.
> At the same time, I can't seem to find one specific change in the git log to 
> say "this is causing the problem".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-05-20 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844201#comment-16844201
 ] 

Steve Loughran commented on YARN-9568:
--

attached full log

> NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover
> --
>
> Key: YARN-9568
> URL: https://issues.apache.org/jira/browse/YARN-9568
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.3.0
> Environment: macos
>Reporter: Steve Loughran
>Priority: Major
> Attachments: npe.log
>
>
> This seems new in trunk. As in "wasn't happening a couple of weeks ago". Its 
> surfacing in the S3A committer tests which are trying to create 
> MiniYarnClusters: all such tests are failing as the mini yarn cluster won't 
> come up with an NPE in {{FileSystemNodeAttributeStore.recover}}
> I'm not sure why node labels are needed on test clusters; the default implies 
> they should be off anyway.
> At the same time, I can't seem to find one specific change in the git log to 
> say "this is causing the problem".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9568) NPE in MiniYarnCluster during FileSystemNodeAttributeStore.recover

2019-05-20 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844138#comment-16844138
 ] 

Steve Loughran commented on YARN-9568:
--

{code}

org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
java.lang.NullPointerException

at 
org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:373)
at 
org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:128)
at 
org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:503)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at 
org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:322)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.fs.s3a.yarn.ITestS3AMiniYarnCluster.setup(ITestS3AMiniYarnCluster.java:84)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodesToAttributesMappingRequestPBImpl.initNodeAttributesMapping(NodesToAttributesMappingRequestPBImpl.java:102)
at 
org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodesToAttributesMappingRequestPBImpl.getNodesToAttributes(NodesToAttributesMappingRequestPBImpl.java:117)
at 
org.apache.hadoop.yarn.nodelabels.store.op.FSNodeStoreLogOp.getNodeToAttributesMap(FSNodeStoreLogOp.java:46)
at 
org.apache.hadoop.yarn.nodelabels.store.op.NodeAttributeMirrorOp.recover(NodeAttributeMirrorOp.java:57)
at 
org.apache.hadoop.yarn.nodelabels.store.op.NodeAttributeMirrorOp.recover(NodeAttributeMirrorOp.java:35)
at 
org.apache.hadoop.yarn.nodelabels.store.AbstractFSNodeStore.loadFromMirror(AbstractFSNodeStore.java:121)
at 
org.apache.hadoop.yarn.nodelabels.store.AbstractFSNodeStore.recoverFromStore(AbstractFSNodeStore.java:150)
at 
org.apache.hadoop.yarn.server.resourcemanager.nodelabels.FileSystemNodeAttributeStore.recover(FileSystemNodeAttributeStore.java:95)
at 
org.apache.hadoop.yarn.server.resourcemanager.nodelabels.NodeAttributesManagerImpl.initNodeAttributeStore(NodeAttributesManagerImpl.java:140)
at 
org.apache.hadoop.yarn.server.resourcemanager.nodelabels.NodeAttributesManagerImpl.serviceStart(NodeAttributesManagerImpl.java:123)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:918)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1285)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1326)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1322)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1322)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1373)
at 
org.apache.hadoop.service