[jira] [Commented] (YARN-10011) Catch all exception during init app in LogAggregationService

2020-02-04 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17029684#comment-17029684
 ] 

Prabhu Joseph commented on YARN-10011:
--

[~cane] Can you include a testcase as well or Please justify why no new tests 
are needed for this patch. Thanks.

> Catch all exception  during init app in LogAggregationService 
> --
>
> Key: YARN-10011
> URL: https://issues.apache.org/jira/browse/YARN-10011
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-10011-001.patch
>
>
> we should catch all exception during init app in LogAggregationService in 
> case of nm exit 
> {code:java}
> 2019-06-12,09:36:03,652 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.ipc.Client.setCallIdAndRetryCount(Client.java:118)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy22.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2115)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1300)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1296)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1312)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:193)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:319)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:116)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10011) Catch all exception during init app in LogAggregationService

2020-02-04 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17029643#comment-17029643
 ] 

Hadoop QA commented on YARN-10011:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 41s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 55s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-10011 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12992441/YARN-10011-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux fbf8f22f2bd4 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1e3a0b0 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/25494/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25494/testReport/ |
| Max. process+thread count | 310 (vs. ulimit of 5500) |
| modules | C: 

[jira] [Commented] (YARN-10011) Catch all exception during init app in LogAggregationService

2020-02-03 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17029614#comment-17029614
 ] 

Prabhu Joseph commented on YARN-10011:
--

The patch looks good, will change the state to Patch Available.

> Catch all exception  during init app in LogAggregationService 
> --
>
> Key: YARN-10011
> URL: https://issues.apache.org/jira/browse/YARN-10011
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-10011-001.patch
>
>
> we should catch all exception during init app in LogAggregationService in 
> case of nm exit 
> {code:java}
> 2019-06-12,09:36:03,652 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.ipc.Client.setCallIdAndRetryCount(Client.java:118)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy22.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2115)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1300)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1296)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1312)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:193)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:319)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:116)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10011) Catch all exception during init app in LogAggregationService

2020-02-03 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028930#comment-17028930
 ] 

Adam Antal commented on YARN-10011:
---

Thanks for the patch [~cane].

I agree with the modification (better not let this exception slip to the 
dispatcher thread).

I was wondering, whether catching the {{IllegalStateException}} somewhere 
around {{LogAggregationFileController#verifyAndCreateRemoteLogDir}} would make 
sense, but I think it is well protected (catches IOExceptions as expected), so 
I think we did our best here.

+1 (non-binding).

Could you please take a look at this [~prabhujoseph]? 

> Catch all exception  during init app in LogAggregationService 
> --
>
> Key: YARN-10011
> URL: https://issues.apache.org/jira/browse/YARN-10011
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-10011-001.patch
>
>
> we should catch all exception during init app in LogAggregationService in 
> case of nm exit 
> {code:java}
> 2019-06-12,09:36:03,652 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.ipc.Client.setCallIdAndRetryCount(Client.java:118)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy22.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2115)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1300)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1296)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1312)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:193)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:319)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:116)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10011) Catch all exception during init app in LogAggregationService

2020-02-03 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028931#comment-17028931
 ] 

Adam Antal commented on YARN-10011:
---

Oh before I forget it, [~cane], please move the jira to patch available state 
so that jenkins check can run on this.

> Catch all exception  during init app in LogAggregationService 
> --
>
> Key: YARN-10011
> URL: https://issues.apache.org/jira/browse/YARN-10011
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-10011-001.patch
>
>
> we should catch all exception during init app in LogAggregationService in 
> case of nm exit 
> {code:java}
> 2019-06-12,09:36:03,652 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.ipc.Client.setCallIdAndRetryCount(Client.java:118)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy22.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2115)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1300)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1296)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1312)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:193)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:319)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:116)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10011) Catch all exception during init app in LogAggregationService

2020-02-02 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028457#comment-17028457
 ] 

zhoukang commented on YARN-10011:
-

[~adam.antal]Could you help review this?

> Catch all exception  during init app in LogAggregationService 
> --
>
> Key: YARN-10011
> URL: https://issues.apache.org/jira/browse/YARN-10011
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-10011-001.patch
>
>
> we should catch all exception during init app in LogAggregationService in 
> case of nm exit 
> {code:java}
> 2019-06-12,09:36:03,652 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
> Error in dispatcher thread
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.ipc.Client.setCallIdAndRetryCount(Client.java:118)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy22.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2115)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1300)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:1296)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1312)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:193)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:319)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:116)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org