[jira] [Commented] (YARN-10340) HsWebServices getContainerReport uses loginUser instead of remoteUser to access ApplicationClientProtocol
[ https://issues.apache.org/jira/browse/YARN-10340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152501#comment-17152501 ] Tarun Parimi commented on YARN-10340: - [~prabhujoseph],[~brahmareddy] The WebServices#getContainer works properly when called by RMWebServices or AHSWebServices. This could be because they use their own ClientRMService and ApplicationHistoryClientService respectively. But HsWebServices now uses ClientRMService remotely and so doAs doesn't work here as expected. > HsWebServices getContainerReport uses loginUser instead of remoteUser to > access ApplicationClientProtocol > - > > Key: YARN-10340 > URL: https://issues.apache.org/jira/browse/YARN-10340 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Tarun Parimi >Priority: Major > > HsWebServices getContainerReport uses loginUser instead of remoteUser to > access ApplicationClientProtocol > > [http://:19888/ws/v1/history/containers/container_e03_1594030808801_0002_01_03/logs|http://pjoseph-secure-1.pjoseph-secure.root.hwx.site:19888/ws/v1/history/containers/container_e03_1594030808801_0002_01_03/logs] > While accessing above link using systest user, the request fails saying > mapred user does not have access to the job > > {code:java} > 2020-07-06 14:02:59,178 WARN org.apache.hadoop.yarn.server.webapp.LogServlet: > Could not obtain node HTTP address from provider. > javax.ws.rs.WebApplicationException: > org.apache.hadoop.yarn.exceptions.YarnException: User mapred does not have > privilege to see this application application_1593997842459_0214 > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getContainerReport(ClientRMService.java:516) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getContainerReport(ApplicationClientProtocolPBServiceImpl.java:466) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:639) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882) > at > org.apache.hadoop.yarn.server.webapp.WebServices.rewrapAndThrowThrowable(WebServices.java:544) > at > org.apache.hadoop.yarn.server.webapp.WebServices.rewrapAndThrowException(WebServices.java:530) > at > org.apache.hadoop.yarn.server.webapp.WebServices.getContainer(WebServices.java:405) > at > org.apache.hadoop.yarn.server.webapp.WebServices.getNodeHttpAddress(WebServices.java:373) > at > org.apache.hadoop.yarn.server.webapp.LogServlet.getContainerLogsInfo(LogServlet.java:268) > at > org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getContainerLogs(HsWebServices.java:461) > > {code} > On Analyzing, found WebServices#getContainer uses doAs using UGI created by > createRemoteUser(end user) to access RM#ApplicationClientProtocol which does > not work. Need to use createProxyUser to do the same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152338#comment-17152338 ] Eric Yang commented on YARN-10341: -- cc [~billie] [~jianhe] > Yarn Service Container Completed event doesn't get processed > - > > Key: YARN-10341 > URL: https://issues.apache.org/jira/browse/YARN-10341 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-10341.001.patch > > > If there 10 workers running and if containers get killed , after a while we > see that there are just 9 workers runnning. This is due to CONTAINER > COMPLETED Event is not processed on AM side. > Issue is in below code: > {code:java} > public void onContainersCompleted(List statuses) { > for (ContainerStatus status : statuses) { > ContainerId containerId = status.getContainerId(); > ComponentInstance instance = > liveInstances.get(status.getContainerId()); > if (instance == null) { > LOG.warn( > "Container {} Completed. No component instance exists. > exitStatus={}. diagnostics={} ", > containerId, status.getExitStatus(), status.getDiagnostics()); > return; > } > ComponentEvent event = > new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED) > .setStatus(status).setInstance(instance) > .setContainerId(containerId); > dispatcher.getEventHandler().handle(event); > } > {code} > If component instance doesnt exist for a container, it doesnt iterate over > other containers as its returning from method -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152325#comment-17152325 ] Hadoop QA commented on YARN-10341: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 35s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 7s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 17s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 94m 34s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-YARN-Build/26253/artifact/out/Dockerfile | | JIRA Issue | YARN-10341 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13007176/YARN-10341.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9f5b79b49796 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 834372f4040 | | Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/26253/testReport/ | | Max. process+thread count | 778 (vs. ulimit of 55
[jira] [Commented] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152271#comment-17152271 ] Bilwa S T commented on YARN-10341: -- Okay i will take care from next time . Thanks [~eyang] > Yarn Service Container Completed event doesn't get processed > - > > Key: YARN-10341 > URL: https://issues.apache.org/jira/browse/YARN-10341 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-10341.001.patch > > > If there 10 workers running and if containers get killed , after a while we > see that there are just 9 workers runnning. This is due to CONTAINER > COMPLETED Event is not processed on AM side. > Issue is in below code: > {code:java} > public void onContainersCompleted(List statuses) { > for (ContainerStatus status : statuses) { > ContainerId containerId = status.getContainerId(); > ComponentInstance instance = > liveInstances.get(status.getContainerId()); > if (instance == null) { > LOG.warn( > "Container {} Completed. No component instance exists. > exitStatus={}. diagnostics={} ", > containerId, status.getExitStatus(), status.getDiagnostics()); > return; > } > ComponentEvent event = > new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED) > .setStatus(status).setInstance(instance) > .setContainerId(containerId); > dispatcher.getEventHandler().handle(event); > } > {code} > If component instance doesnt exist for a container, it doesnt iterate over > other containers as its returning from method -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152267#comment-17152267 ] Eric Yang edited comment on YARN-10341 at 7/6/20, 7:34 PM: --- [~BilwaST] I see that you'd changed the code from break to continue. This change looks better. Please use a new version of the patch instead of replacing existing patch 001, this will help the precommit build to report correctly for the new patch. Thanks was (Author: eyang): [~BilwaST] I see that you'd changed the code from break to continue. This change looks better. Please use a new version of the patch instead of replacing existing patch 001, this will help the recommit build to report correctly for the new patch. Thanks > Yarn Service Container Completed event doesn't get processed > - > > Key: YARN-10341 > URL: https://issues.apache.org/jira/browse/YARN-10341 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-10341.001.patch > > > If there 10 workers running and if containers get killed , after a while we > see that there are just 9 workers runnning. This is due to CONTAINER > COMPLETED Event is not processed on AM side. > Issue is in below code: > {code:java} > public void onContainersCompleted(List statuses) { > for (ContainerStatus status : statuses) { > ContainerId containerId = status.getContainerId(); > ComponentInstance instance = > liveInstances.get(status.getContainerId()); > if (instance == null) { > LOG.warn( > "Container {} Completed. No component instance exists. > exitStatus={}. diagnostics={} ", > containerId, status.getExitStatus(), status.getDiagnostics()); > return; > } > ComponentEvent event = > new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED) > .setStatus(status).setInstance(instance) > .setContainerId(containerId); > dispatcher.getEventHandler().handle(event); > } > {code} > If component instance doesnt exist for a container, it doesnt iterate over > other containers as its returning from method -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152267#comment-17152267 ] Eric Yang commented on YARN-10341: -- [~BilwaST] I see that you'd changed the code from break to continue. This change looks better. Please use a new version of the patch instead of replacing existing patch 001, this will help the recommit build to report correctly for the new patch. Thanks > Yarn Service Container Completed event doesn't get processed > - > > Key: YARN-10341 > URL: https://issues.apache.org/jira/browse/YARN-10341 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-10341.001.patch > > > If there 10 workers running and if containers get killed , after a while we > see that there are just 9 workers runnning. This is due to CONTAINER > COMPLETED Event is not processed on AM side. > Issue is in below code: > {code:java} > public void onContainersCompleted(List statuses) { > for (ContainerStatus status : statuses) { > ContainerId containerId = status.getContainerId(); > ComponentInstance instance = > liveInstances.get(status.getContainerId()); > if (instance == null) { > LOG.warn( > "Container {} Completed. No component instance exists. > exitStatus={}. diagnostics={} ", > containerId, status.getExitStatus(), status.getDiagnostics()); > return; > } > ComponentEvent event = > new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED) > .setStatus(status).setInstance(instance) > .setContainerId(containerId); > dispatcher.getEventHandler().handle(event); > } > {code} > If component instance doesnt exist for a container, it doesnt iterate over > other containers as its returning from method -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10341: - Attachment: YARN-10341.001.patch > Yarn Service Container Completed event doesn't get processed > - > > Key: YARN-10341 > URL: https://issues.apache.org/jira/browse/YARN-10341 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-10341.001.patch > > > If there 10 workers running and if containers get killed , after a while we > see that there are just 9 workers runnning. This is due to CONTAINER > COMPLETED Event is not processed on AM side. > Issue is in below code: > {code:java} > public void onContainersCompleted(List statuses) { > for (ContainerStatus status : statuses) { > ContainerId containerId = status.getContainerId(); > ComponentInstance instance = > liveInstances.get(status.getContainerId()); > if (instance == null) { > LOG.warn( > "Container {} Completed. No component instance exists. > exitStatus={}. diagnostics={} ", > containerId, status.getExitStatus(), status.getDiagnostics()); > return; > } > ComponentEvent event = > new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED) > .setStatus(status).setInstance(instance) > .setContainerId(containerId); > dispatcher.getEventHandler().handle(event); > } > {code} > If component instance doesnt exist for a container, it doesnt iterate over > other containers as its returning from method -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10341: - Attachment: (was: YARN-10341.001.patch) > Yarn Service Container Completed event doesn't get processed > - > > Key: YARN-10341 > URL: https://issues.apache.org/jira/browse/YARN-10341 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Critical > > If there 10 workers running and if containers get killed , after a while we > see that there are just 9 workers runnning. This is due to CONTAINER > COMPLETED Event is not processed on AM side. > Issue is in below code: > {code:java} > public void onContainersCompleted(List statuses) { > for (ContainerStatus status : statuses) { > ContainerId containerId = status.getContainerId(); > ComponentInstance instance = > liveInstances.get(status.getContainerId()); > if (instance == null) { > LOG.warn( > "Container {} Completed. No component instance exists. > exitStatus={}. diagnostics={} ", > containerId, status.getExitStatus(), status.getDiagnostics()); > return; > } > ComponentEvent event = > new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED) > .setStatus(status).setInstance(instance) > .setContainerId(containerId); > dispatcher.getEventHandler().handle(event); > } > {code} > If component instance doesnt exist for a container, it doesnt iterate over > other containers as its returning from method -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152246#comment-17152246 ] Hadoop QA commented on YARN-10341: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 29s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 5s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 43s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 85m 22s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-YARN-Build/26252/artifact/out/Dockerfile | | JIRA Issue | YARN-10341 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13007162/YARN-10341.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux b6a1c9c50022 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 834372f4040 | | Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/26252/testReport/ | | Max. process+thread count | 777 (vs. ulimit of 55
[jira] [Commented] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152241#comment-17152241 ] Eric Yang commented on YARN-10341: -- [~BilwaST] Sorry, I am confused by this ticket and the proposed patch fix to the described problem. The containers "restart_policy" controls if the container should be restarted on the event of failure/killed. If it was not set, it will always restart. If it was set to "NEVER", it will not restart. The completion events are secondary information to assist to restart the containers or not. Using return or break in onContainerCompleted method, don't make any difference. Maybe I am missing something, could you give more information on how this patch address the observed issue? > Yarn Service Container Completed event doesn't get processed > - > > Key: YARN-10341 > URL: https://issues.apache.org/jira/browse/YARN-10341 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-10341.001.patch > > > If there 10 workers running and if containers get killed , after a while we > see that there are just 9 workers runnning. This is due to CONTAINER > COMPLETED Event is not processed on AM side. > Issue is in below code: > {code:java} > public void onContainersCompleted(List statuses) { > for (ContainerStatus status : statuses) { > ContainerId containerId = status.getContainerId(); > ComponentInstance instance = > liveInstances.get(status.getContainerId()); > if (instance == null) { > LOG.warn( > "Container {} Completed. No component instance exists. > exitStatus={}. diagnostics={} ", > containerId, status.getExitStatus(), status.getDiagnostics()); > return; > } > ComponentEvent event = > new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED) > .setStatus(status).setInstance(instance) > .setContainerId(containerId); > dispatcher.getEventHandler().handle(event); > } > {code} > If component instance doesnt exist for a container, it doesnt iterate over > other containers as its returning from method -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152182#comment-17152182 ] Bilwa S T commented on YARN-10341: -- cc [~eyang] > Yarn Service Container Completed event doesn't get processed > - > > Key: YARN-10341 > URL: https://issues.apache.org/jira/browse/YARN-10341 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-10341.001.patch > > > If there 10 workers running and if containers get killed , after a while we > see that there are just 9 workers runnning. This is due to CONTAINER > COMPLETED Event is not processed on AM side. > Issue is in below code: > {code:java} > public void onContainersCompleted(List statuses) { > for (ContainerStatus status : statuses) { > ContainerId containerId = status.getContainerId(); > ComponentInstance instance = > liveInstances.get(status.getContainerId()); > if (instance == null) { > LOG.warn( > "Container {} Completed. No component instance exists. > exitStatus={}. diagnostics={} ", > containerId, status.getExitStatus(), status.getDiagnostics()); > return; > } > ComponentEvent event = > new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED) > .setStatus(status).setInstance(instance) > .setContainerId(containerId); > dispatcher.getEventHandler().handle(event); > } > {code} > If component instance doesnt exist for a container, it doesnt iterate over > other containers as its returning from method -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10341) Yarn Service Container Completed event doesn't get processed
[ https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10341: - Attachment: YARN-10341.001.patch > Yarn Service Container Completed event doesn't get processed > - > > Key: YARN-10341 > URL: https://issues.apache.org/jira/browse/YARN-10341 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-10341.001.patch > > > If there 10 workers running and if containers get killed , after a while we > see that there are just 9 workers runnning. This is due to CONTAINER > COMPLETED Event is not processed on AM side. > Issue is in below code: > {code:java} > public void onContainersCompleted(List statuses) { > for (ContainerStatus status : statuses) { > ContainerId containerId = status.getContainerId(); > ComponentInstance instance = > liveInstances.get(status.getContainerId()); > if (instance == null) { > LOG.warn( > "Container {} Completed. No component instance exists. > exitStatus={}. diagnostics={} ", > containerId, status.getExitStatus(), status.getDiagnostics()); > return; > } > ComponentEvent event = > new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED) > .setStatus(status).setInstance(instance) > .setContainerId(containerId); > dispatcher.getEventHandler().handle(event); > } > {code} > If component instance doesnt exist for a container, it doesnt iterate over > other containers as its returning from method -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10341) Yarn Service Container Completed event doesn't get processed
Bilwa S T created YARN-10341: Summary: Yarn Service Container Completed event doesn't get processed Key: YARN-10341 URL: https://issues.apache.org/jira/browse/YARN-10341 Project: Hadoop YARN Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T If there 10 workers running and if containers get killed , after a while we see that there are just 9 workers runnning. This is due to CONTAINER COMPLETED Event is not processed on AM side. Issue is in below code: {code:java} public void onContainersCompleted(List statuses) { for (ContainerStatus status : statuses) { ContainerId containerId = status.getContainerId(); ComponentInstance instance = liveInstances.get(status.getContainerId()); if (instance == null) { LOG.warn( "Container {} Completed. No component instance exists. exitStatus={}. diagnostics={} ", containerId, status.getExitStatus(), status.getDiagnostics()); return; } ComponentEvent event = new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED) .setStatus(status).setInstance(instance) .setContainerId(containerId); dispatcher.getEventHandler().handle(event); } {code} If component instance doesnt exist for a container, it doesnt iterate over other containers as its returning from method -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10340) HsWebServices getContainerReport uses loginUser instead of remoteUser to access ApplicationClientProtocol
[ https://issues.apache.org/jira/browse/YARN-10340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152173#comment-17152173 ] Prabhu Joseph commented on YARN-10340: -- [~brahmareddy] This issue happens irrespective of HADOOP-16095 change. Looks this issue is present long ago. *Repro:* Setup: Secure cluster + HistoryServer runs as mapred user + yarn.admin.acl=yarn and ACL for queues are set to " " 1. Run a mapreduce sleep job as userA 2. Access http://:19888/ws/v1/history/containers/container_e03_1594030808801_0002_01_03/logs as userA after kinit. 3. The request fails with below error in HistoryServer logs {code} 2020-07-06 14:02:59,178 WARN org.apache.hadoop.yarn.server.webapp.LogServlet: Could not obtain node HTTP address from provider. javax.ws.rs.WebApplicationException: org.apache.hadoop.yarn.exceptions.YarnException: User mapred does not have privilege to see this application application_1593997842459_0214 at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getContainerReport(ClientRMService.java:516) {code} > HsWebServices getContainerReport uses loginUser instead of remoteUser to > access ApplicationClientProtocol > - > > Key: YARN-10340 > URL: https://issues.apache.org/jira/browse/YARN-10340 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Tarun Parimi >Priority: Major > > HsWebServices getContainerReport uses loginUser instead of remoteUser to > access ApplicationClientProtocol > > [http://:19888/ws/v1/history/containers/container_e03_1594030808801_0002_01_03/logs|http://pjoseph-secure-1.pjoseph-secure.root.hwx.site:19888/ws/v1/history/containers/container_e03_1594030808801_0002_01_03/logs] > While accessing above link using systest user, the request fails saying > mapred user does not have access to the job > > {code:java} > 2020-07-06 14:02:59,178 WARN org.apache.hadoop.yarn.server.webapp.LogServlet: > Could not obtain node HTTP address from provider. > javax.ws.rs.WebApplicationException: > org.apache.hadoop.yarn.exceptions.YarnException: User mapred does not have > privilege to see this application application_1593997842459_0214 > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getContainerReport(ClientRMService.java:516) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getContainerReport(ApplicationClientProtocolPBServiceImpl.java:466) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:639) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882) > at > org.apache.hadoop.yarn.server.webapp.WebServices.rewrapAndThrowThrowable(WebServices.java:544) > at > org.apache.hadoop.yarn.server.webapp.WebServices.rewrapAndThrowException(WebServices.java:530) > at > org.apache.hadoop.yarn.server.webapp.WebServices.getContainer(WebServices.java:405) > at > org.apache.hadoop.yarn.server.webapp.WebServices.getNodeHttpAddress(WebServices.java:373) > at > org.apache.hadoop.yarn.server.webapp.LogServlet.getContainerLogsInfo(LogServlet.java:268) > at > org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getContainerLogs(HsWebServices.java:461) > > {code} > On Analyzing, found WebServices#getContainer uses doAs using UGI created by > createRemoteUser(end user) to access RM#ApplicationClientProtocol which does > not work. Need to use createProxyUser to do the same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10340) HsWebServices getContainerReport uses loginUser instead of remoteUser to access ApplicationClientProtocol
[ https://issues.apache.org/jira/browse/YARN-10340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152142#comment-17152142 ] Brahma Reddy Battula commented on YARN-10340: - does this related to HADOOP-16095? > HsWebServices getContainerReport uses loginUser instead of remoteUser to > access ApplicationClientProtocol > - > > Key: YARN-10340 > URL: https://issues.apache.org/jira/browse/YARN-10340 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Tarun Parimi >Priority: Major > > HsWebServices getContainerReport uses loginUser instead of remoteUser to > access ApplicationClientProtocol > > [http://:19888/ws/v1/history/containers/container_e03_1594030808801_0002_01_03/logs|http://pjoseph-secure-1.pjoseph-secure.root.hwx.site:19888/ws/v1/history/containers/container_e03_1594030808801_0002_01_03/logs] > While accessing above link using systest user, the request fails saying > mapred user does not have access to the job > > {code:java} > 2020-07-06 14:02:59,178 WARN org.apache.hadoop.yarn.server.webapp.LogServlet: > Could not obtain node HTTP address from provider. > javax.ws.rs.WebApplicationException: > org.apache.hadoop.yarn.exceptions.YarnException: User mapred does not have > privilege to see this application application_1593997842459_0214 > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getContainerReport(ClientRMService.java:516) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getContainerReport(ApplicationClientProtocolPBServiceImpl.java:466) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:639) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882) > at > org.apache.hadoop.yarn.server.webapp.WebServices.rewrapAndThrowThrowable(WebServices.java:544) > at > org.apache.hadoop.yarn.server.webapp.WebServices.rewrapAndThrowException(WebServices.java:530) > at > org.apache.hadoop.yarn.server.webapp.WebServices.getContainer(WebServices.java:405) > at > org.apache.hadoop.yarn.server.webapp.WebServices.getNodeHttpAddress(WebServices.java:373) > at > org.apache.hadoop.yarn.server.webapp.LogServlet.getContainerLogsInfo(LogServlet.java:268) > at > org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getContainerLogs(HsWebServices.java:461) > > {code} > On Analyzing, found WebServices#getContainer uses doAs using UGI created by > createRemoteUser(end user) to access RM#ApplicationClientProtocol which does > not work. Need to use createProxyUser to do the same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10340) HsWebServices getContainerReport uses loginUser instead of remoteUser to access ApplicationClientProtocol
Prabhu Joseph created YARN-10340: Summary: HsWebServices getContainerReport uses loginUser instead of remoteUser to access ApplicationClientProtocol Key: YARN-10340 URL: https://issues.apache.org/jira/browse/YARN-10340 Project: Hadoop YARN Issue Type: Bug Reporter: Prabhu Joseph Assignee: Tarun Parimi HsWebServices getContainerReport uses loginUser instead of remoteUser to access ApplicationClientProtocol [http://:19888/ws/v1/history/containers/container_e03_1594030808801_0002_01_03/logs|http://pjoseph-secure-1.pjoseph-secure.root.hwx.site:19888/ws/v1/history/containers/container_e03_1594030808801_0002_01_03/logs] While accessing above link using systest user, the request fails saying mapred user does not have access to the job {code:java} 2020-07-06 14:02:59,178 WARN org.apache.hadoop.yarn.server.webapp.LogServlet: Could not obtain node HTTP address from provider. javax.ws.rs.WebApplicationException: org.apache.hadoop.yarn.exceptions.YarnException: User mapred does not have privilege to see this application application_1593997842459_0214 at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getContainerReport(ClientRMService.java:516) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getContainerReport(ApplicationClientProtocolPBServiceImpl.java:466) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:639) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882) at org.apache.hadoop.yarn.server.webapp.WebServices.rewrapAndThrowThrowable(WebServices.java:544) at org.apache.hadoop.yarn.server.webapp.WebServices.rewrapAndThrowException(WebServices.java:530) at org.apache.hadoop.yarn.server.webapp.WebServices.getContainer(WebServices.java:405) at org.apache.hadoop.yarn.server.webapp.WebServices.getNodeHttpAddress(WebServices.java:373) at org.apache.hadoop.yarn.server.webapp.LogServlet.getContainerLogsInfo(LogServlet.java:268) at org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getContainerLogs(HsWebServices.java:461) {code} On Analyzing, found WebServices#getContainer uses doAs using UGI created by createRemoteUser(end user) to access RM#ApplicationClientProtocol which does not work. Need to use createProxyUser to do the same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10339) Timeline Client in Nodemanager gets 403 errors when simple auth is used in kerberos environments
[ https://issues.apache.org/jira/browse/YARN-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152073#comment-17152073 ] Hadoop QA commented on YARN-10339: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 43s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 37s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 39s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 39s{color} | {color:blue} branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (findbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 7s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 29s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 6 new + 212 unchanged - 0 fixed = 218 total (was 212) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 55s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 28s{color} | {color:green} the patch passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 37s{color} | {color:blue} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests has no data from findbugs {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 4s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 4m 4s{color} | {color:red} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 53s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 56s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 20s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflice
[jira] [Updated] (YARN-10339) Timeline Client in Nodemanager gets 403 errors when simple auth is used in kerberos environments
[ https://issues.apache.org/jira/browse/YARN-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tarun Parimi updated YARN-10339: Attachment: YARN-10339.001.patch > Timeline Client in Nodemanager gets 403 errors when simple auth is used in > kerberos environments > > > Key: YARN-10339 > URL: https://issues.apache.org/jira/browse/YARN-10339 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineclient >Affects Versions: 3.1.0 >Reporter: Tarun Parimi >Assignee: Tarun Parimi >Priority: Major > Attachments: YARN-10339.001.patch > > > We get below errors in NodeManager logs whenever we set > yarn.timeline-service.http-authentication.type=simple in a cluster which has > kerberos enabled. There are use cases where simple auth is used only in > timeline server for convenience although kerberos is enabled. > {code:java} > 2020-05-20 20:06:30,181 ERROR impl.TimelineV2ClientImpl > (TimelineV2ClientImpl.java:putObjects(321)) - Response from the timeline > server is not successful, HTTP error code: 403, Server response: > {"exception":"ForbiddenException","message":"java.lang.Exception: The owner > of the posted timeline entities is not > set","javaClassName":"org.apache.hadoop.yarn.webapp.ForbiddenException"} > {code} > This seems to affect the NM timeline publisher which uses > TimelineV2ClientImpl. Doing a simple auth directly to timeline service via > curl works fine. So this issue is in the authenticator configuration in > timeline client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10339) Timeline Client in Nodemanager gets 403 errors when simple auth is used in kerberos environments
Tarun Parimi created YARN-10339: --- Summary: Timeline Client in Nodemanager gets 403 errors when simple auth is used in kerberos environments Key: YARN-10339 URL: https://issues.apache.org/jira/browse/YARN-10339 Project: Hadoop YARN Issue Type: Bug Components: timelineclient Affects Versions: 3.1.0 Reporter: Tarun Parimi Assignee: Tarun Parimi We get below errors in NodeManager logs whenever we set yarn.timeline-service.http-authentication.type=simple in a cluster which has kerberos enabled. There are use cases where simple auth is used only in timeline server for convenience although kerberos is enabled. {code:java} 2020-05-20 20:06:30,181 ERROR impl.TimelineV2ClientImpl (TimelineV2ClientImpl.java:putObjects(321)) - Response from the timeline server is not successful, HTTP error code: 403, Server response: {"exception":"ForbiddenException","message":"java.lang.Exception: The owner of the posted timeline entities is not set","javaClassName":"org.apache.hadoop.yarn.webapp.ForbiddenException"} {code} This seems to affect the NM timeline publisher which uses TimelineV2ClientImpl. Doing a simple auth directly to timeline service via curl works fine. So this issue is in the authenticator configuration in timeline client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10106) Yarn logs CLI filtering by application attempt
[ https://issues.apache.org/jira/browse/YARN-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151964#comment-17151964 ] Hadoop QA commented on YARN-10106: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 54s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 53s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The patch generated 18 new + 126 unchanged - 0 fixed = 144 total (was 126) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 27m 28s{color} | {color:red} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 91m 4s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMClient | \\ \\ || Subsystem || Report/Notes || | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-YARN-Build/26250/artifact/out/Dockerfile | | JIRA Issue | YARN-10106 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13007120/YARN-10106.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7630d2fbacb6 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 639acb6d892 | | Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/26250/art
[jira] [Commented] (YARN-10106) Yarn logs CLI filtering by application attempt
[ https://issues.apache.org/jira/browse/YARN-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151919#comment-17151919 ] Hudáky Márton Gyula commented on YARN-10106: Rebase was needed due to YARN-10327 > Yarn logs CLI filtering by application attempt > -- > > Key: YARN-10106 > URL: https://issues.apache.org/jira/browse/YARN-10106 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Adam Antal >Assignee: Hudáky Márton Gyula >Priority: Trivial > Attachments: YARN-10106.001.patch, YARN-10106.002.patch, > YARN-10106.003.patch, YARN-10106.004.patch, YARN-10106.005.patch > > > {{ContainerLogsRequest}} got a new parameter in YARN-10101, which is the > {{applicationAttempt}} - we can use this new parameter in Yarn logs CLI as > well to filter by application attempt. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10106) Yarn logs CLI filtering by application attempt
[ https://issues.apache.org/jira/browse/YARN-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hudáky Márton Gyula updated YARN-10106: --- Attachment: YARN-10106.005.patch > Yarn logs CLI filtering by application attempt > -- > > Key: YARN-10106 > URL: https://issues.apache.org/jira/browse/YARN-10106 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Adam Antal >Assignee: Hudáky Márton Gyula >Priority: Trivial > Attachments: YARN-10106.001.patch, YARN-10106.002.patch, > YARN-10106.003.patch, YARN-10106.004.patch, YARN-10106.005.patch > > > {{ContainerLogsRequest}} got a new parameter in YARN-10101, which is the > {{applicationAttempt}} - we can use this new parameter in Yarn logs CLI as > well to filter by application attempt. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10335) Improve scheduling of containers based on node health
[ https://issues.apache.org/jira/browse/YARN-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151863#comment-17151863 ] Cyrus Jackson commented on YARN-10335: -- Thanks [~bibinchundatt] for your inputs. I have worked on the working patch with the changes. Please give your thoughts on the approach. The current approach is to have a new NodeHealthDetail which holds the additional information such as SSD, HDD, SKU... etc. The overallScore is generated based on these values from the NodeHealthCheckerService. This is then stored to RmNodeImpl. The overallScore can be used for scheduling nodes on the RM side. NodeHealthService is added to make the NodeHealthCheckerService pluggable to support custom implementations. > Improve scheduling of containers based on node health > - > > Key: YARN-10335 > URL: https://issues.apache.org/jira/browse/YARN-10335 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin Chundatt >Assignee: Cyrus Jackson >Priority: Major > Attachments: YARN-10335.001.patch > > > YARN-7494 supports providing interface to choose nodeset for scheduler > allocation. > We could leverage the same to support allocation of containers based on node > health value send from nodemanagers -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10335) Improve scheduling of containers based on node health
[ https://issues.apache.org/jira/browse/YARN-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cyrus Jackson updated YARN-10335: - Attachment: YARN-10335.001.patch > Improve scheduling of containers based on node health > - > > Key: YARN-10335 > URL: https://issues.apache.org/jira/browse/YARN-10335 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin Chundatt >Assignee: Cyrus Jackson >Priority: Major > Attachments: YARN-10335.001.patch > > > YARN-7494 supports providing interface to choose nodeset for scheduler > allocation. > We could leverage the same to support allocation of containers based on node > health value send from nodemanagers -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10335) Improve scheduling of containers based on node health
[ https://issues.apache.org/jira/browse/YARN-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cyrus Jackson updated YARN-10335: - Attachment: (was: yarn_DE.patch) > Improve scheduling of containers based on node health > - > > Key: YARN-10335 > URL: https://issues.apache.org/jira/browse/YARN-10335 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin Chundatt >Assignee: Cyrus Jackson >Priority: Major > > YARN-7494 supports providing interface to choose nodeset for scheduler > allocation. > We could leverage the same to support allocation of containers based on node > health value send from nodemanagers -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10335) Improve scheduling of containers based on node health
[ https://issues.apache.org/jira/browse/YARN-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cyrus Jackson updated YARN-10335: - Attachment: yarn_DE.patch > Improve scheduling of containers based on node health > - > > Key: YARN-10335 > URL: https://issues.apache.org/jira/browse/YARN-10335 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin Chundatt >Assignee: Cyrus Jackson >Priority: Major > Attachments: yarn_DE.patch > > > YARN-7494 supports providing interface to choose nodeset for scheduler > allocation. > We could leverage the same to support allocation of containers based on node > health value send from nodemanagers -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org