[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking
[ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15244034#comment-15244034 ] Hadoop QA commented on YARN-4676: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 40s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 29s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 0s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped branch modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 53s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 27s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 42s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 42s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s {color} | {color:green} root: patch generated 0 new + 519 unchanged - 6 fixed = 519 total (was 525) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patch modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 29s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 24s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 11m 24s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95 with JDK v1.7.0_95 generated 1 new + 2 unchanged - 0
[jira] [Updated] (YARN-4965) Distributed shell AM failed due to ClientHandlerException thrown by jersey
[ https://issues.apache.org/jira/browse/YARN-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4965: - Attachment: YARN-4965.patch > Distributed shell AM failed due to ClientHandlerException thrown by jersey > -- > > Key: YARN-4965 > URL: https://issues.apache.org/jira/browse/YARN-4965 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Sumana Sathish >Assignee: Junping Du >Priority: Critical > Attachments: YARN-4965.patch > > > Distributed shell AM failed with RuntimeException: ClientHandlerException > {code:title= app logs} > Exception in thread "AMRM Callback Handler Thread" > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream > closed. > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312) > Caused by: com.sun.jersey.api.client.ClientHandlerException: > java.io.IOException: Stream closed. > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:563) > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:446) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishContainerEndEvent(ApplicationMaster.java:1144) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.access$400(ApplicationMaster.java:169) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$RMCallbackHandler.onContainersCompleted(ApplicationMaster.java:779) > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:300) > Caused by: java.io.IOException: Stream closed. > at > java.net.AbstractPlainSocketImpl.available(AbstractPlainSocketImpl.java:458) > at java.net.SocketInputStream.available(SocketInputStream.java:245) > at java.io.BufferedInputStream.read(BufferedInputStream.java:342) > at > sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552) > at > sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) > at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) > at java.io.FilterInputStream.read(FilterInputStream.java:133) > at > sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3067) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.ensureLoaded(ByteSourceBootstrapper.java:507) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.detectEncoding(ByteSourceBootstrapper.java:129) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.constructParser(ByteSourceBootstrapper.java:224) > at > org.codehaus.jackson.JsonFactory._createJsonParser(JsonFactory.java:785) > at > org.codehaus.jackson.JsonFactory.createJsonParser(JsonFactory.java:561) > at > org.codehaus.jackson.jaxrs.JacksonJsonProvider.readFrom(JacksonJsonProvider.java:414) > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4965) Distributed shell AM failed due to ClientHandlerException thrown by jersey
[ https://issues.apache.org/jira/browse/YARN-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4965: - Fix Version/s: (was: 2.8.0) > Distributed shell AM failed due to ClientHandlerException thrown by jersey > -- > > Key: YARN-4965 > URL: https://issues.apache.org/jira/browse/YARN-4965 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Sumana Sathish >Assignee: Junping Du >Priority: Critical > Attachments: YARN-4965.patch > > > Distributed shell AM failed with RuntimeException: ClientHandlerException > {code:title= app logs} > Exception in thread "AMRM Callback Handler Thread" > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream > closed. > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312) > Caused by: com.sun.jersey.api.client.ClientHandlerException: > java.io.IOException: Stream closed. > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:563) > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:446) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishContainerEndEvent(ApplicationMaster.java:1144) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.access$400(ApplicationMaster.java:169) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$RMCallbackHandler.onContainersCompleted(ApplicationMaster.java:779) > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:300) > Caused by: java.io.IOException: Stream closed. > at > java.net.AbstractPlainSocketImpl.available(AbstractPlainSocketImpl.java:458) > at java.net.SocketInputStream.available(SocketInputStream.java:245) > at java.io.BufferedInputStream.read(BufferedInputStream.java:342) > at > sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552) > at > sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) > at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) > at java.io.FilterInputStream.read(FilterInputStream.java:133) > at > sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3067) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.ensureLoaded(ByteSourceBootstrapper.java:507) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.detectEncoding(ByteSourceBootstrapper.java:129) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.constructParser(ByteSourceBootstrapper.java:224) > at > org.codehaus.jackson.JsonFactory._createJsonParser(JsonFactory.java:785) > at > org.codehaus.jackson.JsonFactory.createJsonParser(JsonFactory.java:561) > at > org.codehaus.jackson.jaxrs.JacksonJsonProvider.readFrom(JacksonJsonProvider.java:414) > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4965) Distributed shell AM failed due to ClientHandlerException thrown by jersey
[ https://issues.apache.org/jira/browse/YARN-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4965: - Target Version/s: 2.8.0 > Distributed shell AM failed due to ClientHandlerException thrown by jersey > -- > > Key: YARN-4965 > URL: https://issues.apache.org/jira/browse/YARN-4965 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Sumana Sathish >Assignee: Junping Du >Priority: Critical > Attachments: YARN-4965.patch > > > Distributed shell AM failed with RuntimeException: ClientHandlerException > {code:title= app logs} > Exception in thread "AMRM Callback Handler Thread" > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream > closed. > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312) > Caused by: com.sun.jersey.api.client.ClientHandlerException: > java.io.IOException: Stream closed. > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:563) > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:446) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishContainerEndEvent(ApplicationMaster.java:1144) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.access$400(ApplicationMaster.java:169) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$RMCallbackHandler.onContainersCompleted(ApplicationMaster.java:779) > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:300) > Caused by: java.io.IOException: Stream closed. > at > java.net.AbstractPlainSocketImpl.available(AbstractPlainSocketImpl.java:458) > at java.net.SocketInputStream.available(SocketInputStream.java:245) > at java.io.BufferedInputStream.read(BufferedInputStream.java:342) > at > sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552) > at > sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) > at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) > at java.io.FilterInputStream.read(FilterInputStream.java:133) > at > sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3067) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.ensureLoaded(ByteSourceBootstrapper.java:507) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.detectEncoding(ByteSourceBootstrapper.java:129) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.constructParser(ByteSourceBootstrapper.java:224) > at > org.codehaus.jackson.JsonFactory._createJsonParser(JsonFactory.java:785) > at > org.codehaus.jackson.JsonFactory.createJsonParser(JsonFactory.java:561) > at > org.codehaus.jackson.jaxrs.JacksonJsonProvider.readFrom(JacksonJsonProvider.java:414) > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4965) Distributed shell AM failed due to ClientHandlerException thrown by jersey
[ https://issues.apache.org/jira/browse/YARN-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4965: - Description: Distributed shell AM failed with RuntimeException: ClientHandlerException {code:title= app logs} Exception in thread "AMRM Callback Handler Thread" org.apache.hadoop.yarn.exceptions.YarnRuntimeException: com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream closed. at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312) Caused by: com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream closed. at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:563) at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:446) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishContainerEndEvent(ApplicationMaster.java:1144) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.access$400(ApplicationMaster.java:169) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$RMCallbackHandler.onContainersCompleted(ApplicationMaster.java:779) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:300) Caused by: java.io.IOException: Stream closed. at java.net.AbstractPlainSocketImpl.available(AbstractPlainSocketImpl.java:458) at java.net.SocketInputStream.available(SocketInputStream.java:245) at java.io.BufferedInputStream.read(BufferedInputStream.java:342) at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552) at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) at java.io.FilterInputStream.read(FilterInputStream.java:133) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3067) at org.codehaus.jackson.impl.ByteSourceBootstrapper.ensureLoaded(ByteSourceBootstrapper.java:507) at org.codehaus.jackson.impl.ByteSourceBootstrapper.detectEncoding(ByteSourceBootstrapper.java:129) at org.codehaus.jackson.impl.ByteSourceBootstrapper.constructParser(ByteSourceBootstrapper.java:224) at org.codehaus.jackson.JsonFactory._createJsonParser(JsonFactory.java:785) at org.codehaus.jackson.JsonFactory.createJsonParser(JsonFactory.java:561) at org.codehaus.jackson.jaxrs.JacksonJsonProvider.readFrom(JacksonJsonProvider.java:414) at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553) ... 6 more {code} was: Distributed shell AM failed with java.io.IOException {code:title= app logs} Exception in thread "AMRM Callback Handler Thread" org.apache.hadoop.yarn.exceptions.YarnRuntimeException: com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream closed. at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312) Caused by: com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream closed. at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:563) at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:446) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishContainerEndEvent(ApplicationMaster.java:1144) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.access$400(ApplicationMaster.java:169) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$RMCallbackHandler.onContainersCompleted(ApplicationMaster.java:779) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:300) Caused by: java.io.IOException: Stream closed. at java.net.AbstractPlainSocketImpl.available(AbstractPlainSocketImpl.java:458) at java.net.SocketInputStream.available(SocketInputStream.java:245) at java.io.BufferedInputStream.read(BufferedInputStream.java:342) at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552) at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) at java.io.FilterInputStream.read(FilterInputStream.java:133) at
[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom
[ https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15244015#comment-15244015 ] Wangda Tan commented on YARN-3215: -- [~Naganarasimha], it seems failures relate to the changes, could you take a look at them? > Respect labels in CapacityScheduler when computing headroom > --- > > Key: YARN-3215 > URL: https://issues.apache.org/jira/browse/YARN-3215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Attachments: YARN-3215.v1.001.patch, YARN-3215.v2.001.patch, > YARN-3215.v2.002.patch, YARN-3215.v2.003.patch, YARN-3215.v2.branch-2.8.patch > > > In existing CapacityScheduler, when computing headroom of an application, it > will only consider "non-labeled" nodes of this application. > But it is possible the application is asking for labeled resources, so > headroom-by-label (like 5G resource available under node-label=red) is > required to get better resource allocation and avoid deadlocks such as > MAPREDUCE-5928. > This JIRA could involve both API changes (such as adding a > label-to-available-resource map in AllocateResponse) and also internal > changes in CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4966) More improvement to get Container logs without specify nodeId
[ https://issues.apache.org/jira/browse/YARN-4966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15244008#comment-15244008 ] Hadoop QA commented on YARN-4966: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 34s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 59s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 30 new + 43 unchanged - 8 fixed = 73 total (was 51) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 17 line(s) with tabs. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 22s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 53s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 9s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 8s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 25s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | |
[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom
[ https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243990#comment-15243990 ] Hadoop QA commented on YARN-3215: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 9m 30s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 0s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 21s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: patch generated 1 new + 217 unchanged - 6 fixed = 218 total (was 223) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 90m 37s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 90m 34s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 212m 6s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestWorkPreservingRMRestartForNodeLabel | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler | | |
[jira] [Commented] (YARN-4934) Reserved Resource for QueueMetrics needs to be handled correctly in few cases
[ https://issues.apache.org/jira/browse/YARN-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243971#comment-15243971 ] Sunil G commented on YARN-4934: --- YARN-4947 and YARN-4890 are tracking test failures. > Reserved Resource for QueueMetrics needs to be handled correctly in few cases > -- > > Key: YARN-4934 > URL: https://issues.apache.org/jira/browse/YARN-4934 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.9.0 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4934.patch > > > Reseved Resource for QueueMetrics needs to be decremented correctly in cases > like below: > - when a reserved container is allocated > - when node is lost/ disconnected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4934) Reserved Resource for QueueMetrics needs to be handled correctly in few cases
[ https://issues.apache.org/jira/browse/YARN-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243969#comment-15243969 ] Hadoop QA commented on YARN-4934: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 50s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 13s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 169m 28s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243959#comment-15243959 ] Junping Du commented on YARN-4955: -- I don't think it is a good idea to add inner/nested class just for unit test purpose - which make code less readable which is worse than no UT for a simple exception catch. If we must have unit test, I would suggest to go with v3 patch + mock exception - that make our main code cleaner. > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4-1.patch, YARN-4955.4.patch, YARN-4955.5.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at
[jira] [Updated] (YARN-4957) Add getNewReservation in ApplicationClientProtocol
[ https://issues.apache.org/jira/browse/YARN-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Po updated YARN-4957: -- Attachment: YARN-4957.v0.patch > Add getNewReservation in ApplicationClientProtocol > -- > > Key: YARN-4957 > URL: https://issues.apache.org/jira/browse/YARN-4957 > Project: Hadoop YARN > Issue Type: Sub-task > Components: applications, client, resourcemanager >Affects Versions: 2.8.0 >Reporter: Subru Krishnan >Assignee: Sean Po > Attachments: YARN-4957.v0.patch > > > Currently submitReservation returns a ReservationId if sucessful. This JIRA > propose adding a getNewReservation in ApplicationClientProtocol for the > following reasons: > * Prevent zombie reservations in the face of client and/or network failures > post submitReservation > * Align reservation submission with application submission -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4468) Document the general ReservationSystem functionality, and the REST API
[ https://issues.apache.org/jira/browse/YARN-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243933#comment-15243933 ] Subru Krishnan commented on YARN-4468: -- Thanks [~asuresh] for reviewing/committing. > Document the general ReservationSystem functionality, and the REST API > -- > > Key: YARN-4468 > URL: https://issues.apache.org/jira/browse/YARN-4468 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino >Assignee: Carlo Curino > Fix For: 2.8.0 > > Attachments: YARN-4468-v4-branch2.patch, YARN-4468.1.patch, > YARN-4468.rest-only.patch, YARN-4486-v3.patch, YARN-4486-v4.patch, > yarn_reservation_system.png > > > This JIRA tracks effort to document the ReservationSystem functionality, and > the REST API access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4468) Document the general ReservationSystem functionality, and the REST API
[ https://issues.apache.org/jira/browse/YARN-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-4468: - Target Version/s: (was: 2.8.0) > Document the general ReservationSystem functionality, and the REST API > -- > > Key: YARN-4468 > URL: https://issues.apache.org/jira/browse/YARN-4468 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino >Assignee: Carlo Curino > Fix For: 2.8.0 > > Attachments: YARN-4468-v4-branch2.patch, YARN-4468.1.patch, > YARN-4468.rest-only.patch, YARN-4486-v3.patch, YARN-4486-v4.patch, > yarn_reservation_system.png > > > This JIRA tracks effort to document the ReservationSystem functionality, and > the REST API access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4468) Document the general ReservationSystem functionality, and the REST API
[ https://issues.apache.org/jira/browse/YARN-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-4468: - Fix Version/s: 2.8.0 > Document the general ReservationSystem functionality, and the REST API > -- > > Key: YARN-4468 > URL: https://issues.apache.org/jira/browse/YARN-4468 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino >Assignee: Carlo Curino > Fix For: 2.8.0 > > Attachments: YARN-4468-v4-branch2.patch, YARN-4468.1.patch, > YARN-4468.rest-only.patch, YARN-4486-v3.patch, YARN-4486-v4.patch, > yarn_reservation_system.png > > > This JIRA tracks effort to document the ReservationSystem functionality, and > the REST API access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243917#comment-15243917 ] Vinod Kumar Vavilapalli commented on YARN-4955: --- I think I understand what you guys are saying. [~xgong], you can do this by creating an inner class inside TimelineClientRetryOp say {{TimelineClientRetryOpForOperateDelegationToken}} (instead of {{createTimelineClientRetryOpForOperateDelegationToken()}}) and then override *only* the run() method of this class inside the test-case to throw a SocketTimeoutException. > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4-1.patch, YARN-4955.4.patch, YARN-4955.5.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at >
[jira] [Updated] (YARN-4966) More improvement to get Container logs without specify nodeId
[ https://issues.apache.org/jira/browse/YARN-4966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4966: Attachment: YARN-4966.1.patch > More improvement to get Container logs without specify nodeId > - > > Key: YARN-4966 > URL: https://issues.apache.org/jira/browse/YARN-4966 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4966.1.patch > > > Currently, for the finished application, we can get the container logs > without specify node id, but we need to enable > yarn.timeline-service.generic-application-history.enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4966) More improvement to get Container logs without specify nodeId
[ https://issues.apache.org/jira/browse/YARN-4966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243910#comment-15243910 ] Xuan Gong commented on YARN-4966: - what we could improve is if yarn.timeline-service.generic-application-history.enabled is disabled, for the finished application, we could read through all the log files in HDFS and try to find the specific container log. > More improvement to get Container logs without specify nodeId > - > > Key: YARN-4966 > URL: https://issues.apache.org/jira/browse/YARN-4966 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > > Currently, for the finished application, we can get the container logs > without specify node id, but we need to enable > yarn.timeline-service.generic-application-history.enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4966) More improvement to get Container logs without specify nodeId
Xuan Gong created YARN-4966: --- Summary: More improvement to get Container logs without specify nodeId Key: YARN-4966 URL: https://issues.apache.org/jira/browse/YARN-4966 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Currently, for the finished application, we can get the container logs without specify node id, but we need to enable yarn.timeline-service.generic-application-history.enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243903#comment-15243903 ] Xuan Gong commented on YARN-4955: - [~gtCarrera9] in order to test retry on SocketTimeoutException, I have to override TimelineClientRetryOp#run() to throw out SocketTimeoutException. > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4-1.patch, YARN-4955.4.patch, YARN-4955.5.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at >
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243894#comment-15243894 ] Li Lu commented on YARN-4955: - Sorry I didn't get back earlier. One confusion, are we testing on the behavior of {{clientFake}}'s retry instead of client's retry here? > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4-1.patch, YARN-4955.4.patch, YARN-4955.5.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at >
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243891#comment-15243891 ] Vinod Kumar Vavilapalli commented on YARN-4955: --- [~djp], I was reviewing the previous patch and gave some review comments, so I'll review this update and commit it if they are addressed, no need to rush. > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4-1.patch, YARN-4955.4.patch, YARN-4955.5.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at >
[jira] [Commented] (YARN-4514) [YARN-3368] Cleanup hardcoded configurations, such as RM/ATS addresses
[ https://issues.apache.org/jira/browse/YARN-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243889#comment-15243889 ] Hadoop QA commented on YARN-4514: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 49s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 4m 15s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:4bf84af | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12798975/YARN-4514-YARN-3368.8.patch | | JIRA Issue | YARN-4514 | | Optional Tests | asflicense | | uname | Linux 3ccea15df82d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | YARN-3368 / 4bf84af | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui . U: . | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/11108/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > [YARN-3368] Cleanup hardcoded configurations, such as RM/ATS addresses > -- > > Key: YARN-4514 > URL: https://issues.apache.org/jira/browse/YARN-4514 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Sunil G > Attachments: YARN-4514-YARN-3368.1.patch, > YARN-4514-YARN-3368.2.patch, YARN-4514-YARN-3368.3.patch, > YARN-4514-YARN-3368.4.patch, YARN-4514-YARN-3368.5.patch, > YARN-4514-YARN-3368.6.patch, YARN-4514-YARN-3368.7.patch, > YARN-4514-YARN-3368.8.patch > > > We have several configurations are hard-coded, for example, RM/ATS addresses, > we should make them configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4468) Document the general ReservationSystem functionality, and the REST API
[ https://issues.apache.org/jira/browse/YARN-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243886#comment-15243886 ] Hudson commented on YARN-4468: -- FAILURE: Integrated in Hadoop-trunk-Commit #9621 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9621/]) YARN-4468. Document the general ReservationSystem functionality, and the (arun suresh: rev cab9cbaa0a6d92dd6473545da0ea1e6a22fd09e1) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ReservationSystem.md * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YARN.md * hadoop-project/src/site/site.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/images/yarn_reservation_system.png > Document the general ReservationSystem functionality, and the REST API > -- > > Key: YARN-4468 > URL: https://issues.apache.org/jira/browse/YARN-4468 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-4468-v4-branch2.patch, YARN-4468.1.patch, > YARN-4468.rest-only.patch, YARN-4486-v3.patch, YARN-4486-v4.patch, > yarn_reservation_system.png > > > This JIRA tracks effort to document the ReservationSystem functionality, and > the REST API access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4468) Document the general ReservationSystem functionality, and the REST API
[ https://issues.apache.org/jira/browse/YARN-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-4468: - Attachment: YARN-4468-v4-branch2.patch Attaching a patch file rebased against branch-2. > Document the general ReservationSystem functionality, and the REST API > -- > > Key: YARN-4468 > URL: https://issues.apache.org/jira/browse/YARN-4468 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-4468-v4-branch2.patch, YARN-4468.1.patch, > YARN-4468.rest-only.patch, YARN-4486-v3.patch, YARN-4486-v4.patch, > yarn_reservation_system.png > > > This JIRA tracks effort to document the ReservationSystem functionality, and > the REST API access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243880#comment-15243880 ] Junping Du commented on YARN-4955: -- The checkstyle issue seems to be noisy only. whitespace issue can be fixed in commit. Patch LGTM. [~vinodkv] and [~gtCarrera9], do you have more comments? If not, I will go ahead to commit this... > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4-1.patch, YARN-4955.4.patch, YARN-4955.5.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) >
[jira] [Updated] (YARN-4849) [YARN-3368] cleanup code base, integrate web UI related build to mvn, and fix licenses.
[ https://issues.apache.org/jira/browse/YARN-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4849: - Attachment: YARN-4849-YARN-3368.8.patch > [YARN-3368] cleanup code base, integrate web UI related build to mvn, and fix > licenses. > --- > > Key: YARN-4849 > URL: https://issues.apache.org/jira/browse/YARN-4849 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Fix For: YARN-3368 > > Attachments: YARN-4849-YARN-3368.1.patch, > YARN-4849-YARN-3368.2.patch, YARN-4849-YARN-3368.3.patch, > YARN-4849-YARN-3368.4.patch, YARN-4849-YARN-3368.5.patch, > YARN-4849-YARN-3368.6.patch, YARN-4849-YARN-3368.7.patch, > YARN-4849-YARN-3368.8.patch, YARN-4849-YARN-3368.addendum.1.patch, > YARN-4849-YARN-3368.addendum.2.patch, YARN-4849-YARN-3368.addendum.3.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-4965) Distributed shell AM failed due to ClientHandlerException thrown by jersey
[ https://issues.apache.org/jira/browse/YARN-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reassigned YARN-4965: Assignee: Junping Du > Distributed shell AM failed due to ClientHandlerException thrown by jersey > -- > > Key: YARN-4965 > URL: https://issues.apache.org/jira/browse/YARN-4965 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Sumana Sathish >Assignee: Junping Du >Priority: Critical > Fix For: 2.8.0 > > > Distributed shell AM failed with java.io.IOException > {code:title= app logs} > Exception in thread "AMRM Callback Handler Thread" > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream > closed. > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312) > Caused by: com.sun.jersey.api.client.ClientHandlerException: > java.io.IOException: Stream closed. > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:563) > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:446) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishContainerEndEvent(ApplicationMaster.java:1144) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.access$400(ApplicationMaster.java:169) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$RMCallbackHandler.onContainersCompleted(ApplicationMaster.java:779) > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:300) > Caused by: java.io.IOException: Stream closed. > at > java.net.AbstractPlainSocketImpl.available(AbstractPlainSocketImpl.java:458) > at java.net.SocketInputStream.available(SocketInputStream.java:245) > at java.io.BufferedInputStream.read(BufferedInputStream.java:342) > at > sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552) > at > sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) > at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) > at java.io.FilterInputStream.read(FilterInputStream.java:133) > at > sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3067) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.ensureLoaded(ByteSourceBootstrapper.java:507) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.detectEncoding(ByteSourceBootstrapper.java:129) > at > org.codehaus.jackson.impl.ByteSourceBootstrapper.constructParser(ByteSourceBootstrapper.java:224) > at > org.codehaus.jackson.JsonFactory._createJsonParser(JsonFactory.java:785) > at > org.codehaus.jackson.JsonFactory.createJsonParser(JsonFactory.java:561) > at > org.codehaus.jackson.jaxrs.JacksonJsonProvider.readFrom(JacksonJsonProvider.java:414) > at > com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553) > ... 6 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4965) Distributed shell AM failed due to ClientHandlerException thrown by jersey
Sumana Sathish created YARN-4965: Summary: Distributed shell AM failed due to ClientHandlerException thrown by jersey Key: YARN-4965 URL: https://issues.apache.org/jira/browse/YARN-4965 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.2 Reporter: Sumana Sathish Priority: Critical Fix For: 2.8.0 Distributed shell AM failed with java.io.IOException {code:title= app logs} Exception in thread "AMRM Callback Handler Thread" org.apache.hadoop.yarn.exceptions.YarnRuntimeException: com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream closed. at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312) Caused by: com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: Stream closed. at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:563) at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:506) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:446) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishContainerEndEvent(ApplicationMaster.java:1144) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.access$400(ApplicationMaster.java:169) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$RMCallbackHandler.onContainersCompleted(ApplicationMaster.java:779) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:300) Caused by: java.io.IOException: Stream closed. at java.net.AbstractPlainSocketImpl.available(AbstractPlainSocketImpl.java:458) at java.net.SocketInputStream.available(SocketInputStream.java:245) at java.io.BufferedInputStream.read(BufferedInputStream.java:342) at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552) at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) at java.io.FilterInputStream.read(FilterInputStream.java:133) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3067) at org.codehaus.jackson.impl.ByteSourceBootstrapper.ensureLoaded(ByteSourceBootstrapper.java:507) at org.codehaus.jackson.impl.ByteSourceBootstrapper.detectEncoding(ByteSourceBootstrapper.java:129) at org.codehaus.jackson.impl.ByteSourceBootstrapper.constructParser(ByteSourceBootstrapper.java:224) at org.codehaus.jackson.JsonFactory._createJsonParser(JsonFactory.java:785) at org.codehaus.jackson.JsonFactory.createJsonParser(JsonFactory.java:561) at org.codehaus.jackson.jaxrs.JacksonJsonProvider.readFrom(JacksonJsonProvider.java:414) at com.sun.jersey.api.client.ClientResponse.getEntity(ClientResponse.java:553) ... 6 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4468) Document the general ReservationSystem functionality, and the REST API
[ https://issues.apache.org/jira/browse/YARN-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243877#comment-15243877 ] Arun Suresh commented on YARN-4468: --- +1, Committing this shortly > Document the general ReservationSystem functionality, and the REST API > -- > > Key: YARN-4468 > URL: https://issues.apache.org/jira/browse/YARN-4468 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-4468.1.patch, YARN-4468.rest-only.patch, > YARN-4486-v3.patch, YARN-4486-v4.patch, yarn_reservation_system.png > > > This JIRA tracks effort to document the ReservationSystem functionality, and > the REST API access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4468) Document the general ReservationSystem functionality, and the REST API
[ https://issues.apache.org/jira/browse/YARN-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-4468: - Attachment: yarn_reservation_system.png Attaching the reservation system architecture diagram > Document the general ReservationSystem functionality, and the REST API > -- > > Key: YARN-4468 > URL: https://issues.apache.org/jira/browse/YARN-4468 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-4468.1.patch, YARN-4468.rest-only.patch, > YARN-4486-v3.patch, YARN-4486-v4.patch, yarn_reservation_system.png > > > This JIRA tracks effort to document the ReservationSystem functionality, and > the REST API access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking
[ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Zhi updated YARN-4676: - Attachment: YARN-4676.012.patch Fixed TestPBImplRecords and TestRMAdminCLI (related to patch 011). I don't expect all other QA tests failures are related and have ran about a dozen of them with 9 PASS and 3 FAIL locally with or without my patch. > Automatic and Asynchronous Decommissioning Nodes Status Tracking > > > Key: YARN-4676 > URL: https://issues.apache.org/jira/browse/YARN-4676 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Daniel Zhi >Assignee: Daniel Zhi > Labels: features > Attachments: GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, > YARN-4676.005.patch, YARN-4676.006.patch, YARN-4676.007.patch, > YARN-4676.008.patch, YARN-4676.009.patch, YARN-4676.010.patch, > YARN-4676.011.patch, YARN-4676.012.patch > > > DecommissioningNodeWatcher inside ResourceTrackingService tracks > DECOMMISSIONING nodes status automatically and asynchronously after > client/admin made the graceful decommission request. It tracks > DECOMMISSIONING nodes status to decide when, after all running containers on > the node have completed, will be transitioned into DECOMMISSIONED state. > NodesListManager detect and handle include and exclude list changes to kick > out decommission or recommission as necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2883) Queuing of container requests in the NM
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243722#comment-15243722 ] Arun Suresh commented on YARN-2883: --- Looks like you have to add an additional {noformat} if (!event instanceof ApplicationContainerFinishedEvent){ throw RuntimeException("...") } {noformat} prior casting it.. This is whats done in other parts of the code : https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L1204 > Queuing of container requests in the NM > --- > > Key: YARN-2883 > URL: https://issues.apache.org/jira/browse/YARN-2883 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Attachments: YARN-2883-trunk.004.patch, YARN-2883-trunk.005.patch, > YARN-2883-trunk.006.patch, YARN-2883-trunk.007.patch, > YARN-2883-trunk.008.patch, YARN-2883-trunk.009.patch, > YARN-2883-trunk.010.patch, YARN-2883-trunk.011.patch, > YARN-2883-trunk.012.patch, YARN-2883-trunk.013.patch, > YARN-2883-yarn-2877.001.patch, YARN-2883-yarn-2877.002.patch, > YARN-2883-yarn-2877.003.patch, YARN-2883-yarn-2877.004.patch, > YARN-2883.013.patch > > > We propose to add a queue in each NM, where queueable container requests can > be held. > Based on the available resources in the node and the containers in the queue, > the NM will decide when to allow the execution of a queued container. > In order to ensure the instantaneous start of a guaranteed-start container, > the NM may decide to pre-empt/kill running queueable containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable
[ https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243662#comment-15243662 ] Nathan Roberts commented on YARN-4963: -- Thanks [~leftnoteasy] for the feedback. I agree that it would be a useful feature to be able to give some applications better spread, regardless of allocation type. Now we just have to figure out how to get there. My concern is that I don't think we'd want to implement it using the same simple approach if it's going to apply to all container types. For example, in our case we almost always want NODE_LOCAL and RACK_LOCAL to get scheduled as quickly as possible so I'd want the limit to be high, as opposed to OFF_SWITCH where I want the limit to be 3-5 to keep a nice balance between scheduling performance and clustering. The reason this check was introduced in the first place (iirc) was to prevent network-heavy applications from loading up on specific nodes. The OFF_SWITCH check was a simple way of achieving this at a global level. The feature I think you're asking for (please correct me if I misunderstood) is that applications should be able to request that container spread be prioritized over timely scheduling (kind of like locality delay does today). I completely agree this would be a useful knob for applications to have. It is a trade-off though. An application that wants really good spread would be sacrificing scheduling opportunities that would probably be given to applications behind them in the queue (like locality delay). So maybe there are two things to do: 1) Have the global OFF_SWITCH check to handle the simple case of avoiding too many network-heavy applications on a node. 2) A feature where applications can specify a max_containers_assigned_per_node_per_heartbeat. I think this would be checked down in LeafQueue.assignContainers(). Even with #2 in place, I don't think #1 could immediately go away because the network-heavy applications would need to start properly specifying this limit. The other approach to get rid of #1 would be when network is a resource. Such applications could then request lots of network resource, which should prevent clustering. Does that make any sort of sense? > capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat > configurable > > > Key: YARN-4963 > URL: https://issues.apache.org/jira/browse/YARN-4963 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.0.0, 2.7.2 >Reporter: Nathan Roberts >Assignee: Nathan Roberts > Attachments: YARN-4963.001.patch > > > Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment > per heartbeat. With more and more non MapReduce workloads coming along, the > degree of locality is declining, causing scheduling to be significantly > slower. It's still important to limit the number of OFF_SWITCH assignments to > avoid densely packing OFF_SWITCH containers onto nodes. > Proposal is to add a simple config that makes the number of OFF_SWITCH > assignments configurable. > Will upload candidate patch shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2306) leak of reservation metrics (fair scheduler)
[ https://issues.apache.org/jira/browse/YARN-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243651#comment-15243651 ] Yufei Gu commented on YARN-2306: Hi [~zhiguohong], Thank you for working on this. I am glad you are still working on it. Have you address all comments and the test failures from Hadoop QA? > leak of reservation metrics (fair scheduler) > > > Key: YARN-2306 > URL: https://issues.apache.org/jira/browse/YARN-2306 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Hong Zhiguo >Assignee: Hong Zhiguo >Priority: Minor > Attachments: YARN-2306-2.patch, YARN-2306-3.patch, YARN-2306.patch > > > This only applies to fair scheduler. Capacity scheduler is OK. > When appAttempt or node is removed, the metrics for > reservation(reservedContainers, reservedMB, reservedVCores) is not reduced > back. > These are important metrics for administrator. The wrong metrics confuses may > confuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243636#comment-15243636 ] Hadoop QA commented on YARN-4955: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: patch generated 1 new + 19 unchanged - 0 fixed = 20 total (was 19) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 58s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 12s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 20m 34s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12799008/YARN-4955.5.patch | | JIRA Issue | YARN-4955 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 0553c18b369c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk /
[jira] [Commented] (YARN-4940) yarn node -list -all failed if RM start with decommissioned node
[ https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243627#comment-15243627 ] Hudson commented on YARN-4940: -- FAILURE: Integrated in Hadoop-trunk-Commit #9620 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9620/]) YARN-4940. yarn node -list -all failed if RM start with decommissioned (jlowe: rev 69f3d428d5c3ab0c79cacffc22b1f59408622ae7) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java > yarn node -list -all failed if RM start with decommissioned node > > > Key: YARN-4940 > URL: https://issues.apache.org/jira/browse/YARN-4940 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee >Assignee: sandflee > Fix For: 2.8.0, 2.7.3 > > Attachments: YARN-4940.01.patch, YARN-4940.02.patch, > YARN-4940.03.patch, YARN-4940.04.patch, YARN-4940.05.patch > > > 1, add a node to exclude file > 2, start RM > 3, run yarn node -list -all , see the following exception > {quote} > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager$UnknownNodeId > cannot be cast to org.apache.hadoop.yarn.api.records.impl.pb.NodeIdPBImpl > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:251) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.convertToProtoFormat(GetClusterNodesResponsePBImpl.java:172) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.access$000(GetClusterNodesResponsePBImpl.java:38) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:152) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:141) > at > com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336) > at > com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323) > at > org.apache.hadoop.yarn.proto.YarnServiceProtos$GetClusterNodesResponseProto$Builder.addAllNodeReports(YarnServiceProtos.java:21485) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.addLocalNodeManagerInfosToProto(GetClusterNodesResponsePBImpl.java:164) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToBuilder(GetClusterNodesResponsePBImpl.java:99) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToProto(GetClusterNodesResponsePBImpl.java:106) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.getProto(GetClusterNodesResponsePBImpl.java:71) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:284) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:493) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at >
[jira] [Commented] (YARN-4577) Enable aux services to have their own custom classpath/jar file
[ https://issues.apache.org/jira/browse/YARN-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243588#comment-15243588 ] Vinod Kumar Vavilapalli commented on YARN-4577: --- bq. We simply do not pass in any configuration anywhere as part of the AuxService APIs - so this entire thread of reasoning about getClass() is no long a problem? [~xgong] reminded me offline that we do pass a shared configuration as part of serviceInit(). In that case, the solution is simply to pass a private cloned Configuration for each of the aux-services? > Enable aux services to have their own custom classpath/jar file > --- > > Key: YARN-4577 > URL: https://issues.apache.org/jira/browse/YARN-4577 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4577.1.patch, YARN-4577.2.patch, > YARN-4577.20160119.1.patch, YARN-4577.20160204.patch, YARN-4577.3.patch, > YARN-4577.3.rebase.patch, YARN-4577.4.patch > > > Right now, users have to add their jars to the NM classpath directly, thus > put them on the system classloader. But if multiple versions of the plugin > are present on the classpath, there is no control over which version actually > gets loaded. Or if there are any conflicts between the dependencies > introduced by the auxiliary service and the NM itself, they can break the NM, > the auxiliary service, or both. > The solution could be: to instantiate aux services using a classloader that > is different from the system classloader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243545#comment-15243545 ] Xuan Gong commented on YARN-4955: - Fixed the checkstyle and whitespace warning. > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4-1.patch, YARN-4955.4.patch, YARN-4955.5.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at >
[jira] [Updated] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4955: Attachment: YARN-4955.5.patch > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4-1.patch, YARN-4955.4.patch, YARN-4955.5.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:567) >
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243530#comment-15243530 ] Hadoop QA commented on YARN-4955: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: patch generated 2 new + 19 unchanged - 0 fixed = 21 total (was 19) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 58s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 13s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 20m 57s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12799005/YARN-4955.4-1.patch | | JIRA Issue | YARN-4955 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux fe2eac08a01c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fdbafbc
[jira] [Commented] (YARN-4940) yarn node -list -all failed if RM start with decommissioned node
[ https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243500#comment-15243500 ] Jason Lowe commented on YARN-4940: -- +1 lgtm. The test failures appear to be unrelated. Committing this. > yarn node -list -all failed if RM start with decommissioned node > > > Key: YARN-4940 > URL: https://issues.apache.org/jira/browse/YARN-4940 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-4940.01.patch, YARN-4940.02.patch, > YARN-4940.03.patch, YARN-4940.04.patch, YARN-4940.05.patch > > > 1, add a node to exclude file > 2, start RM > 3, run yarn node -list -all , see the following exception > {quote} > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager$UnknownNodeId > cannot be cast to org.apache.hadoop.yarn.api.records.impl.pb.NodeIdPBImpl > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:251) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.convertToProtoFormat(GetClusterNodesResponsePBImpl.java:172) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.access$000(GetClusterNodesResponsePBImpl.java:38) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:152) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:141) > at > com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336) > at > com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323) > at > org.apache.hadoop.yarn.proto.YarnServiceProtos$GetClusterNodesResponseProto$Builder.addAllNodeReports(YarnServiceProtos.java:21485) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.addLocalNodeManagerInfosToProto(GetClusterNodesResponsePBImpl.java:164) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToBuilder(GetClusterNodesResponsePBImpl.java:99) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToProto(GetClusterNodesResponsePBImpl.java:106) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.getProto(GetClusterNodesResponsePBImpl.java:71) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:284) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:493) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at >
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243494#comment-15243494 ] Xuan Gong commented on YARN-4955: - Thanks for the review. New patch added a testcase for this. > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4-1.patch, YARN-4955.4.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at >
[jira] [Updated] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4955: Attachment: YARN-4955.4-1.patch > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4-1.patch, YARN-4955.4.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:567) > ... 24 more >
[jira] [Updated] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4955: Attachment: YARN-4955.4.patch > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch, > YARN-4955.4.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:567) > ... 24 more > Caused by:
[jira] [Commented] (YARN-4909) Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter
[ https://issues.apache.org/jira/browse/YARN-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243400#comment-15243400 ] Hudson commented on YARN-4909: -- FAILURE: Integrated in Hadoop-trunk-Commit #9619 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9619/]) YARN-4909. Fix intermittent failures of TestRMWebServices And (naganarasimha_gr: rev fdbafbc9e59314d9f9f75e615de9d2dfdced017b) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/JerseyTestBase.java > Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter > --- > > Key: YARN-4909 > URL: https://issues.apache.org/jira/browse/YARN-4909 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Brahma Reddy Battula >Assignee: Bibin A Chundatt >Priority: Blocker > Fix For: 2.8.0 > > Attachments: 0001-YARN-4909.patch, 0002-YARN-4909.patch, > 0003-YARN-4909.patch, 0004-YARN-4909.patch, 0005-YARN-4909.patch, > 0006-YARN-4909.patch > > > *Precommit link* > https://builds.apache.org/job/PreCommit-YARN-Build/10908/testReport/ > *Trace* > {noformat} > com.sun.jersey.test.framework.spi.container.TestContainerException: > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:463) > at sun.nio.ch.Net.bind(Net.java:455) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:413) > at > org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:384) > at > org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:375) > at > org.glassfish.grizzly.http.server.NetworkListener.start(NetworkListener.java:549) > at > org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:255) > at > com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:326) > at > com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:343) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.instantiateGrizzlyWebServer(GrizzlyWebTestContainerFactory.java:219) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:129) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:86) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory.create(GrizzlyWebTestContainerFactory.java:79) > at > com.sun.jersey.test.framework.JerseyTest.getContainer(JerseyTest.java:342) > at com.sun.jersey.test.framework.JerseyTest.(JerseyTest.java:217) > at > org.apache.hadoop.yarn.webapp.JerseyTestBase.(JerseyTestBase.java:30) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices.(TestRMWebServices.java:125) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4964) Allow ShuffleHandler readahead without drop-behind
[ https://issues.apache.org/jira/browse/YARN-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243364#comment-15243364 ] Wangda Tan commented on YARN-4964: -- Should this be moved to MAPREDUCE project? > Allow ShuffleHandler readahead without drop-behind > -- > > Key: YARN-4964 > URL: https://issues.apache.org/jira/browse/YARN-4964 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0, 2.7.2 >Reporter: Nathan Roberts >Assignee: Nathan Roberts > Attachments: YARN-4964.001.patch > > > Currently mapreduce.shuffle.manage.os.cache enables/disables both readahead > (POSIX_FADV_WILLNEED) and drop-behind (POSIX_FADV_DONTNEED) logic within the > ShuffleHandler. > It would be beneficial if these were separately configurable. > - Running without readahead can lead to significant seek storms caused by > large numbers of sendfiles() competing with one another. > - However, running with drop-behind can also lead to seek storms because > there are cases where the server can successfully write the shuffle bytes to > the network, BUT the client doesn't want the bytes right now (MergeManager > wants to WAIT is an example) so it ignores them and asks for them again a bit > later. This causes repeated reads of the same data from disk. > I'll attach a simple patch that enables/disables readahead based on > mapreduce.shuffle.readahead.bytes==0, leaving > mapreduce.shuffle.manage.os.cache controlling only the drop-behind. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI
[ https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243363#comment-15243363 ] Naganarasimha G R commented on YARN-3362: - Thanks [~eepayne], [~wangda], & [~sunilg] for waiting, taking a look at it shortly ... > Add node label usage in RM CapacityScheduler web UI > --- > > Key: YARN-3362 > URL: https://issues.apache.org/jira/browse/YARN-3362 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager, webapp >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Fix For: 2.8.0 > > Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue > Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, > 2015.05.10_3362_Queue_Hierarchy.png, 2015.05.12_3362_Queue_Hierarchy.png, > CSWithLabelsView.png, No-space-between-Active_user_info-and-next-queues.png, > Screen Shot 2015-04-29 at 11.42.17 AM.png, YARN-3362-branch-2.7.002.patch, > YARN-3362.20150428-3-modified.patch, YARN-3362.20150428-3.patch, > YARN-3362.20150506-1.patch, YARN-3362.20150507-1.patch, > YARN-3362.20150510-1.patch, YARN-3362.20150511-1.patch, > YARN-3362.20150512-1.patch, capacity-scheduler.xml > > > We don't have node label usage in RM CapacityScheduler web UI now, without > this, user will be hard to understand what happened to nodes have labels > assign to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4964) Allow ShuffleHandler readahead without drop-behind
[ https://issues.apache.org/jira/browse/YARN-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Roberts updated YARN-4964: - Attachment: YARN-4964.001.patch > Allow ShuffleHandler readahead without drop-behind > -- > > Key: YARN-4964 > URL: https://issues.apache.org/jira/browse/YARN-4964 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0, 2.7.2 >Reporter: Nathan Roberts >Assignee: Nathan Roberts > Attachments: YARN-4964.001.patch > > > Currently mapreduce.shuffle.manage.os.cache enables/disables both readahead > (POSIX_FADV_WILLNEED) and drop-behind (POSIX_FADV_DONTNEED) logic within the > ShuffleHandler. > It would be beneficial if these were separately configurable. > - Running without readahead can lead to significant seek storms caused by > large numbers of sendfiles() competing with one another. > - However, running with drop-behind can also lead to seek storms because > there are cases where the server can successfully write the shuffle bytes to > the network, BUT the client doesn't want the bytes right now (MergeManager > wants to WAIT is an example) so it ignores them and asks for them again a bit > later. This causes repeated reads of the same data from disk. > I'll attach a simple patch that enables/disables readahead based on > mapreduce.shuffle.readahead.bytes==0, leaving > mapreduce.shuffle.manage.os.cache controlling only the drop-behind. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4964) Allow ShuffleHandler readahead without drop-behind
Nathan Roberts created YARN-4964: Summary: Allow ShuffleHandler readahead without drop-behind Key: YARN-4964 URL: https://issues.apache.org/jira/browse/YARN-4964 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.2, 3.0.0 Reporter: Nathan Roberts Assignee: Nathan Roberts Currently mapreduce.shuffle.manage.os.cache enables/disables both readahead (POSIX_FADV_WILLNEED) and drop-behind (POSIX_FADV_DONTNEED) logic within the ShuffleHandler. It would be beneficial if these were separately configurable. - Running without readahead can lead to significant seek storms caused by large numbers of sendfiles() competing with one another. - However, running with drop-behind can also lead to seek storms because there are cases where the server can successfully write the shuffle bytes to the network, BUT the client doesn't want the bytes right now (MergeManager wants to WAIT is an example) so it ignores them and asks for them again a bit later. This causes repeated reads of the same data from disk. I'll attach a simple patch that enables/disables readahead based on mapreduce.shuffle.readahead.bytes==0, leaving mapreduce.shuffle.manage.os.cache controlling only the drop-behind. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable
[ https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243343#comment-15243343 ] Hadoop QA commented on YARN-4963: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 1s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: patch generated 1 new + 111 unchanged - 0 fixed = 112 total (was 111) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 39s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 24s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 149m 4s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.webapp.TestRMWithCSRFFilter | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification | | |
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243280#comment-15243280 ] Vinod Kumar Vavilapalli commented on YARN-4955: --- This looks good to me too. [~xgong], can you please add a simple test which validates that the client retries on socket-timeout now? > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) >
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243237#comment-15243237 ] Li Lu commented on YARN-4955: - New fix looks fine. Thanks! > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:567) > ... 24 more > Caused by:
[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable
[ https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243219#comment-15243219 ] Wangda Tan commented on YARN-4963: -- [~nroberts], +1 to this feature, this gonna be very useful. Instead of only limit #off-switch containers allocate per heartbeat, can we limit #containers allocation regardless of locality? I can see some values to limit #containers for rack/node locality as well. For example, if user wants their containers allocated spread across all the cluster. > capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat > configurable > > > Key: YARN-4963 > URL: https://issues.apache.org/jira/browse/YARN-4963 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.0.0, 2.7.2 >Reporter: Nathan Roberts >Assignee: Nathan Roberts > Attachments: YARN-4963.001.patch > > > Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment > per heartbeat. With more and more non MapReduce workloads coming along, the > degree of locality is declining, causing scheduling to be significantly > slower. It's still important to limit the number of OFF_SWITCH assignments to > avoid densely packing OFF_SWITCH containers onto nodes. > Proposal is to add a simple config that makes the number of OFF_SWITCH > assignments configurable. > Will upload candidate patch shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4940) yarn node -list -all failed if RM start with decommissioned node
[ https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243194#comment-15243194 ] Kuhu Shukla commented on YARN-4940: --- +1 lgtm (non-binding). > yarn node -list -all failed if RM start with decommissioned node > > > Key: YARN-4940 > URL: https://issues.apache.org/jira/browse/YARN-4940 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-4940.01.patch, YARN-4940.02.patch, > YARN-4940.03.patch, YARN-4940.04.patch, YARN-4940.05.patch > > > 1, add a node to exclude file > 2, start RM > 3, run yarn node -list -all , see the following exception > {quote} > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager$UnknownNodeId > cannot be cast to org.apache.hadoop.yarn.api.records.impl.pb.NodeIdPBImpl > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:251) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.convertToProtoFormat(GetClusterNodesResponsePBImpl.java:172) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.access$000(GetClusterNodesResponsePBImpl.java:38) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:152) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:141) > at > com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336) > at > com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323) > at > org.apache.hadoop.yarn.proto.YarnServiceProtos$GetClusterNodesResponseProto$Builder.addAllNodeReports(YarnServiceProtos.java:21485) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.addLocalNodeManagerInfosToProto(GetClusterNodesResponsePBImpl.java:164) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToBuilder(GetClusterNodesResponsePBImpl.java:99) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToProto(GetClusterNodesResponsePBImpl.java:106) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.getProto(GetClusterNodesResponsePBImpl.java:71) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:284) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:493) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at >
[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI
[ https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243214#comment-15243214 ] Sunil G commented on YARN-3362: --- Thanks [~eepayne] for sharing patch here. Very much appreciate the same. Overall patch looks fine for me. Will wait for [~Naganarasimha Garla] also. > Add node label usage in RM CapacityScheduler web UI > --- > > Key: YARN-3362 > URL: https://issues.apache.org/jira/browse/YARN-3362 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager, webapp >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Fix For: 2.8.0 > > Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue > Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, > 2015.05.10_3362_Queue_Hierarchy.png, 2015.05.12_3362_Queue_Hierarchy.png, > CSWithLabelsView.png, No-space-between-Active_user_info-and-next-queues.png, > Screen Shot 2015-04-29 at 11.42.17 AM.png, YARN-3362-branch-2.7.002.patch, > YARN-3362.20150428-3-modified.patch, YARN-3362.20150428-3.patch, > YARN-3362.20150506-1.patch, YARN-3362.20150507-1.patch, > YARN-3362.20150510-1.patch, YARN-3362.20150511-1.patch, > YARN-3362.20150512-1.patch, capacity-scheduler.xml > > > We don't have node label usage in RM CapacityScheduler web UI now, without > this, user will be hard to understand what happened to nodes have labels > assign to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4955: - Priority: Critical (was: Major) > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:567) > ... 24 more > Caused by:
[jira] [Commented] (YARN-4962) support filling up containers on node one by one
[ https://issues.apache.org/jira/browse/YARN-4962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243136#comment-15243136 ] Daniel Templeton commented on YARN-4962: This is a common issue in mixed node clusters. The typical solution is to load the cluster "from different ends." If the cluster has nodes of type A and type B (say, regular and big memory) and jobs of type a and b (where type b jobs need type B machines), the scheduler should schedule type a jobs to type A nodes, spilling into type B nodes only when there are no more type A nodes available. Type b jobs only run on type B nodes. In Grid Engine, the way that's implemented is with node labels. All type B nodes are labeled as such. All type b jobs are submitted with a hard request for type B nodes (hard = requires). All other jobs are submitted with a soft request for !type B nodes (soft = preference). > support filling up containers on node one by one > - > > Key: YARN-4962 > URL: https://issues.apache.org/jira/browse/YARN-4962 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sandflee > > we had a gpu cluster, jobs with bigger resource request couldn't be satisfied > for node is running the jobs with smaller resource request. we didn't open > reserve system because gpu jobs may run days or weeks. we expect scheduler > allocate containers to fill the node , then there will be resource to run > jobs with big resource request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4514) [YARN-3368] Cleanup hardcoded configurations, such as RM/ATS addresses
[ https://issues.apache.org/jira/browse/YARN-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243122#comment-15243122 ] Hadoop QA commented on YARN-4514: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 50s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 2m 54s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:e35bf0f | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12798975/YARN-4514-YARN-3368.8.patch | | JIRA Issue | YARN-4514 | | Optional Tests | asflicense | | uname | Linux ab19464933d2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | YARN-3368 / e35bf0f | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui . U: . | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/11098/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > [YARN-3368] Cleanup hardcoded configurations, such as RM/ATS addresses > -- > > Key: YARN-4514 > URL: https://issues.apache.org/jira/browse/YARN-4514 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Sunil G > Attachments: YARN-4514-YARN-3368.1.patch, > YARN-4514-YARN-3368.2.patch, YARN-4514-YARN-3368.3.patch, > YARN-4514-YARN-3368.4.patch, YARN-4514-YARN-3368.5.patch, > YARN-4514-YARN-3368.6.patch, YARN-4514-YARN-3368.7.patch, > YARN-4514-YARN-3368.8.patch > > > We have several configurations are hard-coded, for example, RM/ATS addresses, > we should make them configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4514) [YARN-3368] Cleanup hardcoded configurations, such as RM/ATS addresses
[ https://issues.apache.org/jira/browse/YARN-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4514: -- Attachment: YARN-4514-YARN-3368.8.patch Thank you [~leftnoteasy]. Please help to check the update message in default-config.js. Attaching an updated patch > [YARN-3368] Cleanup hardcoded configurations, such as RM/ATS addresses > -- > > Key: YARN-4514 > URL: https://issues.apache.org/jira/browse/YARN-4514 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Sunil G > Attachments: YARN-4514-YARN-3368.1.patch, > YARN-4514-YARN-3368.2.patch, YARN-4514-YARN-3368.3.patch, > YARN-4514-YARN-3368.4.patch, YARN-4514-YARN-3368.5.patch, > YARN-4514-YARN-3368.6.patch, YARN-4514-YARN-3368.7.patch, > YARN-4514-YARN-3368.8.patch > > > We have several configurations are hard-coded, for example, RM/ATS addresses, > we should make them configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable
[ https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Roberts updated YARN-4963: - Attachment: YARN-4963.001.patch > capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat > configurable > > > Key: YARN-4963 > URL: https://issues.apache.org/jira/browse/YARN-4963 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.0.0, 2.7.2 >Reporter: Nathan Roberts >Assignee: Nathan Roberts > Attachments: YARN-4963.001.patch > > > Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment > per heartbeat. With more and more non MapReduce workloads coming along, the > degree of locality is declining, causing scheduling to be significantly > slower. It's still important to limit the number of OFF_SWITCH assignments to > avoid densely packing OFF_SWITCH containers onto nodes. > Proposal is to add a simple config that makes the number of OFF_SWITCH > assignments configurable. > Will upload candidate patch shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable
Nathan Roberts created YARN-4963: Summary: capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable Key: YARN-4963 URL: https://issues.apache.org/jira/browse/YARN-4963 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 2.7.2, 3.0.0 Reporter: Nathan Roberts Assignee: Nathan Roberts Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment per heartbeat. With more and more non MapReduce workloads coming along, the degree of locality is declining, causing scheduling to be significantly slower. It's still important to limit the number of OFF_SWITCH assignments to avoid densely packing OFF_SWITCH containers onto nodes. Proposal is to add a simple config that makes the number of OFF_SWITCH assignments configurable. Will upload candidate patch shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2883) Queuing of container requests in the NM
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243083#comment-15243083 ] Konstantinos Karanasos commented on YARN-2883: -- Yep, I will address the visibility comment -- I just wanted to first do the refactoring fast, so that I get your feedback. Regarding the findbug, I tried to fix the problem by adding a @SuppressWarnings("unchecked"), but it does not seem to work. Any ideas about what is wrong? > Queuing of container requests in the NM > --- > > Key: YARN-2883 > URL: https://issues.apache.org/jira/browse/YARN-2883 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Attachments: YARN-2883-trunk.004.patch, YARN-2883-trunk.005.patch, > YARN-2883-trunk.006.patch, YARN-2883-trunk.007.patch, > YARN-2883-trunk.008.patch, YARN-2883-trunk.009.patch, > YARN-2883-trunk.010.patch, YARN-2883-trunk.011.patch, > YARN-2883-trunk.012.patch, YARN-2883-trunk.013.patch, > YARN-2883-yarn-2877.001.patch, YARN-2883-yarn-2877.002.patch, > YARN-2883-yarn-2877.003.patch, YARN-2883-yarn-2877.004.patch, > YARN-2883.013.patch > > > We propose to add a queue in each NM, where queueable container requests can > be held. > Based on the available resources in the node and the containers in the queue, > the NM will decide when to allow the execution of a queued container. > In order to ensure the instantaneous start of a guaranteed-start container, > the NM may decide to pre-empt/kill running queueable containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2883) Queuing of container requests in the NM
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243066#comment-15243066 ] Karthik Kambatla commented on YARN-2883: The latest patch (YARN-2883.013.patch) looks mostly good to me, barring some unaddressed comments from my previous review - visibility of methods in ContainersMonitorImpl and the config itself. As I said, for the config, I am comfortable filing a follow-up. And, yeah, the findbugs. > Queuing of container requests in the NM > --- > > Key: YARN-2883 > URL: https://issues.apache.org/jira/browse/YARN-2883 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Attachments: YARN-2883-trunk.004.patch, YARN-2883-trunk.005.patch, > YARN-2883-trunk.006.patch, YARN-2883-trunk.007.patch, > YARN-2883-trunk.008.patch, YARN-2883-trunk.009.patch, > YARN-2883-trunk.010.patch, YARN-2883-trunk.011.patch, > YARN-2883-trunk.012.patch, YARN-2883-trunk.013.patch, > YARN-2883-yarn-2877.001.patch, YARN-2883-yarn-2877.002.patch, > YARN-2883-yarn-2877.003.patch, YARN-2883-yarn-2877.004.patch, > YARN-2883.013.patch > > > We propose to add a queue in each NM, where queueable container requests can > be held. > Based on the available resources in the node and the containers in the queue, > the NM will decide when to allow the execution of a queued container. > In order to ensure the instantaneous start of a guaranteed-start container, > the NM may decide to pre-empt/kill running queueable containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243048#comment-15243048 ] Junping Du commented on YARN-4955: -- 03 patch LGTM. If no further comments from others, I will commit it shortly after HADOOP-13026 get commit. > Add retry for SocketTimeoutException in TimelineClient > -- > > Key: YARN-4955 > URL: https://issues.apache.org/jira/browse/YARN-4955 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4955.1.patch, YARN-4955.2.patch, YARN-4955.3.patch > > > We saw this exception several times when we tried to getDelegationToken from > ATS. > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > Caused by: > org.apache.hadoop.security.authentication.client.AuthenticationException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475) > at > org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at >
[jira] [Commented] (YARN-4751) In 2.7, Labeled queue usage not shown properly in capacity scheduler UI
[ https://issues.apache.org/jira/browse/YARN-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242992#comment-15242992 ] Eric Payne commented on YARN-4751: -- bq. YARN-4751 does not apply to branch-2.7. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {{YARN-4751-branch-2.7.004.patch}} won't apply until the 2.7 patch for YARN-3362 is committed. > In 2.7, Labeled queue usage not shown properly in capacity scheduler UI > --- > > Key: YARN-4751 > URL: https://issues.apache.org/jira/browse/YARN-4751 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.7.3 >Reporter: Eric Payne >Assignee: Eric Payne > Attachments: 2.7 CS UI No BarGraph.jpg, > YARH-4752-branch-2.7.001.patch, YARH-4752-branch-2.7.002.patch, > YARN-4751-branch-2.7.003.patch, YARN-4751-branch-2.7.004.patch > > > In 2.6 and 2.7, the capacity scheduler UI does not have the queue graphs > separated by partition. When applications are running on a labeled queue, no > color is shown in the bar graph, and several of the "Used" metrics are zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4940) yarn node -list -all failed if RM start with decommissioned node
[ https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242910#comment-15242910 ] Hadoop QA commented on YARN-4940: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 13s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 36s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 132m 52s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps | | | hadoop.yarn.webapp.TestRMWithCSRFFilter | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesReservation | | JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes | |
[jira] [Commented] (YARN-3215) Respect labels in CapacityScheduler when computing headroom
[ https://issues.apache.org/jira/browse/YARN-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242874#comment-15242874 ] Naganarasimha G R commented on YARN-3215: - Thanks [~wangda], Missed to see that you had uploaded a patch for 2.8. Will check why some test cases are failing. > Respect labels in CapacityScheduler when computing headroom > --- > > Key: YARN-3215 > URL: https://issues.apache.org/jira/browse/YARN-3215 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Attachments: YARN-3215.v1.001.patch, YARN-3215.v2.001.patch, > YARN-3215.v2.002.patch, YARN-3215.v2.003.patch, YARN-3215.v2.branch-2.8.patch > > > In existing CapacityScheduler, when computing headroom of an application, it > will only consider "non-labeled" nodes of this application. > But it is possible the application is asking for labeled resources, so > headroom-by-label (like 5G resource available under node-label=red) is > required to get better resource allocation and avoid deadlocks such as > MAPREDUCE-5928. > This JIRA could involve both API changes (such as adding a > label-to-available-resource map in AllocateResponse) and also internal > changes in CapacityScheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4940) yarn node -list -all failed if RM start with decommissioned node
[ https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242785#comment-15242785 ] Hadoop QA commented on YARN-4940: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 35s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 41s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 133m 6s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps | | | hadoop.yarn.webapp.TestRMWithCSRFFilter | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesReservation | | JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes | | JDK
[jira] [Commented] (YARN-4940) yarn node -list -all failed if RM start with decommissioned node
[ https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242768#comment-15242768 ] sandflee commented on YARN-4940: Thanks [~templedf] [~kshukla], update the patch, and I don't think the test failure related to this issue. > yarn node -list -all failed if RM start with decommissioned node > > > Key: YARN-4940 > URL: https://issues.apache.org/jira/browse/YARN-4940 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-4940.01.patch, YARN-4940.02.patch, > YARN-4940.03.patch, YARN-4940.04.patch, YARN-4940.05.patch > > > 1, add a node to exclude file > 2, start RM > 3, run yarn node -list -all , see the following exception > {quote} > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager$UnknownNodeId > cannot be cast to org.apache.hadoop.yarn.api.records.impl.pb.NodeIdPBImpl > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:251) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.convertToProtoFormat(GetClusterNodesResponsePBImpl.java:172) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.access$000(GetClusterNodesResponsePBImpl.java:38) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:152) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:141) > at > com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336) > at > com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323) > at > org.apache.hadoop.yarn.proto.YarnServiceProtos$GetClusterNodesResponseProto$Builder.addAllNodeReports(YarnServiceProtos.java:21485) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.addLocalNodeManagerInfosToProto(GetClusterNodesResponsePBImpl.java:164) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToBuilder(GetClusterNodesResponsePBImpl.java:99) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToProto(GetClusterNodesResponsePBImpl.java:106) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.getProto(GetClusterNodesResponsePBImpl.java:71) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:284) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:493) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >
[jira] [Updated] (YARN-4940) yarn node -list -all failed if RM start with decommissioned node
[ https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sandflee updated YARN-4940: --- Attachment: YARN-4940.05.patch > yarn node -list -all failed if RM start with decommissioned node > > > Key: YARN-4940 > URL: https://issues.apache.org/jira/browse/YARN-4940 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-4940.01.patch, YARN-4940.02.patch, > YARN-4940.03.patch, YARN-4940.04.patch, YARN-4940.05.patch > > > 1, add a node to exclude file > 2, start RM > 3, run yarn node -list -all , see the following exception > {quote} > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager$UnknownNodeId > cannot be cast to org.apache.hadoop.yarn.api.records.impl.pb.NodeIdPBImpl > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:251) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.convertToProtoFormat(GetClusterNodesResponsePBImpl.java:172) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.access$000(GetClusterNodesResponsePBImpl.java:38) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:152) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:141) > at > com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336) > at > com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323) > at > org.apache.hadoop.yarn.proto.YarnServiceProtos$GetClusterNodesResponseProto$Builder.addAllNodeReports(YarnServiceProtos.java:21485) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.addLocalNodeManagerInfosToProto(GetClusterNodesResponsePBImpl.java:164) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToBuilder(GetClusterNodesResponsePBImpl.java:99) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToProto(GetClusterNodesResponsePBImpl.java:106) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.getProto(GetClusterNodesResponsePBImpl.java:71) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:284) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:493) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at
[jira] [Commented] (YARN-4940) yarn node -list -all failed if RM start with decommissioned node
[ https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242756#comment-15242756 ] Kuhu Shukla commented on YARN-4940: --- Latest patch looking good with the minute nit of extra line before {{createExcludeFile}}. Most test failures seem unrelated. > yarn node -list -all failed if RM start with decommissioned node > > > Key: YARN-4940 > URL: https://issues.apache.org/jira/browse/YARN-4940 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-4940.01.patch, YARN-4940.02.patch, > YARN-4940.03.patch, YARN-4940.04.patch > > > 1, add a node to exclude file > 2, start RM > 3, run yarn node -list -all , see the following exception > {quote} > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager$UnknownNodeId > cannot be cast to org.apache.hadoop.yarn.api.records.impl.pb.NodeIdPBImpl > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:251) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.convertToProtoFormat(GetClusterNodesResponsePBImpl.java:172) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.access$000(GetClusterNodesResponsePBImpl.java:38) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:152) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:141) > at > com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336) > at > com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323) > at > org.apache.hadoop.yarn.proto.YarnServiceProtos$GetClusterNodesResponseProto$Builder.addAllNodeReports(YarnServiceProtos.java:21485) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.addLocalNodeManagerInfosToProto(GetClusterNodesResponsePBImpl.java:164) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToBuilder(GetClusterNodesResponsePBImpl.java:99) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToProto(GetClusterNodesResponsePBImpl.java:106) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.getProto(GetClusterNodesResponsePBImpl.java:71) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:284) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:493) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >
[jira] [Commented] (YARN-4909) Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter
[ https://issues.apache.org/jira/browse/YARN-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242712#comment-15242712 ] Naganarasimha G R commented on YARN-4909: - ok, Committing the patch shortly > Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter > --- > > Key: YARN-4909 > URL: https://issues.apache.org/jira/browse/YARN-4909 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Brahma Reddy Battula >Assignee: Bibin A Chundatt >Priority: Blocker > Attachments: 0001-YARN-4909.patch, 0002-YARN-4909.patch, > 0003-YARN-4909.patch, 0004-YARN-4909.patch, 0005-YARN-4909.patch, > 0006-YARN-4909.patch > > > *Precommit link* > https://builds.apache.org/job/PreCommit-YARN-Build/10908/testReport/ > *Trace* > {noformat} > com.sun.jersey.test.framework.spi.container.TestContainerException: > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:463) > at sun.nio.ch.Net.bind(Net.java:455) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:413) > at > org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:384) > at > org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:375) > at > org.glassfish.grizzly.http.server.NetworkListener.start(NetworkListener.java:549) > at > org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:255) > at > com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:326) > at > com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:343) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.instantiateGrizzlyWebServer(GrizzlyWebTestContainerFactory.java:219) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:129) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:86) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory.create(GrizzlyWebTestContainerFactory.java:79) > at > com.sun.jersey.test.framework.JerseyTest.getContainer(JerseyTest.java:342) > at com.sun.jersey.test.framework.JerseyTest.(JerseyTest.java:217) > at > org.apache.hadoop.yarn.webapp.JerseyTestBase.(JerseyTestBase.java:30) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices.(TestRMWebServices.java:125) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4909) Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter
[ https://issues.apache.org/jira/browse/YARN-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242707#comment-15242707 ] Sunil G commented on YARN-4909: --- Looks fine for me too.. > Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter > --- > > Key: YARN-4909 > URL: https://issues.apache.org/jira/browse/YARN-4909 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Brahma Reddy Battula >Assignee: Bibin A Chundatt >Priority: Blocker > Attachments: 0001-YARN-4909.patch, 0002-YARN-4909.patch, > 0003-YARN-4909.patch, 0004-YARN-4909.patch, 0005-YARN-4909.patch, > 0006-YARN-4909.patch > > > *Precommit link* > https://builds.apache.org/job/PreCommit-YARN-Build/10908/testReport/ > *Trace* > {noformat} > com.sun.jersey.test.framework.spi.container.TestContainerException: > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:463) > at sun.nio.ch.Net.bind(Net.java:455) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:413) > at > org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:384) > at > org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:375) > at > org.glassfish.grizzly.http.server.NetworkListener.start(NetworkListener.java:549) > at > org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:255) > at > com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:326) > at > com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:343) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.instantiateGrizzlyWebServer(GrizzlyWebTestContainerFactory.java:219) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:129) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:86) > at > com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory.create(GrizzlyWebTestContainerFactory.java:79) > at > com.sun.jersey.test.framework.JerseyTest.getContainer(JerseyTest.java:342) > at com.sun.jersey.test.framework.JerseyTest.(JerseyTest.java:217) > at > org.apache.hadoop.yarn.webapp.JerseyTestBase.(JerseyTestBase.java:30) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices.(TestRMWebServices.java:125) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4940) yarn node -list -all failed if RM start with decommissioned node
[ https://issues.apache.org/jira/browse/YARN-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sandflee updated YARN-4940: --- Attachment: YARN-4940.04.patch > yarn node -list -all failed if RM start with decommissioned node > > > Key: YARN-4940 > URL: https://issues.apache.org/jira/browse/YARN-4940 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-4940.01.patch, YARN-4940.02.patch, > YARN-4940.03.patch, YARN-4940.04.patch > > > 1, add a node to exclude file > 2, start RM > 3, run yarn node -list -all , see the following exception > {quote} > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager$UnknownNodeId > cannot be cast to org.apache.hadoop.yarn.api.records.impl.pb.NodeIdPBImpl > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:251) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.convertToProtoFormat(GetClusterNodesResponsePBImpl.java:172) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.access$000(GetClusterNodesResponsePBImpl.java:38) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:152) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl$1$1.next(GetClusterNodesResponsePBImpl.java:141) > at > com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336) > at > com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323) > at > org.apache.hadoop.yarn.proto.YarnServiceProtos$GetClusterNodesResponseProto$Builder.addAllNodeReports(YarnServiceProtos.java:21485) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.addLocalNodeManagerInfosToProto(GetClusterNodesResponsePBImpl.java:164) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToBuilder(GetClusterNodesResponsePBImpl.java:99) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.mergeLocalToProto(GetClusterNodesResponsePBImpl.java:106) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetClusterNodesResponsePBImpl.getProto(GetClusterNodesResponsePBImpl.java:71) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:284) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:493) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242591#comment-15242591 ] Hadoop QA commented on YARN-4948: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 56s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 22s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 6s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 13 new + 212 unchanged - 0 fixed = 225 total (was 212) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 14 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 26s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 24s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 25s {color} | {color:red} hadoop-yarn-api in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 29s {color} | {color:red} hadoop-yarn-common in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 24s {color} | {color:red} hadoop-yarn-api in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 38s {color} | {color:red} hadoop-yarn-common in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License
[jira] [Commented] (YARN-4962) support filling up containers on node one by one
[ https://issues.apache.org/jira/browse/YARN-4962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242589#comment-15242589 ] sandflee commented on YARN-4962: one simple way is enable continuous Scheduling, allocate containers to the node with least resource not max. > support filling up containers on node one by one > - > > Key: YARN-4962 > URL: https://issues.apache.org/jira/browse/YARN-4962 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sandflee > > we had a gpu cluster, jobs with bigger resource request couldn't be satisfied > for node is running the jobs with smaller resource request. we didn't open > reserve system because gpu jobs may run days or weeks. we expect scheduler > allocate containers to fill the node , then there will be resource to run > jobs with big resource request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4962) support filling up containers on node one by one
sandflee created YARN-4962: -- Summary: support filling up containers on node one by one Key: YARN-4962 URL: https://issues.apache.org/jira/browse/YARN-4962 Project: Hadoop YARN Issue Type: Improvement Reporter: sandflee we had a gpu cluster, jobs with bigger resource request couldn't be satisfied for node is running the jobs with smaller resource request. we didn't open reserve system because gpu jobs may run days or weeks. we expect scheduler allocate containers to fill the node , then there will be resource to run jobs with big resource request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242555#comment-15242555 ] Naganarasimha G R commented on YARN-4948: - I have triggered the build in the backend. If you want to do it on your own, one way is to reattach the patch. > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4948) Support node labels store in zookeeper
[ https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242556#comment-15242556 ] Naganarasimha G R commented on YARN-4948: - Thanks [~wangda] > Support node labels store in zookeeper > -- > > Key: YARN-4948 > URL: https://issues.apache.org/jira/browse/YARN-4948 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: jialei weng >Assignee: jialei weng > Attachments: YARN-4948.001.patch, YARN-4948.002.patch > > > Support node labels store in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332)