[jira] [Commented] (YARN-6367) YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before parse the JSON object from NMWebService
[ https://issues.apache.org/jira/browse/YARN-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934090#comment-15934090 ] Junping Du commented on YARN-6367: -- Patch LGTM. +1. Will commit it shortly. > YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before > parse the JSON object from NMWebService > - > > Key: YARN-6367 > URL: https://issues.apache.org/jira/browse/YARN-6367 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Xuan Gong > Attachments: YARN-6367.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933947#comment-15933947 ] ASF GitHub Bot commented on YARN-6302: -- Github user szegedim commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107055947 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h --- @@ -37,8 +37,8 @@ enum command { enum errorcodes { INVALID_ARGUMENT_NUMBER = 1, - INVALID_USER_NAME, //2 - INVALID_COMMAND_PROVIDED, //3 + //INVALID_USER_NAME 2 --- End diff -- INVALID_USER_NAME was forgotten earlier, so I removed it, and I just followed the pattern that is in the code right now keeping the original value commented. If we want to refactor this right now, I would generate large pseudorandom number do be able to check the difference and be able to search for the error code like a GUID in a search engine. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933943#comment-15933943 ] ASF GitHub Bot commented on YARN-6302: -- Github user szegedim commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107055710 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java --- @@ -294,6 +295,14 @@ public Integer call() { .setUserLocalDirs(userLocalDirs) .setContainerLocalDirs(containerLocalDirs) .setContainerLogDirs(containerLogDirs).build()); +} catch (ConfigurationException e) { + LOG.error("Failed to launch container.", e); --- End diff -- It will be redundant, since the exception type is usually visible, but I fixed it. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933944#comment-15933944 ] ASF GitHub Bot commented on YARN-6302: -- Github user szegedim commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107055721 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerRelaunch.java --- @@ -115,6 +116,14 @@ public Integer call() { .setContainerLocalDirs(containerLocalDirs) .setContainerLogDirs(containerLogDirs) .build()); +} catch (ConfigurationException e) { + LOG.error("Failed to relaunch container.", e); --- End diff -- It will be redundant, since the exception type is usually visible, but I fixed it. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933941#comment-15933941 ] ASF GitHub Bot commented on YARN-6302: -- Github user szegedim commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107055288 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthCheckerService.java --- @@ -80,6 +97,7 @@ long getLastHealthReportTime() { long lastReportTime = (nodeHealthScriptRunner == null) --- End diff -- Done. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933924#comment-15933924 ] ASF GitHub Bot commented on YARN-6302: -- Github user szegedim commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107053798 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthCheckerService.java --- @@ -54,22 +58,35 @@ protected void serviceInit(Configuration conf) throws Exception { * @return the reporting string of health of the node */ String getHealthReport() { +String healthReport = ""; --- End diff -- Done. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933925#comment-15933925 ] ASF GitHub Bot commented on YARN-6302: -- Github user szegedim commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107053877 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthCheckerService.java --- @@ -54,22 +58,35 @@ protected void serviceInit(Configuration conf) throws Exception { * @return the reporting string of health of the node */ String getHealthReport() { +String healthReport = ""; String scriptReport = (nodeHealthScriptRunner == null) ? "" : nodeHealthScriptRunner.getHealthReport(); -if (scriptReport.equals("")) { - return dirsHandler.getDisksHealthReport(false); -} else { - return scriptReport.concat(SEPARATOR + dirsHandler.getDisksHealthReport(false)); +String discReport = dirsHandler.getDisksHealthReport(false); +String exceptionReport = nodeHealthException != null ? +nodeHealthException.getMessage() : ""; + +if (!scriptReport.equals("")) { + healthReport = scriptReport; +} +if (!discReport.equals("")) { + healthReport = healthReport.equals("") ? discReport : + healthReport.concat(SEPARATOR + discReport); } +if (!exceptionReport.equals("")) { + healthReport = healthReport.equals("") ? exceptionReport : + healthReport.concat(SEPARATOR + exceptionReport); +} +return healthReport; } /** * @return true if the node is healthy */ boolean isHealthy() { -boolean scriptHealthStatus = (nodeHealthScriptRunner == null) ? true -: nodeHealthScriptRunner.isHealthy(); -return scriptHealthStatus && dirsHandler.areDisksHealthy(); +boolean scriptHealthStatus = nodeHealthScriptRunner == null || --- End diff -- Done. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933894#comment-15933894 ] ASF GitHub Bot commented on YARN-6302: -- Github user szegedim commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107051786 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthCheckerService.java --- @@ -31,6 +31,8 @@ private NodeHealthScriptRunner nodeHealthScriptRunner; private LocalDirsHandlerService dirsHandler; + private Exception nodeHealthException; + long nodeHealthExceptionReportTime; --- End diff -- My mistake. Fixed. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933892#comment-15933892 ] ASF GitHub Bot commented on YARN-6302: -- Github user szegedim commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107051720 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java --- @@ -525,6 +580,23 @@ public int launchContainer(ContainerStartContext ctx) throws IOException { logOutput(diagnostics); container.handle(new ContainerDiagnosticsUpdateEvent(containerId, diagnostics)); +if (exitCode == LinuxContainerExecutorExitCode. --- End diff -- I get enum types cannot be instantiated. I could create a function that returns the appropriate enum for an int value, but would not that be an overkill here? > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933876#comment-15933876 ] ASF GitHub Bot commented on YARN-6302: -- Github user szegedim commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107051128 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/ConfigurationException.java --- @@ -0,0 +1,44 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.exceptions; + +import org.apache.hadoop.classification.InterfaceAudience.Public; +import org.apache.hadoop.classification.InterfaceStability.Unstable; + +/** + * This exception is thrown on unrecoverable container launch errors. --- End diff -- Agreed. Fixed the code. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933827#comment-15933827 ] ASF GitHub Bot commented on YARN-6302: -- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107045163 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthCheckerService.java --- @@ -54,22 +58,35 @@ protected void serviceInit(Configuration conf) throws Exception { * @return the reporting string of health of the node */ String getHealthReport() { +String healthReport = ""; --- End diff -- This would be a bit cleaner with a Joiner: String scriptReport = (nodeHealthScriptRunner == null) ? null : nodeHealthScriptRunner.getHealthReport(); String discReport = dirsHandler.getDisksHealthReport(false); String exceptionReport = nodeHealthException == null ? null : nodeHealthException.getMessage(); String healthReport = Joiner.on(SEPARATOR).skipNulls().join(scriptReport, discReport.equals("") ? null : discReport, exceptionReport); The discReport throws a monkey wrench in the works because it's returning "" instead of null. There's probably a more elegant solution that what I did above... > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933823#comment-15933823 ] ASF GitHub Bot commented on YARN-6302: -- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107029835 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java --- @@ -294,6 +295,14 @@ public Integer call() { .setUserLocalDirs(userLocalDirs) .setContainerLocalDirs(containerLocalDirs) .setContainerLogDirs(containerLogDirs).build()); +} catch (ConfigurationException e) { + LOG.error("Failed to launch container.", e); --- End diff -- Since you know it was a configuration error, you may as well say so in the error message. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933819#comment-15933819 ] ASF GitHub Bot commented on YARN-6302: -- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107028842 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/ConfigurationException.java --- @@ -0,0 +1,44 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.yarn.exceptions; + +import org.apache.hadoop.classification.InterfaceAudience.Public; +import org.apache.hadoop.classification.InterfaceStability.Unstable; + +/** + * This exception is thrown on unrecoverable container launch errors. --- End diff -- No reason to constrain the use of the exception. Maybe offer the launch errors as an example or suggested use? > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933825#comment-15933825 ] ASF GitHub Bot commented on YARN-6302: -- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107029968 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerRelaunch.java --- @@ -115,6 +116,14 @@ public Integer call() { .setContainerLocalDirs(containerLocalDirs) .setContainerLogDirs(containerLogDirs) .build()); +} catch (ConfigurationException e) { + LOG.error("Failed to relaunch container.", e); --- End diff -- Since you know it was a configuration error, you may as well say so in the error message. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933821#comment-15933821 ] ASF GitHub Bot commented on YARN-6302: -- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107029030 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthCheckerService.java --- @@ -31,6 +31,8 @@ private NodeHealthScriptRunner nodeHealthScriptRunner; private LocalDirsHandlerService dirsHandler; + private Exception nodeHealthException; + long nodeHealthExceptionReportTime; --- End diff -- My rule of thumb is that If it's not private, it should have javadocs. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933828#comment-15933828 ] ASF GitHub Bot commented on YARN-6302: -- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107029448 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthCheckerService.java --- @@ -80,6 +97,7 @@ long getLastHealthReportTime() { long lastReportTime = (nodeHealthScriptRunner == null) --- End diff -- This isn't your code, but it's hideous. Wanna clean it up, too? :) > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933824#comment-15933824 ] ASF GitHub Bot commented on YARN-6302: -- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107029264 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthCheckerService.java --- @@ -54,22 +58,35 @@ protected void serviceInit(Configuration conf) throws Exception { * @return the reporting string of health of the node */ String getHealthReport() { +String healthReport = ""; String scriptReport = (nodeHealthScriptRunner == null) ? "" : nodeHealthScriptRunner.getHealthReport(); -if (scriptReport.equals("")) { - return dirsHandler.getDisksHealthReport(false); -} else { - return scriptReport.concat(SEPARATOR + dirsHandler.getDisksHealthReport(false)); +String discReport = dirsHandler.getDisksHealthReport(false); +String exceptionReport = nodeHealthException != null ? +nodeHealthException.getMessage() : ""; + +if (!scriptReport.equals("")) { + healthReport = scriptReport; +} +if (!discReport.equals("")) { + healthReport = healthReport.equals("") ? discReport : + healthReport.concat(SEPARATOR + discReport); } +if (!exceptionReport.equals("")) { + healthReport = healthReport.equals("") ? exceptionReport : + healthReport.concat(SEPARATOR + exceptionReport); +} +return healthReport; } /** * @return true if the node is healthy */ boolean isHealthy() { -boolean scriptHealthStatus = (nodeHealthScriptRunner == null) ? true -: nodeHealthScriptRunner.isHealthy(); -return scriptHealthStatus && dirsHandler.areDisksHealthy(); +boolean scriptHealthStatus = nodeHealthScriptRunner == null || --- End diff -- Maybe rename this one scriptHealthy > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933822#comment-15933822 ] ASF GitHub Bot commented on YARN-6302: -- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107028894 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java --- @@ -111,6 +113,58 @@ private LinuxContainerRuntime linuxContainerRuntime; /** + * The container exit code. + */ + public enum LinuxContainerExecutorExitCode { --- End diff -- Love it! > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933826#comment-15933826 ] ASF GitHub Bot commented on YARN-6302: -- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107028658 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java --- @@ -525,6 +580,23 @@ public int launchContainer(ContainerStartContext ctx) throws IOException { logOutput(diagnostics); container.handle(new ContainerDiagnosticsUpdateEvent(containerId, diagnostics)); +if (exitCode == LinuxContainerExecutorExitCode. --- End diff -- Would it be cleaner to create a new LinuxContainerExecutorExitCode from your exitCode and then test via ==? > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly
[ https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933820#comment-15933820 ] ASF GitHub Bot commented on YARN-6302: -- Github user templedf commented on a diff in the pull request: https://github.com/apache/hadoop/pull/200#discussion_r107033241 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h --- @@ -37,8 +37,8 @@ enum command { enum errorcodes { INVALID_ARGUMENT_NUMBER = 1, - INVALID_USER_NAME, //2 - INVALID_COMMAND_PROVIDED, //3 + //INVALID_USER_NAME 2 --- End diff -- This section of code makes me want to weep. > Fail the node, if Linux Container Executor is not configured properly > - > > Key: YARN-6302 > URL: https://issues.apache.org/jira/browse/YARN-6302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > > We have a cluster that has one node with misconfigured Linux Container > Executor. Every time an AM or regular container is launched on the cluster, > it will fail. The node will still have resources available, so it keeps > failing apps until the administrator notices the issue and decommissions the > node. AM Blacklisting only helps, if the application is already running. > As a possible improvement, when the LCE is used on the cluster and a NM gets > certain errors back from the LCE, like error 24 configuration not found, we > should not try to allocate anything on the node anymore or shut down the node > entirely. That kind of problem normally does not fix itself and it means that > nothing can really run on that node. > {code} > Application application_1488920587909_0010 failed 2 times due to AM Container > for appattempt_1488920587909_0010_02 exited with exitCode: -1000 > Failing this attempt.Diagnostics: Application application_1488920587909_0010 > initialization failed (exitCode=24) with output: > For more detailed output, check the application tracking page: > http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then > click on links to logs of each attempt. > . Failing the application. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6285) Add option to set max limit on ResourceManager for ApplicationClientProtocol.getApplications
[ https://issues.apache.org/jira/browse/YARN-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933790#comment-15933790 ] yunjiong zhao commented on YARN-6285: - YARN-6339 is not applied to our cluster yet. When I create YARN-6285, what I want is a simple patch which allow us to control the GC ASAP. With YARN-6339, I believe we can set yarn.resourcemanager.max-limit-get-applications with a bigger value or not need set a limit any more. Will let you know after YARN-6339 pasted review and applied in our cluster. > Add option to set max limit on ResourceManager for > ApplicationClientProtocol.getApplications > > > Key: YARN-6285 > URL: https://issues.apache.org/jira/browse/YARN-6285 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: YARN-6285.001.patch, YARN-6285.002.patch, > YARN-6285.003.patch > > > When users called ApplicationClientProtocol.getApplications, it will return > lots of data, and generate lots of garbage on ResourceManager which caused > long time GC. > For example, on one of our RM, when called rest API " http:// address:port>/ws/v1/cluster/apps" it can return 150MB data which have 944 > applications. > getApplications have limit parameter, but some user might not set it, and > then the limit will be Long.MAX_VALUE. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6339) Improve performance for createAndGetApplicationReport
[ https://issues.apache.org/jira/browse/YARN-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933782#comment-15933782 ] yunjiong zhao commented on YARN-6339: - {quote}Why changes of createAndGetApplicationReport required? {quote} The purpose is to avoid calling getLogAggregationStatus() unnecessary inside getLogAggregationReportsForApp() after application's LogAggregationStatus changed to TIME_OUT. I think we should add LogAggregationStatus.TIME_OUT in isLogAggregationFinished() like LogAggregationStatus.SUCCEEDED and LogAggregationStatus.FAILED. If ignore future risks, we can even change logAggregationStatusForAppReport inside getLogAggregationStatusForAppReport() with hold readLock only. To avoid confusing, due to createAndGetApplicationReport() will call getLogAggregationStatusForAppReport() with hold readLock, I think update logAggregationStatusForAppReport inside createAndGetApplicationReport() with writeLock hold is right thing to do. {code} } else if (logTimeOutCount > 0) { + logAggregationStatusForAppReport = LogAggregationStatus.TIME_OUT; return LogAggregationStatus.TIME_OUT; } {code} > Improve performance for createAndGetApplicationReport > - > > Key: YARN-6339 > URL: https://issues.apache.org/jira/browse/YARN-6339 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: YARN-6339.001.patch, YARN-6339.002.patch > > > There are two performance issue when calling createAndGetApplicationReport: > One is inside ProtoUtils.convertFromProtoFormat, replace is too slow for > clusters which have more than 3000 nodes. Use substring is much better: > https://issues.apache.org/jira/browse/YARN-6285?focusedCommentId=15923241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15923241 > Another one is inside getLogAggregationReportsForApp, if some application's > LogAggregationStatus is TIME_OUT, every time it was called it will create an > HashMap which will produce lots of garbage. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6334) TestRMFailover#testAutomaticFailover always passes even when it should fail
[ https://issues.apache.org/jira/browse/YARN-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933768#comment-15933768 ] Hadoop QA commented on YARN-6334: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The patch generated 0 new + 10 unchanged - 1 fixed = 10 total (was 11) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 32s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 43m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6334 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859656/YARN-6334.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 59563687d45c 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 49efd5d | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15341/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15341/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > TestRMFailover#testAutomaticFailover always passes even when it should fail > --- > > Key: YARN-6334 > URL: https://issues.apache.org/jira/browse/YARN-6334 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-6334.001.patch, YARN-6334.002.patch, > YARN-6334.003.patch, YARN-6334.004.patch,
[jira] [Commented] (YARN-3427) Remove deprecated methods from ResourceCalculatorProcessTree
[ https://issues.apache.org/jira/browse/YARN-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933761#comment-15933761 ] Hadoop QA commented on YARN-3427: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The patch generated 1 new + 126 unchanged - 21 fixed = 127 total (was 147) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 20s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-3427 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859657/YARN-3427.000.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 13dfd88aa020 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 49efd5d | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/15342/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15342/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15342/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Remove deprecated methods from ResourceCalculatorProcessTree > > > Key: YARN-3427 > URL: https://issues.apache.org/jira/browse/YARN-3427 > Project: Hadoop YARN > Issue Type: Improvement >Affects
[jira] [Commented] (YARN-6368) Decommissioning an NM results in a -1 exit code
[ https://issues.apache.org/jira/browse/YARN-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933734#comment-15933734 ] Haibo Chen commented on YARN-6368: -- Thanks [~miklos.szeg...@cloudera.com] for the patch! It looks like resyncWithRM() can also call shutdown() in which case the exit code should be -1. Maybe we should pass in exit code to shutdown()? > Decommissioning an NM results in a -1 exit code > --- > > Key: YARN-6368 > URL: https://issues.apache.org/jira/browse/YARN-6368 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments: YARN-6368.000.patch > > > In NodeManager.java we should exit normally in case the RM shuts down the > node: > {code} > } finally { > if (shouldExitOnShutdownEvent > && !ShutdownHookManager.get().isShutdownInProgress()) { > ExitUtil.terminate(-1); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6342) Issues in async API of TimelineClient
[ https://issues.apache.org/jira/browse/YARN-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933724#comment-15933724 ] Haibo Chen commented on YARN-6342: -- publishWithoutBlockingOnQueue() will only throw InterruptedExceptions from calling queue.poll(), which is already taken care of by the current code. EntitiesHolder.run() never throws an exception based on FutureTask java doc and my little testing. Therefore, I think the publishing thread never exits because of unexpected exceptions in publishing entities. > Issues in async API of TimelineClient > - > > Key: YARN-6342 > URL: https://issues.apache.org/jira/browse/YARN-6342 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > > Found these with [~rohithsharma] while browsing the code > - In stop: it calls shutdownNow which doens't wait for pending tasks, should > it use shutdown instead ? > {code} > public void stop() { > LOG.info("Stopping TimelineClient."); > executor.shutdownNow(); > try { > executor.awaitTermination(DRAIN_TIME_PERIOD, TimeUnit.MILLISECONDS); > } catch (InterruptedException e) { > {code} > - In TimelineClientImpl#createRunnable: > If any exception happens when publish one entity > (publishWithoutBlockingOnQueue), the thread exists. I think it should try > best effort to continue publishing the timeline entities, one failure should > not cause all followup entities not published. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6367) YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before parse the JSON object from NMWebService
[ https://issues.apache.org/jira/browse/YARN-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933722#comment-15933722 ] Hadoop QA commented on YARN-6367: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 48s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 40m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6367 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859644/YARN-6367.1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 4ef90326031e 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 49efd5d | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15339/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15339/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before > parse the JSON object from NMWebService > - > > Key: YARN-6367 > URL: https://issues.apache.org/jira/browse/YARN-6367 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Xuan Gong >
[jira] [Commented] (YARN-6334) TestRMFailover#testAutomaticFailover always passes even when it should fail
[ https://issues.apache.org/jira/browse/YARN-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933716#comment-15933716 ] Daniel Templeton commented on YARN-6334: Works for me. +1 I'll commit soon. > TestRMFailover#testAutomaticFailover always passes even when it should fail > --- > > Key: YARN-6334 > URL: https://issues.apache.org/jira/browse/YARN-6334 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-6334.001.patch, YARN-6334.002.patch, > YARN-6334.003.patch, YARN-6334.004.patch, YARN-6334.005.patch > > > Due to a bug in {{while}} loop. > {code} > int maxWaitingAttempts = 2000; > while (maxWaitingAttempts-- > 0 ) { > if (rm.getRMContext().getHAServiceState() == HAServiceState.STANDBY) { > break; > } > Thread.sleep(1); > } > Assert.assertFalse("RM didn't transition to Standby ", > maxWaitingAttempts == 0); > {code} > maxWaitingAttempts is -1 if RM didn't transition to Standby. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6368) Decommissioning an NM results in a -1 exit code
[ https://issues.apache.org/jira/browse/YARN-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933710#comment-15933710 ] Hadoop QA commented on YARN-6368: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 59s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 35m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6368 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859641/YARN-6368.000.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 5a11c986db12 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 49efd5d | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15340/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15340/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Decommissioning an NM results in a -1 exit code > --- > > Key: YARN-6368 > URL: https://issues.apache.org/jira/browse/YARN-6368 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments:
[jira] [Updated] (YARN-3427) Remove deprecated methods from ResourceCalculatorProcessTree
[ https://issues.apache.org/jira/browse/YARN-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-3427: - Attachment: YARN-3427.000.patch Attached patch > Remove deprecated methods from ResourceCalculatorProcessTree > > > Key: YARN-3427 > URL: https://issues.apache.org/jira/browse/YARN-3427 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.7.0 >Reporter: Karthik Kambatla >Assignee: Miklos Szegedi >Priority: Blocker > Attachments: YARN-3427.000.patch > > > In 2.7, we made ResourceCalculatorProcessTree Public and exposed some > existing ill-formed methods as deprecated ones for use by Tez. > We should remove it in 3.0.0, considering that the methods have been > deprecated for the all 2.x.y releases that it is marked Public in. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6334) TestRMFailover#testAutomaticFailover always passes even when it should fail
[ https://issues.apache.org/jira/browse/YARN-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933702#comment-15933702 ] Yufei Gu commented on YARN-6334: Thanks [~templedf] for the review. Uploaded patch v5 for your comment. > TestRMFailover#testAutomaticFailover always passes even when it should fail > --- > > Key: YARN-6334 > URL: https://issues.apache.org/jira/browse/YARN-6334 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-6334.001.patch, YARN-6334.002.patch, > YARN-6334.003.patch, YARN-6334.004.patch, YARN-6334.005.patch > > > Due to a bug in {{while}} loop. > {code} > int maxWaitingAttempts = 2000; > while (maxWaitingAttempts-- > 0 ) { > if (rm.getRMContext().getHAServiceState() == HAServiceState.STANDBY) { > break; > } > Thread.sleep(1); > } > Assert.assertFalse("RM didn't transition to Standby ", > maxWaitingAttempts == 0); > {code} > maxWaitingAttempts is -1 if RM didn't transition to Standby. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6334) TestRMFailover#testAutomaticFailover always passes even when it should fail
[ https://issues.apache.org/jira/browse/YARN-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-6334: --- Attachment: YARN-6334.005.patch > TestRMFailover#testAutomaticFailover always passes even when it should fail > --- > > Key: YARN-6334 > URL: https://issues.apache.org/jira/browse/YARN-6334 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-6334.001.patch, YARN-6334.002.patch, > YARN-6334.003.patch, YARN-6334.004.patch, YARN-6334.005.patch > > > Due to a bug in {{while}} loop. > {code} > int maxWaitingAttempts = 2000; > while (maxWaitingAttempts-- > 0 ) { > if (rm.getRMContext().getHAServiceState() == HAServiceState.STANDBY) { > break; > } > Thread.sleep(1); > } > Assert.assertFalse("RM didn't transition to Standby ", > maxWaitingAttempts == 0); > {code} > maxWaitingAttempts is -1 if RM didn't transition to Standby. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6285) Add option to set max limit on ResourceManager for ApplicationClientProtocol.getApplications
[ https://issues.apache.org/jira/browse/YARN-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933646#comment-15933646 ] Wangda Tan commented on YARN-6285: -- Thanks [~zhaoyunjiong] for sharing the experimental results. Now I can understand why the YARN-6339 required and I will check it in once it is ready. For your last comment: bq. On the cluster we applied this patch and set yarn.resourcemanager.max-limit-get-applications to 400 I'm actually not sure about which part improves performance since there're two variables (patch of YARN-6339 and changes of #apps). It will be more helpful if you can gets some benchmarks from config with only one variable. > Add option to set max limit on ResourceManager for > ApplicationClientProtocol.getApplications > > > Key: YARN-6285 > URL: https://issues.apache.org/jira/browse/YARN-6285 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: YARN-6285.001.patch, YARN-6285.002.patch, > YARN-6285.003.patch > > > When users called ApplicationClientProtocol.getApplications, it will return > lots of data, and generate lots of garbage on ResourceManager which caused > long time GC. > For example, on one of our RM, when called rest API " http:// address:port>/ws/v1/cluster/apps" it can return 150MB data which have 944 > applications. > getApplications have limit parameter, but some user might not set it, and > then the limit will be Long.MAX_VALUE. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6367) YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before parse the JSON object from NMWebService
[ https://issues.apache.org/jira/browse/YARN-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-6367: Attachment: YARN-6367.1.patch trivial patch. No testcase needed > YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before > parse the JSON object from NMWebService > - > > Key: YARN-6367 > URL: https://issues.apache.org/jira/browse/YARN-6367 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Xuan Gong > Attachments: YARN-6367.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6339) Improve performance for createAndGetApplicationReport
[ https://issues.apache.org/jira/browse/YARN-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933638#comment-15933638 ] Wangda Tan commented on YARN-6339: -- Thanks [~zhaoyunjiong], For your latest patch, there's only thing I'm not sure: Why changes of {{createAndGetApplicationReport}} required? It will be helpful if you could share more details about this part. Other changes of the patch looks good. > Improve performance for createAndGetApplicationReport > - > > Key: YARN-6339 > URL: https://issues.apache.org/jira/browse/YARN-6339 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: YARN-6339.001.patch, YARN-6339.002.patch > > > There are two performance issue when calling createAndGetApplicationReport: > One is inside ProtoUtils.convertFromProtoFormat, replace is too slow for > clusters which have more than 3000 nodes. Use substring is much better: > https://issues.apache.org/jira/browse/YARN-6285?focusedCommentId=15923241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15923241 > Another one is inside getLogAggregationReportsForApp, if some application's > LogAggregationStatus is TIME_OUT, every time it was called it will create an > HashMap which will produce lots of garbage. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6334) TestRMFailover#testAutomaticFailover always passes even when it should fail
[ https://issues.apache.org/jira/browse/YARN-6334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933627#comment-15933627 ] Daniel Templeton commented on YARN-6334: I'm not a fan of wonky _for_ statements. Wanna replace that _for-if_ with a _while_ and a decrement? > TestRMFailover#testAutomaticFailover always passes even when it should fail > --- > > Key: YARN-6334 > URL: https://issues.apache.org/jira/browse/YARN-6334 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-6334.001.patch, YARN-6334.002.patch, > YARN-6334.003.patch, YARN-6334.004.patch > > > Due to a bug in {{while}} loop. > {code} > int maxWaitingAttempts = 2000; > while (maxWaitingAttempts-- > 0 ) { > if (rm.getRMContext().getHAServiceState() == HAServiceState.STANDBY) { > break; > } > Thread.sleep(1); > } > Assert.assertFalse("RM didn't transition to Standby ", > maxWaitingAttempts == 0); > {code} > maxWaitingAttempts is -1 if RM didn't transition to Standby. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6368) Decommissioning an NM results in a -1 exit code
[ https://issues.apache.org/jira/browse/YARN-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-6368: - Attachment: YARN-6368.000.patch Attaching patch > Decommissioning an NM results in a -1 exit code > --- > > Key: YARN-6368 > URL: https://issues.apache.org/jira/browse/YARN-6368 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments: YARN-6368.000.patch > > > In NodeManager.java we should exit normally in case the RM shuts down the > node: > {code} > } finally { > if (shouldExitOnShutdownEvent > && !ShutdownHookManager.get().isShutdownInProgress()) { > ExitUtil.terminate(-1); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6368) Decommissioning an NM results in a -1 exit code
Miklos Szegedi created YARN-6368: Summary: Decommissioning an NM results in a -1 exit code Key: YARN-6368 URL: https://issues.apache.org/jira/browse/YARN-6368 Project: Hadoop YARN Issue Type: Bug Reporter: Miklos Szegedi Assignee: Miklos Szegedi Priority: Minor In NodeManager.java we should exit normally in case the RM shuts down the node: {code} } finally { if (shouldExitOnShutdownEvent && !ShutdownHookManager.get().isShutdownInProgress()) { ExitUtil.terminate(-1); } } {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6309) Fair scheduler docs should have the queue and queuePlacementPolicy elements listed in bold so that they're easier to see
[ https://issues.apache.org/jira/browse/YARN-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933609#comment-15933609 ] Hudson commented on YARN-6309: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11429 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11429/]) YARN-6309. Fair scheduler docs should have the queue and (templedf: rev 49efd5d204524f49a8b91ece84c4131b2d49cf00) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md > Fair scheduler docs should have the queue and queuePlacementPolicy elements > listed in bold so that they're easier to see > > > Key: YARN-6309 > URL: https://issues.apache.org/jira/browse/YARN-6309 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton >Assignee: esmaeil mirzaee >Priority: Minor > Labels: docs, newbie > Attachments: YARN_6309.001.patch, YARN-6309.patch > > > Under {{Allocation file format : Queue elements}}, all of the element names > should be bold, e.g. {{minResources}}, {{maxResources}}, etc. Same for > {{Allocation file format : A queuePlacementPolicy element}} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933594#comment-15933594 ] Hadoop QA commented on YARN-6357: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6357 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859632/YARN-6357.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 23176df02cfb 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6c399a8 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15338/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15338/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch,
[jira] [Created] (YARN-6367) YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before parse the JSON object from NMWebService
Xuan Gong created YARN-6367: --- Summary: YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before parse the JSON object from NMWebService Key: YARN-6367 URL: https://issues.apache.org/jira/browse/YARN-6367 Project: Hadoop YARN Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Xuan Gong -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933545#comment-15933545 ] Haibo Chen commented on YARN-6357: -- One thing I noticed, is that TimelineWriter.write() is effectively an async method, i.e. writeAsync() for the moment. If we need to ensure entities are written to backend, we need to call TimelineWriter.flush(() after TimelineWriter.write(). In addition, both TimelineWriter implementations return new TimelineWriteResponse() in all situations, so response carries no value. I wonder if the TimelineWriter.write() should be renamed to writeAsync() and a new writeSync() should be added to make it cleaner. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933517#comment-15933517 ] Haibo Chen commented on YARN-6357: -- Thanks [~varun_saxena] for the review! I updated the patch accordingly with your comments. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6357: - Attachment: YARN-6357.02.patch > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch, YARN-6357.02.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6285) Add option to set max limit on ResourceManager for ApplicationClientProtocol.getApplications
[ https://issues.apache.org/jira/browse/YARN-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933508#comment-15933508 ] yunjiong zhao commented on YARN-6285: - [~wangda], appreciate if you have time double check LogAggregationReportPBImpl.getLogAggregationStatus() and take a look at YARN-6339. > Add option to set max limit on ResourceManager for > ApplicationClientProtocol.getApplications > > > Key: YARN-6285 > URL: https://issues.apache.org/jira/browse/YARN-6285 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: YARN-6285.001.patch, YARN-6285.002.patch, > YARN-6285.003.patch > > > When users called ApplicationClientProtocol.getApplications, it will return > lots of data, and generate lots of garbage on ResourceManager which caused > long time GC. > For example, on one of our RM, when called rest API " http:// address:port>/ws/v1/cluster/apps" it can return 150MB data which have 944 > applications. > getApplications have limit parameter, but some user might not set it, and > then the limit will be Long.MAX_VALUE. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933461#comment-15933461 ] Daniel Templeton commented on YARN-2962: Thanks for updating the patch, [~varun_saxena]. Looks like you addressed most of my comments. I took a look again at the ones you didn't address, and I'm fine with leaving them unaddressed. One minor quibble in the new patch: "deduct" is misspelled as "dedcut". Otherwise, LGTM. > ZKRMStateStore: Limit the number of znodes under a znode > > > Key: YARN-2962 > URL: https://issues.apache.org/jira/browse/YARN-2962 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Karthik Kambatla >Assignee: Varun Saxena >Priority: Critical > Attachments: YARN-2962.006.patch, YARN-2962.007.patch, > YARN-2962.008.patch, YARN-2962.008.patch, YARN-2962.009.patch, > YARN-2962.01.patch, YARN-2962.04.patch, YARN-2962.05.patch, > YARN-2962.2.patch, YARN-2962.3.patch > > > We ran into this issue where we were hitting the default ZK server message > size configs, primarily because the message had too many znodes even though > they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6366) Refactor the NodeManager DeletionService to support additional DeletionTask types.
[ https://issues.apache.org/jira/browse/YARN-6366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Kumpf reassigned YARN-6366: - Assignee: Shane Kumpf > Refactor the NodeManager DeletionService to support additional DeletionTask > types. > -- > > Key: YARN-6366 > URL: https://issues.apache.org/jira/browse/YARN-6366 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, yarn >Reporter: Shane Kumpf >Assignee: Shane Kumpf > > The NodeManager's DeletionService only supports file based DeletionTask's. > This makes sense as files (and directories) have been the primary concern for > clean up to date. With the addition of the Docker container runtime, addition > types of DeletionTask are likely to be required, such as deletion of docker > container and images. See YARN-5366 and YARN-5670. This issue is to refactor > the DeletionService to support additional DeletionTask's. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6366) Refactor the NodeManager DeletionService to support additional DeletionTask types.
[ https://issues.apache.org/jira/browse/YARN-6366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933293#comment-15933293 ] Shane Kumpf commented on YARN-6366: --- I've been looking into this a bit as part of YARN-5366. I'll take this one. > Refactor the NodeManager DeletionService to support additional DeletionTask > types. > -- > > Key: YARN-6366 > URL: https://issues.apache.org/jira/browse/YARN-6366 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, yarn >Reporter: Shane Kumpf > > The NodeManager's DeletionService only supports file based DeletionTask's. > This makes sense as files (and directories) have been the primary concern for > clean up to date. With the addition of the Docker container runtime, addition > types of DeletionTask are likely to be required, such as deletion of docker > container and images. See YARN-5366 and YARN-5670. This issue is to refactor > the DeletionService to support additional DeletionTask's. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6366) Refactor the NodeManager DeletionService to support additional DeletionTask types.
Shane Kumpf created YARN-6366: - Summary: Refactor the NodeManager DeletionService to support additional DeletionTask types. Key: YARN-6366 URL: https://issues.apache.org/jira/browse/YARN-6366 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, yarn Reporter: Shane Kumpf The NodeManager's DeletionService only supports file based DeletionTask's. This makes sense as files (and directories) have been the primary concern for clean up to date. With the addition of the Docker container runtime, addition types of DeletionTask are likely to be required, such as deletion of docker container and images. See YARN-5366 and YARN-5670. This issue is to refactor the DeletionService to support additional DeletionTask's. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5924) Resource Manager fails to load state with InvalidProtocolBufferException
[ https://issues.apache.org/jira/browse/YARN-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933265#comment-15933265 ] Hadoop QA commented on YARN-5924: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 20s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 3s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5924 | | GITHUB PR | https://github.com/apache/hadoop/pull/164 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 5ca05ecadfc1 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 34a931c | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/15337/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15337/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15337/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Resource Manager fails to load state with InvalidProtocolBufferException > > > Key: YARN-5924 > URL: https://issues.apache.org/jira/browse/YARN-5924 >
[jira] [Commented] (YARN-6353) Clean up OrderingPolicy javadoc
[ https://issues.apache.org/jira/browse/YARN-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933257#comment-15933257 ] Hudson commented on YARN-6353: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11427 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11427/]) YARN-6353. Clean up OrderingPolicy javadoc (Daniel Templeton via Varun (varunsaxena: rev 35034653d02ac8156338d7267e5975d2d66272d5) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/OrderingPolicy.java > Clean up OrderingPolicy javadoc > --- > > Key: YARN-6353 > URL: https://issues.apache.org/jira/browse/YARN-6353 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Labels: javadoc > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: YARN-6353.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6242) [Umbrella] Miscellaneous Scheduler Performance Improvements
[ https://issues.apache.org/jira/browse/YARN-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933237#comment-15933237 ] Yufei Gu commented on YARN-6242: I add two fair scheduler performance improvement JIRAs here. But I just realize there is one FairScheduler performance improvement umbrella YARN-5479. We may combine these two later. > [Umbrella] Miscellaneous Scheduler Performance Improvements > --- > > Key: YARN-6242 > URL: https://issues.apache.org/jira/browse/YARN-6242 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > > There're some performance issues of scheduler. YARN-3091 is majorly targeted > to solve locking issues of scheduler, Let's use this JIRA to track > non-locking issues. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5288) Resource Localization fails due to leftover files
[ https://issues.apache.org/jira/browse/YARN-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu reassigned YARN-5288: -- Assignee: (was: Yufei Gu) > Resource Localization fails due to leftover files > - > > Key: YARN-5288 > URL: https://issues.apache.org/jira/browse/YARN-5288 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.9.0 >Reporter: Yufei Gu > > NM restart didn't clean up all user cache. The leftover files can cause > resource localization failure. > {code} > 2016-06-14 23:09:12,717 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > > java.io.IOException: Rename cannot overwrite non empty destination directory > /data/5/yarn/nm/usercache/xxx/filecache/4567 > at > org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:716) > at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:236) > at > org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:659) > at org.apache.hadoop.fs.FileContext.rename(FileContext.java:912) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues
[ https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-6361: --- Issue Type: Sub-task (was: Bug) Parent: YARN-6242 > FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big > queues > > > Key: YARN-6361 > URL: https://issues.apache.org/jira/browse/YARN-6361 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Miklos Szegedi >Assignee: Yufei Gu >Priority: Minor > Attachments: dispatcherthread.png, threads.png > > > FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. > Most of the time is spent in FairShareComparator.compare. We could improve > this by doing the calculations outside the sort loop {{(O\(n\))}} and we > sorted by a fixed number inside instead {{O(n*log\(n\))}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4090) Make Collections.sort() more efficient in FSParentQueue.java
[ https://issues.apache.org/jira/browse/YARN-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-4090: --- Issue Type: Sub-task (was: Improvement) Parent: YARN-6242 > Make Collections.sort() more efficient in FSParentQueue.java > > > Key: YARN-4090 > URL: https://issues.apache.org/jira/browse/YARN-4090 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Xianyin Xin >Assignee: zhangshilong > Attachments: sampling1.jpg, sampling2.jpg, YARN-4090.001.patch, > YARN-4090.002.patch, YARN-4090.003.patch, YARN-4090.004.patch, > YARN-4090.005.patch, YARN-4090.006.patch, YARN-4090-preview.patch, > YARN-4090-TestResult.pdf > > > Collections.sort() consumes too much time in a scheduling round. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6362) Investigate correct version of frontend-maven-plugin for yarn-ui
[ https://issues.apache.org/jira/browse/YARN-6362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933194#comment-15933194 ] Wangda Tan commented on YARN-6362: -- +1, thanks [~lewuathe]/[~sunilg]! > Investigate correct version of frontend-maven-plugin for yarn-ui > > > Key: YARN-6362 > URL: https://issues.apache.org/jira/browse/YARN-6362 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Kai Sasaki >Assignee: Kai Sasaki > Attachments: YARN-6362.01.patch, YARN-6362.02.patch > > > Building yarn-ui module fails due to invalid npm-cli.js path. > {code} > $ mvn clean install -DskipTests -Dtar -Pdist -Pyarn-ui > {code} > Failure of {{exec-maven-plugin}} in yarn-ui profile. > {code} > [INFO] --- exec-maven-plugin:1.3.1:exec (ember build) @ hadoop-yarn-ui --- > module.js:327 > throw err; > ^ > Error: Cannot find module > '/Users/sasakikai/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/target/src/main/webapp/node/npm/bin/npm-cli' > at Function.Module._resolveFilename (module.js:325:15) > at Function.Module._load (module.js:276:25) > at Function.Module.runMain (module.js:441:10) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6242) [Umbrella] Miscellaneous Scheduler Performance Improvements
[ https://issues.apache.org/jira/browse/YARN-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933164#comment-15933164 ] Wangda Tan commented on YARN-6242: -- [~miklos.szeg...@cloudera.com], sure go ahead :) > [Umbrella] Miscellaneous Scheduler Performance Improvements > --- > > Key: YARN-6242 > URL: https://issues.apache.org/jira/browse/YARN-6242 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > > There're some performance issues of scheduler. YARN-3091 is majorly targeted > to solve locking issues of scheduler, Let's use this JIRA to track > non-locking issues. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync
[ https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933141#comment-15933141 ] Varun Saxena commented on YARN-6357: Thanks [~haibochen] for the patch. # TODO comment in TimelineCollector#putEntitiesAsync should be removed. # Not related but in TimelineCollector#putEntities we can probably remove the log {{LOG.debug("SUCCESS - TIMELINE V2 PROTOTYPE");}} # Duplicate code in putEntities and putEntitiesAsync, can probably be moved to a private method and we can then call this code from respective methods? # Checkstyle and javadoc issues seem fixable. > Implement TimelineCollector#putEntitiesAsync > > > Key: YARN-6357 > URL: https://issues.apache.org/jira/browse/YARN-6357 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineserver >Affects Versions: YARN-2928 >Reporter: Joep Rottinghuis >Assignee: Haibo Chen > Labels: yarn-5355-merge-blocker > Attachments: YARN-6357.01.patch > > > As discovered and discussed in YARN-5269 the > TimelineCollector#putEntitiesAsync method is currently not implemented and > TimelineCollector#putEntities is asynchronous. > TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync > correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... > with the correct argument. This argument does seem to make it into the > params, and on the server side TimelineCollectorWebService#putEntities > correctly pulls the async parameter from the rest call. See line 156: > {code} > boolean isAsync = async != null && async.trim().equalsIgnoreCase("true"); > {code} > However, this is where the problem starts. It simply calls > TimelineCollector#putEntities and ignores the value of isAsync. It should > instead have called TimelineCollector#putEntitiesAsync, which is currently > not implemented. > putEntities should call putEntitiesAsync and then after that call > writer.flush() > The fact that we flush on close and we flush periodically should be more of a > concern of avoiding data loss; close in case sync is never called and the > periodic flush to guard against having data from slow writers get buffered > for a long time and expose us to risk of loss in case the collector crashes > with data in its buffers. Size-based flush is a different concern to avoid > blowing up memory footprint. > The spooling behavior is also somewhat separate. > We have two separate methods on our API putEntities and putEntitiesAsync and > they should have different behavior beyond waiting for the request to be sent. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6166) Unnecessary INFO logs in AMRMClientAsyncImpl$CallbackHandlerThread.run
[ https://issues.apache.org/jira/browse/YARN-6166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933108#comment-15933108 ] Daniel Templeton commented on YARN-6166: [~Naganarasimha], given that the arg is a fixed string, does it really make that much difference? I would assume logging a message at a disabled level should be as cheap as checking the log level. I'm basing my assumption off the JDK's logging, where I know that was an explicit goal. I haven't looked at the code for the logging we use. Looking online, though, I see that Apache commons logging only recommends the guard clause to avoid expensive parameter evaluation:{quote}Code guards are typically used to guard code that only needs to execute in support of logging, that otherwise introduces undesirable runtime overhead in the general case (logging disabled). Examples are multiple parameters, or expressions (e.g. string + " more") for parameters. Use the guard methods of the form {{log.is() }}to verify that logging should be performed, before incurring the overhead of the logging method call. Yes, the logging methods will perform the same check, but only after resolving parameters.{quote} I also see this link on Stack Overflow about Log4j: http://stackoverflow.com/questions/963492/in-log4j-does-checking-isdebugenabled-before-logging-improve-performance that says in the case of this JIRA it's better to leave out the guard clause. > Unnecessary INFO logs in AMRMClientAsyncImpl$CallbackHandlerThread.run > -- > > Key: YARN-6166 > URL: https://issues.apache.org/jira/browse/YARN-6166 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3 >Reporter: Grant W >Assignee: Grant W >Priority: Trivial > Labels: patch > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: YARN-6166.patch > > > Logs like the following should be debug or else every legitimate stop causes > unnecessary exception traces in the logs. > {noformat} > 2013-08-03 20:01:34,460 INFO [AMRM Callback Handler Thread] > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl: > Interrupted while waiting for queue > java.lang.InterruptedException >at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer. > java:1961) >at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1996) >at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) >at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:275) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5924) Resource Manager fails to load state with InvalidProtocolBufferException
[ https://issues.apache.org/jira/browse/YARN-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933105#comment-15933105 ] Hadoop QA commented on YARN-5924: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 3s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 67m 58s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5924 | | GITHUB PR | https://github.com/apache/hadoop/pull/164 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux d11ba1b40127 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 34a931c | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/15334/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15334/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15334/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Resource Manager fails to load state with InvalidProtocolBufferException > > > Key: YARN-5924 > URL:
[jira] [Commented] (YARN-6309) Fair scheduler docs should have the queue and queuePlacementPolicy elements listed in bold so that they're easier to see
[ https://issues.apache.org/jira/browse/YARN-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933097#comment-15933097 ] Hadoop QA commented on YARN-6309: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 14s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6309 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859111/YARN-6309.patch | | Optional Tests | asflicense mvnsite | | uname | Linux 8233445a42c6 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 34a931c | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15335/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Fair scheduler docs should have the queue and queuePlacementPolicy elements > listed in bold so that they're easier to see > > > Key: YARN-6309 > URL: https://issues.apache.org/jira/browse/YARN-6309 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton >Assignee: esmaeil mirzaee >Priority: Minor > Labels: docs, newbie > Attachments: YARN_6309.001.patch, YARN-6309.patch > > > Under {{Allocation file format : Queue elements}}, all of the element names > should be bold, e.g. {{minResources}}, {{maxResources}}, etc. Same for > {{Allocation file format : A queuePlacementPolicy element}} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6309) Fair scheduler docs should have the queue and queuePlacementPolicy elements listed in bold so that they're easier to see
[ https://issues.apache.org/jira/browse/YARN-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933067#comment-15933067 ] Daniel Templeton commented on YARN-6309: Kicking off the pre-commit build again to see if the MVN issue goes away. > Fair scheduler docs should have the queue and queuePlacementPolicy elements > listed in bold so that they're easier to see > > > Key: YARN-6309 > URL: https://issues.apache.org/jira/browse/YARN-6309 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton >Assignee: esmaeil mirzaee >Priority: Minor > Labels: docs, newbie > Attachments: YARN_6309.001.patch, YARN-6309.patch > > > Under {{Allocation file format : Queue elements}}, all of the element names > should be bold, e.g. {{minResources}}, {{maxResources}}, etc. Same for > {{Allocation file format : A queuePlacementPolicy element}} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6353) Clean up OrderingPolicy javadoc
[ https://issues.apache.org/jira/browse/YARN-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933045#comment-15933045 ] Daniel Templeton commented on YARN-6353: Thanks, [~varun_saxena]! > Clean up OrderingPolicy javadoc > --- > > Key: YARN-6353 > URL: https://issues.apache.org/jira/browse/YARN-6353 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Labels: javadoc > Attachments: YARN-6353.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6365) slsrun.sh creating random html directories
Allen Wittenauer created YARN-6365: -- Summary: slsrun.sh creating random html directories Key: YARN-6365 URL: https://issues.apache.org/jira/browse/YARN-6365 Project: Hadoop YARN Issue Type: Bug Components: scheduler-load-simulator Affects Versions: 3.0.0-alpha3 Reporter: Allen Wittenauer Priority: Blocker YARN-6275 causes slsrun.sh to randomly create or override html directories wherever it is run. {code} # copy 'html' directory to current directory to make sure web sever can access cp -r "${bin}/../html" "$(pwd)" {code} Instead, the Java could should be changed to take a system property that slsrun can populate at run time. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6344) Rethinking OFF_SWITCH locality in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932783#comment-15932783 ] Nathan Roberts commented on YARN-6344: -- +1 on improving localityWaitFactor. It definitely won't behave well for applications that ask for resources in small batches. > Rethinking OFF_SWITCH locality in CapacityScheduler > --- > > Key: YARN-6344 > URL: https://issues.apache.org/jira/browse/YARN-6344 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Reporter: Konstantinos Karanasos > > When relaxing locality from node to rack, the {{node-locality-parameter}} is > used: when scheduling opportunities for a scheduler key are more than the > value of this parameter, we relax locality and try to assign the container to > a node in the corresponding rack. > On the other hand, when relaxing locality to off-switch (i.e., assign the > container anywhere in the cluster), we are using a {{localityWaitFactor}}, > which is computed based on the number of outstanding requests for a specific > scheduler key, which is divided by the size of the cluster. > In case of applications that request containers in big batches (e.g., > traditional MR jobs), and for relatively small clusters, the > localityWaitFactor does not affect relaxing locality much. > However, in case of applications that request containers in small batches, > this load factor takes a very small value, which leads to assigning > off-switch containers too soon. This situation is even more pronounced in big > clusters. > For example, if an application requests only one container per request, the > locality will be relaxed after a single missed scheduling opportunity. > The purpose of this JIRA is to rethink the way we are relaxing locality for > off-switch assignments. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6255) Refactor yarn-native-services framework
[ https://issues.apache.org/jira/browse/YARN-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932784#comment-15932784 ] Hadoop QA commented on YARN-6255: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 53s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 32s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 30s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 40s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} yarn-native-services passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 33s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications generated 9 new + 20 unchanged - 14 fixed = 29 total (was 34) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 35s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications: The patch generated 172 new + 1557 unchanged - 604 fixed = 1729 total (was 2161) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 1s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-slider/hadoop-yarn-slider-core generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s{color} | {color:green} hadoop-yarn-slider-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 18s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 31m 39s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-slider/hadoop-yarn-slider-core | | | Unread public/protected field:At ActionUpgradeArgs.java:[line 66] | | | Unread public/protected field:At ActionUpgradeArgs.java:[line 62] | | | Dead store to roleStatus in org.apache.slider.server.appmaster.state.AppState.innerOnNodeManagerContainerStarted(ContainerId) At
[jira] [Commented] (YARN-6255) Refactor yarn-native-services framework
[ https://issues.apache.org/jira/browse/YARN-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932781#comment-15932781 ] Hadoop QA commented on YARN-6255: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 21s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 55s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 28s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 23s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} yarn-native-services passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 34s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications generated 9 new + 20 unchanged - 14 fixed = 29 total (was 34) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 42s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications: The patch generated 171 new + 1556 unchanged - 603 fixed = 1727 total (was 2159) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 4s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-slider/hadoop-yarn-slider-core generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 38s{color} | {color:green} hadoop-yarn-slider-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 15s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 58s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-slider/hadoop-yarn-slider-core | | | Unread public/protected field:At ActionUpgradeArgs.java:[line 66] | | | Unread public/protected field:At ActionUpgradeArgs.java:[line 62] | | | Dead store to roleStatus in org.apache.slider.server.appmaster.state.AppState.innerOnNodeManagerContainerStarted(ContainerId) At
[jira] [Updated] (YARN-6255) Refactor yarn-native-services framework
[ https://issues.apache.org/jira/browse/YARN-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-6255: -- Attachment: YARN-6255.yarn-native-services.04.patch > Refactor yarn-native-services framework > > > Key: YARN-6255 > URL: https://issues.apache.org/jira/browse/YARN-6255 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-6255.yarn-native-services.01.patch, > YARN-6255.yarn-native-services.02.patch, > YARN-6255.yarn-native-services.03.patch, > YARN-6255.yarn-native-services.04.patch > > > YARN-4692 provides a good abstraction of services on YARN. We could use this > as a building block in yarn-native-services framework code base as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4970) Difficult to trace "Connection Refused" in AM Proxying
[ https://issues.apache.org/jira/browse/YARN-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932529#comment-15932529 ] tuoyu commented on YARN-4970: - Below is my properties in yarn-site.xml, and met the same problem. When the job's status is "UNASSIGNED", through the driver host, i could access SparkUI correctly, but when the job is scheduled by Yarn, neither $driver_host:$port or Yarn's ApplicationMaster, all 500 error return. yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 h107710041.cluster.ds.weibo.com yarn.resourcemanager.hostname.rm2 h107710042.cluster.ds.weibo.com yarn.resourcemanager.webapp.address h107710041.cluster.ds.weibo.com:8088 yarn.resourcemanager.webapp.address.rm1 h107710041.cluster.ds.weibo.com:8088 yarn.resourcemanager.webapp.address.rm2 h107710042.cluster.ds.weibo.com:8088 I am not sure this problem could be around. Any help, please, thanks. > Difficult to trace "Connection Refused" in AM Proxying > -- > > Key: YARN-4970 > URL: https://issues.apache.org/jira/browse/YARN-4970 > Project: Hadoop YARN > Issue Type: Improvement > Components: webapp >Affects Versions: 2.6.0 > Environment: Hadoop-2.6.0-CDH5.5.1, linux >Reporter: Matthew Byng-Maddick >Priority: Minor > Labels: applicationmaster, proxy, webapp, yarn > > In generating an HA YARN config (effectively using my own tools for > generation), I missed out the multiple specification of > {{yarn.resourcemanager.webapp.address}}, which produced a "Connection > Refused" similar to that in YARN-800 (because of a misunderstanding about > what the proxy address was) from the AM webapp. > It occurs to me, though, in tracing the code, that the behaviour of > {{hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/RMHAUtils.java}}: > {{getRMHAWebappAddresses(YarnConfiguration)}} should probably mirror the > effect of the startup of the resourcemanager itself, such that if the > id-suffixed key doesn't exist, you end up using the > {{yarn.resourcemanager.address}} with the port number replaced by the one > taken from {{yarn.resourcemanager.webapp.address}} (or its https equivalent). > That is, if you had a config like: > {code} > > yarn.resourcemanager.ha.enabled > true > yarn.resourcemanager.ha.rm-ids > hosta,hostb > yarn.resourcemanager.webapp.address > 0.0.0.0:8088 > yarn.resourcemanager.address.hosta > hosta:8032 > yarn.resourcemanager.address.hostb > hostb:8032 > > {code} > you would end up with ("{{hosta:8088}}", "{{hostb:8088}}") as the > {{List}} result of the function above, rather than the current empty > list result. > This would certainly give a better principle of least-surprise for operators > of yarn (especially those like myself who ended up configuring it wrongly). > Thoughts? Reasons why this isn't a good idea? (I'm afraid my Java's a bit > read-only which is why I haven't suggested a patch) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6353) Clean up OrderingPolicy javadoc
[ https://issues.apache.org/jira/browse/YARN-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932492#comment-15932492 ] Varun Saxena commented on YARN-6353: +1. Will commit it later today > Clean up OrderingPolicy javadoc > --- > > Key: YARN-6353 > URL: https://issues.apache.org/jira/browse/YARN-6353 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Labels: javadoc > Attachments: YARN-6353.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932475#comment-15932475 ] Sunil G commented on YARN-2113: --- Nice catch. Thanks [~eepayne]. A potential over-preemption is there, i will upload a fix now. > Add cross-user preemption within CapacityScheduler's leaf-queue > --- > > Key: YARN-2113 > URL: https://issues.apache.org/jira/browse/YARN-2113 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Vinod Kumar Vavilapalli >Assignee: Sunil G > Attachments: YARN-2113.v0.patch > > > Preemption today only works across queues and moves around resources across > queues per demand and usage. We should also have user-level preemption within > a queue, to balance capacity across users in a predictable manner. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6166) Unnecessary INFO logs in AMRMClientAsyncImpl$CallbackHandlerThread.run
[ https://issues.apache.org/jira/browse/YARN-6166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932422#comment-15932422 ] Naganarasimha G R commented on YARN-6166: - Thanks for the contribution [~genericuser], just a small nit always ensure that *LOG.isDebugEnabled()* is checked before invoking *LOG.debug* > Unnecessary INFO logs in AMRMClientAsyncImpl$CallbackHandlerThread.run > -- > > Key: YARN-6166 > URL: https://issues.apache.org/jira/browse/YARN-6166 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.3 >Reporter: Grant W >Assignee: Grant W >Priority: Trivial > Labels: patch > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: YARN-6166.patch > > > Logs like the following should be debug or else every legitimate stop causes > unnecessary exception traces in the logs. > {noformat} > 2013-08-03 20:01:34,460 INFO [AMRM Callback Handler Thread] > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl: > Interrupted while waiting for queue > java.lang.InterruptedException >at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer. > java:1961) >at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1996) >at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) >at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:275) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6364) How to set the resource queue when start spark job running on yarn
[ https://issues.apache.org/jira/browse/YARN-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932404#comment-15932404 ] Jeff Zhang commented on YARN-6364: -- Set spark.yarn.queue in zeppelin interpreter setting. > How to set the resource queue when start spark job running on yarn > --- > > Key: YARN-6364 > URL: https://issues.apache.org/jira/browse/YARN-6364 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sydt > > As we all know, yarn takes charge of resource manage for hadoop. When > zeppelin start a spark job with yarn-client mode, how to set the designated > resource queue on yarn in order to make different spark applications belongs > to respective user running different yarn resource queue? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6364) How to set the resource queue when start spark job running on yarn
[ https://issues.apache.org/jira/browse/YARN-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang resolved YARN-6364. -- Resolution: Invalid > How to set the resource queue when start spark job running on yarn > --- > > Key: YARN-6364 > URL: https://issues.apache.org/jira/browse/YARN-6364 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sydt > > As we all know, yarn takes charge of resource manage for hadoop. When > zeppelin start a spark job with yarn-client mode, how to set the designated > resource queue on yarn in order to make different spark applications belongs > to respective user running different yarn resource queue? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5153) [YARN-3368] Add a toggle to switch timeline view / table view for containers information inside application-attempt page
[ https://issues.apache.org/jira/browse/YARN-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932381#comment-15932381 ] Hadoop QA commented on YARN-5153: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 1m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5153 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859524/YARN-5153.002.patch | | Optional Tests | asflicense | | uname | Linux 399f1e74346e 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 34a931c | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15331/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [YARN-3368] Add a toggle to switch timeline view / table view for containers > information inside application-attempt page > > > Key: YARN-5153 > URL: https://issues.apache.org/jira/browse/YARN-5153 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Reporter: Wangda Tan >Assignee: Akhil PB > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png, screenshot-5.png, YARN-5153.001.patch, YARN-5153.002.patch, > YARN-5153.preliminary.1.patch, YARN-5153-YARN-3368.1.patch > > > Now we only support timeline view for containers on app-attempt page, it will > be also very useful to show table of containers in some cases. For example, > user can short containers based on priority, etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4518) [YARN-3368] Support rendering statistic-by-node-label for queues/apps page
[ https://issues.apache.org/jira/browse/YARN-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932379#comment-15932379 ] Hadoop QA commented on YARN-4518: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 0m 51s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-4518 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859523/YARN-4518.0005.patch | | Optional Tests | asflicense | | uname | Linux 9b18f2ff693f 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 34a931c | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/15330/artifact/patchprocess/whitespace-tabs.txt | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15330/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [YARN-3368] Support rendering statistic-by-node-label for queues/apps page > -- > > Key: YARN-4518 > URL: https://issues.apache.org/jira/browse/YARN-4518 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Akhil PB > Attachments: YARN-4518.0001.patch, YARN-4518.0002.patch, > YARN-4518.0003.patch, YARN-4518.0004.patch, YARN-4518.0005.patch, > YARN-4518-YARN-3368.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5153) [YARN-3368] Add a toggle to switch timeline view / table view for containers information inside application-attempt page
[ https://issues.apache.org/jira/browse/YARN-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB updated YARN-5153: --- Attachment: YARN-5153.002.patch v2 patch > [YARN-3368] Add a toggle to switch timeline view / table view for containers > information inside application-attempt page > > > Key: YARN-5153 > URL: https://issues.apache.org/jira/browse/YARN-5153 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Reporter: Wangda Tan >Assignee: Akhil PB > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, > screenshot-4.png, screenshot-5.png, YARN-5153.001.patch, YARN-5153.002.patch, > YARN-5153.preliminary.1.patch, YARN-5153-YARN-3368.1.patch > > > Now we only support timeline view for containers on app-attempt page, it will > be also very useful to show table of containers in some cases. For example, > user can short containers based on priority, etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4518) [YARN-3368] Support rendering statistic-by-node-label for queues/apps page
[ https://issues.apache.org/jira/browse/YARN-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB updated YARN-4518: --- Attachment: YARN-4518.0005.patch v5 patch > [YARN-3368] Support rendering statistic-by-node-label for queues/apps page > -- > > Key: YARN-4518 > URL: https://issues.apache.org/jira/browse/YARN-4518 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Akhil PB > Attachments: YARN-4518.0001.patch, YARN-4518.0002.patch, > YARN-4518.0003.patch, YARN-4518.0004.patch, YARN-4518.0005.patch, > YARN-4518-YARN-3368.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6364) How to set the resource queue when start spark job running on yarn
[ https://issues.apache.org/jira/browse/YARN-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sydt updated YARN-6364: --- Description: As we all know, yarn takes charge of resource manage for hadoop. When zeppelin start a spark job with yarn-client mode, how to set the designated resource queue on yarn in order to make different spark applications belongs to respective user running different yarn resource queue? (was: As we all know, yarn takes charge of resource manage for hadoop. When zeppelin start a spark job with yarn-client mode, how to set the designated resource queue on yarn ?) > How to set the resource queue when start spark job running on yarn > --- > > Key: YARN-6364 > URL: https://issues.apache.org/jira/browse/YARN-6364 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sydt > > As we all know, yarn takes charge of resource manage for hadoop. When > zeppelin start a spark job with yarn-client mode, how to set the designated > resource queue on yarn in order to make different spark applications belongs > to respective user running different yarn resource queue? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6364) How to set the resource queue when start spark job running on yarn
sydt created YARN-6364: -- Summary: How to set the resource queue when start spark job running on yarn Key: YARN-6364 URL: https://issues.apache.org/jira/browse/YARN-6364 Project: Hadoop YARN Issue Type: Improvement Reporter: sydt As we all know, yarn takes charge of resource manage for hadoop. When zeppelin start a spark job with yarn-client mode, how to set the designated resource queue on yarn ? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6255) Refactor yarn-native-services framework
[ https://issues.apache.org/jira/browse/YARN-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932279#comment-15932279 ] Hadoop QA commented on YARN-6255: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 53s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 31s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 30s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} yarn-native-services passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 31s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications generated 9 new + 20 unchanged - 14 fixed = 29 total (was 34) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 39s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications: The patch generated 172 new + 1568 unchanged - 594 fixed = 1740 total (was 2162) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 55s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-slider/hadoop-yarn-slider-core generated 8 new + 0 unchanged - 0 fixed = 8 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 36s{color} | {color:green} hadoop-yarn-slider-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 14s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 30m 59s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-slider/hadoop-yarn-slider-core | | | Redundant nullcheck of template, which is known to be non-null in org.apache.slider.client.SliderClient.actionUpgrade(String, ActionUpgradeArgs) Redundant null check at SliderClient.java:is known to be non-null in org.apache.slider.client.SliderClient.actionUpgrade(String, ActionUpgradeArgs) Redundant
[jira] [Commented] (YARN-6255) Refactor yarn-native-services framework
[ https://issues.apache.org/jira/browse/YARN-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15932260#comment-15932260 ] Hadoop QA commented on YARN-6255: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 44s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 11s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 36s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 30s{color} | {color:green} yarn-native-services passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} yarn-native-services passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 36s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications generated 9 new + 20 unchanged - 14 fixed = 29 total (was 34) {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 39s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications: The patch generated 171 new + 1564 unchanged - 593 fixed = 1735 total (was 2157) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 18s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-slider/hadoop-yarn-slider-core generated 8 new + 0 unchanged - 0 fixed = 8 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 47s{color} | {color:green} hadoop-yarn-slider-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 30m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-slider/hadoop-yarn-slider-core | | | Redundant nullcheck of template, which is known to be non-null in org.apache.slider.client.SliderClient.actionUpgrade(String, ActionUpgradeArgs) Redundant null check at SliderClient.java:is known to be non-null in org.apache.slider.client.SliderClient.actionUpgrade(String, ActionUpgradeArgs) Redundant
[jira] [Updated] (YARN-6255) Refactor yarn-native-services framework
[ https://issues.apache.org/jira/browse/YARN-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-6255: -- Attachment: YARN-6255.yarn-native-services.03.patch > Refactor yarn-native-services framework > > > Key: YARN-6255 > URL: https://issues.apache.org/jira/browse/YARN-6255 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-6255.yarn-native-services.01.patch, > YARN-6255.yarn-native-services.02.patch, > YARN-6255.yarn-native-services.03.patch > > > YARN-4692 provides a good abstraction of services on YARN. We could use this > as a building block in yarn-native-services framework code base as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org