[jira] [Commented] (YARN-8644) Improve unit test for RMAppImpl.FinalTransition
[ https://issues.apache.org/jira/browse/YARN-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639259#comment-16639259 ] Hadoop QA commented on YARN-8644: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 42s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 28s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 132 unchanged - 4 fixed = 135 total (was 136) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 35s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 23s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}118m 25s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8644 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12942460/YARN-8644.010.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 63a71b2081b1 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 619e490 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/22066/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | whitespace |
[jira] [Commented] (YARN-8788) mvn package -Pyarn-ui fails on JDK9
[ https://issues.apache.org/jira/browse/YARN-8788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639234#comment-16639234 ] Akira Ajisaka commented on YARN-8788: - Attached the patch in the pull request to trigger precommit jenkins job. > mvn package -Pyarn-ui fails on JDK9 > --- > > Key: YARN-8788 > URL: https://issues.apache.org/jira/browse/YARN-8788 > Project: Hadoop YARN > Issue Type: Bug > Environment: Java 9.0.4, CentOS 7.5 >Reporter: Akira Ajisaka >Assignee: Vidura Bhathiya Mudalige >Priority: Major > Labels: newbie > Attachments: 421.patch > > > {{mvn package -Pdist,native,yarn-ui -Dtar -DskipTests}} failed on trunk. > {noformat} > [ERROR] Failed to execute goal ro.isdc.wro4j:wro4j-maven-plugin:1.7.9:run > (default) on project hadoop-yarn-ui: Execution default of goal > ro.isdc.wro4j:wro4j-maven-plugin:1.7.9:run failed: An API incompatibility was > encountered while executing ro.isdc.wro4j:wro4j-maven-plugin:1.7.9:run: > java.lang.ExceptionInInitializerError: null > [ERROR] - > [ERROR] realm =plugin>ro.isdc.wro4j:wro4j-maven-plugin:1.7.9 > [ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy > [ERROR] urls[0] = > file:/home/aajisaka/.m2/repository/ro/isdc/wro4j/wro4j-maven-plugin/1.7.9/wro4j-maven-plugin-1.7.9.jar > [ERROR] urls[1] = > file:/home/aajisaka/.m2/repository/ro/isdc/wro4j/wro4j-core/1.7.9/wro4j-core-1.7.9.jar > [ERROR] urls[2] = > file:/home/aajisaka/.m2/repository/org/apache/commons/commons-lang3/3.4/commons-lang3-3.4.jar > (snip) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8788) mvn package -Pyarn-ui fails on JDK9
[ https://issues.apache.org/jira/browse/YARN-8788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-8788: Attachment: 421.patch > mvn package -Pyarn-ui fails on JDK9 > --- > > Key: YARN-8788 > URL: https://issues.apache.org/jira/browse/YARN-8788 > Project: Hadoop YARN > Issue Type: Bug > Environment: Java 9.0.4, CentOS 7.5 >Reporter: Akira Ajisaka >Assignee: Vidura Bhathiya Mudalige >Priority: Major > Labels: newbie > Attachments: 421.patch > > > {{mvn package -Pdist,native,yarn-ui -Dtar -DskipTests}} failed on trunk. > {noformat} > [ERROR] Failed to execute goal ro.isdc.wro4j:wro4j-maven-plugin:1.7.9:run > (default) on project hadoop-yarn-ui: Execution default of goal > ro.isdc.wro4j:wro4j-maven-plugin:1.7.9:run failed: An API incompatibility was > encountered while executing ro.isdc.wro4j:wro4j-maven-plugin:1.7.9:run: > java.lang.ExceptionInInitializerError: null > [ERROR] - > [ERROR] realm =plugin>ro.isdc.wro4j:wro4j-maven-plugin:1.7.9 > [ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy > [ERROR] urls[0] = > file:/home/aajisaka/.m2/repository/ro/isdc/wro4j/wro4j-maven-plugin/1.7.9/wro4j-maven-plugin-1.7.9.jar > [ERROR] urls[1] = > file:/home/aajisaka/.m2/repository/ro/isdc/wro4j/wro4j-core/1.7.9/wro4j-core-1.7.9.jar > [ERROR] urls[2] = > file:/home/aajisaka/.m2/repository/org/apache/commons/commons-lang3/3.4/commons-lang3-3.4.jar > (snip) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8849) DynoYARN: A simulation and testing infrastructure for YARN clusters
[ https://issues.apache.org/jira/browse/YARN-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keqiu Hu reassigned YARN-8849: -- Assignee: Keqiu Hu > DynoYARN: A simulation and testing infrastructure for YARN clusters > --- > > Key: YARN-8849 > URL: https://issues.apache.org/jira/browse/YARN-8849 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Arun Suresh >Assignee: Keqiu Hu >Priority: Major > > Traditionally, YARN workload simulation is performed using SLS (Scheduler > Load Simulator) which is packaged with YARN. It Essentially, starts a full > fledged *ResourceManager*, but runs simulators for the *NodeManager* and the > *ApplicationMaster* Containers. These simulators are lightweight and run in a > threadpool. The NM simulators do not open any external ports and send > (in-process) heartbeats to the ResourceManager. > There are a couple of drawbacks with using the SLS: > * It might be difficult to simulate really large clusters without having > access to a very beefy box - since the NMs are launched as tasks in a > threadpool, and each NM has to send periodic heartbeats to the RM. > * Certain features (like YARN-1011) requires changes to the NodeManager - > aspects such as queuing and selectively killing containers have to be > incorporate into the existing NM Simulator which might make the simulator a > bit heavy weight - there is a need for locking and synchronization. > * Since the NM and AM are simulations, only the Scheduler is faithfully > tested - it does not really perform an end-2-end test of a cluster. > Therefore, drawing inspiration from > [Dynamometer|https://github.com/linkedin/dynamometer], we propose a framework > for YARN deployable YARN cluster - *DynoYARN* - for testing, with the > following features: > * The NM already has hooks to plug-in custom *ContainerExecutor* and > *NodeResourceMonitor*. If we can plug-in a custom *ContainersMonitorImpl*'s > Monitoring thread (and other modules like the LocalizationService), We can > probably inject an Executor that does not actually launch containers and a > Node and Container resource monitor that reports synthetic pre-specified > Utilization metrics back to the RM. > * Since we are launching fake containers, we cannot run normal AM containers. > We can therefore, use *Unmanaged AM*'s to launch synthetic jobs. > Essentially, a test workflow would look like this: > * Launch a DynoYARN cluster. > * Use the Unmanaged AM feature to directly negotiate with the DynaYARN > Resource Manager for container tokens. > * Use the container tokens from the RM to directly ask the DynoYARN Node > Managers to start fake containers. > * The DynoYARN NodeManagers will start the fake containers and report to the > DynoYARN Resource Manager synthetically generated resource utilization for > the containers (which will be injected via the *ContainerLaunchContext* and > parsed by the plugged-in Container Executor). > * The Scheduler will use the utilization report to schedule containers - we > will be able to test allocation of *Opportunistic* containers based on > resource utilization. > * Since the DynoYARN Node Managers run the actual code paths, all preemption > and queuing logic will be faithfully executed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8849) DynoYARN: A simulation and testing infrastructure for YARN clusters
[ https://issues.apache.org/jira/browse/YARN-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-8849: -- Description: Traditionally, YARN workload simulation is performed using SLS (Scheduler Load Simulator) which is packaged with YARN. It Essentially, starts a full fledged *ResourceManager*, but runs simulators for the *NodeManager* and the *ApplicationMaster* Containers. These simulators are lightweight and run in a threadpool. The NM simulators do not open any external ports and send (in-process) heartbeats to the ResourceManager. There are a couple of drawbacks with using the SLS: * It might be difficult to simulate really large clusters without having access to a very beefy box - since the NMs are launched as tasks in a threadpool, and each NM has to send periodic heartbeats to the RM. * Certain features (like YARN-1011) requires changes to the NodeManager - aspects such as queuing and selectively killing containers have to be incorporate into the existing NM Simulator which might make the simulator a bit heavy weight - there is a need for locking and synchronization. * Since the NM and AM are simulations, only the Scheduler is faithfully tested - it does not really perform an end-2-end test of a cluster. Therefore, drawing inspiration from [Dynamometer|https://github.com/linkedin/dynamometer], we propose a framework for YARN deployable YARN cluster - *DynoYARN* - for testing, with the following features: * The NM already has hooks to plug-in custom *ContainerExecutor* and *NodeResourceMonitor*. If we can plug-in a custom *ContainersMonitorImpl*'s Monitoring thread (and other modules like the LocalizationService), We can probably inject an Executor that does not actually launch containers and a Node and Container resource monitor that reports synthetic pre-specified Utilization metrics back to the RM. * Since we are launching fake containers, we cannot run normal AM containers. We can therefore, use *Unmanaged AM*'s to launch synthetic jobs. Essentially, a test workflow would look like this: * Launch a DynoYARN cluster. * Use the Unmanaged AM feature to directly negotiate with the DynaYARN Resource Manager for container tokens. * Use the container tokens from the RM to directly ask the DynoYARN Node Managers to start fake containers. * The DynoYARN NodeManagers will start the fake containers and report to the DynoYARN Resource Manager synthetically generated resource utilization for the containers (which will be injected via the *ContainerLaunchContext* and parsed by the plugged-in Container Executor). * The Scheduler will use the utilization report to schedule containers - we will be able to test allocation of *Opportunistic* containers based on resource utilization. * Since the DynoYARN Node Managers run the actual code paths, all preemption and queuing logic will be faithfully executed. was: Traditionally, YARN workload simulation is performed using SLS (Scheduler Load Simulator) which is packaged with YARN. It Essentially, starts a full fledged *ResourceManager*, but runs simulators for the *NodeManager* and the *ApplicationMaster* Containers. These simulators are lightweight and run in a threadpool. The NM simulators do not open any external ports and send (in-process) heartbeats to the ResourceManager. There are a couple of drawbacks with using the SLS: * It might be difficult to simulate really large clusters without having access to a very beefy box - since the NMs are launched as tasks in a threadpool, and each NM has to send periodic heartbeats to the RM. * Certain features (like YARN-1011) requires changes to the NodeManager - aspects such as queuing and selectively killing containers have to be incorporate into the existing NM Simulator which might make the simulator a bit heavy weight - there is a need for locking and synchronization. * Since the NM and AM are simulations, only the Scheduler is faithfully tested - it does not really perform an end-2-end test of a cluster. Therefore, drawing inspiration from [Dynamometer|https://github.com/linkedin/dynamometer], we propose a framework for YARN deployable YARN cluster - *DynoYARN* - for testing, with the following features: * The NM already has hooks to plug-in custom *ContainerExecutor* and *NodeResourceMonitor*. If we can plug-in a custom *ContainersMonitorImpl*'s Monitoring thread (and other modules like the LocalizationService), We can probably inject an Executor that does not actually launch containers and a Node and Container resource monitor that reports synthetic pre-specified Utilization metrics back to the RM. * Since we are launching fake containers, we cannot run normal AM containers. We can therefore, use *Unmanaged AM*'s to launch synthetic jobs. Essentially, a test workflow would look like this: * Launch a DynoYARN cluster. * Use the Unmanaged AM feature to directly negotiate
[jira] [Created] (YARN-8850) Make certain aspects of the NM pluggable to support a DynoYARN cluster
Arun Suresh created YARN-8850: - Summary: Make certain aspects of the NM pluggable to support a DynoYARN cluster Key: YARN-8850 URL: https://issues.apache.org/jira/browse/YARN-8850 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8849) DynoYARN: A simulation and testing infrastructure for YARN clusters
Arun Suresh created YARN-8849: - Summary: DynoYARN: A simulation and testing infrastructure for YARN clusters Key: YARN-8849 URL: https://issues.apache.org/jira/browse/YARN-8849 Project: Hadoop YARN Issue Type: New Feature Reporter: Arun Suresh Traditionally, YARN workload simulation is performed using SLS (Scheduler Load Simulator) which is packaged with YARN. It Essentially, starts a full fledged *ResourceManager*, but runs simulators for the *NodeManager* and the *ApplicationMaster* Containers. These simulators are lightweight and run in a threadpool. The NM simulators do not open any external ports and send (in-process) heartbeats to the ResourceManager. There are a couple of drawbacks with using the SLS: * It might be difficult to simulate really large clusters without having access to a very beefy box - since the NMs are launched as tasks in a threadpool, and each NM has to send periodic heartbeats to the RM. * Certain features (like YARN-1011) requires changes to the NodeManager - aspects such as queuing and selectively killing containers have to be incorporate into the existing NM Simulator which might make the simulator a bit heavy weight - there is a need for locking and synchronization. * Since the NM and AM are simulations, only the Scheduler is faithfully tested - it does not really perform an end-2-end test of a cluster. Therefore, drawing inspiration from [Dynamometer|https://github.com/linkedin/dynamometer], we propose a framework for YARN deployable YARN cluster - *DynoYARN* - for testing, with the following features: * The NM already has hooks to plug-in custom *ContainerExecutor* and *NodeResourceMonitor*. If we can plug-in a custom *ContainersMonitorImpl*'s Monitoring thread (and other modules like the LocalizationService), We can probably inject an Executor that does not actually launch containers and a Node and Container resource monitor that reports synthetic pre-specified Utilization metrics back to the RM. * Since we are launching fake containers, we cannot run normal AM containers. We can therefore, use *Unmanaged AM*'s to launch synthetic jobs. Essentially, a test workflow would look like this: * Launch a DynoYARN cluster. * Use the Unmanaged AM feature to directly negotiate with the DynaYARN Resource Manager for container tokens. * Use the container tokens from the RM to directly ask the DynoYARN Node Managers to start fake containers. * The DynoYARN NodeManagers will start the fake containers and report to the DynoYARN Resource Manager synthetically generated resource utilization for the containers (which will be injected via the *ContainerLaunchContext* and parsed by the plugged-in Container Executor). * The Scheduler will use the utilization report to schedule containers - we will be able to test allocation of {{Opportunistic}} containers based on resource utilization. * Since the DynoYARN Node Managers run the actual code paths, all preemption and queuing logic will be faithfully executed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8842) Update QueueMetrics with custom resource values
[ https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639218#comment-16639218 ] Szilard Nemeth commented on YARN-8842: -- About your second comment: I think it does make sense to move that method with YARN-8059. > Update QueueMetrics with custom resource values > > > Key: YARN-8842 > URL: https://issues.apache.org/jira/browse/YARN-8842 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8842.001.patch, YARN-8842.002.patch > > > This is the 2nd dependent jira of YARN-8059. > As updating the metrics is an independent step from handling preemption, this > jira only deals with the queue metrics update of custom resources. > The following metrics should be updated: > * allocated resources > * available resources > * pending resources > * reserved resources > * aggregate seconds preempted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639178#comment-16639178 ] Eric Yang commented on YARN-8569: - [~suma.shivaprasad] Thanks for the review, appinfo.json is probably better choice than AMInfo.json because the API can be used by code other than Application master even though it might be rare. Info in appinfo.json provides no additional information to the file name. How about app.json in short for easy nomenclature of naming this file? Good catch on the file close, will update in the next patch. > Create an interface to provide cluster information to application > - > > Key: YARN-8569 > URL: https://issues.apache.org/jira/browse/YARN-8569 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8569 YARN sysfs interface to provide cluster > information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, > YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, > YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch > > > Some program requires container hostnames to be known for application to run. > For example, distributed tensorflow requires launch_command that looks like: > {code} > # On ps0.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=ps --task_index=0 > # On ps1.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=ps --task_index=1 > # On worker0.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=worker --task_index=0 > # On worker1.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=worker --task_index=1 > {code} > This is a bit cumbersome to orchestrate via Distributed Shell, or YARN > services launch_command. In addition, the dynamic parameters do not work > with YARN flex command. This is the classic pain point for application > developer attempt to automate system environment settings as parameter to end > user application. > It would be great if YARN Docker integration can provide a simple option to > expose hostnames of the yarn service via a mounted file. The file content > gets updated when flex command is performed. This allows application > developer to consume system environment settings via a standard interface. > It is like /proc/devices for Linux, but for Hadoop. This may involve > updating a file in distributed cache, and allow mounting of the file via > container-executor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639205#comment-16639205 ] Suma Shivaprasad commented on YARN-8569: [~eyang] app.json sounds good. Thanks > Create an interface to provide cluster information to application > - > > Key: YARN-8569 > URL: https://issues.apache.org/jira/browse/YARN-8569 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8569 YARN sysfs interface to provide cluster > information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, > YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, > YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch > > > Some program requires container hostnames to be known for application to run. > For example, distributed tensorflow requires launch_command that looks like: > {code} > # On ps0.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=ps --task_index=0 > # On ps1.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=ps --task_index=1 > # On worker0.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=worker --task_index=0 > # On worker1.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=worker --task_index=1 > {code} > This is a bit cumbersome to orchestrate via Distributed Shell, or YARN > services launch_command. In addition, the dynamic parameters do not work > with YARN flex command. This is the classic pain point for application > developer attempt to automate system environment settings as parameter to end > user application. > It would be great if YARN Docker integration can provide a simple option to > expose hostnames of the yarn service via a mounted file. The file content > gets updated when flex command is performed. This allows application > developer to consume system environment settings via a standard interface. > It is like /proc/devices for Linux, but for Hadoop. This may involve > updating a file in distributed cache, and allow mounting of the file via > container-executor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8842) Update QueueMetrics with custom resource values
[ https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639171#comment-16639171 ] Szilard Nemeth commented on YARN-8842: -- Hi [~wilfreds]! Thanks for the comments. 1. Good idea to decide on the flag. However, it's not immediately straightforward for me whether we only need to treat memory / cores as default resources. See: https://github.com/apache/hadoop/blob/12a095a496dd59066d73a7a6c24129b5b6a9d650/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ResourceInformation.java#L56 , here we also have GPU / FPGA in the mandatory resources list. How would you exactly check in the constructor whether we have custom resources configured? I can think of the following: As we receive the configuration object in the constructor, we could check for how many keys are in there for {{YarnConfiguration.RESOURCE_TYPES}}. If it's 2 (4 in case we treat GPU / FPGA as standard resources), then we are not creating the {{QueueMetricsForCustomResources}}, otherwise we create it. Can you think of any better way to do this? 2. It's a good catch, thanks! 3. Good point, fixed. 4. Refactored {{QueueMetricsTestcase}} in order to handle more than 2 queues, so that I can not only specify root and leaf queues, but multiple queues in a hierarchy. > Update QueueMetrics with custom resource values > > > Key: YARN-8842 > URL: https://issues.apache.org/jira/browse/YARN-8842 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8842.001.patch, YARN-8842.002.patch > > > This is the 2nd dependent jira of YARN-8059. > As updating the metrics is an independent step from handling preemption, this > jira only deals with the queue metrics update of custom resources. > The following metrics should be updated: > * allocated resources > * available resources > * pending resources > * reserved resources > * aggregate seconds preempted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8842) Update QueueMetrics with custom resource values
[ https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-8842: - Attachment: YARN-8842.002.patch > Update QueueMetrics with custom resource values > > > Key: YARN-8842 > URL: https://issues.apache.org/jira/browse/YARN-8842 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8842.001.patch, YARN-8842.002.patch > > > This is the 2nd dependent jira of YARN-8059. > As updating the metrics is an independent step from handling preemption, this > jira only deals with the queue metrics update of custom resources. > The following metrics should be updated: > * allocated resources > * available resources > * pending resources > * reserved resources > * aggregate seconds preempted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7994) Add support for network-alias in docker run for user defined networks
[ https://issues.apache.org/jira/browse/YARN-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639164#comment-16639164 ] Eric Yang commented on YARN-7994: - [~suma.shivaprasad] Sorry, I digressed from the topic. Please ignore the previous comment. > Add support for network-alias in docker run for user defined networks > -- > > Key: YARN-7994 > URL: https://issues.apache.org/jira/browse/YARN-7994 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > Labels: Docker > > Docker Embedded DNS supports DNS resolution for containers by one or more of > its configured {{--network-alias}} within a user-defined network. > DockerRunCommand should support this option for DNS resolution to work > through docker embedded DNS -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8848) Improvements to YARN over-allocation (YARN-1011)
Arun Suresh created YARN-8848: - Summary: Improvements to YARN over-allocation (YARN-1011) Key: YARN-8848 URL: https://issues.apache.org/jira/browse/YARN-8848 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Consolidating work to be done in the next phase of YARN over-allocation (YARN-1011). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6723) NM overallocation based on over-time rather than snapshot utilization
[ https://issues.apache.org/jira/browse/YARN-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6723: -- Parent Issue: YARN-8848 (was: YARN-1011) > NM overallocation based on over-time rather than snapshot utilization > - > > Key: YARN-6723 > URL: https://issues.apache.org/jira/browse/YARN-6723 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 3.0.0-alpha3 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > > To continue discussion on Miklos's idea in YARN-6670 of > "Usually the CPU usage fluctuates quite a bit. Do not we need a time period > for NM_OVERALLOCATION_GENERAL_THRESHOLD, etc. to avoid allocating on small > glitches, even worse preempting in those cases?" -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6690) Consolidate NM overallocation thresholds with ResourceTypes
[ https://issues.apache.org/jira/browse/YARN-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6690: -- Parent Issue: YARN-8848 (was: YARN-1011) > Consolidate NM overallocation thresholds with ResourceTypes > > > Key: YARN-6690 > URL: https://issues.apache.org/jira/browse/YARN-6690 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 3.0.0-alpha3 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > > YARN-3926 (ResourceTypes) introduces a new class ResourceInformation to > encapsulate all information about a given resource type (e.g. type, value, > unit). We could add the overallocation thresholds to it as well. > Another thing to look at, as suggested by Wangda in YARN-4511 is whether we > could just use ResourceThresholds to replace OverallocationInfo. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8845) hadoop.registry.rm.enabled is not used
[ https://issues.apache.org/jira/browse/YARN-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639141#comment-16639141 ] Hadoop QA commented on YARN-8845: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 52s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 18s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 51s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 4s{color} | {color:green} hadoop-yarn-registry in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}117m 38s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests |
[jira] [Resolved] (YARN-8847) Add resource-types.xml to configuration when refreshing maximum allocation
[ https://issues.apache.org/jira/browse/YARN-8847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung resolved YARN-8847. - Resolution: Invalid Actually, upon closer inspection it's handled in AdminService. Seems this was missed when porting internally. Sorry for the noise. > Add resource-types.xml to configuration when refreshing maximum allocation > -- > > Key: YARN-8847 > URL: https://issues.apache.org/jira/browse/YARN-8847 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Attachments: YARN-8847.001.patch > > > YARN-7738 adds functionality for refreshing maximum allocation on scheduler > refresh. But it seems when resource types are configured in > resource-types.xml, the relevant configurations: {noformat} > yarn.resource-types > yarn.io/gpu > > > yarn.resource-types.yarn.io/gpu.maximum-allocation > 4 > > {noformat} > are not picked up. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8847) Add resource-types.xml to configuration when refreshing maximum allocation
[ https://issues.apache.org/jira/browse/YARN-8847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-8847: Attachment: YARN-8847.001.patch > Add resource-types.xml to configuration when refreshing maximum allocation > -- > > Key: YARN-8847 > URL: https://issues.apache.org/jira/browse/YARN-8847 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Attachments: YARN-8847.001.patch > > > YARN-7738 adds functionality for refreshing maximum allocation on scheduler > refresh. But it seems when resource types are configured in > resource-types.xml, the relevant configurations: {noformat} > yarn.resource-types > yarn.io/gpu > > > yarn.resource-types.yarn.io/gpu.maximum-allocation > 4 > > {noformat} > are not picked up. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8847) Add resource-types.xml to configuration when refreshing maximum allocation
[ https://issues.apache.org/jira/browse/YARN-8847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639115#comment-16639115 ] Jonathan Hung commented on YARN-8847: - 001 adds resource-types.xml to conf prior to fetching max allocations > Add resource-types.xml to configuration when refreshing maximum allocation > -- > > Key: YARN-8847 > URL: https://issues.apache.org/jira/browse/YARN-8847 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Attachments: YARN-8847.001.patch > > > YARN-7738 adds functionality for refreshing maximum allocation on scheduler > refresh. But it seems when resource types are configured in > resource-types.xml, the relevant configurations: {noformat} > yarn.resource-types > yarn.io/gpu > > > yarn.resource-types.yarn.io/gpu.maximum-allocation > 4 > > {noformat} > are not picked up. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8847) Add resource-types.xml to configuration when refreshing maximum allocation
[ https://issues.apache.org/jira/browse/YARN-8847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-8847: Issue Type: Sub-task (was: Improvement) Parent: YARN-7069 > Add resource-types.xml to configuration when refreshing maximum allocation > -- > > Key: YARN-8847 > URL: https://issues.apache.org/jira/browse/YARN-8847 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > > YARN-7738 adds functionality for refreshing maximum allocation on scheduler > refresh. But it seems when resource types are configured in > resource-types.xml, the relevant configurations: {noformat} > yarn.resource-types > yarn.io/gpu > > > yarn.resource-types.yarn.io/gpu.maximum-allocation > 4 > > {noformat} > are not picked up. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8847) Add resource-types.xml to configuration when refreshing maximum allocation
Jonathan Hung created YARN-8847: --- Summary: Add resource-types.xml to configuration when refreshing maximum allocation Key: YARN-8847 URL: https://issues.apache.org/jira/browse/YARN-8847 Project: Hadoop YARN Issue Type: Improvement Reporter: Jonathan Hung Assignee: Jonathan Hung YARN-7738 adds functionality for refreshing maximum allocation on scheduler refresh. But it seems when resource types are configured in resource-types.xml, the relevant configurations: {noformat} yarn.resource-types yarn.io/gpu yarn.resource-types.yarn.io/gpu.maximum-allocation 4 {noformat} are not picked up. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8846) Allow Applications to demand Guaranteed Containers
Arun Suresh created YARN-8846: - Summary: Allow Applications to demand Guaranteed Containers Key: YARN-8846 URL: https://issues.apache.org/jira/browse/YARN-8846 Project: Hadoop YARN Issue Type: Sub-task Components: capacity scheduler Reporter: Arun Suresh The Capacity Scheduler should ensure that if the {{enforceExecutionType}} flag in the resource request is {{true}} and the requested Container is of {{GUARANTEED}} type, the Capacity scheduler should not return over-allocated containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8846) Allow Applications to demand Guaranteed Containers
[ https://issues.apache.org/jira/browse/YARN-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh reassigned YARN-8846: - Assignee: Arun Suresh > Allow Applications to demand Guaranteed Containers > -- > > Key: YARN-8846 > URL: https://issues.apache.org/jira/browse/YARN-8846 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Major > > The Capacity Scheduler should ensure that if the {{enforceExecutionType}} > flag in the resource request is {{true}} and the requested Container is of > {{GUARANTEED}} type, the Capacity scheduler should not return over-allocated > containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8250) Create another implementation of ContainerScheduler to support NM overallocation
[ https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-8250: -- Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-1011) > Create another implementation of ContainerScheduler to support NM > overallocation > > > Key: YARN-8250 > URL: https://issues.apache.org/jira/browse/YARN-8250 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8250-YARN-1011.00.patch, > YARN-8250-YARN-1011.01.patch, YARN-8250-YARN-1011.02.patch > > > YARN-6675 adds NM over-allocation support by modifying the existing > ContainerScheduler and providing a utilizationBased resource tracker. > However, the implementation adds a lot of complexity to ContainerScheduler, > and future tweak of over-allocation strategy based on how much containers > have been launched is even more complicated. > As such, this Jira proposes a new ContainerScheduler that always launch > guaranteed containers immediately and queues opportunistic containers. It > relies on a periodical check to launch opportunistic containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8790) Authentication Filter change to force security check
[ https://issues.apache.org/jira/browse/YARN-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639039#comment-16639039 ] Eric Yang commented on YARN-8790: - Using curl as sanity test with YARN-8763 patch 004, and verified the container shell websocket is protected by AuthenticationFilter: {code} curl -i --negotiate -u : -H 'Upgrade: websocket' -H 'Connection: Upgrade' -H 'Sec-WebSocket-Version: 13' -H 'Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==' http://hadoop.example.com:8042/container/v1 HTTP/1.1 401 Authentication required Date: Thu, 04 Oct 2018 21:02:22 GMT Date: Thu, 04 Oct 2018 21:02:22 GMT Pragma: no-cache X-Content-Type-Options: nosniff X-XSS-Protection: 1; mode=block WWW-Authenticate: Negotiate Set-Cookie: hadoop.auth=; Path=/; Domain=example.com; HttpOnly Cache-Control: must-revalidate,no-cache,no-store Content-Type: text/html;charset=iso-8859-1 Content-Length: 272 HTTP/1.1 101 Switching Protocols Date: Thu, 04 Oct 2018 21:02:22 GMT Cache-Control: no-cache Expires: Thu, 04 Oct 2018 21:02:22 GMT Date: Thu, 04 Oct 2018 21:02:22 GMT Pragma: no-cache Content-Type: text/plain;charset=utf-8 X-Content-Type-Options: nosniff X-XSS-Protection: 1; mode=block WWW-Authenticate: Negotiate YGoGCSqGSIb3EgECAgIAb1swWaADAgEFoQMCAQ+iTTBLoAMCARKiRARCP+d4BKPjrGJcC8EEDX5by19u6EetMvscxmkmImFrRFZCT+EdKYbaBIaNn9/Td/fmIW6EOQeXBy6T8UMmAP2588qi Set-Cookie: hadoop.auth="u=hbase=hbase/hadoop.example@example.com=kerberos=1538722942268=DPKQ5Q58BR7LqZTkw2EyhLNpFN3MggMRJzX49SipyYE="; Path=/; Domain=example.com; HttpOnly X-Frame-Options: SAMEORIGIN Vary: Accept-Encoding Connection: Upgrade Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk= Upgrade: WebSocket {code} > Authentication Filter change to force security check > - > > Key: YARN-8790 > URL: https://issues.apache.org/jira/browse/YARN-8790 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Priority: Major > Labels: Docker > > Hadoop node manager REST API is authenticated using AuthenticationFilter from > Hadoop-auth project. AuthenticationFilter is added to the new WebSocket URL > path spec. The requested remote user is verified to match the container owner > to allow WebSocket connection to be established. WebSocket servlet code > enforces the username match check. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8790) Authentication Filter change to force security check
[ https://issues.apache.org/jira/browse/YARN-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang reassigned YARN-8790: --- Assignee: Eric Yang > Authentication Filter change to force security check > - > > Key: YARN-8790 > URL: https://issues.apache.org/jira/browse/YARN-8790 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Eric Yang >Priority: Major > Labels: Docker > > Hadoop node manager REST API is authenticated using AuthenticationFilter from > Hadoop-auth project. AuthenticationFilter is added to the new WebSocket URL > path spec. The requested remote user is verified to match the container owner > to allow WebSocket connection to be established. WebSocket servlet code > enforces the username match check. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7994) Add support for network-alias in docker run for user defined networks
[ https://issues.apache.org/jira/browse/YARN-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639034#comment-16639034 ] Suma Shivaprasad commented on YARN-7994: [~eyang] The proposal here was to add support for --network-alias option while starting containers with "docker run" to provide a DNS Alias for Docker embedded DNS resolution to work. Not sure what you meant in the above comment? > Add support for network-alias in docker run for user defined networks > -- > > Key: YARN-7994 > URL: https://issues.apache.org/jira/browse/YARN-7994 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > Labels: Docker > > Docker Embedded DNS supports DNS resolution for containers by one or more of > its configured {{--network-alias}} within a user-defined network. > DockerRunCommand should support this option for DNS resolution to work > through docker embedded DNS -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639012#comment-16639012 ] Suma Shivaprasad commented on YARN-8569: Thanks [~eyang] . I agree with [~leftnoteasy] that the local file name should be renamed to something like "appInfo.json" or "AMInfo.json" instead of service.json since it could be used by other custom AMs other than Yarn Service like Spark AM. Patch generally LGTM. A few minor comments ServiceClient.addFilesToCompression and DockerLInuxContainerRuntime.handleYarnSysFSUpdate are opening *Stream classes which could be closed in finally like in addYarnSysFs to prevent resource leaks. > Create an interface to provide cluster information to application > - > > Key: YARN-8569 > URL: https://issues.apache.org/jira/browse/YARN-8569 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8569 YARN sysfs interface to provide cluster > information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, > YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, > YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch > > > Some program requires container hostnames to be known for application to run. > For example, distributed tensorflow requires launch_command that looks like: > {code} > # On ps0.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=ps --task_index=0 > # On ps1.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=ps --task_index=1 > # On worker0.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=worker --task_index=0 > # On worker1.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=worker --task_index=1 > {code} > This is a bit cumbersome to orchestrate via Distributed Shell, or YARN > services launch_command. In addition, the dynamic parameters do not work > with YARN flex command. This is the classic pain point for application > developer attempt to automate system environment settings as parameter to end > user application. > It would be great if YARN Docker integration can provide a simple option to > expose hostnames of the yarn service via a mounted file. The file content > gets updated when flex command is performed. This allows application > developer to consume system environment settings via a standard interface. > It is like /proc/devices for Linux, but for Hadoop. This may involve > updating a file in distributed cache, and allow mounting of the file via > container-executor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7644) NM gets backed up deleting docker containers
[ https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639025#comment-16639025 ] Eric Yang commented on YARN-7644: - Trunk build issues seem to have been resolved, triggering pre-commit test for patch 002. > NM gets backed up deleting docker containers > > > Key: YARN-7644 > URL: https://issues.apache.org/jira/browse/YARN-7644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Eric Badger >Assignee: Chandni Singh >Priority: Major > Labels: Docker > Attachments: YARN-7644.001.patch, YARN-7644.002.patch > > > We are sending a {{docker stop}} to the docker container with a timeout of 10 > seconds when we shut down a container. If the container does not stop after > 10 seconds then we force kill it. However, the {{docker stop}} command is a > blocking call. So in cases where lots of containers don't go down with the > initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to > return. This ties up the ContainerLaunch handler and so these kill events > back up. It also appears to be backing up new container launches as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7935) Expose container's hostname to applications running within the docker container
[ https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639019#comment-16639019 ] Suma Shivaprasad edited comment on YARN-7935 at 10/4/18 11:14 PM: -- This is no longer needed since the container's hostname, IP and additional information could be obtained via YARN-8569 by Spark or other AMs which need this information. was (Author: suma.shivaprasad): This is no longer needed since the container's hostname, IP and additional information could be obtained via YARN-8659 by Spark or other AMs which need this information. > Expose container's hostname to applications running within the docker > container > --- > > Key: YARN-7935 > URL: https://issues.apache.org/jira/browse/YARN-7935 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > Labels: Docker > Attachments: YARN-7935.1.patch, YARN-7935.2.patch, YARN-7935.3.patch, > YARN-7935.4.patch > > > Some applications have a need to bind to the container's hostname (like > Spark) which is different from the NodeManager's hostname(NM_HOST which is > available as an env during container launch) when launched through Docker > runtime. The container's hostname can be exposed to applications via an env > CONTAINER_HOSTNAME. Another potential candidate is the container's IP but > this can be addressed in a separate jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7935) Expose container's hostname to applications running within the docker container
[ https://issues.apache.org/jira/browse/YARN-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639019#comment-16639019 ] Suma Shivaprasad commented on YARN-7935: This is no longer needed since the container's hostname, IP and additional information could be obtained via YARN-8659 by Spark or other AMs which need this information. > Expose container's hostname to applications running within the docker > container > --- > > Key: YARN-7935 > URL: https://issues.apache.org/jira/browse/YARN-7935 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad >Priority: Major > Labels: Docker > Attachments: YARN-7935.1.patch, YARN-7935.2.patch, YARN-7935.3.patch, > YARN-7935.4.patch > > > Some applications have a need to bind to the container's hostname (like > Spark) which is different from the NodeManager's hostname(NM_HOST which is > available as an env during container launch) when launched through Docker > runtime. The container's hostname can be exposed to applications via an env > CONTAINER_HOSTNAME. Another potential candidate is the container's IP but > this can be addressed in a separate jira. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8845) hadoop.registry.rm.enabled is not used
[ https://issues.apache.org/jira/browse/YARN-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated YARN-8845: -- Attachment: YARN-8845.000.patch > hadoop.registry.rm.enabled is not used > -- > > Key: YARN-8845 > URL: https://issues.apache.org/jira/browse/YARN-8845 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Íñigo Goiri >Priority: Major > Attachments: YARN-8845.000.patch > > > YARN-2652 introduced "hadoop.registry.rm.enabled" as YARN-2571 was supposed > to initialize the registry but that's now gone. We should remove all the > references to this configuration key. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet
[ https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639006#comment-16639006 ] Eric Yang commented on YARN-8763: - [~Zian Chen] DefaultContainerExecutor execContainer method is causing TestContainerManager unit test to fail. If the method returns null instead of throwing ContainerExecutionException, then the test passed. Not sure why execContainer method is getting triggered in TestContainerManager test. > Add WebSocket logic to the Node Manager web server to establish servlet > --- > > Key: YARN-8763 > URL: https://issues.apache.org/jira/browse/YARN-8763 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Major > Labels: Docker > Attachments: YARN-8763-001.patch, YARN-8763.002.patch, > YARN-8763.003.patch, YARN-8763.004.patch > > > The reason we want to use WebSocket servlet to serve the backend instead of > establishing the connection through HTTP is that WebSocket solves a few > issues with HTTP which needed for our scenario, > # In HTTP, the request is always initiated by the client and the response is > processed by the server — making HTTP a unidirectional protocol, while web > socket provides the Bi-directional protocol which means either client/server > can send a message to the other party. > # Full-duplex communication — client and server can talk to each other > independently at the same time > # Single TCP connection — After upgrading the HTTP connection in the > beginning, client and server communicate over that same TCP connection > throughout the lifecycle of WebSocket connection -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8845) hadoop.registry.rm.enabled is not used
[ https://issues.apache.org/jira/browse/YARN-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated YARN-8845: -- Affects Version/s: 3.1.1 > hadoop.registry.rm.enabled is not used > -- > > Key: YARN-8845 > URL: https://issues.apache.org/jira/browse/YARN-8845 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.1.1 >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: YARN-8845.000.patch > > > YARN-2652 introduced "hadoop.registry.rm.enabled" as YARN-2571 was supposed > to initialize the registry but that's now gone. We should remove all the > references to this configuration key. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8845) hadoop.registry.rm.enabled is not used
[ https://issues.apache.org/jira/browse/YARN-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri reassigned YARN-8845: - Assignee: Íñigo Goiri > hadoop.registry.rm.enabled is not used > -- > > Key: YARN-8845 > URL: https://issues.apache.org/jira/browse/YARN-8845 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: YARN-8845.000.patch > > > YARN-2652 introduced "hadoop.registry.rm.enabled" as YARN-2571 was supposed > to initialize the registry but that's now gone. We should remove all the > references to this configuration key. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8644) Improve unit test for RMAppImpl.FinalTransition
[ https://issues.apache.org/jira/browse/YARN-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8644: - Summary: Improve unit test for RMAppImpl.FinalTransition (was: Add more test coverage for RMAppImpl.FinalTransition) > Improve unit test for RMAppImpl.FinalTransition > --- > > Key: YARN-8644 > URL: https://issues.apache.org/jira/browse/YARN-8644 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-8644.001.patch, YARN-8644.002.patch, > YARN-8644.003.patch, YARN-8644.004.patch, YARN-8644.005.patch, > YARN-8644.006.patch, YARN-8644.007.patch, YARN-8644.008.patch, > YARN-8644.009.patch, YARN-8644.010.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8644) Improve unit test for RMAppImpl.FinalTransition
[ https://issues.apache.org/jira/browse/YARN-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638959#comment-16638959 ] Haibo Chen commented on YARN-8644: -- +1 on the latest patch pending Jenkins. > Improve unit test for RMAppImpl.FinalTransition > --- > > Key: YARN-8644 > URL: https://issues.apache.org/jira/browse/YARN-8644 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-8644.001.patch, YARN-8644.002.patch, > YARN-8644.003.patch, YARN-8644.004.patch, YARN-8644.005.patch, > YARN-8644.006.patch, YARN-8644.007.patch, YARN-8644.008.patch, > YARN-8644.009.patch, YARN-8644.010.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8659) RMWebServices returns only RUNNING apps when filtered with queue
[ https://issues.apache.org/jira/browse/YARN-8659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638932#comment-16638932 ] Haibo Chen commented on YARN-8659: -- Thanks [~snemeth] for the analysis. I agree with your observation. When a queue is specified, the matching applications are retrieved currently from the scheduler, which knows only about the ones that are still actively involved in scheduling decision, a subset of all the applications that RM knows about. An application is removed from the scheduler's knowledge as soon as its scheduling part is done, but RM can still retain the application in its memory until it is evicted based on the max # of applications RM is allowed to remember. IMO this is more of a bug than a API-compatibility-breaking change. To reproduce the bug and verify the behavioral change, I think we need two uni test methods, 1) Two applications, one running and the other finished, query with state=finished and a queue, the result should contain the finished app (the result is empty now) 2) Two applications, one running and the other finished, query with queue specified but no state, the result should include both applications (the result is empty now) > RMWebServices returns only RUNNING apps when filtered with queue > > > Key: YARN-8659 > URL: https://issues.apache.org/jira/browse/YARN-8659 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Szilard Nemeth >Priority: Major > Attachments: Screen Shot 2018-08-13 at 8.01.29 PM.png, Screen Shot > 2018-08-13 at 8.01.52 PM.png, YARN-8659.001.patch > > > RMWebServices returns only RUNNING apps when filtered with queue and returns > empty apps > when filtered with both FINISHED states and queue. > http://pjoseph-script-llap3.openstacklocal:8088/ws/v1/cluster/apps?queue=default > http://pjoseph-script-llap3.openstacklocal:8088/ws/v1/cluster/apps?states=FINISHED=default -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet
[ https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638918#comment-16638918 ] Hadoop QA commented on YARN-8763: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 37s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-client-modules/hadoop-client-runtime hadoop-client-modules/hadoop-client-minicluster {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 12s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 27s{color} | {color:orange} root: The patch generated 6 new + 67 unchanged - 0 fixed = 73 total (was 67) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 5s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 25s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-client-modules/hadoop-client-runtime hadoop-client-modules/hadoop-client-minicluster {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s{color} | {color:green} hadoop-project in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 13s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s{color} | {color:green} hadoop-client-runtime in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 30s{color} | {color:green} hadoop-client-minicluster in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green}
[jira] [Commented] (YARN-8732) Add unit tests of min/max allocation for custom resource types in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638916#comment-16638916 ] Szilard Nemeth commented on YARN-8732: -- Thanks [~haibochen]! Okay, will pay attention for these in the future. > Add unit tests of min/max allocation for custom resource types in > FairScheduler > --- > > Key: YARN-8732 > URL: https://issues.apache.org/jira/browse/YARN-8732 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.2.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Labels: unittest > Fix For: 3.3.0 > > Attachments: YARN-8732.001.patch, YARN-8732.002.patch, > YARN-8732.003.patch, YARN-8732.004.patch, YARN-8732.005.patch, > YARN-8732.006.patch, YARN-8732.007.patch > > > Create testcase like this, but for FS: > org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService#testValidateRequestCapacityAgainstMinMaxAllocationFor3rdResourceTypes -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8644) Add more test coverage for RMAppImpl.FinalTransition
[ https://issues.apache.org/jira/browse/YARN-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-8644: - Attachment: YARN-8644.010.patch > Add more test coverage for RMAppImpl.FinalTransition > > > Key: YARN-8644 > URL: https://issues.apache.org/jira/browse/YARN-8644 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-8644.001.patch, YARN-8644.002.patch, > YARN-8644.003.patch, YARN-8644.004.patch, YARN-8644.005.patch, > YARN-8644.006.patch, YARN-8644.007.patch, YARN-8644.008.patch, > YARN-8644.009.patch, YARN-8644.010.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8644) Add more test coverage for RMAppImpl.FinalTransition
[ https://issues.apache.org/jira/browse/YARN-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638902#comment-16638902 ] Szilard Nemeth commented on YARN-8644: -- Hi [~haibochen]! Referring to your previous comment, adding the new testcase {{testFinalTransition}} was really unnecessary as its assertions were already covered by another testcases. I also removed {{verifyAppRemovedEvent}} and moved one assertion from it to {{verifyAppRemovedSchedulerEvent}} that verifies whether the applicationEvent's applicationId field corresponds to the actual application's id. Also removed a lot of unnecessary changes: method visibility restrictions and generic types from collections. > Add more test coverage for RMAppImpl.FinalTransition > > > Key: YARN-8644 > URL: https://issues.apache.org/jira/browse/YARN-8644 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-8644.001.patch, YARN-8644.002.patch, > YARN-8644.003.patch, YARN-8644.004.patch, YARN-8644.005.patch, > YARN-8644.006.patch, YARN-8644.007.patch, YARN-8644.008.patch, > YARN-8644.009.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet
[ https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638901#comment-16638901 ] Eric Yang commented on YARN-8763: - [~Zian Chen] +1 on patch 4. One small improvement that we can do it here or in YARN-8838, It would be great to accept application ID in the URL for security validation, hence we can locate yarn local dir on host system to match the remote user with container directory owner to prevent other users from access the container. > Add WebSocket logic to the Node Manager web server to establish servlet > --- > > Key: YARN-8763 > URL: https://issues.apache.org/jira/browse/YARN-8763 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Major > Labels: Docker > Attachments: YARN-8763-001.patch, YARN-8763.002.patch, > YARN-8763.003.patch, YARN-8763.004.patch > > > The reason we want to use WebSocket servlet to serve the backend instead of > establishing the connection through HTTP is that WebSocket solves a few > issues with HTTP which needed for our scenario, > # In HTTP, the request is always initiated by the client and the response is > processed by the server — making HTTP a unidirectional protocol, while web > socket provides the Bi-directional protocol which means either client/server > can send a message to the other party. > # Full-duplex communication — client and server can talk to each other > independently at the same time > # Single TCP connection — After upgrading the HTTP connection in the > beginning, client and server communicate over that same TCP connection > throughout the lifecycle of WebSocket connection -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7644) NM gets backed up deleting docker containers
[ https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638819#comment-16638819 ] Eric Yang edited comment on YARN-7644 at 10/4/18 9:06 PM: -- [~csingh] ContainerCleanup is a runnable, and it is a utility class to remove container. It would be good to keep this helper class generic and can be reused in deletion task in my opinion. Launcher package is all dealing with similar kind of events like launch, relaunch, or pause launch. Deletion task has a package prefix of it's own. Package names appear more organized, if ContainerCleanup is in deletion.task package, even if the utility class is called by launch event failure or completion. That is just my opinion. I will let others provide feedback. was (Author: eyang): [~csingh] ContainerCleanup a a runnable, and it is a utility class to remove container. It would be good to keep this helper class general and can be reused in deletion task in my opinion. Launcher package is all dealing with similar kind of events like launch, relaunch, or pause launch. Deletion task has a package prefix of it's own. Package names appear more organized, if ContainerCleanup is in deletion.task package, even if the utility class is called by launch event failure or completion. That is just my opinion. I will let others provide feedback. > NM gets backed up deleting docker containers > > > Key: YARN-7644 > URL: https://issues.apache.org/jira/browse/YARN-7644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Eric Badger >Assignee: Chandni Singh >Priority: Major > Labels: Docker > Attachments: YARN-7644.001.patch, YARN-7644.002.patch > > > We are sending a {{docker stop}} to the docker container with a timeout of 10 > seconds when we shut down a container. If the container does not stop after > 10 seconds then we force kill it. However, the {{docker stop}} command is a > blocking call. So in cases where lots of containers don't go down with the > initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to > return. This ties up the ContainerLaunch handler and so these kill events > back up. It also appears to be backing up new container launches as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command
[ https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638874#comment-16638874 ] Hadoop QA commented on YARN-8777: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 33m 8s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 37s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 37s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 69m 31s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8777 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12942440/YARN-8777.007.patch | | Optional Tests | dupname asflicense compile cc mvnsite javac unit | | uname | Linux 8ce8a43f287b 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b6d5d84 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22059/testReport/ | | Max. process+thread count | 340 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22059/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Container Executor C binary change to execute interactive docker command > > > Key: YARN-8777 > URL: https://issues.apache.org/jira/browse/YARN-8777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8777.001.patch, YARN-8777.002.patch, > YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, > YARN-8777.006.patch, YARN-8777.007.patch > > > Since Container Executor provides Container execution using the native > container-executor binary, we also need to make changes to accept new > “dockerExec” method to
[jira] [Commented] (YARN-7644) NM gets backed up deleting docker containers
[ https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638840#comment-16638840 ] Eric Badger commented on YARN-7644: --- Personally, I think that {{org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher}} is the more appropriate location for the new class because it is related to the container launch cycle of events. Based on the name of the class ({{CleanupContainer}}, it probably should be in the deletion package. But based on the actual implementation of what it actually does, I think it belongs in launcher. I think there are pros and cons to each, and I agree that it gets a little messy since we have to involve a deletion task to actually remove the docker containers, but I think that is the deviation and that we should maintain course in this case. Overall, I think the patch looks good. +1 (non-binding) from me. [~jlowe], do you have any comments? > NM gets backed up deleting docker containers > > > Key: YARN-7644 > URL: https://issues.apache.org/jira/browse/YARN-7644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Eric Badger >Assignee: Chandni Singh >Priority: Major > Labels: Docker > Attachments: YARN-7644.001.patch, YARN-7644.002.patch > > > We are sending a {{docker stop}} to the docker container with a timeout of 10 > seconds when we shut down a container. If the container does not stop after > 10 seconds then we force kill it. However, the {{docker stop}} command is a > blocking call. So in cases where lots of containers don't go down with the > initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to > return. This ties up the ContainerLaunch handler and so these kill events > back up. It also appears to be backing up new container launches as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8750) Refactor TestQueueMetrics
[ https://issues.apache.org/jira/browse/YARN-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638835#comment-16638835 ] Hudson commented on YARN-8750: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15116 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15116/]) YARN-8750. Refactor TestQueueMetrics. (Contributed by Szilard Nemeth) (haibochen: rev e60b797c88541f94cecc7fdbcaad010c4742cfdb) * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppMetricsChecker.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ResourceMetricsChecker.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestQueueMetrics.java > Refactor TestQueueMetrics > - > > Key: YARN-8750 > URL: https://issues.apache.org/jira/browse/YARN-8750 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-8750.001.patch, YARN-8750.002.patch, > YARN-8750.003.patch, YARN-8750.004.patch, YARN-8750.005.patch, > YARN-8750.006.patch > > > {{TestQueueMetrics#checkApps}} and {{TestQueueMetrics#checkResources}} have 8 > and 14 parameters, respectively. > It is very hard to read the testcases that are using these methods. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7644) NM gets backed up deleting docker containers
[ https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638819#comment-16638819 ] Eric Yang commented on YARN-7644: - [~csingh] ContainerCleanup a a runnable, and it is a utility class to remove container. It would be good to keep this helper class general and can be reused in deletion task in my opinion. Launcher package is all dealing with similar kind of events like launch, relaunch, or pause launch. Deletion task has a package prefix of it's own. Package names appear more organized, if ContainerCleanup is in deletion.task package, even if the utility class is called by launch event failure or completion. That is just my opinion. I will let others provide feedback. > NM gets backed up deleting docker containers > > > Key: YARN-7644 > URL: https://issues.apache.org/jira/browse/YARN-7644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Eric Badger >Assignee: Chandni Singh >Priority: Major > Labels: Docker > Attachments: YARN-7644.001.patch, YARN-7644.002.patch > > > We are sending a {{docker stop}} to the docker container with a timeout of 10 > seconds when we shut down a container. If the container does not stop after > 10 seconds then we force kill it. However, the {{docker stop}} command is a > blocking call. So in cases where lots of containers don't go down with the > initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to > return. This ties up the ContainerLaunch handler and so these kill events > back up. It also appears to be backing up new container launches as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8732) Add unit tests of min/max allocation for custom resource types in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638802#comment-16638802 ] Hudson commented on YARN-8732: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15115 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15115/]) YARN-8732. Add unit tests of min/max allocation for custom resource (haibochen: rev b6d5d84e0761a450acee103d87afcae26ca504b6) * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterServiceFair.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterServiceInterceptor.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterServiceTestBase.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterServiceCapacity.java * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterService.java > Add unit tests of min/max allocation for custom resource types in > FairScheduler > --- > > Key: YARN-8732 > URL: https://issues.apache.org/jira/browse/YARN-8732 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.2.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Labels: unittest > Fix For: 3.3.0 > > Attachments: YARN-8732.001.patch, YARN-8732.002.patch, > YARN-8732.003.patch, YARN-8732.004.patch, YARN-8732.005.patch, > YARN-8732.006.patch, YARN-8732.007.patch > > > Create testcase like this, but for FS: > org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService#testValidateRequestCapacityAgainstMinMaxAllocationFor3rdResourceTypes -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8750) Refactor TestQueueMetrics
[ https://issues.apache.org/jira/browse/YARN-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638786#comment-16638786 ] Haibo Chen commented on YARN-8750: -- Thansk [~snemeth] for the contribution. I have committed the patch to trunk, addressing the line-longer-than-80 character issue. I notice inconsistent indentation issues (2, 4 and 6 are used in mix) > Refactor TestQueueMetrics > - > > Key: YARN-8750 > URL: https://issues.apache.org/jira/browse/YARN-8750 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-8750.001.patch, YARN-8750.002.patch, > YARN-8750.003.patch, YARN-8750.004.patch, YARN-8750.005.patch, > YARN-8750.006.patch > > > {{TestQueueMetrics#checkApps}} and {{TestQueueMetrics#checkResources}} have 8 > and 14 parameters, respectively. > It is very hard to read the testcases that are using these methods. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command
[ https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638762#comment-16638762 ] Eric Yang commented on YARN-8777: - Unit test failure is addressed in YARN-8844. Patch 7 fixed the white spaces. > Container Executor C binary change to execute interactive docker command > > > Key: YARN-8777 > URL: https://issues.apache.org/jira/browse/YARN-8777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8777.001.patch, YARN-8777.002.patch, > YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, > YARN-8777.006.patch, YARN-8777.007.patch > > > Since Container Executor provides Container execution using the native > container-executor binary, we also need to make changes to accept new > “dockerExec” method to invoke the corresponding native function to execute > docker exec command to the running container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8777) Container Executor C binary change to execute interactive docker command
[ https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8777: Attachment: YARN-8777.007.patch > Container Executor C binary change to execute interactive docker command > > > Key: YARN-8777 > URL: https://issues.apache.org/jira/browse/YARN-8777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8777.001.patch, YARN-8777.002.patch, > YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, > YARN-8777.006.patch, YARN-8777.007.patch > > > Since Container Executor provides Container execution using the native > container-executor binary, we also need to make changes to accept new > “dockerExec” method to invoke the corresponding native function to execute > docker exec command to the running container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8732) Add unit tests of min/max allocation for custom resource types in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638759#comment-16638759 ] Haibo Chen commented on YARN-8732: -- Thanks [~snemeth]. I have corrected two indentation issues out of the three reported along with my commit to trunk. Please do remember to address the indentation issues in the future and keep the line continuation style consistent (some are 4 spaces, some are 6). > Add unit tests of min/max allocation for custom resource types in > FairScheduler > --- > > Key: YARN-8732 > URL: https://issues.apache.org/jira/browse/YARN-8732 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.2.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Labels: unittest > Fix For: 3.3.0 > > Attachments: YARN-8732.001.patch, YARN-8732.002.patch, > YARN-8732.003.patch, YARN-8732.004.patch, YARN-8732.005.patch, > YARN-8732.006.patch, YARN-8732.007.patch > > > Create testcase like this, but for FS: > org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService#testValidateRequestCapacityAgainstMinMaxAllocationFor3rdResourceTypes -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command
[ https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638734#comment-16638734 ] Zian Chen commented on YARN-8777: - Hi [~eyang], thanks for patch 006. Seems we still have whitespace errors in latest Jenkins build? Could you help fix it? Overall patch looks good to me. > Container Executor C binary change to execute interactive docker command > > > Key: YARN-8777 > URL: https://issues.apache.org/jira/browse/YARN-8777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8777.001.patch, YARN-8777.002.patch, > YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, > YARN-8777.006.patch > > > Since Container Executor provides Container execution using the native > container-executor binary, we also need to make changes to accept new > “dockerExec” method to invoke the corresponding native function to execute > docker exec command to the running container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4254) ApplicationAttempt stuck for ever due to UnknowHostexception
[ https://issues.apache.org/jira/browse/YARN-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638735#comment-16638735 ] Hadoop QA commented on YARN-4254: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 42s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 6s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 32s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 74m 48s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}157m 28s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-4254 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12942404/YARN-4254.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux f88f4643769b 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet
[ https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638730#comment-16638730 ] Zian Chen commented on YARN-8763: - [~eyang], just uploaded patch 004, please help review it. Thanks > Add WebSocket logic to the Node Manager web server to establish servlet > --- > > Key: YARN-8763 > URL: https://issues.apache.org/jira/browse/YARN-8763 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Major > Labels: Docker > Attachments: YARN-8763-001.patch, YARN-8763.002.patch, > YARN-8763.003.patch, YARN-8763.004.patch > > > The reason we want to use WebSocket servlet to serve the backend instead of > establishing the connection through HTTP is that WebSocket solves a few > issues with HTTP which needed for our scenario, > # In HTTP, the request is always initiated by the client and the response is > processed by the server — making HTTP a unidirectional protocol, while web > socket provides the Bi-directional protocol which means either client/server > can send a message to the other party. > # Full-duplex communication — client and server can talk to each other > independently at the same time > # Single TCP connection — After upgrading the HTTP connection in the > beginning, client and server communicate over that same TCP connection > throughout the lifecycle of WebSocket connection -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet
[ https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zian Chen updated YARN-8763: Attachment: YARN-8763.004.patch > Add WebSocket logic to the Node Manager web server to establish servlet > --- > > Key: YARN-8763 > URL: https://issues.apache.org/jira/browse/YARN-8763 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Major > Labels: Docker > Attachments: YARN-8763-001.patch, YARN-8763.002.patch, > YARN-8763.003.patch, YARN-8763.004.patch > > > The reason we want to use WebSocket servlet to serve the backend instead of > establishing the connection through HTTP is that WebSocket solves a few > issues with HTTP which needed for our scenario, > # In HTTP, the request is always initiated by the client and the response is > processed by the server — making HTTP a unidirectional protocol, while web > socket provides the Bi-directional protocol which means either client/server > can send a message to the other party. > # Full-duplex communication — client and server can talk to each other > independently at the same time > # Single TCP connection — After upgrading the HTTP connection in the > beginning, client and server communicate over that same TCP connection > throughout the lifecycle of WebSocket connection -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8844) TestNMProxy unit test is failing
[ https://issues.apache.org/jira/browse/YARN-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638711#comment-16638711 ] Hadoop QA commented on YARN-8844: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 9s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 78m 18s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8844 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12942421/YARN-8844.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 635d1fc42aeb 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6926fd0 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22058/testReport/ | | Max. process+thread count | 338 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22058/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > TestNMProxy unit test is failing >
[jira] [Commented] (YARN-8845) hadoop.registry.rm.enabled is not used
[ https://issues.apache.org/jira/browse/YARN-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638640#comment-16638640 ] Íñigo Goiri commented on YARN-8845: --- [~steve_l], you were working on YARN-2571 to integrate the RM with the registry. However, it looks like this was stopped after YARN-6903. Should we remove the references in the constants and the documentation? > hadoop.registry.rm.enabled is not used > -- > > Key: YARN-8845 > URL: https://issues.apache.org/jira/browse/YARN-8845 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Íñigo Goiri >Priority: Major > > YARN-2652 introduced "hadoop.registry.rm.enabled" as YARN-2571 was supposed > to initialize the registry but that's now gone. We should remove all the > references to this configuration key. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8844) TestNMProxy unit test is failing
[ https://issues.apache.org/jira/browse/YARN-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638666#comment-16638666 ] Hudson commented on YARN-8844: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15114 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15114/]) YARN-8844. TestNMProxy unit test is failing. (Eric Yang via wangda) (wangda: rev 2e9913caf2a0ba24c323c26dd978492bd398f2e5) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestNMProxy.java > TestNMProxy unit test is failing > > > Key: YARN-8844 > URL: https://issues.apache.org/jira/browse/YARN-8844 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8844.001.patch > > > TestNMProxy has been failing in trunk for the last two or three weeks. > Investigating the failure. > {code} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.806 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] > testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy) > Time elapsed: 1.188 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:171) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8763) Add WebSocket logic to the Node Manager web server to establish servlet
[ https://issues.apache.org/jira/browse/YARN-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638658#comment-16638658 ] Zian Chen commented on YARN-8763: - Thanks for the comments, Eric, I'll update the patch later today. > Add WebSocket logic to the Node Manager web server to establish servlet > --- > > Key: YARN-8763 > URL: https://issues.apache.org/jira/browse/YARN-8763 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Major > Labels: Docker > Attachments: YARN-8763-001.patch, YARN-8763.002.patch, > YARN-8763.003.patch > > > The reason we want to use WebSocket servlet to serve the backend instead of > establishing the connection through HTTP is that WebSocket solves a few > issues with HTTP which needed for our scenario, > # In HTTP, the request is always initiated by the client and the response is > processed by the server — making HTTP a unidirectional protocol, while web > socket provides the Bi-directional protocol which means either client/server > can send a message to the other party. > # Full-duplex communication — client and server can talk to each other > independently at the same time > # Single TCP connection — After upgrading the HTTP connection in the > beginning, client and server communicate over that same TCP connection > throughout the lifecycle of WebSocket connection -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8844) TestNMProxy unit test is failing
[ https://issues.apache.org/jira/browse/YARN-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang reassigned YARN-8844: --- Assignee: Eric Yang > TestNMProxy unit test is failing > > > Key: YARN-8844 > URL: https://issues.apache.org/jira/browse/YARN-8844 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > > TestNMProxy has been failing in trunk for the last two or three weeks. > Investigating the failure. > {code} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.806 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] > testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy) > Time elapsed: 1.188 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:171) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8845) hadoop.registry.rm.enabled is not used
Íñigo Goiri created YARN-8845: - Summary: hadoop.registry.rm.enabled is not used Key: YARN-8845 URL: https://issues.apache.org/jira/browse/YARN-8845 Project: Hadoop YARN Issue Type: Bug Reporter: Íñigo Goiri YARN-2652 introduced "hadoop.registry.rm.enabled" as YARN-2571 was supposed to initialize the registry but that's now gone. We should remove all the references to this configuration key. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8758) Support getting PreemptionMessage when using AMRMClientAsync
[ https://issues.apache.org/jira/browse/YARN-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638632#comment-16638632 ] Hudson commented on YARN-8758: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15113 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15113/]) YARN-8758. Support getting PreemptionMessage when using AMRMClientAsyn. (wangda: rev 6926fd0ec634df2576bbc9f45e9636b99260db72) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/async/impl/AMRMClientAsyncImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/async/AMRMClientAsync.java > Support getting PreemptionMessage when using AMRMClientAsync > > > Key: YARN-8758 > URL: https://issues.apache.org/jira/browse/YARN-8758 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.1.1 >Reporter: Krishna Kishore >Assignee: Zian Chen >Priority: Major > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8758.001.patch > > > There's no way to get PreemptionMessage sent by RM from AMRMClientAsync, we > should add support for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8844) TestNMProxy unit test is failing
[ https://issues.apache.org/jira/browse/YARN-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8844: Attachment: YARN-8844.001.patch > TestNMProxy unit test is failing > > > Key: YARN-8844 > URL: https://issues.apache.org/jira/browse/YARN-8844 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8844.001.patch > > > TestNMProxy has been failing in trunk for the last two or three weeks. > Investigating the failure. > {code} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.806 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] > testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy) > Time elapsed: 1.188 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:171) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command
[ https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638615#comment-16638615 ] Hadoop QA commented on YARN-8777: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 29m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 47s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 61m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestNMProxy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8777 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12942415/YARN-8777.006.patch | | Optional Tests | dupname asflicense compile cc mvnsite javac unit | | uname | Linux e1c77744ae3d 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 81f635f | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/22055/artifact/out/whitespace-eol.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/22055/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22055/testReport/ | | Max. process+thread count | 453 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22055/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Container Executor C binary change to execute interactive docker command > > > Key: YARN-8777 > URL: https://issues.apache.org/jira/browse/YARN-8777 > Project: Hadoop YARN > Issue Type: Sub-task >
[jira] [Comment Edited] (YARN-8758) Support getting PreemptionMessage when using AMRMClientAsync
[ https://issues.apache.org/jira/browse/YARN-8758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638618#comment-16638618 ] Wangda Tan edited comment on YARN-8758 at 10/4/18 5:54 PM: --- Committed to branch-3.2/trunk/branch-3.1, thanks [~Zian Chen]. was (Author: leftnoteasy): Committed to branch-3.2/trunk/branch-3.1, thanks [~eyang]! > Support getting PreemptionMessage when using AMRMClientAsync > > > Key: YARN-8758 > URL: https://issues.apache.org/jira/browse/YARN-8758 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.1.1 >Reporter: Krishna Kishore >Assignee: Zian Chen >Priority: Major > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8758.001.patch > > > There's no way to get PreemptionMessage sent by RM from AMRMClientAsync, we > should add support for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8844) TestNMProxy unit test is failing
[ https://issues.apache.org/jira/browse/YARN-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638606#comment-16638606 ] Wangda Tan commented on YARN-8844: -- Fix LGTM, thanks [~eyang]. > TestNMProxy unit test is failing > > > Key: YARN-8844 > URL: https://issues.apache.org/jira/browse/YARN-8844 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8844.001.patch > > > TestNMProxy has been failing in trunk for the last two or three weeks. > Investigating the failure. > {code} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.806 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] > testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy) > Time elapsed: 1.188 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:171) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8844) TestNMProxy unit test is failing
[ https://issues.apache.org/jira/browse/YARN-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638596#comment-16638596 ] Eric Yang commented on YARN-8844: - The test is design to pass, when Socket exception is thrown. {code} 150 @Test(timeout = 2) 151 public void testNMProxyRPCRetry() throws Exception { 152 conf.setLong(YarnConfiguration.CLIENT_NM_CONNECT_MAX_WAIT_MS, 1000); 153 conf.setLong(YarnConfiguration.CLIENT_NM_CONNECT_RETRY_INTERVAL_MS, 100); 154 StartContainersRequest allRequests = 155 Records.newRecord(StartContainersRequest.class); 156 Configuration newConf = new YarnConfiguration(conf); 157 newConf.setInt( 158 CommonConfigurationKeysPublic.IPC_CLIENT_CONNECT_MAX_RETRIES_KEY, 100); 159 160 newConf.setInt(CommonConfigurationKeysPublic. 161 IPC_CLIENT_CONNECT_MAX_RETRIES_ON_SOCKET_TIMEOUTS_KEY, 100); 162 // connect to some dummy address so that it can trigger 163 // connection failure and RPC level retires. 164 newConf.set(YarnConfiguration.NM_ADDRESS, "1234"); 165 ContainerManagementProtocol proxy = getNMProxy(newConf); 166 try { 167 proxy.startContainers(allRequests); 168 Assert.fail("should get socket exception"); 169 } catch (IOException e) { 170 // socket exception should be thrown immediately, without RPC retries. 171 Assert.assertTrue(e instanceof java.net.SocketException); 172 } 173 } {code} This test passes with Java 1.8.0_151, but it fails with Java 1.8.0_181. In Java 1.8.0_181, exception throw was java.net.UnknownHostException instead of SocketException. YARN configuration for the test case is to setup NM_ADDRESS=1234 without a hostname. This cause the test case to fail. The solution is to change NM_ADDRESS to 0.0.0.0:1234 to get SocketException instead of UnknownHostException. > TestNMProxy unit test is failing > > > Key: YARN-8844 > URL: https://issues.apache.org/jira/browse/YARN-8844 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Eric Yang >Priority: Major > > TestNMProxy has been failing in trunk for the last two or three weeks. > Investigating the failure. > {code} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.806 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] > testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy) > Time elapsed: 1.188 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:171) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8468: - Component/s: scheduler > Enable the use of queue based maximum container allocation limit and > implement it in FairScheduler > -- > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler, scheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Fix For: 3.2.0 > > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch, > YARN-8468.017.patch, YARN-8468.018.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > Suggested solution: > All the infrastructure is already in the code. We need to do the following: > * add the setting to the queue properties for all queue types (parent and > leaf), this will cover dynamically created queues. > * if we set it on the root we override the scheduler setting and we should > not allow that. > * make sure that queue resource cap can not be larger than scheduler max > resource cap in the config. > * implement getMaximumResourceCapability(String queueName) in the > FairScheduler > * implement getMaximumResourceCapability(String queueName) in both > FSParentQueue and FSLeafQueue as follows > * expose the setting in the queue information in the RM web UI. > * expose the setting in the metrics etc for the queue. > * Enforce the use of queue based maximum allocation limit if it is > available, if not use the general scheduler level setting > ** Use it during validation and normalization of requests in > scheduler.allocate, app submit and resource request -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8468: - Fix Version/s: (was: 3.3.0) > Enable the use of queue based maximum container allocation limit and > implement it in FairScheduler > -- > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Fix For: 3.2.0 > > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch, > YARN-8468.017.patch, YARN-8468.018.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > Suggested solution: > All the infrastructure is already in the code. We need to do the following: > * add the setting to the queue properties for all queue types (parent and > leaf), this will cover dynamically created queues. > * if we set it on the root we override the scheduler setting and we should > not allow that. > * make sure that queue resource cap can not be larger than scheduler max > resource cap in the config. > * implement getMaximumResourceCapability(String queueName) in the > FairScheduler > * implement getMaximumResourceCapability(String queueName) in both > FSParentQueue and FSLeafQueue as follows > * expose the setting in the queue information in the RM web UI. > * expose the setting in the metrics etc for the queue. > * Enforce the use of queue based maximum allocation limit if it is > available, if not use the general scheduler level setting > ** Use it during validation and normalization of requests in > scheduler.allocate, app submit and resource request -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8448) AM HTTPS Support
[ https://issues.apache.org/jira/browse/YARN-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638522#comment-16638522 ] Robert Kanter commented on YARN-8448: - Test failures seem unrelated - they all pass locally for me. > AM HTTPS Support > > > Key: YARN-8448 > URL: https://issues.apache.org/jira/browse/YARN-8448 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Robert Kanter >Assignee: Robert Kanter >Priority: Major > Attachments: YARN-8448.001.patch, YARN-8448.002.patch, > YARN-8448.003.patch, YARN-8448.004.patch, YARN-8448.005.patch, > YARN-8448.006.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638556#comment-16638556 ] Eric Payne commented on YARN-7225: -- {quote} - In the capacity scheduler, the AM Released Container log entry contains an empty NODELABEL= KVP when there is no node label. - In the fair scheduler, the AM Released Container log entry does not container the QUEUENAME KVP. - In the fair scheduler, if a queue is created at runtime, the Submit Application Request log entry contains QUEUENAME=Default instead of the name of the new queue. {quote} I attached patch 002. It addresses the first 2. I think the third may be okay to leave as-is. > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Attachments: YARN-7225.001.patch, YARN-7225.002.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-7225: - Attachment: YARN-7225.002.patch > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jonathan Hung >Assignee: Eric Payne >Priority: Major > Attachments: YARN-7225.001.patch, YARN-7225.002.patch > > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8834) Provide Java client for fetching entities from TimelineReader
[ https://issues.apache.org/jira/browse/YARN-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638541#comment-16638541 ] Vrushali C commented on YARN-8834: -- Add from id and limit . Suggestion by [~rohithsharma] update jira title and desc to reflect that this is not an exhaustive coverage for the REST apis. > Provide Java client for fetching entities from TimelineReader > - > > Key: YARN-8834 > URL: https://issues.apache.org/jira/browse/YARN-8834 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Rohith Sharma K S >Assignee: Abhishek Modi >Priority: Critical > Attachments: YARN-8834.001.patch, YARN-8834.002.patch, > YARN-8834.003.patch > > > While reviewing YARN-8303, we felt that it is necessary to provide > TimelineReaderClient which wraps all the REST calls in it so that user can > just provide EntityType and EntityId along with filters.Currently fetching > entities from TimelineReader is only via REST call or somebody need to write > java client get entities. > It is good to provide TimelineReaderClient which fetch entities from > TimelineReaderServer. This will be more useful. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8844) TestNMProxy unit test is failing
[ https://issues.apache.org/jira/browse/YARN-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8844: Description: TestNMProxy has been failing in trunk for the last two or three weeks. Investigating the failure. {code} [INFO] Running org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.806 s <<< FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy [ERROR] testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy) Time elapsed: 1.188 s <<< FAILURE! java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:171) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) {code} was:TestNMProxy has been failing in trunk for the last two or three weeks. Investigating the failure. > TestNMProxy unit test is failing > > > Key: YARN-8844 > URL: https://issues.apache.org/jira/browse/YARN-8844 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.0 >Reporter: Eric Yang >Priority: Major > > TestNMProxy has been failing in trunk for the last two or three weeks. > Investigating the failure. > {code} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.806 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy > [ERROR] > testNMProxyRPCRetry(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy) > Time elapsed: 1.188 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestNMProxy.testNMProxyRPCRetry(TestNMProxy.java:171) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8549) Adding a NoOp timeline writer and reader plugin classes for ATSv2
[ https://issues.apache.org/jira/browse/YARN-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638524#comment-16638524 ] Prabha Manepalli commented on YARN-8549: [~vrushalic] As [~suma.shivaprasad] suggested, I replaced the NULL with an empty TimelineWriterResponse object. I will also address your concerns on adding a debug log when the stubs are being used on a node. > Adding a NoOp timeline writer and reader plugin classes for ATSv2 > - > > Key: YARN-8549 > URL: https://issues.apache.org/jira/browse/YARN-8549 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineclient, timelineserver >Reporter: Prabha Manepalli >Assignee: Prabha Manepalli >Priority: Minor > Attachments: YARN-8549-branch-2.03.patch, > YARN-8549-branch-2.04.patch, YARN-8549.v1.patch, YARN-8549.v2.patch, > YARN-8549.v4.patch > > > Stub implementation for TimeLineReader and TimeLineWriter classes. > These are useful for functional testing of writer and reader path for ATSv2 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command
[ https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638503#comment-16638503 ] Eric Yang commented on YARN-8777: - Patch 6 fixes white space, and cc return code check. TestNMProxy unit test failure is not related to this JIRA, open YARN-8844 to track that issue. > Container Executor C binary change to execute interactive docker command > > > Key: YARN-8777 > URL: https://issues.apache.org/jira/browse/YARN-8777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8777.001.patch, YARN-8777.002.patch, > YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, > YARN-8777.006.patch > > > Since Container Executor provides Container execution using the native > container-executor binary, we also need to make changes to accept new > “dockerExec” method to invoke the corresponding native function to execute > docker exec command to the running container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8844) TestNMProxy unit test is failing
Eric Yang created YARN-8844: --- Summary: TestNMProxy unit test is failing Key: YARN-8844 URL: https://issues.apache.org/jira/browse/YARN-8844 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 3.3.0 Reporter: Eric Yang TestNMProxy has been failing in trunk for the last two or three weeks. Investigating the failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8777) Container Executor C binary change to execute interactive docker command
[ https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8777: Attachment: YARN-8777.006.patch > Container Executor C binary change to execute interactive docker command > > > Key: YARN-8777 > URL: https://issues.apache.org/jira/browse/YARN-8777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8777.001.patch, YARN-8777.002.patch, > YARN-8777.003.patch, YARN-8777.004.patch, YARN-8777.005.patch, > YARN-8777.006.patch > > > Since Container Executor provides Container execution using the native > container-executor binary, we also need to make changes to accept new > “dockerExec” method to invoke the corresponding native function to execute > docker exec command to the running container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6989) Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a consistent way
[ https://issues.apache.org/jira/browse/YARN-6989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638418#comment-16638418 ] Vrushali C commented on YARN-6989: -- +1 LGTM Committing shortly Thanks [~abmodi] for the patch! > Ensure timeline service v2 codebase gets UGI from HttpServletRequest in a > consistent way > > > Key: YARN-6989 > URL: https://issues.apache.org/jira/browse/YARN-6989 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-6989.001.patch, YARN-6989.002.patch > > > As noticed during discussions in YARN-6820, the webservices in timeline > service v2 get the UGI created from the user obtained by invoking > getRemoteUser on the HttpServletRequest . > It will be good to use getUserPrincipal instead of invoking getRemoteUser on > the HttpServletRequest. > Filing jira to update the code. > Per Java EE documentations for 6 and 7, the behavior around getRemoteUser and > getUserPrincipal is listed at: > http://docs.oracle.com/javaee/6/tutorial/doc/gjiie.html#bncba > https://docs.oracle.com/javaee/7/tutorial/security-webtier003.htm > {code} > getRemoteUser, which determines the user name with which the client > authenticated. The getRemoteUser method returns the name of the remote user > (the caller) associated by the container with the request. If no user has > been authenticated, this method returns null. > getUserPrincipal, which determines the principal name of the current user and > returns a java.security.Principal object. If no user has been authenticated, > this method returns null. Calling the getName method on the Principal > returned by getUserPrincipal returns the name of the remote user. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4254) ApplicationAttempt stuck for ever due to UnknowHostexception
[ https://issues.apache.org/jira/browse/YARN-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638385#comment-16638385 ] Bibin A Chundatt commented on YARN-4254: Thank you [~jlowe] for review Attached patch handling checkstyle and checkIpHostnameRegistration. > ApplicationAttempt stuck for ever due to UnknowHostexception > > > Key: YARN-4254 > URL: https://issues.apache.org/jira/browse/YARN-4254 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Major > Attachments: 0001-YARN-4254.patch, Logs.txt, Test.patch, > YARN-4254.002.patch, YARN-4254.003.patch > > > Scenario > === > 1. RM HA and 5 NMs available in cluster and are working fine > 2. Add one more NM to the same cluster but RM /etc/hosts not updated. > 3. Submit application to the same cluster > If Am get allocated to the newly added NM the *application attempt will get > stuck for ever*.User will not get to know why the same happened. > Impact > 1.RM logs gets overloaded with exception > 2.Application gets stuck for ever. > Handling suggestion YARN-261 allows for Fail application attempt . > If we fail the same next attempt could get assigned to another NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4254) ApplicationAttempt stuck for ever due to UnknowHostexception
[ https://issues.apache.org/jira/browse/YARN-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4254: --- Attachment: YARN-4254.003.patch > ApplicationAttempt stuck for ever due to UnknowHostexception > > > Key: YARN-4254 > URL: https://issues.apache.org/jira/browse/YARN-4254 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Major > Attachments: 0001-YARN-4254.patch, Logs.txt, Test.patch, > YARN-4254.002.patch, YARN-4254.003.patch > > > Scenario > === > 1. RM HA and 5 NMs available in cluster and are working fine > 2. Add one more NM to the same cluster but RM /etc/hosts not updated. > 3. Submit application to the same cluster > If Am get allocated to the newly added NM the *application attempt will get > stuck for ever*.User will not get to know why the same happened. > Impact > 1.RM logs gets overloaded with exception > 2.Application gets stuck for ever. > Handling suggestion YARN-261 allows for Fail application attempt . > If we fail the same next attempt could get assigned to another NM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8834) Provide Java client for fetching entities from TimelineReader
[ https://issues.apache.org/jira/browse/YARN-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638339#comment-16638339 ] Hadoop QA commented on YARN-8834: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 44s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 28s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 25s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 52m 5s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common | | | org.apache.hadoop.yarn.client.api.impl.TimelineReaderClientImpl.mergeFilters(MultivaluedMap, Map) makes inefficient use of keySet iterator instead of entrySet iterator At TimelineReaderClientImpl.java:of keySet iterator instead of entrySet iterator At TimelineReaderClientImpl.java:[line 204] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8834 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12942373/YARN-8834.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e8b1bcdd5546 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 81f635f | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | |
[jira] [Commented] (YARN-8843) updateNodeResource does not support units for memory
[ https://issues.apache.org/jira/browse/YARN-8843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638344#comment-16638344 ] Íñigo Goiri commented on YARN-8843: --- [~sunilg], the behavior is broken. If you try the unit test without the fix, you'll see that the resource is 1MB. I saw this very same behavior in my cluster; I set the memory to 1Gi and the memory for the NM becomes 1MB. I'm not sure if the fix I proposed is in the best place but the unit test is. > updateNodeResource does not support units for memory > > > Key: YARN-8843 > URL: https://issues.apache.org/jira/browse/YARN-8843 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Minor > Attachments: YARN-8843.000.patch > > > When doing -updateNodeResource memory-mb=1Gi, it assigns 1MB. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8843) updateNodeResource does not support units for memory
[ https://issues.apache.org/jira/browse/YARN-8843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638349#comment-16638349 ] Íñigo Goiri commented on YARN-8843: --- I see YARN-7159 is in 3.2. I tested with 3.1.1 and that's where I saw the issue. However, I verified that trunk had the issue. Do you mind running the unit test with my change to verify is broken there? > updateNodeResource does not support units for memory > > > Key: YARN-8843 > URL: https://issues.apache.org/jira/browse/YARN-8843 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Minor > Attachments: YARN-8843.000.patch > > > When doing -updateNodeResource memory-mb=1Gi, it assigns 1MB. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8843) updateNodeResource does not support units for memory
[ https://issues.apache.org/jira/browse/YARN-8843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638350#comment-16638350 ] Sunil Govindan commented on YARN-8843: -- Thanks. As per my understanding, this should ideally not happen. So let me recheck again. > updateNodeResource does not support units for memory > > > Key: YARN-8843 > URL: https://issues.apache.org/jira/browse/YARN-8843 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Minor > Attachments: YARN-8843.000.patch > > > When doing -updateNodeResource memory-mb=1Gi, it assigns 1MB. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6668) Use cgroup to get container resource utilization
[ https://issues.apache.org/jira/browse/YARN-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6668: -- Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-1011) > Use cgroup to get container resource utilization > > > Key: YARN-6668 > URL: https://issues.apache.org/jira/browse/YARN-6668 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0-alpha3 >Reporter: Haibo Chen >Assignee: Miklos Szegedi >Priority: Major > Attachments: YARN-6668.000.patch, YARN-6668.001.patch, > YARN-6668.002.patch, YARN-6668.003.patch, YARN-6668.004.patch, > YARN-6668.005.patch, YARN-6668.006.patch, YARN-6668.007.patch, > YARN-6668.008.patch, YARN-6668.009.patch > > > Container Monitor relies on proc file system to get container resource > utilization, which is not as efficient as reading cgroup accounting. We > should in NM, when cgroup is enabled, read cgroup stats instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8843) updateNodeResource does not support units for memory
[ https://issues.apache.org/jira/browse/YARN-8843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638329#comment-16638329 ] Sunil Govindan commented on YARN-8843: -- [~elgoiri] is this change needed? Internally while we use Resource object within RM, these units ll be same. Over the wire (through proto), there is a chance if different unit. YARN-7159 has fixed this pblm. So I am bit confused why this is needed, did u see any pblm. Could u pls confirm > updateNodeResource does not support units for memory > > > Key: YARN-8843 > URL: https://issues.apache.org/jira/browse/YARN-8843 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Minor > Attachments: YARN-8843.000.patch > > > When doing -updateNodeResource memory-mb=1Gi, it assigns 1MB. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6843) Add unit test for NM to preempt OPPORTUNISTIC containers under high utilization
[ https://issues.apache.org/jira/browse/YARN-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6843: -- Issue Type: Bug (was: Sub-task) Parent: (was: YARN-1011) > Add unit test for NM to preempt OPPORTUNISTIC containers under high > utilization > --- > > Key: YARN-6843 > URL: https://issues.apache.org/jira/browse/YARN-6843 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha4 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6806) Scheduler-agnostic RM changes support oversubscription
[ https://issues.apache.org/jira/browse/YARN-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6806: -- Issue Type: Bug (was: Sub-task) Parent: (was: YARN-1011) > Scheduler-agnostic RM changes support oversubscription > -- > > Key: YARN-6806 > URL: https://issues.apache.org/jira/browse/YARN-6806 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6800) Add opportunity to start containers while periodically checking for preemption
[ https://issues.apache.org/jira/browse/YARN-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6800: -- Issue Type: Bug (was: Sub-task) Parent: (was: YARN-1011) > Add opportunity to start containers while periodically checking for preemption > -- > > Key: YARN-6800 > URL: https://issues.apache.org/jira/browse/YARN-6800 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Fix For: YARN-1011 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6796) Add unit test for NM to launch OPPORTUNISTIC container for overallocation
[ https://issues.apache.org/jira/browse/YARN-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6796: -- Issue Type: Bug (was: Sub-task) Parent: (was: YARN-1011) > Add unit test for NM to launch OPPORTUNISTIC container for overallocation > - > > Key: YARN-6796 > URL: https://issues.apache.org/jira/browse/YARN-6796 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7015) Handle Container ExecType update (Promotion/Demotion) in cgroups resource handlers
[ https://issues.apache.org/jira/browse/YARN-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7015: -- Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-1011) > Handle Container ExecType update (Promotion/Demotion) in cgroups resource > handlers > -- > > Key: YARN-7015 > URL: https://issues.apache.org/jira/browse/YARN-7015 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Arun Suresh >Assignee: Miklos Szegedi >Priority: Major > > YARN-5085 adds support for change of container execution type > (Promotion/Demotion). > Modifications to the ContainerManagementProtocol, ContainerManager and > ContainerScheduler to handle this change are now in trunk. Opening this JIRA > to track changes (if any) required in the cgroups resourcehandlers to > accommodate this in the context of YARN-1011. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6673) Add cpu cgroup configurations for opportunistic containers
[ https://issues.apache.org/jira/browse/YARN-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6673: -- Issue Type: Bug (was: Sub-task) Parent: (was: YARN-1011) > Add cpu cgroup configurations for opportunistic containers > -- > > Key: YARN-6673 > URL: https://issues.apache.org/jira/browse/YARN-6673 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Haibo Chen >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.0.0-beta1 > > Attachments: YARN-6673.000.patch > > > In addition to setting cpu.cfs_period_us on a per-container basis, we could > also set cpu.shares to 2 for opportunistic containers so they are run on a > best-effort basis -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6674) Add memory cgroup settings for opportunistic containers
[ https://issues.apache.org/jira/browse/YARN-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6674: -- Issue Type: Improvement (was: Bug) > Add memory cgroup settings for opportunistic containers > --- > > Key: YARN-6674 > URL: https://issues.apache.org/jira/browse/YARN-6674 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0-alpha3 >Reporter: Haibo Chen >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.0.0-beta1 > > Attachments: YARN-6674.000.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6674) Add memory cgroup settings for opportunistic containers
[ https://issues.apache.org/jira/browse/YARN-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6674: -- Issue Type: Bug (was: Sub-task) Parent: (was: YARN-1011) > Add memory cgroup settings for opportunistic containers > --- > > Key: YARN-6674 > URL: https://issues.apache.org/jira/browse/YARN-6674 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha3 >Reporter: Haibo Chen >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.0.0-beta1 > > Attachments: YARN-6674.000.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6671) Add container type awareness in ResourceHandlers.
[ https://issues.apache.org/jira/browse/YARN-6671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6671: -- Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-1011) > Add container type awareness in ResourceHandlers. > - > > Key: YARN-6671 > URL: https://issues.apache.org/jira/browse/YARN-6671 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.0.0-alpha3 >Reporter: Haibo Chen >Assignee: Miklos Szegedi >Priority: Major > > When using LCE, different cgroup settings for opportunistic and guaranteed > containers can be used to ensure isolation between them. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4293) ResourceUtilization should be a part of yarn node CLI
[ https://issues.apache.org/jira/browse/YARN-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-4293: -- Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-1011) > ResourceUtilization should be a part of yarn node CLI > - > > Key: YARN-4293 > URL: https://issues.apache.org/jira/browse/YARN-4293 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Wangda Tan >Assignee: Sunil Govindan >Priority: Major > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: 0001-YARN-4293.patch, 0002-YARN-4293.patch, > 0003-YARN-4293.patch > > > In order to get resource utilization information easier, "yarn node" CLI > should include resource utilization on the node. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org