[jira] [Updated] (YARN-8456) Fix a bug when user leave FPGA discover executable path configuration default but set OpenCL SDK path environment variable

2018-06-22 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-8456:
---
Summary: Fix a bug when user leave FPGA discover executable path 
configuration default but set OpenCL SDK path environment variable  (was: Fix a 
bug when user leave FPGA discover executable path configuration empty but set 
OpenCL SDK path environment variable)

> Fix a bug when user leave FPGA discover executable path configuration default 
> but set OpenCL SDK path environment variable
> --
>
> Key: YARN-8456
> URL: https://issues.apache.org/jira/browse/YARN-8456
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
>
> *Issue:*
> When the user doesn't configure 
> "yarn.nodemanager.resource-plugins.fpga.path-to-discovery-executables" in 
> yarn-site.xml and have "ALTERAOCLSDKROOT" environment variable set, the FPGA 
> discoverer cannot find the correct executable path (with 
> IntelFPGAOpenclPlugin).
> *Reason:*
> In IntelFPGAOpenclPlugin,  the current code builds a wrong path string after 
> getting the environment variable value. It should append "/bin/ name>" otherwise it would fail the FPGA resource discovery.
>  
> *Solution:*
> Fix the path construction code in IntelFPGAOpenclPlugin.
>  
> *MISC:*
> The patch also corrects some minor errors:
>  # Change _"yarn-io/_gpu_"_ and _"yarn-io/_fpga_"_ to _"yarn.io/gpu"_, 
> _"yarn.io/fpga"_ in documents
>  # Use _"auto"_ as the default value for 
> "yarn.nodemanager.resource-plugins.fpga.allowed-fpga-devices". The original 
> "0,1" won't cause any problem but use "auto" is better
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8456) Fix a bug when user leave FPGA discover executable path configuration empty but set OpenCL SDK path environment variable

2018-06-22 Thread Zhankun Tang (JIRA)
Zhankun Tang created YARN-8456:
--

 Summary: Fix a bug when user leave FPGA discover executable path 
configuration empty but set OpenCL SDK path environment variable
 Key: YARN-8456
 URL: https://issues.apache.org/jira/browse/YARN-8456
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Reporter: Zhankun Tang


*Issue:*
When the user doesn't configure 
"yarn.nodemanager.resource-plugins.fpga.path-to-discovery-executables" in 
yarn-site.xml and have "ALTERAOCLSDKROOT" environment variable set, the FPGA 
discoverer cannot find the correct executable path (with IntelFPGAOpenclPlugin).

*Reason:*

In IntelFPGAOpenclPlugin,  the current code builds a wrong path string after 
getting the environment variable value. It should append "/bin/" otherwise it would fail the FPGA resource discovery.

 

*Solution:*

Fix the path construction code in IntelFPGAOpenclPlugin.

 

*MISC:*

The patch also corrects some minor errors:
 # Change _"yarn-io/_gpu_"_ and _"yarn-io/_fpga_"_ to _"yarn.io/gpu"_, 
_"yarn.io/fpga"_ in documents
 # Use _"auto"_ as the default value for 
"yarn.nodemanager.resource-plugins.fpga.allowed-fpga-devices". The original 
"0,1" won't cause any problem but use "auto" is better

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8456) Fix a bug when user leave FPGA discover executable path configuration empty but set OpenCL SDK path environment variable

2018-06-22 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang reassigned YARN-8456:
--

Assignee: Zhankun Tang

> Fix a bug when user leave FPGA discover executable path configuration empty 
> but set OpenCL SDK path environment variable
> 
>
> Key: YARN-8456
> URL: https://issues.apache.org/jira/browse/YARN-8456
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
>
> *Issue:*
> When the user doesn't configure 
> "yarn.nodemanager.resource-plugins.fpga.path-to-discovery-executables" in 
> yarn-site.xml and have "ALTERAOCLSDKROOT" environment variable set, the FPGA 
> discoverer cannot find the correct executable path (with 
> IntelFPGAOpenclPlugin).
> *Reason:*
> In IntelFPGAOpenclPlugin,  the current code builds a wrong path string after 
> getting the environment variable value. It should append "/bin/ name>" otherwise it would fail the FPGA resource discovery.
>  
> *Solution:*
> Fix the path construction code in IntelFPGAOpenclPlugin.
>  
> *MISC:*
> The patch also corrects some minor errors:
>  # Change _"yarn-io/_gpu_"_ and _"yarn-io/_fpga_"_ to _"yarn.io/gpu"_, 
> _"yarn.io/fpga"_ in documents
>  # Use _"auto"_ as the default value for 
> "yarn.nodemanager.resource-plugins.fpga.allowed-fpga-devices". The original 
> "0,1" won't cause any problem but use "auto" is better
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6672) Add NM preemption of opportunistic containers when utilization goes high

2018-06-22 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6672:
-
Attachment: YARN-6672-YARN-1011.02.patch

> Add NM preemption of opportunistic containers when utilization goes high
> 
>
> Key: YARN-6672
> URL: https://issues.apache.org/jira/browse/YARN-6672
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-6672-YARN-1011.00.patch, 
> YARN-6672-YARN-1011.01.patch, YARN-6672-YARN-1011.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6672) Add NM preemption of opportunistic containers when utilization goes high

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520973#comment-16520973
 ] 

genericqa commented on YARN-6672:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-1011 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
43s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} YARN-1011 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
20s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 20s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 20s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 5 unchanged - 0 fixed = 6 total (was 5) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  4m  
9s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
21s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 1 new + 9 unchanged - 0 fixed = 10 total (was 9) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 23s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-6672 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928852/YARN-6672-YARN-1011.01.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1847eb92b37b 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | YARN-1011 / e0e6460 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/21086/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-serve

[jira] [Commented] (YARN-8423) GPU does not get released even though the application gets killed.

2018-06-22 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520968#comment-16520968
 ] 

Wangda Tan commented on YARN-8423:
--

Thanks [~sunilg], 

Overall looks good, except: 
{code:java}
266 if (container.isContainerInFinalStates()) {
267 releasingGpus++;
268 }{code}
Instead of ++, you should add the actual # of allocated GPUs of the container. 

And even if the patch looks safe, could u add a test to make sure there's no 
regression in the future? 

> GPU does not get released even though the application gets killed.
> --
>
> Key: YARN-8423
> URL: https://issues.apache.org/jira/browse/YARN-8423
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Attachments: YARN-8423.001.patch, YARN-8423.002.patch, 
> kill-container-nm.log
>
>
> Run an Tensor flow app requesting one GPU.
> Kill the application once the GPU is allocated
> Query the nodemanger once the application is killed.We see that GPU is not 
> being released.
> {code}
>  curl -i /ws/v1/node/resources/yarn.io%2Fgpu
> {"gpuDeviceInformation":{"gpus":[{"productName":"","uuid":"GPU-","minorNumber":0,"gpuUtilizations":{"overallGpuUtilization":0.0},"gpuMemoryUsage":{"usedMemoryMiB":73,"availMemoryMiB":12125,"totalMemoryMiB":12198},"temperature":{"currentGpuTemp":28.0,"maxGpuTemp":85.0,"slowThresholdGpuTemp":82.0}},{"productName":"","uuid":"GPU-","minorNumber":1,"gpuUtilizations":{"overallGpuUtilization":0.0},"gpuMemoryUsage":{"usedMemoryMiB":73,"availMemoryMiB":12125,"totalMemoryMiB":12198},"temperature":{"currentGpuTemp":28.0,"maxGpuTemp":85.0,"slowThresholdGpuTemp":82.0}}],"driverVersion":""},"totalGpuDevices":[{"index":0,"minorNumber":0},{"index":1,"minorNumber":1}],"assignedGpuDevices":[{"index":0,"minorNumber":0,"containerId":"container_"}]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6672) Add NM preemption of opportunistic containers when utilization goes high

2018-06-22 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6672:
-
Attachment: YARN-6672-YARN-1011.01.patch

> Add NM preemption of opportunistic containers when utilization goes high
> 
>
> Key: YARN-6672
> URL: https://issues.apache.org/jira/browse/YARN-6672
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-6672-YARN-1011.00.patch, 
> YARN-6672-YARN-1011.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-06-22 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520939#comment-16520939
 ] 

Shane Kumpf commented on YARN-8326:
---

Thank you for reporting the issue, [~hlhu...@us.ibm.com], and thanks for the 
analysis and commit, [~eyang]!

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  yarn.nodemanager.remote-app-log-dir-suffix
>  logs
>  
> 
>  yarn.nodemanager.resource-plugins
>  
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
>  auto
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.docker-plugin
>  nvidia-docker-v1
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.docker-plugin.nvidiadocker-
>  v1.endpoint
>  [http://localhost:3476/v1.0/docker/cli]
>  
> 
>  
> yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables
>  
>  
> 
>  yarn.nodemanager.resource.cpu-vcores
>  6
>  
> 
>  yarn.nodemanager.resource.memory-mb
>

[jira] [Commented] (YARN-8452) FairScheduler.update can take long time if yarn.scheduler.fair.sizebasedweight is on

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520933#comment-16520933
 ] 

genericqa commented on YARN-8452:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
45s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 66m 
56s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}128m 15s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8452 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928827/YARN-8452.000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f77e4a8dd0cb 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1cdce86 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21085/testReport/ |
| Max. process+thread count | 943 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21085/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> FairScheduler.update can take lon

[jira] [Commented] (YARN-6672) Add NM preemption of opportunistic containers when utilization goes high

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520916#comment-16520916
 ] 

genericqa commented on YARN-6672:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
46s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-1011 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 
 9s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} YARN-1011 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 22s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 5 new + 5 unchanged - 0 fixed = 10 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
21s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 21 new + 9 unchanged - 0 fixed = 30 total (was 9) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 23s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-6672 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928839/YARN-6672-YARN-1011.00.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4d380e87cf98 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | YARN-1011 / e0e6460 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21084/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/21084/artifact/out/diff-javadoc-javadoc-ha

[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-06-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520898#comment-16520898
 ] 

Hudson commented on YARN-8326:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14469 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14469/])
YARN-8326.  Removed exit code file check for launched container. 
(eyang: rev 8a32bc39eb210fca8052c472601e24c2446b4cc2)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java


> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  yarn.nodemanager.remote-app-log-dir-suffix
>  logs
>  
> 
>  yarn.nodemanager.resource-plugins
>  
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
>  auto
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.docker-plugin
>  

[jira] [Created] (YARN-8455) Add basic acl check for all TS v2 REST APIs

2018-06-22 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-8455:
---

 Summary: Add basic acl check for all TS v2 REST APIs
 Key: YARN-8455
 URL: https://issues.apache.org/jira/browse/YARN-8455
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Rohith Sharma K S
Assignee: Rohith Sharma K S


YARN-8319 filter check for flows pages. The same behavior need to be added for 
all other REST API as long as ATS provides support for ACLs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8103) Add CLI interface to query node attributes

2018-06-22 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520880#comment-16520880
 ] 

Naganarasimha G R commented on YARN-8103:
-

Thanks for the patch [~bibinchundatt], Sorry my bad had done the validation 
wrong in "equals" and everything else is fine except for checkstyle issues 
having some unused imports. Also one small suggestion. would it be good to sort 
the attribute listing based one the attribute prefix/ name and listing it ? (in 
all places, nodecli, cluster cli and nodeattributecli).

Hope others can also take a look as its almost ready to go in ?

> Add CLI interface to  query node attributes
> ---
>
> Key: YARN-8103
> URL: https://issues.apache.org/jira/browse/YARN-8103
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-8103-YARN-3409.001.patch, 
> YARN-8103-YARN-3409.002.patch, YARN-8103-YARN-3409.003.patch, 
> YARN-8103-YARN-3409.004.patch, YARN-8103-YARN-3409.005.patch, 
> YARN-8103-YARN-3409.WIP.patch
>
>
> YARN-8100 will add API interface for querying the attributes. CLI interface 
> for querying node attributes for each nodes and list all attributes in 
> cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6672) Add NM preemption of opportunistic containers when utilization goes high

2018-06-22 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520853#comment-16520853
 ] 

Haibo Chen commented on YARN-6672:
--

Attached a patch to do the NM periodical preemption. In the meantime, I notice 
that there is a disabled unit test in TestContainerSchedulerWithOverAllocation 
that was supposed to be removed in YARN-8427, so I did that too in this patch.

> Add NM preemption of opportunistic containers when utilization goes high
> 
>
> Key: YARN-6672
> URL: https://issues.apache.org/jira/browse/YARN-6672
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-6672-YARN-1011.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8434) Nodemanager not registering to active RM in federation

2018-06-22 Thread Subru Krishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520852#comment-16520852
 ] 

Subru Krishnan commented on YARN-8434:
--

[~bibinchundatt], thanks for reporting this. I would like to understand the 
context more, are you trying to use the {{FederationRMFailoverProxyProvider}} 
for NM - RM communication as we use \{{RequestHedgingRMFailoverProxyProvider}}? 
We currently use {{FederationRMFailoverProxyProvider}} for AM - RM protocol.

> Nodemanager not registering to active RM in federation
> --
>
> Key: YARN-8434
> URL: https://issues.apache.org/jira/browse/YARN-8434
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Blocker
>
> FederationRMFailoverProxyProvider doesn't handle connecting to active RM. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6672) Add NM preemption of opportunistic containers when utilization goes high

2018-06-22 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6672:
-
Attachment: YARN-6672-YARN-1011.00.patch

> Add NM preemption of opportunistic containers when utilization goes high
> 
>
> Key: YARN-6672
> URL: https://issues.apache.org/jira/browse/YARN-6672
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-6672-YARN-1011.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2018-06-22 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520822#comment-16520822
 ] 

Eric Payne commented on YARN-4606:
--

[~maniraj...@gmail.com], we can fix the queue application starvation problem by 
making most of the changes in the scheduler-specific users managers. For 
{{CapacityScheduler}}, all the changes can be done in the {{UsersManager}} 
class. For the other schedulers (FIfo, Fair, etc.), I think there needs to be 
some amount of changes in the scheduler infrastructure classes to support 
retrieving iformation such as number of pending and active apps per user, 
amount of queue's AM limit resources, amount of a user's used AM resources, 
etc. But I think that most of the changes can be done in {{ActiveUsersManager}} 
for other schedulers as well.

I am attaching a POC patch that only modifies {{UsersManager}}. The 
{{UsersManager}} already keeps track of all users in the queue. Each user 
object keeps the number of active apps and the number of pending apps. here is 
the sequence of events plus proposed change:
 - When an application is submitted, the user object's pending apps count is 
incremented
 - If limits are not exceeded, {{LeafQueue}} activates the app
 -- {{Leafqueue#activateApplications}} already checks whether or not activation 
of an application will go over the queue's AM limit.
 -- If activating the application will not go over the queue's AM limit, 
{{Leafqueue#activateApplications}} will increment the user object's active app 
count and decrement the pending app count.
 -- However, if activating the application will go over the queue's AM limit, 
the user's pending app count remains the same.
 - The change made in {{YARN-4606.POC.3.patch}} is that 
{{UsersManager#activateApplication}} will check whether or not the user object 
has any active apps. If not, it will not continue (thus not putting the user in 
the {{activeUsers}} list).

I have not yet analyzed the problem you pointed out above regarding moving apps 
to different queues.

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Manikandan R
>Priority: Critical
> Attachments: YARN-4606.001.patch, YARN-4606.002.patch, 
> YARN-4606.003.patch, YARN-4606.004.patch, YARN-4606.1.poc.patch, 
> YARN-4606.POC.2.patch, YARN-4606.POC.3.patch, YARN-4606.POC.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8392) Allow multiple tags for anti-affinity placement policy in service specification

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520819#comment-16520819
 ] 

genericqa commented on YARN-8392:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
29s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8392 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928811/YARN-8392.2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 54b12f217a8c 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ae05562 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21083/testReport/ |
| Max. process+thread count | 746 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21083/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Allow mu

[jira] [Updated] (YARN-4606) CapacityScheduler: applications could get starved because computation of #activeUsers considers pending apps

2018-06-22 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-4606:
-
Attachment: YARN-4606.POC.3.patch

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 2.8.0, 2.7.1
>Reporter: Karam Singh
>Assignee: Manikandan R
>Priority: Critical
> Attachments: YARN-4606.001.patch, YARN-4606.002.patch, 
> YARN-4606.003.patch, YARN-4606.004.patch, YARN-4606.1.poc.patch, 
> YARN-4606.POC.2.patch, YARN-4606.POC.3.patch, YARN-4606.POC.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8427) Don't start opportunistic containers at container scheduler/finish event with over-allocation

2018-06-22 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520817#comment-16520817
 ] 

Haibo Chen commented on YARN-8427:
--

Thanks [~szegedim] for your review!

> Don't start opportunistic containers at container scheduler/finish event with 
> over-allocation
> -
>
> Key: YARN-8427
> URL: https://issues.apache.org/jira/browse/YARN-8427
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Fix For: YARN-1011
>
> Attachments: YARN-8427-YARN-1011.00.patch, 
> YARN-8427-YARN-1011.01.patch
>
>
> As discussed in YARN-8250, we can stop opportunistic containers from being 
> launched at container scheduler/finish events if the node is already 
> over-allocating itself.  This can mitigate the issue that too many 
> opportunistic containers can be launched and then quickly killed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8438) TestContainer.testKillOnNew flaky on trunk

2018-06-22 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520815#comment-16520815
 ] 

Miklos Szegedi commented on YARN-8438:
--

I think it is legitimate that a code is fast. I would vote for the simple 
change just using >= here.

> TestContainer.testKillOnNew flaky on trunk
> --
>
> Key: YARN-8438
> URL: https://issues.apache.org/jira/browse/YARN-8438
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8438.001.patch, YARN-8438.002.patch, 
> YARN-8438.003.patch, YARN-8438.004.patch, YARN-8438.005.patch
>
>
> Running this test several times (e.g. 30), it fails ~5-10 times.
> Stacktrace: 
> {code:java}
> java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> org.junit.Assert.assertTrue(Assert.java:52) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.TestContainer.testKillOnNew(TestContainer.java:594)
> {code}
> TestContainer:594 is the following code in trunk, currently:
> {code:java}
> Assert.assertTrue( containerMetrics.finishTime.value() > 
> containerMetrics.startTime .value());
> {code}
> So sometimes the finish time is not greater than the start time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8452) FairScheduler.update can take long time if yarn.scheduler.fair.sizebasedweight is on

2018-06-22 Thread Miklos Szegedi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-8452:
-
Attachment: YARN-8452.000.patch

> FairScheduler.update can take long time if 
> yarn.scheduler.fair.sizebasedweight is on
> 
>
> Key: YARN-8452
> URL: https://issues.apache.org/jira/browse/YARN-8452
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Miklos Szegedi
>Priority: Major
> Attachments: YARN-8452.000.patch
>
>
> Basically we recalculate the weight every time, even if the inputs did not 
> change. This causes high cpu usage, if the cluster has lots of apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8452) FairScheduler.update can take long time if yarn.scheduler.fair.sizebasedweight is on

2018-06-22 Thread Miklos Szegedi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi reassigned YARN-8452:


Assignee: Miklos Szegedi

> FairScheduler.update can take long time if 
> yarn.scheduler.fair.sizebasedweight is on
> 
>
> Key: YARN-8452
> URL: https://issues.apache.org/jira/browse/YARN-8452
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-8452.000.patch
>
>
> Basically we recalculate the weight every time, even if the inputs did not 
> change. This causes high cpu usage, if the cluster has lots of apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8454) WeightedLocalityPolicyManager should manage empty configurations

2018-06-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520806#comment-16520806
 ] 

Íñigo Goiri commented on YARN-8454:
---

The full trace is:
{code}
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
org.apache.hadoop.yarn.server.federation.policies.exceptions.FederationPolicyInitializationException:
 javax.xml.bind.UnmarshalException: Error creating JSON-based XMLStreamReader
- with linked exception:
[javax.xml.stream.XMLStreamException: JSON expression can not be empty!]
at 
org.apache.hadoop.yarn.server.router.webapp.FederationInterceptorREST.init(FederationInterceptorREST.java:139)
at 
org.apache.hadoop.yarn.server.router.webapp.RouterWebServices.initializePipeline(RouterWebServices.java:265)
at 
org.apache.hadoop.yarn.server.router.webapp.RouterWebServices.getInterceptorChain(RouterWebServices.java:175)
at 
org.apache.hadoop.yarn.server.router.webapp.RouterWebServices.getClusterMetricsInfo(RouterWebServices.java:333)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185)
at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
at 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
at 
com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
at 
com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:304)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
at 
org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthentic

[jira] [Updated] (YARN-8454) WeightedLocalityPolicyManager should manage empty configurations

2018-06-22 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8454:
--
Affects Version/s: 2.9.1

> WeightedLocalityPolicyManager should manage empty configurations
> 
>
> Key: YARN-8454
> URL: https://issues.apache.org/jira/browse/YARN-8454
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.9.1
>Reporter: Íñigo Goiri
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
>
> Currently, setting WeightedLocalityPolicyManager to use the default settings 
> in federation will trigger an exception trying to parse an empty string. This 
> error should be reported properly. In addition, the UI breaks when this 
> happens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8454) WeightedLocalityPolicyManager should manage empty configurations

2018-06-22 Thread JIRA
Íñigo Goiri created YARN-8454:
-

 Summary: WeightedLocalityPolicyManager should manage empty 
configurations
 Key: YARN-8454
 URL: https://issues.apache.org/jira/browse/YARN-8454
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Íñigo Goiri
Assignee: Giovanni Matteo Fumarola


Currently, setting WeightedLocalityPolicyManager to use the default settings in 
federation will trigger an exception trying to parse an empty string. This 
error should be reported properly. In addition, the UI breaks when this happens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8184) Too many metrics if containerLocalizer/ResourceLocalizationService uses ReadWriteDiskValidator

2018-06-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520798#comment-16520798
 ] 

Hudson commented on YARN-8184:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14468 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14468/])
YARN-8184. Too many metrics if (yufei_gu: rev 
1cdce86d33d4b73ba6dd4136c966eb7e822b6f36)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ContainerLocalizer.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java


> Too many metrics if containerLocalizer/ResourceLocalizationService uses 
> ReadWriteDiskValidator
> --
>
> Key: YARN-8184
> URL: https://issues.apache.org/jira/browse/YARN-8184
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8184.001.patch, YARN-8184.002.patch
>
>
> ContainerLocalizer or ResourceLocalizationService will use the 
> ReadWriteDiskValidator as its disk validator when it downloads files if we 
> configure the yarn.nodemanger.disk-validator to ReadWriteDiskValidator's 
> name. In that case, ReadWriteDiskValidator will create a metric item for each 
> directory localized, which will be too many metrics. We should let 
> ContainerLocalizer only use the basic disk validator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8451) Multiple NM heartbeat thread created when a slow NM resync with RM

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520790#comment-16520790
 ] 

genericqa commented on YARN-8451:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
49s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 78m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8451 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928808/YARN-8451.v1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d4e529be25a6 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 55fad6a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21082/testReport/ |
| Max. process+thread count | 333 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21082/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Multiple NM heartbeat thread created when a sl

[jira] [Commented] (YARN-8184) Too many metrics if containerLocalizer/ResourceLocalizationService uses ReadWriteDiskValidator

2018-06-22 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520773#comment-16520773
 ] 

Yufei Gu commented on YARN-8184:


Committed to trunk. Thanks for the review, [~haibochen].

> Too many metrics if containerLocalizer/ResourceLocalizationService uses 
> ReadWriteDiskValidator
> --
>
> Key: YARN-8184
> URL: https://issues.apache.org/jira/browse/YARN-8184
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8184.001.patch, YARN-8184.002.patch
>
>
> ContainerLocalizer or ResourceLocalizationService will use the 
> ReadWriteDiskValidator as its disk validator when it downloads files if we 
> configure the yarn.nodemanger.disk-validator to ReadWriteDiskValidator's 
> name. In that case, ReadWriteDiskValidator will create a metric item for each 
> directory localized, which will be too many metrics. We should let 
> ContainerLocalizer only use the basic disk validator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8438) TestContainer.testKillOnNew flaky on trunk

2018-06-22 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520772#comment-16520772
 ] 

Szilard Nemeth commented on YARN-8438:
--

Hi [~miklos.szeg...@cloudera.com]!

To be honest, fixing the testcase this way was my first intention too.

Then I thinked about it more and had the idea it would confuse developers who 
will try to understand what the test code does. In a real-world situation, I 
bet it never happens that a container is started and finished at the same 
millisecond nor the same second, so I think the test only makes sense with a 
greater than relation between the time values. 

It's just the property of the testcase that it runs too fast so that we get the 
same time value for the start and end time, so I think having a strictly 
monotonic clock makes more sense.

Thanks!

> TestContainer.testKillOnNew flaky on trunk
> --
>
> Key: YARN-8438
> URL: https://issues.apache.org/jira/browse/YARN-8438
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8438.001.patch, YARN-8438.002.patch, 
> YARN-8438.003.patch, YARN-8438.004.patch, YARN-8438.005.patch
>
>
> Running this test several times (e.g. 30), it fails ~5-10 times.
> Stacktrace: 
> {code:java}
> java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> org.junit.Assert.assertTrue(Assert.java:52) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.TestContainer.testKillOnNew(TestContainer.java:594)
> {code}
> TestContainer:594 is the following code in trunk, currently:
> {code:java}
> Assert.assertTrue( containerMetrics.finishTime.value() > 
> containerMetrics.startTime .value());
> {code}
> So sometimes the finish time is not greater than the start time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8452) FairScheduler.update can take long time if yarn.scheduler.fair.sizebasedweight is on

2018-06-22 Thread Miklos Szegedi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi reassigned YARN-8452:


Assignee: (was: Szilard Nemeth)

> FairScheduler.update can take long time if 
> yarn.scheduler.fair.sizebasedweight is on
> 
>
> Key: YARN-8452
> URL: https://issues.apache.org/jira/browse/YARN-8452
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Miklos Szegedi
>Priority: Major
>
> Basically we recalculate the weight every time, even if the inputs did not 
> change. This causes high cpu usage, if the cluster has lots of apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8438) TestContainer.testKillOnNew flaky on trunk

2018-06-22 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520758#comment-16520758
 ] 

Miklos Szegedi commented on YARN-8438:
--

Thank you for the updated patch [~snemeth].

Why do not you use just >= to resolve this?
{code:java}
Assert.assertTrue(
containerMetrics.finishTime.value() >= containerMetrics.startTime
.value());{code}

> TestContainer.testKillOnNew flaky on trunk
> --
>
> Key: YARN-8438
> URL: https://issues.apache.org/jira/browse/YARN-8438
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8438.001.patch, YARN-8438.002.patch, 
> YARN-8438.003.patch, YARN-8438.004.patch, YARN-8438.005.patch
>
>
> Running this test several times (e.g. 30), it fails ~5-10 times.
> Stacktrace: 
> {code:java}
> java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> org.junit.Assert.assertTrue(Assert.java:52) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.TestContainer.testKillOnNew(TestContainer.java:594)
> {code}
> TestContainer:594 is the following code in trunk, currently:
> {code:java}
> Assert.assertTrue( containerMetrics.finishTime.value() > 
> containerMetrics.startTime .value());
> {code}
> So sometimes the finish time is not greater than the start time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8423) GPU does not get released even though the application gets killed.

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520754#comment-16520754
 ] 

genericqa commented on YARN-8423:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
46s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8423 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928800/YARN-8423.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cb0b39a1bf92 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 55fad6a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21081/testReport/ |
| Max. process+thread count | 291 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21081/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> GPU does not get released even though the app

[jira] [Updated] (YARN-8392) Allow multiple tags for anti-affinity placement policy in service specification

2018-06-22 Thread Billie Rinaldi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-8392:
-
Attachment: YARN-8392.2.patch

> Allow multiple tags for anti-affinity placement policy in service 
> specification
> ---
>
> Key: YARN-8392
> URL: https://issues.apache.org/jira/browse/YARN-8392
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
>Priority: Critical
> Attachments: YARN-8392.1.patch, YARN-8392.2.patch
>
>
> Currently the service client code is restricting a component's target tags to 
> include only a single tag, the component name. I have a use case for two 
> components having anti-affinity with themselves and with each other. The YARN 
> placement policies support this, but the service framework isn't allowing it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8453) Allocation to a queue is dishonored if one resource is at the limit

2018-06-22 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520753#comment-16520753
 ] 

Sunil Govindan commented on YARN-8453:
--

cc / [~leftnoteasy]

> Allocation to a queue is dishonored if one resource is at the limit
> ---
>
> Key: YARN-8453
> URL: https://issues.apache.org/jira/browse/YARN-8453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 3.0.2
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
>
> Post support of additional resource types other then CPU and Memory, it could 
> be possible that one such new resource is exhausted its quota on a given 
> queue. But other resources such as Memory / CPU is still there beyond its 
> guaranteed limit (under max-limit). However as new resource is exhausted, 
> still containers will be failed to get that delta resources (cpu and memory). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8453) Allocation to a queue is dishonored if one resource is at the limit

2018-06-22 Thread Sunil Govindan (JIRA)
Sunil Govindan created YARN-8453:


 Summary: Allocation to a queue is dishonored if one resource is at 
the limit
 Key: YARN-8453
 URL: https://issues.apache.org/jira/browse/YARN-8453
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler
Affects Versions: 3.0.2
Reporter: Sunil Govindan
Assignee: Sunil Govindan


Post support of additional resource types other then CPU and Memory, it could 
be possible that one such new resource is exhausted its quota on a given queue. 
But other resources such as Memory / CPU is still there beyond its guaranteed 
limit (under max-limit). However as new resource is exhausted, still containers 
will be failed to get that delta resources (cpu and memory). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8452) FairScheduler.update can take long time if yarn.scheduler.fair.sizebasedweight is on

2018-06-22 Thread Miklos Szegedi (JIRA)
Miklos Szegedi created YARN-8452:


 Summary: FairScheduler.update can take long time if 
yarn.scheduler.fair.sizebasedweight is on
 Key: YARN-8452
 URL: https://issues.apache.org/jira/browse/YARN-8452
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Miklos Szegedi
Assignee: Szilard Nemeth


Basically we recalculate the weight every time, even if the inputs did not 
change. This causes high cpu usage, if the cluster has lots of apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8451) Multiple NM heartbeat thread created when a slow NM resync with RM

2018-06-22 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8451:
---
Attachment: YARN-8451.v1.patch

> Multiple NM heartbeat thread created when a slow NM resync with RM
> --
>
> Key: YARN-8451
> URL: https://issues.apache.org/jira/browse/YARN-8451
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8451.v1.patch
>
>
> During a NM resync with RM (say RM did a master slave switch), if NM is 
> running slow, more than one RESYNC event may be put into the NM dispatcher by 
> the existing heartbeat thread before they are processed. As a result, 
> multiple new heartbeat thread are later created and start to hb to RM 
> concurrently with their own responseId. If at some point of time, one thread 
> becomes more than one step behind others, RM will send back a resync signal 
> in this heartbeat response, killing all containers in this NM. 
> See comments below for details on how this can happen. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8451) Multiple NM heartbeat thread created when a slow NM resync with RM

2018-06-22 Thread Botong Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520727#comment-16520727
 ] 

Botong Huang commented on YARN-8451:


Here’s an example where more than one heartbeat thread is created: 
1. YarnRM master slave switch happens, when the new YarnRM comes up, it 
notifies the NM to resync (without killing its containers) upon first NM 
heartbeat. 
2. Every time NM heartbeats into RM and gets a resync signal, it dispatches an 
NodeManagerEventType.RESYNC event and move on. 
3. NodeManager.resyncWithRM() is the one listening to this event. 
4. When the NM dispatcher is running slow, by the time the first event is 
processed, the NM heartbeat thread has managed to heartbeat more and put more 
NodeManagerEventType.RESYNC events into the dispatcher event queue. 
5. Multiple threads are created inside NodeManager.resyncWithRM(), all of them 
are blocked at statusUpdater.join() inside 
NodeStatusUpdateImpl.rebootNodeStatusUpdaterAndRegisterWithRM(). 
6. When the previous heartbeat thread exits, every blocked thread gets released 
and creates a new heartbeat thread. 

> Multiple NM heartbeat thread created when a slow NM resync with RM
> --
>
> Key: YARN-8451
> URL: https://issues.apache.org/jira/browse/YARN-8451
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
>
> During a NM resync with RM (say RM did a master slave switch), if NM is 
> running slow, more than one RESYNC event may be put into the NM dispatcher by 
> the existing heartbeat thread before they are processed. As a result, 
> multiple new heartbeat thread are later created and start to hb to RM 
> concurrently with their own responseId. If at some point of time, one thread 
> becomes more than one step behind others, RM will send back a resync signal 
> in this heartbeat response, killing all containers in this NM. 
> See comments below for details on how this can happen. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8451) Multiple NM heartbeat thread created when a slow NM resync with RM

2018-06-22 Thread Botong Huang (JIRA)
Botong Huang created YARN-8451:
--

 Summary: Multiple NM heartbeat thread created when a slow NM 
resync with RM
 Key: YARN-8451
 URL: https://issues.apache.org/jira/browse/YARN-8451
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Botong Huang
Assignee: Botong Huang


During a NM resync with RM (say RM did a master slave switch), if NM is running 
slow, more than one RESYNC event may be put into the NM dispatcher by the 
existing heartbeat thread before they are processed. As a result, multiple new 
heartbeat thread are later created and start to hb to RM concurrently with 
their own responseId. If at some point of time, one thread becomes more than 
one step behind others, RM will send back a resync signal in this heartbeat 
response, killing all containers in this NM. 

See comments below for details on how this can happen. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8392) Allow multiple tags for anti-affinity placement policy in service specification

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520719#comment-16520719
 ] 

genericqa commented on YARN-8392:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 56s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 13s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 58s{color} 
| {color:red} hadoop-yarn-services-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
31s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.service.TestServiceApiUtil |
|   | hadoop.yarn.service.TestYarnNativeServices |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8392 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928798/YARN-8392.1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 59da42fe7235 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 55fad6a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| unit | 
https

[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520718#comment-16520718
 ] 

genericqa commented on YARN-8326:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m  
0s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8326 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928790/YARN-8326.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cd2f8bf00ab6 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 55fad6a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21079/testReport/ |
| Max. process+thread count | 459 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21079/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHO

[jira] [Updated] (YARN-8450) Blocking resources such as GPU/FPGA etc tend to release actual device slowly even after RM identifies it as COMPLETED

2018-06-22 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8450:
-
Summary: Blocking resources such as GPU/FPGA etc tend to release actual 
device slowly even after RM identifies it as COMPLETED  (was: Blocking 
resources such as GPU/FPGA etc tend to release actual device even after RM 
identifies it as COMPLETED)

> Blocking resources such as GPU/FPGA etc tend to release actual device slowly 
> even after RM identifies it as COMPLETED
> -
>
> Key: YARN-8450
> URL: https://issues.apache.org/jira/browse/YARN-8450
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.2
>Reporter: Sunil Govindan
>Priority: Major
>
> For resources such as GPU/FPGA or similar resources, sometimes we have seen 
> that device is not released from a container even after container is in 
> completed states. 
> In such cases, we need a common way of handling from NM level. YARN-8423 is 
> only handling this for GPU.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8423) GPU does not get released even though the application gets killed.

2018-06-22 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520682#comment-16520682
 ] 

Sunil Govindan commented on YARN-8423:
--

Thanks [~vinodkv] Attaching new patch after addressing all comments.

For generic way of handling, I have opened YARN-8450 as it need to analyzed and 
refactored at NM level for all similar resources which may be tend to block at 
the time when container is released. we will continue discussing same in that 
Jira for a global approach mean while this issue can immediately tackle the GPU 
issue. Thank you. cc [~leftnoteasy]

> GPU does not get released even though the application gets killed.
> --
>
> Key: YARN-8423
> URL: https://issues.apache.org/jira/browse/YARN-8423
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Attachments: YARN-8423.001.patch, YARN-8423.002.patch, 
> kill-container-nm.log
>
>
> Run an Tensor flow app requesting one GPU.
> Kill the application once the GPU is allocated
> Query the nodemanger once the application is killed.We see that GPU is not 
> being released.
> {code}
>  curl -i /ws/v1/node/resources/yarn.io%2Fgpu
> {"gpuDeviceInformation":{"gpus":[{"productName":"","uuid":"GPU-","minorNumber":0,"gpuUtilizations":{"overallGpuUtilization":0.0},"gpuMemoryUsage":{"usedMemoryMiB":73,"availMemoryMiB":12125,"totalMemoryMiB":12198},"temperature":{"currentGpuTemp":28.0,"maxGpuTemp":85.0,"slowThresholdGpuTemp":82.0}},{"productName":"","uuid":"GPU-","minorNumber":1,"gpuUtilizations":{"overallGpuUtilization":0.0},"gpuMemoryUsage":{"usedMemoryMiB":73,"availMemoryMiB":12125,"totalMemoryMiB":12198},"temperature":{"currentGpuTemp":28.0,"maxGpuTemp":85.0,"slowThresholdGpuTemp":82.0}}],"driverVersion":""},"totalGpuDevices":[{"index":0,"minorNumber":0},{"index":1,"minorNumber":1}],"assignedGpuDevices":[{"index":0,"minorNumber":0,"containerId":"container_"}]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8423) GPU does not get released even though the application gets killed.

2018-06-22 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-8423:
-
Attachment: YARN-8423.002.patch

> GPU does not get released even though the application gets killed.
> --
>
> Key: YARN-8423
> URL: https://issues.apache.org/jira/browse/YARN-8423
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Sunil Govindan
>Priority: Critical
> Attachments: YARN-8423.001.patch, YARN-8423.002.patch, 
> kill-container-nm.log
>
>
> Run an Tensor flow app requesting one GPU.
> Kill the application once the GPU is allocated
> Query the nodemanger once the application is killed.We see that GPU is not 
> being released.
> {code}
>  curl -i /ws/v1/node/resources/yarn.io%2Fgpu
> {"gpuDeviceInformation":{"gpus":[{"productName":"","uuid":"GPU-","minorNumber":0,"gpuUtilizations":{"overallGpuUtilization":0.0},"gpuMemoryUsage":{"usedMemoryMiB":73,"availMemoryMiB":12125,"totalMemoryMiB":12198},"temperature":{"currentGpuTemp":28.0,"maxGpuTemp":85.0,"slowThresholdGpuTemp":82.0}},{"productName":"","uuid":"GPU-","minorNumber":1,"gpuUtilizations":{"overallGpuUtilization":0.0},"gpuMemoryUsage":{"usedMemoryMiB":73,"availMemoryMiB":12125,"totalMemoryMiB":12198},"temperature":{"currentGpuTemp":28.0,"maxGpuTemp":85.0,"slowThresholdGpuTemp":82.0}}],"driverVersion":""},"totalGpuDevices":[{"index":0,"minorNumber":0},{"index":1,"minorNumber":1}],"assignedGpuDevices":[{"index":0,"minorNumber":0,"containerId":"container_"}]}
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8450) Blocking resources such as GPU/FPGA etc tend to release actual device even after RM identifies it as COMPLETED

2018-06-22 Thread Sunil Govindan (JIRA)
Sunil Govindan created YARN-8450:


 Summary: Blocking resources such as GPU/FPGA etc tend to release 
actual device even after RM identifies it as COMPLETED
 Key: YARN-8450
 URL: https://issues.apache.org/jira/browse/YARN-8450
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.2
Reporter: Sunil Govindan


For resources such as GPU/FPGA or similar resources, sometimes we have seen 
that device is not released from a container even after container is in 
completed states. 

In such cases, we need a common way of handling from NM level. YARN-8423 is 
only handling this for GPU.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8392) Allow multiple tags for anti-affinity placement policy in service specification

2018-06-22 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520660#comment-16520660
 ] 

Billie Rinaldi commented on YARN-8392:
--

This patch removes the validation code in ServiceApiUtil that restricts the 
placement constraint tags and adds a test for multi component anti affinity. I 
created a new test class for this test, and I also moved the existing placement 
policy test from TestYarnNativeServices to the new class, TestPlacementPolicy.

> Allow multiple tags for anti-affinity placement policy in service 
> specification
> ---
>
> Key: YARN-8392
> URL: https://issues.apache.org/jira/browse/YARN-8392
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
>Priority: Critical
> Attachments: YARN-8392.1.patch
>
>
> Currently the service client code is restricting a component's target tags to 
> include only a single tag, the component name. I have a use case for two 
> components having anti-affinity with themselves and with each other. The YARN 
> placement policies support this, but the service framework isn't allowing it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8392) Allow multiple tags for anti-affinity placement policy in service specification

2018-06-22 Thread Billie Rinaldi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-8392:
-
Attachment: YARN-8392.1.patch

> Allow multiple tags for anti-affinity placement policy in service 
> specification
> ---
>
> Key: YARN-8392
> URL: https://issues.apache.org/jira/browse/YARN-8392
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
>Priority: Critical
> Attachments: YARN-8392.1.patch
>
>
> Currently the service client code is restricting a component's target tags to 
> include only a single tag, the component name. I have a use case for two 
> components having anti-affinity with themselves and with each other. The YARN 
> placement policies support this, but the service framework isn't allowing it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8444) NodeResourceMonitor crashes on bad swapFree value

2018-06-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520616#comment-16520616
 ] 

Hudson commented on YARN-8444:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14463 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14463/])
YARN-8444: NodeResourceMonitor crashes on bad swapFree value. (ericp: rev 
6432128622d64f3f9dd638b9c254c77cdf5408aa)
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestSysInfoLinux.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/SysInfoLinux.java


> NodeResourceMonitor crashes on bad swapFree value
> -
>
> Key: YARN-8444
> URL: https://issues.apache.org/jira/browse/YARN-8444
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-8444.001.patch
>
>
> Saw this on a node that was running out of memory. Can't have 
> NodeResourceMonitor exiting. System was above 99% memory used at the time, so 
> this is not a common occurrence, but we should fix since this is a critical 
> monitor to the health of the node.
>  
> {noformat}
> 2018-06-04 14:28:08,539 [Container Monitor] DEBUG 
> ContainersMonitorImpl.audit: Memory usage of ProcessTree 110564 for 
> container-id container_e24_1526662705797_129647_01_004791: 2.1 GB of 3.5 GB 
> physical memory used; 5.0 GB of 7.3 GB virtual memory used
> 2018-06-04 14:28:10,622 [Node Resource Monitor] ERROR 
> yarn.YarnUncaughtExceptionHandler: Thread Thread[Node Resource 
> Monitor,5,main] threw an Exception.
> java.lang.NumberFormatException: For input string: "18446744073709551596"
>  at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>  at java.lang.Long.parseLong(Long.java:592)
>  at java.lang.Long.parseLong(Long.java:631)
>  at 
> org.apache.hadoop.util.SysInfoLinux.readProcMemInfoFile(SysInfoLinux.java:257)
>  at 
> org.apache.hadoop.util.SysInfoLinux.getAvailablePhysicalMemorySize(SysInfoLinux.java:591)
>  at 
> org.apache.hadoop.util.SysInfoLinux.getAvailableVirtualMemorySize(SysInfoLinux.java:601)
>  at 
> org.apache.hadoop.yarn.util.ResourceCalculatorPlugin.getAvailableVirtualMemorySize(ResourceCalculatorPlugin.java:74)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.NodeResourceMonitorImpl$MonitoringThread.run(NodeResourceMonitorImpl.java:193)
> 2018-06-04 14:28:30,747 
> [org.apache.hadoop.util.JvmPauseMonitor$Monitor@226eba67] INFO 
> util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of 
> approximately 9330ms
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-06-22 Thread Shane Kumpf (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf updated YARN-8326:
--
Attachment: YARN-8326.001.patch

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  yarn.nodemanager.remote-app-log-dir-suffix
>  logs
>  
> 
>  yarn.nodemanager.resource-plugins
>  
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
>  auto
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.docker-plugin
>  nvidia-docker-v1
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.docker-plugin.nvidiadocker-
>  v1.endpoint
>  [http://localhost:3476/v1.0/docker/cli]
>  
> 
>  
> yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables
>  
>  
> 
>  yarn.nodemanager.resource.cpu-vcores
>  6
>  
> 
>  yarn.nodemanager.resource.memory-mb
>  12288
>  
> 
>  yarn.nodemanager.resource.percentage-physical-cpu-limit
>  80
>  
> 
>  yarn.nodemanager.runtime.linux.allowed-ru

[jira] [Assigned] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-06-22 Thread Shane Kumpf (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf reassigned YARN-8326:
-

Assignee: Shane Kumpf

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  yarn.nodemanager.remote-app-log-dir-suffix
>  logs
>  
> 
>  yarn.nodemanager.resource-plugins
>  
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
>  auto
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.docker-plugin
>  nvidia-docker-v1
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.docker-plugin.nvidiadocker-
>  v1.endpoint
>  [http://localhost:3476/v1.0/docker/cli]
>  
> 
>  
> yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables
>  
>  
> 
>  yarn.nodemanager.resource.cpu-vcores
>  6
>  
> 
>  yarn.nodemanager.resource.memory-mb
>  12288
>  
> 
>  yarn.nodemanager.resource.percentage-physical-cpu-limit
>  80
>  
> 
>  yarn.nodemanager.runtime.linux.allowed-runti

[jira] [Commented] (YARN-8427) Don't start opportunistic containers at container scheduler/finish event with over-allocation

2018-06-22 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520607#comment-16520607
 ] 

Miklos Szegedi commented on YARN-8427:
--

+1 LGTM. If there are no concerns, I will commit this shortly.

> Don't start opportunistic containers at container scheduler/finish event with 
> over-allocation
> -
>
> Key: YARN-8427
> URL: https://issues.apache.org/jira/browse/YARN-8427
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-8427-YARN-1011.00.patch, 
> YARN-8427-YARN-1011.01.patch
>
>
> As discussed in YARN-8250, we can stop opportunistic containers from being 
> launched at container scheduler/finish events if the node is already 
> over-allocating itself.  This can mitigate the issue that too many 
> opportunistic containers can be launched and then quickly killed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8444) NodeResourceMonitor crashes on bad swapFree value

2018-06-22 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520597#comment-16520597
 ] 

Eric Payne commented on YARN-8444:
--

Thanks [~Jim_Brennan] for the work on this JIRA.

+1. I will commit soon

> NodeResourceMonitor crashes on bad swapFree value
> -
>
> Key: YARN-8444
> URL: https://issues.apache.org/jira/browse/YARN-8444
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.3, 3.0.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-8444.001.patch
>
>
> Saw this on a node that was running out of memory. Can't have 
> NodeResourceMonitor exiting. System was above 99% memory used at the time, so 
> this is not a common occurrence, but we should fix since this is a critical 
> monitor to the health of the node.
>  
> {noformat}
> 2018-06-04 14:28:08,539 [Container Monitor] DEBUG 
> ContainersMonitorImpl.audit: Memory usage of ProcessTree 110564 for 
> container-id container_e24_1526662705797_129647_01_004791: 2.1 GB of 3.5 GB 
> physical memory used; 5.0 GB of 7.3 GB virtual memory used
> 2018-06-04 14:28:10,622 [Node Resource Monitor] ERROR 
> yarn.YarnUncaughtExceptionHandler: Thread Thread[Node Resource 
> Monitor,5,main] threw an Exception.
> java.lang.NumberFormatException: For input string: "18446744073709551596"
>  at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>  at java.lang.Long.parseLong(Long.java:592)
>  at java.lang.Long.parseLong(Long.java:631)
>  at 
> org.apache.hadoop.util.SysInfoLinux.readProcMemInfoFile(SysInfoLinux.java:257)
>  at 
> org.apache.hadoop.util.SysInfoLinux.getAvailablePhysicalMemorySize(SysInfoLinux.java:591)
>  at 
> org.apache.hadoop.util.SysInfoLinux.getAvailableVirtualMemorySize(SysInfoLinux.java:601)
>  at 
> org.apache.hadoop.yarn.util.ResourceCalculatorPlugin.getAvailableVirtualMemorySize(ResourceCalculatorPlugin.java:74)
>  at 
> org.apache.hadoop.yarn.server.nodemanager.NodeResourceMonitorImpl$MonitoringThread.run(NodeResourceMonitorImpl.java:193)
> 2018-06-04 14:28:30,747 
> [org.apache.hadoop.util.JvmPauseMonitor$Monitor@226eba67] INFO 
> util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of 
> approximately 9330ms
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8446) Support of managing multi-dimensional resources

2018-06-22 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520585#comment-16520585
 ] 

Sunil Govindan commented on YARN-8446:
--

This looks very interesting. Thanks [~cheersyang] for filing this. This will 
definitely help to support a wide variety of resources in YARN.
{code:java}
public enum ResourceTypes {
  COUNTABLE
}{code}
Few questions:
 # So we have only COUNTABLE resource type for now. As per this proposal, 
couple of more addition to this is *SET* and *MULTIDIMENSIONAL.* I have some 
comments on this. Once we indicate a resource as SET, for eg: I am considering 
*IPAddress* as a resource name in this context. 
 ## This will now be good if all values in this set are unique. But in larger 
context, are we looking to consider * or similar special characters for each 
resource types. To make it more clear, each resource type might need some 
specification to be fed in and * might be one such with a specific meaning and 
there could be more such specs for each resources. 
 ## Also such non-countable resources will be consumed for each containers, and 
after use that has to be added back to the resource set. I am thinking of the 
performance cost here as we might need to consider the resource as a critical 
section.
 ## As of today, resource is per-node or per-partition or at cluster level. We 
aggregate and use the same for various uses or metrics etc. I am wondering the 
semantic changes to api's such as *Resources.add* or *Resources.multiply* or 
Resources.divide. Its better we need to avoid such non-countable from these 
apis but being said we might need to aggregate to higher level for above said 
reason. Could u please share some insights to this.
 #  Earlier I have commented in other Jira of multiple resources about concept 
of shared resources. Is this *MULTIDIMENSIONAL* also consider shared resources?

 ## Same question as 1.3, In *MULTIDIMENSIONAL* how we do the operations?

> Support of managing multi-dimensional resources
> ---
>
> Key: YARN-8446
> URL: https://issues.apache.org/jira/browse/YARN-8446
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Weiwei Yang
>Priority: Major
>
> To better support long running jobs and services, we need to extend YARN to 
> support other resources, such as disk, IP, port. Current resource types is 
> not flexible enough to make this work because it only supports COUNTABLE type 
> which is single value.
>  
> Propose to extend resource types by adding two more general types, such as 
> SET, MULTIDIMENSIONAL (naming TBD). With schema like
> *SET*:  a set of values
> {noformat}
> ["10.100.0.1", "10.100.0.2"]
> ["9981", "9982", "9983"]
> {noformat}
> *MULTIDIMENSIONAL*: a set of values, each value can be a resource instance 
> with multiple values. 
> {noformat}
> [ disk1 : { attributes: { "type" : "SATA", "index" : 1 },  "size" : "500gb", 
> "iops" : "1000" },
>   disk2 : { attributes: { "type" : "SSD", "index" : 2 },  "size" : "100gb", 
> "iops" : "1000" } ]
> {noformat}
> this way, we could support better resource management and isolations. The 
> idea is to make this as general as possible so we can easily support some 
> other complex resources.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8047) RMWebApp make external class pluggable

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520368#comment-16520368
 ] 

genericqa commented on YARN-8047:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
21s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 14m 
33s{color} | {color:red} hadoop-yarn in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 18s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 54s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
45s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
14s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 
35s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}165m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8047 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928754/YARN-8047-002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findb

[jira] [Commented] (YARN-8438) TestContainer.testKillOnNew flaky on trunk

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520316#comment-16520316
 ] 

genericqa commented on YARN-8438:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
53s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}114m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8438 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928753/YARN-8438.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 652b9375577b 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 30728ac |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21078/testReport/ |
| Max. process+thread count | 290 (vs. ulimit of 1

[jira] [Updated] (YARN-8047) RMWebApp make external class pluggable

2018-06-22 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-8047:

Attachment: YARN-8047-002.patch

> RMWebApp make external class pluggable
> --
>
> Key: YARN-8047
> URL: https://issues.apache.org/jira/browse/YARN-8047
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-8047-001.patch, YARN-8047-002.patch
>
>
> JIra should make sure we should be able to plugin webservices and web pages 
> of scheduler in Resourcemanager
> * RMWebApp allow to bind external classes
> * RMController allow to plugin scheduler classes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8438) TestContainer.testKillOnNew flaky on trunk

2018-06-22 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-8438:
-
Attachment: YARN-8438.005.patch

> TestContainer.testKillOnNew flaky on trunk
> --
>
> Key: YARN-8438
> URL: https://issues.apache.org/jira/browse/YARN-8438
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8438.001.patch, YARN-8438.002.patch, 
> YARN-8438.003.patch, YARN-8438.004.patch, YARN-8438.005.patch
>
>
> Running this test several times (e.g. 30), it fails ~5-10 times.
> Stacktrace: 
> {code:java}
> java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> org.junit.Assert.assertTrue(Assert.java:52) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.TestContainer.testKillOnNew(TestContainer.java:594)
> {code}
> TestContainer:594 is the following code in trunk, currently:
> {code:java}
> Assert.assertTrue( containerMetrics.finishTime.value() > 
> containerMetrics.startTime .value());
> {code}
> So sometimes the finish time is not greater than the start time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8438) TestContainer.testKillOnNew flaky on trunk

2018-06-22 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520182#comment-16520182
 ] 

genericqa commented on YARN-8438:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
1s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 55s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  4m 
13s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  4m 13s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 59s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
12s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
40s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}102m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8438 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12928726/YARN-8438.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 958d0a18d2d5 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 30728ac |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| compile | 
https://builds.apache.org/job/PreCom

[jira] [Commented] (YARN-8438) TestContainer.testKillOnNew flaky on trunk

2018-06-22 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520096#comment-16520096
 ] 

Szilard Nemeth commented on YARN-8438:
--

Hi [~miklos.szeg...@cloudera.com]!

Indeed it's a valid case you mentioned, I fixed the implementation to handle 
this case as well.

Also applied synchronized to the method header.

What did you mean about using inheritance instead of reclection?

ContainerImpl.clock is a private static so extending from this class for the 
purpose of replacing the clock in tests does not make sense for me.

While I was thinking about your comment I realized that I forgot to save and 
restore the original clock of ContainerImpl so I extended my patch with this 
chunk of code.

Please see my updated patch.

> TestContainer.testKillOnNew flaky on trunk
> --
>
> Key: YARN-8438
> URL: https://issues.apache.org/jira/browse/YARN-8438
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8438.001.patch, YARN-8438.002.patch, 
> YARN-8438.003.patch, YARN-8438.004.patch
>
>
> Running this test several times (e.g. 30), it fails ~5-10 times.
> Stacktrace: 
> {code:java}
> java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> org.junit.Assert.assertTrue(Assert.java:52) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.TestContainer.testKillOnNew(TestContainer.java:594)
> {code}
> TestContainer:594 is the following code in trunk, currently:
> {code:java}
> Assert.assertTrue( containerMetrics.finishTime.value() > 
> containerMetrics.startTime .value());
> {code}
> So sometimes the finish time is not greater than the start time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8438) TestContainer.testKillOnNew flaky on trunk

2018-06-22 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-8438:
-
Attachment: YARN-8438.004.patch

> TestContainer.testKillOnNew flaky on trunk
> --
>
> Key: YARN-8438
> URL: https://issues.apache.org/jira/browse/YARN-8438
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8438.001.patch, YARN-8438.002.patch, 
> YARN-8438.003.patch, YARN-8438.004.patch
>
>
> Running this test several times (e.g. 30), it fails ~5-10 times.
> Stacktrace: 
> {code:java}
> java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> org.junit.Assert.assertTrue(Assert.java:52) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.TestContainer.testKillOnNew(TestContainer.java:594)
> {code}
> TestContainer:594 is the following code in trunk, currently:
> {code:java}
> Assert.assertTrue( containerMetrics.finishTime.value() > 
> containerMetrics.startTime .value());
> {code}
> So sometimes the finish time is not greater than the start time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8438) TestContainer.testKillOnNew flaky on trunk

2018-06-22 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-8438:
-
Attachment: YARN-8438.003.patch

> TestContainer.testKillOnNew flaky on trunk
> --
>
> Key: YARN-8438
> URL: https://issues.apache.org/jira/browse/YARN-8438
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8438.001.patch, YARN-8438.002.patch, 
> YARN-8438.003.patch
>
>
> Running this test several times (e.g. 30), it fails ~5-10 times.
> Stacktrace: 
> {code:java}
> java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> org.junit.Assert.assertTrue(Assert.java:52) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.TestContainer.testKillOnNew(TestContainer.java:594)
> {code}
> TestContainer:594 is the following code in trunk, currently:
> {code:java}
> Assert.assertTrue( containerMetrics.finishTime.value() > 
> containerMetrics.startTime .value());
> {code}
> So sometimes the finish time is not greater than the start time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org