[jira] [Commented] (YARN-8122) Component health threshold monitor

2018-04-11 Thread Gour Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434975#comment-16434975
 ] 

Gour Saha commented on YARN-8122:
-

Uploaded 001 patch. [~billie.rinaldi], please review when you get a chance.

> Component health threshold monitor
> --
>
> Key: YARN-8122
> URL: https://issues.apache.org/jira/browse/YARN-8122
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-8122.001.patch, YARN-8122.draft.patch
>
>
> Slider supported component health threshold monitoring with SLIDER-1246. It 
> would be good to have this feature for YARN Service too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8122) Component health threshold monitor

2018-04-11 Thread Gour Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha updated YARN-8122:

Attachment: YARN-8122.001.patch

> Component health threshold monitor
> --
>
> Key: YARN-8122
> URL: https://issues.apache.org/jira/browse/YARN-8122
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Attachments: YARN-8122.001.patch, YARN-8122.draft.patch
>
>
> Slider supported component health threshold monitoring with SLIDER-1246. It 
> would be good to have this feature for YARN Service too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8153) Guaranteed containers always stay in SCHEDULED on NM after restart

2018-04-11 Thread Yang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated YARN-8153:

Attachment: YARN-8153.001.patch

> Guaranteed containers always stay in SCHEDULED on NM after restart
> --
>
> Key: YARN-8153
> URL: https://issues.apache.org/jira/browse/YARN-8153
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
>Assignee: Yang Wang
>Priority: Major
> Attachments: YARN-8153.001.patch
>
>
> When nm recovery is enabled, after NM restart, some containers always stay in 
> SCHEDULED because of no sufficient resources.
> The root cause is that utilizationTracker.addContainerResources has been 
> called twice when restart. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8153) Guaranteed containers always stay in SCHEDULED on NM after restart

2018-04-11 Thread Yang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang reassigned YARN-8153:
---

Assignee: Yang Wang

> Guaranteed containers always stay in SCHEDULED on NM after restart
> --
>
> Key: YARN-8153
> URL: https://issues.apache.org/jira/browse/YARN-8153
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
>Assignee: Yang Wang
>Priority: Major
>
> When nm recovery is enabled, after NM restart, some containers always stay in 
> SCHEDULED because of no sufficient resources.
> The root cause is that utilizationTracker.addContainerResources has been 
> called twice when restart. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8153) Guaranteed containers always stay in SCHEDULED on NM after restart

2018-04-11 Thread Yang Wang (JIRA)
Yang Wang created YARN-8153:
---

 Summary: Guaranteed containers always stay in SCHEDULED on NM 
after restart
 Key: YARN-8153
 URL: https://issues.apache.org/jira/browse/YARN-8153
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yang Wang


When nm recovery is enabled, after NM restart, some containers always stay in 
SCHEDULED because of no sufficient resources.

The root cause is that utilizationTracker.addContainerResources has been called 
twice when restart. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7931) [atsv2 read acls] Include domain table creation as part of schema creator

2018-04-11 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434922#comment-16434922
 ] 

Vrushali C commented on YARN-7931:
--

Uploading v004 that addresses Haibo's point about ensuring special characters 
in rowkeys can be stored & retrieved.

> [atsv2 read acls] Include domain table creation as part of schema creator
> -
>
> Key: YARN-7931
> URL: https://issues.apache.org/jira/browse/YARN-7931
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vrushali C
>Assignee: Vrushali C
>Priority: Major
> Attachments: YARN-7391.0001.patch, YARN-7391.0002.patch, 
> YARN-7391.0003.patch, YARN-7391.0004.patch
>
>
>  
> Update the schema creator to create a domain table to store timeline entity 
> domain info. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7931) [atsv2 read acls] Include domain table creation as part of schema creator

2018-04-11 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-7931:
-
Attachment: YARN-7391.0004.patch

> [atsv2 read acls] Include domain table creation as part of schema creator
> -
>
> Key: YARN-7931
> URL: https://issues.apache.org/jira/browse/YARN-7931
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vrushali C
>Assignee: Vrushali C
>Priority: Major
> Attachments: YARN-7391.0001.patch, YARN-7391.0002.patch, 
> YARN-7391.0003.patch, YARN-7391.0004.patch
>
>
>  
> Update the schema creator to create a domain table to store timeline entity 
> domain info. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7481) Gpu locality support for Better AI scheduling

2018-04-11 Thread Chen Qingcha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Qingcha updated YARN-7481:
---
Attachment: hadoop-2.7.2.port-gpu.patch

> Gpu locality support for Better AI scheduling
> -
>
> Key: YARN-7481
> URL: https://issues.apache.org/jira/browse/YARN-7481
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, RM, yarn
>Affects Versions: 2.7.2
>Reporter: Chen Qingcha
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: GPU locality support for Job scheduling.pdf, 
> hadoop-2.7.2-gpu-port.patch, hadoop-2.7.2-gpu.patch, 
> hadoop-2.7.2.port-gpu.patch
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> We enhance Hadoop with GPU support for better AI job scheduling. 
> Currently, YARN-3926 also supports GPU scheduling, which treats GPU as 
> countable resource. 
> However, GPU placement is also very important to deep learning job for better 
> efficiency.
>  For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu 
> {0, 7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not.
>  We add the support to Hadoop 2.7.2 to enable GPU locality scheduling, which 
> support fine-grained GPU placement. 
> A 64-bits bitmap is added to yarn Resource, which indicates both GPU usage 
> and locality information in a node (up to 64 GPUs per node). '1' means 
> available and '0' otherwise in the corresponding position of the bit.   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7481) Gpu locality support for Better AI scheduling

2018-04-11 Thread Chen Qingcha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Qingcha updated YARN-7481:
---
Attachment: (was: hadoop-2.7.2.port-gpu.patch)

> Gpu locality support for Better AI scheduling
> -
>
> Key: YARN-7481
> URL: https://issues.apache.org/jira/browse/YARN-7481
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, RM, yarn
>Affects Versions: 2.7.2
>Reporter: Chen Qingcha
>Priority: Major
> Fix For: 2.7.2
>
> Attachments: GPU locality support for Job scheduling.pdf, 
> hadoop-2.7.2-gpu-port.patch, hadoop-2.7.2-gpu.patch
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> We enhance Hadoop with GPU support for better AI job scheduling. 
> Currently, YARN-3926 also supports GPU scheduling, which treats GPU as 
> countable resource. 
> However, GPU placement is also very important to deep learning job for better 
> efficiency.
>  For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu 
> {0, 7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not.
>  We add the support to Hadoop 2.7.2 to enable GPU locality scheduling, which 
> support fine-grained GPU placement. 
> A 64-bits bitmap is added to yarn Resource, which indicates both GPU usage 
> and locality information in a node (up to 64 GPUs per node). '1' means 
> available and '0' otherwise in the corresponding position of the bit.   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6315) Improve LocalResourcesTrackerImpl#isResourcePresent to return false for corrupted files

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434893#comment-16434893
 ] 

genericqa commented on YARN-6315:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
40s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 25m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 25m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 25m 
17s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 29s{color} | {color:orange} root: The patch generated 1 new + 244 unchanged 
- 1 fixed = 245 total (was 245) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
8m 30s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
5s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
9s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
54s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}132m 37s{color} 
| {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}285m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-6315 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-8103) Add CLI interface to query node attributes

2018-04-11 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434880#comment-16434880
 ] 

Weiwei Yang commented on YARN-8103:
---

Hi [~Naganarasimha]/[~bibinchundatt]

Thanks for sharing the details. I like the idea to group them together in 
"node-attributes" command, which simplifies the usage. It is OK to add as a 
top-level command as long as there are authentication checks for each OP. 

> Add CLI interface to  query node attributes
> ---
>
> Key: YARN-8103
> URL: https://issues.apache.org/jira/browse/YARN-8103
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
>
> YARN-8100 will add API interface for querying the attributes. CLI interface 
> for querying node attributes for each nodes and list all attributes in 
> cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8152) Add chart in SLS to illustrate the throughput of the scheduler

2018-04-11 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reassigned YARN-8152:
-

Assignee: Tao Yang

> Add chart in SLS to illustrate the throughput of the scheduler
> --
>
> Key: YARN-8152
> URL: https://issues.apache.org/jira/browse/YARN-8152
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Reporter: Weiwei Yang
>Assignee: Tao Yang
>Priority: Major
>
> Throughput is one of key metrics to evaluate the scheduler performance, 
> propose to add a chart in SLS board, to help eval scheduler perf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8152) Add chart in SLS to illustrate the throughput of the scheduler

2018-04-11 Thread Weiwei Yang (JIRA)
Weiwei Yang created YARN-8152:
-

 Summary: Add chart in SLS to illustrate the throughput of the 
scheduler
 Key: YARN-8152
 URL: https://issues.apache.org/jira/browse/YARN-8152
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler-load-simulator
Reporter: Weiwei Yang


Throughput is one of key metrics to evaluate the scheduler performance, propose 
to add a chart in SLS board, to help eval scheduler perf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434839#comment-16434839
 ] 

genericqa commented on YARN-7939:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 12 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
53s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 35 new + 402 unchanged - 2 fixed = 437 total (was 404) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 11s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 27m 55s{color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m  
6s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}117m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-7939 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918640/YARN-7939.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux ccbff32dc3b7 3.13.0-139-generic 

[jira] [Commented] (YARN-7527) Over-allocate node resource in async-scheduling mode of CapacityScheduler

2018-04-11 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434832#comment-16434832
 ] 

Weiwei Yang commented on YARN-7527:
---

LGTM, committing this to branch-2 now.

> Over-allocate node resource in async-scheduling mode of CapacityScheduler
> -
>
> Key: YARN-7527
> URL: https://issues.apache.org/jira/browse/YARN-7527
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.0.0-alpha4, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.1.0, 3.0.3
>
> Attachments: YARN-7527-branch-2.001.patch, 
> YARN-7527-branch-2.002.patch, YARN-7527.001.patch
>
>
> Currently in async-scheduling mode of CapacityScheduler, node resource may be 
> over-allocated since node resource check is ignored.
> {{FiCaSchedulerApp#commonCheckContainerAllocation}} will check whether this 
> node have enough available resource for this proposal and return check result 
> (ture/false), but this result is ignored in {{CapacityScheduler#accept}} as 
> below.
> {noformat}
> commonCheckContainerAllocation(allocation, schedulerContainer);
> {noformat}
> If {{FiCaSchedulerApp#commonCheckContainerAllocation}} returns false, 
> {{CapacityScheduler#accept}} should also return false as below:
> {noformat}
> if (!commonCheckContainerAllocation(allocation, schedulerContainer)) {
>   return false;
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7142) Support placement policy in yarn native services

2018-04-11 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434830#comment-16434830
 ] 

Weiwei Yang commented on YARN-7142:
---

Thanks [~gsaha]/[~leftnoteasy], that's OK lets keep it this way. I was just 
hesitating about which one is more descriptive. Thanks for your feedback.

> Support placement policy in yarn native services
> 
>
> Key: YARN-7142
> URL: https://issues.apache.org/jira/browse/YARN-7142
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Billie Rinaldi
>Assignee: Gour Saha
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7142-branch-3.1.004.patch, YARN-7142.001.patch, 
> YARN-7142.002.patch, YARN-7142.003.patch, YARN-7142.004.patch
>
>
> Placement policy exists in the API but is not implemented yet.
> I have filed YARN-8074 to move the composite constraints implementation out 
> of this phase-1 implementation of placement policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8057) Inadequate information for handling catch clauses

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434813#comment-16434813
 ] 

ASF GitHub Bot commented on YARN-8057:
--

Github user leekyosek commented on the issue:

https://github.com/apache/hadoop/pull/362
  
Good


> Inadequate information for handling catch clauses
> -
>
> Key: YARN-8057
> URL: https://issues.apache.org/jira/browse/YARN-8057
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api, yarn
>Affects Versions: 3.0.0
>Reporter: Zhenhao Li
>Priority: Major
>  Labels: easyfix
>
> Their are some situations that different exception types are caught, but the 
> handling of those exceptions can not show the differences of those types. 
> Here are the code snippets we found which have this problem:
> *org/apache/hadoop/yarn/client/api/impl/NMClientImpl.java*
> [https://github.com/apache/hadoop/blob/c02d2ba50db8a355ea03081c3984b2ea0c375a3f/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/NMClientImpl.java]
> At Line *125* and Line *129.* We can see that two exception types are caught, 
> but the logging statements here can not show the exception type at all. It 
> may cause confusions to the person who is reading the log, the person can not 
> know what exception happened here.
>  
> Maybe adding stack trace information to these two logging statements is a 
> simple way to improve it.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434800#comment-16434800
 ] 

genericqa commented on YARN-8142:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
8m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 31s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
40s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-8142 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918641/YARN-8142.1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux fcf6b1f34e97 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0d898b7 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20315/testReport/ |
| Max. process+thread count | 687 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20315/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> yarn service application 

[jira] [Commented] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434781#comment-16434781
 ] 

genericqa commented on YARN-7939:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 12 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
33s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 43s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 35 new + 402 unchanged - 2 fixed = 437 total (was 404) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 28m 45s{color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
14s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}120m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMClient |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-7939 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918639/YARN-7939.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 960a107f8313 3.13.0-139-generic 

[jira] [Commented] (YARN-8104) Add API to fetch node to attribute mapping

2018-04-11 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434770#comment-16434770
 ] 

Naganarasimha G R commented on YARN-8104:
-

Thanks for the patches [~bibinchundatt], 

_Currently IMHO don't make sense since NodeAttributeManagerImpl doesn't use 
NodeId. We can take up this point once we support in NodeAttributeManager._

I never meant the comment should change the API in NodeAttributeManager neither 
are there any plans in that direction. Till now the apis exposed are either 
internal or admin related. But for user whether it would make sense to have API 
for NodeId is the question here.

 Also was wondering what would be the behavior if ip's are used instead of 
hostnames while mapping?

 _GetNodesToAttributesResponseProto yarn_service.proto requires the same 
yarn_protos._

agree

 

hadoop-mapreduce-client-jobclient unit tests either failed or timed out can you 
have a local run and share the results ?

 

Apart from that patch LGTM !

 

> Add API to fetch node to attribute mapping
> --
>
> Key: YARN-8104
> URL: https://issues.apache.org/jira/browse/YARN-8104
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-8104-YARN-3409.001.patch, 
> YARN-8104-YARN-3409.002.patch, YARN-8104-YARN-3409.003.patch, 
> YARN-8104-YARN-3409.004.patch, YARN-8104-YARN-3409.005.patch
>
>
> Add node/host to attribute mapping in yarn client API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7810) TestDockerContainerRuntime test failures due to UID lookup of a non-existent user

2018-04-11 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7810:

Fix Version/s: 3.0.2
   2.10.0

[~ebadger] I cherry-picked 59828be1978ec942dda38774a1d9f741efa96f71 for branch 
3.0 and branch 2 without back port changes from other JIRAs.

> TestDockerContainerRuntime test failures due to UID lookup of a non-existent 
> user
> -
>
> Key: YARN-7810
> URL: https://issues.apache.org/jira/browse/YARN-7810
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 3.0.2
>
> Attachments: YARN-7810-branch-2.001.patch, 
> YARN-7810-branch-3.0.001.patch, YARN-7810.001.patch, YARN-7810.002.patch
>
>
> YARN-7782 enabled the Docker runtime feature to remap the username to uid:gid 
> form for launching Docker containers. The feature does an {{id -u}} and {{id 
> -G}} to get the UID and GIDs. This fails with the test user, as that user 
> doesn't actually exist on the host.
> {code:java}
> [ERROR] 
> testContainerLaunchWithCustomNetworks(org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime)
>   Time elapsed: 0.411 s  <<< ERROR!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
>  
> ExitCodeException exitCode=1: id: 'run_as_user': no such user
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.getUserIdInfo(DockerLinuxContainerRuntime.java:711)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(DockerLinuxContainerRuntime.java:757)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime.testContainerLaunchWithCustomNetworks(TestDockerContainerRuntime.java:599){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7810) TestDockerContainerRuntime test failures due to UID lookup of a non-existent user

2018-04-11 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7810:

Attachment: YARN-7810-branch-2.001.patch

> TestDockerContainerRuntime test failures due to UID lookup of a non-existent 
> user
> -
>
> Key: YARN-7810
> URL: https://issues.apache.org/jira/browse/YARN-7810
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7810-branch-2.001.patch, 
> YARN-7810-branch-3.0.001.patch, YARN-7810.001.patch, YARN-7810.002.patch
>
>
> YARN-7782 enabled the Docker runtime feature to remap the username to uid:gid 
> form for launching Docker containers. The feature does an {{id -u}} and {{id 
> -G}} to get the UID and GIDs. This fails with the test user, as that user 
> doesn't actually exist on the host.
> {code:java}
> [ERROR] 
> testContainerLaunchWithCustomNetworks(org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime)
>   Time elapsed: 0.411 s  <<< ERROR!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
>  
> ExitCodeException exitCode=1: id: 'run_as_user': no such user
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.getUserIdInfo(DockerLinuxContainerRuntime.java:711)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(DockerLinuxContainerRuntime.java:757)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime.testContainerLaunchWithCustomNetworks(TestDockerContainerRuntime.java:599){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8104) Add API to fetch node to attribute mapping

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434737#comment-16434737
 ] 

genericqa commented on YARN-8104:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-3409 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 4s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
49s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
14s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
57s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
19s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
YARN-3409 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
45s{color} | {color:green} YARN-3409 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 43m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 43m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 43m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  6m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  9m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
11s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
12s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 66m 
55s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 27m  
1s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
29s{color} | {color:green} hadoop-yarn-server-router in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}124m 56s{color} 
| {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
51s{color} | {color:green} The patch does not generate ASF License 

[jira] [Updated] (YARN-7810) TestDockerContainerRuntime test failures due to UID lookup of a non-existent user

2018-04-11 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7810:

Attachment: YARN-7810-branch-3.0.001.patch

> TestDockerContainerRuntime test failures due to UID lookup of a non-existent 
> user
> -
>
> Key: YARN-7810
> URL: https://issues.apache.org/jira/browse/YARN-7810
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7810-branch-3.0.001.patch, YARN-7810.001.patch, 
> YARN-7810.002.patch
>
>
> YARN-7782 enabled the Docker runtime feature to remap the username to uid:gid 
> form for launching Docker containers. The feature does an {{id -u}} and {{id 
> -G}} to get the UID and GIDs. This fails with the test user, as that user 
> doesn't actually exist on the host.
> {code:java}
> [ERROR] 
> testContainerLaunchWithCustomNetworks(org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime)
>   Time elapsed: 0.411 s  <<< ERROR!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
>  
> ExitCodeException exitCode=1: id: 'run_as_user': no such user
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.getUserIdInfo(DockerLinuxContainerRuntime.java:711)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(DockerLinuxContainerRuntime.java:757)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime.testContainerLaunchWithCustomNetworks(TestDockerContainerRuntime.java:599){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7810) TestDockerContainerRuntime test failures due to UID lookup of a non-existent user

2018-04-11 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434706#comment-16434706
 ] 

Eric Yang commented on YARN-7810:
-

This patch is updating test cases that are not yet existed in branch-2 or 
branch-3.0.  I think we would want a smaller patch that does:
{code}
private String runAsUser = System.getProperty("user.name");
{code}

Instead of back port the following JIRAs:

YARN-5534
YARN-7487
YARN-5366
YARN-7729
YARN-7810

> TestDockerContainerRuntime test failures due to UID lookup of a non-existent 
> user
> -
>
> Key: YARN-7810
> URL: https://issues.apache.org/jira/browse/YARN-7810
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7810.001.patch, YARN-7810.002.patch
>
>
> YARN-7782 enabled the Docker runtime feature to remap the username to uid:gid 
> form for launching Docker containers. The feature does an {{id -u}} and {{id 
> -G}} to get the UID and GIDs. This fails with the test user, as that user 
> doesn't actually exist on the host.
> {code:java}
> [ERROR] 
> testContainerLaunchWithCustomNetworks(org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime)
>   Time elapsed: 0.411 s  <<< ERROR!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
>  
> ExitCodeException exitCode=1: id: 'run_as_user': no such user
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.getUserIdInfo(DockerLinuxContainerRuntime.java:711)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(DockerLinuxContainerRuntime.java:757)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime.testContainerLaunchWithCustomNetworks(TestDockerContainerRuntime.java:599){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8138) Add unit test to validate queue priority preemption works under node partition.

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434707#comment-16434707
 ] 

genericqa commented on YARN-8138:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 33s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 10 new + 17 unchanged - 0 fixed = 27 total (was 17) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 56s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}129m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-8138 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918622/YARN-8138.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 779bd232a387 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 18de6f2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20311/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 

[jira] [Updated] (YARN-8142) yarn service application stops when AM is killed with SIGTERM

2018-04-11 Thread Billie Rinaldi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-8142:
-
Attachment: YARN-8142.1.patch

> yarn service application stops when AM is killed with SIGTERM
> -
>
> Key: YARN-8142
> URL: https://issues.apache.org/jira/browse/YARN-8142
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8142.1.patch
>
>
> Steps:
> 1) Launch sleeper job ( non-docker yarn service)
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch 
> fault-test-am-sleeper 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition 
> from local FS: 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms
> 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: 
> application_1522887500374_0010
> Exit Code: 0{code}
> 2) Wait for sleeper component to be up
> 3) Kill AM process PID
>  
> Expected behavior:
> New attempt of AM will be started. The pre-existing container will keep 
> running
>  
> Actual behavior:
> Application finishes with State : FINISHED and Final-State : ENDED
> New attempt was never launched
> Note: 
> when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting 
> the entire app down instead of letting it continue to run for another attempt
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7654) Support ENTRY_POINT for docker container

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434686#comment-16434686
 ] 

genericqa commented on YARN-7654:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 53s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 14s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 10 new + 115 unchanged - 0 fixed = 125 total (was 115) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 8 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
45s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m  7s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
15s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}108m 34s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-7654 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918621/YARN-7654.010.patch |
| Optional Tests |  asflicense  compile  

[jira] [Updated] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance

2018-04-11 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-7939:

Attachment: YARN-7939.005.patch

> Yarn Service Upgrade: add support to upgrade a component instance 
> --
>
> Key: YARN-7939
> URL: https://issues.apache.org/jira/browse/YARN-7939
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-7939.001.patch, YARN-7939.002.patch, 
> YARN-7939.003.patch, YARN-7939.004.patch, YARN-7939.005.patch
>
>
> Yarn core supports in-place upgrade of containers. A yarn service can 
> leverage that to provide in-place upgrade of component instances. Please see 
> YARN-7512 for details.
> Will add support to upgrade a single component instance first and then 
> iteratively add other APIs and features.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance

2018-04-11 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-7939:

Attachment: (was: YARN-7939.005.patch)

> Yarn Service Upgrade: add support to upgrade a component instance 
> --
>
> Key: YARN-7939
> URL: https://issues.apache.org/jira/browse/YARN-7939
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-7939.001.patch, YARN-7939.002.patch, 
> YARN-7939.003.patch, YARN-7939.004.patch
>
>
> Yarn core supports in-place upgrade of containers. A yarn service can 
> leverage that to provide in-place upgrade of component instances. Please see 
> YARN-7512 for details.
> Will add support to upgrade a single component instance first and then 
> iteratively add other APIs and features.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7939) Yarn Service Upgrade: add support to upgrade a component instance

2018-04-11 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-7939:

Attachment: YARN-7939.005.patch

> Yarn Service Upgrade: add support to upgrade a component instance 
> --
>
> Key: YARN-7939
> URL: https://issues.apache.org/jira/browse/YARN-7939
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-7939.001.patch, YARN-7939.002.patch, 
> YARN-7939.003.patch, YARN-7939.004.patch, YARN-7939.005.patch
>
>
> Yarn core supports in-place upgrade of containers. A yarn service can 
> leverage that to provide in-place upgrade of component instances. Please see 
> YARN-7512 for details.
> Will add support to upgrade a single component instance first and then 
> iteratively add other APIs and features.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8138) Add unit test to validate queue priority preemption works under node partition.

2018-04-11 Thread Zian Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8138:

Description: 
Add unit test to validate queue priority preemption works under node partition.

Test configuration:
 queue A (capacity=50, priority=1)
 queue B (capacity=50, priority=2)
 both have accessible-node-labels set to x
 A.accessible-node-labels.x.capacity = 50
 B.accessible-node-labels.x.capacity = 50
 Along with this pre-emption related properties have been set.

Test steps:
 - Submit an application A1 to B, with am-container = container = 4096, no. of 
containers = 4
 - Submit an application A2 to A, with am-container = 1024, container = 2048, 
no of containers = (NUM_NM-1)
 - Kill application A1
 - Submit an application A3 to B with am-container=container=5210, no. of 
containers=NUM_NM
 - Expectation is that containers are pre-empted from application A2 to A3

  was:
Add unit test to validate queue priority preemption works under node partition.

Test configuration:
queue A (capacity=50, priority=1)
queue B (capacity=50, priority=2)
both have accessible-node-labels set to x
A.accessible-node-labels.x.capacity = 50
B.accessible-node-labels.x.capacity = 50
Along with this pre-emption related properties have been set.

Test steps:
 - Submit an application A1 to B, with am-container = container = 4096, no. of 
containers = 4
 - Submit an application A2 to A, with am-container = 1024, container = 2048, 
no of containers = (NUM_NM-1)
 - Kill application A1
 - Submit an application A3 to B with am-container=container=5210, no. of 
containers=NUM_NM
 - Expectation is that containers are pre-empted from application A2 to A3 but 
there is no container pre-emption happening


> Add unit test to validate queue priority preemption works under node 
> partition.
> ---
>
> Key: YARN-8138
> URL: https://issues.apache.org/jira/browse/YARN-8138
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Zian Chen
>Priority: Minor
> Attachments: YARN-8138.001.patch, YARN-8138.002.patch
>
>
> Add unit test to validate queue priority preemption works under node 
> partition.
> Test configuration:
>  queue A (capacity=50, priority=1)
>  queue B (capacity=50, priority=2)
>  both have accessible-node-labels set to x
>  A.accessible-node-labels.x.capacity = 50
>  B.accessible-node-labels.x.capacity = 50
>  Along with this pre-emption related properties have been set.
> Test steps:
>  - Submit an application A1 to B, with am-container = container = 4096, no. 
> of containers = 4
>  - Submit an application A2 to A, with am-container = 1024, container = 2048, 
> no of containers = (NUM_NM-1)
>  - Kill application A1
>  - Submit an application A3 to B with am-container=container=5210, no. of 
> containers=NUM_NM
>  - Expectation is that containers are pre-empted from application A2 to A3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8138) Add unit test to validate queue priority preemption works under node partition.

2018-04-11 Thread Zian Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8138:

Description: 
Add unit test to validate queue priority preemption works under node partition.

Test configuration:
queue A (capacity=50, priority=1)
queue B (capacity=50, priority=2)
both have accessible-node-labels set to x
A.accessible-node-labels.x.capacity = 50
B.accessible-node-labels.x.capacity = 50
Along with this pre-emption related properties have been set.

Test steps:
 - Submit an application A1 to B, with am-container = container = 4096, no. of 
containers = 4
 - Submit an application A2 to A, with am-container = 1024, container = 2048, 
no of containers = (NUM_NM-1)
 - Kill application A1
 - Submit an application A3 to B with am-container=container=5210, no. of 
containers=NUM_NM
 - Expectation is that containers are pre-empted from application A2 to A3 but 
there is no container pre-emption happening

  was:
There seems to be an issue with pre-emption when using node labels with queue 
priority.

Test configuration:
queue A (capacity=50, priority=1)
queue B (capacity=50, priority=2)
both have accessible-node-labels set to x
A.accessible-node-labels.x.capacity = 50
B.accessible-node-labels.x.capacity = 50
Along with this pre-emption related properties have been set.

Test steps:
 - Set NM memory = 6000MB and containerMemory = 750MB
 - Submit an application A1 to B, with am-container = container = 
(6000-750-1500), no. of containers = 2
 - Submit an application A2 to A, with am-container = 750, container = 1500, no 
of containers = (NUM_NM-1)
 - Kill application A1
 - Submit an application A3 to B with am-container=container=5000, no. of 
containers=3
 - Expectation is that containers are pre-empted from application A2 to A3 but 
there is no container pre-emption happening
Container pre-emption is stuck with the message in the RM log,
{noformat}
2018-02-02 11:41:36,974 INFO capacity.CapacityScheduler 
(CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
(CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
fulfill reservation for application application_1517571510094_0003 on node: 
XX:25454
2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
(AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
Reserved container application=application_1517571510094_0003 
resource= 
queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
 cluster=
2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
(CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
(CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
fulfill reservation for application application_1517571510094_0003 on node: 
XX:25454
2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
(AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
Reserved container application=application_1517571510094_0003 
resource= 
queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
 cluster=
2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
(CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
2018-02-02 11:41:36,994 INFO capacity.CapacityScheduler 
(CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
fulfill reservation for application application_1517571510094_0003 on node: 
XX:25454
2018-02-02 11:41:36,995 INFO allocator.AbstractContainerAllocator 
(AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
Reserved container application=application_1517571510094_0003 
resource= 
queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
 cluster={noformat}


> Add unit test to validate queue priority preemption works under node 
> partition.
> ---
>
> Key: YARN-8138
> URL: https://issues.apache.org/jira/browse/YARN-8138
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Zian Chen
>Priority: Minor
> Attachments: YARN-8138.001.patch, YARN-8138.002.patch
>
>
> Add unit test to validate queue priority preemption works under node 
> partition.
> Test configuration:
> queue A (capacity=50, priority=1)
> queue B (capacity=50, priority=2)
> both have accessible-node-labels set to x
> A.accessible-node-labels.x.capacity = 50
> 

[jira] [Issue Comment Deleted] (YARN-8138) Add unit test to validate queue priority preemption works under node partition.

2018-04-11 Thread Zian Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8138:

Comment: was deleted

(was: Investigated this issue and wrote a UT to reproduce it. According to the 
UT. the conclusion is the preemption happened after application 3 got 
submitted.  But not happening as expected as the test scenario presented. There 
are several issues we need to clarify here.
 # When we set memory size for containers, we need to set them as multiple of 
1024 MB, otherwise, the scheduler will convert them into the nearest size which 
is bigger than the requested size which is multiple of 1024 MB. For example 
app3 had am container request of 750MB, instead, it will get 1024 MB as the 
container size.
 # According to the log, preemption seems not happened. but actually, it 
happened with a long time delay(1 minute probably), the reason is when we set 
"yarn.scheduler.capacity.ordering-policy.priority-utilization.underutilized-preemption.reserved-container-delay-ms"
 property, the reserved container will not be allocated before we hit this 
timeout, which leads preemption will delay more before we hit this timeout.
 # Although we got preemption happened, we will not expect A3 to be able to 
launch all its requested containers. Because the amount of resource A3 can get 
should limit by minimum guaranteed resource for the queue the application 
submitted to. In this case, we will only expect two containers to preempt since 
Queue B will reach its minimum guaranteed resource (50% of the cluster 
resource) after two containers preempt from Queue A.

So my suggestion is recheck the test scenario with those issues mentioned above 
and change settings properly, and the test should pass.

 

[~leftnoteasy] , could you share your opinions as well? Thanks)

> Add unit test to validate queue priority preemption works under node 
> partition.
> ---
>
> Key: YARN-8138
> URL: https://issues.apache.org/jira/browse/YARN-8138
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Zian Chen
>Priority: Minor
> Attachments: YARN-8138.001.patch, YARN-8138.002.patch
>
>
> There seems to be an issue with pre-emption when using node labels with queue 
> priority.
> Test configuration:
> queue A (capacity=50, priority=1)
> queue B (capacity=50, priority=2)
> both have accessible-node-labels set to x
> A.accessible-node-labels.x.capacity = 50
> B.accessible-node-labels.x.capacity = 50
> Along with this pre-emption related properties have been set.
> Test steps:
>  - Set NM memory = 6000MB and containerMemory = 750MB
>  - Submit an application A1 to B, with am-container = container = 
> (6000-750-1500), no. of containers = 2
>  - Submit an application A2 to A, with am-container = 750, container = 1500, 
> no of containers = (NUM_NM-1)
>  - Kill application A1
>  - Submit an application A3 to B with am-container=container=5000, no. of 
> containers=3
>  - Expectation is that containers are pre-empted from application A2 to A3 
> but there is no container pre-emption happening
> Container pre-emption is stuck with the message in the RM log,
> {noformat}
> 2018-02-02 11:41:36,974 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> 

[jira] [Commented] (YARN-8151) Yarn RM Epoch should wrap around

2018-04-11 Thread Young Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434662#comment-16434662
 ] 

Young Chen commented on YARN-8151:
--

Right now RM Epoch values in sub clusters are seeded in different ranges: 0, 
1000, 2000, etc. If one RM restarts enough its epoch can increment until it 
clashes with a neighboring sub cluster. E.g. 999 -> 1000. To fix this, we 
introduce a configurable range by which the epoch generation is bound.

> Yarn RM Epoch should wrap around
> 
>
> Key: YARN-8151
> URL: https://issues.apache.org/jira/browse/YARN-8151
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Young Chen
>Assignee: Young Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8151) Yarn RM Epoch should wrap around

2018-04-11 Thread Young Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Young Chen updated YARN-8151:
-
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-5597

> Yarn RM Epoch should wrap around
> 
>
> Key: YARN-8151
> URL: https://issues.apache.org/jira/browse/YARN-8151
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Young Chen
>Assignee: Young Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8151) Yarn RM Epoch should wrap around

2018-04-11 Thread Young Chen (JIRA)
Young Chen created YARN-8151:


 Summary: Yarn RM Epoch should wrap around
 Key: YARN-8151
 URL: https://issues.apache.org/jira/browse/YARN-8151
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Young Chen
Assignee: Young Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8103) Add CLI interface to query node attributes

2018-04-11 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434648#comment-16434648
 ] 

Naganarasimha G R edited comment on YARN-8103 at 4/11/18 10:17 PM:
---

Thanks [~bibinchundatt], for detailing it out. 

[~cheersyang], As part of  YARN-6856, we decided to have feature specific 
grouping thats when we introduced *node-attributes* option directly under yarn 
command instead of placing it as suboption under *rmadmin* option. This was 
done primarily for two reasons
 # It would be more ideal to capture all related operations under one heading 
so that its easier to use. (similar to kubernetes). 
 # Lesser commands to type in the CLI.

  So what we are planning is which ever sub-option requires admin access will 
get validated and others will be available for access by all users. so here 
replace, add and remove will be using *ResourceManagerAdministrationProtocol* 
and for other options which [~bibinchundatt] was mentioning will be using 
*ApplicationClientProtocol*

And as part of last weeks call we decided to go with first listing which Bibin 
mentioned. And of the last 3 points mentioned by [~bibinchundatt] 1 & 3 is up 
for discussion. IMO i thought it better to be grouped together because that was 
the reason which we introduced a option *node-attributes* .

 

   


was (Author: naganarasimha):
Thanks [~bibinchundatt], for detailing it out. 

[~cheersyang], As part of  YARN-6856, we decided to have feature specific 
grouping thats when we introduced *node-attributes* option directly under yarn 
command instead of placing it as suboption under *rmadmin* option. This was 
done primarily for two reasons
 # It would be more ideal to capture all related operations under one heading 
so that its easier to use. (similar to kubernetes). 
 # Lesser commands to type in the CLI.

  So what we are planning is which ever sub-option requires admin access will 
get validated and others will be available for access by all users. so here 
replace, add and remove will be using *ResourceManagerAdministrationProtocol* 
and for other options which [~bibinchundatt] was mentioning will be using 
*ApplicationClientProtocol*

And as part of last weeks call we decided to go with first 3 list which Bibin 
mentioned. And of the last 3 points mentioned by [~bibinchundatt] 1 & 3 is up 
for discussion. IMO i thought it better to be grouped together because that was 
the reason which we introduced a option *node-attributes* .

 

   

> Add CLI interface to  query node attributes
> ---
>
> Key: YARN-8103
> URL: https://issues.apache.org/jira/browse/YARN-8103
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
>
> YARN-8100 will add API interface for querying the attributes. CLI interface 
> for querying node attributes for each nodes and list all attributes in 
> cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7189) Container-executor doesn't remove Docker containers that error out early

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434650#comment-16434650
 ] 

genericqa commented on YARN-7189:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-3.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
28s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
27m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m 58s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5aaf88d |
| JIRA Issue | YARN-7189 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918618/YARN-7189-branch-3.0.002.patch
 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 1757ad56d177 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.0 / 5fe2b97 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20309/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20309/testReport/ |
| Max. process+thread count | 303 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20309/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Container-executor doesn't remove Docker containers that error out early
> 
>
> Key: YARN-7189
> URL: https://issues.apache.org/jira/browse/YARN-7189
> Project: Hadoop YARN
>  Issue Type: Sub-task
>

[jira] [Commented] (YARN-6315) Improve LocalResourcesTrackerImpl#isResourcePresent to return false for corrupted files

2018-04-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434649#comment-16434649
 ] 

Jason Lowe commented on YARN-6315:
--

Thanks for updating the patch!  Sorry for the delay in reviewing.

Why is downloadSize being added as a field to the abstract LocalResource record 
class?  I think it should be treated just like all the other "fields" in this 
class -- access methods in the abstract class but implementation details 
delegated to the concrete classes.  LocalResourceRequest has a {{downloadSize}} 
as well which makes things a bit confusing. 

{{downloadSize}} is being set to the same value as {{size}} in the record 
creation?  That makes me think we don't need to put this in the protocol buffer 
record but instead track this separately in a non-protocol buffer record on the 
NM side.

Thinking out loud here: I'm not sure we really need the client to tell us what 
the size is in the protocol buffer record.  We're already doing a timestamp 
check which should be close enough when combined with the path to uniquely 
identify the resource being localized.  Once we have uniquely identified the 
resource the NM will download for the user then we can track how big the 
download was when the NM localized it in LocalizedResource.  Then we can check 
that size against the size found on disk to verify the resource still looks OK, 
at least at a high level.

Note that YARN-2185 and archives in general may be problematic for this 
approach since what we download becomes something very different on the local 
disk due to the archive unpack process.


> Improve LocalResourcesTrackerImpl#isResourcePresent to return false for 
> corrupted files
> ---
>
> Key: YARN-6315
> URL: https://issues.apache.org/jira/browse/YARN-6315
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.3, 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Major
> Attachments: YARN-6315.001.patch, YARN-6315.002.patch, 
> YARN-6315.003.patch, YARN-6315.004.patch, YARN-6315.005.patch, 
> YARN-6315.006.patch
>
>
> We currently check if a resource is present by making sure that the file 
> exists locally. There can be a case where the LocalizationTracker thinks that 
> it has the resource if the file exists but with size 0 or less than the 
> "expected" size of the LocalResource. This JIRA tracks the change to harden 
> the isResourcePresent call to address that case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8103) Add CLI interface to query node attributes

2018-04-11 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434648#comment-16434648
 ] 

Naganarasimha G R commented on YARN-8103:
-

Thanks [~bibinchundatt], for detailing it out. 

[~cheersyang], As part of  YARN-6856, we decided to have feature specific 
grouping thats when we introduced *node-attributes* option directly under yarn 
command instead of placing it as suboption under *rmadmin* option. This was 
done primarily for two reasons
 # It would be more ideal to capture all related operations under one heading 
so that its easier to use. (similar to kubernetes). 
 # Lesser commands to type in the CLI.

  So what we are planning is which ever sub-option requires admin access will 
get validated and others will be available for access by all users. so here 
replace, add and remove will be using *ResourceManagerAdministrationProtocol* 
and for other options which [~bibinchundatt] was mentioning will be using 
*ApplicationClientProtocol*

And as part of last weeks call we decided to go with first 3 list which Bibin 
mentioned. And of the last 3 points mentioned by [~bibinchundatt] 1 & 3 is up 
for discussion. IMO i thought it better to be grouped together because that was 
the reason which we introduced a option *node-attributes* .

 

   

> Add CLI interface to  query node attributes
> ---
>
> Key: YARN-8103
> URL: https://issues.apache.org/jira/browse/YARN-8103
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
>
> YARN-8100 will add API interface for querying the attributes. CLI interface 
> for querying node attributes for each nodes and list all attributes in 
> cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7810) TestDockerContainerRuntime test failures due to UID lookup of a non-existent user

2018-04-11 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434630#comment-16434630
 ] 

Eric Badger commented on YARN-7810:
---

It's definitely failing in branch-3.0 and branch-2. I just tested it. Can we 
backport it there?

> TestDockerContainerRuntime test failures due to UID lookup of a non-existent 
> user
> -
>
> Key: YARN-7810
> URL: https://issues.apache.org/jira/browse/YARN-7810
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7810.001.patch, YARN-7810.002.patch
>
>
> YARN-7782 enabled the Docker runtime feature to remap the username to uid:gid 
> form for launching Docker containers. The feature does an {{id -u}} and {{id 
> -G}} to get the UID and GIDs. This fails with the test user, as that user 
> doesn't actually exist on the host.
> {code:java}
> [ERROR] 
> testContainerLaunchWithCustomNetworks(org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime)
>   Time elapsed: 0.411 s  <<< ERROR!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
>  
> ExitCodeException exitCode=1: id: 'run_as_user': no such user
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.getUserIdInfo(DockerLinuxContainerRuntime.java:711)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(DockerLinuxContainerRuntime.java:757)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime.testContainerLaunchWithCustomNetworks(TestDockerContainerRuntime.java:599){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8150) Promoting AMSimulator

2018-04-11 Thread Young Chen (JIRA)
Young Chen created YARN-8150:


 Summary: Promoting AMSimulator
 Key: YARN-8150
 URL: https://issues.apache.org/jira/browse/YARN-8150
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler-load-simulator
Reporter: Young Chen
Assignee: Young Chen


Add a PromotingAMSimulator that exercises Opportunistic/Promote/Demote



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8150) Promoting AMSimulator

2018-04-11 Thread Young Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Young Chen updated YARN-8150:
-
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-5065

> Promoting AMSimulator
> -
>
> Key: YARN-8150
> URL: https://issues.apache.org/jira/browse/YARN-8150
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Reporter: Young Chen
>Assignee: Young Chen
>Priority: Minor
>
> Add a PromotingAMSimulator that exercises Opportunistic/Promote/Demote



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6828) [Umbrella] Container preemption using OPPORTUNISTIC containers

2018-04-11 Thread Young Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Young Chen reassigned YARN-6828:


Assignee: Young Chen  (was: Arun Suresh)

> [Umbrella] Container preemption using OPPORTUNISTIC containers
> --
>
> Key: YARN-6828
> URL: https://issues.apache.org/jira/browse/YARN-6828
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Arun Suresh
>Assignee: Young Chen
>Priority: Major
>
> This is based on discussions with [~kasha] and [~kkaranasos].
> Currently, the YARN schedulers selects containers for preemption only in 
> response to a starved queue / app's request. We propose to allow the 
> Schedulers to mark containers that are allocated over queue 
> capacity/fair-share as Opportunistic containers.
> This JIRA proposes to allow Schedulers to:
> # Allocate all containers over the configured queue capacity/weight as 
> OPPORTUNISTIC.
> # Auto-promote running OPPORTUNISTIC containers of apps as and when their 
> GUARANTEED containers complete.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4781) Support intra-queue preemption for fairness ordering policy.

2018-04-11 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434611#comment-16434611
 ] 

Eric Payne commented on YARN-4781:
--

[~sunilg], actually, it looks like I can access the {{FairComparator}} class 
from {{FairOrderingPolicy}}:
{code:java|title=IntraQueueCandidatesSelector#TAFairOrderingComparator}
  public int compare(TempAppPerPartition ta1, TempAppPerPartition ta2) {
AbstractComparatorOrderingPolicy acop =
(AbstractComparatorOrderingPolicy)
ta1.getFiCaSchedulerApp().getCSLeafQueue().getOrderingPolicy();
return acop.getComparator()
  .compare(ta1.getFiCaSchedulerApp(), ta2.getFiCaSchedulerApp());
  }
{code}
It's still a little messy, but less intrusive than the other suggestions.

I have manually tested this in my pseudo cluster and it works pretty well. I'm 
still trying to get the unit tests mocked up correctly, so I won't post a patch 
just yet.

> Support intra-queue preemption for fairness ordering policy.
> 
>
> Key: YARN-4781
> URL: https://issues.apache.org/jira/browse/YARN-4781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
> Attachments: YARN-4781.001.patch, YARN-4781.002.patch, 
> YARN-4781.003.patch
>
>
> We introduced fairness queue policy since YARN-3319, which will let large 
> applications make progresses and not starve small applications. However, if a 
> large application takes the queue’s resources, and containers of the large 
> app has long lifespan, small applications could still wait for resources for 
> long time and SLAs cannot be guaranteed.
> Instead of wait for application release resources on their own, we need to 
> preempt resources of queue with fairness policy enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7088) Fix application start time and add submit time to UIs

2018-04-11 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434598#comment-16434598
 ] 

Haibo Chen commented on YARN-7088:
--

Thanks [~kanwaljeets] for the patch! I have a few questions/comments around the 
core state machine changes you've made in this patch.

1)  
{code:java}
 private static final class AppRunningOnAppAttemptLaunchTransition
  extends  RMAppTransition {
    @Override
    public void transition(RMAppImpl app, RMAppEvent event) {
  LOG.info("update the launch time");
  if(app.launchTime == 0) {
    app.launchTime = System.currentTimeMillis();
  }
    }
  }{code}
We are generating the timestamp when RMAppImpl is notified of the launch. But 
the correct timestamp should come from RMAppAttemptEventType.LAUNCHED that is 
generated by AMLauncher. If the event dispatch thread is falling back, there 
could be a large gap between when RMAppImpl is notified and when the attempt is 
actually launched.

2) I see two transitions are added to RMAPPImpl state machine, namely ACCEPTED 
--> ACCEPTED and RUNNING --> RUNNING upon the new 
RMAppEventType.ATTEMPT_LAUNCHED. However, I could not think of the possibility 
of receiving a ATTEMPT_LAUNCHED event while the state machine is in RUNNING 
state. Can you please elaborate a little more on that?

3) Some minor comments: there is an unused variable you added in 
AMLaunchedTransition; the log in AppRunningOnAppAttemptLaunchTransition  is too 
generic. Can we add application id, attempt id and the timestamp?;

The change in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/application_history_server.proto
 is unnecessary as you have decided not to change for ATS (all versions). 

The schedulingWaitTime in AppInfo is misleading if one application has multiple 
attempts. Because the launch time is updated everytime a new attempt is 
launched, we can rename it to the launch time of the latest attempt. If there 
are mutiple attempts, the running time of previous attempts would be counted as 
schedulingwait time, which is not correct.

> Fix application start time and add submit time to UIs
> -
>
> Key: YARN-7088
> URL: https://issues.apache.org/jira/browse/YARN-7088
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Abdullah Yousufi
>Assignee: Kanwaljeet Sachdev
>Priority: Major
> Attachments: YARN-7088.001.patch, YARN-7088.002.patch, 
> YARN-7088.003.patch, YARN-7088.004.patch, YARN-7088.005.patch, 
> YARN-7088.006.patch, YARN-7088.007.patch, YARN-7088.008.patch, 
> YARN-7088.009.patch, YARN-7088.010.patch, YARN-7088.011.patch, 
> YARN-7088.012.patch, YARN-7088.013.patch, YARN-7088.014.patch
>
>
> Currently, the start time in the old and new UI actually shows the app 
> submission time. There should actually be two different fields; one for the 
> app's submission and one for its start, as well as the elapsed pending time 
> between the two.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7221) Add security check for privileged docker container

2018-04-11 Thread Billie Rinaldi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-7221:
-
Fix Version/s: 3.1.1

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch, 
> YARN-7221.003.patch, YARN-7221.004.patch, YARN-7221.005.patch, 
> YARN-7221.006.patch, YARN-7221.007.patch, YARN-7221.008.patch, 
> YARN-7221.009.patch, YARN-7221.010.patch, YARN-7221.011.patch, 
> YARN-7221.012.patch, YARN-7221.013.patch, YARN-7221.014.patch, 
> YARN-7221.015.patch, YARN-7221.016.patch, YARN-7221.017.patch, 
> YARN-7221.018.patch, YARN-7221.019.patch, YARN-7221.020.patch, 
> YARN-7221.021.patch, YARN-7221.022.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7810) TestDockerContainerRuntime test failures due to UID lookup of a non-existent user

2018-04-11 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434576#comment-16434576
 ] 

Eric Yang commented on YARN-7810:
-

[~ebadger] I don't see branch 2.9 test failing without this patch.  I tried to 
cherry pick this JIRA to 2.8, but some tests have diverged.  This will require 
addendum patch to back port to branch 2.9.

> TestDockerContainerRuntime test failures due to UID lookup of a non-existent 
> user
> -
>
> Key: YARN-7810
> URL: https://issues.apache.org/jira/browse/YARN-7810
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7810.001.patch, YARN-7810.002.patch
>
>
> YARN-7782 enabled the Docker runtime feature to remap the username to uid:gid 
> form for launching Docker containers. The feature does an {{id -u}} and {{id 
> -G}} to get the UID and GIDs. This fails with the test user, as that user 
> doesn't actually exist on the host.
> {code:java}
> [ERROR] 
> testContainerLaunchWithCustomNetworks(org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime)
>   Time elapsed: 0.411 s  <<< ERROR!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
>  
> ExitCodeException exitCode=1: id: 'run_as_user': no such user
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.getUserIdInfo(DockerLinuxContainerRuntime.java:711)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(DockerLinuxContainerRuntime.java:757)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime.testContainerLaunchWithCustomNetworks(TestDockerContainerRuntime.java:599){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7810) TestDockerContainerRuntime test failures due to UID lookup of a non-existent user

2018-04-11 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434576#comment-16434576
 ] 

Eric Yang edited comment on YARN-7810 at 4/11/18 9:07 PM:
--

[~ebadger] I don't see branch 2.9 test failing without this patch.  I tried to 
cherry pick this JIRA to 2.9, but some tests have diverged.  This will require 
addendum patch to back port to branch 2.9.


was (Author: eyang):
[~ebadger] I don't see branch 2.9 test failing without this patch.  I tried to 
cherry pick this JIRA to 2.8, but some tests have diverged.  This will require 
addendum patch to back port to branch 2.9.

> TestDockerContainerRuntime test failures due to UID lookup of a non-existent 
> user
> -
>
> Key: YARN-7810
> URL: https://issues.apache.org/jira/browse/YARN-7810
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7810.001.patch, YARN-7810.002.patch
>
>
> YARN-7782 enabled the Docker runtime feature to remap the username to uid:gid 
> form for launching Docker containers. The feature does an {{id -u}} and {{id 
> -G}} to get the UID and GIDs. This fails with the test user, as that user 
> doesn't actually exist on the host.
> {code:java}
> [ERROR] 
> testContainerLaunchWithCustomNetworks(org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime)
>   Time elapsed: 0.411 s  <<< ERROR!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
>  
> ExitCodeException exitCode=1: id: 'run_as_user': no such user
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.getUserIdInfo(DockerLinuxContainerRuntime.java:711)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(DockerLinuxContainerRuntime.java:757)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime.testContainerLaunchWithCustomNetworks(TestDockerContainerRuntime.java:599){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8147) TestClientRMService#testGetApplications sporadically fails

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434565#comment-16434565
 ] 

genericqa commented on YARN-8147:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 43 unchanged - 0 fixed = 44 total (was 43) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 
37s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}119m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-8147 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918606/YARN-8147.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 597229568f64 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 933477e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20308/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20308/testReport/ |
| Max. process+thread count | 838 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Updated] (YARN-8138) Add unit test to validate queue priority preemption works under node partition.

2018-04-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8138:
-
Target Version/s: 3.2.0, 3.1.1

> Add unit test to validate queue priority preemption works under node 
> partition.
> ---
>
> Key: YARN-8138
> URL: https://issues.apache.org/jira/browse/YARN-8138
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Zian Chen
>Priority: Minor
> Attachments: YARN-8138.001.patch, YARN-8138.002.patch
>
>
> There seems to be an issue with pre-emption when using node labels with queue 
> priority.
> Test configuration:
> queue A (capacity=50, priority=1)
> queue B (capacity=50, priority=2)
> both have accessible-node-labels set to x
> A.accessible-node-labels.x.capacity = 50
> B.accessible-node-labels.x.capacity = 50
> Along with this pre-emption related properties have been set.
> Test steps:
>  - Set NM memory = 6000MB and containerMemory = 750MB
>  - Submit an application A1 to B, with am-container = container = 
> (6000-750-1500), no. of containers = 2
>  - Submit an application A2 to A, with am-container = 750, container = 1500, 
> no of containers = (NUM_NM-1)
>  - Kill application A1
>  - Submit an application A3 to B with am-container=container=5000, no. of 
> containers=3
>  - Expectation is that containers are pre-empted from application A2 to A3 
> but there is no container pre-emption happening
> Container pre-emption is stuck with the message in the RM log,
> {noformat}
> 2018-02-02 11:41:36,974 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,994 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,995 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster={noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8138) Add unit test to validate queue priority preemption works under node partition.

2018-04-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8138:
-
Summary: Add unit test to validate queue priority preemption works under 
node partition.  (was: No containers pre-empted from another queue when using 
node labels)

> Add unit test to validate queue priority preemption works under node 
> partition.
> ---
>
> Key: YARN-8138
> URL: https://issues.apache.org/jira/browse/YARN-8138
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Zian Chen
>Priority: Minor
> Attachments: YARN-8138.001.patch, YARN-8138.002.patch
>
>
> There seems to be an issue with pre-emption when using node labels with queue 
> priority.
> Test configuration:
> queue A (capacity=50, priority=1)
> queue B (capacity=50, priority=2)
> both have accessible-node-labels set to x
> A.accessible-node-labels.x.capacity = 50
> B.accessible-node-labels.x.capacity = 50
> Along with this pre-emption related properties have been set.
> Test steps:
>  - Set NM memory = 6000MB and containerMemory = 750MB
>  - Submit an application A1 to B, with am-container = container = 
> (6000-750-1500), no. of containers = 2
>  - Submit an application A2 to A, with am-container = 750, container = 1500, 
> no of containers = (NUM_NM-1)
>  - Kill application A1
>  - Submit an application A3 to B with am-container=container=5000, no. of 
> containers=3
>  - Expectation is that containers are pre-empted from application A2 to A3 
> but there is no container pre-emption happening
> Container pre-emption is stuck with the message in the RM log,
> {noformat}
> 2018-02-02 11:41:36,974 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,994 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,995 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster={noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8138) No containers pre-empted from another queue when using node labels

2018-04-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8138:
-
Priority: Minor  (was: Blocker)

> No containers pre-empted from another queue when using node labels
> --
>
> Key: YARN-8138
> URL: https://issues.apache.org/jira/browse/YARN-8138
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Zian Chen
>Priority: Minor
> Attachments: YARN-8138.001.patch, YARN-8138.002.patch
>
>
> There seems to be an issue with pre-emption when using node labels with queue 
> priority.
> Test configuration:
> queue A (capacity=50, priority=1)
> queue B (capacity=50, priority=2)
> both have accessible-node-labels set to x
> A.accessible-node-labels.x.capacity = 50
> B.accessible-node-labels.x.capacity = 50
> Along with this pre-emption related properties have been set.
> Test steps:
>  - Set NM memory = 6000MB and containerMemory = 750MB
>  - Submit an application A1 to B, with am-container = container = 
> (6000-750-1500), no. of containers = 2
>  - Submit an application A2 to A, with am-container = 750, container = 1500, 
> no of containers = (NUM_NM-1)
>  - Kill application A1
>  - Submit an application A3 to B with am-container=container=5000, no. of 
> containers=3
>  - Expectation is that containers are pre-empted from application A2 to A3 
> but there is no container pre-emption happening
> Container pre-emption is stuck with the message in the RM log,
> {noformat}
> 2018-02-02 11:41:36,974 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,994 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,995 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster={noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8138) No containers pre-empted from another queue when using node labels

2018-04-11 Thread Zian Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8138:

Attachment: YARN-8138.002.patch

> No containers pre-empted from another queue when using node labels
> --
>
> Key: YARN-8138
> URL: https://issues.apache.org/jira/browse/YARN-8138
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Zian Chen
>Priority: Blocker
> Attachments: YARN-8138.001.patch, YARN-8138.002.patch
>
>
> There seems to be an issue with pre-emption when using node labels with queue 
> priority.
> Test configuration:
> queue A (capacity=50, priority=1)
> queue B (capacity=50, priority=2)
> both have accessible-node-labels set to x
> A.accessible-node-labels.x.capacity = 50
> B.accessible-node-labels.x.capacity = 50
> Along with this pre-emption related properties have been set.
> Test steps:
>  - Set NM memory = 6000MB and containerMemory = 750MB
>  - Submit an application A1 to B, with am-container = container = 
> (6000-750-1500), no. of containers = 2
>  - Submit an application A2 to A, with am-container = 750, container = 1500, 
> no of containers = (NUM_NM-1)
>  - Kill application A1
>  - Submit an application A3 to B with am-container=container=5000, no. of 
> containers=3
>  - Expectation is that containers are pre-empted from application A2 to A3 
> but there is no container pre-emption happening
> Container pre-emption is stuck with the message in the RM log,
> {noformat}
> 2018-02-02 11:41:36,974 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,994 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,995 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster={noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8138) No containers pre-empted from another queue when using node labels

2018-04-11 Thread Zian Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434543#comment-16434543
 ] 

Zian Chen commented on YARN-8138:
-

Fix failed cases.

> No containers pre-empted from another queue when using node labels
> --
>
> Key: YARN-8138
> URL: https://issues.apache.org/jira/browse/YARN-8138
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Charan Hebri
>Assignee: Zian Chen
>Priority: Blocker
> Attachments: YARN-8138.001.patch
>
>
> There seems to be an issue with pre-emption when using node labels with queue 
> priority.
> Test configuration:
> queue A (capacity=50, priority=1)
> queue B (capacity=50, priority=2)
> both have accessible-node-labels set to x
> A.accessible-node-labels.x.capacity = 50
> B.accessible-node-labels.x.capacity = 50
> Along with this pre-emption related properties have been set.
> Test steps:
>  - Set NM memory = 6000MB and containerMemory = 750MB
>  - Submit an application A1 to B, with am-container = container = 
> (6000-750-1500), no. of containers = 2
>  - Submit an application A2 to A, with am-container = 750, container = 1500, 
> no of containers = (NUM_NM-1)
>  - Kill application A1
>  - Submit an application A3 to B with am-container=container=5000, no. of 
> containers=3
>  - Expectation is that containers are pre-empted from application A2 to A3 
> but there is no container pre-emption happening
> Container pre-emption is stuck with the message in the RM log,
> {noformat}
> 2018-02-02 11:41:36,974 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,984 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster=
> 2018-02-02 11:41:36,984 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(2673)) - Allocation proposal accepted
> 2018-02-02 11:41:36,994 INFO capacity.CapacityScheduler 
> (CapacityScheduler.java:allocateContainerOnSingleNode(1391)) - Trying to 
> fulfill reservation for application application_1517571510094_0003 on node: 
> XX:25454
> 2018-02-02 11:41:36,995 INFO allocator.AbstractContainerAllocator 
> (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(97)) - 
> Reserved container application=application_1517571510094_0003 
> resource= 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@3f04848e
>  cluster={noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8018) Yarn Service Upgrade: Add support for initiating service upgrade

2018-04-11 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434542#comment-16434542
 ] 

Eric Yang commented on YARN-8018:
-

[~leftnoteasy] I agree with your assessment for backport this to 3.1.1 release.

> Yarn Service Upgrade: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8018-branch-3.1.007.patch, YARN-8018.001.patch, 
> YARN-8018.002.patch, YARN-8018.003.patch, YARN-8018.004.patch, 
> YARN-8018.005.patch, YARN-8018.006.patch, YARN-8018.007.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7654) Support ENTRY_POINT for docker container

2018-04-11 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434527#comment-16434527
 ] 

Eric Yang commented on YARN-7654:
-

Rebase patch 10 to current trunk after YARN-7221 and YARN-7973 changes.

> Support ENTRY_POINT for docker container
> 
>
> Key: YARN-7654
> URL: https://issues.apache.org/jira/browse/YARN-7654
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-7654.001.patch, YARN-7654.002.patch, 
> YARN-7654.003.patch, YARN-7654.004.patch, YARN-7654.005.patch, 
> YARN-7654.006.patch, YARN-7654.007.patch, YARN-7654.008.patch, 
> YARN-7654.009.patch, YARN-7654.010.patch
>
>
> Docker image may have ENTRY_POINT predefined, but this is not supported in 
> the current implementation.  It would be nice if we can detect existence of 
> {{launch_command}} and base on this variable launch docker container in 
> different ways:
> h3. Launch command exists
> {code}
> docker run [image]:[version]
> docker exec [container_id] [launch_command]
> {code}
> h3. Use ENTRY_POINT
> {code}
> docker run [image]:[version]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7654) Support ENTRY_POINT for docker container

2018-04-11 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7654:

Attachment: YARN-7654.010.patch

> Support ENTRY_POINT for docker container
> 
>
> Key: YARN-7654
> URL: https://issues.apache.org/jira/browse/YARN-7654
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-7654.001.patch, YARN-7654.002.patch, 
> YARN-7654.003.patch, YARN-7654.004.patch, YARN-7654.005.patch, 
> YARN-7654.006.patch, YARN-7654.007.patch, YARN-7654.008.patch, 
> YARN-7654.009.patch, YARN-7654.010.patch
>
>
> Docker image may have ENTRY_POINT predefined, but this is not supported in 
> the current implementation.  It would be nice if we can detect existence of 
> {{launch_command}} and base on this variable launch docker container in 
> different ways:
> h3. Launch command exists
> {code}
> docker run [image]:[version]
> docker exec [container_id] [launch_command]
> {code}
> h3. Use ENTRY_POINT
> {code}
> docker run [image]:[version]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-04-11 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434525#comment-16434525
 ] 

Eric Yang commented on YARN-7221:
-

[~billie.rinaldi] Thank you for the review and commit.
[~shaneku...@gmail.com] [~ebadger] [~jlowe] Thank you for the reviews.

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch, 
> YARN-7221.003.patch, YARN-7221.004.patch, YARN-7221.005.patch, 
> YARN-7221.006.patch, YARN-7221.007.patch, YARN-7221.008.patch, 
> YARN-7221.009.patch, YARN-7221.010.patch, YARN-7221.011.patch, 
> YARN-7221.012.patch, YARN-7221.013.patch, YARN-7221.014.patch, 
> YARN-7221.015.patch, YARN-7221.016.patch, YARN-7221.017.patch, 
> YARN-7221.018.patch, YARN-7221.019.patch, YARN-7221.020.patch, 
> YARN-7221.021.patch, YARN-7221.022.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8018) Yarn Service Upgrade: Add support for initiating service upgrade

2018-04-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434524#comment-16434524
 ] 

Wangda Tan commented on YARN-8018:
--

Thanks [~eyang] , I think the whole service related APIs are marked as unstable 
in the 3.1.0 release. 

It will be fine to include incomplete fixes to native service as far as it 
incorporates with Hadoop compatibility policy. I would prefer to do some of 
these changes earlier rather than doing it after 2 months and missed 
dependencies / fixes, etc. 

Thoughts?

> Yarn Service Upgrade: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8018-branch-3.1.007.patch, YARN-8018.001.patch, 
> YARN-8018.002.patch, YARN-8018.003.patch, YARN-8018.004.patch, 
> YARN-8018.005.patch, YARN-8018.006.patch, YARN-8018.007.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8018) Yarn Service Upgrade: Add support for initiating service upgrade

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434510#comment-16434510
 ] 

genericqa commented on YARN-8018:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
36s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
5s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
46s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} branch-3.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m  
2s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 13s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 138 unchanged - 3 fixed = 140 total (was 141) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 28m 
10s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  5m 
24s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}122m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-8018 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918603/YARN-8018-branch-3.1.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux bec79fc3a578 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 

[jira] [Commented] (YARN-8060) Create default readiness check for service components

2018-04-11 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434502#comment-16434502
 ] 

Shane Kumpf commented on YARN-8060:
---

Thanks for the updated patch [~billie.rinaldi]! The latest patch lgtm, +1 
non-binding.

> Create default readiness check for service components
> -
>
> Key: YARN-8060
> URL: https://issues.apache.org/jira/browse/YARN-8060
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Billie Rinaldi
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-8060.1.patch, YARN-8060.2.patch, YARN-8060.3.patch
>
>
> It is currently possible for a component instance to have READY status before 
> the AM retrieves an IP for the container. We should make sure the IP has been 
> retrieved before marking the instance as READY.
> This default probe could also have an option to check for a DNS entry for the 
> instance's hostname if a DNS address is provided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7810) TestDockerContainerRuntime test failures due to UID lookup of a non-existent user

2018-04-11 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434501#comment-16434501
 ] 

Eric Badger commented on YARN-7810:
---

[~eyang], [~shaneku...@gmail.com], can we backport this through 2.9? YARN-7782 
was committed to all of those branches, so the unit tests are broken for all of 
them. 

> TestDockerContainerRuntime test failures due to UID lookup of a non-existent 
> user
> -
>
> Key: YARN-7810
> URL: https://issues.apache.org/jira/browse/YARN-7810
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7810.001.patch, YARN-7810.002.patch
>
>
> YARN-7782 enabled the Docker runtime feature to remap the username to uid:gid 
> form for launching Docker containers. The feature does an {{id -u}} and {{id 
> -G}} to get the UID and GIDs. This fails with the test user, as that user 
> doesn't actually exist on the host.
> {code:java}
> [ERROR] 
> testContainerLaunchWithCustomNetworks(org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime)
>   Time elapsed: 0.411 s  <<< ERROR!
> org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException:
>  
> ExitCodeException exitCode=1: id: 'run_as_user': no such user
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.getUserIdInfo(DockerLinuxContainerRuntime.java:711)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(DockerLinuxContainerRuntime.java:757)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime.testContainerLaunchWithCustomNetworks(TestDockerContainerRuntime.java:599){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7189) Container-executor doesn't remove Docker containers that error out early

2018-04-11 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434466#comment-16434466
 ] 

Eric Badger commented on YARN-7189:
---

[~jlowe], new patch cleans some things up. Looping over pclose() makes things a 
little bit weird because of the required popen(). Let me know if this is 
acceptable.

> Container-executor doesn't remove Docker containers that error out early
> 
>
> Key: YARN-7189
> URL: https://issues.apache.org/jira/browse/YARN-7189
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.9.0, 2.8.3, 3.0.1
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-7189-b3.0.001.patch, 
> YARN-7189-branch-3.0.001.patch, YARN-7189-branch-3.0.002.patch
>
>
> Once the docker run command is executed, the docker container is created 
> unless the return code is 125 meaning that the run command itself failed 
> (https://docs.docker.com/engine/reference/run/#exit-status). Any error that 
> happens after the docker run needs to remove the container during cleanup.
> {noformat:title=container-executor.c:launch_docker_container_as_user}
>   snprintf(docker_command_with_binary, command_size, "%s %s", docker_binary, 
> docker_command);
>   fprintf(LOGFILE, "Launching docker container...\n");
>   FILE* start_docker = popen(docker_command_with_binary, "r");
> {noformat}
> This is fixed by YARN-5366, which changes how we remove containers. However, 
> that was committed into 3.1.0. 2.8, 2.9, and 3.0 are all affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7189) Container-executor doesn't remove Docker containers that error out early

2018-04-11 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-7189:
--
Attachment: YARN-7189-branch-3.0.002.patch

> Container-executor doesn't remove Docker containers that error out early
> 
>
> Key: YARN-7189
> URL: https://issues.apache.org/jira/browse/YARN-7189
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.9.0, 2.8.3, 3.0.1
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-7189-b3.0.001.patch, 
> YARN-7189-branch-3.0.001.patch, YARN-7189-branch-3.0.002.patch
>
>
> Once the docker run command is executed, the docker container is created 
> unless the return code is 125 meaning that the run command itself failed 
> (https://docs.docker.com/engine/reference/run/#exit-status). Any error that 
> happens after the docker run needs to remove the container during cleanup.
> {noformat:title=container-executor.c:launch_docker_container_as_user}
>   snprintf(docker_command_with_binary, command_size, "%s %s", docker_binary, 
> docker_command);
>   fprintf(LOGFILE, "Launching docker container...\n");
>   FILE* start_docker = popen(docker_command_with_binary, "r");
> {noformat}
> This is fixed by YARN-5366, which changes how we remove containers. However, 
> that was committed into 3.1.0. 2.8, 2.9, and 3.0 are all affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM

2018-04-11 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434414#comment-16434414
 ] 

Billie Rinaldi commented on YARN-8142:
--

[~eyang], that argument makes sense to me. It does seem unexpected for SIGTERM 
to be more destructive than SIGKILL. I will put up a patch to make SIGTERM 
match the SIGKILL behavior.

> yarn service application stops when AM is killed with SIGTERM
> -
>
> Key: YARN-8142
> URL: https://issues.apache.org/jira/browse/YARN-8142
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
>
> Steps:
> 1) Launch sleeper job ( non-docker yarn service)
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch 
> fault-test-am-sleeper 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition 
> from local FS: 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms
> 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: 
> application_1522887500374_0010
> Exit Code: 0{code}
> 2) Wait for sleeper component to be up
> 3) Kill AM process PID
>  
> Expected behavior:
> New attempt of AM will be started. The pre-existing container will keep 
> running
>  
> Actual behavior:
> Application finishes with State : FINISHED and Final-State : ENDED
> New attempt was never launched
> Note: 
> when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting 
> the entire app down instead of letting it continue to run for another attempt
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434403#comment-16434403
 ] 

Hudson commented on YARN-7221:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13973 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13973/])
YARN-7221. Add security check for privileged docker container. (billie: rev 
933477e9e0526e2ed81ea454f8806de31981822a)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/TestDockerContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java


> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch, 
> YARN-7221.003.patch, YARN-7221.004.patch, YARN-7221.005.patch, 
> YARN-7221.006.patch, YARN-7221.007.patch, YARN-7221.008.patch, 
> YARN-7221.009.patch, YARN-7221.010.patch, YARN-7221.011.patch, 
> YARN-7221.012.patch, YARN-7221.013.patch, YARN-7221.014.patch, 
> YARN-7221.015.patch, YARN-7221.016.patch, YARN-7221.017.patch, 
> YARN-7221.018.patch, YARN-7221.019.patch, YARN-7221.020.patch, 
> YARN-7221.021.patch, YARN-7221.022.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8140) Improve log message when launch cmd is ran for stopped yarn service

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434397#comment-16434397
 ] 

genericqa commented on YARN-8140:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 49s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-8140 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918590/YARN-8140.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 9bdf476e8abd 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / f7d5bac |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20306/testReport/ |
| Max. process+thread count | 347 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services-api
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20306/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   

[jira] [Commented] (YARN-8127) Resource leak when async scheduling is enabled

2018-04-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434388#comment-16434388
 ] 

Wangda Tan commented on YARN-8127:
--

Nice catching! Thanks [~Tao Yang] / [~cheersyang]!

> Resource leak when async scheduling is enabled
> --
>
> Key: YARN-8127
> URL: https://issues.apache.org/jira/browse/YARN-8127
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Weiwei Yang
>Assignee: Tao Yang
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8127.001.patch, YARN-8127.002.patch, 
> YARN-8127.003.patch, YARN-8127.004.patch
>
>
> Brief steps to reproduce
>  # Enable async scheduling, 5 threads
>  # Submit a lot of jobs trying to exhaust cluster resource
>  # After a while, observed NM allocated resource is more than resource 
> requested by allocated containers
> Looks like the commit phase is not sync handling reserved containers, causing 
> some proposal incorrectly accepted, subsequently resource was deducted 
> multiple times for a container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8147) TestClientRMService#testGetApplications sporadically fails

2018-04-11 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-8147:
-
Attachment: YARN-8147.001.patch

> TestClientRMService#testGetApplications sporadically fails
> --
>
> Key: YARN-8147
> URL: https://issues.apache.org/jira/browse/YARN-8147
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8147.001.patch
>
>
> testGetApplications can fail sporadically when testing start time filters on 
> the request, e.g.:
> {noformat}
> java.lang.AssertionError: Incorrect number of matching start range 
> expected:<0> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService.testGetApplications(TestClientRMService.java:798)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8018) Yarn Service Upgrade: Add support for initiating service upgrade

2018-04-11 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434343#comment-16434343
 ] 

Eric Yang edited comment on YARN-8018 at 4/11/18 6:10 PM:
--

[~billie.rinaldi] This patch is not complete upgrade framework.  However, it 
does introduce incompatible field "version" into yarnfile.  If we update 
document for yarn services to include version field, and accepts this 
incompatibility.  It can be cherry-picked and remain as inactive for dependent 
patch to back port.


was (Author: eyang):
[~billie.rinaldi] This patch is not complete upgrade framework.  It can be 
cherry-picked and remain as inactive for dependent patch to back port.

> Yarn Service Upgrade: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8018-branch-3.1.007.patch, YARN-8018.001.patch, 
> YARN-8018.002.patch, YARN-8018.003.patch, YARN-8018.004.patch, 
> YARN-8018.005.patch, YARN-8018.006.patch, YARN-8018.007.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8018) Yarn Service Upgrade: Add support for initiating service upgrade

2018-04-11 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434343#comment-16434343
 ] 

Eric Yang commented on YARN-8018:
-

[~billie.rinaldi] This patch is not complete upgrade framework.  It can be 
cherry-picked and remain as inactive for dependent patch to back port.

> Yarn Service Upgrade: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8018-branch-3.1.007.patch, YARN-8018.001.patch, 
> YARN-8018.002.patch, YARN-8018.003.patch, YARN-8018.004.patch, 
> YARN-8018.005.patch, YARN-8018.006.patch, YARN-8018.007.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8149) Revisit behavior of Re-Reservation in Capacity Scheduler

2018-04-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434342#comment-16434342
 ] 

Wangda Tan commented on YARN-8149:
--

[~jlowe] / [~eepayne] / [~cheersyang] / [~Tao Yang] / [~sunilg]. 

Could u share your thoughts on this? If we can remove this, reservation logic 
can be simplified a lot.

> Revisit behavior of Re-Reservation in Capacity Scheduler
> 
>
> Key: YARN-8149
> URL: https://issues.apache.org/jira/browse/YARN-8149
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Priority: Critical
>
> Frankly speaking, I'm not sure why we need the re-reservation. The formula is 
> not that easy to understand:
> Inside: 
> {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator#shouldAllocOrReserveNewContainer}}
> {code:java}
> starvation = re-reservation / (#reserved-container * 
>  (1 - min(requested-resource / max-alloc, 
>   max-alloc - min-alloc / max-alloc))
> should_allocate = starvation + requiredContainers - reservedContainers > 
> 0{code}
> I think we should be able to remove the starvation computation, just to check 
> requiredContainers > reservedContainers should be enough.
> In a large cluster, we can easily overflow re-reservation to MAX_INT, see 
> YARN-7636. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8149) Revisit behavior of Re-Reservation in Capacity Scheduler

2018-04-11 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8149:


 Summary: Revisit behavior of Re-Reservation in Capacity Scheduler
 Key: YARN-8149
 URL: https://issues.apache.org/jira/browse/YARN-8149
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wangda Tan


Frankly speaking, I'm not sure why we need the re-reservation. The formula is 
not that easy to understand:

Inside: 
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator#shouldAllocOrReserveNewContainer}}
{code:java}
starvation = re-reservation / (#reserved-container * 
 (1 - min(requested-resource / max-alloc, 
  max-alloc - min-alloc / max-alloc))
should_allocate = starvation + requiredContainers - reservedContainers > 0{code}
I think we should be able to remove the starvation computation, just to check 
requiredContainers > reservedContainers should be enough.

In a large cluster, we can easily overflow re-reservation to MAX_INT, see 
YARN-7636. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-8018) Yarn Service Upgrade: Add support for initiating service upgrade

2018-04-11 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh reopened YARN-8018:
-

reopening to run jenkins

> Yarn Service Upgrade: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8018-branch-3.1.007.patch, YARN-8018.001.patch, 
> YARN-8018.002.patch, YARN-8018.003.patch, YARN-8018.004.patch, 
> YARN-8018.005.patch, YARN-8018.006.patch, YARN-8018.007.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8018) Yarn Service Upgrade: Add support for initiating service upgrade

2018-04-11 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8018:

Attachment: YARN-8018-branch-3.1.007.patch

> Yarn Service Upgrade: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8018-branch-3.1.007.patch, YARN-8018.001.patch, 
> YARN-8018.002.patch, YARN-8018.003.patch, YARN-8018.004.patch, 
> YARN-8018.005.patch, YARN-8018.006.patch, YARN-8018.007.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM

2018-04-11 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434316#comment-16434316
 ] 

Eric Yang commented on YARN-8142:
-

[~billie.rinaldi] [~shaneku...@gmail.com] SIGTERM should not be more 
destructive than SIGKILL.  Hence, I am changing my mind about using SIGTERM to 
terminate the entire application.  It would be good to keep it consistent with 
existing YARN AM design.

> yarn service application stops when AM is killed with SIGTERM
> -
>
> Key: YARN-8142
> URL: https://issues.apache.org/jira/browse/YARN-8142
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
>
> Steps:
> 1) Launch sleeper job ( non-docker yarn service)
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch 
> fault-test-am-sleeper 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition 
> from local FS: 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms
> 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: 
> application_1522887500374_0010
> Exit Code: 0{code}
> 2) Wait for sleeper component to be up
> 3) Kill AM process PID
>  
> Expected behavior:
> New attempt of AM will be started. The pre-existing container will keep 
> running
>  
> Actual behavior:
> Application finishes with State : FINISHED and Final-State : ENDED
> New attempt was never launched
> Note: 
> when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting 
> the entire app down instead of letting it continue to run for another attempt
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8148) Update decimal values for queue capacities shown on queue status cli

2018-04-11 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-8148:
---

 Summary: Update decimal values for queue capacities shown on queue 
status cli
 Key: YARN-8148
 URL: https://issues.apache.org/jira/browse/YARN-8148
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 3.0.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


Capacities are shown with two decimal values in RM UI as part of YARN-6182. The 
queue status cli are still showing one decimal value.

{code}
[root@bigdata3 yarn]# yarn queue -status default
Queue Information : 
Queue Name : default
State : RUNNING
Capacity : 69.9%
Current Capacity : .0%
Maximum Capacity : 70.0%
Default Node Label expression : 
Accessible Node Labels : *
Preemption : enabled
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-04-11 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434288#comment-16434288
 ] 

Billie Rinaldi commented on YARN-7221:
--

+1 for patch 22 as well. I think we have all agreed, so I will commit this 
patch.

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch, 
> YARN-7221.003.patch, YARN-7221.004.patch, YARN-7221.005.patch, 
> YARN-7221.006.patch, YARN-7221.007.patch, YARN-7221.008.patch, 
> YARN-7221.009.patch, YARN-7221.010.patch, YARN-7221.011.patch, 
> YARN-7221.012.patch, YARN-7221.013.patch, YARN-7221.014.patch, 
> YARN-7221.015.patch, YARN-7221.016.patch, YARN-7221.017.patch, 
> YARN-7221.018.patch, YARN-7221.019.patch, YARN-7221.020.patch, 
> YARN-7221.021.patch, YARN-7221.022.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7142) Support placement policy in yarn native services

2018-04-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434284#comment-16434284
 ] 

Wangda Tan commented on YARN-7142:
--

[~cheersyang], thanks for reviewing this Jira, I agree with [~gsaha]: unlike DS 
mostly for dev testing, placement spec of native service should be more clear. 
The proposed one in this Jira is clearer than DS spec for end user to use. 

Currently we're planning to backport several dependencies to branch-3.1 so 
YARN-7142 can be backported w/o modification and makes native service 
implementation less divergency between trunk and branch-3.1. Once YARN-8118 
backported, we can backport this one.

> Support placement policy in yarn native services
> 
>
> Key: YARN-7142
> URL: https://issues.apache.org/jira/browse/YARN-7142
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Billie Rinaldi
>Assignee: Gour Saha
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7142-branch-3.1.004.patch, YARN-7142.001.patch, 
> YARN-7142.002.patch, YARN-7142.003.patch, YARN-7142.004.patch
>
>
> Placement policy exists in the API but is not implemented yet.
> I have filed YARN-8074 to move the composite constraints implementation out 
> of this phase-1 implementation of placement policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8103) Add CLI interface to query node attributes

2018-04-11 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434279#comment-16434279
 ] 

Bibin A Chundatt commented on YARN-8103:


Hi [~cheersyang]

Following are planned as part of this jira

# *yarn cluster -lna* (lists all the node attributes in cluster)
# *yarn node* (Node details list  node attributes too)
Add details of attributes
# *yarn node-attributes* (for specific attribute print the nodes too) 

Above points were discussed in last call.

*Naga* was suggested to add in *yarn node-attributes* listing cluster 
attributes too.


# *yarn node-attributes  -list-cluster-attributes* (List all cluster node 
attributes)-- Suggestion from *Naga*
# *yarn node-attributes  -list-nodes  *  (List nodes of specific 
attibutes)   -- discussed on call
# *yarn node-attributes  -list-attributes * (List attributes of host 
name)  -- *Additional inclusion plan from my side*


[~Naganarasimha] please add if i have missed anything.

> Add CLI interface to  query node attributes
> ---
>
> Key: YARN-8103
> URL: https://issues.apache.org/jira/browse/YARN-8103
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
>
> YARN-8100 will add API interface for querying the attributes. CLI interface 
> for querying node attributes for each nodes and list all attributes in 
> cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7142) Support placement policy in yarn native services

2018-04-11 Thread Gour Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434268#comment-16434268
 ] 

Gour Saha commented on YARN-7142:
-

{quote}I think we should be able to support a simple PC language, by specifying 
something like: notin,node,foo
{quote}
[~cheersyang] thank you for the good suggestions. In my opinion, YARN Service 
should be viewed as a higher level abstraction for layman users who do not 
understand and don't want to understand YARN internals. They only understand 
their own application and their app deployment model. If the YARN Service API 
and spec are not simple and crisp it will immediately cause a barrier to such 
application-owners from coming to YARN. What do you think?

> Support placement policy in yarn native services
> 
>
> Key: YARN-7142
> URL: https://issues.apache.org/jira/browse/YARN-7142
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Billie Rinaldi
>Assignee: Gour Saha
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7142-branch-3.1.004.patch, YARN-7142.001.patch, 
> YARN-7142.002.patch, YARN-7142.003.patch, YARN-7142.004.patch
>
>
> Placement policy exists in the API but is not implemented yet.
> I have filed YARN-8074 to move the composite constraints implementation out 
> of this phase-1 implementation of placement policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8140) Improve log message when launch cmd is ran for stopped yarn service

2018-04-11 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8140:

Attachment: YARN-8140.001.patch

> Improve log message when launch cmd is ran for stopped yarn service
> ---
>
> Key: YARN-8140
> URL: https://issues.apache.org/jira/browse/YARN-8140
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8140.001.patch
>
>
> Steps:
>  1) Launch sleeper app
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch 
> sleeper2-duplicate-app-stopped 
> /usr/hdp/3.0.0.0-xxx/hadoop-yarn/yarn-service-examples/sleeper/sleeper.json
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/10 21:31:01 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/10 21:31:01 INFO client.AHSProxy: Connecting to Application History 
> server at xx:10200
> 18/04/10 21:31:01 INFO client.AHSProxy: Connecting to Application History 
> server at xx:10200
> 18/04/10 21:31:01 INFO client.ApiServiceClient: Loading service definition 
> from local FS: 
> /usr/hdp/3.0.0.0-xxx/hadoop-yarn/yarn-service-examples/sleeper/sleeper.json
> 18/04/10 21:31:03 INFO util.log: Logging initialized @2818ms
> 18/04/10 21:31:10 INFO client.ApiServiceClient: Application ID: 
> application_1523387473707_0007
> Exit Code: 0{code}
> 2) Stop the application
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -stop 
> sleeper2-duplicate-app-stopped
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/10 21:31:14 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/10 21:31:15 INFO client.AHSProxy: Connecting to Application History 
> server at xx:10200
> 18/04/10 21:31:15 INFO client.AHSProxy: Connecting to Application History 
> server at xx:10200
> 18/04/10 21:31:16 INFO util.log: Logging initialized @3034ms
> 18/04/10 21:31:17 INFO client.ApiServiceClient: Successfully stopped service 
> sleeper2-duplicate-app-stopped
> Exit Code: 0{code}
> 3) Launch the application with same name
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch 
> sleeper2-duplicate-app-stopped 
> /usr/hdp/3.0.0.0-xxx/hadoop-yarn/yarn-service-examples/sleeper/sleeper.json
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/10 21:31:19 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/10 21:31:19 INFO client.AHSProxy: Connecting to Application History 
> server at xx:10200
> 18/04/10 21:31:19 INFO client.AHSProxy: Connecting to Application History 
> server at xx:10200
> 18/04/10 21:31:19 INFO client.ApiServiceClient: Loading service definition 
> from local FS: 
> /usr/hdp/3.0.0.0-xxx/hadoop-yarn/yarn-service-examples/sleeper/sleeper.json
> 18/04/10 21:31:22 INFO util.log: Logging initialized @4456ms
> 18/04/10 21:31:22 ERROR client.ApiServiceClient: Service Instance dir already 
> exists: 
> hdfs://mycluster/user/hrt_qa/.yarn/services/sleeper2-duplicate-app-stopped/sleeper2-duplicate-app-stopped.json
> Exit Code: 56
> {code}
>  
> Here, launch cmd fails with "Service Instance dir already exists: 
> hdfs://mycluster/user/hrt_qa/.yarn/services/sleeper2-duplicate-app-stopped/sleeper2-duplicate-app-stopped.json".
>  
> The log message should be more meaningful. It should return that 
> "sleeper2-duplicate-app-stopped is in stopped state".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM

2018-04-11 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434202#comment-16434202
 ] 

Shane Kumpf commented on YARN-8142:
---

I'm leaning in favor of having SIGTERM result in a new AM being started, as 
SIGKILL does now. Erring on the side of "keep the service running" may outweigh 
matching the Unix philosophy here, but I can understand that reasoning.

> yarn service application stops when AM is killed with SIGTERM
> -
>
> Key: YARN-8142
> URL: https://issues.apache.org/jira/browse/YARN-8142
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
>
> Steps:
> 1) Launch sleeper job ( non-docker yarn service)
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch 
> fault-test-am-sleeper 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition 
> from local FS: 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms
> 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: 
> application_1522887500374_0010
> Exit Code: 0{code}
> 2) Wait for sleeper component to be up
> 3) Kill AM process PID
>  
> Expected behavior:
> New attempt of AM will be started. The pre-existing container will keep 
> running
>  
> Actual behavior:
> Application finishes with State : FINISHED and Final-State : ENDED
> New attempt was never launched
> Note: 
> when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting 
> the entire app down instead of letting it continue to run for another attempt
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8104) Add API to fetch node to attribute mapping

2018-04-11 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434183#comment-16434183
 ] 

Bibin A Chundatt commented on YARN-8104:


Attached patch to handle checkstyle issues too.

> Add API to fetch node to attribute mapping
> --
>
> Key: YARN-8104
> URL: https://issues.apache.org/jira/browse/YARN-8104
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-8104-YARN-3409.001.patch, 
> YARN-8104-YARN-3409.002.patch, YARN-8104-YARN-3409.003.patch, 
> YARN-8104-YARN-3409.004.patch, YARN-8104-YARN-3409.005.patch
>
>
> Add node/host to attribute mapping in yarn client API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8104) Add API to fetch node to attribute mapping

2018-04-11 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-8104:
---
Attachment: YARN-8104-YARN-3409.005.patch

> Add API to fetch node to attribute mapping
> --
>
> Key: YARN-8104
> URL: https://issues.apache.org/jira/browse/YARN-8104
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Major
> Attachments: YARN-8104-YARN-3409.001.patch, 
> YARN-8104-YARN-3409.002.patch, YARN-8104-YARN-3409.003.patch, 
> YARN-8104-YARN-3409.004.patch, YARN-8104-YARN-3409.005.patch
>
>
> Add node/host to attribute mapping in yarn client API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4781) Support intra-queue preemption for fairness ordering policy.

2018-04-11 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434143#comment-16434143
 ] 

Eric Payne commented on YARN-4781:
--

[~sunilg], I can think of a couple of ways to handle this, but none of them 
seem ideal to me.

The goal is to sort the apps ordered in the same way the {{FairOrderingPolicy}} 
would sort them, and to do that, we need to use the same logic that is used in 
{{FairOrderingPolicy#FairComparator}}.

- One way would be to just copy the compare logic to 
{{IntraQueueCandidatesSelector#TAFairOrderingComparator}}. This is not ideal 
because it introduces a maintenance concern of keeping the two in sync.
- We could split out fifo and fair {{...IntraQueuePreemptionPlugin}}s and the 
new {{FairIntraQueuePreemptionPlugin}} could subclasses 
{{FairOrderingPolicy#FairComparator}}. This seems overly complicated and 
disruptive.
- Add an accessor and all the necessary support to allow the queue's 
orderingpolicy's comparator to be retrieved and used. This would affect fair, 
fifo, and fifo pending policies as well as the interface itself.
- Create a new {{LeafQueue#getAllOrderedApplications}} (to be used instead of 
{{getAllApplications}}) that creates a priority queue sorted according to the 
queue's {{orderingPolicy}}. Or, we could change {{getAllApplications}}| to do 
the same thing. Since a HashSet isn't ordered, nothing that calls 
{{getAllApplications}} today would expect any order.

I like the last option the best. There would still need to be some 
modifications within {{FifoIntraQueuePreemptionPlugin}}, but at least then it's 
all contained within preemption except for the minor mods to {{LeafQueue}}

> Support intra-queue preemption for fairness ordering policy.
> 
>
> Key: YARN-4781
> URL: https://issues.apache.org/jira/browse/YARN-4781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
> Attachments: YARN-4781.001.patch, YARN-4781.002.patch, 
> YARN-4781.003.patch
>
>
> We introduced fairness queue policy since YARN-3319, which will let large 
> applications make progresses and not starve small applications. However, if a 
> large application takes the queue’s resources, and containers of the large 
> app has long lifespan, small applications could still wait for resources for 
> long time and SLAs cannot be guaranteed.
> Instead of wait for application release resources on their own, we need to 
> preempt resources of queue with fairness policy enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM

2018-04-11 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434136#comment-16434136
 ] 

Billie Rinaldi commented on YARN-8142:
--

Okay, I could put up a patch that would modify the service AM SIGTERM behavior 
to match the SIGKILL behavior, but we need to resolve what we would like the AM 
to do in this situation. Currently, the behavior is:
* SIGKILL only kills the AM, so a new AM will be started and the service will 
keep running
* SIGTERM stops the entire service
* a service client stop command RPC stops the entire service

So the question is whether the SIGTERM behavior should be the same as the 
SIGKILL or the service client stop command.

> yarn service application stops when AM is killed with SIGTERM
> -
>
> Key: YARN-8142
> URL: https://issues.apache.org/jira/browse/YARN-8142
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
>
> Steps:
> 1) Launch sleeper job ( non-docker yarn service)
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch 
> fault-test-am-sleeper 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition 
> from local FS: 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms
> 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: 
> application_1522887500374_0010
> Exit Code: 0{code}
> 2) Wait for sleeper component to be up
> 3) Kill AM process PID
>  
> Expected behavior:
> New attempt of AM will be started. The pre-existing container will keep 
> running
>  
> Actual behavior:
> Application finishes with State : FINISHED and Final-State : ENDED
> New attempt was never launched
> Note: 
> when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting 
> the entire app down instead of letting it continue to run for another attempt
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8147) TestClientRMService#testGetApplications sporadically fails

2018-04-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434125#comment-16434125
 ] 

Jason Lowe commented on YARN-8147:
--

The unit test is making the assumption that the machine is not fast enough for 
an application to be started and return to the client in the same millisecond.  
Occasionally this does happen in the unit test and the test fails.

> TestClientRMService#testGetApplications sporadically fails
> --
>
> Key: YARN-8147
> URL: https://issues.apache.org/jira/browse/YARN-8147
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
>
> testGetApplications can fail sporadically when testing start time filters on 
> the request, e.g.:
> {noformat}
> java.lang.AssertionError: Incorrect number of matching start range 
> expected:<0> but was:<1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService.testGetApplications(TestClientRMService.java:798)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8147) TestClientRMService#testGetApplications sporadically fails

2018-04-11 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-8147:


 Summary: TestClientRMService#testGetApplications sporadically fails
 Key: YARN-8147
 URL: https://issues.apache.org/jira/browse/YARN-8147
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe
Assignee: Jason Lowe


testGetApplications can fail sporadically when testing start time filters on 
the request, e.g.:
{noformat}
java.lang.AssertionError: Incorrect number of matching start range expected:<0> 
but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at 
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService.testGetApplications(TestClientRMService.java:798)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-04-11 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434092#comment-16434092
 ] 

Shane Kumpf commented on YARN-7221:
---

Thanks the updated patch, [~eyang]. The latest patch lgtm, +1 (non-binding).

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch, 
> YARN-7221.003.patch, YARN-7221.004.patch, YARN-7221.005.patch, 
> YARN-7221.006.patch, YARN-7221.007.patch, YARN-7221.008.patch, 
> YARN-7221.009.patch, YARN-7221.010.patch, YARN-7221.011.patch, 
> YARN-7221.012.patch, YARN-7221.013.patch, YARN-7221.014.patch, 
> YARN-7221.015.patch, YARN-7221.016.patch, YARN-7221.017.patch, 
> YARN-7221.018.patch, YARN-7221.019.patch, YARN-7221.020.patch, 
> YARN-7221.021.patch, YARN-7221.022.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7527) Over-allocate node resource in async-scheduling mode of CapacityScheduler

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434066#comment-16434066
 ] 

genericqa commented on YARN-7527:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
49s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m  
4s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 79m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:f667ef1 |
| JIRA Issue | YARN-7527 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918568/YARN-7527-branch-2.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 798111edbe14 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 2b2e2ac5 |
| maven | version: Apache Maven 3.3.9 
(bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) |
| Default Java | 1.7.0_171 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20304/testReport/ |
| Max. process+thread count | 878 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20304/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Over-allocate node resource in async-scheduling mode of CapacityScheduler
> -
>
> Key: YARN-7527
> URL: https://issues.apache.org/jira/browse/YARN-7527
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.0.0-alpha4, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang

[jira] [Commented] (YARN-7189) Container-executor doesn't remove Docker containers that error out early

2018-04-11 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434063#comment-16434063
 ] 

Jason Lowe commented on YARN-7189:
--

Thanks for the patch!

The {{i < 5}} check is extraneous and would never be triggered because the body 
of the loop is checking it and will be the termination condition instead.  
Actually I think the loop would be simpler if written as a while loop, e.g.: 
while ((rc = pclose(..)) != 0).

Nit: The {{continue}} in the for loop is extraneous as is the {{goto}}.

It may be useful to log errors from pclose (i.e.: pclose returning -1) along 
with strerror(errno) when that happens.

Nit: "Could not remove container after 5 tries %s.\n" should be "Could not 
remove container after 5 tries: %s\n" so the command is clearly separated from 
the error description and we don't inject a trailing period into the cmdline 
printed.



> Container-executor doesn't remove Docker containers that error out early
> 
>
> Key: YARN-7189
> URL: https://issues.apache.org/jira/browse/YARN-7189
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.9.0, 2.8.3, 3.0.1
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-7189-b3.0.001.patch, YARN-7189-branch-3.0.001.patch
>
>
> Once the docker run command is executed, the docker container is created 
> unless the return code is 125 meaning that the run command itself failed 
> (https://docs.docker.com/engine/reference/run/#exit-status). Any error that 
> happens after the docker run needs to remove the container during cleanup.
> {noformat:title=container-executor.c:launch_docker_container_as_user}
>   snprintf(docker_command_with_binary, command_size, "%s %s", docker_binary, 
> docker_command);
>   fprintf(LOGFILE, "Launching docker container...\n");
>   FILE* start_docker = popen(docker_command_with_binary, "r");
> {noformat}
> This is fixed by YARN-5366, which changes how we remove containers. However, 
> that was committed into 3.1.0. 2.8, 2.9, and 3.0 are all affected



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7189) Container-executor doesn't remove Docker containers that error out early

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434057#comment-16434057
 ] 

genericqa commented on YARN-7189:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
56s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-3.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
56s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
37m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  3s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 56s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.linux.runtime.TestDockerContainerRuntime
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5aaf88d |
| JIRA Issue | YARN-7189 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918565/YARN-7189-branch-3.0.001.patch
 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux ae80e40479f9 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.0 / 7cca348 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20303/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20303/testReport/ |
| Max. process+thread count | 302 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20303/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Container-executor doesn't remove Docker containers that error out early
> 
>
> Key: YARN-7189
> URL: https://issues.apache.org/jira/browse/YARN-7189
> Project: Hadoop YARN
>  Issue Type: Sub-task
>

[jira] [Commented] (YARN-8146) Remove LinkedList From resourcemanager.reservation.planning Package

2018-04-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434017#comment-16434017
 ] 

genericqa commented on YARN-8146:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 68m  
3s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}121m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b |
| JIRA Issue | YARN-8146 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12918554/YARN-8146.1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux dcc43b023112 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 7eb783e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20301/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20301/testReport/ |
| Max. process+thread count | 869 (vs. ulimit of 

[jira] [Commented] (YARN-8142) yarn service application stops when AM is killed with SIGTERM

2018-04-11 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434013#comment-16434013
 ] 

Rushabh S Shah commented on YARN-8142:
--

My bad.
Please ignore my previous comments.
I missed to read that it affects {{yarn-native-services}}.

> yarn service application stops when AM is killed with SIGTERM
> -
>
> Key: YARN-8142
> URL: https://issues.apache.org/jira/browse/YARN-8142
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
>
> Steps:
> 1) Launch sleeper job ( non-docker yarn service)
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch 
> fault-test-am-sleeper 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition 
> from local FS: 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms
> 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: 
> application_1522887500374_0010
> Exit Code: 0{code}
> 2) Wait for sleeper component to be up
> 3) Kill AM process PID
>  
> Expected behavior:
> New attempt of AM will be started. The pre-existing container will keep 
> running
>  
> Actual behavior:
> Application finishes with State : FINISHED and Final-State : ENDED
> New attempt was never launched
> Note: 
> when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting 
> the entire app down instead of letting it continue to run for another attempt
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7941) Transitive dependencies for component are not resolved

2018-04-11 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434004#comment-16434004
 ] 

Billie Rinaldi commented on YARN-7941:
--

Thanks for the review, [~rohithsharma]!

> Transitive dependencies for component are not resolved 
> ---
>
> Key: YARN-7941
> URL: https://issues.apache.org/jira/browse/YARN-7941
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Billie Rinaldi
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-7941.1.patch
>
>
> It is observed that transitive dependencies are not resolved as a result one 
> of the component is started earlier. 
> Ex : In HBase app, 
> master is independent component, 
> regionserver is depends on master.  
> hbaseclient depends on regionserver, 
> but I always see that HBaseClient is launched before regionserver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8018) Yarn Service Upgrade: Add support for initiating service upgrade

2018-04-11 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433988#comment-16433988
 ] 

Billie Rinaldi commented on YARN-8018:
--

Thoughts on getting this patch into branch-3.1?

> Yarn Service Upgrade: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8018.001.patch, YARN-8018.002.patch, 
> YARN-8018.003.patch, YARN-8018.004.patch, YARN-8018.005.patch, 
> YARN-8018.006.patch, YARN-8018.007.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8142) yarn service application stops when AM is killed with SIGTERM

2018-04-11 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433959#comment-16433959
 ] 

Rushabh S Shah edited comment on YARN-8142 at 4/11/18 2:13 PM:
---

I ran mapreduce sleep job.
{noformat}
hadoop jar hadoop-mapreduce-client-jobclient-2.8-tests.jar sleep -m 1 -mt 
10 -r 1 -rt 1
{noformat}
Tried to kill am attempt.
 It didn't kill the whole application.
 In my case, it launched another AM attempt.
 I ran SIGKILL as well as SIGTERM. Both of them spawned new AM attempt.
-So something that went into 3.* changed that behavior.-
I thought the affected version was 3.2 and 3.1
Yesha: Can you please mention affected version ?
 


was (Author: shahrs87):
I reproduced the same scenario that [~yeshavora] pointed out in description on 
cluster running hadoop 2.8.
 It didn't kill the whole application.
 In my case, it launched another AM attempt.
 I ran SIGKILL as well as SIGTERM. Both of them spawned new AM attempt.
-So something that went into 3.* changed that behavior.-
I thought the affected version was 3.2 and 3.1
Yesha: Can you please mention affected version ?
 

> yarn service application stops when AM is killed with SIGTERM
> -
>
> Key: YARN-8142
> URL: https://issues.apache.org/jira/browse/YARN-8142
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Yesha Vora
>Assignee: Billie Rinaldi
>Priority: Major
>
> Steps:
> 1) Launch sleeper job ( non-docker yarn service)
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn app -launch 
> fault-test-am-sleeper 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> WARNING: YARN_LOG_DIR has been replaced by HADOOP_LOG_DIR. Using value of 
> YARN_LOG_DIR.
> WARNING: YARN_LOGFILE has been replaced by HADOOP_LOGFILE. Using value of 
> YARN_LOGFILE.
> WARNING: YARN_PID_DIR has been replaced by HADOOP_PID_DIR. Using value of 
> YARN_PID_DIR.
> WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
> 18/04/06 22:24:24 WARN util.NativeCodeLoader: Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.AHSProxy: Connecting to Application History 
> server at xxx:10200
> 18/04/06 22:24:24 INFO client.ApiServiceClient: Loading service definition 
> from local FS: 
> /usr/hdp/current/hadoop-yarn-client/yarn-service-examples/sleeper/sleeper.json
> 18/04/06 22:24:26 INFO util.log: Logging initialized @3631ms
> 18/04/06 22:24:37 INFO client.ApiServiceClient: Application ID: 
> application_1522887500374_0010
> Exit Code: 0{code}
> 2) Wait for sleeper component to be up
> 3) Kill AM process PID
>  
> Expected behavior:
> New attempt of AM will be started. The pre-existing container will keep 
> running
>  
> Actual behavior:
> Application finishes with State : FINISHED and Final-State : ENDED
> New attempt was never launched
> Note: 
> when the AM gets a SIGTERM and gracefully shuts itself down. It is shutting 
> the entire app down instead of letting it continue to run for another attempt
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >