[jira] [Commented] (YARN-8002) Support NOT_SELF and ALL namespace types for allocation tag

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399939#comment-16399939
 ] 

genericqa commented on YARN-8002:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 41s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
28s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m  
6s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 37s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 61 unchanged - 1 fixed = 63 total (was 62) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
46s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 65m 
23s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}144m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-8002 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914621/YARN-8002.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e966e4731d38 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e840b4a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| 

[jira] [Commented] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399919#comment-16399919
 ] 

genericqa commented on YARN-7636:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 40s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 
33s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}110m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-7636 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914627/YARN-7636.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 53c10bbabac9 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3a0f4bc |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19985/testReport/ |
| Max. process+thread count | 896 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19985/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   

[jira] [Commented] (YARN-7581) HBase filters are not constructed correctly in ATSv2

2018-03-14 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399916#comment-16399916
 ] 

Haibo Chen commented on YARN-7581:
--

{quote}Does it because of assumption that info field will be retrieved by 
default all the time
{quote}
yes. there is always a FamilyFilter of 'i',
FilterList AND (5/5): [
  FamilyFilter (EQUAL, i),
  QualifierFilter (NOT_EQUAL, e!),
  QualifierFilter (NOT_EQUAL, i!),
  QualifierFilter (NOT_EQUAL, s!),
  QualifierFilter (NOT_EQUAL, r!)
]
so we don't need to created FamilyFilters for it like we do for metrics and 
configs.
{quote}I get one doubt that how does filter looks for combination of 
metricstoretrieve and cofigstoretrieve? 
 Does it look like below ?
{quote}
yes.

 

> HBase filters are not constructed correctly in ATSv2
> 
>
> Key: YARN-7581
> URL: https://issues.apache.org/jira/browse/YARN-7581
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0-beta1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-7581.00.patch, YARN-7581.01.patch, 
> YARN-7581.02.patch
>
>
> Post YARN-7346,
> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesConfigFilters() and 
> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesMetricFilters() 
> start to fail when hbase.profile is set to 2.0)
> *Error Message*
>  [ERROR] Failures:
>  [ERROR] 
> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesConfigFilters:1266 
> expected:<2> but was:<0>
>  [ERROR] 
> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesMetricFilters:1523 
> expected:<1> but was:<0>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7461) DominantResourceCalculator#ratio calculation problem when right resource contains zero value

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399911#comment-16399911
 ] 

genericqa commented on YARN-7461:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
9s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-7461 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12897926/YARN-7461.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 2e66abd5f7f1 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3a0f4bc |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19986/testReport/ |
| Max. process+thread count | 411 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19986/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DominantResourceCalculator#ratio calculation problem when right resource 
> contains zero value
> 

[jira] [Commented] (YARN-7461) DominantResourceCalculator#ratio calculation problem when right resource contains zero value

2018-03-14 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399907#comment-16399907
 ] 

Wangda Tan commented on YARN-7461:
--

Thanks [~Tao Yang], [~cheersyang], [~templedf]. 

I'm not sure if this is correct, IMHO, for ratio(a,b), we should get a >1 
result (when a > b) and <1 (when a < b) under the context of DRF. Otherwise it 
will be inconsistent to me. We recently saw some issues in CapacityScheduler 
preemption which is caused by similar issues. Please see YARN-8020/YARN-6538 
for more details.

Since this calculator exists in common package, in order to run RM unit tests, 
you may need to change some random things in resourcemanager.
 

> DominantResourceCalculator#ratio calculation problem when right resource 
> contains zero value
> 
>
> Key: YARN-7461
> URL: https://issues.apache.org/jira/browse/YARN-7461
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Minor
> Attachments: YARN-7461.001.patch, YARN-7461.002.patch, 
> YARN-7461.003.patch
>
>
> Currently DominantResourceCalculator#ratio may return wrong result when right 
> resource contains zero value. For example, there are three resource types 
> such as , leftResource=<5, 5, 0> and 
> rightResource=<10, 10, 0>, we expect the result of 
> DominantResourceCalculator#ratio(leftResource, rightResource) is 0.5 but 
> currently is NaN.
> There should be a verification before divide calculation to ensure that 
> dividend is not zero.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7999) Docker launch fails when user private filecache directory is missing

2018-03-14 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399890#comment-16399890
 ] 

Eric Yang commented on YARN-7999:
-

[~jlowe] 3.1 Release manager decided to rebase 3.1 from current trunk.  
Therefore, we don't need to commit it separately.

> Docker launch fails when user private filecache directory is missing
> 
>
> Key: YARN-7999
> URL: https://issues.apache.org/jira/browse/YARN-7999
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-7999.001.patch, YARN-7999.002.patch, q3.log
>
>
> Docker container is failing to launch in trunk.  The root cause is:
> {code}
> [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_20]: 
> [2018-03-02 23:26:09.196]Exception from container-launch.
> Container id: container_1520032931921_0001_01_20
> Exit code: 29
> Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Invalid docker mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
>  realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache
> Error constructing docker command, docker error code=12, error 
> message='Invalid docker mount'
> Shell output: main : command provided 4
> main : run as user is hbase
> main : requested yarn user is hbase
> Creating script paths...
> Creating local dirs...
> [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 
> 23:26:09.240]
> [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
> [2018-03-02 23:26:39.278]Could not find 
> nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_20//container_1520032931921_0001_01_20.pid
>  in any of the directories
> [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down 
> now...
> {code}
> The filecache cant not be mounted because it doesn't exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7461) DominantResourceCalculator#ratio calculation problem when right resource contains zero value

2018-03-14 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399871#comment-16399871
 ] 

Weiwei Yang commented on YARN-7461:
---

Hi [~Tao Yang]

Thanks for providing the fix. Since we are calculating the ratio for 
portion_resource/all_resource in the scheduler, I don't think we expect any 
case that rhs is zero but lhs is non-zero. Even this happens, return an 
INFINITY value is following the old behavior, so I think it is a safe change.
+1 to the patch, [~templedf] do you want to take a look at least patch? I you 
don't have any more comments, I will commit this by end of tomorrow.

Thanks.

> DominantResourceCalculator#ratio calculation problem when right resource 
> contains zero value
> 
>
> Key: YARN-7461
> URL: https://issues.apache.org/jira/browse/YARN-7461
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Minor
> Attachments: YARN-7461.001.patch, YARN-7461.002.patch, 
> YARN-7461.003.patch
>
>
> Currently DominantResourceCalculator#ratio may return wrong result when right 
> resource contains zero value. For example, there are three resource types 
> such as , leftResource=<5, 5, 0> and 
> rightResource=<10, 10, 0>, we expect the result of 
> DominantResourceCalculator#ratio(leftResource, rightResource) is 0.5 but 
> currently is NaN.
> There should be a verification before divide calculation to ensure that 
> dividend is not zero.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7952) RM should be able to recover log aggregation status after restart/fail-over

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399858#comment-16399858
 ] 

genericqa commented on YARN-7952:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
26s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 3 new + 381 unchanged - 0 fixed = 384 total (was 381) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 57s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
48s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
22s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
18s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
44s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 26s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
54s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}198m 30s{color} 

[jira] [Commented] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Tao Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399855#comment-16399855
 ] 

Tao Yang commented on YARN-7636:


 

Attached v3 patch to fix the compile error (v2 patch forgot to return in catch 
block of

addMissedNonPartitionedRequestSchedulingOpportunity).

> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch, YARN-7636.002.patch, 
> YARN-7636.003.patch
>
>
> This happens on our production cluster twice, when a request cannot be 
> satisfied for a long time, it continually triggers the re-reservation and 
> eventually caused the overflow. This will crash the scheduler.
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
> {noformat}
> Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
> can ignore this exception to avoid this problem.
> This problem may happens in 
> SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
>  fix it in the same way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7636:
---
Attachment: YARN-7636.003.patch

> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch, YARN-7636.002.patch, 
> YARN-7636.003.patch
>
>
> This happens on our production cluster twice, when a request cannot be 
> satisfied for a long time, it continually triggers the re-reservation and 
> eventually caused the overflow. This will crash the scheduler.
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
> {noformat}
> Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
> can ignore this exception to avoid this problem.
> This problem may happens in 
> SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
>  fix it in the same way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8002) Support NOT_SELF and ALL namespace types for allocation tag

2018-03-14 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8002:
--
Attachment: YARN-8002.004.patch

> Support NOT_SELF and ALL namespace types for allocation tag
> ---
>
> Key: YARN-8002
> URL: https://issues.apache.org/jira/browse/YARN-8002
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8002.001.patch, YARN-8002.002.patch, 
> YARN-8002.003.patch, YARN-8002.004.patch
>
>
> This is a continua task after YARN-7972, YARN-7972 adds support to specify 
> tags with namespace SELF and APP_ID, like following
>  * self/
>  * app-id//
> this task is to track the work to support 2 of remaining namespace types 
> *NOT_SELF* & *ALL* (we'll support app-label later),
>  * not-self/
>  * all/
> this will require a bit refactoring in {{AllocationTagsManager}} as it needs 
> to do some proper aggregation on tags for multiple apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8002) Support NOT_SELF and ALL namespace types for allocation tag

2018-03-14 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399777#comment-16399777
 ] 

Weiwei Yang edited comment on YARN-8002 at 3/15/18 2:37 AM:


Thanks [~leftnoteasy]

bq. Changes in SingleConstraintAppPlacementAllocato...
The allocator should be able to support all namespace types except app-label. I 
have added test case 
{{TestSchedulingRequestContainerAllocation#testInterAppAntiAffinity}} to verify 
simple anti-affinity inter-app constraints, and for comprehensive tests over 
different namespace types, that can be found in 
{{TestPlacementConstraintsUtil}}. The ultimate goal is to support all of these 
types ASAP, and I can write more tests in YARN-8015.

bq. TargetApplications: remove unused methods
Done

bq. aggregateAllocationTagsByApps/aggregateAllocationTagsByRac --- The two 
methods can be merged.
Good suggestion, I've merged them to one method now. About the null check, I 
prefer to keep it in case in future this method is used in some where else. 
Just to prevent a potential NPE.

bq. getNamespaceScope ... remove those
OK, I have removed those and replaced by explicit check on enum value.

bq. I suggest to add a method like throwExceptionIfNamespaceTypeNotSupported so 
we don't need to change all the places in the future.
That might not be necessary. There are two checks, one in 
{{PlacementConstraintsUtil}} and the other in 
{{SingleConstraintAppPlacementAllocator}}. I will removed the first one once 
YARN-8013 is done. And removed the 2nd one once YARN-8015 is done. But please 
let me know if you insist on this, I can add a method in 
{{PlacementConstraintsUtil}} and call it in both places.

I will upload a new patch shortly. Thanks



was (Author: cheersyang):
Thanks [~leftnoteasy]

bq. Changes in SingleConstraintAppPlacementAllocato...
Make sense, I have removed the change in this patch. Will track in YARN-8015.

bq. TargetApplications: remove unused methods
Done

bq. aggregateAllocationTagsByApps/aggregateAllocationTagsByRac --- The two 
methods can be merged.
Good suggestion, I've merged them to one method now. About the null check, I 
prefer to keep it in case in future this method is used in some where else. 
Just to prevent a potential NPE.

bq. getNamespaceScope ... remove those
OK, I have removed those and replaced by explicit check on enum value.

bq. I suggest to add a method like throwExceptionIfNamespaceTypeNotSupported so 
we don't need to change all the places in the future.
That might not be necessary. There are two checks, one in 
{{PlacementConstraintsUtil}} and the other in 
{{SingleConstraintAppPlacementAllocator}}. I will removed the first one once 
YARN-8013 is done. And removed the 2nd one once YARN-8015 is done. But please 
let me know if you insist on this, I can add a method in 
{{PlacementConstraintsUtil}} and call it in both places.

I will upload a new patch shortly. Thanks


> Support NOT_SELF and ALL namespace types for allocation tag
> ---
>
> Key: YARN-8002
> URL: https://issues.apache.org/jira/browse/YARN-8002
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8002.001.patch, YARN-8002.002.patch, 
> YARN-8002.003.patch
>
>
> This is a continua task after YARN-7972, YARN-7972 adds support to specify 
> tags with namespace SELF and APP_ID, like following
>  * self/
>  * app-id//
> this task is to track the work to support 2 of remaining namespace types 
> *NOT_SELF* & *ALL* (we'll support app-label later),
>  * not-self/
>  * all/
> this will require a bit refactoring in {{AllocationTagsManager}} as it needs 
> to do some proper aggregation on tags for multiple apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7574) Add support for Node Labels on Auto Created Leaf Queue Template

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399801#comment-16399801
 ] 

genericqa commented on YARN-7574:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  9s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 62 new + 146 unchanged - 21 fixed = 208 total (was 167) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  6s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
21s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 4 unchanged - 0 fixed = 5 total (was 4) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 40s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}122m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-7574 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914604/YARN-7574.5.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a088cc836ada 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e840b4a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/19982/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| javadoc | 

[jira] [Comment Edited] (YARN-8002) Support NOT_SELF and ALL namespace types for allocation tag

2018-03-14 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399777#comment-16399777
 ] 

Weiwei Yang edited comment on YARN-8002 at 3/15/18 2:14 AM:


Thanks [~leftnoteasy]

bq. Changes in SingleConstraintAppPlacementAllocato...
Make sense, I have removed the change in this patch. Will track in YARN-8015.

bq. TargetApplications: remove unused methods
Done

bq. aggregateAllocationTagsByApps/aggregateAllocationTagsByRac --- The two 
methods can be merged.
Good suggestion, I've merged them to one method now. About the null check, I 
prefer to keep it in case in future this method is used in some where else. 
Just to prevent a potential NPE.

bq. getNamespaceScope ... remove those
OK, I have removed those and replaced by explicit check on enum value.

bq. I suggest to add a method like throwExceptionIfNamespaceTypeNotSupported so 
we don't need to change all the places in the future.
That might not be necessary. There are two checks, one in 
{{PlacementConstraintsUtil}} and the other in 
{{SingleConstraintAppPlacementAllocator}}. I will removed the first one once 
YARN-8013 is done. And removed the 2nd one once YARN-8015 is done. But please 
let me know if you insist on this, I can add a method in 
{{PlacementConstraintsUtil}} and call it in both places.

I will upload a new patch shortly. Thanks



was (Author: cheersyang):
Thanks [~leftnoteasy]

bq. Changes in SingleConstraintAppPlacementAllocato...
Make sense, I have removed the change in this patch. Will track in YARN-8015.

bq. TargetApplications: remove unused methods
Done

bq. aggregateAllocationTagsByApps/aggregateAllocationTagsByRac --- The two 
methods can be merged.
Good suggestion, I've merged them to one method now. About the null check, I 
prefer to keep it in case in future this method is used in some where else. 
Just to prevent a potential NPE.

bq. getNamespaceScope ... remove those
OK, I have removed those and replaced by explicit check on enum value.

bq. I suggest to add a method like throwExceptionIfNamespaceTypeNotSupported so 
we don't need to change all the places in the future.
That might not be necessary. There are two checks, one in 
{{PlacementConstraintsUtil}} and the other in 
{{SingleConstraintAppPlacementAllocator}}. I will removed the first one once 
YARN-8013 is done. And removed the 2nd one once YARN-8015 is done. But please 
let me know if you insist on this, I can add a method in 
{{PlacementConstraintsUtil}} and call it in both places.

Thanks


> Support NOT_SELF and ALL namespace types for allocation tag
> ---
>
> Key: YARN-8002
> URL: https://issues.apache.org/jira/browse/YARN-8002
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8002.001.patch, YARN-8002.002.patch, 
> YARN-8002.003.patch
>
>
> This is a continua task after YARN-7972, YARN-7972 adds support to specify 
> tags with namespace SELF and APP_ID, like following
>  * self/
>  * app-id//
> this task is to track the work to support 2 of remaining namespace types 
> *NOT_SELF* & *ALL* (we'll support app-label later),
>  * not-self/
>  * all/
> this will require a bit refactoring in {{AllocationTagsManager}} as it needs 
> to do some proper aggregation on tags for multiple apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399790#comment-16399790
 ] 

genericqa commented on YARN-7636:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 21s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  2m 
57s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 22s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-7636 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914440/YARN-7636.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 12ca1598694e 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e840b4a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/19983/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| compile | 

[jira] [Commented] (YARN-8002) Support NOT_SELF and ALL namespace types for allocation tag

2018-03-14 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399777#comment-16399777
 ] 

Weiwei Yang commented on YARN-8002:
---

Thanks [~leftnoteasy]

bq. Changes in SingleConstraintAppPlacementAllocato...
Make sense, I have removed the change in this patch. Will track in YARN-8015.

bq. TargetApplications: remove unused methods
Done

bq. aggregateAllocationTagsByApps/aggregateAllocationTagsByRac --- The two 
methods can be merged.
Good suggestion, I've merged them to one method now. About the null check, I 
prefer to keep it in case in future this method is used in some where else. 
Just to prevent a potential NPE.

bq. getNamespaceScope ... remove those
OK, I have removed those and replaced by explicit check on enum value.

bq. I suggest to add a method like throwExceptionIfNamespaceTypeNotSupported so 
we don't need to change all the places in the future.
That might not be necessary. There are two checks, one in 
{{PlacementConstraintsUtil}} and the other in 
{{SingleConstraintAppPlacementAllocator}}. I will removed the first one once 
YARN-8013 is done. And removed the 2nd one once YARN-8015 is done. But please 
let me know if you insist on this, I can add a method in 
{{PlacementConstraintsUtil}} and call it in both places.

Thanks


> Support NOT_SELF and ALL namespace types for allocation tag
> ---
>
> Key: YARN-8002
> URL: https://issues.apache.org/jira/browse/YARN-8002
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8002.001.patch, YARN-8002.002.patch, 
> YARN-8002.003.patch
>
>
> This is a continua task after YARN-7972, YARN-7972 adds support to specify 
> tags with namespace SELF and APP_ID, like following
>  * self/
>  * app-id//
> this task is to track the work to support 2 of remaining namespace types 
> *NOT_SELF* & *ALL* (we'll support app-label later),
>  * not-self/
>  * all/
> this will require a bit refactoring in {{AllocationTagsManager}} as it needs 
> to do some proper aggregation on tags for multiple apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7654) Support ENTRY_POINT for docker container

2018-03-14 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399772#comment-16399772
 ] 

Eric Yang commented on YARN-7654:
-

[~ebadger] [~jlowe] [~shaneku...@gmail.com] This is a early patch using execl 
instead of shell script to launch.  I am currently running into a problem that 
after docker is started, the existing logic for detecting docker exit is not 
working.  Let me know what you think about this approach.  I am concerned that 
we are using popen for docker launch for non-entry point version, it actually 
spawn a shell underneath.   We might need to convert the output buffer from 
char array to char pointer array to avoid chopping strings in the middle.  Your 
early feedback is appreciated.  Thanks

> Support ENTRY_POINT for docker container
> 
>
> Key: YARN-7654
> URL: https://issues.apache.org/jira/browse/YARN-7654
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-7654.001.patch, YARN-7654.002.patch
>
>
> Docker image may have ENTRY_POINT predefined, but this is not supported in 
> the current implementation.  It would be nice if we can detect existence of 
> {{launch_command}} and base on this variable launch docker container in 
> different ways:
> h3. Launch command exists
> {code}
> docker run [image]:[version]
> docker exec [container_id] [launch_command]
> {code}
> h3. Use ENTRY_POINT
> {code}
> docker run [image]:[version]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7654) Support ENTRY_POINT for docker container

2018-03-14 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7654:

Attachment: YARN-7654.002.patch

> Support ENTRY_POINT for docker container
> 
>
> Key: YARN-7654
> URL: https://issues.apache.org/jira/browse/YARN-7654
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-7654.001.patch, YARN-7654.002.patch
>
>
> Docker image may have ENTRY_POINT predefined, but this is not supported in 
> the current implementation.  It would be nice if we can detect existence of 
> {{launch_command}} and base on this variable launch docker container in 
> different ways:
> h3. Launch command exists
> {code}
> docker run [image]:[version]
> docker exec [container_id] [launch_command]
> {code}
> h3. Use ENTRY_POINT
> {code}
> docker run [image]:[version]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Tao Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399739#comment-16399739
 ] 

Tao Yang commented on YARN-7636:


Thanks [~cheersyang] and [~leftnoteasy] for your reviews.

Trigger the QA to verify this patch.

> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch, YARN-7636.002.patch
>
>
> This happens on our production cluster twice, when a request cannot be 
> satisfied for a long time, it continually triggers the re-reservation and 
> eventually caused the overflow. This will crash the scheduler.
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
> {noformat}
> Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
> can ignore this exception to avoid this problem.
> This problem may happens in 
> SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
>  fix it in the same way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8016) Refine PlacementRule interface and add a app-name queue mapping rule as an example

2018-03-14 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399696#comment-16399696
 ] 

Yufei Gu commented on YARN-8016:


Works for me. 

> Refine PlacementRule interface and add a app-name queue mapping rule as an 
> example
> --
>
> Key: YARN-8016
> URL: https://issues.apache.org/jira/browse/YARN-8016
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8016.001.patch
>
>
> After YARN-3635/YARN-6689, PlacementRule becomes a common interface which can 
> be used by scheduler and can be dynamically updated by scheduler according to 
> configs. There're some other works. 
> - There's no way to initialize PlacementRule.
> - No example of PlacementRule except the user-group mapping one.
> This JIRA is targeted to refine PlacementRule interfaces and add another 
> PlacementRule example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8016) Refine PlacementRule interface and add a app-name queue mapping rule as an example

2018-03-14 Thread Zian Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399691#comment-16399691
 ] 

Zian Chen commented on YARN-8016:
-

Hi [~yufeigu] , thank you for your suggestions. As Wangda give the comments 
regarding FairScheduler part, I'll focus on CS part first. We could discuss 
further if the situation changes. Meanwhile, I will update the patch according 
to Wangda's comments. Please let me know if you have any other comments or 
suggestions. Thanks!

> Refine PlacementRule interface and add a app-name queue mapping rule as an 
> example
> --
>
> Key: YARN-8016
> URL: https://issues.apache.org/jira/browse/YARN-8016
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8016.001.patch
>
>
> After YARN-3635/YARN-6689, PlacementRule becomes a common interface which can 
> be used by scheduler and can be dynamically updated by scheduler according to 
> configs. There're some other works. 
> - There's no way to initialize PlacementRule.
> - No example of PlacementRule except the user-group mapping one.
> This JIRA is targeted to refine PlacementRule interfaces and add another 
> PlacementRule example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8016) Refine PlacementRule interface and add a app-name queue mapping rule as an example

2018-03-14 Thread Zian Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399686#comment-16399686
 ] 

Zian Chen commented on YARN-8016:
-

Hi [~leftnoteasy] , Thank you so much for your comments, here are my thoughts 
after further discussion,

1) will remove readlock here.

2) initially I check rule.getQueueMappingLists() instead of rule is because we 
change the initialize logic inside UserGroupMappingRule and other rules to 
extend initialize method from PlacementRule instead of calling get() method to 
normalize the logic. In initialize method, rule will not become null, we only 
affect the mappings field of rule variable, however your suggestions also make 
a lot of sense, I propose we can do this,

throw Exception as long as we get mappings empty at the initialize method, 
which means user actually not specify any placement rules for initialize. 

 

will remove List as we only want to provide general 
framework here, use List will restrict user cannot specify 
dynamic placement rule mapping strategy. 

 

5) public UserGroupMappingPlacementRule(){} is the default constructor which 
will be hidden since we have constructor with parameters inside 
UserGroupMappingPlacementRule class. I could modify it to call 

public UserGroupMappingPlacementRule(boolean overrideWithQueueMappings,
List newMappings, Groups groups)
{code:java}
public UserGroupMappingPlacementRule(){}
{code}
Is not necessary.

6) {{validateParentQueue}} rename it to something like 
{{validateQueueMappingUnderParentQueue}} and place it under 
{{QueuePlacementRuleUtils}}?

 

9) Agree. There is only one thing I want to suggest here is, we should probably 
check if placementRuleStrs have duplicate placement rule first, if yes, we can 
simply fail the case as this is more like user input error. Then add 
UserGroupMappingPlacementRule if absent, then for loop it.

 

Agree with all the other comments. will change them according to the 
suggestions.

 

Thanks!

 

> Refine PlacementRule interface and add a app-name queue mapping rule as an 
> example
> --
>
> Key: YARN-8016
> URL: https://issues.apache.org/jira/browse/YARN-8016
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-8016.001.patch
>
>
> After YARN-3635/YARN-6689, PlacementRule becomes a common interface which can 
> be used by scheduler and can be dynamically updated by scheduler according to 
> configs. There're some other works. 
> - There's no way to initialize PlacementRule.
> - No example of PlacementRule except the user-group mapping one.
> This JIRA is targeted to refine PlacementRule interfaces and add another 
> PlacementRule example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8006) Make Hbase-2 profile as default for YARN-7055 branch

2018-03-14 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399671#comment-16399671
 ] 

Haibo Chen commented on YARN-8006:
--

It looks like we can run yetus commands on a linux machine by doing 
"./start-build-env.sh" followed by "dev-support/bin/test-patch PATH_TO_PATH".

That basically downloads yetus and launch yetus to test the patch just as the 
jenkins job does (with slight difference)

 

> Make Hbase-2 profile as default for YARN-7055 branch
> 
>
> Key: YARN-8006
> URL: https://issues.apache.org/jira/browse/YARN-8006
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-8006-YARN-7055.001.patch, 
> YARN-8006-YARN-8006.00.patch, YARN-8006-YARN-8006.01.patch
>
>
> In last weekly call folks discussed that we should have separate branch with 
> hbase-2 as profile by default. Trunk default profile is hbase-1 which runs 
> all the tests under hbase-1 profile. But for hbase-2 profile tests are not 
> running.
> As per the discussion, lets keep YARN-7055 branch for hbase-2 profile as 
> default. Any server side patches can be given to this branch as well which 
> runs tests for hbase-2 profile. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7574) Add support for Node Labels on Auto Created Leaf Queue Template

2018-03-14 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated YARN-7574:
---
Attachment: YARN-7574.5.patch

> Add support for Node Labels on Auto Created Leaf Queue Template
> ---
>
> Key: YARN-7574
> URL: https://issues.apache.org/jira/browse/YARN-7574
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-7574.1.patch, YARN-7574.2.patch, YARN-7574.3.patch, 
> YARN-7574.4.patch, YARN-7574.5.patch
>
>
> YARN-7473 adds support for auto created leaf queues to inherit node labels 
> capacities from parent queues. Howebver there is no support for leaf queue 
> template to allow different configured capacities for different node labels. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7574) Add support for Node Labels on Auto Created Leaf Queue Template

2018-03-14 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399645#comment-16399645
 ] 

Suma Shivaprasad commented on YARN-7574:


Fixed UT and merged with YARN-7657 changes

> Add support for Node Labels on Auto Created Leaf Queue Template
> ---
>
> Key: YARN-7574
> URL: https://issues.apache.org/jira/browse/YARN-7574
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-7574.1.patch, YARN-7574.2.patch, YARN-7574.3.patch, 
> YARN-7574.4.patch, YARN-7574.5.patch
>
>
> YARN-7473 adds support for auto created leaf queues to inherit node labels 
> capacities from parent queues. Howebver there is no support for leaf queue 
> template to allow different configured capacities for different node labels. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7952) RM should be able to recover log aggregation status after restart/fail-over

2018-03-14 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399640#comment-16399640
 ] 

Robert Kanter commented on YARN-7952:
-

LGTM +1 pending Jenkins

> RM should be able to recover log aggregation status after restart/fail-over
> ---
>
> Key: YARN-7952
> URL: https://issues.apache.org/jira/browse/YARN-7952
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7952-poc.patch, YARN-7952.1.patch, 
> YARN-7952.2.patch, YARN-7952.3.patch, YARN-7952.3.patch, YARN-7952.5.patch, 
> YARN-7952.6.patch, YARN-7952.7.patch, YARN-7952.8.patch, YARN-7952.9.patch
>
>
> Right now, the NM would send its own log aggregation status to RM 
> periodically to RM. And RM would aggregate the status for each application, 
> but it will not generate the final status until a client call(from web ui or 
> cli) trigger it. But RM never persists the log aggregation status. So, when 
> RM restarts/fails over, the log aggregation status will become “NOT_STARTED”. 
> This is confusing, maybe we should change it to “NOT_AVAILABLE” (will create 
> a separate ticket for this). Anyway, we need to persist the log aggregation 
> status for the future use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7952) RM should be able to recover log aggregation status after restart/fail-over

2018-03-14 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399634#comment-16399634
 ] 

Xuan Gong commented on YARN-7952:
-

Uploaded a new patch to address all the comments

> RM should be able to recover log aggregation status after restart/fail-over
> ---
>
> Key: YARN-7952
> URL: https://issues.apache.org/jira/browse/YARN-7952
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7952-poc.patch, YARN-7952.1.patch, 
> YARN-7952.2.patch, YARN-7952.3.patch, YARN-7952.3.patch, YARN-7952.5.patch, 
> YARN-7952.6.patch, YARN-7952.7.patch, YARN-7952.8.patch, YARN-7952.9.patch
>
>
> Right now, the NM would send its own log aggregation status to RM 
> periodically to RM. And RM would aggregate the status for each application, 
> but it will not generate the final status until a client call(from web ui or 
> cli) trigger it. But RM never persists the log aggregation status. So, when 
> RM restarts/fails over, the log aggregation status will become “NOT_STARTED”. 
> This is confusing, maybe we should change it to “NOT_AVAILABLE” (will create 
> a separate ticket for this). Anyway, we need to persist the log aggregation 
> status for the future use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7952) RM should be able to recover log aggregation status after restart/fail-over

2018-03-14 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-7952:

Attachment: YARN-7952.9.patch

> RM should be able to recover log aggregation status after restart/fail-over
> ---
>
> Key: YARN-7952
> URL: https://issues.apache.org/jira/browse/YARN-7952
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7952-poc.patch, YARN-7952.1.patch, 
> YARN-7952.2.patch, YARN-7952.3.patch, YARN-7952.3.patch, YARN-7952.5.patch, 
> YARN-7952.6.patch, YARN-7952.7.patch, YARN-7952.8.patch, YARN-7952.9.patch
>
>
> Right now, the NM would send its own log aggregation status to RM 
> periodically to RM. And RM would aggregate the status for each application, 
> but it will not generate the final status until a client call(from web ui or 
> cli) trigger it. But RM never persists the log aggregation status. So, when 
> RM restarts/fails over, the log aggregation status will become “NOT_STARTED”. 
> This is confusing, maybe we should change it to “NOT_AVAILABLE” (will create 
> a separate ticket for this). Anyway, we need to persist the log aggregation 
> status for the future use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8028) Support authorizeUserAccessToQueue in RMWebServices

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399585#comment-16399585
 ] 

genericqa commented on YARN-8028:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 49s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 8 new + 
26 unchanged - 3 fixed = 34 total (was 29) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 23s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
22s{color} | {color:green} hadoop-yarn-server-router in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}123m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-8028 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914578/YARN-8028.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 27c388ad24c5 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 252c2b4 |
| maven | 

[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-03-14 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399551#comment-16399551
 ] 

Eric Yang commented on YARN-7221:
-

[~ebadger] Are you running sudo -U ebadger -n -l docker as root user or yarn?  
When container-executor runs this check, it is using root privileges to check, 
thus password prompt is omitted.  The sudo session doesn't exist beyond the 
check.

> Add security check for privileged docker container
> --
>
> Key: YARN-7221
> URL: https://issues.apache.org/jira/browse/YARN-7221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-7221.001.patch, YARN-7221.002.patch, 
> YARN-7221.003.patch, YARN-7221.004.patch, YARN-7221.005.patch, 
> YARN-7221.006.patch
>
>
> When a docker is running with privileges, majority of the use case is to have 
> some program running with root then drop privileges to another user.  i.e. 
> httpd to start with privileged and bind to port 80, then drop privileges to 
> www user.  
> # We should add security check for submitting users, to verify they have 
> "sudo" access to run privileged container.  
> # We should remove --user=uid:gid for privileged containers.  
>  
> Docker can be launched with --privileged=true, and --user=uid:gid flag.  With 
> this parameter combinations, user will not have access to become root user.  
> All docker exec command will be drop to uid:gid user to run instead of 
> granting privileges.  User can gain root privileges if container file system 
> contains files that give user extra power, but this type of image is 
> considered as dangerous.  Non-privileged user can launch container with 
> special bits to acquire same level of root power.  Hence, we lose control of 
> which image should be run with --privileges, and who have sudo rights to use 
> privileged container images.  As the result, we should check for sudo access 
> then decide to parameterize --privileged=true OR --user=uid:gid.  This will 
> avoid leading developer down the wrong path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5627) [Atsv2] Support streaming reader API to fetch entities

2018-03-14 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399549#comment-16399549
 ] 

Haibo Chen commented on YARN-5627:
--

Thanks [~rohithsharma] for the pointer. Trying to understand FROM_ID better. 
Say an application has stored 1000 custom entities of the same entity type in 
the EntityTable. Hence the table looks like this:

 
{code:java}
user1/clusterId/flowName/flowRunId/AppId/TYPE_A/e0001

user1/clusterId/flowName/flowRunId/AppId/TYPE_A/e0002

user1/clusterId/flowName/flowRunId/AppId/TYPE_A/e0003

...

user1/clusterId/flowName/flowRunId/AppId/TYPE_A/e1000

{code}
 

How would a REST client make use of FROM_ID to retrieve 100 entities at a time? 
My current understanding is that fromId is always used along with limit, and 
fromid is inclusive. That should support server side pagination in theory. But 
the rest client needs to know what it the first fromid, right?

> [Atsv2] Support streaming reader API to fetch entities
> --
>
> Key: YARN-5627
> URL: https://issues.apache.org/jira/browse/YARN-5627
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
>
> There is no limit for size of TimelineEntitie object. It can be varied from 
> KB's to MB. While reading entities list, it would be an potential issue that 
> TimeLineReder would go into OOM situation based on the entity size and limit. 
> Proposal is to support an streaming API to read entities list. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8018) Yarn service: Add support for initiating service upgrade

2018-03-14 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8018:

Attachment: YARN-8018.wip.patch

> Yarn service: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8018.wip.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8018) Yarn service: Add support for initiating service upgrade

2018-03-14 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8018:

Attachment: (was: YARN-8018.wip.patch)

> Yarn service: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7221) Add security check for privileged docker container

2018-03-14 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399493#comment-16399493
 ] 

Eric Badger commented on YARN-7221:
---

Hi [~eyang], I just tried out patch 006. I'm getting some weird behavior. I run 
a job as my user "ebadger" with privileges and it succeeds. The containers are 
all run as privileged containers and are entered with the user root. However, 
immediately after running the container, I run {{sudo -U ebadger -n -l docker}} 
and it says {{sudo: a password is required}}. This doesn't seem consistent 
since I'm doing the exact same sudo check that the container-executor is doing. 

{noformat}
[ebadger@foobar ~]$ export 
vars="YARN_CONTAINER_RUNTIME_DOCKER_RUN_PRIVILEGED_CONTAINER=true,YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=foo/rhel7";
 $HADOOP_PREFIX/bin/hadoop jar 
$HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar pi 
-Dyarn.app.mapreduce.am.env=$vars -Dmapreduce.map.env=$vars 
-Dmapreduce.reduce.env=$vars 10 100
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of 
HADOOP_PREFIX.
Number of Maps  = 10
Samples per Map = 100
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
2018-03-14 21:43:59,007 INFO  [main] client.RMProxy 
(RMProxy.java:newProxyInstance(133)) - Connecting to ResourceManager at 
/127.0.0.1:8040
2018-03-14 21:43:59,278 INFO  [main] client.AHSProxy 
(AHSProxy.java:createAHSProxy(42)) - Connecting to Application History server 
at /127.0.0.1:10200
2018-03-14 21:43:59,335 INFO  [main] mapreduce.JobSubmissionFiles 
(JobSubmissionFiles.java:getStagingDir(156)) - Permissions on staging directory 
/tmp/hadoop-yarn/staging/ebadger/.staging are incorrect: rwxrwxrwx. Fixing 
permissions to correct value rwx--
2018-03-14 21:43:59,465 INFO  [main] mapreduce.JobResourceUploader 
(JobResourceUploader.java:disableErasureCodingForPath(883)) - Disabling Erasure 
Coding for path: 
/tmp/hadoop-yarn/staging/ebadger/.staging/job_1521043593738_0007
2018-03-14 21:43:59,643 INFO  [main] input.FileInputFormat 
(FileInputFormat.java:listStatus(290)) - Total input files to process : 10
2018-03-14 21:43:59,698 INFO  [main] mapreduce.JobSubmitter 
(JobSubmitter.java:submitJobInternal(205)) - number of splits:10
2018-03-14 21:43:59,756 INFO  [main] Configuration.deprecation 
(Configuration.java:logDeprecation(1391)) - 
yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, 
use yarn.system-metrics-publisher.enabled
2018-03-14 21:43:59,904 INFO  [main] mapreduce.JobSubmitter 
(JobSubmitter.java:printTokens(301)) - Submitting tokens for job: 
job_1521043593738_0007
2018-03-14 21:43:59,907 INFO  [main] mapreduce.JobSubmitter 
(JobSubmitter.java:printTokens(302)) - Executing with tokens: []
2018-03-14 21:44:00,186 INFO  [main] conf.Configuration 
(Configuration.java:getConfResourceAsInputStream(2749)) - resource-types.xml 
not found
2018-03-14 21:44:00,187 INFO  [main] resource.ResourceUtils 
(ResourceUtils.java:addResourcesFileToConf(418)) - Unable to find 
'resource-types.xml'.
2018-03-14 21:44:00,640 INFO  [main] impl.YarnClientImpl 
(YarnClientImpl.java:submitApplication(306)) - Submitted application 
application_1521043593738_0007
2018-03-14 21:44:00,769 INFO  [main] mapreduce.Job (Job.java:submit(1574)) - 
The url to track the job: 
http://foo.bar.com:8088/proxy/application_1521043593738_0007/
2018-03-14 21:44:00,775 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1619)) - Running job: job_1521043593738_0007
2018-03-14 21:44:15,054 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1640)) - Job job_1521043593738_0007 running in 
uber mode : false
2018-03-14 21:44:15,056 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1647)) -  map 0% reduce 0%
2018-03-14 21:44:27,209 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1647)) -  map 30% reduce 0%
2018-03-14 21:44:46,435 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1647)) -  map 40% reduce 0%
2018-03-14 21:44:48,452 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1647)) -  map 60% reduce 0%
2018-03-14 21:45:12,849 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1647)) -  map 80% reduce 0%
2018-03-14 21:45:21,940 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1647)) -  map 80% reduce 27%
2018-03-14 21:45:26,979 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1647)) -  map 100% reduce 27%
2018-03-14 21:45:27,988 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1647)) -  map 100% reduce 30%
2018-03-14 21:45:28,999 INFO  [main] mapreduce.Job 
(Job.java:monitorAndPrintJob(1647)) -  map 100% reduce 100%
2018-03-14 21:45:29,020 INFO  [main] mapreduce.Job 

[jira] [Commented] (YARN-6830) Support quoted strings for environment variables

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399463#comment-16399463
 ] 

genericqa commented on YARN-6830:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 25s{color} 
| {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: 
The patch generated 0 new + 5 unchanged - 2 fixed = 5 total (was 7) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  2m 
36s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
16s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 24s{color} 
| {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-6830 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877856/YARN-6830.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux bc1eef437f52 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 
21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 252c2b4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/19979/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/19979/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
 |
| javac | 

[jira] [Commented] (YARN-8002) Support NOT_SELF and ALL namespace types for allocation tag

2018-03-14 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399443#comment-16399443
 ] 

Wangda Tan commented on YARN-8002:
--

Thanks [~cheersyang], the latest patch looks much better now. 

Some additional minor comments:

1) Changes in SingleConstraintAppPlacementAllocator:
- Before the patch only SELF supported, after this patch several ALLOCATION_TAG 
supported. I'm not sure if we should do that since we haven't validated if 
inter-app target works for SingleConstraintAppPlacementAllocator.

2) TargetApplications:
- Removed unused methods.

3) aggregateAllocationTagsByApps/aggregateAllocationTagsByRack 
- The two methods can be merged.
- appIds != null check is redundant (since 
{{AllocationTagNamespace#getNamespaceScope}} return non-null always).

4) getNamespaceScope:
- There're several "scope" related methods, I suggest to rename them to 
get/setApplicationIds. The name of "scope" is confusing to me. 
- Following methods are used by test only, suggest to remove them:
{code}
  /**
   * @return true if the namespace is effective in all applications
   * in this cluster. Specifically the namespace prefix should be
   * "all".
   */
  public boolean isGlobal() {
return AllocationTagNamespaceType.ALL.equals(getNamespaceType());
  }

  /**
   * @return true if the namespace is effective within a single application
   * by its application ID, the namespace prefix should be "app-id";
   * false otherwise.
   */
  public boolean isSingleInterApp() {
return AllocationTagNamespaceType.APP_ID.equals(getNamespaceType());
  }

  /**
   * @return true if the namespace is effective to the application itself,
   * the namespace prefix should be "self"; false otherwise.
   */
  public boolean isIntraApp() {
return AllocationTagNamespaceType.SELF.equals(getNamespaceType());
  }

  /**
   * @return true if the namespace is effective to all applications except
   * itself, the namespace prefix should be "not-self"; false otherwise.
   */
  public boolean isNotSelf() {
return AllocationTagNamespaceType.NOT_SELF.equals(getNamespaceType());
  }
{code}
And actually I suggest to remove all the is... methods since we can check 
{{AllocationTagNamespaceType.XYZ.equals(getNamespaceType)}}. This is optional 
to me, depends on you.
- Several checks are check unsupported namespaces:
{code} 
// TODO Complete remove this check once we support app-label.
if (namespace.isAppLabel()) {
  throw new InvalidAllocationTagsQueryException(
  namespace.toString() + " is not supported yet!");
}
{code} 
I suggest to add a method like {{throwExceptionIfNamespaceTypeNotSupported}} so 
we don't need to change all the places in the future. 

> Support NOT_SELF and ALL namespace types for allocation tag
> ---
>
> Key: YARN-8002
> URL: https://issues.apache.org/jira/browse/YARN-8002
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8002.001.patch, YARN-8002.002.patch, 
> YARN-8002.003.patch
>
>
> This is a continua task after YARN-7972, YARN-7972 adds support to specify 
> tags with namespace SELF and APP_ID, like following
>  * self/
>  * app-id//
> this task is to track the work to support 2 of remaining namespace types 
> *NOT_SELF* & *ALL* (we'll support app-label later),
>  * not-self/
>  * all/
> this will require a bit refactoring in {{AllocationTagsManager}} as it needs 
> to do some proper aggregation on tags for multiple apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399416#comment-16399416
 ] 

Wangda Tan commented on YARN-7636:
--

Thanks [~Tao Yang] for the nice catching! Patch LGTM. 

> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch, YARN-7636.002.patch
>
>
> This happens on our production cluster twice, when a request cannot be 
> satisfied for a long time, it continually triggers the re-reservation and 
> eventually caused the overflow. This will crash the scheduler.
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
> {noformat}
> Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
> can ignore this exception to avoid this problem.
> This problem may happens in 
> SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
>  fix it in the same way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6434) When setting environment variables, can't use comma for a list of value in key = value pairs.

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399409#comment-16399409
 ] 

genericqa commented on YARN-6434:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} YARN-6434 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-6434 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12861802/YARN-6434.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19980/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> When setting environment variables, can't use comma for a list of value in 
> key = value pairs.
> -
>
> Key: YARN-6434
> URL: https://issues.apache.org/jira/browse/YARN-6434
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jaeboo Jeong
>Priority: Major
> Attachments: YARN-6434.001.patch
>
>
> We can set environment variables using yarn.app.mapreduce.am.env, 
> mapreduce.map.env, mapreduce.reduce.env.
> There is no problem if we use key=value pairs like X=Y, X=$Y.
> However If we want to set key=a list of value pair(e.g. X=Y,Z), we can’t.
> This is related to YARN-4595.
> The attached patch is based on YARN-3768.
> We can set environment variables like below.
> {code}
> mapreduce.map.env="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker,YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS=\"/dir1:/targetdir1,/dir2:/targetdir2\""
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8027) Setting hostname of docker container breaks for --net=host in docker 1.13

2018-03-14 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399388#comment-16399388
 ] 

Suma Shivaprasad commented on YARN-8027:


[~Jim_Brennan] Yes, +1 for current patch.  Thanks

> Setting hostname of docker container breaks for --net=host in docker 1.13
> -
>
> Key: YARN-8027
> URL: https://issues.apache.org/jira/browse/YARN-8027
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-8027.001.patch
>
>
> In DockerLinuxContainerRuntime:launchContainer, we are adding the --hostname 
> argument to the docker run command to set the hostname in the container to 
> something like:  ctr-e84-1520889172376-0001-01-01.
> This does not work when combined with the --net=host command line option in 
> Docker 1.13.1.  It causes multiple failures when the client tries to resolve 
> the hostname and it fails.
> We haven't seen this before because we were using docker 1.12.6 which seems 
> to ignore --hostname when you are using --net=host.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8027) Setting hostname of docker container breaks for --net=host in docker 1.13

2018-03-14 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399355#comment-16399355
 ] 

Jim Brennan commented on YARN-8027:
---

[~suma.shivaprasad], thanks for your comment.  It sounds like the current patch 
would be ok with you then?  It preserves the current behavior except in the 
case where network is host and Registry DNS is not enabled.

 

> Setting hostname of docker container breaks for --net=host in docker 1.13
> -
>
> Key: YARN-8027
> URL: https://issues.apache.org/jira/browse/YARN-8027
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-8027.001.patch
>
>
> In DockerLinuxContainerRuntime:launchContainer, we are adding the --hostname 
> argument to the docker run command to set the hostname in the container to 
> something like:  ctr-e84-1520889172376-0001-01-01.
> This does not work when combined with the --net=host command line option in 
> Docker 1.13.1.  It causes multiple failures when the client tries to resolve 
> the hostname and it fails.
> We haven't seen this before because we were using docker 1.12.6 which seems 
> to ignore --hostname when you are using --net=host.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7952) RM should be able to recover log aggregation status after restart/fail-over

2018-03-14 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399351#comment-16399351
 ] 

Robert Kanter commented on YARN-7952:
-

Thanks for the updated patch.  Two last things:
# Instead of throwing an Exception when the config is <= 0, which will crash 
the NM, I think it would be better to log a WARN message and use the default 
value.
# In the {{assertEquals}} statements, the first argument is actually the 
expected value and the second is the actual value.  So we need to swap the 
argument order in the tests on those.

> RM should be able to recover log aggregation status after restart/fail-over
> ---
>
> Key: YARN-7952
> URL: https://issues.apache.org/jira/browse/YARN-7952
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7952-poc.patch, YARN-7952.1.patch, 
> YARN-7952.2.patch, YARN-7952.3.patch, YARN-7952.3.patch, YARN-7952.5.patch, 
> YARN-7952.6.patch, YARN-7952.7.patch, YARN-7952.8.patch
>
>
> Right now, the NM would send its own log aggregation status to RM 
> periodically to RM. And RM would aggregate the status for each application, 
> but it will not generate the final status until a client call(from web ui or 
> cli) trigger it. But RM never persists the log aggregation status. So, when 
> RM restarts/fails over, the log aggregation status will become “NOT_STARTED”. 
> This is confusing, maybe we should change it to “NOT_AVAILABLE” (will create 
> a separate ticket for this). Anyway, we need to persist the log aggregation 
> status for the future use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4781) Support intra-queue preemption for fairness ordering policy.

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399245#comment-16399245
 ] 

genericqa commented on YARN-4781:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
55s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 18s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 8 new + 53 unchanged - 0 fixed = 61 total (was 53) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 13s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
13s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 70m 
22s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}119m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Class 
org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.IntraQueueCandidatesSelector$TAFairOrderingComparator
 defines non-transient non-serializable instance field clusterRes  In 
IntraQueueCandidatesSelector.java:instance field clusterRes  In 
IntraQueueCandidatesSelector.java |
|  |  Class 
org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.IntraQueueCandidatesSelector$TAFairOrderingComparator
 defines non-transient non-serializable instance field rc  In 
IntraQueueCandidatesSelector.java:instance field rc  In 
IntraQueueCandidatesSelector.java |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-4781 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914540/YARN-4781.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  

[jira] [Commented] (YARN-8028) Support authorizeUserAccessToQueue in RMWebServices

2018-03-14 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399244#comment-16399244
 ] 

Wangda Tan commented on YARN-8028:
--

[~sunilg], could you help to review this patch?

> Support authorizeUserAccessToQueue in RMWebServices
> ---
>
> Key: YARN-8028
> URL: https://issues.apache.org/jira/browse/YARN-8028
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
> Attachments: YARN-8028.001.patch
>
>
> Currently we have {{QueueUserACLInfo}} in ApplicationClient, we should 
> support similar API in REST API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8028) Support authorizeUserAccessToQueue in RMWebServices

2018-03-14 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8028:
-
Attachment: YARN-8028.001.patch

> Support authorizeUserAccessToQueue in RMWebServices
> ---
>
> Key: YARN-8028
> URL: https://issues.apache.org/jira/browse/YARN-8028
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
> Attachments: YARN-8028.001.patch
>
>
> Currently we have {{QueueUserACLInfo}} in ApplicationClient, we should 
> support similar API in REST API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8029) YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS should not use commas as separators

2018-03-14 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399240#comment-16399240
 ] 

Jim Brennan commented on YARN-8029:
---

Thanks [~shaneku...@gmail.com]. This does appear to be a duplicate of 
YARN-6830, with respect to the underlying problem, but proposes a different 
solution. Supporting the ability to quote the values does seem like a natural 
approach - it's the first thing I tried to do.

I proposed changing the delimiters in these docker runtime variables because it 
is a safe change - it can't break anything because it's currently not working 
with commas. While commas seems like a natural choice for the delimiter, I 
don't think changing it to something else would be much of a hardship as long 
as it is documented.

I'm willing to work on either this or YARN-6830, depending on which option is 
favored. cc: [~jlowe], [~templedf], [~aw],

> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS should not use commas as separators
> 
>
> Key: YARN-8029
> URL: https://issues.apache.org/jira/browse/YARN-8029
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Priority: Major
>
> The following docker-related environment variables specify a comma-separated 
> list of mounts:
> YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS
> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS
> This is a problem because hadoop -Dmapreduce.map.env and related options use  
> comma as a delimiter.   So if I put more than one mount in 
> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS the comma in the variable will be 
> treated as a delimiter for the hadoop command line option and all but the 
> first mount will be ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8029) YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS should not use commas as separators

2018-03-14 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399236#comment-16399236
 ] 

Billie Rinaldi commented on YARN-8029:
--

See also YARN-6434.

> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS should not use commas as separators
> 
>
> Key: YARN-8029
> URL: https://issues.apache.org/jira/browse/YARN-8029
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Priority: Major
>
> The following docker-related environment variables specify a comma-separated 
> list of mounts:
> YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS
> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS
> This is a problem because hadoop -Dmapreduce.map.env and related options use  
> comma as a delimiter.   So if I put more than one mount in 
> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS the comma in the variable will be 
> treated as a delimiter for the hadoop command line option and all but the 
> first mount will be ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8018) Yarn service: Add support for initiating service upgrade

2018-03-14 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8018:

Attachment: (was: YARN-8018.wip.patch)

> Yarn service: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8018.wip.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8018) Yarn service: Add support for initiating service upgrade

2018-03-14 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8018:

Attachment: YARN-8018.wip.patch

> Yarn service: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8018.wip.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8018) Yarn service: Add support for initiating service upgrade

2018-03-14 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8018:

Attachment: YARN-8018.wip.patch

> Yarn service: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8018.wip.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8018) Yarn service: Add support for initiating service upgrade

2018-03-14 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8018:

Attachment: (was: YARN-8018.wip.patch)

> Yarn service: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8018.wip.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8010) add config in FederationRMFailoverProxy to not bypass facade cache when failing over

2018-03-14 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399207#comment-16399207
 ] 

Botong Huang commented on YARN-8010:


Hi Subru, let's put it in trunk/branch-2. YARN-7402 will also pick it up after 
the next rebase. Thanks! 

> add config in FederationRMFailoverProxy to not bypass facade cache when 
> failing over
> 
>
> Key: YARN-8010
> URL: https://issues.apache.org/jira/browse/YARN-8010
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-8010.v1.patch, YARN-8010.v1.patch, 
> YARN-8010.v2.patch, YARN-8010.v3.patch
>
>
> Today when YarnRM is failing over, the FederationRMFailoverProxy running in 
> AMRMProxy will perform failover, try to get latest subcluster info from 
> FederationStateStore and then retry connect to the latest YarnRM master. When 
> calling getSubCluster() to FederationStateStoreFacade, it bypasses the cache 
> with a flush flag. When YarnRM is failing over, every AM heartbeat thread 
> creates a different thread inside FederationInterceptor, each of which keeps 
> performing failover several times. This leads to a big spike of getSubCluster 
> call to FederationStateStore. 
> Depending on the cluster setup (e.g. putting a VIP before all YarnRMs), 
> YarnRM master slave change might not result in RM addr change. In other 
> cases, a small delay of getting latest subcluster information may be 
> acceptable. This patch thus creates a config option, so that it is possible 
> to ask the FederationRMFailoverProxy to not flush cache when calling 
> getSubCluster(). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8010) add config in FederationRMFailoverProxy to not bypass facade cache when failing over

2018-03-14 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399207#comment-16399207
 ] 

Botong Huang edited comment on YARN-8010 at 3/14/18 7:52 PM:
-

Hi [~subru], let's put it in trunk/branch-2. YARN-7402 will also pick it up 
after the next rebase. Thanks! 


was (Author: botong):
Hi Subru, let's put it in trunk/branch-2. YARN-7402 will also pick it up after 
the next rebase. Thanks! 

> add config in FederationRMFailoverProxy to not bypass facade cache when 
> failing over
> 
>
> Key: YARN-8010
> URL: https://issues.apache.org/jira/browse/YARN-8010
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-8010.v1.patch, YARN-8010.v1.patch, 
> YARN-8010.v2.patch, YARN-8010.v3.patch
>
>
> Today when YarnRM is failing over, the FederationRMFailoverProxy running in 
> AMRMProxy will perform failover, try to get latest subcluster info from 
> FederationStateStore and then retry connect to the latest YarnRM master. When 
> calling getSubCluster() to FederationStateStoreFacade, it bypasses the cache 
> with a flush flag. When YarnRM is failing over, every AM heartbeat thread 
> creates a different thread inside FederationInterceptor, each of which keeps 
> performing failover several times. This leads to a big spike of getSubCluster 
> call to FederationStateStore. 
> Depending on the cluster setup (e.g. putting a VIP before all YarnRMs), 
> YarnRM master slave change might not result in RM addr change. In other 
> cases, a small delay of getting latest subcluster information may be 
> acceptable. This patch thus creates a config option, so that it is possible 
> to ask the FederationRMFailoverProxy to not flush cache when calling 
> getSubCluster(). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8010) add config in FederationRMFailoverProxy to not bypass facade cache when failing over

2018-03-14 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399177#comment-16399177
 ] 

Subru Krishnan commented on YARN-8010:
--

Thanks [~botong] for the patch and [~giovanni.fumarola] for reviewing it.

 

Do you want this in trunk/branch-2 or or YARN-7402?

> add config in FederationRMFailoverProxy to not bypass facade cache when 
> failing over
> 
>
> Key: YARN-8010
> URL: https://issues.apache.org/jira/browse/YARN-8010
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-8010.v1.patch, YARN-8010.v1.patch, 
> YARN-8010.v2.patch, YARN-8010.v3.patch
>
>
> Today when YarnRM is failing over, the FederationRMFailoverProxy running in 
> AMRMProxy will perform failover, try to get latest subcluster info from 
> FederationStateStore and then retry connect to the latest YarnRM master. When 
> calling getSubCluster() to FederationStateStoreFacade, it bypasses the cache 
> with a flush flag. When YarnRM is failing over, every AM heartbeat thread 
> creates a different thread inside FederationInterceptor, each of which keeps 
> performing failover several times. This leads to a big spike of getSubCluster 
> call to FederationStateStore. 
> Depending on the cluster setup (e.g. putting a VIP before all YarnRMs), 
> YarnRM master slave change might not result in RM addr change. In other 
> cases, a small delay of getting latest subcluster information may be 
> acceptable. This patch thus creates a config option, so that it is possible 
> to ask the FederationRMFailoverProxy to not flush cache when calling 
> getSubCluster(). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8029) YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS should not use commas as separators

2018-03-14 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399131#comment-16399131
 ] 

Shane Kumpf commented on YARN-8029:
---

Thanks for reporting this [~Jim_Brennan]. I was aware of this limitation and I 
did consider just using other delimiters, but comma seems most appropriate for 
a list. I had opened YARN-6830 to address supporting commas within quoted 
strings for this use case. Unfortunately, it took a different direction and I 
haven't had a chance to revisit. If you have any interest in taking that up, I 
think it is likely a better fix than changing the delimiter, but I'm open to 
using a different delimiter if there is a strong opinion in favor of that.

> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS should not use commas as separators
> 
>
> Key: YARN-8029
> URL: https://issues.apache.org/jira/browse/YARN-8029
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Priority: Major
>
> The following docker-related environment variables specify a comma-separated 
> list of mounts:
> YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS
> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS
> This is a problem because hadoop -Dmapreduce.map.env and related options use  
> comma as a delimiter.   So if I put more than one mount in 
> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS the comma in the variable will be 
> treated as a delimiter for the hadoop command line option and all but the 
> first mount will be ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8010) add config in FederationRMFailoverProxy to not bypass facade cache when failing over

2018-03-14 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399091#comment-16399091
 ] 

Giovanni Matteo Fumarola commented on YARN-8010:


Thanks [~botong] for the patch. LGTM +1.

The findbugs warnings seem not related to this patch. 

> add config in FederationRMFailoverProxy to not bypass facade cache when 
> failing over
> 
>
> Key: YARN-8010
> URL: https://issues.apache.org/jira/browse/YARN-8010
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-8010.v1.patch, YARN-8010.v1.patch, 
> YARN-8010.v2.patch, YARN-8010.v3.patch
>
>
> Today when YarnRM is failing over, the FederationRMFailoverProxy running in 
> AMRMProxy will perform failover, try to get latest subcluster info from 
> FederationStateStore and then retry connect to the latest YarnRM master. When 
> calling getSubCluster() to FederationStateStoreFacade, it bypasses the cache 
> with a flush flag. When YarnRM is failing over, every AM heartbeat thread 
> creates a different thread inside FederationInterceptor, each of which keeps 
> performing failover several times. This leads to a big spike of getSubCluster 
> call to FederationStateStore. 
> Depending on the cluster setup (e.g. putting a VIP before all YarnRMs), 
> YarnRM master slave change might not result in RM addr change. In other 
> cases, a small delay of getting latest subcluster information may be 
> acceptable. This patch thus creates a config option, so that it is possible 
> to ask the FederationRMFailoverProxy to not flush cache when calling 
> getSubCluster(). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8030) Use the common sliding window retry policy for AM restart

2018-03-14 Thread Chandni Singh (JIRA)
Chandni Singh created YARN-8030:
---

 Summary: Use the common sliding window retry policy for AM restart
 Key: YARN-8030
 URL: https://issues.apache.org/jira/browse/YARN-8030
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chandni Singh
Assignee: Chandni Singh


In YARN-5015, we added a SlidingWindowRetryPolicy for container restart. 
We need to refactor AM restart to use the common SlidingWindowRetryPolicy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8027) Setting hostname of docker container breaks for --net=host in docker 1.13

2018-03-14 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399021#comment-16399021
 ] 

Suma Shivaprasad edited comment on YARN-8027 at 3/14/18 6:14 PM:
-

IMO it would be better to set the --hostname in case of networks (other than 
the host mode) to a known YARN generated DNS friendly qualified name for both 
Yarn native service managed and other application containers (which is the 
current behaviour) . When not set, docker sets it to the container's ID which 
is a random string and would be known only on docker inspect. This would avoid 
initial DNS server lookups atleast from within the container in case of both 
user defined networks and Yarn native services.
{noformat}
docker run --net=my-net --name test789 -it spark-r bash
[root@03e058a25680 /]# hostname -f
03e058a25680
[root@03e058a25680 /]# hostname
03e058a25680
[root@03e058a25680 /]# cat /etc/host
cat: /etc/host: No such file or directory
[root@03e058a25680 /]# cat /etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.18.0.4 03e058a25680

{noformat}
Setting the hostname doesnt affect the behaviour of Docker Embedded DNS 
resolution in case of user defined networks since it uses the container's 
name/network-alias.


was (Author: suma.shivaprasad):
IMO it would be better to set the --hostname to a known YARN generated DNS 
friendly qualified name for both Yarn native service managed and other 
application containers (which is the current behaviour). When not set, docker 
sets it to the container's ID which is a random string and would be known only 
on docker inspect. This would avoid initial DNS server lookups atleast from 
within the container in case of both user defined networks and Yarn native 
services.

{noformat}

docker run --net=my-net --name test789 -it spark-r bash
[root@03e058a25680 /]# hostname -f
03e058a25680
[root@03e058a25680 /]# hostname
03e058a25680
[root@03e058a25680 /]# cat /etc/host
cat: /etc/host: No such file or directory
[root@03e058a25680 /]# cat /etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.18.0.4 03e058a25680

{noformat}

Setting the hostname doesnt affect the behaviour of Docker Embedded DNS 
resolution in case of user defined networks since it uses the container's 
name/network-alias.

> Setting hostname of docker container breaks for --net=host in docker 1.13
> -
>
> Key: YARN-8027
> URL: https://issues.apache.org/jira/browse/YARN-8027
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-8027.001.patch
>
>
> In DockerLinuxContainerRuntime:launchContainer, we are adding the --hostname 
> argument to the docker run command to set the hostname in the container to 
> something like:  ctr-e84-1520889172376-0001-01-01.
> This does not work when combined with the --net=host command line option in 
> Docker 1.13.1.  It causes multiple failures when the client tries to resolve 
> the hostname and it fails.
> We haven't seen this before because we were using docker 1.12.6 which seems 
> to ignore --hostname when you are using --net=host.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8027) Setting hostname of docker container breaks for --net=host in docker 1.13

2018-03-14 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399021#comment-16399021
 ] 

Suma Shivaprasad commented on YARN-8027:


IMO it would be better to set the --hostname to a known YARN generated DNS 
friendly qualified name for both Yarn native service managed and other 
application containers (which is the current behaviour). When not set, docker 
sets it to the container's ID which is a random string and would be known only 
on docker inspect. This would avoid initial DNS server lookups atleast from 
within the container in case of both user defined networks and Yarn native 
services.

{noformat}

docker run --net=my-net --name test789 -it spark-r bash
[root@03e058a25680 /]# hostname -f
03e058a25680
[root@03e058a25680 /]# hostname
03e058a25680
[root@03e058a25680 /]# cat /etc/host
cat: /etc/host: No such file or directory
[root@03e058a25680 /]# cat /etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.18.0.4 03e058a25680

{noformat}

Setting the hostname doesnt affect the behaviour of Docker Embedded DNS 
resolution in case of user defined networks since it uses the container's 
name/network-alias.

> Setting hostname of docker container breaks for --net=host in docker 1.13
> -
>
> Key: YARN-8027
> URL: https://issues.apache.org/jira/browse/YARN-8027
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-8027.001.patch
>
>
> In DockerLinuxContainerRuntime:launchContainer, we are adding the --hostname 
> argument to the docker run command to set the hostname in the container to 
> something like:  ctr-e84-1520889172376-0001-01-01.
> This does not work when combined with the --net=host command line option in 
> Docker 1.13.1.  It causes multiple failures when the client tries to resolve 
> the hostname and it fails.
> We haven't seen this before because we were using docker 1.12.6 which seems 
> to ignore --hostname when you are using --net=host.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8029) YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS should not use commas as separators

2018-03-14 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399006#comment-16399006
 ] 

Jim Brennan commented on YARN-8029:
---

There was a related discussion in [HADOOP-11640] about allowing the user to 
specify an alternate delimiter or adding an escaping mechanism.

I think in this case, the better solution would be to change the docker runtime 
environment variables to use a different separator -  semicolon, or pipe, or 
something else.

> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS should not use commas as separators
> 
>
> Key: YARN-8029
> URL: https://issues.apache.org/jira/browse/YARN-8029
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Priority: Major
>
> The following docker-related environment variables specify a comma-separated 
> list of mounts:
> YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS
> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS
> This is a problem because hadoop -Dmapreduce.map.env and related options use  
> comma as a delimiter.   So if I put more than one mount in 
> YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS the comma in the variable will be 
> treated as a delimiter for the hadoop command line option and all but the 
> first mount will be ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8029) YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS should not use commas as separators

2018-03-14 Thread Jim Brennan (JIRA)
Jim Brennan created YARN-8029:
-

 Summary: YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS should not use 
commas as separators
 Key: YARN-8029
 URL: https://issues.apache.org/jira/browse/YARN-8029
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.0.0
Reporter: Jim Brennan


The following docker-related environment variables specify a comma-separated 
list of mounts:

YARN_CONTAINER_RUNTIME_DOCKER_LOCAL_RESOURCE_MOUNTS
YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS

This is a problem because hadoop -Dmapreduce.map.env and related options use  
comma as a delimiter.   So if I put more than one mount in 
YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS the comma in the variable will be treated 
as a delimiter for the hadoop command line option and all but the first mount 
will be ignored.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4781) Support intra-queue preemption for fairness ordering policy.

2018-03-14 Thread Eric Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-4781:
-
Attachment: YARN-4781.002.patch

> Support intra-queue preemption for fairness ordering policy.
> 
>
> Key: YARN-4781
> URL: https://issues.apache.org/jira/browse/YARN-4781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
> Attachments: YARN-4781.001.patch, YARN-4781.002.patch
>
>
> We introduced fairness queue policy since YARN-3319, which will let large 
> applications make progresses and not starve small applications. However, if a 
> large application takes the queue’s resources, and containers of the large 
> app has long lifespan, small applications could still wait for resources for 
> long time and SLAs cannot be guaranteed.
> Instead of wait for application release resources on their own, we need to 
> preempt resources of queue with fairness policy enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7999) Docker launch fails when user private filecache directory is missing

2018-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398923#comment-16398923
 ] 

Hudson commented on YARN-7999:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13835 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13835/])
YARN-7999. Added file cache initializer for Linux container-executor.
(eyang: rev a82be7754d74f4d16b206427b91e700bb5f44d56)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


> Docker launch fails when user private filecache directory is missing
> 
>
> Key: YARN-7999
> URL: https://issues.apache.org/jira/browse/YARN-7999
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-7999.001.patch, YARN-7999.002.patch, q3.log
>
>
> Docker container is failing to launch in trunk.  The root cause is:
> {code}
> [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_20]: 
> [2018-03-02 23:26:09.196]Exception from container-launch.
> Container id: container_1520032931921_0001_01_20
> Exit code: 29
> Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Invalid docker mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
>  realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache
> Error constructing docker command, docker error code=12, error 
> message='Invalid docker mount'
> Shell output: main : command provided 4
> main : run as user is hbase
> main : requested yarn user is hbase
> Creating script paths...
> Creating local dirs...
> [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 
> 23:26:09.240]
> [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
> [2018-03-02 23:26:39.278]Could not find 
> nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_20//container_1520032931921_0001_01_20.pid
>  in any of the directories
> [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down 
> now...
> {code}
> The filecache cant not be mounted because it doesn't exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8022) ResourceManager UI cluster/app/ page fails to render

2018-03-14 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-8022:

Fix Version/s: 3.0.1

> ResourceManager UI cluster/app/ page fails to render
> 
>
> Key: YARN-8022
> URL: https://issues.apache.org/jira/browse/YARN-8022
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Blocker
> Fix For: 3.1.0, 3.0.1, 3.2.0
>
> Attachments: Screen Shot 2018-03-12 at 1.45.05 PM.png, 
> YARN-8022.001.patch, YARN-8022.002.patch
>
>
> The page displays the message "Failed to read the attempts of the application"
>  
> The following stack trace is observed in RM log.
> org.apache.hadoop.yarn.server.webapp.AppBlock: Failed to read the attempts of 
> the application application_1520597233415_0002.
> java.lang.NullPointerException
>  at org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:283)
>  at org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:280)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  at org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:279)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppBlock.render(RMAppBlock.java:71)
>  at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
>  at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
>  at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
>  at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
>  at 
> org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
>  at org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
>  at 
> org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
>  at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
>  at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RmController.app(RmController.java:54)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7999) Docker launch fails when user private filecache directory is missing

2018-03-14 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398840#comment-16398840
 ] 

Jason Lowe commented on YARN-7999:
--

Thanks for the commit!  Since this was introduced by YARN-7815 which was 
committed to branch-3.1, I think this needs to go into branch-3.1 as well.

> Docker launch fails when user private filecache directory is missing
> 
>
> Key: YARN-7999
> URL: https://issues.apache.org/jira/browse/YARN-7999
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-7999.001.patch, YARN-7999.002.patch, q3.log
>
>
> Docker container is failing to launch in trunk.  The root cause is:
> {code}
> [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_20]: 
> [2018-03-02 23:26:09.196]Exception from container-launch.
> Container id: container_1520032931921_0001_01_20
> Exit code: 29
> Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Invalid docker mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
>  realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache
> Error constructing docker command, docker error code=12, error 
> message='Invalid docker mount'
> Shell output: main : command provided 4
> main : run as user is hbase
> main : requested yarn user is hbase
> Creating script paths...
> Creating local dirs...
> [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 
> 23:26:09.240]
> [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
> [2018-03-02 23:26:39.278]Could not find 
> nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_20//container_1520032931921_0001_01_20.pid
>  in any of the directories
> [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down 
> now...
> {code}
> The filecache cant not be mounted because it doesn't exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7747) YARN UI is broken in the minicluster

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398800#comment-16398800
 ] 

genericqa commented on YARN-7747:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 
11s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 36s{color} | {color:orange} root: The patch generated 4 new + 150 unchanged 
- 2 fixed = 154 total (was 152) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
8m 50s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 54s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
13s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 49s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 62m 
43s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}185m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.shell.TestCopyPreserveFlag |
|   | hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-7747 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-6629) NPE occurred when container allocation proposal is applied but its resource requests are removed before

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398768#comment-16398768
 ] 

genericqa commented on YARN-6629:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 32s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 3 new + 224 unchanged - 0 fixed = 227 total (was 224) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 
24s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}116m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-6629 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914489/YARN-6629.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 56f0385ea973 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ad1b988 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/19976/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19976/testReport/ |
| Max. process+thread count | 856 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Commented] (YARN-8027) Setting hostname of docker container breaks for --net=host in docker 1.13

2018-03-14 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398752#comment-16398752
 ] 

Jim Brennan commented on YARN-8027:
---

My thinking was that the only known case where there is a problem is with 
--net=host, so I was keeping the change narrowed to that case.

With network set to bridge or none, the default hostname for the container is 
the container id, and it is not resolvable inside the container,  so changing 
it to a more useful name seems relatively harmless.   For user defined 
networks, I'm unsure if there is a case where we would want to set the 
container name without using Registry DNS.

I'm happy to simplify this to just check Registry DNS if 
[~shaneku...@gmail.com] and [~billie.rinaldi] agree that is the best solution.

> Setting hostname of docker container breaks for --net=host in docker 1.13
> -
>
> Key: YARN-8027
> URL: https://issues.apache.org/jira/browse/YARN-8027
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-8027.001.patch
>
>
> In DockerLinuxContainerRuntime:launchContainer, we are adding the --hostname 
> argument to the docker run command to set the hostname in the container to 
> something like:  ctr-e84-1520889172376-0001-01-01.
> This does not work when combined with the --net=host command line option in 
> Docker 1.13.1.  It causes multiple failures when the client tries to resolve 
> the hostname and it fails.
> We haven't seen this before because we were using docker 1.12.6 which seems 
> to ignore --hostname when you are using --net=host.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7952) RM should be able to recover log aggregation status after restart/fail-over

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398708#comment-16398708
 ] 

genericqa commented on YARN-7952:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
14s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
33s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  4s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 382 unchanged - 0 fixed = 384 total (was 382) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  8s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
37s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
6s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
13s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 21s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 72m 
54s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}178m 51s{color} 

[jira] [Commented] (YARN-8027) Setting hostname of docker container breaks for --net=host in docker 1.13

2018-03-14 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398704#comment-16398704
 ] 

Jason Lowe commented on YARN-8027:
--

Thanks for the patch!  Curious, is there any reason for YARN to want to change 
the hostname of a container unless Registry DNS is enabled?  If not then I'm 
wondering if this simplifies to just checking for Registry DNS being enabled.  
Otherwise it looks OK to me.  However I'll defer to [~billie.rinaldi] and 
[~shaneku...@gmail.com] since they know far more about the docker container 
requirements for native services in YARN.

> Setting hostname of docker container breaks for --net=host in docker 1.13
> -
>
> Key: YARN-8027
> URL: https://issues.apache.org/jira/browse/YARN-8027
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-8027.001.patch
>
>
> In DockerLinuxContainerRuntime:launchContainer, we are adding the --hostname 
> argument to the docker run command to set the hostname in the container to 
> something like:  ctr-e84-1520889172376-0001-01-01.
> This does not work when combined with the --net=host command line option in 
> Docker 1.13.1.  It causes multiple failures when the client tries to resolve 
> the hostname and it fails.
> We haven't seen this before because we were using docker 1.12.6 which seems 
> to ignore --hostname when you are using --net=host.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8002) Support NOT_SELF and ALL namespace types for allocation tag

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398674#comment-16398674
 ] 

genericqa commented on YARN-8002:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
19s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 44s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 61 unchanged - 1 fixed = 63 total (was 62) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
49s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 25s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}147m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.placement.TestSingleConstraintAppPlacementAllocator
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.constraint.TestPlacementConstraintsUtil
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.constraint.TestAllocationTagsManager
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestSchedulingRequestContainerAllocationAsync
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.constraint.algorithm.TestLocalAllocationTagsManager
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestSchedulingRequestContainerAllocation
 |
|   | 

[jira] [Commented] (YARN-8027) Setting hostname of docker container breaks for --net=host in docker 1.13

2018-03-14 Thread Jim Brennan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398635#comment-16398635
 ] 

Jim Brennan commented on YARN-8027:
---

The unit test failure (testKillOpportunisticForGuaranteedContainer) does not 
appear to be related to my changes.

[~jlowe], [~shaneku...@gmail.com], [~billie.rinaldi], I believe this is ready 
for review.

 

> Setting hostname of docker container breaks for --net=host in docker 1.13
> -
>
> Key: YARN-8027
> URL: https://issues.apache.org/jira/browse/YARN-8027
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-8027.001.patch
>
>
> In DockerLinuxContainerRuntime:launchContainer, we are adding the --hostname 
> argument to the docker run command to set the hostname in the container to 
> something like:  ctr-e84-1520889172376-0001-01-01.
> This does not work when combined with the --net=host command line option in 
> Docker 1.13.1.  It causes multiple failures when the client tries to resolve 
> the hostname and it fails.
> We haven't seen this before because we were using docker 1.12.6 which seems 
> to ignore --hostname when you are using --net=host.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6936) [Atsv2] Retrospect storing entities into sub application table from client perspective

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398593#comment-16398593
 ] 

genericqa commented on YARN-6936:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
54s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
16s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 13s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 3 new + 215 unchanged - 0 fixed = 218 total (was 215) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
46s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
21s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
9s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-client in 
the patch passed. 

[jira] [Commented] (YARN-8020) when DRF is used, preemption does not trigger due to incorrect idealAssigned

2018-03-14 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398575#comment-16398575
 ] 

Eric Payne commented on YARN-8020:
--

bq. explain why preemption doesn't happen for the case you mentioned
I don't know why yet. I'm still investigating.

> when DRF is used, preemption does not trigger due to incorrect idealAssigned
> 
>
> Key: YARN-8020
> URL: https://issues.apache.org/jira/browse/YARN-8020
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Priority: Major
>
> I’ve met that Inter Queue Preemption does not work.
> It happens when DRF is used and submitting application with a large number of 
> vcores.
> IMHO, idealAssigned can be set incorrectly by following code.
> {code}
> // This function "accepts" all the resources it can (pending) and return
> // the unused ones
> Resource offer(Resource avail, ResourceCalculator rc,
> Resource clusterResource, boolean considersReservedResource) {
>   Resource absMaxCapIdealAssignedDelta = Resources.componentwiseMax(
>   Resources.subtract(getMax(), idealAssigned),
>   Resource.newInstance(0, 0));
>   // accepted = min{avail,
>   //   max - assigned,
>   //   current + pending - assigned,
>   //   # Make sure a queue will not get more than max of its
>   //   # used/guaranteed, this is to make sure preemption won't
>   //   # happen if all active queues are beyond their guaranteed
>   //   # This is for leaf queue only.
>   //   max(guaranteed, used) - assigned}
>   // remain = avail - accepted
>   Resource accepted = Resources.min(rc, clusterResource,
>   absMaxCapIdealAssignedDelta,
>   Resources.min(rc, clusterResource, avail, Resources
>   /*
>* When we're using FifoPreemptionSelector (considerReservedResource
>* = false).
>*
>* We should deduct reserved resource from pending to avoid 
> excessive
>* preemption:
>*
>* For example, if an under-utilized queue has used = reserved = 20.
>* Preemption policy will try to preempt 20 containers (which is not
>* satisfied) from different hosts.
>*
>* In FifoPreemptionSelector, there's no guarantee that preempted
>* resource can be used by pending request, so policy will preempt
>* resources repeatly.
>*/
>   .subtract(Resources.add(getUsed(),
>   (considersReservedResource ? pending : pendingDeductReserved)),
>   idealAssigned)));
> {code}
> let’s say,
> * cluster resource : 
> * idealAssigned(assigned): 
> * avail: 
> * current: 
> * pending: 
> current + pending - assigned: 
> min ( avail, (current + pending - assigned) ) : 
> accepted: 
> as a result, idealAssigned will be , which does not 
> trigger preemption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6629) NPE occurred when container allocation proposal is applied but its resource requests are removed before

2018-03-14 Thread Tao Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398499#comment-16398499
 ] 

Tao Yang commented on YARN-6629:


Uploaded v3 patch for trunk, [~sunilg], Please help to review again, Thanks.

> NPE occurred when container allocation proposal is applied but its resource 
> requests are removed before
> ---
>
> Key: YARN-6629
> URL: https://issues.apache.org/jira/browse/YARN-6629
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-6629.001.patch, YARN-6629.002.patch, 
> YARN-6629.003.patch
>
>
> I wrote a test case to reproduce another problem for branch-2 and found new 
> NPE error,  log: 
> {code}
> FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in 
> handling event type NODE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516)
> at 
> org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225)
> at 
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31)
> at org.mockito.internal.MockHandler.handle(MockHandler.java:97)
> at 
> org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply()
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Reproduce this error in chronological order:
> 1. AM started and requested 1 container with schedulerRequestKey#1 : 
> ApplicationMasterService#allocate -->  CapacityScheduler#allocate --> 
> SchedulerApplicationAttempt#updateResourceRequests --> 
> AppSchedulingInfo#updateResourceRequests 
> Added schedulerRequestKey#1 into schedulerKeyToPlacementSets
> 2. Scheduler allocatd 1 container for this request and accepted the proposal
> 3. AM removed this request
> ApplicationMasterService#allocate -->  CapacityScheduler#allocate --> 
> SchedulerApplicationAttempt#updateResourceRequests --> 
> AppSchedulingInfo#updateResourceRequests --> 
> AppSchedulingInfo#addToPlacementSets --> 
> AppSchedulingInfo#updatePendingResources
> Removed schedulerRequestKey#1 from schedulerKeyToPlacementSets)
> 4. Scheduler applied this proposal
> CapacityScheduler#tryCommit --> FiCaSchedulerApp#apply --> 
> AppSchedulingInfo#allocate 
> Throw NPE when called 
> schedulerKeyToPlacementSets.get(schedulerRequestKey).allocate(schedulerKey, 
> type, node);



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6629) NPE occurred when container allocation proposal is applied but its resource requests are removed before

2018-03-14 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-6629:
---
Attachment: YARN-6629.003.patch

> NPE occurred when container allocation proposal is applied but its resource 
> requests are removed before
> ---
>
> Key: YARN-6629
> URL: https://issues.apache.org/jira/browse/YARN-6629
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-6629.001.patch, YARN-6629.002.patch, 
> YARN-6629.003.patch
>
>
> I wrote a test case to reproduce another problem for branch-2 and found new 
> NPE error,  log: 
> {code}
> FATAL event.EventDispatcher (EventDispatcher.java:run(75)) - Error in 
> handling event type NODE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:446)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:516)
> at 
> org.apache.hadoop.yarn.client.TestNegativePendingResource$1.answer(TestNegativePendingResource.java:225)
> at 
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:31)
> at org.mockito.internal.MockHandler.handle(MockHandler.java:97)
> at 
> org.mockito.internal.creation.MethodInterceptorFilter.intercept(MethodInterceptorFilter.java:47)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp$$EnhancerByMockitoWithCGLIB$$29eb8afc.apply()
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2396)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.submitResourceCommitRequest(CapacityScheduler.java:2281)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1247)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1236)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1325)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1112)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:987)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1367)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:143)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Reproduce this error in chronological order:
> 1. AM started and requested 1 container with schedulerRequestKey#1 : 
> ApplicationMasterService#allocate -->  CapacityScheduler#allocate --> 
> SchedulerApplicationAttempt#updateResourceRequests --> 
> AppSchedulingInfo#updateResourceRequests 
> Added schedulerRequestKey#1 into schedulerKeyToPlacementSets
> 2. Scheduler allocatd 1 container for this request and accepted the proposal
> 3. AM removed this request
> ApplicationMasterService#allocate -->  CapacityScheduler#allocate --> 
> SchedulerApplicationAttempt#updateResourceRequests --> 
> AppSchedulingInfo#updateResourceRequests --> 
> AppSchedulingInfo#addToPlacementSets --> 
> AppSchedulingInfo#updatePendingResources
> Removed schedulerRequestKey#1 from schedulerKeyToPlacementSets)
> 4. Scheduler applied this proposal
> CapacityScheduler#tryCommit --> FiCaSchedulerApp#apply --> 
> AppSchedulingInfo#allocate 
> Throw NPE when called 
> schedulerKeyToPlacementSets.get(schedulerRequestKey).allocate(schedulerKey, 
> type, node);



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7708) [GPG] Load based policy generator

2018-03-14 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398457#comment-16398457
 ] 

genericqa commented on YARN-7708:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-7708 does not apply to YARN-7402. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7708 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914387/YARN-7708-YARN-7402.07.cumulative.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19973/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> [GPG] Load based policy generator
> -
>
> Key: YARN-7708
> URL: https://issues.apache.org/jira/browse/YARN-7708
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Carlo Curino
>Assignee: Young Chen
>Priority: Major
> Attachments: YARN-7708-YARN-7402.01.cumulative.patch, 
> YARN-7708-YARN-7402.02.cumulative.patch, 
> YARN-7708-YARN-7402.03.cumulative.patch, 
> YARN-7708-YARN-7402.04.cumulative.patch, 
> YARN-7708-YARN-7402.05.cumulative.patch, 
> YARN-7708-YARN-7402.06.cumulative.patch, 
> YARN-7708-YARN-7402.07.cumulative.patch
>
>
> This policy reads load from the "pendingQueueLength" metrics and provides 
> scaling into a set of weights that influence the AMRMProxy and Router 
> behaviors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7747) YARN UI is broken in the minicluster

2018-03-14 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-7747:

Attachment: YARN-7747.002.patch

> YARN UI is broken in the minicluster 
> -
>
> Key: YARN-7747
> URL: https://issues.apache.org/jira/browse/YARN-7747
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
>Priority: Major
> Attachments: YARN-7747.001.patch, YARN-7747.002.patch
>
>
> YARN web apps use non-injected instances of GuiceFilter, i.e. instances 
> created by Jetty as opposed by Guice itself. This triggers the [call 
> path|https://github.com/google/guice/blob/master/extensions/servlet/src/com/google/inject/servlet/GuiceFilter.java#L251]
>  where the static field {{pipeline}} is used instead of the instance field 
> {{injectedPipeline}}. However, besides GuiceFilter instances created by 
> Jetty, each Guice module generates them as well. On the injection call path 
> this static variable is updated by each instance. Thus if there are multiple 
> modules as it happens to be the case in the minicluster the one loaded last 
> ends up defining the filter pipeline for all Jetty instances. In the 
> minicluster case this is the nodemanager UI
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398349#comment-16398349
 ] 

Weiwei Yang commented on YARN-7636:
---

Thanks [~Tao Yang] for catching this bug and providing the fix, it looks good 
to me. Pending on jenkins. Only thing is do we need to get this into 3.1.0? 
Ping [~leftnoteasy] for his opinion.

> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch, YARN-7636.002.patch
>
>
> This happens on our production cluster twice, when a request cannot be 
> satisfied for a long time, it continually triggers the re-reservation and 
> eventually caused the overflow. This will crash the scheduler.
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
> {noformat}
> Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
> can ignore this exception to avoid this problem.
> This problem may happens in 
> SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
>  fix it in the same way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-7636:
--
Description: 
This happens on our production cluster twice, when a request cannot be 
satisfied for a long time, it continually triggers the re-reservation and 
eventually caused the overflow. This will crash the scheduler.

Exception stack:
{noformat}
java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count of 
2147483647
        at 
com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
        at 
com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
        at 
com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
{noformat}
Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
can ignore this exception to avoid this problem.

This problem may happens in 
SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
 fix it in the same way.

  was:
Exception stack:
{noformat}
java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count of 
2147483647
        at 
com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
        at 
com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
        at 
com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
{noformat}
Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
can ignore this exception to avoid this problem.

This problem may happens in 
SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
 fix it in the same way.


> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch, YARN-7636.002.patch
>
>
> This happens on our production cluster twice, when a request cannot be 
> satisfied for a long time, it continually triggers the re-reservation and 
> eventually caused the overflow. This will crash the scheduler.
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> 

[jira] [Commented] (YARN-8022) ResourceManager UI cluster/app/ page fails to render

2018-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398323#comment-16398323
 ] 

Hudson commented on YARN-8022:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13833 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13833/])
YARN-8022. ResourceManager UI cluster/app/ page fails to render. 
(rohithsharmaks: rev e6de10d0a6363bdaf767a7bdac7ad908d7786718)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppBlock.java


> ResourceManager UI cluster/app/ page fails to render
> 
>
> Key: YARN-8022
> URL: https://issues.apache.org/jira/browse/YARN-8022
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Blocker
> Fix For: 3.1.0, 3.2.0
>
> Attachments: Screen Shot 2018-03-12 at 1.45.05 PM.png, 
> YARN-8022.001.patch, YARN-8022.002.patch
>
>
> The page displays the message "Failed to read the attempts of the application"
>  
> The following stack trace is observed in RM log.
> org.apache.hadoop.yarn.server.webapp.AppBlock: Failed to read the attempts of 
> the application application_1520597233415_0002.
> java.lang.NullPointerException
>  at org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:283)
>  at org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:280)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  at org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:279)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppBlock.render(RMAppBlock.java:71)
>  at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
>  at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
>  at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
>  at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
>  at 
> org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
>  at org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
>  at 
> org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
>  at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
>  at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RmController.app(RmController.java:54)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7657) Queue Mapping could provide options to provide 'user' specific auto-created queues under a specified group parent queue

2018-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398320#comment-16398320
 ] 

Hudson commented on YARN-7657:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13833 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13833/])
YARN-7657. Queue Mapping could provide options to provide 'user' (wangda: rev 
b167d60763f4b9ac69976df2da5d988e5faa85e0)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/UserGroupMappingPlacementRule.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueManagementDynamicEditPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/placement/TestUserGroupMappingPlacementRule.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerAutoQueueCreation.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerAutoCreatedQueueBase.java


> Queue Mapping could provide options to provide 'user' specific auto-created 
> queues under a specified group parent queue
> ---
>
> Key: YARN-7657
> URL: https://issues.apache.org/jira/browse/YARN-7657
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7657.1.patch, YARN-7657.2.patch, YARN-7657.3.patch, 
> YARN-7657.4.patch
>
>
> Current Queue-Mapping only provides %user as an option for 'user' specific 
> queues as u:%user:%user. We can also support %user with group as 
> 'g:makerting-group:marketing.%user'  and user specific queues can be 
> automatically created under a group queue in this case.
> cc [~leftnoteasy]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5015) Support sliding window retry capability for container restart

2018-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398319#comment-16398319
 ] 

Hudson commented on YARN-5015:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13833 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13833/])
YARN-5015. Support sliding window retry capability for container (wangda: rev 
a5b27b3c678ad2f5cb8dbfa1b60ef5cd365f8bde)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMStateStoreService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMNullStateStoreService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMMemoryStateStoreService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerRetryContextPBImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerRetryContext.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/SlidingWindowRetryPolicy.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestSlidingWindowRetryPolicy.java


> Support sliding window retry capability for container restart 
> --
>
> Key: YARN-5015
> URL: https://issues.apache.org/jira/browse/YARN-5015
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Chandni Singh
>Priority: Major
>  Labels: oct16-medium
> Fix For: 3.2.0
>
> Attachments: YARN-5015.01.patch, YARN-5015.02.patch, 
> YARN-5015.03.patch, YARN-5015.04.patch, YARN-5015.05.patch, 
> YARN-5015.06.patch, YARN-5015.07.patch, YARN-5015.08.patch
>
>
> We support sliding window retry policy for AM restarts (Introduced in 
> YARN-611). Similar sliding window retry policy is needed for container 
> restarts.
> With this change, we can introduce a common class for 
> SlidingWindowRetryPolicy ( suggested by [~vvasudev] in the comments) and 
> integrate it to container restart. 
> In a subsequent jira, we can modify the AM code to use 
> SlidingWindowRetryPolicy which will unify the AM and container restart code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7636:
---
Affects Version/s: (was: 3.0.0-alpha4)
   3.1.0

> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch, YARN-7636.002.patch
>
>
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
> {noformat}
> Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
> can ignore this exception to avoid this problem.
> This problem may happens in 
> SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
>  fix it in the same way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7636:
---
Attachment: YARN-7636.002.patch

> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.0.0-alpha4, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch, YARN-7636.002.patch
>
>
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
> {noformat}
> Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
> can ignore this exception to avoid this problem.
> This problem may happens in 
> SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
>  fix it in the same way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7636:
---
Attachment: YARN-7636.002.patch

> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.0.0-alpha4, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch
>
>
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
> {noformat}
> Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
> can ignore this exception to avoid this problem.
> This problem may happens in 
> SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
>  fix it in the same way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7636:
---
Attachment: (was: YARN-7636.002.patch)

> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.0.0-alpha4, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch
>
>
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
> {noformat}
> Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
> can ignore this exception to avoid this problem.
> This problem may happens in 
> SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
>  fix it in the same way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7636) Re-reservation count may overflow when cluster resource exhausted for a long time

2018-03-14 Thread Tao Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7636:
---
Description: 
Exception stack:
{noformat}
java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count of 
2147483647
        at 
com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
        at 
com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
        at 
com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
{noformat}
Refer to handling of SchedulerApplicationAttempt#addSchedulingOpportunity, we 
can ignore this exception to avoid this problem.

This problem may happens in 
SchedulerApplicationAttempt#addMissedNonPartitionedRequestSchedulingOpportunity,
 fix it in the same way.

  was:
Exception stack:
{noformat}
java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count of 
2147483647
        at 
com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
        at 
com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
        at 
com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
{noformat}
We can add check condition {{getReReservations(schedulerKey) < 
Integer.MAX_VALUE}} before  addReReservation to avoid this problem.


> Re-reservation count may overflow when cluster resource exhausted for a long 
> time 
> --
>
> Key: YARN-7636
> URL: https://issues.apache.org/jira/browse/YARN-7636
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.0.0-alpha4, 2.9.1
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-7636.001.patch
>
>
> Exception stack:
> {noformat}
> java.lang.IllegalArgumentException: Overflow adding 1 occurrences to a count 
> of 2147483647
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:246)
>         at 
> com.google.common.collect.AbstractMultiset.add(AbstractMultiset.java:80)
>         at 
> com.google.common.collect.ConcurrentHashMultiset.add(ConcurrentHashMultiset.java:51)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.addReReservation(SchedulerApplicationAttempt.java:406)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.reserve(SchedulerApplicationAttempt.java:555)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.reserve(FiCaSchedulerApp.java:1076)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.apply(FiCaSchedulerApp.java:795)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2770)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:546)
> {noformat}
> Refer to handling 

[jira] [Updated] (YARN-6936) [Atsv2] Retrospect storing entities into sub application table from client perspective

2018-03-14 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-6936:

Attachment: YARN-6936.000.patch

> [Atsv2] Retrospect storing entities into sub application table from client 
> perspective
> --
>
> Key: YARN-6936
> URL: https://issues.apache.org/jira/browse/YARN-6936
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-6936.000.patch
>
>
> Currently YARN-6734 stores entities into sub application table only if doAs 
> user and submitted users are different. This holds good for Tez kind of use 
> cases. But AM runs as same as submitted user like MR also need to store 
> entities in sub application table so that it could read entities without 
> application id. 
> This would be a point of concern later stages when ATSv2 is deployed into 
> production. This JIRA is to retrospect decision of storing entities into sub 
> application table based on client side configuration driven rather than user 
> driven. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8002) Support NOT_SELF and ALL namespace types for allocation tag

2018-03-14 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8002:
--
Attachment: YARN-8002.003.patch

> Support NOT_SELF and ALL namespace types for allocation tag
> ---
>
> Key: YARN-8002
> URL: https://issues.apache.org/jira/browse/YARN-8002
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8002.001.patch, YARN-8002.002.patch, 
> YARN-8002.003.patch
>
>
> This is a continua task after YARN-7972, YARN-7972 adds support to specify 
> tags with namespace SELF and APP_ID, like following
>  * self/
>  * app-id//
> this task is to track the work to support 2 of remaining namespace types 
> *NOT_SELF* & *ALL* (we'll support app-label later),
>  * not-self/
>  * all/
> this will require a bit refactoring in {{AllocationTagsManager}} as it needs 
> to do some proper aggregation on tags for multiple apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8002) Support NOT_SELF and ALL namespace types for allocation tag

2018-03-14 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398195#comment-16398195
 ] 

Weiwei Yang commented on YARN-8002:
---

Hi [~leftnoteasy]

I spent a few hours consolidating the patch based on your comments, 
{quote}I'm fine with keep the logics here, but here I just want to make sure 
that we don't do any unnecessary aggregations for single.
{quote}
Make sense. In {{AllocationTagsManager}}, I have updated 
{{aggregateAllocationTagsByNode}} and {{aggregateAllocationTagsByRack}} to make 
sure there is no unnecessary aggregation for single app ID case.
{quote}More specifically, following call need to be avoided
{quote}
I tried, but I did not find out a better neat way to handle this. This 
consistent model was suggested by [~asuresh] in earlier discussion, no matter 
what namespace a parser returns to me, I call evaluate to ensure its target 
applications is properly interpreted. Otherwise we will end up with a lot of 
if-else.
 However, I have made some improvements. If you take a look at 
{{AllocationTagNamespace}}, now only classes need evaluation step overrides the 
{{evaluate()}} method now. Hope that looks better now.
{quote}For AllocationTags, I'm fine with keep the name, but I suggest to move 
it to scheduler.constraint package...
{quote}
Make senses. I have moved
 * AllocationTagNamespace
 * AllocationTags
 * Evaluable

to {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint}} 
package. Only left {{AllocationTagNamespaceType}} in the api package.
{quote}AllocationTags#fromScope... I suggest to make explicit names to this.
{quote}
Make sense. I've done so.
{quote}we should keep as less public interfaces as possible for ATM
{quote}
I have removed following public methods since they are only used by UT now,
{code:java}
public long getNodeCardinalityByOp(NodeId nodeId, ApplicationId applicationId,
  Set tags, LongBinaryOperator op);
public long getRackCardinalityByOp(String rack, ApplicationId applicationId,
  Set tags, LongBinaryOperator op);
{code}
{quote}about not-self
{quote}
I discussed with [~asuresh] offline, he mentioned this is required in their 
use-case. I suggest we do keep this namespace type for now, given the approach 
we are using now, it is trivial to add or remove the support of this namespace.

Please take a look at the v3 patch. Let know your thoughts, thanks!

> Support NOT_SELF and ALL namespace types for allocation tag
> ---
>
> Key: YARN-8002
> URL: https://issues.apache.org/jira/browse/YARN-8002
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8002.001.patch, YARN-8002.002.patch
>
>
> This is a continua task after YARN-7972, YARN-7972 adds support to specify 
> tags with namespace SELF and APP_ID, like following
>  * self/
>  * app-id//
> this task is to track the work to support 2 of remaining namespace types 
> *NOT_SELF* & *ALL* (we'll support app-label later),
>  * not-self/
>  * all/
> this will require a bit refactoring in {{AllocationTagsManager}} as it needs 
> to do some proper aggregation on tags for multiple apps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7952) RM should be able to recover log aggregation status after restart/fail-over

2018-03-14 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-7952:

Attachment: YARN-7952.8.patch

> RM should be able to recover log aggregation status after restart/fail-over
> ---
>
> Key: YARN-7952
> URL: https://issues.apache.org/jira/browse/YARN-7952
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7952-poc.patch, YARN-7952.1.patch, 
> YARN-7952.2.patch, YARN-7952.3.patch, YARN-7952.3.patch, YARN-7952.5.patch, 
> YARN-7952.6.patch, YARN-7952.7.patch, YARN-7952.8.patch
>
>
> Right now, the NM would send its own log aggregation status to RM 
> periodically to RM. And RM would aggregate the status for each application, 
> but it will not generate the final status until a client call(from web ui or 
> cli) trigger it. But RM never persists the log aggregation status. So, when 
> RM restarts/fails over, the log aggregation status will become “NOT_STARTED”. 
> This is confusing, maybe we should change it to “NOT_AVAILABLE” (will create 
> a separate ticket for this). Anyway, we need to persist the log aggregation 
> status for the future use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7952) RM should be able to recover log aggregation status after restart/fail-over

2018-03-14 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398185#comment-16398185
 ] 

Xuan Gong commented on YARN-7952:
-

Thanks for the comments. [~rkanter]

Attached a new patch to address all your comments

> RM should be able to recover log aggregation status after restart/fail-over
> ---
>
> Key: YARN-7952
> URL: https://issues.apache.org/jira/browse/YARN-7952
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-7952-poc.patch, YARN-7952.1.patch, 
> YARN-7952.2.patch, YARN-7952.3.patch, YARN-7952.3.patch, YARN-7952.5.patch, 
> YARN-7952.6.patch, YARN-7952.7.patch, YARN-7952.8.patch
>
>
> Right now, the NM would send its own log aggregation status to RM 
> periodically to RM. And RM would aggregate the status for each application, 
> but it will not generate the final status until a client call(from web ui or 
> cli) trigger it. But RM never persists the log aggregation status. So, when 
> RM restarts/fails over, the log aggregation status will become “NOT_STARTED”. 
> This is confusing, maybe we should change it to “NOT_AVAILABLE” (will create 
> a separate ticket for this). Anyway, we need to persist the log aggregation 
> status for the future use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8018) Yarn service: Add support for initiating service upgrade

2018-03-14 Thread Chandni Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398146#comment-16398146
 ] 

Chandni Singh commented on YARN-8018:
-

The changes are getting larger than I anticipated. So, I am providing an early 
patch for early feedback. I will continue working on unit tests and other 
patches related to upgrade.
cc. [~gsaha] [~billie.rinaldi] 

> Yarn service: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8018.wip.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8018) Yarn service: Add support for initiating service upgrade

2018-03-14 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8018:

Attachment: YARN-8018.wip.patch

> Yarn service: Add support for initiating service upgrade
> 
>
> Key: YARN-8018
> URL: https://issues.apache.org/jira/browse/YARN-8018
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8018.wip.patch
>
>
> Add support for initiating service upgrade which includes the following main 
> changes:
>  # Service API to initiate upgrade
>  # Persist service version on hdfs
>  # Start the upgraded version of service



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7581) HBase filters are not constructed correctly in ATSv2

2018-03-14 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398132#comment-16398132
 ] 

Rohith Sharma K S commented on YARN-7581:
-

Overall patch approach looks good. Some comments

TimelineEntityReader.java
l.no 211, 
- method extractColumnFamiliesFromFiltersBasedOnFilters has duplicated code for 
else if conditions. 
- Can this  _new String(cf, Charset.forName("UTF-8")_ be extracted to utils so 
that we don't miss syntax anytime.

In all the *EntityReader implementations, by default cfsInFields added for INFO 
family. Does it because of assumption that info field will be retrieved by 
default all the time? 


> HBase filters are not constructed correctly in ATSv2
> 
>
> Key: YARN-7581
> URL: https://issues.apache.org/jira/browse/YARN-7581
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Affects Versions: 3.0.0-beta1
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Attachments: YARN-7581.00.patch, YARN-7581.01.patch, 
> YARN-7581.02.patch
>
>
> Post YARN-7346,
> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesConfigFilters() and 
> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesMetricFilters() 
> start to fail when hbase.profile is set to 2.0)
> *Error Message*
>  [ERROR] Failures:
>  [ERROR] 
> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesConfigFilters:1266 
> expected:<2> but was:<0>
>  [ERROR] 
> TestTimelineReaderWebServicesHBaseStorage.testGetEntitiesMetricFilters:1523 
> expected:<1> but was:<0>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org