[jira] [Commented] (YARN-8627) EntityGroupFSTimelineStore hdfs done directory keeps on accumulating

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571082#comment-16571082
 ] 

genericqa commented on YARN-8627:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
20s{color} | {color:green} hadoop-yarn-server-timeline-pluginstorage in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8627 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934581/YARN-8627.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c2d1528981e6 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2e4e02b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21527/testReport/ |
| Max. process+thread count | 316 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21527/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> EntityGroupFSTimelineStore 

[jira] [Comment Edited] (YARN-8609) NM oom because of large container statuses

2018-08-06 Thread Xianghao Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571033#comment-16571033
 ] 

Xianghao Lu edited comment on YARN-8609 at 8/7/18 2:36 AM:
---

Thank you for your patience!
 I know what you say, and my hadoop version is 2.7.2 which don't contain the 
change in YARN-3998. Indeed, it would not take up too much memory if running 
with -YARN-3998.-
 However, for example, if the raw diagnostic is largeExceptionMessage + 
fixedString + fixedString + ... , all meaningful fixedString will be discarded 
when running with YARN-3998. So, if we do truncation in for loop, all kinds of 
diagnostic info will retain. This is what I want to say and it is a small 
improvement.

Besides,  diagnosticsMaxSize in -YARN-3998- is also necessary, and, there is a 
small possibility to reach diagnosticsMaxSize with appropriate truncation in 
for loop.


was (Author: luxianghao):
Thank you for your patience!
 I know what you say, and my hadoop version is 2.7.2 which don't contain the 
change in YARN-3998. Indeed, it would not take up too much memory if running 
with -YARN-3998.-
 However, for example, if the raw diagnostic is largeExceptionMessage + 
fixedString + fixedString + ... , all meaningful fixedString will be discarded 
when running with YARN-3998. So, if we do truncation in for loop, all kinds of 
diagnostic info will retain. This is what I want to say and it is a small 
improvement.

> NM oom because of large container statuses
> --
>
> Key: YARN-8609
> URL: https://issues.apache.org/jira/browse/YARN-8609
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Xianghao Lu
>Priority: Major
> Attachments: YARN-8609.001.patch, contain_status.jpg, oom.jpeg
>
>
> Sometimes, NodeManger will send large container statuses to ResourceManager 
> when NodeManger start with recovering, as a result , NodeManger will be 
> failed to start because of oom.
>  In my case, the large container statuses size is 135M, which contain 11 
> container statuses, and I find the diagnostics of 5 containers are very 
> large(27M), so, I truncate the container diagnostics as the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8627) EntityGroupFSTimelineStore hdfs done directory keeps on accumulating

2018-08-06 Thread Tarun Parimi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571043#comment-16571043
 ] 

Tarun Parimi commented on YARN-8627:


Adding a test case in TestEntityGroupFSTimelineStore#testCleanLogs to check the 
cleaning of an app directory with multiple attempt dirs and an app dir within 
an app dir.

> EntityGroupFSTimelineStore hdfs done directory keeps on accumulating
> 
>
> Key: YARN-8627
> URL: https://issues.apache.org/jira/browse/YARN-8627
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-8627.001.patch, YARN-8627.002.patch
>
>
> The EntityLogCleaner threads exits with the following ERROR every time it 
> runs.  
> {code:java}
> 2018-07-18 19:59:39,837 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18268
> 2018-07-18 19:59:39,844 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
> 2018-07-18 19:59:39,848 ERROR timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:run(899)) - Error cleaning files  
> java.io.FileNotFoundException: File 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
>  does not exist.  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1062)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1069)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1040)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1019)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.shouldCleanAppLogDir(EntityGroupFSTimelineStore.java:480)
>  
> {code}
>  
>  Each time the thread gets scheduled, it is a different folder encountering 
> the error. As a result, the thread is not able to clean all the old done 
> directories, since it stops after this error. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8627) EntityGroupFSTimelineStore hdfs done directory keeps on accumulating

2018-08-06 Thread Tarun Parimi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tarun Parimi updated YARN-8627:
---
Attachment: YARN-8627.002.patch

> EntityGroupFSTimelineStore hdfs done directory keeps on accumulating
> 
>
> Key: YARN-8627
> URL: https://issues.apache.org/jira/browse/YARN-8627
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-8627.001.patch, YARN-8627.002.patch
>
>
> The EntityLogCleaner threads exits with the following ERROR every time it 
> runs.  
> {code:java}
> 2018-07-18 19:59:39,837 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18268
> 2018-07-18 19:59:39,844 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
> 2018-07-18 19:59:39,848 ERROR timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:run(899)) - Error cleaning files  
> java.io.FileNotFoundException: File 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
>  does not exist.  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1062)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1069)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1040)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1019)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.shouldCleanAppLogDir(EntityGroupFSTimelineStore.java:480)
>  
> {code}
>  
>  Each time the thread gets scheduled, it is a different folder encountering 
> the error. As a result, the thread is not able to clean all the old done 
> directories, since it stops after this error. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8609) NM oom because of large container statuses

2018-08-06 Thread Xianghao Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571033#comment-16571033
 ] 

Xianghao Lu edited comment on YARN-8609 at 8/7/18 2:21 AM:
---

Thank you for your patience!
 I know what you say, and my hadoop version is 2.7.2 which don't contain the 
change in YARN-3998. Indeed, it would not take up too much memory if running 
with -YARN-3998.-
 However, for example, if the raw diagnostic is largeExceptionMessage + 
fixedString + fixedString + ... , all meaningful fixedString will be discarded 
when running with YARN-3998. So, if we do truncation in for loop, all kinds of 
diagnostic info will retain. This is what I want to say and it is a small 
improvement.


was (Author: luxianghao):
Thank you for your patience!
I know what you say, and my hadoop version is 2.7.2 which don't contain the 
change in YARN-3998. Indeed, it would not take up too many memory if running 
with -YARN-3998.-
However, for example, if the raw diagnostic is largeExceptionMessage + 
fixedString + fixedString + ... , all meaningful fixedString will be discarded. 
So, if we do truncation in for loop, all kinds of diagnostic info will retain. 
This is what I want to say and it is a small improvement.

> NM oom because of large container statuses
> --
>
> Key: YARN-8609
> URL: https://issues.apache.org/jira/browse/YARN-8609
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Xianghao Lu
>Priority: Major
> Attachments: YARN-8609.001.patch, contain_status.jpg, oom.jpeg
>
>
> Sometimes, NodeManger will send large container statuses to ResourceManager 
> when NodeManger start with recovering, as a result , NodeManger will be 
> failed to start because of oom.
>  In my case, the large container statuses size is 135M, which contain 11 
> container statuses, and I find the diagnostics of 5 containers are very 
> large(27M), so, I truncate the container diagnostics as the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8609) NM oom because of large container statuses

2018-08-06 Thread Xianghao Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571033#comment-16571033
 ] 

Xianghao Lu commented on YARN-8609:
---

Thank you for your patience!
I know what you say, and my hadoop version is 2.7.2 which don't contain the 
change in YARN-3998. Indeed, it would not take up too many memory if running 
with -YARN-3998.-
However, for example, if the raw diagnostic is largeExceptionMessage + 
fixedString + fixedString + ... , all meaningful fixedString will be discarded. 
So, if we do truncation in for loop, all kinds of diagnostic info will retain. 
This is what I want to say and it is a small improvement.

> NM oom because of large container statuses
> --
>
> Key: YARN-8609
> URL: https://issues.apache.org/jira/browse/YARN-8609
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Xianghao Lu
>Priority: Major
> Attachments: YARN-8609.001.patch, contain_status.jpg, oom.jpeg
>
>
> Sometimes, NodeManger will send large container statuses to ResourceManager 
> when NodeManger start with recovering, as a result , NodeManger will be 
> failed to start because of oom.
>  In my case, the large container statuses size is 135M, which contain 11 
> container statuses, and I find the diagnostics of 5 containers are very 
> large(27M), so, I truncate the container diagnostics as the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8448) AM HTTPS Support

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571030#comment-16571030
 ] 

genericqa commented on YARN-8448:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
25m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-client-modules/hadoop-client-runtime 
hadoop-client-modules/hadoop-client-check-invariants 
hadoop-client-modules/hadoop-client-minicluster 
hadoop-client-modules/hadoop-client-check-test-invariants {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
42s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 11m  
6s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 29m 55s{color} | 
{color:red} root generated 1 new + 11 unchanged - 0 fixed = 12 total (was 11) 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 29m 
55s{color} | {color:green} root generated 0 new + 1458 unchanged - 10 fixed = 
1458 total (was 1468) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m 
23s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-client-modules/hadoop-client-runtime 
hadoop-client-modules/hadoop-client-check-invariants 
hadoop-client-modules/hadoop-client-minicluster 
hadoop-client-modules/hadoop-client-check-test-invariants {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 17m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 11m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
23s{color} | {color:green} hadoop-project in the patch passed. 

[jira] [Commented] (YARN-8626) Create HomePolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571029#comment-16571029
 ] 

genericqa commented on YARN-8626:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
12s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8626 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934569/YARN-8626.008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 2d12e5c88173 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca20e0d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21526/testReport/ |
| Max. process+thread count | 329 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21526/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Create HomePolicyManager that sends all the requests to the home subcluster
> 

[jira] [Commented] (YARN-8626) Create HomePolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571022#comment-16571022
 ] 

genericqa commented on YARN-8626:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 32m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
16s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8626 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934568/YARN-8626.007.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 92d0fc3a8a2c 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca20e0d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21525/testReport/ |
| Max. process+thread count | 303 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21525/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Create HomePolicyManager that sends all the requests to the home subcluster
> 

[jira] [Commented] (YARN-8626) Create HomePolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread Subru Krishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570999#comment-16570999
 ] 

Subru Krishnan commented on YARN-8626:
--

Thanks [~elgoiri] for addressing my comments, +1 on the latest patch (v8) 
pending Yetus.

> Create HomePolicyManager that sends all the requests to the home subcluster
> ---
>
> Key: YARN-8626
> URL: https://issues.apache.org/jira/browse/YARN-8626
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: YARN-8626.000.patch, YARN-8626.001.patch, 
> YARN-8626.002.patch, YARN-8626.003.patch, YARN-8626.004.patch, 
> YARN-8626.005.patch, YARN-8626.006.patch, YARN-8626.007.patch, 
> YARN-8626.008.patch
>
>
> To have the same behavior as a regular non-federated deployment, one should 
> be able to submit jobs to the local RM and get the job constrained to that 
> subcluster.
> This JIRA creates an AMRMProxyPolicy that sends resources to the home 
> subcluster and mimics the behavior of a non-federated cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8626) Create HomePolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8626:
--
Attachment: YARN-8626.008.patch

> Create HomePolicyManager that sends all the requests to the home subcluster
> ---
>
> Key: YARN-8626
> URL: https://issues.apache.org/jira/browse/YARN-8626
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: YARN-8626.000.patch, YARN-8626.001.patch, 
> YARN-8626.002.patch, YARN-8626.003.patch, YARN-8626.004.patch, 
> YARN-8626.005.patch, YARN-8626.006.patch, YARN-8626.007.patch, 
> YARN-8626.008.patch
>
>
> To have the same behavior as a regular non-federated deployment, one should 
> be able to submit jobs to the local RM and get the job constrained to that 
> subcluster.
> This JIRA creates an AMRMProxyPolicy that sends resources to the home 
> subcluster and mimics the behavior of a non-federated cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8626) Create HomePolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8626:
--
Attachment: YARN-8626.007.patch

> Create HomePolicyManager that sends all the requests to the home subcluster
> ---
>
> Key: YARN-8626
> URL: https://issues.apache.org/jira/browse/YARN-8626
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: YARN-8626.000.patch, YARN-8626.001.patch, 
> YARN-8626.002.patch, YARN-8626.003.patch, YARN-8626.004.patch, 
> YARN-8626.005.patch, YARN-8626.006.patch, YARN-8626.007.patch
>
>
> To have the same behavior as a regular non-federated deployment, one should 
> be able to submit jobs to the local RM and get the job constrained to that 
> subcluster.
> This JIRA creates an AMRMProxyPolicy that sends resources to the home 
> subcluster and mimics the behavior of a non-federated cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8629) Container cleanup fails while trying to delete Cgroups

2018-08-06 Thread Suma Shivaprasad (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated YARN-8629:
---
Summary: Container cleanup fails while trying to delete Cgroups  (was: 
Container cleanup failed)

> Container cleanup fails while trying to delete Cgroups
> --
>
> Key: YARN-8629
> URL: https://issues.apache.org/jira/browse/YARN-8629
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Suma Shivaprasad
>Priority: Major
>
> When an application failed to launch container successfully, the cleanup of 
> container also failed with below message.
> {code}
> 2018-08-06 03:28:20,351 WARN  resources.CGroupsHandlerImpl 
> (CGroupsHandlerImpl.java:checkAndDeleteCgroup(523)) - Failed to read cgroup 
> tasks file.
> java.io.FileNotFoundException: 
> /sys/fs/cgroup/cpu,cpuacct/hadoop-yarn-tmp-cxx/container_e02_156898541_0010_20_02/tasks
>  (No such file or directory)
> at java.io.FileInputStream.open0(Native Method)
> at java.io.FileInputStream.open(FileInputStream.java:195)
> at java.io.FileInputStream.(FileInputStream.java:138)
> at java.io.FileInputStream.(FileInputStream.java:93)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.checkAndDeleteCgroup(CGroupsHandlerImpl.java:507)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.deleteCGroup(CGroupsHandlerImpl.java:542)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsCpuResourceHandlerImpl.postComplete(CGroupsCpuResourceHandlerImpl.java:238)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.postComplete(ResourceHandlerChain.java:111)
> at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.postComplete(LinuxContainerExecutor.java:964)
> at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.reapContainer(LinuxContainerExecutor.java:787)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:821)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:161)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:57)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:748)
> 2018-08-06 03:28:20,372 WARN  resources.CGroupsHandlerImpl 
> (CGroupsHandlerImpl.java:checkAndDeleteCgroup(523)) - Failed to read cgroup 
> tasks file.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7089) Mark the log-aggregation-controller APIs as public

2018-08-06 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570961#comment-16570961
 ] 

Wangda Tan commented on YARN-7089:
--

+1, LGTM.

> Mark the log-aggregation-controller APIs as public
> --
>
> Key: YARN-7089
> URL: https://issues.apache.org/jira/browse/YARN-7089
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7089.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8629) Container cleanup failed

2018-08-06 Thread Suma Shivaprasad (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad reassigned YARN-8629:
--

Assignee: Suma Shivaprasad

> Container cleanup failed
> 
>
> Key: YARN-8629
> URL: https://issues.apache.org/jira/browse/YARN-8629
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Suma Shivaprasad
>Priority: Major
>
> When an application failed to launch container successfully, the cleanup of 
> container also failed with below message.
> {code}
> 2018-08-06 03:28:20,351 WARN  resources.CGroupsHandlerImpl 
> (CGroupsHandlerImpl.java:checkAndDeleteCgroup(523)) - Failed to read cgroup 
> tasks file.
> java.io.FileNotFoundException: 
> /sys/fs/cgroup/cpu,cpuacct/hadoop-yarn-tmp-cxx/container_e02_156898541_0010_20_02/tasks
>  (No such file or directory)
> at java.io.FileInputStream.open0(Native Method)
> at java.io.FileInputStream.open(FileInputStream.java:195)
> at java.io.FileInputStream.(FileInputStream.java:138)
> at java.io.FileInputStream.(FileInputStream.java:93)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.checkAndDeleteCgroup(CGroupsHandlerImpl.java:507)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.deleteCGroup(CGroupsHandlerImpl.java:542)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsCpuResourceHandlerImpl.postComplete(CGroupsCpuResourceHandlerImpl.java:238)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.postComplete(ResourceHandlerChain.java:111)
> at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.postComplete(LinuxContainerExecutor.java:964)
> at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.reapContainer(LinuxContainerExecutor.java:787)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:821)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:161)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:57)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:748)
> 2018-08-06 03:28:20,372 WARN  resources.CGroupsHandlerImpl 
> (CGroupsHandlerImpl.java:checkAndDeleteCgroup(523)) - Failed to read cgroup 
> tasks file.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8629) Container cleanup failed

2018-08-06 Thread Yesha Vora (JIRA)
Yesha Vora created YARN-8629:


 Summary: Container cleanup failed
 Key: YARN-8629
 URL: https://issues.apache.org/jira/browse/YARN-8629
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yesha Vora


When an application failed to launch container successfully, the cleanup of 
container also failed with below message.
{code}
2018-08-06 03:28:20,351 WARN  resources.CGroupsHandlerImpl 
(CGroupsHandlerImpl.java:checkAndDeleteCgroup(523)) - Failed to read cgroup 
tasks file.
java.io.FileNotFoundException: 
/sys/fs/cgroup/cpu,cpuacct/hadoop-yarn-tmp-cxx/container_e02_156898541_0010_20_02/tasks
 (No such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.(FileInputStream.java:138)
at java.io.FileInputStream.(FileInputStream.java:93)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.checkAndDeleteCgroup(CGroupsHandlerImpl.java:507)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.deleteCGroup(CGroupsHandlerImpl.java:542)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsCpuResourceHandlerImpl.postComplete(CGroupsCpuResourceHandlerImpl.java:238)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.postComplete(ResourceHandlerChain.java:111)
at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.postComplete(LinuxContainerExecutor.java:964)
at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.reapContainer(LinuxContainerExecutor.java:787)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:821)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:161)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:57)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748)
2018-08-06 03:28:20,372 WARN  resources.CGroupsHandlerImpl 
(CGroupsHandlerImpl.java:checkAndDeleteCgroup(523)) - Failed to read cgroup 
tasks file.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6972) Adding RM ClusterId in AppInfo

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570944#comment-16570944
 ] 

genericqa commented on YARN-6972:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m  7s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}133m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-6972 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934549/YARN-6972.013.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4596575f209d 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca20e0d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/21523/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21523/testReport/ |
| Max. process+thread count | 903 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Commented] (YARN-8160) Yarn Service Upgrade: Support upgrade of service that use docker containers

2018-08-06 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570918#comment-16570918
 ] 

Eric Yang commented on YARN-8160:
-

[~csingh] Exit code 255 is coming from docker inspect 
container_e02_1533231998644_0009_01_03.  There looks like a race condition 
where ContainerLaunch thread has issued the termination on docker container 
pid.  LinuxContainerExecutor still has a independent child process that is 
checking the liveness of the docker container.  The two code path are not 
coordinated to cause the status of the container to record incorrect transient 
result from container-executor.  One possibility is to find the parent pid of 
docker container, and send a sigkill to avoid additional status to be written 
from container-executor, then send signal to docker container.  This can 
prevent node manager from processing the exit code from docker inspect 
container_e02_1533231998644_0009_01_03.


> Yarn Service Upgrade: Support upgrade of service that use docker containers 
> 
>
> Key: YARN-8160
> URL: https://issues.apache.org/jira/browse/YARN-8160
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Attachments: container_e02_1533231998644_0009_01_03.nm.log
>
>
> Ability to upgrade dockerized  yarn native services.
> Ref: YARN-5637
> *Background*
> Container upgrade is supported by the NM via {{reInitializeContainer}} api. 
> {{reInitializeContainer}} does *NOT* change the ContainerId of the upgraded 
> container.
> NM performs the following steps during {{reInitializeContainer}}:
> - kills the existing process
> - cleans up the container
> - launches another container with the new {{ContainerLaunchContext}}
> NOTE: {{ContainerLaunchContext}} holds all the information that needs to 
> upgrade the container.
> With {{reInitializeContainer}}, the following does *NOT* change
> - container ID. This is not created by NM. It is provided to it and here RM 
> is not creating another container allocation.
> - {{localizedResources}} this stays the same if the upgrade does *NOT* 
> require additional resources IIUC.
>  
> The following changes with {{reInitializeContainer}}
> - the working directory of the upgraded container changes. It is *NOT* a 
> relaunch. 
> *Changes required in the case of docker container*
> - {{reInitializeContainer}} seems to not be working with Docker containers. 
> Investigate and fix this.
> - [Future change] Add an additional api to NM to pull the images and modify 
> {{reInitializeContainer}} to trigger docker container launch without pulling 
> the image first which could be based on a flag.
> -- When the service upgrade is initialized, we can provide the user with 
> an option to just pull the images  on the NMs.
> -- When a component instance is upgrade, it calls the 
> {{reInitializeContainer}} with the flag pull-image set to false, since the NM 
> will have already pulled the images.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570898#comment-16570898
 ] 

genericqa commented on YARN-7417:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 35s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
9s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-7417 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934552/YARN-7417.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 598e24bbeac5 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 
17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca20e0d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21524/testReport/ |
| Max. process+thread count | 301 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21524/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> re-factory 

[jira] [Updated] (YARN-8626) Create HomePolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8626:
--
Summary: Create HomePolicyManager that sends all the requests to the home 
subcluster  (was: Create LocalPolicyManager that sends all the requests to the 
home subcluster)

> Create HomePolicyManager that sends all the requests to the home subcluster
> ---
>
> Key: YARN-8626
> URL: https://issues.apache.org/jira/browse/YARN-8626
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: YARN-8626.000.patch, YARN-8626.001.patch, 
> YARN-8626.002.patch, YARN-8626.003.patch, YARN-8626.004.patch, 
> YARN-8626.005.patch, YARN-8626.006.patch
>
>
> To have the same behavior as a regular non-federated deployment, one should 
> be able to submit jobs to the local RM and get the job constrained to that 
> subcluster.
> This JIRA creates an AMRMProxyPolicy that sends resources to the home 
> subcluster and mimics the behavior of a non-federated cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8160) Yarn Service Upgrade: Support upgrade of service that use docker containers

2018-08-06 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570774#comment-16570774
 ] 

Chandni Singh edited comment on YARN-8160 at 8/6/18 10:06 PM:
--

Attached are the logs of container 3 that fails to re-initialize. When it is 
re-initialized, the container is stopped and cleanup. This causes the container 
to exit but here it exits with code {{255}} instead of {{FORCE_KILLED}} or 
{{TERMINATED}}.

Since the container exits with a failure code, that is {{255}}, the status of 
the container in NM changes from {{REINITIALIZING_AWAITING_KILL}} to 
{{EXITED_WITH_FAILURE}}.

Below are the relevant log stmts:

1. Reinit of the container is triggered
{code:java}
 ctr005.log:2018-08-02 22:30:41,100 DEBUG container.ContainerImpl 
(ContainerImpl.java:handle(2080)) - Processing 
container_e02_1533231998644_0009_01_03 of type REINITIALIZE_CONTAINER

ctr005.log:2018-08-02 22:30:41,101 INFO container.ContainerImpl 
(ContainerImpl.java:handle(2093)) - Container 
container_e02_1533231998644_0009_01_03 transitioned from RUNNING to 
REINITIALIZING_AWAITING_KIL
{code}
2. Reinit triggers cleanup of the container
{code:java}
ctr005.log:2018-08-02 22:30:41,102 INFO launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(734)) - Cleaning up container 
container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG recovery.NMLeveldbStateStoreService 
(NMLeveldbStateStoreService.java:storeContainerKilled(555)) - 
storeContainerKilled: containerId=container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(752)) - Marking container 
container_e02_1533231998644_0009_01_03 as inactive
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(759)) - Getting pid for container 
container_e02_1533231998644_0009_01_03 to kill from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:getContainerPid(1084)) - Accessing pid for container 
container_e02_1533231998644_0009_01_03 from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG util.ProcessIdFileReader 
(ProcessIdFileReader.java:getProcessId(53)) - Accessing pid from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG util.ProcessIdFileReader 
(ProcessIdFileReader.java:getProcessId(103)) - Got pid 364708 from path 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:getContainerPid(1096)) - Got pid 364708 for container 
container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:signalProcess(919)) - Sending signal to pid 364708 as 
user root for container container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:executeDockerCommand(89)) - Running docker command: 
inspect docker-command=inspect format=\{{.State.Status}} 
name=container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,103 DEBUG privileged.PrivilegedOperationExecutor 
(PrivilegedOperationExecutor.java:getPrivilegedOperationExecutionCommand(119)) 
- Privileged Execution Command Array: 
[/hadoop_dist/hadoop-yarn/bin/container-executor, --inspect-docker-container, 
--format=\{{.State.Status}}, container_e02_1533231998644_0009_01_03]
ctr005.log:2018-08-02 22:30:41,129 DEBUG privileged.PrivilegedOperationExecutor 
(PrivilegedOperationExecutor.java:executePrivilegedOperation(155)) - 
[/hadoop_dist/hadoop-yarn/bin/container-executor, --inspect-docker-container, 
--format=\{{.State.Status}}, container_e02_1533231998644_0009_01_03]
ctr005.log:2018-08-02 22:30:41,130 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:getContainerStatus(154)) - Container Status: 
running ContainerId: container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,131 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:executeDockerCommand(89)) - Running docker command: 
stop docker-command=stop name=container_e02_1533231998644_0009_01_03
{code}
3. After 10 seconds, the stop command sent to the executor completes and the 
container is 

[jira] [Comment Edited] (YARN-8160) Yarn Service Upgrade: Support upgrade of service that use docker containers

2018-08-06 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570774#comment-16570774
 ] 

Chandni Singh edited comment on YARN-8160 at 8/6/18 10:06 PM:
--

Attached are the logs of container 3 that fails to re-initialize. When it is 
re-initialized, the container is stopped and cleanup. This causes the container 
to exit but here it exits with code {{255}} instead of {{FORCE_KILLED}} or 
{{TERMINATED}}.

Since the container exits with a failure code, that is {{255}}, the status of 
the container in NM changes from {{REINITIALIZING_AWAITING_KILL}} to 
{{EXITED_WITH_FAILURE}}.

Below are the relevant log stmts:

1. Reinit of the container is triggered
{code:java}
 ctr005.log:2018-08-02 22:30:41,100 DEBUG container.ContainerImpl 
(ContainerImpl.java:handle(2080)) - Processing 
container_e02_1533231998644_0009_01_03 of type REINITIALIZE_CONTAINER

ctr005.log:2018-08-02 22:30:41,101 INFO container.ContainerImpl 
(ContainerImpl.java:handle(2093)) - Container 
container_e02_1533231998644_0009_01_03 transitioned from RUNNING to 
REINITIALIZING_AWAITING_KIL
{code}
2. Reinit triggers cleanup of the container
{code:java}
ctr005.log:2018-08-02 22:30:41,102 INFO launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(734)) - Cleaning up container 
container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG recovery.NMLeveldbStateStoreService 
(NMLeveldbStateStoreService.java:storeContainerKilled(555)) - 
storeContainerKilled: containerId=container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(752)) - Marking container 
container_e02_1533231998644_0009_01_03 as inactive
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(759)) - Getting pid for container 
container_e02_1533231998644_0009_01_03 to kill from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:getContainerPid(1084)) - Accessing pid for container 
container_e02_1533231998644_0009_01_03 from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG util.ProcessIdFileReader 
(ProcessIdFileReader.java:getProcessId(53)) - Accessing pid from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG util.ProcessIdFileReader 
(ProcessIdFileReader.java:getProcessId(103)) - Got pid 364708 from path 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:getContainerPid(1096)) - Got pid 364708 for container 
container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:signalProcess(919)) - Sending signal to pid 364708 as 
user root for container container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:executeDockerCommand(89)) - Running docker command: 
inspect docker-command=inspect format=\{{.State.Status}} 
name=container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,103 DEBUG privileged.PrivilegedOperationExecutor 
(PrivilegedOperationExecutor.java:getPrivilegedOperationExecutionCommand(119)) 
- Privileged Execution Command Array: 
[/hadoop_dist/hadoop-yarn/bin/container-executor, --inspect-docker-container, 
--format=\{{.State.Status}}, container_e02_1533231998644_0009_01_03]
ctr005.log:2018-08-02 22:30:41,129 DEBUG privileged.PrivilegedOperationExecutor 
(PrivilegedOperationExecutor.java:executePrivilegedOperation(155)) - 
[/hadoop_dist/hadoop-yarn/bin/container-executor, --inspect-docker-container, 
--format=\{{.State.Status}}, container_e02_1533231998644_0009_01_03]
ctr005.log:2018-08-02 22:30:41,130 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:getContainerStatus(154)) - Container Status: 
running ContainerId: container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,131 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:executeDockerCommand(89)) - Running docker command: 
stop docker-command=stop name=container_e02_1533231998644_0009_01_03
{code}
3. After 10 seconds, the stop command sent to the executor completes and the 
container is 

[jira] [Commented] (YARN-4946) RM should not consider an application as COMPLETED when log aggregation is not in a terminal state

2018-08-06 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570843#comment-16570843
 ] 

Robert Kanter commented on YARN-4946:
-

{quote}Please also keep in mind that checkAppNumCompletedLimit is only invoked 
when an app becomes completed (APP_COMPLETED event dispatched from 
FinalTransition), so it can happen that we store more applications in the 
state-store and memory until another app finishes, but I can't think of any 
better solution currently.{quote}
I think that's fine.  The pressure to remove older completed apps comes from 
newer apps being completed, so if there are no new apps completed, we don't 
need to worry about this.  And when a newer app completes, this will trigger.

The 003 patch looks good overall.  Here's some more comments:
# Missing a newline at the end of {{RMAppImplTest}}.
# {{RMAppImplTest}} should be named {{TestRMAppImpl}} (we typically name tests 
{{TestFoo}} rather than {{FooTest}}).
# In {{RMAppImplTest}}, it would be best if we could avoid so much reflection 
to get access to internal state.  Instead, perhaps we can either get the App 
into that state by sending events (it looks like {{TestRMAppTransitions}} does 
this) or a {{\@VisibleForTesting}} method.  
#- Similarly, with {{completedAppsField}} in 
{{TestRMAppManager$TestRMAppManager}}.  It would be better to add a getter 
that's {{\@VisibleForTesting}} and package private.  Or given that 
{{TestRMAppManager$TestRMAppManager}} is a subclass of {{RMAppManager}}, you 
could make it protected.
# I'm not sure we need so many tests in {{RMAppImplTest}}.  The updated code in 
{{RMAppImpl#FinalTransition}} doesn't really do much with log aggregation now, 
and does pretty much the same thing whether it's enabled or not.  So having 
tests that check the behavior with various states of log aggregation seems 
unnecessary.  
#- In fact, the changes in {{RMAppImpl#FinalTransition}} are all just 
re-organizing the code - the logic is still the same, right?  And it's not 
really even related to this JIRA.  Perhaps these changes should be moved to a 
new JIRA that's aim is to make the code more readable?
# {{RMAppManager}} changes and tests look good.  
# The line declaring 
{{TestAppManager#createRMAppsMapMixedLogAggregationStatus}} is too long.


> RM should not consider an application as COMPLETED when log aggregation is 
> not in a terminal state
> --
>
> Key: YARN-4946
> URL: https://issues.apache.org/jira/browse/YARN-4946
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-4946.001.patch, YARN-4946.002.patch, 
> YARN-4946.003.patch
>
>
> MAPREDUCE-6415 added a tool that combines the aggregated log files for each 
> Yarn App into a HAR file.  When run, it seeds the list by looking at the 
> aggregated logs directory, and then filters out ineligible apps.  One of the 
> criteria involves checking with the RM that an Application's log aggregation 
> status is not still running and has not failed.  When the RM "forgets" about 
> an older completed Application (e.g. RM failover, enough time has passed, 
> etc), the tool won't find the Application in the RM and will just assume that 
> its log aggregation succeeded, even if it actually failed or is still running.
> We can solve this problem by doing the following:
> The RM should not consider an app to be fully completed (and thus removed 
> from its history) until the aggregation status has reached a terminal state 
> (e.g. SUCCEEDED, FAILED, TIME_OUT).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-06 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570836#comment-16570836
 ] 

Zian Chen commented on YARN-7417:
-

Update the patch 003 to address the findbugs issue.

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-06 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570838#comment-16570838
 ] 

Zian Chen commented on YARN-7417:
-

[~sunilg], could you help review the patch? Thanks

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-06 Thread Zian Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-7417:

Attachment: YARN-7417.002.patch

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch, YARN-7417.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6972) Adding RM ClusterId in AppInfo

2018-08-06 Thread Tanuj Nayak (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanuj Nayak updated YARN-6972:
--
Attachment: YARN-6972.013.patch

> Adding RM ClusterId in AppInfo
> --
>
> Key: YARN-6972
> URL: https://issues.apache.org/jira/browse/YARN-6972
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Tanuj Nayak
>Priority: Major
> Attachments: YARN-6972.001.patch, YARN-6972.002.patch, 
> YARN-6972.003.patch, YARN-6972.004.patch, YARN-6972.005.patch, 
> YARN-6972.006.patch, YARN-6972.007.patch, YARN-6972.008.patch, 
> YARN-6972.009.patch, YARN-6972.010.patch, YARN-6972.011.patch, 
> YARN-6972.012.patch, YARN-6972.013.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8609) NM oom because of large container statuses

2018-08-06 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570805#comment-16570805
 ] 

Jason Lowe commented on YARN-8609:
--

bq. As far as I know, there are two kinds of diagnostics info, one is fixed 
string, such as "Container is killed before being launched.\n", the other is 
exception message which may be very large, so I think we should just truncate 
exception message rather than the entire string made by for loop.

There should be only one way to store/update a container's diagnostics for 
recovery, and that's NMStateStoreService#storeContainerDiagnostics.  That 
method does not append but replaces the diagnostics.  The only call to that 
method is ContainerImpl#addDiagnostics which after YARN-3998 trims the 
diagnostics to the maximum configured length, keeping the most recently added 
characters.  The for loop is just for adding all the messages since it's 
implemented with variable arguments.  The most memory this method could take is 
 diagnosticsMaxSize + size_of_new_diagnostics which is then truncated to 
diagnosticsMaxSize at the end.  It will not persist, either in memory or in the 
state store, diagnostics beyond diagnosticsMaxSize.

If you're not running with YARN-3998 in your build then it appears the 
necessary changes are already addressed by YARN-3998.  It certainly looks as if 
that JIRA should have addressed your issue if it's configured to the default or 
a reasonable limit.  Are you running on a version that contains that change?  
If so then I'm wondering how you were able to get a 27MB diagnostic message 
into the state store.


> NM oom because of large container statuses
> --
>
> Key: YARN-8609
> URL: https://issues.apache.org/jira/browse/YARN-8609
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Xianghao Lu
>Priority: Major
> Attachments: YARN-8609.001.patch, contain_status.jpg, oom.jpeg
>
>
> Sometimes, NodeManger will send large container statuses to ResourceManager 
> when NodeManger start with recovering, as a result , NodeManger will be 
> failed to start because of oom.
>  In my case, the large container statuses size is 135M, which contain 11 
> container statuses, and I find the diagnostics of 5 containers are very 
> large(27M), so, I truncate the container diagnostics as the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8626) Create LocalPolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570795#comment-16570795
 ] 

genericqa commented on YARN-8626:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 54s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
21s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8626 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934536/YARN-8626.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 9fe4ff0ff547 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca20e0d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21522/testReport/ |
| Max. process+thread count | 324 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21522/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Create LocalPolicyManager that sends all the requests to the home subcluster

[jira] [Updated] (YARN-8160) Yarn Service Upgrade: Support upgrade of service that use docker containers

2018-08-06 Thread Chandni Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8160:

Attachment: container_e02_1533231998644_0009_01_03.nm.log

> Yarn Service Upgrade: Support upgrade of service that use docker containers 
> 
>
> Key: YARN-8160
> URL: https://issues.apache.org/jira/browse/YARN-8160
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Attachments: container_e02_1533231998644_0009_01_03.nm.log
>
>
> Ability to upgrade dockerized  yarn native services.
> Ref: YARN-5637
> *Background*
> Container upgrade is supported by the NM via {{reInitializeContainer}} api. 
> {{reInitializeContainer}} does *NOT* change the ContainerId of the upgraded 
> container.
> NM performs the following steps during {{reInitializeContainer}}:
> - kills the existing process
> - cleans up the container
> - launches another container with the new {{ContainerLaunchContext}}
> NOTE: {{ContainerLaunchContext}} holds all the information that needs to 
> upgrade the container.
> With {{reInitializeContainer}}, the following does *NOT* change
> - container ID. This is not created by NM. It is provided to it and here RM 
> is not creating another container allocation.
> - {{localizedResources}} this stays the same if the upgrade does *NOT* 
> require additional resources IIUC.
>  
> The following changes with {{reInitializeContainer}}
> - the working directory of the upgraded container changes. It is *NOT* a 
> relaunch. 
> *Changes required in the case of docker container*
> - {{reInitializeContainer}} seems to not be working with Docker containers. 
> Investigate and fix this.
> - [Future change] Add an additional api to NM to pull the images and modify 
> {{reInitializeContainer}} to trigger docker container launch without pulling 
> the image first which could be based on a flag.
> -- When the service upgrade is initialized, we can provide the user with 
> an option to just pull the images  on the NMs.
> -- When a component instance is upgrade, it calls the 
> {{reInitializeContainer}} with the flag pull-image set to false, since the NM 
> will have already pulled the images.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8160) Yarn Service Upgrade: Support upgrade of service that use docker containers

2018-08-06 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570774#comment-16570774
 ] 

Chandni Singh edited comment on YARN-8160 at 8/6/18 9:07 PM:
-

Attached are the logs of container 3 that fails to re-initialize. When it is 
re-initialized, the container is stopped and cleanup. This causes the container 
to exit but here it exits with code {{255}} instead of {{FORCE_KILLED}} or 
{{TERMINATED}}.

Since the container exits with a failure code, that is {{255}}, the status of 
the container in NM changes from {{REINITIALIZING_AWAITING_KILL}} to 
{{EXITED_WITH_FAILURE}}.

Below are the relevant log stmts:

1. Reinit of the container is triggered
{code:java}
 ctr005.log:2018-08-02 22:30:41,100 DEBUG container.ContainerImpl 
(ContainerImpl.java:handle(2080)) - Processing 
container_e02_1533231998644_0009_01_03 of type REINITIALIZE_CONTAINER

ctr005.log:2018-08-02 22:30:41,101 INFO container.ContainerImpl 
(ContainerImpl.java:handle(2093)) - Container 
container_e02_1533231998644_0009_01_03 transitioned from RUNNING to 
REINITIALIZING_AWAITING_KIL
{code}
2. Reinit triggers cleanup of the container
{code:java}
ctr005.log:2018-08-02 22:30:41,102 INFO launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(734)) - Cleaning up container 
container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG recovery.NMLeveldbStateStoreService 
(NMLeveldbStateStoreService.java:storeContainerKilled(555)) - 
storeContainerKilled: containerId=container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(752)) - Marking container 
container_e02_1533231998644_0009_01_03 as inactive
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(759)) - Getting pid for container 
container_e02_1533231998644_0009_01_03 to kill from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:getContainerPid(1084)) - Accessing pid for container 
container_e02_1533231998644_0009_01_03 from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG util.ProcessIdFileReader 
(ProcessIdFileReader.java:getProcessId(53)) - Accessing pid from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG util.ProcessIdFileReader 
(ProcessIdFileReader.java:getProcessId(103)) - Got pid 364708 from path 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:getContainerPid(1096)) - Got pid 364708 for container 
container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:signalProcess(919)) - Sending signal to pid 364708 as 
user root for container container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:executeDockerCommand(89)) - Running docker command: 
inspect docker-command=inspect format=\{{.State.Status}} 
name=container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,103 DEBUG privileged.PrivilegedOperationExecutor 
(PrivilegedOperationExecutor.java:getPrivilegedOperationExecutionCommand(119)) 
- Privileged Execution Command Array: 
[/hadoop_dist/hadoop-yarn/bin/container-executor, --inspect-docker-container, 
--format=\{{.State.Status}}, container_e02_1533231998644_0009_01_03]
ctr005.log:2018-08-02 22:30:41,129 DEBUG privileged.PrivilegedOperationExecutor 
(PrivilegedOperationExecutor.java:executePrivilegedOperation(155)) - 
[/hadoop_dist/hadoop-yarn/bin/container-executor, --inspect-docker-container, 
--format=\{{.State.Status}}, container_e02_1533231998644_0009_01_03]
ctr005.log:2018-08-02 22:30:41,130 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:getContainerStatus(154)) - Container Status: 
running ContainerId: container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,131 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:executeDockerCommand(89)) - Running docker command: 
stop docker-command=stop name=container_e02_1533231998644_0009_01_03
{code}
3. After 10 seconds, the stop command sent to the executor completes and the 
container is 

[jira] [Commented] (YARN-7089) Mark the log-aggregation-controller APIs as public

2018-08-06 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570780#comment-16570780
 ] 

Zian Chen commented on YARN-7089:
-

Hi [~leftnoteasy], could you help review this patch? Thanks

> Mark the log-aggregation-controller APIs as public
> --
>
> Key: YARN-7089
> URL: https://issues.apache.org/jira/browse/YARN-7089
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7089.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8160) Yarn Service Upgrade: Support upgrade of service that use docker containers

2018-08-06 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570774#comment-16570774
 ] 

Chandni Singh commented on YARN-8160:
-

Attached are the logs of ctr005 that fails to re-initialize. When it is 
re-initialized, the container is stopped and cleanup. This causes the container 
to exit but here it exits with code {{255}} instead of {{FORCE_KILLED}} or 
{{TERMINATED}}.

Since the container exits with a failure code, that is {{255}}, the status of 
the container in NM changes from {{REINITIALIZING_AWAITING_KILL}} to 
{{EXITED_WITH_FAILURE}}.

Below are the relevant log stmts:

1. Reinit of the container is triggered
{code}
 ctr005.log:2018-08-02 22:30:41,100 DEBUG container.ContainerImpl 
(ContainerImpl.java:handle(2080)) - Processing 
container_e02_1533231998644_0009_01_03 of type REINITIALIZE_CONTAINER

ctr005.log:2018-08-02 22:30:41,101 INFO container.ContainerImpl 
(ContainerImpl.java:handle(2093)) - Container 
container_e02_1533231998644_0009_01_03 transitioned from RUNNING to 
REINITIALIZING_AWAITING_KIL
{code}

2. Reinit triggers cleanup of the container
{code}
ctr005.log:2018-08-02 22:30:41,102 INFO launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(734)) - Cleaning up container 
container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG recovery.NMLeveldbStateStoreService 
(NMLeveldbStateStoreService.java:storeContainerKilled(555)) - 
storeContainerKilled: containerId=container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(752)) - Marking container 
container_e02_1533231998644_0009_01_03 as inactive
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(759)) - Getting pid for container 
container_e02_1533231998644_0009_01_03 to kill from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:getContainerPid(1084)) - Accessing pid for container 
container_e02_1533231998644_0009_01_03 from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG util.ProcessIdFileReader 
(ProcessIdFileReader.java:getProcessId(53)) - Accessing pid from pid file 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG util.ProcessIdFileReader 
(ProcessIdFileReader.java:getProcessId(103)) - Got pid 364708 from path 
/tmp/hadoop/yarn/local/nmPrivate/application_1533231998644_0009/container_e02_1533231998644_0009_01_03/container_e02_1533231998644_0009_01_03.pid
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:getContainerPid(1096)) - Got pid 364708 for container 
container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG launcher.ContainerLaunch 
(ContainerLaunch.java:signalProcess(919)) - Sending signal to pid 364708 as 
user root for container container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,102 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:executeDockerCommand(89)) - Running docker command: 
inspect docker-command=inspect format=\{{.State.Status}} 
name=container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,103 DEBUG privileged.PrivilegedOperationExecutor 
(PrivilegedOperationExecutor.java:getPrivilegedOperationExecutionCommand(119)) 
- Privileged Execution Command Array: 
[/hadoop_dist/hadoop-yarn/bin/container-executor, --inspect-docker-container, 
--format=\{{.State.Status}}, container_e02_1533231998644_0009_01_03]
ctr005.log:2018-08-02 22:30:41,129 DEBUG privileged.PrivilegedOperationExecutor 
(PrivilegedOperationExecutor.java:executePrivilegedOperation(155)) - 
[/hadoop_dist/hadoop-yarn/bin/container-executor, --inspect-docker-container, 
--format=\{{.State.Status}}, container_e02_1533231998644_0009_01_03]
ctr005.log:2018-08-02 22:30:41,130 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:getContainerStatus(154)) - Container Status: 
running ContainerId: container_e02_1533231998644_0009_01_03
ctr005.log:2018-08-02 22:30:41,131 DEBUG docker.DockerCommandExecutor 
(DockerCommandExecutor.java:executeDockerCommand(89)) - Running docker command: 
stop docker-command=stop name=container_e02_1533231998644_0009_01_03
{code}

3. After 10 seconds, the stop command sent to the executor completes and the 
container is removed
{code}
ctr005.log:2018-08-02 22:30:51,251 DEBUG 

[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-06 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570773#comment-16570773
 ] 

Prabhu Joseph commented on YARN-8617:
-

Thanks [~bibinchundatt]. Still analyzing the issue in our cluster, will open 
this later if needed.

> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8448) AM HTTPS Support

2018-08-06 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-8448:

Attachment: YARN-8448.005.patch

> AM HTTPS Support
> 
>
> Key: YARN-8448
> URL: https://issues.apache.org/jira/browse/YARN-8448
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Major
> Attachments: YARN-8448.001.patch, YARN-8448.002.patch, 
> YARN-8448.003.patch, YARN-8448.004.patch, YARN-8448.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8448) AM HTTPS Support

2018-08-06 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570703#comment-16570703
 ] 

Robert Kanter commented on YARN-8448:
-

The 005 patch:
- Fixes the unit tests.  It turns out we do still need the test scope 
bouncycastle dependencies.
- Added some useful log messages to {{ProxyCA}}.

> AM HTTPS Support
> 
>
> Key: YARN-8448
> URL: https://issues.apache.org/jira/browse/YARN-8448
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Major
> Attachments: YARN-8448.001.patch, YARN-8448.002.patch, 
> YARN-8448.003.patch, YARN-8448.004.patch, YARN-8448.005.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570687#comment-16570687
 ] 

genericqa commented on YARN-7417:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 29s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
27s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
5s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
|  |  Found reliance on default encoding in 
org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.IndexedFileAggregatedLogsBlock.processContainerLog(HtmlBlock$Block,
 long[], InputStream, int, byte[]):in 
org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.IndexedFileAggregatedLogsBlock.processContainerLog(HtmlBlock$Block,
 long[], InputStream, int, byte[]): new String(byte[], int, int)  At 
IndexedFileAggregatedLogsBlock.java:[line 265] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-7417 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934526/YARN-7417.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b9be7d36c7c0 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (YARN-8626) Create LocalPolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8626:
--
Attachment: YARN-8626.006.patch

> Create LocalPolicyManager that sends all the requests to the home subcluster
> 
>
> Key: YARN-8626
> URL: https://issues.apache.org/jira/browse/YARN-8626
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: YARN-8626.000.patch, YARN-8626.001.patch, 
> YARN-8626.002.patch, YARN-8626.003.patch, YARN-8626.004.patch, 
> YARN-8626.005.patch, YARN-8626.006.patch
>
>
> To have the same behavior as a regular non-federated deployment, one should 
> be able to submit jobs to the local RM and get the job constrained to that 
> subcluster.
> This JIRA creates an AMRMProxyPolicy that sends resources to the home 
> subcluster and mimics the behavior of a non-federated cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8626) Create LocalPolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570672#comment-16570672
 ] 

genericqa commented on YARN-8626:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
21s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8626 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934523/YARN-8626.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 979afaaf279d 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca20e0d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21519/testReport/ |
| Max. process+thread count | 395 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21519/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Create LocalPolicyManager that sends all the requests to the home subcluster
> 

[jira] [Commented] (YARN-8626) Create LocalPolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570619#comment-16570619
 ] 

genericqa commented on YARN-8626:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 40s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
13s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8626 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934519/YARN-8626.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d2de6e005859 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca20e0d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21518/testReport/ |
| Max. process+thread count | 303 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21518/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Create LocalPolicyManager that sends all the requests to the home subcluster

[jira] [Commented] (YARN-8626) Create LocalPolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread Subru Krishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570604#comment-16570604
 ] 

Subru Krishnan commented on YARN-8626:
--

Thanks [~elgoiri] for the patch. I looked at it, please find my comments below:
 * We generally use _home_ and not _local_. In this case I suggest replacing 
_local_ with either _home_ or _reflexive_?
 * If possible, can you remove the empty *notifyOfResponse* impls from the 
stateless AMRMProxyPolicies as those are now redundant.
 * In the {{LocalAMRMProxyPolicy}}, add a check to validate the _home SC_ is 
indeed active (and corresponding test).
 * The *FederationPolicyInitializationContext* will not have the _home SC_ set 
in {{LocalRouterPolicy}} as it's the responsibility of the router to do so 
(chicken or egg situation :)). If you have capacity reserved, then the ideal 
approach would be to query the _StateStore_ to figure out which SC has capacity 
and select that as the _home SC_. If you don't have capacity reserved, then you 
should use *UniformRandomRouterPolicy* directly.
 * Add a test for {{LocalRouterPolicy}} if it's still required based on above 
comment.

> Create LocalPolicyManager that sends all the requests to the home subcluster
> 
>
> Key: YARN-8626
> URL: https://issues.apache.org/jira/browse/YARN-8626
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: YARN-8626.000.patch, YARN-8626.001.patch, 
> YARN-8626.002.patch, YARN-8626.003.patch, YARN-8626.004.patch, 
> YARN-8626.005.patch
>
>
> To have the same behavior as a regular non-federated deployment, one should 
> be able to submit jobs to the local RM and get the job constrained to that 
> subcluster.
> This JIRA creates an AMRMProxyPolicy that sends resources to the home 
> subcluster and mimics the behavior of a non-federated cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes

2018-08-06 Thread Zian Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-7417:

Attachment: YARN-7417.001.patch

> re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to 
> remove duplicate codes
> 
>
> Key: YARN-7417
> URL: https://issues.apache.org/jira/browse/YARN-7417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Zian Chen
>Priority: Major
> Attachments: YARN-7417.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8626) Create LocalPolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8626:
--
Attachment: YARN-8626.005.patch

> Create LocalPolicyManager that sends all the requests to the home subcluster
> 
>
> Key: YARN-8626
> URL: https://issues.apache.org/jira/browse/YARN-8626
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: YARN-8626.000.patch, YARN-8626.001.patch, 
> YARN-8626.002.patch, YARN-8626.003.patch, YARN-8626.004.patch, 
> YARN-8626.005.patch
>
>
> To have the same behavior as a regular non-federated deployment, one should 
> be able to submit jobs to the local RM and get the job constrained to that 
> subcluster.
> This JIRA creates an AMRMProxyPolicy that sends resources to the home 
> subcluster and mimics the behavior of a non-federated cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8520) Document best practice for user management

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570568#comment-16570568
 ] 

genericqa commented on YARN-8520:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
37m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 42s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 52m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8520 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934514/YARN-8520.003.patch |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 3f4c38d2188c 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ca20e0d |
| maven | version: Apache Maven 3.3.9 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/21517/artifact/out/whitespace-eol.txt
 |
| Max. process+thread count | 334 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21517/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Document best practice for user management
> --
>
> Key: YARN-8520
> URL: https://issues.apache.org/jira/browse/YARN-8520
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation, yarn
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8520.001.patch, YARN-8520.002.patch, 
> YARN-8520.003.patch
>
>
> Docker container must have consistent username and groups with host operating 
> system when external mount points are exposed to docker container.  This 
> prevents malicious or unauthorized impersonation to occur.  This task is to 
> document the best practice to ensure user and group membership are consistent 
> across docker containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8626) Create LocalPolicyManager that sends all the requests to the home subcluster

2018-08-06 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8626:
--
Attachment: YARN-8626.004.patch

> Create LocalPolicyManager that sends all the requests to the home subcluster
> 
>
> Key: YARN-8626
> URL: https://issues.apache.org/jira/browse/YARN-8626
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Giovanni Matteo Fumarola
>Assignee: Íñigo Goiri
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: YARN-8626.000.patch, YARN-8626.001.patch, 
> YARN-8626.002.patch, YARN-8626.003.patch, YARN-8626.004.patch
>
>
> To have the same behavior as a regular non-federated deployment, one should 
> be able to submit jobs to the local RM and get the job constrained to that 
> subcluster.
> This JIRA creates an AMRMProxyPolicy that sends resources to the home 
> subcluster and mimics the behavior of a non-federated cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8520) Document best practice for user management

2018-08-06 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8520:

Attachment: YARN-8520.003.patch

> Document best practice for user management
> --
>
> Key: YARN-8520
> URL: https://issues.apache.org/jira/browse/YARN-8520
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation, yarn
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8520.001.patch, YARN-8520.002.patch, 
> YARN-8520.003.patch
>
>
> Docker container must have consistent username and groups with host operating 
> system when external mount points are exposed to docker container.  This 
> prevents malicious or unauthorized impersonation to occur.  This task is to 
> document the best practice to ensure user and group membership are consistent 
> across docker containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8624) Cleanup ENTRYPOINT documentation

2018-08-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570491#comment-16570491
 ] 

Hudson commented on YARN-8624:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14709 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14709/])
YARN-8624. Updated verbiage around entry point support.(eyang: rev 
ca20e0d7e9767a7362dddfea8ec19548947d3fd7)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md


> Cleanup ENTRYPOINT documentation
> 
>
> Key: YARN-8624
> URL: https://issues.apache.org/jira/browse/YARN-8624
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Craig Condit
>Assignee: Craig Condit
>Priority: Minor
>  Labels: Docker
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8624.001.patch
>
>
> With the changes to allow disabling the YARN launch script in favor of 
> running whatever is specified in the image, I think the following needs to be 
> remove/updated. There is already a section called Docker Container ENTRYPOINT 
> support. I think we can clean this up a bit make it easier to understand this 
> feature. This will likely require some testing to fully describe the feature.
> {code:java}
> If a Docker image has a command set, the behavior will depend on whether the 
> YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE is set to true. If so, the 
> command will be overridden when LCE launches the image with YARN's container 
> launch script. 
>  If a Docker image has an entry point set, the entry point will be honored, 
> but the default command may be overridden, as just mentioned above. Unless 
> the entry point is something similar to sh -c or 
> YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE is set to true, the net 
> result will likely be undesirable. Because the YARN container launch script 
> is required to correctly launch the YARN task, use of entry points is 
> discouraged. {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8627) EntityGroupFSTimelineStore hdfs done directory keeps on accumulating

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570274#comment-16570274
 ] 

genericqa commented on YARN-8627:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
17s{color} | {color:green} hadoop-yarn-server-timeline-pluginstorage in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 58m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8627 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934478/YARN-8627.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1c103bfa9c2d 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bcfc985 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21516/testReport/ |
| Max. process+thread count | 301 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21516/console |
| Powered by | Apache 

[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-06 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570205#comment-16570205
 ] 

Bibin A Chundatt commented on YARN-8617:


[~Prabhu Joseph]

For IndexFileFormat the rolling is based in size 

{code}
  @Private
  @VisibleForTesting
  public long getRollOverLogMaxSize(Configuration conf) {
return 1024L * 1024 * 1024 * conf.getInt(
LOG_ROLL_OVER_MAX_FILE_SIZE_GB, 10);
  }
{code}

> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8628) [UI2] Few duplicated or inconsistent information displayed in UI2

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570187#comment-16570187
 ] 

genericqa commented on YARN-8628:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
35m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8628 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934470/YARN-8628.001.patch |
| Optional Tests |  asflicense  shadedclient  |
| uname | Linux 80fde3837510 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bcfc985 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 408 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21515/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> [UI2] Few duplicated or inconsistent information displayed in UI2
> -
>
> Key: YARN-8628
> URL: https://issues.apache.org/jira/browse/YARN-8628
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Akhil PB
>Assignee: Akhil PB
>Priority: Major
> Attachments: YARN-8628.001.patch
>
>
> 1. Irrespective of whichever component-instance that we click on, it always 
> lands on the component-instance detail page which has the first container. It 
> should take us to the component instance page of the corresponding container.
> 2. Exit Status Code in Component Instance Information detail page says 0, but 
> says N/A in Containers Grid View page.
> 3. Host URL and IP Address are N/A in Component Instance Information detail 
> page, but has valid values in Containers Grid View page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8627) EntityGroupFSTimelineStore hdfs done directory keeps on accumulating

2018-08-06 Thread Tarun Parimi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tarun Parimi updated YARN-8627:
---
Attachment: YARN-8627.001.patch

> EntityGroupFSTimelineStore hdfs done directory keeps on accumulating
> 
>
> Key: YARN-8627
> URL: https://issues.apache.org/jira/browse/YARN-8627
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-8627.001.patch
>
>
> The EntityLogCleaner threads exits with the following ERROR every time it 
> runs.  
> {code:java}
> 2018-07-18 19:59:39,837 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18268
> 2018-07-18 19:59:39,844 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
> 2018-07-18 19:59:39,848 ERROR timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:run(899)) - Error cleaning files  
> java.io.FileNotFoundException: File 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
>  does not exist.  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1062)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1069)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1040)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1019)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.shouldCleanAppLogDir(EntityGroupFSTimelineStore.java:480)
>  
> {code}
>  
>  Each time the thread gets scheduled, it is a different folder encountering 
> the error. As a result, the thread is not able to clean all the old done 
> directories, since it stops after this error. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8628) [UI2] Few duplicated or inconsistent information displayed in UI2

2018-08-06 Thread Akhil PB (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB updated YARN-8628:
---
Description: 
1. Irrespective of whichever component-instance that we click on, it always 
lands on the component-instance detail page which has the first container. It 
should take us to the component instance page of the corresponding container.

2. Exit Status Code in Component Instance Information detail page says 0, but 
says N/A in Containers Grid View page.

3. Host URL and IP Address are N/A in Component Instance Information detail 
page, but has valid values in Containers Grid View page.

  was:
1. Irrespective of whichever component-instance (ping-0 in the Issue_1_* images 
in this case) that we click on, it always lands on the component-instance 
detail page which has the first container (note the id and timestamp) which is 
container_e19_1525121797508_0004_01_02 in the Issue_1_* images. It should 
take us to the component instance page of the corresponding container (which is 
container_e19_1525121797508_0004_01_11 in this case). Check the Issue_1_* 
images for details.

2. Exit Status Code in Component Instance Information detail page says 0, but 
says N/A in Containers Grid View page. Check the Issue_2_Exit_Status_* images 
for details.

3. Host URL and IP Address are N/A in Component Instance Information detail 
page, but has valid values in Containers Grid View page. Check the 
Issue_3_Host_IP_* images for details.


> [UI2] Few duplicated or inconsistent information displayed in UI2
> -
>
> Key: YARN-8628
> URL: https://issues.apache.org/jira/browse/YARN-8628
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Akhil PB
>Assignee: Akhil PB
>Priority: Major
>
> 1. Irrespective of whichever component-instance that we click on, it always 
> lands on the component-instance detail page which has the first container. It 
> should take us to the component instance page of the corresponding container.
> 2. Exit Status Code in Component Instance Information detail page says 0, but 
> says N/A in Containers Grid View page.
> 3. Host URL and IP Address are N/A in Component Instance Information detail 
> page, but has valid values in Containers Grid View page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8628) [UI2] Few duplicated or inconsistent information displayed in UI2

2018-08-06 Thread Akhil PB (JIRA)
Akhil PB created YARN-8628:
--

 Summary: [UI2] Few duplicated or inconsistent information 
displayed in UI2
 Key: YARN-8628
 URL: https://issues.apache.org/jira/browse/YARN-8628
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Reporter: Akhil PB
Assignee: Akhil PB


1. Irrespective of whichever component-instance (ping-0 in the Issue_1_* images 
in this case) that we click on, it always lands on the component-instance 
detail page which has the first container (note the id and timestamp) which is 
container_e19_1525121797508_0004_01_02 in the Issue_1_* images. It should 
take us to the component instance page of the corresponding container (which is 
container_e19_1525121797508_0004_01_11 in this case). Check the Issue_1_* 
images for details.

2. Exit Status Code in Component Instance Information detail page says 0, but 
says N/A in Containers Grid View page. Check the Issue_2_Exit_Status_* images 
for details.

3. Host URL and IP Address are N/A in Component Instance Information detail 
page, but has valid values in Containers Grid View page. Check the 
Issue_3_Host_IP_* images for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4946) RM should not consider an application as COMPLETED when log aggregation is not in a terminal state

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570060#comment-16570060
 ] 

genericqa commented on YARN-4946:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 33s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}136m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-4946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/1293/YARN-4946.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux fee50e32b6e5 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bcfc985 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/21514/artifact/out/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/21514/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21514/testReport/ |
| Max. process+thread count | 930 

[jira] [Commented] (YARN-8617) Aggregated Application Logs accumulates for long running jobs

2018-08-06 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569983#comment-16569983
 ] 

Bibin A Chundatt commented on YARN-8617:


[~Prabhu Joseph]

IIUC LogAggregationService ->  AggregatorImpl every upload cycle for each node 
you will have new files per node

{code}
node3.openstack_45454_1533315091898
node3.openstack_45454_
node3.openstack_45454_
{code}

{{node3.openstack_45454_1533315091898}} file will not be changed once uploaded. 
So AggregationDeletionService should delete the file after log retain time


{code}
   Set uploadedFilePathsInThisCycle =
aggregator.doContainerLogAggregation(logAggregationFileController,
appFinished, finishedContainers.contains(container));
...

  logControllerContext.setLogUploadTimeStamp(System.currentTimeMillis());

  try {
this.logAggregationFileController.postWrite(logControllerContext);
diagnosticMessage = "Log uploaded successfully for Application: "
+ appId + " in NodeManager: "
+ LogAggregationUtils.getNodeString(nodeId) + " at "
+ Times.format(logControllerContext.getLogUploadTimeStamp())
+ "\n";
{code}

this.logAggregationFileController.postWrite(logControllerContext); --> renames 
the file with time stamp




> Aggregated Application Logs accumulates for long running jobs
> -
>
> Key: YARN-8617
> URL: https://issues.apache.org/jira/browse/YARN-8617
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: log-aggregation
>Affects Versions: 2.7.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently AggregationDeletionService will delete older aggregated log files 
> once when they are complete. This will cause logs to accumulate for Long 
> Running Jobs like Llap, Spark Streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4946) RM should not consider an application as COMPLETED when log aggregation is not in a terminal state

2018-08-06 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569917#comment-16569917
 ] 

Szilard Nemeth commented on YARN-4946:
--

Thanks [~rkanter] for your quick review!
1. Removed {{recordLogAggregationStartTime}}. For the other small methods, I 
think it definitely makes {{FinalTransition#transition}} more readable so I 
left them as is. Please tell me if you still have objections with this.
2. Good point, didn't know log aggregation could delay the finish time of the 
application with minutes, so I removed the condition around the 
{{APP_COMPLETED}} event in {{FinalTransition}}.

So the majority of the code changes now are in 
{{RMAppManager#checkAppNumCompletedLimit}}.
Basically, for both the state store and the in-memory completed application 
checks, I modified the logic so that checking what is the difference in the 
completed apps vs. configured max and then try to delete as many applications.
I only remove an application If the application has log aggregation enabled and 
the aggregation is not finished yet.

An example: Let's suppose we have configured max for state store and in-memory 
as 2 and we have 10 apps completed.
{{RMAppManager#checkAppNumCompletedLimit}} will realize that it need to remove 
8 apps so it starts from the 0th index in apps and try to delete sequentially.
If any of the apps has not finished their log aggregation then they won't be 
removed (index is skipped), so from now on, the configured max is not a hard 
limit. 

Please check the testcase added specifically to test the above scenario: 
{{TestAppManager#testStateStoreAppLimitSomeAppsHaveNotFinishedLogAggregation}}
I also extended 2 testcases with checking the removed / completed application 
IDs, too.

Please also keep in mind that {{checkAppNumCompletedLimit}} is only invoked 
when an app becomes completed ({{APP_COMPLETED}} event dispatched from 
{{FinalTransition}}), so it can happen that we store more applications in the 
state-store and memory until another app finishes, but I can't think of any 
better solution currently.




> RM should not consider an application as COMPLETED when log aggregation is 
> not in a terminal state
> --
>
> Key: YARN-4946
> URL: https://issues.apache.org/jira/browse/YARN-4946
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-4946.001.patch, YARN-4946.002.patch, 
> YARN-4946.003.patch
>
>
> MAPREDUCE-6415 added a tool that combines the aggregated log files for each 
> Yarn App into a HAR file.  When run, it seeds the list by looking at the 
> aggregated logs directory, and then filters out ineligible apps.  One of the 
> criteria involves checking with the RM that an Application's log aggregation 
> status is not still running and has not failed.  When the RM "forgets" about 
> an older completed Application (e.g. RM failover, enough time has passed, 
> etc), the tool won't find the Application in the RM and will just assume that 
> its log aggregation succeeded, even if it actually failed or is still running.
> We can solve this problem by doing the following:
> The RM should not consider an app to be fully completed (and thus removed 
> from its history) until the aggregation status has reached a terminal state 
> (e.g. SUCCEEDED, FAILED, TIME_OUT).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4946) RM should not consider an application as COMPLETED when log aggregation is not in a terminal state

2018-08-06 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-4946:
-
Attachment: YARN-4946.003.patch

> RM should not consider an application as COMPLETED when log aggregation is 
> not in a terminal state
> --
>
> Key: YARN-4946
> URL: https://issues.apache.org/jira/browse/YARN-4946
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation
>Affects Versions: 2.8.0
>Reporter: Robert Kanter
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-4946.001.patch, YARN-4946.002.patch, 
> YARN-4946.003.patch
>
>
> MAPREDUCE-6415 added a tool that combines the aggregated log files for each 
> Yarn App into a HAR file.  When run, it seeds the list by looking at the 
> aggregated logs directory, and then filters out ineligible apps.  One of the 
> criteria involves checking with the RM that an Application's log aggregation 
> status is not still running and has not failed.  When the RM "forgets" about 
> an older completed Application (e.g. RM failover, enough time has passed, 
> etc), the tool won't find the Application in the RM and will just assume that 
> its log aggregation succeeded, even if it actually failed or is still running.
> We can solve this problem by doing the following:
> The RM should not consider an app to be fully completed (and thus removed 
> from its history) until the aggregation status has reached a terminal state 
> (e.g. SUCCEEDED, FAILED, TIME_OUT).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8627) EntityGroupFSTimelineStore hdfs done directory keeps on accumulating

2018-08-06 Thread Tarun Parimi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569873#comment-16569873
 ] 

Tarun Parimi commented on YARN-8627:


On further analysis, I found that this error occurs for application directories 
which themselves have another application directory such as 
/ats/done/1500089190015//017/application_1500089190015_17219/application_1500089190015_17219.
 In audit logs I see that ats tries to list the folder after deleting it, which 
causes the error.
{code:java}
19:35:45,944 INFO FSNamesystem.audit: allowed=true  
ugi=yarn/rm-u...@example.com (auth:KERBEROS)ip=/x.x.x.x cmd=listStatus  
src=/ats/done/1500089190015//017/application_1500089190015_17219
dst=nullperm=null   proto=rpc   
callerContext=yarn_ats_server_v1_5
19:35:45,945 INFO FSNamesystem.audit: allowed=true  
ugi=yarn/rm-u...@example.com (auth:KERBEROS)ip=/x.x.x.x cmd=listStatus  
src=/ats/done/1500089190015//017/application_1500089190015_17219
dst=nullperm=null   proto=rpc   
callerContext=yarn_ats_server_v1_5
19:35:45,946 INFO FSNamesystem.audit: allowed=true  
ugi=yarn/rm-u...@example.com (auth:KERBEROS)ip=/x.x.x.x cmd=listStatus  
src=/ats/done/1500089190015//017/application_1500089190015_17219/appattempt_1500089190015_17219_01
  dst=nullperm=null   proto=rpc   
callerContext=yarn_ats_server_v1_5
19:35:45,947 INFO FSNamesystem.audit: allowed=true  
ugi=yarn/rm-u...@example.com (auth:KERBEROS)ip=/x.x.x.x cmd=listStatus  
src=/ats/done/1500089190015//017/application_1500089190015_17219/application_1500089190015_17219
dst=nullperm=null   proto=rpc   
callerContext=yarn_ats_server_v1_5
19:35:45,948 INFO FSNamesystem.audit: allowed=true  
ugi=yarn/rm-u...@example.com (auth:KERBEROS)ip=/x.x.x.x cmd=listStatus  
src=/ats/done/1500089190015//017/application_1500089190015_17219/application_1500089190015_17219/appattempt_1500089190015_17219_01
  dst=nullperm=null   proto=rpc   
callerContext=yarn_ats_server_v1_5
19:35:45,952 INFO FSNamesystem.audit: allowed=true  
ugi=yarn/rm-u...@example.com (auth:KERBEROS)ip=/x.x.x.x cmd=delete  
src=/ats/done/1500089190015//017/application_1500089190015_17219
dst=nullperm=null   proto=rpc   
callerContext=yarn_ats_server_v1_5
19:35:45,953 INFO FSNamesystem.audit: allowed=true  
ugi=yarn/rm-u...@example.com (auth:KERBEROS)ip=/x.x.x.x cmd=listStatus  
src=/ats/done/1500089190015//017/application_1500089190015_17219
dst=nullperm=null   proto=rpc   
callerContext=yarn_ats_server_v1_5
{code}
I am not sure how this directory structure got created in the first place. But 
the cleaner thread should not list a directory after deleting the same. 

The {{EntityGroupFSTimelineStore#cleanLogs}} method tries to delete the parent 
directory {{dirpath}}, while it is iterating over the same dirpath. It should 
only try to delete its children so as to avoid these issues. Testing a patch 
which does this in my environment and it seems to fix the issue. Will upload a 
patch soon after doing further tests.

 

> EntityGroupFSTimelineStore hdfs done directory keeps on accumulating
> 
>
> Key: YARN-8627
> URL: https://issues.apache.org/jira/browse/YARN-8627
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
>
> The EntityLogCleaner threads exits with the following ERROR every time it 
> runs.  
> {code:java}
> 2018-07-18 19:59:39,837 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18268
> 2018-07-18 19:59:39,844 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
> 2018-07-18 19:59:39,848 ERROR timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:run(899)) - Error cleaning files  
> java.io.FileNotFoundException: File 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
>  does not exist.  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1062)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1069)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1040)
>   at 
> 

[jira] [Commented] (YARN-8614) Some typos in YarnConfiguration

2018-08-06 Thread Sen Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569861#comment-16569861
 ] 

Sen Zhao commented on YARN-8614:


No unit test is needed here.

> Some typos in YarnConfiguration
> ---
>
> Key: YARN-8614
> URL: https://issues.apache.org/jira/browse/YARN-8614
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sen Zhao
>Assignee: Sen Zhao
>Priority: Minor
> Attachments: YARN-8614.001.patch
>
>
> Fix some typos in comments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8627) EntityGroupFSTimelineStore hdfs done directory keeps on accumulating

2018-08-06 Thread Tarun Parimi (JIRA)
Tarun Parimi created YARN-8627:
--

 Summary: EntityGroupFSTimelineStore hdfs done directory keeps on 
accumulating
 Key: YARN-8627
 URL: https://issues.apache.org/jira/browse/YARN-8627
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.8.0
Reporter: Tarun Parimi
Assignee: Tarun Parimi


The EntityLogCleaner threads exits with the following ERROR every time it runs. 
 
{code:java}
2018-07-18 19:59:39,837 INFO timeline.EntityGroupFSTimelineStore 
(EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18268
2018-07-18 19:59:39,844 INFO timeline.EntityGroupFSTimelineStore 
(EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
2018-07-18 19:59:39,848 ERROR timeline.EntityGroupFSTimelineStore 
(EntityGroupFSTimelineStore.java:run(899)) - Error cleaning files  
java.io.FileNotFoundException: File 
hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270 
does not exist.  at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1062)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1069)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1040)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1019)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1015)
  at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1015)
  at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.shouldCleanAppLogDir(EntityGroupFSTimelineStore.java:480)
 
{code}
 
 Each time the thread gets scheduled, it is a different folder encountering the 
error. As a result, the thread is not able to clean all the old done 
directories, since it stops after this error. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7957) [UI2] Yarn service delete option disappears after stopping application

2018-08-06 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569837#comment-16569837
 ] 

genericqa commented on YARN-7957:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
41m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 29s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-7957 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934429/YARN-7957.001.patch |
| Optional Tests |  asflicense  shadedclient  |
| uname | Linux 112369425ba8 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bcfc985 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 336 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21513/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> [UI2] Yarn service delete option disappears after stopping application
> --
>
> Key: YARN-7957
> URL: https://issues.apache.org/jira/browse/YARN-7957
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Akhil PB
>Priority: Critical
> Attachments: YARN-7957.001.patch
>
>
> Steps:
> 1) Launch yarn service
> 2) Go to service page and click on Setting button->"Stop Service". The 
> application will be stopped.
> 3) Refresh page
> Here, setting button disappears. Thus, user can not delete service from UI 
> after stopping application
> Expected behavior:
> Setting button should be present on UI page after application is stopped. If 
> application is stopped, setting button should only have "Delete Service" 
> action available.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7991) Use ServiceState values to publish to ATS

2018-08-06 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-7991:
-
Issue Type: Bug  (was: Sub-task)
Parent: (was: YARN-7957)

> Use ServiceState values to publish to ATS
> -
>
> Key: YARN-7991
> URL: https://issues.apache.org/jira/browse/YARN-7991
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
>
> Add the state DELETED to ServiceState and then use ServiceState values to 
> publish to ATS (instead of FinalApplicationStatus). 
> Refer to parent issue for more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7993) [UI2] yarn-service page need to consider ServiceState to show stop/delete buttons

2018-08-06 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-7993:
-
Issue Type: Bug  (was: Sub-task)
Parent: (was: YARN-7957)

> [UI2] yarn-service page need to consider ServiceState to show stop/delete 
> buttons
> -
>
> Key: YARN-7993
> URL: https://issues.apache.org/jira/browse/YARN-7993
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Sunil Govindan
>Assignee: Akhil PB
>Priority: Major
>
> yarn service page has stop/delete buttons. These buttons has to be 
> shown/hidden based on ServiceState of each app from ATS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org