[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies

2018-12-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709684#comment-16709684
 ] 

Hadoop QA commented on YARN-9057:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
52m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-assemblies in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
43s{color} | {color:green} hadoop-yarn-csi in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 89m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9057 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12950647/YARN-9057.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  xml  |
| uname | Linux f34a21f8b6ff 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 228156c |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22781/testReport/ |
| Max. process+thread count | 332 (vs. ulimit of 1) |
| modules | C: hadoop-assemblies 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22781/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> CSI jar

[jira] [Commented] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Akhil PB (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709647#comment-16709647
 ] 

Akhil PB commented on YARN-8914:


[~eyang] Please let me know what are the steps to make the terminal work. I was 
unable to enter inputs since it was frozen.

> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, 
> YARN-8914.009.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Akhil PB (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709640#comment-16709640
 ] 

Akhil PB edited comment on YARN-8914 at 12/5/18 6:15 AM:
-

[~eyang] Please find the v9 patch  [^YARN-8914.009.patch] with latest UI2 
changes. The issues were in how requestedUser were accessed.
The code snippet {{this.get('requestedUser')}} should be 
{{self.get('requestedUser')}}.
The following changes were made related to UI2 in v9 patch.
# Removed DOM access and termLink code from {{models/yarn-container.js and 
serializers/yarn-container.js}}, since these were not used.
# Removed Terminal column from attempts table in 
{{components/timeline-view.js}}.
# Fixed requestedUser issues in {{components/timeline-view.js and 
templates/components/container-table.hbs}}.

Please verify that all your changes are there in v9 patch.


was (Author: akhilpb):
[~eyang] Please find the v9 patch  [^YARN-8914.009.patch] with latest UI2 
changes. The issues were in how requestedUser were accessed.
The code snippet {{this.get('requestedUser')}} should be 
{{self.get('requestedUser')}}.
The following changes were made related to UI2.
# Removed DOM access and termLink code from {{models/yarn-container.js and 
serializers/yarn-container.js}}, since these were not used.
# Removed Terminal column from attempts table in 
{{components/timeline-view.js}}.
# Fixed requestedUser issues in {{components/timeline-view.js and 
templates/components/container-table.hbs}}.

Please verify that all your changes are there in v9 patch.

> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, 
> YARN-8914.009.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Akhil PB (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709640#comment-16709640
 ] 

Akhil PB edited comment on YARN-8914 at 12/5/18 6:16 AM:
-

[~eyang] Please find the v9 patch  [^YARN-8914.009.patch] with latest UI2 
changes. The issues were in how requestedUser were accessed.
The code snippet {{this.get('requestedUser')}} should be 
{{self.get('requestedUser')}}.
The following changes were made related to UI2 in v9 patch.
# Removed DOM access and termLink code from {{models/yarn-container.js and 
serializers/yarn-container.js}}, since these were not used. Since we only need 
nodeHttpAddress, containerId and user for termLink.
# Removed Terminal column from attempts table in 
{{components/timeline-view.js}}.
# Fixed requestedUser issues in {{components/timeline-view.js and 
templates/components/container-table.hbs}}.

Please verify that all your changes are there in v9 patch.


was (Author: akhilpb):
[~eyang] Please find the v9 patch  [^YARN-8914.009.patch] with latest UI2 
changes. The issues were in how requestedUser were accessed.
The code snippet {{this.get('requestedUser')}} should be 
{{self.get('requestedUser')}}.
The following changes were made related to UI2 in v9 patch.
# Removed DOM access and termLink code from {{models/yarn-container.js and 
serializers/yarn-container.js}}, since these were not used.
# Removed Terminal column from attempts table in 
{{components/timeline-view.js}}.
# Fixed requestedUser issues in {{components/timeline-view.js and 
templates/components/container-table.hbs}}.

Please verify that all your changes are there in v9 patch.

> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, 
> YARN-8914.009.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Akhil PB (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB updated YARN-8914:
---
Attachment: (was: YARN-8914.008.patch)

> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Akhil PB (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709640#comment-16709640
 ] 

Akhil PB edited comment on YARN-8914 at 12/5/18 6:02 AM:
-

[~eyang] Please find the v9 patch  [^YARN-8914.009.patch] with latest UI2 
changes. The issues were in how requestedUser were accessed.
The code snippet {{this.get('requestedUser')}} should be 
{{self.get('requestedUser')}}.
The following changes were made related to UI2.
# Removed DOM access and termLink code from {{models/yarn-container.js and 
serializers/yarn-container.js}}, since these were not used.
# Removed Terminal column from attempts table in 
{{components/timeline-view.js}}.
# Fixed requestedUser issues in {{components/timeline-view.js and 
templates/components/container-table.hbs}}.

Please verify that all your changes are there in v9 patch.


was (Author: akhilpb):
[~eyang] Please find the v8 patch  [^YARN-8914.009.patch] with latest UI2 
changes. The issues were in how requestedUser were accessed.
The code snippet {{this.get('requestedUser')}} should be 
{{self.get('requestedUser')}}.
The following changes were made related to UI2.
# Removed DOM access and termLink code from {{models/yarn-container.js and 
serializers/yarn-container.js}}, since these were not used.
# Removed Terminal column from attempts table in 
{{components/timeline-view.js}}.
# Fixed requestedUser issues in {{components/timeline-view.js and 
templates/components/container-table.hbs}}.


> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, 
> YARN-8914.009.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Akhil PB (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB updated YARN-8914:
---
Attachment: YARN-8914.009.patch

> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, 
> YARN-8914.009.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Akhil PB (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709640#comment-16709640
 ] 

Akhil PB edited comment on YARN-8914 at 12/5/18 6:01 AM:
-

[~eyang] Please find the v8 patch  [^YARN-8914.009.patch] with latest UI2 
changes. The issues were in how requestedUser were accessed.
The code snippet {{this.get('requestedUser')}} should be 
{{self.get('requestedUser')}}.
The following changes were made related to UI2.
# Removed DOM access and termLink code from {{models/yarn-container.js and 
serializers/yarn-container.js}}, since these were not used.
# Removed Terminal column from attempts table in 
{{components/timeline-view.js}}.
# Fixed requestedUser issues in {{components/timeline-view.js and 
templates/components/container-table.hbs}}.



was (Author: akhilpb):
[~eyang] Please find the v8 patch  [^YARN-8914.008.patch] with latest UI2 
changes. The issues were in how requestedUser were accessed.
The code snippet {{this.get('requestedUser')}} should be 
{{self.get('requestedUser')}}.
The following changes were made related to UI2.
# Removed DOM access and termLink code from {{models/yarn-container.js and 
serializers/yarn-container.js}}, since these were not used.
# Removed Terminal column from attempts table in 
{{components/timeline-view.js}}.
# Fixed requestedUser issues in {{components/timeline-view.js and 
templates/components/container-table.hbs}}.


> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, 
> YARN-8914.009.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Akhil PB (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709640#comment-16709640
 ] 

Akhil PB commented on YARN-8914:


[~eyang] Please find the v8 patch  [^YARN-8914.008.patch] with latest UI2 
changes. The issues were in how requestedUser were accessed.
The code snippet {{this.get('requestedUser')}} should be 
{{self.get('requestedUser')}}.
The following changes were made related to UI2.
# Removed DOM access and termLink code from {{models/yarn-container.js and 
serializers/yarn-container.js}}, since these were not used.
# Removed Terminal column from attempts table in 
{{components/timeline-view.js}}.
# Fixed requestedUser issues in {{components/timeline-view.js and 
templates/components/container-table.hbs}}.


> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, 
> YARN-8914.008.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies

2018-12-04 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709637#comment-16709637
 ] 

Weiwei Yang commented on YARN-9057:
---

Hi [~eyang], using provided scope for hadoop-yarn-api/hadoop-yarn-common 
resolves this problem. Please help to review. Thanks

> CSI jar file should not bundle third party dependencies
> ---
>
> Key: YARN-9057
> URL: https://issues.apache.org/jira/browse/YARN-9057
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.0
>Reporter: Eric Yang
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-9057.001.patch, YARN-9057.002.patch
>
>
> hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a 
> shaded jar instead of CSI only classes.  This is generating error messages 
> for YARN cli:
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Akhil PB (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB updated YARN-8914:
---
Attachment: YARN-8914.008.patch

> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, 
> YARN-8914.008.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9057) CSI jar file should not bundle third party dependencies

2018-12-04 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-9057:
--
Attachment: YARN-9057.002.patch

> CSI jar file should not bundle third party dependencies
> ---
>
> Key: YARN-9057
> URL: https://issues.apache.org/jira/browse/YARN-9057
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.0
>Reporter: Eric Yang
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-9057.001.patch, YARN-9057.002.patch
>
>
> hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a 
> shaded jar instead of CSI only classes.  This is generating error messages 
> for YARN cli:
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709633#comment-16709633
 ] 

Hadoop QA commented on YARN-8914:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 15m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
31m 19s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui 
hadoop-client-modules/hadoop-client-minicluster . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
58s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 29s{color} | {color:orange} root: The patch generated 1 new + 2 unchanged - 
0 fixed = 3 total (was 2) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 12m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui 
hadoop-client-modules/hadoop-client-minicluster . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}122m  2s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}278m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.registry.secure.TestSecureLogins |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Iss

[jira] [Commented] (YARN-8789) Add BoundedQueue to AsyncDispatcher

2018-12-04 Thread Wilfred Spiegelenburg (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709605#comment-16709605
 ] 

Wilfred Spiegelenburg commented on YARN-8789:
-

First: I would expect that the change would be fully tested and thus the 
behaviour with a limited queue would be known and described. Task failures are 
probably more acceptable. Are we really still seeing them with the change from 
MAPREDUCE-5124 applied? If not then making this change is not really warranted.
Before we go further and make a change like this I would also test the 
behaviour. What happens when the queue is full. Looking at the patch there is 
far more change than needed: the current queue can be limited and just that 
change would be far less impactful. The logic for taking an event is also 
changed which I don't think is needed either. Going back to just the basic 
change of limiting the queue after we find that it is needed would be a better 
approach.

Based on that quick analysis I would say this is not an acceptable change in 
its current form.

> Add BoundedQueue to AsyncDispatcher
> ---
>
> Key: YARN-8789
> URL: https://issues.apache.org/jira/browse/YARN-8789
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: YARN-8789.1.patch, YARN-8789.10.patch, 
> YARN-8789.12.patch, YARN-8789.14.patch, YARN-8789.2.patch, YARN-8789.3.patch, 
> YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, 
> YARN-8789.7.patch, YARN-8789.8.patch, YARN-8789.9.patch
>
>
> I recently came across a scenario where an MR ApplicationMaster was failing 
> with an OOM exception.  It had many thousands of Mappers and thousands of 
> Reducers.  It was noted that in the logging that the event-queue of 
> {{AsyncDispatcher}} had a very large number of item in it and was seemingly 
> never decreasing.
> I started looking at the code and thought it could use some clean up, 
> simplification, and the ability to specify a bounded queue so that any 
> incoming events are throttled until they can be processed.  This will protect 
> the ApplicationMaster from a flood of events.
> Logging Message:
> Size of event-queue is xxx



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-12-04 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709547#comment-16709547
 ] 

Zhankun Tang edited comment on YARN-8714 at 12/5/18 3:04 AM:
-

[~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle 
directory. I changed distributedShell's client to let it localize an HDFS 
directory "mydir" directly. This is enabled by YARN-2185.
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
 "/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
 LocalResource.newInstance(URL.fromURI(p.toUri()),
 LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
 scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py  2.py  dir1  test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/
 -l
total 20
lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the bad news is submarine utilizes YARN native service, and it doesn't know 
this YARN ability and blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: 
srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
 is a directory, which is not supported.
{code}

Two solutions ahead of us at present:
1. Fix the improper handling of the directory in the native service and then 
got this implement.
2. Go ahead with our download, zip and upload approach which is more complex. 
And refactor this after 1 is done.

I personally prefer solution 2 because in that case submarine won't depend on 
newer YARN(3.1.0). Any thoughts?


was (Author: tangzhankun):
[~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle 
directory. I changed distributedShell's client to let it localize an HDFS 
directory "mydir" directly. 
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
 "/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
 LocalResource.newInstance(URL.fromURI(p.toUri()),
 LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
 scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py  2.py  dir1  test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/
 -l
total 20
lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the bad news is submarine utilizes YARN native service, and it doesn't know 
this YARN ability and blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: 
srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
 is a directory, which is not supported.
{code}

Two solutions ahead of us at present:
1. Fix the improper handling of the directory in the native service and then 
got this implement.
2. Go ahead with our download, zip and upload approach which is more complex. 
And refactor this after 1 is done.

Any thoughts?

> [Submarine] Support files/tarballs to be localized for a training job.
> --
>
> Key: YARN-8714
> URL: https://issues.apache.org/jira/browse/YARN-8714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8714-WIP1-trunk-001.patch, 
> YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, 
> YARN-8714

[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers

2018-12-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709554#comment-16709554
 ] 

Hadoop QA commented on YARN-9071:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 32s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 194 unchanged - 0 fixed = 195 total (was 194) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
34s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 
39s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}117m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9071 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12950625/YARN-9071.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8fc24a7fc3e3 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 228156c |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://bu

[jira] [Updated] (YARN-9083) Support remote directory localization in yarn native service

2018-12-04 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-9083:
---
Description: 
When refining YARN-8714, found that the YARN localizer seems can handle remote 
directory directly. In FSDownload.java#downloadAndUnpack, it uses 
"FileUtil.copy" which can handle directory. This ability is added by YARN-2185.

For testing purpose, I changed distributedShell's client to let it localize an 
HDFS directory "mydir" directly. 
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
 "/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
 LocalResource.newInstance(URL.fromURI(p.toUri()),
 LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
 scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py  2.py  dir1  test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/
 -l
total 20
lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the YARN native service seems doesn't know this YARN localizer ability and 
blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: 
srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
 is a directory, which is not supported.{code}
We should enable this ability in yarn native service.

  was:
When refining YARN-8714, found that the YARN localizer seems can handle remote 
directory directly. In FSDownload.java#downloadAndUnpack, it uses 
"FileUtil.copy" which can handle directory. This ability is added by YARN-2185.

For testing purpose, I changed distributedShell's client to let it localize an 
HDFS directory "mydir" directly. 
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
 "/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
 LocalResource.newInstance(URL.fromURI(p.toUri()),
 LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
 scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py  2.py  dir1  test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/
 -l
total 20
lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the YARN native service seems doesn't know this YARN localizer ability and 
blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: 
srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
 is a directory, which is not supported.{code}
We should utilize this ability in yarn native service.


> Support remote directory localization in yarn native service
> 
>
> Key: YARN-9083
> URL: https://issues.apache.org/jira/browse/YARN-9083
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
>
> When refining YARN-8714, found that the YARN localizer seems can handle 
> remote directory directly. In FSDownload.java#downloadAndUnpack, it uses 
> "FileUtil.copy" which can handle directory. This ability is added by 
> YARN-2185.
> For testing purpose, I changed distributedShell's client to let it localize 
> an HDFS directory "mydir" directly. 
> {code:java}
> Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
>  "/mydir");
> FileStatus scFileStatus = fs.getFileStatus(p);
> LocalResource r =
> 

[jira] [Created] (YARN-9083) Support remote directory localization in yarn native service

2018-12-04 Thread Zhankun Tang (JIRA)
Zhankun Tang created YARN-9083:
--

 Summary: Support remote directory localization in yarn native 
service
 Key: YARN-9083
 URL: https://issues.apache.org/jira/browse/YARN-9083
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhankun Tang
Assignee: Zhankun Tang


When refining YARN-8714, found that the YARN localizer seems can handle remote 
directory directly. In FSDownload.java#downloadAndUnpack, it uses 
"FileUtil.copy" which can handle directory. This ability is added by YARN-2185.

For testing purpose, I changed distributedShell's client to let it localize an 
HDFS directory "mydir" directly. 
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
 "/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
 LocalResource.newInstance(URL.fromURI(p.toUri()),
 LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
 scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py  2.py  dir1  test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/
 -l
total 20
lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the YARN native service seems doesn't know this YARN localizer ability and 
blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: 
srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
 is a directory, which is not supported.{code}
We should utilize this ability in yarn native service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-12-04 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709547#comment-16709547
 ] 

Zhankun Tang edited comment on YARN-8714 at 12/5/18 2:33 AM:
-

[~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle 
directory. I changed distributedShell's client to let it localize an HDFS 
directory "mydir" directly. 
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
 "/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
 LocalResource.newInstance(URL.fromURI(p.toUri()),
 LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
 scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py  2.py  dir1  test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/
 -l
total 20
lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the bad news is submarine utilizes YARN native service, and it doesn't know 
this YARN ability and blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: 
srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
 is a directory, which is not supported.
{code}

Two solutions ahead of us at present:
1. Fix the improper handling of the directory in the native service and then 
got this implement.
2. Go ahead with our download, zip and upload approach which is more complex. 
And refactor this after 1 is done.

Any thoughts?


was (Author: tangzhankun):
[~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle 
directory. I changed distributedShell's client to let it localize an HDFS 
directory "mydir" directly. 
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
 "/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
 LocalResource.newInstance(URL.fromURI(p.toUri()),
 LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
 scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py  2.py  dir1  test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/
 -l
total 20
lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the bad news is submarine utilizes YARN native server, and it doesn't know 
this YARN ability and blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: 
srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
 is a directory, which is not supported.
{code}

Two solutions ahead of us at present:
1. Fix the improper handling of the directory in the native service and then 
got this implement.
2. Go ahead with our download, zip and upload approach which is more complex. 
And refactor this after 1 is done.

Any thoughts?

> [Submarine] Support files/tarballs to be localized for a training job.
> --
>
> Key: YARN-8714
> URL: https://issues.apache.org/jira/browse/YARN-8714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8714-WIP1-trunk-001.patch, 
> YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, 
> YARN-8714-trunk.002.patch
>
>
> See 
> [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vk

[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-12-04 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709547#comment-16709547
 ] 

Zhankun Tang commented on YARN-8714:


[~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle 
directory. I changed distributedShell's client to let it localize an HDFS 
directory "mydir" directly. 
{code:java}
Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" +
 "/mydir");
FileStatus scFileStatus = fs.getFileStatus(p);
LocalResource r =
 LocalResource.newInstance(URL.fromURI(p.toUri()),
 LocalResourceType.FILE, LocalResourceVisibility.APPLICATION,
 scFileStatus.getLen(), scFileStatus.getModificationTime());
localResources.put("mydir", r);{code}
And YARN localizer indeed downloads the HDFS dir to local for DistributedShell.
{code:java}
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
1.py  2.py  dir1  test_kill9.sh
yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/
 -l
total 20
lrwxrwxrwx 1 yarn hadoop  111 12月  5 10:08 AppMaster.jar -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar
lrwxrwxrwx 1 yarn hadoop  103 12月  5 10:08 mydir -> 
/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir
{code}
But the bad news is submarine utilizes YARN native server, and it doesn't know 
this YARN ability and blocked it.
{code:java}
2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: 
srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir
 is a directory, which is not supported.
{code}

Two solutions ahead of us at present:
1. Fix the improper handling of the directory in the native service and then 
got this implement.
2. Go ahead with our download, zip and upload approach which is more complex. 
And refactor this after 1 is done.

Any thoughts?

> [Submarine] Support files/tarballs to be localized for a training job.
> --
>
> Key: YARN-8714
> URL: https://issues.apache.org/jira/browse/YARN-8714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8714-WIP1-trunk-001.patch, 
> YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, 
> YARN-8714-trunk.002.patch
>
>
> See 
> [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7],
>  {{job run --localization ...}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-12-04 Thread Xun Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709512#comment-16709512
 ] 

Xun Liu commented on YARN-5168:
---

[~eyang], Thanks for your tips, I deal with it immediately. :)

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Xun Liu
>Priority: Major
>  Labels: Docker
> Attachments: YARN-5168.001.patch, YARN-5168.002.patch, 
> YARN-5168.003.patch, YARN-5168.004.patch, YARN-5168.005.patch, 
> YARN-5168.006.patch, YARN-5168.007.patch, YARN-5168.008.patch, 
> YARN-5168.009.patch, YARN-5168.010.patch
>
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709477#comment-16709477
 ] 

Eric Yang commented on YARN-8914:
-

[~akhilpb] Patch 008 fixes the issues 1-4 from your last comments.  Please 
review.  Thanks

> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8914) Add xtermjs to YARN UI2

2018-12-04 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8914:

Attachment: YARN-8914.008.patch

> Add xtermjs to YARN UI2
> ---
>
> Key: YARN-8914
> URL: https://issues.apache.org/jira/browse/YARN-8914
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8914.001.patch, YARN-8914.002.patch, 
> YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, 
> YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch
>
>
> In the container listing from UI2, we can add a link to connect to docker 
> container using xtermjs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers

2018-12-04 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709475#comment-16709475
 ] 

Chandni Singh commented on YARN-9071:
-

[~eyang] I have uploaded patch 5 where ip and host is cleared on both the AM 
side and the NM side before upgrade. Please take a look at it. 

> NM and service AM don't have updated status for reinitialized containers
> 
>
> Key: YARN-9071
> URL: https://issues.apache.org/jira/browse/YARN-9071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Chandni Singh
>Priority: Critical
> Attachments: YARN-9071.001.patch, YARN-9071.002.patch, 
> YARN-9071.003.patch, YARN-9071.004.patch, YARN-9071.005.patch, q.log
>
>
> Container resource monitoring is not stopped during the reinitialization 
> process, and this prevents the NM from obtaining updated process tree 
> information when the container starts running again. I observed a 
> reinitialized container go from RUNNING to REINITIALIZING to 
> REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring 
> was then started for a second time, but since the trackingContainers entry 
> had already been initialized for the container, ContainersMonitor skipped 
> finding the new PID and IP for the container. A possible solution would be to 
> stop the container monitoring in the reinitialization process so that the 
> process tree information would be initialized properly when monitoring is 
> restarted. When the same container was stopped by the NM later, the NM did 
> not kill the container, and the service AM received an unexpected event (stop 
> at reinitializing).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9071) NM and service AM don't have updated status for reinitialized containers

2018-12-04 Thread Chandni Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-9071:

Attachment: YARN-9071.005.patch

> NM and service AM don't have updated status for reinitialized containers
> 
>
> Key: YARN-9071
> URL: https://issues.apache.org/jira/browse/YARN-9071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Chandni Singh
>Priority: Critical
> Attachments: YARN-9071.001.patch, YARN-9071.002.patch, 
> YARN-9071.003.patch, YARN-9071.004.patch, YARN-9071.005.patch, q.log
>
>
> Container resource monitoring is not stopped during the reinitialization 
> process, and this prevents the NM from obtaining updated process tree 
> information when the container starts running again. I observed a 
> reinitialized container go from RUNNING to REINITIALIZING to 
> REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring 
> was then started for a second time, but since the trackingContainers entry 
> had already been initialized for the container, ContainersMonitor skipped 
> finding the new PID and IP for the container. A possible solution would be to 
> stop the container monitoring in the reinitialization process so that the 
> process tree information would be initialized properly when monitoring is 
> restarted. When the same container was stopped by the NM later, the NM did 
> not kill the container, and the service AM received an unexpected event (stop 
> at reinitializing).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies

2018-12-04 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709465#comment-16709465
 ] 

Weiwei Yang commented on YARN-9057:
---

Thanks [~eyang], indeed that is not expected. Not sure why copy dependencies 
would ever move existing jars. Let me check, thanks!

> CSI jar file should not bundle third party dependencies
> ---
>
> Key: YARN-9057
> URL: https://issues.apache.org/jira/browse/YARN-9057
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.0
>Reporter: Eric Yang
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-9057.001.patch
>
>
> hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a 
> shaded jar instead of CSI only classes.  This is generating error messages 
> for YARN cli:
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9013) [GPG] fix order of steps cleaning Registry entries in ApplicationCleaner

2018-12-04 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709446#comment-16709446
 ] 

Giovanni Matteo Fumarola commented on YARN-9013:


Thanks [~botong] . +1 on [^YARN-9013-YARN-7402.v2.patch] .

> [GPG] fix order of steps cleaning Registry entries in ApplicationCleaner
> 
>
> Key: YARN-9013
> URL: https://issues.apache.org/jira/browse/YARN-9013
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-9013-YARN-7402.v1.patch, 
> YARN-9013-YARN-7402.v2.patch
>
>
> ApplicationCleaner today deletes the entry for all finished (non-running) 
> application in YarnRegistry using this logic:
>  # GPG gets the list of running applications from Router.
>  # GPG gets the full list of applications in registry
>  # GPG deletes in registry every app in 2 that’s not in 1
> The problem is that jobs that started between 1 and 2 meets the criteria in 
> 3, and thus get deleted by mistake. The fix/right order should be 2->1->3, 
> rather than 1->2->3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8870) [Submarine] Add submarine installation scripts

2018-12-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709356#comment-16709356
 ] 

Hudson commented on YARN-8870:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15561 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15561/])
Revert "YARN-8870. [Submarine] Add submarine installation scripts. (Xun 
(wangda: rev 228156cfd1b474988bc4fedfbf7edddc87db41e3)
* (edit) hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/etcd/etcd.service
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/utils.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/hadoop/container-executor.cfg
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/docker/daemon.json
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/environment.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/calico/calico-node.service
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/install.conf
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/docker/docker.service
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/nvidia.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/submarine.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/docker.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/hadoop.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/menu.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/install.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/calico/calicoctl.cfg
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/submarine/submarine.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/download-server.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/etcd.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/calico.sh
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/nvidia-docker.sh


> [Submarine] Add submarine installation scripts
> --
>
> Key: YARN-8870
> URL: https://issues.apache.org/jira/browse/YARN-8870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xun Liu
>Assignee: Xun Liu
>Priority: Critical
> Attachments: YARN-8870-addendum.008.patch, YARN-8870.001.patch, 
> YARN-8870.004.patch, YARN-8870.005.patch, YARN-8870.006.patch, 
> YARN-8870.007.patch, YARN-8870.009.patch, YARN-8870.010.patch, 
> YARN-8870.011.patch, YARN-8870.012.patch
>
>
> In order to reduce the deployment difficulty of Hadoop
> {Submarine} DNS, Docker, GPU, Network, graphics card, operating system kernel 
> modification and other components, I specially developed this installation 
> script to deploy Hadoop \{Submarine}
> runtime environment, providing one-click installation Scripts, which can also 
> be used to install, uninstall, start, and stop individual components step by 
> step.
>  
> {color:#ff}design d{color}{color:#FF}ocument:{color} 
> [https://docs.google.com/document/d/1muCTGFuUXUvM4JaDYjKqX5liQEg-AsNgkxfLMIFxYHU/edit?usp=sharing]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8870) [Submarine] Add submarine installation scripts

2018-12-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709341#comment-16709341
 ] 

Wangda Tan commented on YARN-8870:
--

As we discussed offline, reverted the patch from branches. It's better to move 
such scripts outside of Hadoop core. 

> [Submarine] Add submarine installation scripts
> --
>
> Key: YARN-8870
> URL: https://issues.apache.org/jira/browse/YARN-8870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xun Liu
>Assignee: Xun Liu
>Priority: Critical
> Attachments: YARN-8870-addendum.008.patch, YARN-8870.001.patch, 
> YARN-8870.004.patch, YARN-8870.005.patch, YARN-8870.006.patch, 
> YARN-8870.007.patch, YARN-8870.009.patch, YARN-8870.010.patch, 
> YARN-8870.011.patch, YARN-8870.012.patch
>
>
> In order to reduce the deployment difficulty of Hadoop
> {Submarine} DNS, Docker, GPU, Network, graphics card, operating system kernel 
> modification and other components, I specially developed this installation 
> script to deploy Hadoop \{Submarine}
> runtime environment, providing one-click installation Scripts, which can also 
> be used to install, uninstall, start, and stop individual components step by 
> step.
>  
> {color:#ff}design d{color}{color:#FF}ocument:{color} 
> [https://docs.google.com/document/d/1muCTGFuUXUvM4JaDYjKqX5liQEg-AsNgkxfLMIFxYHU/edit?usp=sharing]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8870) [Submarine] Add submarine installation scripts

2018-12-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8870:
-
Target Version/s:   (was: 3.2.0)

> [Submarine] Add submarine installation scripts
> --
>
> Key: YARN-8870
> URL: https://issues.apache.org/jira/browse/YARN-8870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xun Liu
>Assignee: Xun Liu
>Priority: Critical
> Attachments: YARN-8870-addendum.008.patch, YARN-8870.001.patch, 
> YARN-8870.004.patch, YARN-8870.005.patch, YARN-8870.006.patch, 
> YARN-8870.007.patch, YARN-8870.009.patch, YARN-8870.010.patch, 
> YARN-8870.011.patch, YARN-8870.012.patch
>
>
> In order to reduce the deployment difficulty of Hadoop
> {Submarine} DNS, Docker, GPU, Network, graphics card, operating system kernel 
> modification and other components, I specially developed this installation 
> script to deploy Hadoop \{Submarine}
> runtime environment, providing one-click installation Scripts, which can also 
> be used to install, uninstall, start, and stop individual components step by 
> step.
>  
> {color:#ff}design d{color}{color:#FF}ocument:{color} 
> [https://docs.google.com/document/d/1muCTGFuUXUvM4JaDYjKqX5liQEg-AsNgkxfLMIFxYHU/edit?usp=sharing]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8870) [Submarine] Add submarine installation scripts

2018-12-04 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8870:
-
Fix Version/s: (was: 3.2.0)

> [Submarine] Add submarine installation scripts
> --
>
> Key: YARN-8870
> URL: https://issues.apache.org/jira/browse/YARN-8870
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xun Liu
>Assignee: Xun Liu
>Priority: Critical
> Attachments: YARN-8870-addendum.008.patch, YARN-8870.001.patch, 
> YARN-8870.004.patch, YARN-8870.005.patch, YARN-8870.006.patch, 
> YARN-8870.007.patch, YARN-8870.009.patch, YARN-8870.010.patch, 
> YARN-8870.011.patch, YARN-8870.012.patch
>
>
> In order to reduce the deployment difficulty of Hadoop
> {Submarine} DNS, Docker, GPU, Network, graphics card, operating system kernel 
> modification and other components, I specially developed this installation 
> script to deploy Hadoop \{Submarine}
> runtime environment, providing one-click installation Scripts, which can also 
> be used to install, uninstall, start, and stop individual components step by 
> step.
>  
> {color:#ff}design d{color}{color:#FF}ocument:{color} 
> [https://docs.google.com/document/d/1muCTGFuUXUvM4JaDYjKqX5liQEg-AsNgkxfLMIFxYHU/edit?usp=sharing]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies

2018-12-04 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709320#comment-16709320
 ] 

Eric Yang commented on YARN-9057:
-

[~cheersyang] Thank you for the patch.  
HADOOP_HOME/share/hadoop/yarn/csi/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar has the 
right content.

However, I get error message when launching application:

{code}
$ ./bin/yarn app -status abc
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/hadoop/yarn/conf/YarnConfiguration
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at 
sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.yarn.conf.YarnConfiguration
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
{code}

When I look in HADOOP_HOME/share/hadoop/yarn, hadoop-yarn-api-*.jar file is 
missing.  It copied hadoop-yarn-api-*.jar into:

{code}
$ tar tfvz hadoop-3.3.0-SNAPSHOT.tar.gz |grep yarn-api
-rw-rw-r-- eyang/eyang 3369775 2018-12-04 16:46 
hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/csi/lib/hadoop-yarn-api-3.3.0-SNAPSHOT.jar
{code}

Seems like an unexpected behavior.

> CSI jar file should not bundle third party dependencies
> ---
>
> Key: YARN-9057
> URL: https://issues.apache.org/jira/browse/YARN-9057
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.0
>Reporter: Eric Yang
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-9057.001.patch
>
>
> hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a 
> shaded jar instead of CSI only classes.  This is generating error messages 
> for YARN cli:
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8937) Upgrade Curator version to 2.13.0 to fix ZK tests

2018-12-04 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709293#comment-16709293
 ] 

Jason Lowe commented on YARN-8937:
--

Thanks for the excellent analysis!  +1 lgtm.  Committing this.

> Upgrade Curator version to 2.13.0 to fix ZK tests
> -
>
> Key: YARN-8937
> URL: https://issues.apache.org/jira/browse/YARN-8937
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.3.0
>Reporter: Jason Lowe
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: YARN-8937.01.patch
>
>
> TestLeaderElectorService hangs waiting for the TestingZooKeeperServer to 
> start and eventually gets killed by the surefire timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network

2018-12-04 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709246#comment-16709246
 ] 

Eric Yang commented on YARN-5168:
-

[~liuxun323] The patch looks good.  Is it possible to also expose this 
information to Application Attempts > Containers > Graph View and Grid View in 
addition to component instance view?  Thanks

> Add port mapping handling when docker container use bridge network
> --
>
> Key: YARN-5168
> URL: https://issues.apache.org/jira/browse/YARN-5168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Xun Liu
>Priority: Major
>  Labels: Docker
> Attachments: YARN-5168.001.patch, YARN-5168.002.patch, 
> YARN-5168.003.patch, YARN-5168.004.patch, YARN-5168.005.patch, 
> YARN-5168.006.patch, YARN-5168.007.patch, YARN-5168.008.patch, 
> YARN-5168.009.patch, YARN-5168.010.patch
>
>
> YARN-4007 addresses different network setups when launching the docker 
> container. We need support port mapping when docker container uses bridge 
> network.
> The following problems are what we faced:
> 1. Add "-P" to map docker container's exposed ports to automatically.
> 2. Add "-p" to let user specify specific ports to map.
> 3. Add service registry support for bridge network case, then app could find 
> each other. It could be done out of YARN, however it might be more convenient 
> to support it natively in YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers

2018-12-04 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709176#comment-16709176
 ] 

Chandni Singh commented on YARN-9071:
-

As discussed offline, 

[~billie.rinaldi] I created YARN-9082 as a follow-up Jira to remove the delay 
in un-registering a metric.

[~eyang] Will put a fix on the Yarn Service AM side to remove the IP address 
from the registry before reinitialization. Currently the default readiness 
check is for the presence of an IP. This is successful because the IP address 
is present from the previous launch. If we remove the IP Address before the 
reinit, then only when the container is successfully launched, it will go into 
READY state.

> NM and service AM don't have updated status for reinitialized containers
> 
>
> Key: YARN-9071
> URL: https://issues.apache.org/jira/browse/YARN-9071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Chandni Singh
>Priority: Critical
> Attachments: YARN-9071.001.patch, YARN-9071.002.patch, 
> YARN-9071.003.patch, YARN-9071.004.patch, q.log
>
>
> Container resource monitoring is not stopped during the reinitialization 
> process, and this prevents the NM from obtaining updated process tree 
> information when the container starts running again. I observed a 
> reinitialized container go from RUNNING to REINITIALIZING to 
> REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring 
> was then started for a second time, but since the trackingContainers entry 
> had already been initialized for the container, ContainersMonitor skipped 
> finding the new PID and IP for the container. A possible solution would be to 
> stop the container monitoring in the reinitialization process so that the 
> process tree information would be initialized properly when monitoring is 
> restarted. When the same container was stopped by the NM later, the NM did 
> not kill the container, and the service AM received an unexpected event (stop 
> at reinitializing).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-12-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709178#comment-16709178
 ] 

Wangda Tan commented on YARN-8714:
--

Thanks [~tangzhankun], what I remember is YARN doesn't support localize 
directory for LocalResource, but I could be wrong as well. Hope you're correct 
:). 

Please keep us posted for your testing.

> [Submarine] Support files/tarballs to be localized for a training job.
> --
>
> Key: YARN-8714
> URL: https://issues.apache.org/jira/browse/YARN-8714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8714-WIP1-trunk-001.patch, 
> YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, 
> YARN-8714-trunk.002.patch
>
>
> See 
> [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7],
>  {{job run --localization ...}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9082) Delay during unregistering metrics is unnecessary

2018-12-04 Thread Chandni Singh (JIRA)
Chandni Singh created YARN-9082:
---

 Summary: Delay during unregistering metrics is unnecessary
 Key: YARN-9082
 URL: https://issues.apache.org/jira/browse/YARN-9082
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chandni Singh
Assignee: Chandni Singh


Discovered while debugging YARN-9071

Quoting [~billie.rinaldi]

{quote}

I looked at YARN-3619, where the unregistration delay was added. It seems like 
this was added because unregistration was performed in getMetrics, which was 
causing a ConcurrentModificationException. However, unregistration was moved 
from getMetrics into the finished method (in the same patch), and this leads me 
to believe that the delay is never needed. I'm inclined to think we should 
remove the delay entirely, but would like to hear other opinions.

{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers

2018-12-04 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709089#comment-16709089
 ] 

Eric Yang commented on YARN-9071:
-

In my local testing, if a container failed to start on node A, and moved 
container to node B.  With patch 004, when performing upgrade, the reinit will 
try to relaunch container on node A.  The default readiness check for IP 
address, ContainerMonitor contains IP address of previous instance of container 
without getting refreshed by new instance of the container.  AM will 
incorrectly determine the reinit of the container is successful, but no actual 
container was launched.

> NM and service AM don't have updated status for reinitialized containers
> 
>
> Key: YARN-9071
> URL: https://issues.apache.org/jira/browse/YARN-9071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Chandni Singh
>Priority: Critical
> Attachments: YARN-9071.001.patch, YARN-9071.002.patch, 
> YARN-9071.003.patch, YARN-9071.004.patch, q.log
>
>
> Container resource monitoring is not stopped during the reinitialization 
> process, and this prevents the NM from obtaining updated process tree 
> information when the container starts running again. I observed a 
> reinitialized container go from RUNNING to REINITIALIZING to 
> REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring 
> was then started for a second time, but since the trackingContainers entry 
> had already been initialized for the container, ContainersMonitor skipped 
> finding the new PID and IP for the container. A possible solution would be to 
> stop the container monitoring in the reinitialization process so that the 
> process tree information would be initialized properly when monitoring is 
> restarted. When the same container was stopped by the NM later, the NM did 
> not kill the container, and the service AM received an unexpected event (stop 
> at reinitializing).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9041) Performance Optimization of method FSPreemptionThread#identifyContainersToPreempt

2018-12-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709077#comment-16709077
 ] 

Hudson commented on YARN-9041:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15558 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15558/])
YARN-9041. Performance Optimization of method (yufei: rev 
e89941fdbb3b382eeb487d32e5194909610ac334)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSPreemptionThread.java


> Performance Optimization of method 
> FSPreemptionThread#identifyContainersToPreempt
> -
>
> Key: YARN-9041
> URL: https://issues.apache.org/jira/browse/YARN-9041
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler preemption
>Affects Versions: 3.1.1
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: YARN-9041.001.patch, YARN-9041.002.patch, 
> YARN-9041.003.patch, YARN-9041.004.patch, YARN-9041.005.patch, 
> YARN-9041.006.patch, YARN-9041.007.patch
>
>
> In FSPreemptionThread#identifyContainersToPreempt method, I suggest if AM 
> preemption, and locality relaxation is allowed, then the search space is 
> expanded to all nodes changed to the remaining nodes. The remaining nodes are 
> equal to all nodes minus the potential nodes.
> Judging condition changed to:
>  # rr.getRelaxLocality()
>  # !ResourceRequest.isAnyLocation(rr.getResourceName())
>  # bestContainers != null
>  # bestContainers.numAMContainers > 0
> If I understand the deviation, please criticize me. thx~



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9041) Performance Optimization of method FSPreemptionThread#identifyContainersToPreempt

2018-12-04 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709067#comment-16709067
 ] 

Yufei Gu commented on YARN-9041:


Committed to trunk. Thanks [~jiwq] for working on this. Thanks [~Steven Rand] 
for the review.

> Performance Optimization of method 
> FSPreemptionThread#identifyContainersToPreempt
> -
>
> Key: YARN-9041
> URL: https://issues.apache.org/jira/browse/YARN-9041
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler preemption
>Affects Versions: 3.1.1
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: YARN-9041.001.patch, YARN-9041.002.patch, 
> YARN-9041.003.patch, YARN-9041.004.patch, YARN-9041.005.patch, 
> YARN-9041.006.patch, YARN-9041.007.patch
>
>
> In FSPreemptionThread#identifyContainersToPreempt method, I suggest if AM 
> preemption, and locality relaxation is allowed, then the search space is 
> expanded to all nodes changed to the remaining nodes. The remaining nodes are 
> equal to all nodes minus the potential nodes.
> Judging condition changed to:
>  # rr.getRelaxLocality()
>  # !ResourceRequest.isAnyLocation(rr.getResourceName())
>  # bestContainers != null
>  # bestContainers.numAMContainers > 0
> If I understand the deviation, please criticize me. thx~



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9041) Performance Optimization of method FSPreemptionThread#identifyContainersToPreempt

2018-12-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-9041:
---
Fix Version/s: 3.2.1

> Performance Optimization of method 
> FSPreemptionThread#identifyContainersToPreempt
> -
>
> Key: YARN-9041
> URL: https://issues.apache.org/jira/browse/YARN-9041
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler preemption
>Affects Versions: 3.1.1
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: YARN-9041.001.patch, YARN-9041.002.patch, 
> YARN-9041.003.patch, YARN-9041.004.patch, YARN-9041.005.patch, 
> YARN-9041.006.patch, YARN-9041.007.patch
>
>
> In FSPreemptionThread#identifyContainersToPreempt method, I suggest if AM 
> preemption, and locality relaxation is allowed, then the search space is 
> expanded to all nodes changed to the remaining nodes. The remaining nodes are 
> equal to all nodes minus the potential nodes.
> Judging condition changed to:
>  # rr.getRelaxLocality()
>  # !ResourceRequest.isAnyLocation(rr.getResourceName())
>  # bestContainers != null
>  # bestContainers.numAMContainers > 0
> If I understand the deviation, please criticize me. thx~



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9041) Performance Optimization of method FSPreemptionThread#identifyContainersToPreempt

2018-12-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-9041:
---
Summary: Performance Optimization of method 
FSPreemptionThread#identifyContainersToPreempt  (was: Performance Optimization 
of FSPreemptionThread#identifyContainersToPreempt method)

> Performance Optimization of method 
> FSPreemptionThread#identifyContainersToPreempt
> -
>
> Key: YARN-9041
> URL: https://issues.apache.org/jira/browse/YARN-9041
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler preemption
>Affects Versions: 3.1.1
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Attachments: YARN-9041.001.patch, YARN-9041.002.patch, 
> YARN-9041.003.patch, YARN-9041.004.patch, YARN-9041.005.patch, 
> YARN-9041.006.patch, YARN-9041.007.patch
>
>
> In FSPreemptionThread#identifyContainersToPreempt method, I suggest if AM 
> preemption, and locality relaxation is allowed, then the search space is 
> expanded to all nodes changed to the remaining nodes. The remaining nodes are 
> equal to all nodes minus the potential nodes.
> Judging condition changed to:
>  # rr.getRelaxLocality()
>  # !ResourceRequest.isAnyLocation(rr.getResourceName())
>  # bestContainers != null
>  # bestContainers.numAMContainers > 0
> If I understand the deviation, please criticize me. thx~



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9041) Performance Optimization of FSPreemptionThread#identifyContainersToPreempt method

2018-12-04 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-9041:
---
Summary: Performance Optimization of 
FSPreemptionThread#identifyContainersToPreempt method  (was: Optimize 
FSPreemptionThread#identifyContainersToPreempt method)

> Performance Optimization of FSPreemptionThread#identifyContainersToPreempt 
> method
> -
>
> Key: YARN-9041
> URL: https://issues.apache.org/jira/browse/YARN-9041
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler, scheduler preemption
>Affects Versions: 3.1.1
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Attachments: YARN-9041.001.patch, YARN-9041.002.patch, 
> YARN-9041.003.patch, YARN-9041.004.patch, YARN-9041.005.patch, 
> YARN-9041.006.patch, YARN-9041.007.patch
>
>
> In FSPreemptionThread#identifyContainersToPreempt method, I suggest if AM 
> preemption, and locality relaxation is allowed, then the search space is 
> expanded to all nodes changed to the remaining nodes. The remaining nodes are 
> equal to all nodes minus the potential nodes.
> Judging condition changed to:
>  # rr.getRelaxLocality()
>  # !ResourceRequest.isAnyLocation(rr.getResourceName())
>  # bestContainers != null
>  # bestContainers.numAMContainers > 0
> If I understand the deviation, please criticize me. thx~



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers

2018-12-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709054#comment-16709054
 ] 

Hadoop QA commented on YARN-9071:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} YARN-9071 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9071 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22778/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> NM and service AM don't have updated status for reinitialized containers
> 
>
> Key: YARN-9071
> URL: https://issues.apache.org/jira/browse/YARN-9071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Chandni Singh
>Priority: Critical
> Attachments: YARN-9071.001.patch, YARN-9071.002.patch, 
> YARN-9071.003.patch, YARN-9071.004.patch, q.log
>
>
> Container resource monitoring is not stopped during the reinitialization 
> process, and this prevents the NM from obtaining updated process tree 
> information when the container starts running again. I observed a 
> reinitialized container go from RUNNING to REINITIALIZING to 
> REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring 
> was then started for a second time, but since the trackingContainers entry 
> had already been initialized for the container, ContainersMonitor skipped 
> finding the new PID and IP for the container. A possible solution would be to 
> stop the container monitoring in the reinitialization process so that the 
> process tree information would be initialized properly when monitoring is 
> restarted. When the same container was stopped by the NM later, the NM did 
> not kill the container, and the service AM received an unexpected event (stop 
> at reinitializing).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9071) NM and service AM don't have updated status for reinitialized containers

2018-12-04 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-9071:

Attachment: q.log

> NM and service AM don't have updated status for reinitialized containers
> 
>
> Key: YARN-9071
> URL: https://issues.apache.org/jira/browse/YARN-9071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Chandni Singh
>Priority: Critical
> Attachments: YARN-9071.001.patch, YARN-9071.002.patch, 
> YARN-9071.003.patch, YARN-9071.004.patch, q.log
>
>
> Container resource monitoring is not stopped during the reinitialization 
> process, and this prevents the NM from obtaining updated process tree 
> information when the container starts running again. I observed a 
> reinitialized container go from RUNNING to REINITIALIZING to 
> REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring 
> was then started for a second time, but since the trackingContainers entry 
> had already been initialized for the container, ContainersMonitor skipped 
> finding the new PID and IP for the container. A possible solution would be to 
> stop the container monitoring in the reinitialization process so that the 
> process tree information would be initialized properly when monitoring is 
> restarted. When the same container was stopped by the NM later, the NM did 
> not kill the container, and the service AM received an unexpected event (stop 
> at reinitializing).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers

2018-12-04 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709037#comment-16709037
 ] 

Eric Yang commented on YARN-9071:
-

[~csingh] Something is strange with this patch.  This patch impacts upgrade, 
see the attached log file (q.log).
It looks like container transitioned from STABLE to START, localize, STOP.  The 
sequence seems wrong.

> NM and service AM don't have updated status for reinitialized containers
> 
>
> Key: YARN-9071
> URL: https://issues.apache.org/jira/browse/YARN-9071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Chandni Singh
>Priority: Critical
> Attachments: YARN-9071.001.patch, YARN-9071.002.patch, 
> YARN-9071.003.patch, YARN-9071.004.patch
>
>
> Container resource monitoring is not stopped during the reinitialization 
> process, and this prevents the NM from obtaining updated process tree 
> information when the container starts running again. I observed a 
> reinitialized container go from RUNNING to REINITIALIZING to 
> REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring 
> was then started for a second time, but since the trackingContainers entry 
> had already been initialized for the container, ContainersMonitor skipped 
> finding the new PID and IP for the container. A possible solution would be to 
> stop the container monitoring in the reinitialization process so that the 
> process tree information would be initialized properly when monitoring is 
> restarted. When the same container was stopped by the NM later, the NM did 
> not kill the container, and the service AM received an unexpected event (stop 
> at reinitializing).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers

2018-12-04 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709028#comment-16709028
 ] 

Billie Rinaldi commented on YARN-9071:
--

I looked at YARN-3619, where the unregistration delay was added. It seems like 
this was added because unregistration was performed in getMetrics, which was 
causing a ConcurrentModificationException. However, unregistration was moved 
from getMetrics into the finished method (in the same patch), and this leads me 
to believe that the delay is never needed. I'm inclined to think we should 
remove the delay entirely, but would like to hear other opinions.

> NM and service AM don't have updated status for reinitialized containers
> 
>
> Key: YARN-9071
> URL: https://issues.apache.org/jira/browse/YARN-9071
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Chandni Singh
>Priority: Critical
> Attachments: YARN-9071.001.patch, YARN-9071.002.patch, 
> YARN-9071.003.patch, YARN-9071.004.patch
>
>
> Container resource monitoring is not stopped during the reinitialization 
> process, and this prevents the NM from obtaining updated process tree 
> information when the container starts running again. I observed a 
> reinitialized container go from RUNNING to REINITIALIZING to 
> REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring 
> was then started for a second time, but since the trackingContainers entry 
> had already been initialized for the container, ContainersMonitor skipped 
> finding the new PID and IP for the container. A possible solution would be to 
> stop the container monitoring in the reinitialization process so that the 
> process tree information would be initialized properly when monitoring is 
> restarted. When the same container was stopped by the NM later, the NM did 
> not kill the container, and the service AM received an unexpected event (stop 
> at reinitializing).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-12-04 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708874#comment-16708874
 ] 

Zhankun Tang edited comment on YARN-8714 at 12/4/18 3:32 PM:
-

[~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN 
localizer seems can localize *remote directory(hdfs, s3 .etc)*. In 
FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle 
directory.

Depending on this can greatly simplify our implementation, no need to download 
remote dir or zip local dir anymore.

We may still need a configuration to limit the remote file/dir size to be 
localized to the container.

I will verify and update the patch tomorrow.


was (Author: tangzhankun):
[~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN 
localizer seems can localize *remote directory(hdfs, s3 .etc)*. In 
FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle 
directory.

Depending on this can greatly simplify our implementation, no need to download 
remote dir or zip local dir anymore. I will verify and update the patch 
tomorrow.

> [Submarine] Support files/tarballs to be localized for a training job.
> --
>
> Key: YARN-8714
> URL: https://issues.apache.org/jira/browse/YARN-8714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8714-WIP1-trunk-001.patch, 
> YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, 
> YARN-8714-trunk.002.patch
>
>
> See 
> [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7],
>  {{job run --localization ...}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-12-04 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708874#comment-16708874
 ] 

Zhankun Tang edited comment on YARN-8714 at 12/4/18 3:29 PM:
-

[~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN 
localizer seems can localize *remote directory(hdfs, s3 .etc)*. In 
FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle 
directory.

Depending on this can greatly simplify our implementation, no need to download 
remote dir or zip local dir anymore. I will verify and update the patch 
tomorrow.


was (Author: tangzhankun):
[~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN 
localizer seems can localize *remote directory*. In 
FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle 
directory.

Depending on this can greatly simplify our implementation, no need to download 
remote dir or zip local dir anymore. I will verify and update the patch 
tomorrow.

> [Submarine] Support files/tarballs to be localized for a training job.
> --
>
> Key: YARN-8714
> URL: https://issues.apache.org/jira/browse/YARN-8714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8714-WIP1-trunk-001.patch, 
> YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, 
> YARN-8714-trunk.002.patch
>
>
> See 
> [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7],
>  {{job run --localization ...}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-12-04 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708874#comment-16708874
 ] 

Zhankun Tang commented on YARN-8714:


[~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN 
localizer seems can localize *remote directory*. In 
FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle 
directory.

Depending on this can greatly simplify our implementation, no need to download 
remote dir or zip local dir anymore. I will verify and update the patch 
tomorrow.

> [Submarine] Support files/tarballs to be localized for a training job.
> --
>
> Key: YARN-8714
> URL: https://issues.apache.org/jira/browse/YARN-8714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8714-WIP1-trunk-001.patch, 
> YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, 
> YARN-8714-trunk.002.patch
>
>
> See 
> [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7],
>  {{job run --localization ...}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6523) Newly retrieved security Tokens are sent as part of each heartbeat to each node from RM which is not desirable in large cluster

2018-12-04 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708868#comment-16708868
 ] 

Jason Lowe commented on YARN-6523:
--

Thanks for updating the patch!   If a unit test just added in a patch fails in 
the precommit build then there's usually something wrong with the test even if 
it passes locally.  It's likely to be a racy test, as the precommit builds are 
notorious for running unit tests with a different timing than seen locally.

The problem with these tests is they still aren't really unit tests but rather 
integration tests where it is spinning up an RM and an NM.  The first test 
should only create a DelegationTokenRenewer with a mock RMContext and verify 
that RMContext#incrTokenSequenceNo is called when the appropriate token is 
created and when it is renewed.  No server start ups, heartbeats, etc.  All of 
that tends to be racy as async dispatchers are usually involved making it hard 
to know when something is done processing and therefore safe to examine for 
assertions.  DelegationTokenRenewer#addApplicationSync can be used to test the 
case where a token is created, and we can make 
DelegationTokenRenewer#requestNewHdfsDelegationTokenIfNeeded package-private so 
we can call it from a test with a token that needs to be renewed to test the 
renewal case.

The second test is designed to test the ResourceTrackerService is properly 
handling the token sequence number, so there should be a unit test that 
verifies that the system credentials are sent when the token sequence number 
mismatches and not sent when they match.  That test should be in 
TestResourceTrackerService, since that's what we're testing.  If we pass a mock 
RMContext to the ResourceTrackerService when we construct it for the test, it 
makes it easy to manipulate it, along with the credentials payload, to verify 
in the test that the credentials are only sent when expected. 

NodeHeartbeatResponse should get/set a Collection rather than a List.  That 
allows ResourceTrackerService to pass the values of its tracking map directly 
rather than needing to convert it into a list first.

Typo in NodeHeartbeatResponse comment: "logAggreations"

NodeHeartbeatResponsePBImpl#setSystemCredentialsForApps should pass the 
collection directly to the ArrayList constructor so it doesn't have to guess at 
the initial size of the array then immediately discard it to reallocate a new 
one when the collection is larger than the initial guess.  Passing directly to 
the constructor allows ArrayList to allocate the correct array size the first 
time and reduces unnecessary garbage.

Nit: The name "systemCredentialsForAppsProto" in NodeHeartbeatResponsePBImpl 
implies it is a single proto rather than a collection of multiple.  Maybe just 
"systemCredentials"?

YarnServerBuilderUtils should pass the desired capacity to the ArrayList or 
HashMap constructor since it's trivial to compute and eliminates the 
possibility of needing to resize the collection due to a poor initial guess in 
the default constructor.


> Newly retrieved security Tokens are sent as part of each heartbeat to each 
> node from RM which is not desirable in large cluster
> ---
>
> Key: YARN-6523
> URL: https://issues.apache.org/jira/browse/YARN-6523
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: RM
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Naganarasimha G R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-6523.001.patch, YARN-6523.002.patch, 
> YARN-6523.003.patch, YARN-6523.004.patch, YARN-6523.005.patch, 
> YARN-6523.006.patch, YARN-6523.007.patch, YARN-6523.008.patch, 
> YARN-6523.009.patch
>
>
> Currently as part of heartbeat response RM sets all application's tokens 
> though all applications might not be active on the node. On top of it 
> NodeHeartbeatResponsePBImpl converts tokens for each app into 
> SystemCredentialsForAppsProto. Hence for each node and each heartbeat too 
> many SystemCredentialsForAppsProto objects were getting created.
> We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with 
> 8GB RAM configured for RM



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies

2018-12-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708658#comment-16708658
 ] 

Hadoop QA commented on YARN-9057:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
50m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-assemblies in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
43s{color} | {color:green} hadoop-yarn-csi in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 85m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9057 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12950539/YARN-9057.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  xml  |
| uname | Linux eba0bb6a87be 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / de42555 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/22777/testReport/ |
| Max. process+thread count | 339 (vs. ulimit of 1) |
| modules | C: hadoop-assemblies 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/22777/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> CSI jar

[jira] [Commented] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment

2018-12-04 Thread Zac Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708652#comment-16708652
 ] 

Zac Zhou commented on YARN-8960:


I think it should be ok, [~leftnoteasy] any comments?

> [Submarine] Can't get submarine service status using the command of "yarn app 
> -status" under security environment
> -
>
> Key: YARN-8960
> URL: https://issues.apache.org/jira/browse/YARN-8960
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-8960.001.patch, YARN-8960.002.patch, 
> YARN-8960.003.patch, YARN-8960.004.patch, YARN-8960.005.patch, 
> YARN-8960.006.patch, YARN-8960.007.patch
>
>
> After submitting a submarine job, we tried to get service status using the 
> following command:
> yarn app -status ${service_name}
> But we got the following error:
> HTTP error code : 500
>  
> The stack in resourcemanager log is :
> {code}
> ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {}
> java.lang.reflect.UndeclaredThrowableException
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ...
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: No principal 
> specified in the persisted service definitio
> n, fail to connect to AM.
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.createAMProxy(ServiceClient.java:1500)
>  at 
> org.apache.hadoop.yarn.service.client.ServiceClient.getStatus(ServiceClient.java:1376)
>  at 
> org.apache.hadoop.yarn.service.webapp.ApiServer.lambda$getServiceFromClient$4(ApiServer.java:804)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ... 68 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9001) [Submarine] Use AppAdminClient instead of ServiceClient to sumbit jobs

2018-12-04 Thread Zac Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708649#comment-16708649
 ] 

Zac Zhou commented on YARN-9001:


Yup, I think it can be applied to 3.2.0. Since this patch uses API from 3.1.0. 
It should be ok~

> [Submarine] Use AppAdminClient instead of ServiceClient to sumbit jobs
> --
>
> Key: YARN-9001
> URL: https://issues.apache.org/jira/browse/YARN-9001
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-9001-branch-3.2.001.patch, YARN-9001.001.patch, 
> YARN-9001.002.patch, YARN-9001.003.patch, YARN-9001.004.patch, 
> YARN-9001.005.patch
>
>
> For now, submarine submit a service to yarn by using ServiceClient, We should 
> change it to AppAdminClient 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies

2018-12-04 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708581#comment-16708581
 ] 

Weiwei Yang commented on YARN-9057:
---

Attached patch to remove the shading code, and yarn-csi copies its dependencies 
to share/hadoop/yarn/csi/lib, so this is self-contained. This will run with its 
own classpath, I have tried to use AUX service to launch the service, it works 
fine.

[~sunilg], [~ste...@apache.org], [~eyang], pls help to review.

Thanks.

> CSI jar file should not bundle third party dependencies
> ---
>
> Key: YARN-9057
> URL: https://issues.apache.org/jira/browse/YARN-9057
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.0
>Reporter: Eric Yang
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-9057.001.patch
>
>
> hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a 
> shaded jar instead of CSI only classes.  This is generating error messages 
> for YARN cli:
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9057) CSI jar file should not bundle third party dependencies

2018-12-04 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-9057:
--
Attachment: YARN-9057.001.patch

> CSI jar file should not bundle third party dependencies
> ---
>
> Key: YARN-9057
> URL: https://issues.apache.org/jira/browse/YARN-9057
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.0
>Reporter: Eric Yang
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-9057.001.patch
>
>
> hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a 
> shaded jar instead of CSI only classes.  This is generating error messages 
> for YARN cli:
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies

2018-12-04 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708496#comment-16708496
 ] 

Steve Loughran commented on YARN-9057:
--

bq. it it seems to create more problems than the ones it fixed. 

Afraid so. 

General practise in hadoop-*: unshaded in all our cross references, moving to 
shaded for public artifacts (which we still need to do for the object stores). 
And we dream of a java9-only world...

> CSI jar file should not bundle third party dependencies
> ---
>
> Key: YARN-9057
> URL: https://issues.apache.org/jira/browse/YARN-9057
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.0
>Reporter: Eric Yang
>Assignee: Weiwei Yang
>Priority: Blocker
>
> hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a 
> shaded jar instead of CSI only classes.  This is generating error messages 
> for YARN cli:
> {code}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-7897) Invalid NM log link published on Yarn UI when container fails

2018-12-04 Thread Akhil PB (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB resolved YARN-7897.

Resolution: Not A Bug

UI2 has no bug, log link is displayed from ATSv2 response data. If the log link 
is not available in ATSv2 response, ui2 will display N/A.

> Invalid NM log link published on Yarn UI when container fails
> -
>
> Key: YARN-7897
> URL: https://issues.apache.org/jira/browse/YARN-7897
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Yesha Vora
>Assignee: Akhil PB
>Priority: Major
> Attachments: Screen Shot 2018-02-05 at 4.52.59 PM.png
>
>
> Steps:
> 1) Launch Httpd example via rest api in unsecure mode
> 2) container_e04_1517875972784_0001_01_02 fails with "Unable to find 
> image 'centos/httpd-24-centos7:latest"
> 3) Go To RM UI2 to debug issue.
> The Yarn app attempt page has incorrect Value for Logs and Nodemanager UI
> Logs = N/A
> Nodemanager UI = http://nmhost:0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-8230) [UI2] Attempt Info page url shows NA for several fields for container info

2018-12-04 Thread Akhil PB (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB resolved YARN-8230.

Resolution: Not A Bug

It is working as expected. UI would display the data if available otherwise N/A.

> [UI2] Attempt Info page url shows NA for several fields for container info
> --
>
> Key: YARN-8230
> URL: https://issues.apache.org/jira/browse/YARN-8230
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn, yarn-ui-v2
>Reporter: Sumana Sathish
>Assignee: Akhil PB
>Priority: Critical
>
> 1. Click on any application
> 2. Click on the appAttempt present 
> 3. Click on grid View
> 4. It shows container Info. But logs / nodemanager / and several fields show 
> NA, with finished time as Invalid



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8918) [Submarine] Correct method usage of str.subString in CliUtils

2018-12-04 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708338#comment-16708338
 ] 

Zhankun Tang commented on YARN-8918:


[~sunilg] , it's a minor change. not important.

> [Submarine] Correct method usage of str.subString in CliUtils
> -
>
> Key: YARN-8918
> URL: https://issues.apache.org/jira/browse/YARN-8918
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Minor
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-8918-trunk.001.patch, YARN-8918-trunk.002.patch, 
> YARN-8918-trunk.003.patch
>
>
> In CliUtils.java (line 74), there's uncorrect code block,:
> {code:java}
> if (resourcesStr.endsWith("]")) {
>  resourcesStr = resourcesStr.substring(0, resourcesStr.length());
> }{code}
> Above if block will execute "resourceStr = resourceStr". It should be 
> "length() -1"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org