[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies
[ https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709684#comment-16709684 ] Hadoop QA commented on YARN-9057: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 52m 33s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 17s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s{color} | {color:green} hadoop-assemblies in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} hadoop-yarn-csi in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 89m 5s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9057 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12950647/YARN-9057.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml | | uname | Linux f34a21f8b6ff 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 228156c | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22781/testReport/ | | Max. process+thread count | 332 (vs. ulimit of 1) | | modules | C: hadoop-assemblies hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi U: . | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22781/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > CSI jar
[jira] [Commented] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709647#comment-16709647 ] Akhil PB commented on YARN-8914: [~eyang] Please let me know what are the steps to make the terminal work. I was unable to enter inputs since it was frozen. > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, > YARN-8914.009.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709640#comment-16709640 ] Akhil PB edited comment on YARN-8914 at 12/5/18 6:15 AM: - [~eyang] Please find the v9 patch [^YARN-8914.009.patch] with latest UI2 changes. The issues were in how requestedUser were accessed. The code snippet {{this.get('requestedUser')}} should be {{self.get('requestedUser')}}. The following changes were made related to UI2 in v9 patch. # Removed DOM access and termLink code from {{models/yarn-container.js and serializers/yarn-container.js}}, since these were not used. # Removed Terminal column from attempts table in {{components/timeline-view.js}}. # Fixed requestedUser issues in {{components/timeline-view.js and templates/components/container-table.hbs}}. Please verify that all your changes are there in v9 patch. was (Author: akhilpb): [~eyang] Please find the v9 patch [^YARN-8914.009.patch] with latest UI2 changes. The issues were in how requestedUser were accessed. The code snippet {{this.get('requestedUser')}} should be {{self.get('requestedUser')}}. The following changes were made related to UI2. # Removed DOM access and termLink code from {{models/yarn-container.js and serializers/yarn-container.js}}, since these were not used. # Removed Terminal column from attempts table in {{components/timeline-view.js}}. # Fixed requestedUser issues in {{components/timeline-view.js and templates/components/container-table.hbs}}. Please verify that all your changes are there in v9 patch. > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, > YARN-8914.009.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709640#comment-16709640 ] Akhil PB edited comment on YARN-8914 at 12/5/18 6:16 AM: - [~eyang] Please find the v9 patch [^YARN-8914.009.patch] with latest UI2 changes. The issues were in how requestedUser were accessed. The code snippet {{this.get('requestedUser')}} should be {{self.get('requestedUser')}}. The following changes were made related to UI2 in v9 patch. # Removed DOM access and termLink code from {{models/yarn-container.js and serializers/yarn-container.js}}, since these were not used. Since we only need nodeHttpAddress, containerId and user for termLink. # Removed Terminal column from attempts table in {{components/timeline-view.js}}. # Fixed requestedUser issues in {{components/timeline-view.js and templates/components/container-table.hbs}}. Please verify that all your changes are there in v9 patch. was (Author: akhilpb): [~eyang] Please find the v9 patch [^YARN-8914.009.patch] with latest UI2 changes. The issues were in how requestedUser were accessed. The code snippet {{this.get('requestedUser')}} should be {{self.get('requestedUser')}}. The following changes were made related to UI2 in v9 patch. # Removed DOM access and termLink code from {{models/yarn-container.js and serializers/yarn-container.js}}, since these were not used. # Removed Terminal column from attempts table in {{components/timeline-view.js}}. # Fixed requestedUser issues in {{components/timeline-view.js and templates/components/container-table.hbs}}. Please verify that all your changes are there in v9 patch. > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, > YARN-8914.009.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB updated YARN-8914: --- Attachment: (was: YARN-8914.008.patch) > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709640#comment-16709640 ] Akhil PB edited comment on YARN-8914 at 12/5/18 6:02 AM: - [~eyang] Please find the v9 patch [^YARN-8914.009.patch] with latest UI2 changes. The issues were in how requestedUser were accessed. The code snippet {{this.get('requestedUser')}} should be {{self.get('requestedUser')}}. The following changes were made related to UI2. # Removed DOM access and termLink code from {{models/yarn-container.js and serializers/yarn-container.js}}, since these were not used. # Removed Terminal column from attempts table in {{components/timeline-view.js}}. # Fixed requestedUser issues in {{components/timeline-view.js and templates/components/container-table.hbs}}. Please verify that all your changes are there in v9 patch. was (Author: akhilpb): [~eyang] Please find the v8 patch [^YARN-8914.009.patch] with latest UI2 changes. The issues were in how requestedUser were accessed. The code snippet {{this.get('requestedUser')}} should be {{self.get('requestedUser')}}. The following changes were made related to UI2. # Removed DOM access and termLink code from {{models/yarn-container.js and serializers/yarn-container.js}}, since these were not used. # Removed Terminal column from attempts table in {{components/timeline-view.js}}. # Fixed requestedUser issues in {{components/timeline-view.js and templates/components/container-table.hbs}}. > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, > YARN-8914.009.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB updated YARN-8914: --- Attachment: YARN-8914.009.patch > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, > YARN-8914.009.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709640#comment-16709640 ] Akhil PB edited comment on YARN-8914 at 12/5/18 6:01 AM: - [~eyang] Please find the v8 patch [^YARN-8914.009.patch] with latest UI2 changes. The issues were in how requestedUser were accessed. The code snippet {{this.get('requestedUser')}} should be {{self.get('requestedUser')}}. The following changes were made related to UI2. # Removed DOM access and termLink code from {{models/yarn-container.js and serializers/yarn-container.js}}, since these were not used. # Removed Terminal column from attempts table in {{components/timeline-view.js}}. # Fixed requestedUser issues in {{components/timeline-view.js and templates/components/container-table.hbs}}. was (Author: akhilpb): [~eyang] Please find the v8 patch [^YARN-8914.008.patch] with latest UI2 changes. The issues were in how requestedUser were accessed. The code snippet {{this.get('requestedUser')}} should be {{self.get('requestedUser')}}. The following changes were made related to UI2. # Removed DOM access and termLink code from {{models/yarn-container.js and serializers/yarn-container.js}}, since these were not used. # Removed Terminal column from attempts table in {{components/timeline-view.js}}. # Fixed requestedUser issues in {{components/timeline-view.js and templates/components/container-table.hbs}}. > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, > YARN-8914.009.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709640#comment-16709640 ] Akhil PB commented on YARN-8914: [~eyang] Please find the v8 patch [^YARN-8914.008.patch] with latest UI2 changes. The issues were in how requestedUser were accessed. The code snippet {{this.get('requestedUser')}} should be {{self.get('requestedUser')}}. The following changes were made related to UI2. # Removed DOM access and termLink code from {{models/yarn-container.js and serializers/yarn-container.js}}, since these were not used. # Removed Terminal column from attempts table in {{components/timeline-view.js}}. # Fixed requestedUser issues in {{components/timeline-view.js and templates/components/container-table.hbs}}. > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, > YARN-8914.008.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies
[ https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709637#comment-16709637 ] Weiwei Yang commented on YARN-9057: --- Hi [~eyang], using provided scope for hadoop-yarn-api/hadoop-yarn-common resolves this problem. Please help to review. Thanks > CSI jar file should not bundle third party dependencies > --- > > Key: YARN-9057 > URL: https://issues.apache.org/jira/browse/YARN-9057 > Project: Hadoop YARN > Issue Type: Sub-task > Components: build >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Weiwei Yang >Priority: Blocker > Attachments: YARN-9057.001.patch, YARN-9057.002.patch > > > hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a > shaded jar instead of CSI only classes. This is generating error messages > for YARN cli: > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB updated YARN-8914: --- Attachment: YARN-8914.008.patch > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch, > YARN-8914.008.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9057) CSI jar file should not bundle third party dependencies
[ https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-9057: -- Attachment: YARN-9057.002.patch > CSI jar file should not bundle third party dependencies > --- > > Key: YARN-9057 > URL: https://issues.apache.org/jira/browse/YARN-9057 > Project: Hadoop YARN > Issue Type: Sub-task > Components: build >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Weiwei Yang >Priority: Blocker > Attachments: YARN-9057.001.patch, YARN-9057.002.patch > > > hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a > shaded jar instead of CSI only classes. This is generating error messages > for YARN cli: > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709633#comment-16709633 ] Hadoop QA commented on YARN-8914: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 15m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 31m 19s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui hadoop-client-modules/hadoop-client-minicluster . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 28s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 58s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 29s{color} | {color:orange} root: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 12m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 16s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui hadoop-client-modules/hadoop-client-minicluster . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}122m 2s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}278m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.registry.secure.TestSecureLogins | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Iss
[jira] [Commented] (YARN-8789) Add BoundedQueue to AsyncDispatcher
[ https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709605#comment-16709605 ] Wilfred Spiegelenburg commented on YARN-8789: - First: I would expect that the change would be fully tested and thus the behaviour with a limited queue would be known and described. Task failures are probably more acceptable. Are we really still seeing them with the change from MAPREDUCE-5124 applied? If not then making this change is not really warranted. Before we go further and make a change like this I would also test the behaviour. What happens when the queue is full. Looking at the patch there is far more change than needed: the current queue can be limited and just that change would be far less impactful. The logic for taking an event is also changed which I don't think is needed either. Going back to just the basic change of limiting the queue after we find that it is needed would be a better approach. Based on that quick analysis I would say this is not an acceptable change in its current form. > Add BoundedQueue to AsyncDispatcher > --- > > Key: YARN-8789 > URL: https://issues.apache.org/jira/browse/YARN-8789 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: YARN-8789.1.patch, YARN-8789.10.patch, > YARN-8789.12.patch, YARN-8789.14.patch, YARN-8789.2.patch, YARN-8789.3.patch, > YARN-8789.4.patch, YARN-8789.5.patch, YARN-8789.6.patch, YARN-8789.7.patch, > YARN-8789.7.patch, YARN-8789.8.patch, YARN-8789.9.patch > > > I recently came across a scenario where an MR ApplicationMaster was failing > with an OOM exception. It had many thousands of Mappers and thousands of > Reducers. It was noted that in the logging that the event-queue of > {{AsyncDispatcher}} had a very large number of item in it and was seemingly > never decreasing. > I started looking at the code and thought it could use some clean up, > simplification, and the ability to specify a bounded queue so that any > incoming events are throttled until they can be processed. This will protect > the ApplicationMaster from a flood of events. > Logging Message: > Size of event-queue is xxx -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709547#comment-16709547 ] Zhankun Tang edited comment on YARN-8714 at 12/5/18 3:04 AM: - [~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle directory. I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. This is enabled by YARN-2185. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the bad news is submarine utilizes YARN native service, and it doesn't know this YARN ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported. {code} Two solutions ahead of us at present: 1. Fix the improper handling of the directory in the native service and then got this implement. 2. Go ahead with our download, zip and upload approach which is more complex. And refactor this after 1 is done. I personally prefer solution 2 because in that case submarine won't depend on newer YARN(3.1.0). Any thoughts? was (Author: tangzhankun): [~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle directory. I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the bad news is submarine utilizes YARN native service, and it doesn't know this YARN ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported. {code} Two solutions ahead of us at present: 1. Fix the improper handling of the directory in the native service and then got this implement. 2. Go ahead with our download, zip and upload approach which is more complex. And refactor this after 1 is done. Any thoughts? > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714
[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers
[ https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709554#comment-16709554 ] Hadoop QA commented on YARN-9071: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 42s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 15s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 32s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 194 unchanged - 0 fixed = 195 total (was 194) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 34s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 39s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}117m 4s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9071 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12950625/YARN-9071.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 8fc24a7fc3e3 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 228156c | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://bu
[jira] [Updated] (YARN-9083) Support remote directory localization in yarn native service
[ https://issues.apache.org/jira/browse/YARN-9083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-9083: --- Description: When refining YARN-8714, found that the YARN localizer seems can handle remote directory directly. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. This ability is added by YARN-2185. For testing purpose, I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the YARN native service seems doesn't know this YARN localizer ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported.{code} We should enable this ability in yarn native service. was: When refining YARN-8714, found that the YARN localizer seems can handle remote directory directly. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. This ability is added by YARN-2185. For testing purpose, I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the YARN native service seems doesn't know this YARN localizer ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported.{code} We should utilize this ability in yarn native service. > Support remote directory localization in yarn native service > > > Key: YARN-9083 > URL: https://issues.apache.org/jira/browse/YARN-9083 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > > When refining YARN-8714, found that the YARN localizer seems can handle > remote directory directly. In FSDownload.java#downloadAndUnpack, it uses > "FileUtil.copy" which can handle directory. This ability is added by > YARN-2185. > For testing purpose, I changed distributedShell's client to let it localize > an HDFS directory "mydir" directly. > {code:java} > Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + > "/mydir"); > FileStatus scFileStatus = fs.getFileStatus(p); > LocalResource r = >
[jira] [Created] (YARN-9083) Support remote directory localization in yarn native service
Zhankun Tang created YARN-9083: -- Summary: Support remote directory localization in yarn native service Key: YARN-9083 URL: https://issues.apache.org/jira/browse/YARN-9083 Project: Hadoop YARN Issue Type: Improvement Reporter: Zhankun Tang Assignee: Zhankun Tang When refining YARN-8714, found that the YARN localizer seems can handle remote directory directly. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. This ability is added by YARN-2185. For testing purpose, I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the YARN native service seems doesn't know this YARN localizer ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported.{code} We should utilize this ability in yarn native service. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709547#comment-16709547 ] Zhankun Tang edited comment on YARN-8714 at 12/5/18 2:33 AM: - [~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle directory. I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the bad news is submarine utilizes YARN native service, and it doesn't know this YARN ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported. {code} Two solutions ahead of us at present: 1. Fix the improper handling of the directory in the native service and then got this implement. 2. Go ahead with our download, zip and upload approach which is more complex. And refactor this after 1 is done. Any thoughts? was (Author: tangzhankun): [~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle directory. I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the bad news is submarine utilizes YARN native server, and it doesn't know this YARN ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported. {code} Two solutions ahead of us at present: 1. Fix the improper handling of the directory in the native service and then got this implement. 2. Go ahead with our download, zip and upload approach which is more complex. And refactor this after 1 is done. Any thoughts? > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch > > > See > [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vk
[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709547#comment-16709547 ] Zhankun Tang commented on YARN-8714: [~leftnoteasy], [~liuxun323] . Per my testing, the YARN localizer can handle directory. I changed distributedShell's client to let it localize an HDFS directory "mydir" directly. {code:java} Path p = new Path("hdfs:///user/yarn/submarine/jobs/tf-job-001/staging" + "/mydir"); FileStatus scFileStatus = fs.getFileStatus(p); LocalResource r = LocalResource.newInstance(URL.fromURI(p.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put("mydir", r);{code} And YARN localizer indeed downloads the HDFS dir to local for DistributedShell. {code:java} yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir 1.py 2.py dir1 test_kill9.sh yarn@master0-VirtualBox:~/apache-hadoop-install-dir/hadoop-dev-workspace$ ls /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/container_1543898305524_0004_01_01/ -l total 20 lrwxrwxrwx 1 yarn hadoop 111 12月 5 10:08 AppMaster.jar -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/10/AppMaster.jar lrwxrwxrwx 1 yarn hadoop 103 12月 5 10:08 mydir -> /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1543898305524_0004/filecache/13/mydir {code} But the bad news is submarine utilizes YARN native server, and it doesn't know this YARN ability and blocked it. {code:java} 2018-12-05 10:06:40,286 ERROR client.ApiServiceClient: srcFile=hdfs://192.168.50.191:9000/user/yarn/submarine/jobs/tf-job-001/staging/mydir is a directory, which is not supported. {code} Two solutions ahead of us at present: 1. Fix the improper handling of the directory in the native service and then got this implement. 2. Go ahead with our download, zip and upload approach which is more complex. And refactor this after 1 is done. Any thoughts? > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch > > > See > [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7], > {{job run --localization ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network
[ https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709512#comment-16709512 ] Xun Liu commented on YARN-5168: --- [~eyang], Thanks for your tips, I deal with it immediately. :) > Add port mapping handling when docker container use bridge network > -- > > Key: YARN-5168 > URL: https://issues.apache.org/jira/browse/YARN-5168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jun Gong >Assignee: Xun Liu >Priority: Major > Labels: Docker > Attachments: YARN-5168.001.patch, YARN-5168.002.patch, > YARN-5168.003.patch, YARN-5168.004.patch, YARN-5168.005.patch, > YARN-5168.006.patch, YARN-5168.007.patch, YARN-5168.008.patch, > YARN-5168.009.patch, YARN-5168.010.patch > > > YARN-4007 addresses different network setups when launching the docker > container. We need support port mapping when docker container uses bridge > network. > The following problems are what we faced: > 1. Add "-P" to map docker container's exposed ports to automatically. > 2. Add "-p" to let user specify specific ports to map. > 3. Add service registry support for bridge network case, then app could find > each other. It could be done out of YARN, however it might be more convenient > to support it natively in YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709477#comment-16709477 ] Eric Yang commented on YARN-8914: - [~akhilpb] Patch 008 fixes the issues 1-4 from your last comments. Please review. Thanks > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8914) Add xtermjs to YARN UI2
[ https://issues.apache.org/jira/browse/YARN-8914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8914: Attachment: YARN-8914.008.patch > Add xtermjs to YARN UI2 > --- > > Key: YARN-8914 > URL: https://issues.apache.org/jira/browse/YARN-8914 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8914.001.patch, YARN-8914.002.patch, > YARN-8914.003.patch, YARN-8914.004.patch, YARN-8914.005.patch, > YARN-8914.006.patch, YARN-8914.007.patch, YARN-8914.008.patch > > > In the container listing from UI2, we can add a link to connect to docker > container using xtermjs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers
[ https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709475#comment-16709475 ] Chandni Singh commented on YARN-9071: - [~eyang] I have uploaded patch 5 where ip and host is cleared on both the AM side and the NM side before upgrade. Please take a look at it. > NM and service AM don't have updated status for reinitialized containers > > > Key: YARN-9071 > URL: https://issues.apache.org/jira/browse/YARN-9071 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Chandni Singh >Priority: Critical > Attachments: YARN-9071.001.patch, YARN-9071.002.patch, > YARN-9071.003.patch, YARN-9071.004.patch, YARN-9071.005.patch, q.log > > > Container resource monitoring is not stopped during the reinitialization > process, and this prevents the NM from obtaining updated process tree > information when the container starts running again. I observed a > reinitialized container go from RUNNING to REINITIALIZING to > REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring > was then started for a second time, but since the trackingContainers entry > had already been initialized for the container, ContainersMonitor skipped > finding the new PID and IP for the container. A possible solution would be to > stop the container monitoring in the reinitialization process so that the > process tree information would be initialized properly when monitoring is > restarted. When the same container was stopped by the NM later, the NM did > not kill the container, and the service AM received an unexpected event (stop > at reinitializing). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9071) NM and service AM don't have updated status for reinitialized containers
[ https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-9071: Attachment: YARN-9071.005.patch > NM and service AM don't have updated status for reinitialized containers > > > Key: YARN-9071 > URL: https://issues.apache.org/jira/browse/YARN-9071 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Chandni Singh >Priority: Critical > Attachments: YARN-9071.001.patch, YARN-9071.002.patch, > YARN-9071.003.patch, YARN-9071.004.patch, YARN-9071.005.patch, q.log > > > Container resource monitoring is not stopped during the reinitialization > process, and this prevents the NM from obtaining updated process tree > information when the container starts running again. I observed a > reinitialized container go from RUNNING to REINITIALIZING to > REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring > was then started for a second time, but since the trackingContainers entry > had already been initialized for the container, ContainersMonitor skipped > finding the new PID and IP for the container. A possible solution would be to > stop the container monitoring in the reinitialization process so that the > process tree information would be initialized properly when monitoring is > restarted. When the same container was stopped by the NM later, the NM did > not kill the container, and the service AM received an unexpected event (stop > at reinitializing). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies
[ https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709465#comment-16709465 ] Weiwei Yang commented on YARN-9057: --- Thanks [~eyang], indeed that is not expected. Not sure why copy dependencies would ever move existing jars. Let me check, thanks! > CSI jar file should not bundle third party dependencies > --- > > Key: YARN-9057 > URL: https://issues.apache.org/jira/browse/YARN-9057 > Project: Hadoop YARN > Issue Type: Sub-task > Components: build >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Weiwei Yang >Priority: Blocker > Attachments: YARN-9057.001.patch > > > hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a > shaded jar instead of CSI only classes. This is generating error messages > for YARN cli: > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9013) [GPG] fix order of steps cleaning Registry entries in ApplicationCleaner
[ https://issues.apache.org/jira/browse/YARN-9013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709446#comment-16709446 ] Giovanni Matteo Fumarola commented on YARN-9013: Thanks [~botong] . +1 on [^YARN-9013-YARN-7402.v2.patch] . > [GPG] fix order of steps cleaning Registry entries in ApplicationCleaner > > > Key: YARN-9013 > URL: https://issues.apache.org/jira/browse/YARN-9013 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-9013-YARN-7402.v1.patch, > YARN-9013-YARN-7402.v2.patch > > > ApplicationCleaner today deletes the entry for all finished (non-running) > application in YarnRegistry using this logic: > # GPG gets the list of running applications from Router. > # GPG gets the full list of applications in registry > # GPG deletes in registry every app in 2 that’s not in 1 > The problem is that jobs that started between 1 and 2 meets the criteria in > 3, and thus get deleted by mistake. The fix/right order should be 2->1->3, > rather than 1->2->3. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8870) [Submarine] Add submarine installation scripts
[ https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709356#comment-16709356 ] Hudson commented on YARN-8870: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15561 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15561/]) Revert "YARN-8870. [Submarine] Add submarine installation scripts. (Xun (wangda: rev 228156cfd1b474988bc4fedfbf7edddc87db41e3) * (edit) hadoop-assemblies/src/main/resources/assemblies/hadoop-yarn-dist.xml * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/etcd/etcd.service * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/utils.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/hadoop/container-executor.cfg * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/docker/daemon.json * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/environment.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/calico/calico-node.service * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/install.conf * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/docker/docker.service * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/nvidia.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/submarine.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/docker.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/hadoop.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/menu.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/install.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/calico/calicoctl.cfg * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/package/submarine/submarine.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/download-server.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/etcd.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/calico.sh * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/installation/scripts/nvidia-docker.sh > [Submarine] Add submarine installation scripts > -- > > Key: YARN-8870 > URL: https://issues.apache.org/jira/browse/YARN-8870 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xun Liu >Assignee: Xun Liu >Priority: Critical > Attachments: YARN-8870-addendum.008.patch, YARN-8870.001.patch, > YARN-8870.004.patch, YARN-8870.005.patch, YARN-8870.006.patch, > YARN-8870.007.patch, YARN-8870.009.patch, YARN-8870.010.patch, > YARN-8870.011.patch, YARN-8870.012.patch > > > In order to reduce the deployment difficulty of Hadoop > {Submarine} DNS, Docker, GPU, Network, graphics card, operating system kernel > modification and other components, I specially developed this installation > script to deploy Hadoop \{Submarine} > runtime environment, providing one-click installation Scripts, which can also > be used to install, uninstall, start, and stop individual components step by > step. > > {color:#ff}design d{color}{color:#FF}ocument:{color} > [https://docs.google.com/document/d/1muCTGFuUXUvM4JaDYjKqX5liQEg-AsNgkxfLMIFxYHU/edit?usp=sharing] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8870) [Submarine] Add submarine installation scripts
[ https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709341#comment-16709341 ] Wangda Tan commented on YARN-8870: -- As we discussed offline, reverted the patch from branches. It's better to move such scripts outside of Hadoop core. > [Submarine] Add submarine installation scripts > -- > > Key: YARN-8870 > URL: https://issues.apache.org/jira/browse/YARN-8870 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xun Liu >Assignee: Xun Liu >Priority: Critical > Attachments: YARN-8870-addendum.008.patch, YARN-8870.001.patch, > YARN-8870.004.patch, YARN-8870.005.patch, YARN-8870.006.patch, > YARN-8870.007.patch, YARN-8870.009.patch, YARN-8870.010.patch, > YARN-8870.011.patch, YARN-8870.012.patch > > > In order to reduce the deployment difficulty of Hadoop > {Submarine} DNS, Docker, GPU, Network, graphics card, operating system kernel > modification and other components, I specially developed this installation > script to deploy Hadoop \{Submarine} > runtime environment, providing one-click installation Scripts, which can also > be used to install, uninstall, start, and stop individual components step by > step. > > {color:#ff}design d{color}{color:#FF}ocument:{color} > [https://docs.google.com/document/d/1muCTGFuUXUvM4JaDYjKqX5liQEg-AsNgkxfLMIFxYHU/edit?usp=sharing] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8870) [Submarine] Add submarine installation scripts
[ https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8870: - Target Version/s: (was: 3.2.0) > [Submarine] Add submarine installation scripts > -- > > Key: YARN-8870 > URL: https://issues.apache.org/jira/browse/YARN-8870 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xun Liu >Assignee: Xun Liu >Priority: Critical > Attachments: YARN-8870-addendum.008.patch, YARN-8870.001.patch, > YARN-8870.004.patch, YARN-8870.005.patch, YARN-8870.006.patch, > YARN-8870.007.patch, YARN-8870.009.patch, YARN-8870.010.patch, > YARN-8870.011.patch, YARN-8870.012.patch > > > In order to reduce the deployment difficulty of Hadoop > {Submarine} DNS, Docker, GPU, Network, graphics card, operating system kernel > modification and other components, I specially developed this installation > script to deploy Hadoop \{Submarine} > runtime environment, providing one-click installation Scripts, which can also > be used to install, uninstall, start, and stop individual components step by > step. > > {color:#ff}design d{color}{color:#FF}ocument:{color} > [https://docs.google.com/document/d/1muCTGFuUXUvM4JaDYjKqX5liQEg-AsNgkxfLMIFxYHU/edit?usp=sharing] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8870) [Submarine] Add submarine installation scripts
[ https://issues.apache.org/jira/browse/YARN-8870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8870: - Fix Version/s: (was: 3.2.0) > [Submarine] Add submarine installation scripts > -- > > Key: YARN-8870 > URL: https://issues.apache.org/jira/browse/YARN-8870 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xun Liu >Assignee: Xun Liu >Priority: Critical > Attachments: YARN-8870-addendum.008.patch, YARN-8870.001.patch, > YARN-8870.004.patch, YARN-8870.005.patch, YARN-8870.006.patch, > YARN-8870.007.patch, YARN-8870.009.patch, YARN-8870.010.patch, > YARN-8870.011.patch, YARN-8870.012.patch > > > In order to reduce the deployment difficulty of Hadoop > {Submarine} DNS, Docker, GPU, Network, graphics card, operating system kernel > modification and other components, I specially developed this installation > script to deploy Hadoop \{Submarine} > runtime environment, providing one-click installation Scripts, which can also > be used to install, uninstall, start, and stop individual components step by > step. > > {color:#ff}design d{color}{color:#FF}ocument:{color} > [https://docs.google.com/document/d/1muCTGFuUXUvM4JaDYjKqX5liQEg-AsNgkxfLMIFxYHU/edit?usp=sharing] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies
[ https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709320#comment-16709320 ] Eric Yang commented on YARN-9057: - [~cheersyang] Thank you for the patch. HADOOP_HOME/share/hadoop/yarn/csi/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar has the right content. However, I get error message when launching application: {code} $ ./bin/yarn app -status abc Error: A JNI error has occurred, please check your installation and try again Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/conf/YarnConfiguration at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at java.lang.Class.privateGetMethodRecursive(Class.java:3048) at java.lang.Class.getMethod0(Class.java:3018) at java.lang.Class.getMethod(Class.java:1784) at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.yarn.conf.YarnConfiguration at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 7 more {code} When I look in HADOOP_HOME/share/hadoop/yarn, hadoop-yarn-api-*.jar file is missing. It copied hadoop-yarn-api-*.jar into: {code} $ tar tfvz hadoop-3.3.0-SNAPSHOT.tar.gz |grep yarn-api -rw-rw-r-- eyang/eyang 3369775 2018-12-04 16:46 hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/csi/lib/hadoop-yarn-api-3.3.0-SNAPSHOT.jar {code} Seems like an unexpected behavior. > CSI jar file should not bundle third party dependencies > --- > > Key: YARN-9057 > URL: https://issues.apache.org/jira/browse/YARN-9057 > Project: Hadoop YARN > Issue Type: Sub-task > Components: build >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Weiwei Yang >Priority: Blocker > Attachments: YARN-9057.001.patch > > > hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a > shaded jar instead of CSI only classes. This is generating error messages > for YARN cli: > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8937) Upgrade Curator version to 2.13.0 to fix ZK tests
[ https://issues.apache.org/jira/browse/YARN-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709293#comment-16709293 ] Jason Lowe commented on YARN-8937: -- Thanks for the excellent analysis! +1 lgtm. Committing this. > Upgrade Curator version to 2.13.0 to fix ZK tests > - > > Key: YARN-8937 > URL: https://issues.apache.org/jira/browse/YARN-8937 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Affects Versions: 3.3.0 >Reporter: Jason Lowe >Assignee: Akira Ajisaka >Priority: Major > Attachments: YARN-8937.01.patch > > > TestLeaderElectorService hangs waiting for the TestingZooKeeperServer to > start and eventually gets killed by the surefire timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5168) Add port mapping handling when docker container use bridge network
[ https://issues.apache.org/jira/browse/YARN-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709246#comment-16709246 ] Eric Yang commented on YARN-5168: - [~liuxun323] The patch looks good. Is it possible to also expose this information to Application Attempts > Containers > Graph View and Grid View in addition to component instance view? Thanks > Add port mapping handling when docker container use bridge network > -- > > Key: YARN-5168 > URL: https://issues.apache.org/jira/browse/YARN-5168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jun Gong >Assignee: Xun Liu >Priority: Major > Labels: Docker > Attachments: YARN-5168.001.patch, YARN-5168.002.patch, > YARN-5168.003.patch, YARN-5168.004.patch, YARN-5168.005.patch, > YARN-5168.006.patch, YARN-5168.007.patch, YARN-5168.008.patch, > YARN-5168.009.patch, YARN-5168.010.patch > > > YARN-4007 addresses different network setups when launching the docker > container. We need support port mapping when docker container uses bridge > network. > The following problems are what we faced: > 1. Add "-P" to map docker container's exposed ports to automatically. > 2. Add "-p" to let user specify specific ports to map. > 3. Add service registry support for bridge network case, then app could find > each other. It could be done out of YARN, however it might be more convenient > to support it natively in YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers
[ https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709176#comment-16709176 ] Chandni Singh commented on YARN-9071: - As discussed offline, [~billie.rinaldi] I created YARN-9082 as a follow-up Jira to remove the delay in un-registering a metric. [~eyang] Will put a fix on the Yarn Service AM side to remove the IP address from the registry before reinitialization. Currently the default readiness check is for the presence of an IP. This is successful because the IP address is present from the previous launch. If we remove the IP Address before the reinit, then only when the container is successfully launched, it will go into READY state. > NM and service AM don't have updated status for reinitialized containers > > > Key: YARN-9071 > URL: https://issues.apache.org/jira/browse/YARN-9071 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Chandni Singh >Priority: Critical > Attachments: YARN-9071.001.patch, YARN-9071.002.patch, > YARN-9071.003.patch, YARN-9071.004.patch, q.log > > > Container resource monitoring is not stopped during the reinitialization > process, and this prevents the NM from obtaining updated process tree > information when the container starts running again. I observed a > reinitialized container go from RUNNING to REINITIALIZING to > REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring > was then started for a second time, but since the trackingContainers entry > had already been initialized for the container, ContainersMonitor skipped > finding the new PID and IP for the container. A possible solution would be to > stop the container monitoring in the reinitialization process so that the > process tree information would be initialized properly when monitoring is > restarted. When the same container was stopped by the NM later, the NM did > not kill the container, and the service AM received an unexpected event (stop > at reinitializing). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709178#comment-16709178 ] Wangda Tan commented on YARN-8714: -- Thanks [~tangzhankun], what I remember is YARN doesn't support localize directory for LocalResource, but I could be wrong as well. Hope you're correct :). Please keep us posted for your testing. > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch > > > See > [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7], > {{job run --localization ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9082) Delay during unregistering metrics is unnecessary
Chandni Singh created YARN-9082: --- Summary: Delay during unregistering metrics is unnecessary Key: YARN-9082 URL: https://issues.apache.org/jira/browse/YARN-9082 Project: Hadoop YARN Issue Type: Bug Reporter: Chandni Singh Assignee: Chandni Singh Discovered while debugging YARN-9071 Quoting [~billie.rinaldi] {quote} I looked at YARN-3619, where the unregistration delay was added. It seems like this was added because unregistration was performed in getMetrics, which was causing a ConcurrentModificationException. However, unregistration was moved from getMetrics into the finished method (in the same patch), and this leads me to believe that the delay is never needed. I'm inclined to think we should remove the delay entirely, but would like to hear other opinions. {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers
[ https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709089#comment-16709089 ] Eric Yang commented on YARN-9071: - In my local testing, if a container failed to start on node A, and moved container to node B. With patch 004, when performing upgrade, the reinit will try to relaunch container on node A. The default readiness check for IP address, ContainerMonitor contains IP address of previous instance of container without getting refreshed by new instance of the container. AM will incorrectly determine the reinit of the container is successful, but no actual container was launched. > NM and service AM don't have updated status for reinitialized containers > > > Key: YARN-9071 > URL: https://issues.apache.org/jira/browse/YARN-9071 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Chandni Singh >Priority: Critical > Attachments: YARN-9071.001.patch, YARN-9071.002.patch, > YARN-9071.003.patch, YARN-9071.004.patch, q.log > > > Container resource monitoring is not stopped during the reinitialization > process, and this prevents the NM from obtaining updated process tree > information when the container starts running again. I observed a > reinitialized container go from RUNNING to REINITIALIZING to > REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring > was then started for a second time, but since the trackingContainers entry > had already been initialized for the container, ContainersMonitor skipped > finding the new PID and IP for the container. A possible solution would be to > stop the container monitoring in the reinitialization process so that the > process tree information would be initialized properly when monitoring is > restarted. When the same container was stopped by the NM later, the NM did > not kill the container, and the service AM received an unexpected event (stop > at reinitializing). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9041) Performance Optimization of method FSPreemptionThread#identifyContainersToPreempt
[ https://issues.apache.org/jira/browse/YARN-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709077#comment-16709077 ] Hudson commented on YARN-9041: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15558 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15558/]) YARN-9041. Performance Optimization of method (yufei: rev e89941fdbb3b382eeb487d32e5194909610ac334) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSPreemptionThread.java > Performance Optimization of method > FSPreemptionThread#identifyContainersToPreempt > - > > Key: YARN-9041 > URL: https://issues.apache.org/jira/browse/YARN-9041 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler, scheduler preemption >Affects Versions: 3.1.1 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Fix For: 3.2.1 > > Attachments: YARN-9041.001.patch, YARN-9041.002.patch, > YARN-9041.003.patch, YARN-9041.004.patch, YARN-9041.005.patch, > YARN-9041.006.patch, YARN-9041.007.patch > > > In FSPreemptionThread#identifyContainersToPreempt method, I suggest if AM > preemption, and locality relaxation is allowed, then the search space is > expanded to all nodes changed to the remaining nodes. The remaining nodes are > equal to all nodes minus the potential nodes. > Judging condition changed to: > # rr.getRelaxLocality() > # !ResourceRequest.isAnyLocation(rr.getResourceName()) > # bestContainers != null > # bestContainers.numAMContainers > 0 > If I understand the deviation, please criticize me. thx~ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9041) Performance Optimization of method FSPreemptionThread#identifyContainersToPreempt
[ https://issues.apache.org/jira/browse/YARN-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709067#comment-16709067 ] Yufei Gu commented on YARN-9041: Committed to trunk. Thanks [~jiwq] for working on this. Thanks [~Steven Rand] for the review. > Performance Optimization of method > FSPreemptionThread#identifyContainersToPreempt > - > > Key: YARN-9041 > URL: https://issues.apache.org/jira/browse/YARN-9041 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler, scheduler preemption >Affects Versions: 3.1.1 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Fix For: 3.2.1 > > Attachments: YARN-9041.001.patch, YARN-9041.002.patch, > YARN-9041.003.patch, YARN-9041.004.patch, YARN-9041.005.patch, > YARN-9041.006.patch, YARN-9041.007.patch > > > In FSPreemptionThread#identifyContainersToPreempt method, I suggest if AM > preemption, and locality relaxation is allowed, then the search space is > expanded to all nodes changed to the remaining nodes. The remaining nodes are > equal to all nodes minus the potential nodes. > Judging condition changed to: > # rr.getRelaxLocality() > # !ResourceRequest.isAnyLocation(rr.getResourceName()) > # bestContainers != null > # bestContainers.numAMContainers > 0 > If I understand the deviation, please criticize me. thx~ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9041) Performance Optimization of method FSPreemptionThread#identifyContainersToPreempt
[ https://issues.apache.org/jira/browse/YARN-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-9041: --- Fix Version/s: 3.2.1 > Performance Optimization of method > FSPreemptionThread#identifyContainersToPreempt > - > > Key: YARN-9041 > URL: https://issues.apache.org/jira/browse/YARN-9041 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler, scheduler preemption >Affects Versions: 3.1.1 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Fix For: 3.2.1 > > Attachments: YARN-9041.001.patch, YARN-9041.002.patch, > YARN-9041.003.patch, YARN-9041.004.patch, YARN-9041.005.patch, > YARN-9041.006.patch, YARN-9041.007.patch > > > In FSPreemptionThread#identifyContainersToPreempt method, I suggest if AM > preemption, and locality relaxation is allowed, then the search space is > expanded to all nodes changed to the remaining nodes. The remaining nodes are > equal to all nodes minus the potential nodes. > Judging condition changed to: > # rr.getRelaxLocality() > # !ResourceRequest.isAnyLocation(rr.getResourceName()) > # bestContainers != null > # bestContainers.numAMContainers > 0 > If I understand the deviation, please criticize me. thx~ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9041) Performance Optimization of method FSPreemptionThread#identifyContainersToPreempt
[ https://issues.apache.org/jira/browse/YARN-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-9041: --- Summary: Performance Optimization of method FSPreemptionThread#identifyContainersToPreempt (was: Performance Optimization of FSPreemptionThread#identifyContainersToPreempt method) > Performance Optimization of method > FSPreemptionThread#identifyContainersToPreempt > - > > Key: YARN-9041 > URL: https://issues.apache.org/jira/browse/YARN-9041 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler, scheduler preemption >Affects Versions: 3.1.1 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9041.001.patch, YARN-9041.002.patch, > YARN-9041.003.patch, YARN-9041.004.patch, YARN-9041.005.patch, > YARN-9041.006.patch, YARN-9041.007.patch > > > In FSPreemptionThread#identifyContainersToPreempt method, I suggest if AM > preemption, and locality relaxation is allowed, then the search space is > expanded to all nodes changed to the remaining nodes. The remaining nodes are > equal to all nodes minus the potential nodes. > Judging condition changed to: > # rr.getRelaxLocality() > # !ResourceRequest.isAnyLocation(rr.getResourceName()) > # bestContainers != null > # bestContainers.numAMContainers > 0 > If I understand the deviation, please criticize me. thx~ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9041) Performance Optimization of FSPreemptionThread#identifyContainersToPreempt method
[ https://issues.apache.org/jira/browse/YARN-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-9041: --- Summary: Performance Optimization of FSPreemptionThread#identifyContainersToPreempt method (was: Optimize FSPreemptionThread#identifyContainersToPreempt method) > Performance Optimization of FSPreemptionThread#identifyContainersToPreempt > method > - > > Key: YARN-9041 > URL: https://issues.apache.org/jira/browse/YARN-9041 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler, scheduler preemption >Affects Versions: 3.1.1 >Reporter: Wanqiang Ji >Assignee: Wanqiang Ji >Priority: Major > Attachments: YARN-9041.001.patch, YARN-9041.002.patch, > YARN-9041.003.patch, YARN-9041.004.patch, YARN-9041.005.patch, > YARN-9041.006.patch, YARN-9041.007.patch > > > In FSPreemptionThread#identifyContainersToPreempt method, I suggest if AM > preemption, and locality relaxation is allowed, then the search space is > expanded to all nodes changed to the remaining nodes. The remaining nodes are > equal to all nodes minus the potential nodes. > Judging condition changed to: > # rr.getRelaxLocality() > # !ResourceRequest.isAnyLocation(rr.getResourceName()) > # bestContainers != null > # bestContainers.numAMContainers > 0 > If I understand the deviation, please criticize me. thx~ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers
[ https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709054#comment-16709054 ] Hadoop QA commented on YARN-9071: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} YARN-9071 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-9071 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22778/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > NM and service AM don't have updated status for reinitialized containers > > > Key: YARN-9071 > URL: https://issues.apache.org/jira/browse/YARN-9071 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Chandni Singh >Priority: Critical > Attachments: YARN-9071.001.patch, YARN-9071.002.patch, > YARN-9071.003.patch, YARN-9071.004.patch, q.log > > > Container resource monitoring is not stopped during the reinitialization > process, and this prevents the NM from obtaining updated process tree > information when the container starts running again. I observed a > reinitialized container go from RUNNING to REINITIALIZING to > REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring > was then started for a second time, but since the trackingContainers entry > had already been initialized for the container, ContainersMonitor skipped > finding the new PID and IP for the container. A possible solution would be to > stop the container monitoring in the reinitialization process so that the > process tree information would be initialized properly when monitoring is > restarted. When the same container was stopped by the NM later, the NM did > not kill the container, and the service AM received an unexpected event (stop > at reinitializing). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9071) NM and service AM don't have updated status for reinitialized containers
[ https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-9071: Attachment: q.log > NM and service AM don't have updated status for reinitialized containers > > > Key: YARN-9071 > URL: https://issues.apache.org/jira/browse/YARN-9071 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Chandni Singh >Priority: Critical > Attachments: YARN-9071.001.patch, YARN-9071.002.patch, > YARN-9071.003.patch, YARN-9071.004.patch, q.log > > > Container resource monitoring is not stopped during the reinitialization > process, and this prevents the NM from obtaining updated process tree > information when the container starts running again. I observed a > reinitialized container go from RUNNING to REINITIALIZING to > REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring > was then started for a second time, but since the trackingContainers entry > had already been initialized for the container, ContainersMonitor skipped > finding the new PID and IP for the container. A possible solution would be to > stop the container monitoring in the reinitialization process so that the > process tree information would be initialized properly when monitoring is > restarted. When the same container was stopped by the NM later, the NM did > not kill the container, and the service AM received an unexpected event (stop > at reinitializing). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers
[ https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709037#comment-16709037 ] Eric Yang commented on YARN-9071: - [~csingh] Something is strange with this patch. This patch impacts upgrade, see the attached log file (q.log). It looks like container transitioned from STABLE to START, localize, STOP. The sequence seems wrong. > NM and service AM don't have updated status for reinitialized containers > > > Key: YARN-9071 > URL: https://issues.apache.org/jira/browse/YARN-9071 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Chandni Singh >Priority: Critical > Attachments: YARN-9071.001.patch, YARN-9071.002.patch, > YARN-9071.003.patch, YARN-9071.004.patch > > > Container resource monitoring is not stopped during the reinitialization > process, and this prevents the NM from obtaining updated process tree > information when the container starts running again. I observed a > reinitialized container go from RUNNING to REINITIALIZING to > REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring > was then started for a second time, but since the trackingContainers entry > had already been initialized for the container, ContainersMonitor skipped > finding the new PID and IP for the container. A possible solution would be to > stop the container monitoring in the reinitialization process so that the > process tree information would be initialized properly when monitoring is > restarted. When the same container was stopped by the NM later, the NM did > not kill the container, and the service AM received an unexpected event (stop > at reinitializing). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9071) NM and service AM don't have updated status for reinitialized containers
[ https://issues.apache.org/jira/browse/YARN-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709028#comment-16709028 ] Billie Rinaldi commented on YARN-9071: -- I looked at YARN-3619, where the unregistration delay was added. It seems like this was added because unregistration was performed in getMetrics, which was causing a ConcurrentModificationException. However, unregistration was moved from getMetrics into the finished method (in the same patch), and this leads me to believe that the delay is never needed. I'm inclined to think we should remove the delay entirely, but would like to hear other opinions. > NM and service AM don't have updated status for reinitialized containers > > > Key: YARN-9071 > URL: https://issues.apache.org/jira/browse/YARN-9071 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Chandni Singh >Priority: Critical > Attachments: YARN-9071.001.patch, YARN-9071.002.patch, > YARN-9071.003.patch, YARN-9071.004.patch > > > Container resource monitoring is not stopped during the reinitialization > process, and this prevents the NM from obtaining updated process tree > information when the container starts running again. I observed a > reinitialized container go from RUNNING to REINITIALIZING to > REINITIALIZING_AWAITING_KILL to SCHEDULED to RUNNING. Container monitoring > was then started for a second time, but since the trackingContainers entry > had already been initialized for the container, ContainersMonitor skipped > finding the new PID and IP for the container. A possible solution would be to > stop the container monitoring in the reinitialization process so that the > process tree information would be initialized properly when monitoring is > restarted. When the same container was stopped by the NM later, the NM did > not kill the container, and the service AM received an unexpected event (stop > at reinitializing). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708874#comment-16708874 ] Zhankun Tang edited comment on YARN-8714 at 12/4/18 3:32 PM: - [~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN localizer seems can localize *remote directory(hdfs, s3 .etc)*. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. Depending on this can greatly simplify our implementation, no need to download remote dir or zip local dir anymore. We may still need a configuration to limit the remote file/dir size to be localized to the container. I will verify and update the patch tomorrow. was (Author: tangzhankun): [~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN localizer seems can localize *remote directory(hdfs, s3 .etc)*. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. Depending on this can greatly simplify our implementation, no need to download remote dir or zip local dir anymore. I will verify and update the patch tomorrow. > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch > > > See > [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7], > {{job run --localization ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708874#comment-16708874 ] Zhankun Tang edited comment on YARN-8714 at 12/4/18 3:29 PM: - [~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN localizer seems can localize *remote directory(hdfs, s3 .etc)*. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. Depending on this can greatly simplify our implementation, no need to download remote dir or zip local dir anymore. I will verify and update the patch tomorrow. was (Author: tangzhankun): [~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN localizer seems can localize *remote directory*. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. Depending on this can greatly simplify our implementation, no need to download remote dir or zip local dir anymore. I will verify and update the patch tomorrow. > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch > > > See > [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7], > {{job run --localization ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.
[ https://issues.apache.org/jira/browse/YARN-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708874#comment-16708874 ] Zhankun Tang commented on YARN-8714: [~leftnoteasy], [~liuxun323] , While refining the patch, I found that YARN localizer seems can localize *remote directory*. In FSDownload.java#downloadAndUnpack, it uses "FileUtil.copy" which can handle directory. Depending on this can greatly simplify our implementation, no need to download remote dir or zip local dir anymore. I will verify and update the patch tomorrow. > [Submarine] Support files/tarballs to be localized for a training job. > -- > > Key: YARN-8714 > URL: https://issues.apache.org/jira/browse/YARN-8714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8714-WIP1-trunk-001.patch, > YARN-8714-WIP1-trunk-002.patch, YARN-8714-trunk.001.patch, > YARN-8714-trunk.002.patch > > > See > [https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit#heading=h.vkxp9edl11m7], > {{job run --localization ...}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6523) Newly retrieved security Tokens are sent as part of each heartbeat to each node from RM which is not desirable in large cluster
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708868#comment-16708868 ] Jason Lowe commented on YARN-6523: -- Thanks for updating the patch! If a unit test just added in a patch fails in the precommit build then there's usually something wrong with the test even if it passes locally. It's likely to be a racy test, as the precommit builds are notorious for running unit tests with a different timing than seen locally. The problem with these tests is they still aren't really unit tests but rather integration tests where it is spinning up an RM and an NM. The first test should only create a DelegationTokenRenewer with a mock RMContext and verify that RMContext#incrTokenSequenceNo is called when the appropriate token is created and when it is renewed. No server start ups, heartbeats, etc. All of that tends to be racy as async dispatchers are usually involved making it hard to know when something is done processing and therefore safe to examine for assertions. DelegationTokenRenewer#addApplicationSync can be used to test the case where a token is created, and we can make DelegationTokenRenewer#requestNewHdfsDelegationTokenIfNeeded package-private so we can call it from a test with a token that needs to be renewed to test the renewal case. The second test is designed to test the ResourceTrackerService is properly handling the token sequence number, so there should be a unit test that verifies that the system credentials are sent when the token sequence number mismatches and not sent when they match. That test should be in TestResourceTrackerService, since that's what we're testing. If we pass a mock RMContext to the ResourceTrackerService when we construct it for the test, it makes it easy to manipulate it, along with the credentials payload, to verify in the test that the credentials are only sent when expected. NodeHeartbeatResponse should get/set a Collection rather than a List. That allows ResourceTrackerService to pass the values of its tracking map directly rather than needing to convert it into a list first. Typo in NodeHeartbeatResponse comment: "logAggreations" NodeHeartbeatResponsePBImpl#setSystemCredentialsForApps should pass the collection directly to the ArrayList constructor so it doesn't have to guess at the initial size of the array then immediately discard it to reallocate a new one when the collection is larger than the initial guess. Passing directly to the constructor allows ArrayList to allocate the correct array size the first time and reduces unnecessary garbage. Nit: The name "systemCredentialsForAppsProto" in NodeHeartbeatResponsePBImpl implies it is a single proto rather than a collection of multiple. Maybe just "systemCredentials"? YarnServerBuilderUtils should pass the desired capacity to the ArrayList or HashMap constructor since it's trivial to compute and eliminates the possibility of needing to resize the collection due to a poor initial guess in the default constructor. > Newly retrieved security Tokens are sent as part of each heartbeat to each > node from RM which is not desirable in large cluster > --- > > Key: YARN-6523 > URL: https://issues.apache.org/jira/browse/YARN-6523 > Project: Hadoop YARN > Issue Type: Improvement > Components: RM >Affects Versions: 2.8.0, 2.7.3 >Reporter: Naganarasimha G R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-6523.001.patch, YARN-6523.002.patch, > YARN-6523.003.patch, YARN-6523.004.patch, YARN-6523.005.patch, > YARN-6523.006.patch, YARN-6523.007.patch, YARN-6523.008.patch, > YARN-6523.009.patch > > > Currently as part of heartbeat response RM sets all application's tokens > though all applications might not be active on the node. On top of it > NodeHeartbeatResponsePBImpl converts tokens for each app into > SystemCredentialsForAppsProto. Hence for each node and each heartbeat too > many SystemCredentialsForAppsProto objects were getting created. > We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with > 8GB RAM configured for RM -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies
[ https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708658#comment-16708658 ] Hadoop QA commented on YARN-9057: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 50m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s{color} | {color:green} hadoop-assemblies in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} hadoop-yarn-csi in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 85m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9057 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12950539/YARN-9057.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml | | uname | Linux eba0bb6a87be 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / de42555 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/22777/testReport/ | | Max. process+thread count | 339 (vs. ulimit of 1) | | modules | C: hadoop-assemblies hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi U: . | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/22777/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > CSI jar
[jira] [Commented] (YARN-8960) [Submarine] Can't get submarine service status using the command of "yarn app -status" under security environment
[ https://issues.apache.org/jira/browse/YARN-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708652#comment-16708652 ] Zac Zhou commented on YARN-8960: I think it should be ok, [~leftnoteasy] any comments? > [Submarine] Can't get submarine service status using the command of "yarn app > -status" under security environment > - > > Key: YARN-8960 > URL: https://issues.apache.org/jira/browse/YARN-8960 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zac Zhou >Assignee: Zac Zhou >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8960.001.patch, YARN-8960.002.patch, > YARN-8960.003.patch, YARN-8960.004.patch, YARN-8960.005.patch, > YARN-8960.006.patch, YARN-8960.007.patch > > > After submitting a submarine job, we tried to get service status using the > following command: > yarn app -status ${service_name} > But we got the following error: > HTTP error code : 500 > > The stack in resourcemanager log is : > {code} > ERROR org.apache.hadoop.yarn.service.webapp.ApiServer: Get service failed: {} > java.lang.reflect.UndeclaredThrowableException > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748) > at > org.apache.hadoop.yarn.service.webapp.ApiServer.getServiceFromClient(ApiServer.java:800) > at > org.apache.hadoop.yarn.service.webapp.ApiServer.getService(ApiServer.java:186) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ... > Caused by: org.apache.hadoop.yarn.exceptions.YarnException: No principal > specified in the persisted service definitio > n, fail to connect to AM. > at > org.apache.hadoop.yarn.service.client.ServiceClient.createAMProxy(ServiceClient.java:1500) > at > org.apache.hadoop.yarn.service.client.ServiceClient.getStatus(ServiceClient.java:1376) > at > org.apache.hadoop.yarn.service.webapp.ApiServer.lambda$getServiceFromClient$4(ApiServer.java:804) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > ... 68 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9001) [Submarine] Use AppAdminClient instead of ServiceClient to sumbit jobs
[ https://issues.apache.org/jira/browse/YARN-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708649#comment-16708649 ] Zac Zhou commented on YARN-9001: Yup, I think it can be applied to 3.2.0. Since this patch uses API from 3.1.0. It should be ok~ > [Submarine] Use AppAdminClient instead of ServiceClient to sumbit jobs > -- > > Key: YARN-9001 > URL: https://issues.apache.org/jira/browse/YARN-9001 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zac Zhou >Assignee: Zac Zhou >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-9001-branch-3.2.001.patch, YARN-9001.001.patch, > YARN-9001.002.patch, YARN-9001.003.patch, YARN-9001.004.patch, > YARN-9001.005.patch > > > For now, submarine submit a service to yarn by using ServiceClient, We should > change it to AppAdminClient -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies
[ https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708581#comment-16708581 ] Weiwei Yang commented on YARN-9057: --- Attached patch to remove the shading code, and yarn-csi copies its dependencies to share/hadoop/yarn/csi/lib, so this is self-contained. This will run with its own classpath, I have tried to use AUX service to launch the service, it works fine. [~sunilg], [~ste...@apache.org], [~eyang], pls help to review. Thanks. > CSI jar file should not bundle third party dependencies > --- > > Key: YARN-9057 > URL: https://issues.apache.org/jira/browse/YARN-9057 > Project: Hadoop YARN > Issue Type: Sub-task > Components: build >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Weiwei Yang >Priority: Blocker > Attachments: YARN-9057.001.patch > > > hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a > shaded jar instead of CSI only classes. This is generating error messages > for YARN cli: > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9057) CSI jar file should not bundle third party dependencies
[ https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-9057: -- Attachment: YARN-9057.001.patch > CSI jar file should not bundle third party dependencies > --- > > Key: YARN-9057 > URL: https://issues.apache.org/jira/browse/YARN-9057 > Project: Hadoop YARN > Issue Type: Sub-task > Components: build >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Weiwei Yang >Priority: Blocker > Attachments: YARN-9057.001.patch > > > hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a > shaded jar instead of CSI only classes. This is generating error messages > for YARN cli: > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9057) CSI jar file should not bundle third party dependencies
[ https://issues.apache.org/jira/browse/YARN-9057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708496#comment-16708496 ] Steve Loughran commented on YARN-9057: -- bq. it it seems to create more problems than the ones it fixed. Afraid so. General practise in hadoop-*: unshaded in all our cross references, moving to shaded for public artifacts (which we still need to do for the object stores). And we dream of a java9-only world... > CSI jar file should not bundle third party dependencies > --- > > Key: YARN-9057 > URL: https://issues.apache.org/jira/browse/YARN-9057 > Project: Hadoop YARN > Issue Type: Sub-task > Components: build >Affects Versions: 3.3.0 >Reporter: Eric Yang >Assignee: Weiwei Yang >Priority: Blocker > > hadoop-yarn-csi-3.3.0-SNAPSHOT.jar bundles all third party classes like a > shaded jar instead of CSI only classes. This is generating error messages > for YARN cli: > {code} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/local/hadoop-3.3.0-SNAPSHOT/share/hadoop/yarn/hadoop-yarn-csi-3.3.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-7897) Invalid NM log link published on Yarn UI when container fails
[ https://issues.apache.org/jira/browse/YARN-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB resolved YARN-7897. Resolution: Not A Bug UI2 has no bug, log link is displayed from ATSv2 response data. If the log link is not available in ATSv2 response, ui2 will display N/A. > Invalid NM log link published on Yarn UI when container fails > - > > Key: YARN-7897 > URL: https://issues.apache.org/jira/browse/YARN-7897 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-ui-v2 >Reporter: Yesha Vora >Assignee: Akhil PB >Priority: Major > Attachments: Screen Shot 2018-02-05 at 4.52.59 PM.png > > > Steps: > 1) Launch Httpd example via rest api in unsecure mode > 2) container_e04_1517875972784_0001_01_02 fails with "Unable to find > image 'centos/httpd-24-centos7:latest" > 3) Go To RM UI2 to debug issue. > The Yarn app attempt page has incorrect Value for Logs and Nodemanager UI > Logs = N/A > Nodemanager UI = http://nmhost:0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8230) [UI2] Attempt Info page url shows NA for several fields for container info
[ https://issues.apache.org/jira/browse/YARN-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB resolved YARN-8230. Resolution: Not A Bug It is working as expected. UI would display the data if available otherwise N/A. > [UI2] Attempt Info page url shows NA for several fields for container info > -- > > Key: YARN-8230 > URL: https://issues.apache.org/jira/browse/YARN-8230 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn, yarn-ui-v2 >Reporter: Sumana Sathish >Assignee: Akhil PB >Priority: Critical > > 1. Click on any application > 2. Click on the appAttempt present > 3. Click on grid View > 4. It shows container Info. But logs / nodemanager / and several fields show > NA, with finished time as Invalid -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8918) [Submarine] Correct method usage of str.subString in CliUtils
[ https://issues.apache.org/jira/browse/YARN-8918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708338#comment-16708338 ] Zhankun Tang commented on YARN-8918: [~sunilg] , it's a minor change. not important. > [Submarine] Correct method usage of str.subString in CliUtils > - > > Key: YARN-8918 > URL: https://issues.apache.org/jira/browse/YARN-8918 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Minor > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8918-trunk.001.patch, YARN-8918-trunk.002.patch, > YARN-8918-trunk.003.patch > > > In CliUtils.java (line 74), there's uncorrect code block,: > {code:java} > if (resourcesStr.endsWith("]")) { > resourcesStr = resourcesStr.substring(0, resourcesStr.length()); > }{code} > Above if block will execute "resourceStr = resourceStr". It should be > "length() -1" -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org