[jira] [Updated] (YARN-6507) Add support in NodeManager to isolate FPGA devices with CGroups
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-6507: --- Fix Version/s: (was: YARN-3926) > Add support in NodeManager to isolate FPGA devices with CGroups > --- > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Attachments: YARN-6507-branch-YARN-3926.001.patch, > YARN-6507-branch-YARN-3926.002.patch, YARN-6507-trunk.001.patch > > > Support local FPGA resource scheduler to assign/isolate N FPGA slots to a > container. > At the beginning, support one vendor plugin with basic features to serve > OpenCL applications -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6507) Add support in NodeManager to isolate FPGA devices with CGroups
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-6507: --- Attachment: YARN-6507-trunk.001.patch Draft patch for FPGA java side code > Add support in NodeManager to isolate FPGA devices with CGroups > --- > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Fix For: YARN-3926 > > Attachments: YARN-6507-branch-YARN-3926.001.patch, > YARN-6507-branch-YARN-3926.002.patch, YARN-6507-trunk.001.patch > > > Support local FPGA resource scheduler to assign/isolate N FPGA slots to a > container. > At the beginning, support one vendor plugin with basic features to serve > OpenCL applications -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7413) Support resource type in SLS
[ https://issues.apache.org/jira/browse/YARN-7413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226255#comment-16226255 ] Daniel Templeton commented on YARN-7413: Patch looks good. It would be nice to have the info in your previous comment documented somewhere so that folks know about the feature. What about setting the AM resources? > Support resource type in SLS > > > Key: YARN-7413 > URL: https://issues.apache.org/jira/browse/YARN-7413 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler-load-simulator >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-7413.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7316) Cleaning up the usage of Resources and ResourceCalculator
[ https://issues.apache.org/jira/browse/YARN-7316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226244#comment-16226244 ] Daniel Templeton commented on YARN-7316: I had been planning to post a JIRA to remove the methods from {{Resources}} that are just calculator pass-throughs. For me they hurt readability and complicate the code with no benefit. How would you like to proceed here? > Cleaning up the usage of Resources and ResourceCalculator > - > > Key: YARN-7316 > URL: https://issues.apache.org/jira/browse/YARN-7316 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0 >Reporter: lovekesh bansal >Assignee: lovekesh bansal >Priority: Minor > Fix For: 3.1.0 > > Attachments: YARN-7316_trunk.001.patch > > > Cleaning up and Uniformizing the usage the usage of Resources and > ResourceCalculator. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6594) [API] Introduce SchedulingRequest object
[ https://issues.apache.org/jira/browse/YARN-6594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226226#comment-16226226 ] Konstantinos Karanasos commented on YARN-6594: -- Thanks [~wangda]! And [~asuresh], [~jianhe], [~sunilg] for the feedback/reviews! > [API] Introduce SchedulingRequest object > > > Key: YARN-6594 > URL: https://issues.apache.org/jira/browse/YARN-6594 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Fix For: YARN-6592 > > Attachments: YARN-6594-YARN-6592.002.patch, YARN-6594.001.patch > > > This JIRA introduces a new SchedulingRequest object. > It will be part of the {{AllocateRequest}} and will be used to define sizing > (e.g., number of allocations, size of allocations) and placement constraints > for allocations. > Applications can use either this new object (when rich placement constraints > are required) or the existing {{ResourceRequest}} object. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6413) Yarn Registry FS implementation
[ https://issues.apache.org/jira/browse/YARN-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226218#comment-16226218 ] Jian He commented on YARN-6413: --- bq. will the yarn-native-services branch add the option to configure which implementing class to use for registry? No, it doesn't add > Yarn Registry FS implementation > --- > > Key: YARN-6413 > URL: https://issues.apache.org/jira/browse/YARN-6413 > Project: Hadoop YARN > Issue Type: Improvement > Components: amrmproxy, api, resourcemanager >Reporter: Ellen Hui >Assignee: Ellen Hui > Attachments: 0001-Registry-API-v2.patch, > 0001-YARN-6413-Yarn-Registry-FS-Implementation.patch, > 0002-Registry-API-v2.patch, 0003-Registry-API-api-only.patch, > 0004-Registry-API-api-stubbed.patch, YARN-6413.v1.patch, YARN-6413.v2.patch, > YARN-6413.v3.patch, YARN-6413.v4.patch, YARN-6413.v5.patch, > YARN-6413.v6.patch, YARN-6413.v7.patch > > > Add a RegistryOperations implementation that writes records to the file > system. This does not include any changes to the API, to avoid compatibility > issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6413) Yarn Registry FS implementation
[ https://issues.apache.org/jira/browse/YARN-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226184#comment-16226184 ] Hadoop QA commented on YARN-6413: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 37s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 46s{color} | {color:green} hadoop-yarn-registry in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 43m 7s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-6413 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12894894/YARN-6413.v7.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux fa036cefafb6 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a8083aa | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18253/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/18253/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Yarn Registry FS implementation > --- > > Key: YARN-6413 > URL: https://issues.apache.org/jira/browse/YARN-6413 >
[jira] [Commented] (YARN-7330) Add support to show GPU on UI/metrics
[ https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226181#comment-16226181 ] Hadoop QA commented on YARN-7330: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 59s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 21m 12s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 30s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in trunk has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 31s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 14s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 22 new + 65 unchanged - 1 fixed = 87 total (was 66) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 0s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 43s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 8s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 21s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 40s{color} | {color:green} hadoop-yarn-ui in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 57s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}192m 34s{color} | {color:black}
[jira] [Commented] (YARN-6508) Support FPGA plugin
[ https://issues.apache.org/jira/browse/YARN-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226162#comment-16226162 ] Zhankun Tang commented on YARN-6508: The default vendor specific plugin is implemented in YARN-6507. So close this. > Support FPGA plugin > --- > > Key: YARN-6508 > URL: https://issues.apache.org/jira/browse/YARN-6508 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6508) Support FPGA plugin
[ https://issues.apache.org/jira/browse/YARN-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang resolved YARN-6508. Resolution: Implemented > Support FPGA plugin > --- > > Key: YARN-6508 > URL: https://issues.apache.org/jira/browse/YARN-6508 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6507) Add support in NodeManager to isolate FPGA devices with CGroups
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-6507: --- Description: Support local FPGA resource scheduler to assign/isolate N FPGA slots to a container. At the beginning, support one vendor plugin with basic features to serve OpenCL applications was: Support local FPGA resource scheduler to assign/cleanup N FPGA slots to/from a container. Support vendor plugin framework with basic features that meets vendor requirements > Add support in NodeManager to isolate FPGA devices with CGroups > --- > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Fix For: YARN-3926 > > Attachments: YARN-6507-branch-YARN-3926.001.patch, > YARN-6507-branch-YARN-3926.002.patch > > > Support local FPGA resource scheduler to assign/isolate N FPGA slots to a > container. > At the beginning, support one vendor plugin with basic features to serve > OpenCL applications -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4511) Common scheduler changes supporting scheduler-specific implementations
[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226157#comment-16226157 ] Hadoop QA commented on YARN-4511: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 14 new or modified test files. {color} | || || || || {color:brown} YARN-1011 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 5m 56s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 56s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 4s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 2s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 37s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 41s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} YARN-1011 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 59s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 6s{color} | {color:orange} root: The patch generated 7 new + 1358 unchanged - 4 fixed = 1365 total (was 1362) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 7m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 8s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 12s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 48s{color} | {color:green} hadoop-sls in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}153m 51s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Increment of volatile field org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode.numGuaranteedContainers in org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode.guaranteedContainerReleased(Container) At SchedulerNode.java:in org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode.guaranteedContainerReleased(Container) At SchedulerNode.java:[line 383] | | | Increment of volatile field
[jira] [Commented] (YARN-6507) Add support in NodeManager to isolate FPGA devices with CGroups
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226156#comment-16226156 ] Zhankun Tang commented on YARN-6507: Because we'll first support one kind of plugin for the time being. Change the title from "Support FPGA abstraction framework on NM side " to "Add support in NodeManager to isolate FPGA devices with CGroups". > Add support in NodeManager to isolate FPGA devices with CGroups > --- > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Fix For: YARN-3926 > > Attachments: YARN-6507-branch-YARN-3926.001.patch, > YARN-6507-branch-YARN-3926.002.patch > > > Support local FPGA resource scheduler to assign/cleanup N FPGA slots to/from > a container. > Support vendor plugin framework with basic features that meets vendor > requirements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6507) Add support in NodeManager to isolate FPGA devices with CGroups
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-6507: --- Summary: Add support in NodeManager to isolate FPGA devices with CGroups (was: Support FPGA abstraction framework on NM side) > Add support in NodeManager to isolate FPGA devices with CGroups > --- > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Fix For: YARN-3926 > > Attachments: YARN-6507-branch-YARN-3926.001.patch, > YARN-6507-branch-YARN-3926.002.patch > > > Support local FPGA resource scheduler to assign/cleanup N FPGA slots to/from > a container. > Support vendor plugin framework with basic features that meets vendor > requirements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7276) Federation Router Web Service fixes
[ https://issues.apache.org/jira/browse/YARN-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226132#comment-16226132 ] Hadoop QA commented on YARN-7276: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 42s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 34s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 19s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-router generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 45s{color} | {color:red} hadoop-yarn-server-router in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 13s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.router.webapp.TestRouterWebServicesREST | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 | | JIRA Issue | YARN-7276 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12894820/YARN-7276-branch-2.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux af6dd3951d49 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 76ec5ea | | maven | version: Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) | | Default Java | 1.7.0_151 | | findbugs | v3.0.0 | | javac | https://builds.apache.org/job/PreCommit-YARN-Build/18252/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-router.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/18252/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-router.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18252/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router | | Console output |
[jira] [Commented] (YARN-6413) Yarn Registry FS implementation
[ https://issues.apache.org/jira/browse/YARN-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226125#comment-16226125 ] Hadoop QA commented on YARN-6413: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 8m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 35s{color} | {color:green} hadoop-yarn-registry in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 37m 20s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-6413 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12894894/YARN-6413.v7.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux cc225c580d48 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a8083aa | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18251/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/18251/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Yarn Registry FS implementation > --- > > Key: YARN-6413 > URL: https://issues.apache.org/jira/browse/YARN-6413 > Project:
[jira] [Commented] (YARN-7316) Cleaning up the usage of Resources and ResourceCalculator
[ https://issues.apache.org/jira/browse/YARN-7316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226124#comment-16226124 ] Sunil G commented on YARN-7316: --- I think it was mainly because of a readability and to go with a common approach similar to other apis. Some of the apis in Resources class doesnt need calculator also. This gives a common view that any ops on Resource objects could be done via static methods in Resources class. Other than that there is no other reasons to me. > Cleaning up the usage of Resources and ResourceCalculator > - > > Key: YARN-7316 > URL: https://issues.apache.org/jira/browse/YARN-7316 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0 >Reporter: lovekesh bansal >Assignee: lovekesh bansal >Priority: Minor > Fix For: 3.1.0 > > Attachments: YARN-7316_trunk.001.patch > > > Cleaning up and Uniformizing the usage the usage of Resources and > ResourceCalculator. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6413) Yarn Registry FS implementation
[ https://issues.apache.org/jira/browse/YARN-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ellen Hui updated YARN-6413: Attachment: YARN-6413.v7.patch Fix Checkstyle and whitespace > Yarn Registry FS implementation > --- > > Key: YARN-6413 > URL: https://issues.apache.org/jira/browse/YARN-6413 > Project: Hadoop YARN > Issue Type: Improvement > Components: amrmproxy, api, resourcemanager >Reporter: Ellen Hui >Assignee: Ellen Hui > Attachments: 0001-Registry-API-v2.patch, > 0001-YARN-6413-Yarn-Registry-FS-Implementation.patch, > 0002-Registry-API-v2.patch, 0003-Registry-API-api-only.patch, > 0004-Registry-API-api-stubbed.patch, YARN-6413.v1.patch, YARN-6413.v2.patch, > YARN-6413.v3.patch, YARN-6413.v4.patch, YARN-6413.v5.patch, > YARN-6413.v6.patch, YARN-6413.v7.patch > > > Add a RegistryOperations implementation that writes records to the file > system. This does not include any changes to the API, to avoid compatibility > issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226092#comment-16226092 ] Subru Krishnan commented on YARN-7400: -- Thanks [~djp] for the clarification and [~xgong] for the quick fix. It looks like it'll anyways get in 2.9.0. > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > Attachments: YARN-7400.1.patch > > > In the job history server ui, If we enable the new log format, the container > preview log is displayed incorrectly, for e.x launch_container.sh is > displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6413) Yarn Registry FS implementation
[ https://issues.apache.org/jira/browse/YARN-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226080#comment-16226080 ] Hadoop QA commented on YARN-6413: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 8m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 9s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry: The patch generated 1 new + 15 unchanged - 0 fixed = 16 total (was 15) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 11 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 40s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 38s{color} | {color:green} hadoop-yarn-registry in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 51m 51s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-6413 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12894857/YARN-6413.v5.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ea97c8f77ec2 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a8083aa | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/18248/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-registry.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/18248/artifact/out/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18248/testReport/ | | modules | C:
[jira] [Updated] (YARN-6413) Yarn Registry FS implementation
[ https://issues.apache.org/jira/browse/YARN-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ellen Hui updated YARN-6413: Attachment: YARN-6413.v6.patch Addressed [~jianhe]'s comments. Hi Jian, a quick question: will the yarn-native-services branch add the option to configure which implementing class to use for registry? > Yarn Registry FS implementation > --- > > Key: YARN-6413 > URL: https://issues.apache.org/jira/browse/YARN-6413 > Project: Hadoop YARN > Issue Type: Improvement > Components: amrmproxy, api, resourcemanager >Reporter: Ellen Hui >Assignee: Ellen Hui > Attachments: 0001-Registry-API-v2.patch, > 0001-YARN-6413-Yarn-Registry-FS-Implementation.patch, > 0002-Registry-API-v2.patch, 0003-Registry-API-api-only.patch, > 0004-Registry-API-api-stubbed.patch, YARN-6413.v1.patch, YARN-6413.v2.patch, > YARN-6413.v3.patch, YARN-6413.v4.patch, YARN-6413.v5.patch, YARN-6413.v6.patch > > > Add a RegistryOperations implementation that writes records to the file > system. This does not include any changes to the API, to avoid compatibility > issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226069#comment-16226069 ] Hadoop QA commented on YARN-7400: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 41s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 59s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 31s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 43m 48s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7400 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12894867/YARN-7400.1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7af99646e1b8 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a8083aa | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18246/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/18246/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > incorrect log preview displayed in jobhistory server ui >
[jira] [Created] (YARN-7418) Improve performance of locking in fair scheduler
Daniel Templeton created YARN-7418: -- Summary: Improve performance of locking in fair scheduler Key: YARN-7418 URL: https://issues.apache.org/jira/browse/YARN-7418 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 3.0.0-beta1 Reporter: Daniel Templeton Assignee: Daniel Templeton Based on initial testing, we can improve scheduler performance by 5%-10% with some simple optimizations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7336) Unsafe cast from long to int Resource.hashCode() method
[ https://issues.apache.org/jira/browse/YARN-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226031#comment-16226031 ] Hudson commented on YARN-7336: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13161 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13161/]) YARN-7336. Unsafe cast from long to int Resource.hashCode() method (templedf: rev d64736d58965722b71d6eade578b6c4c266e6448) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/impl/LightWeightResource.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/Resource.java > Unsafe cast from long to int Resource.hashCode() method > --- > > Key: YARN-7336 > URL: https://issues.apache.org/jira/browse/YARN-7336 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.0.0-beta1, 3.1.0 >Reporter: Daniel Templeton >Assignee: Miklos Szegedi >Priority: Critical > Labels: ready-to-commit > Fix For: 3.1.0 > > Attachments: YARN-7336.000.patch, YARN-7336.001.patch > > > For example: > {code} > final int prime = 47; > long result = 0; > for (ResourceInformation entry : resources) { > result = prime * result + entry.hashCode(); > } > return (int) result; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6927) Add support for individual resource types requests in MapReduce
[ https://issues.apache.org/jira/browse/YARN-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226030#comment-16226030 ] Hudson commented on YARN-6927: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13161 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13161/]) YARN-6927. Add support for individual resource types requests in (templedf: rev 9a7e81083801a57d6bb96584988415cbef67460d) * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskAttempt.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestYARNRunner.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceUtils.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/TestMapreduceConfigFields.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java > Add support for individual resource types requests in MapReduce > --- > > Key: YARN-6927 > URL: https://issues.apache.org/jira/browse/YARN-6927 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Daniel Templeton >Assignee: Gergo Repas > Fix For: 3.1.0 > > Attachments: YARN-6927.000.patch, YARN-6927.001.patch, > YARN-6927.002.patch, YARN-6927.003.patch, YARN-6927.004.patch, > YARN-6927.005.patch, YARN-6927.006.patch, YARN-6927.007.patch, > YARN-6927.008.patch, YARN-6927.009.patch, YARN-6927.010.patch > > > YARN-6504 adds support for resource profiles in MapReduce jobs, but resource > profiles don't give users much flexibility in their resource requests. To > satisfy users' needs, MapReduce should also allow users to specify arbitrary > resource requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4511) Common scheduler changes supporting scheduler-specific implementations
[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-4511: - Attachment: YARN-4511-YARN-1011.10.patch > Common scheduler changes supporting scheduler-specific implementations > -- > > Key: YARN-4511 > URL: https://issues.apache.org/jira/browse/YARN-4511 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Haibo Chen > Attachments: YARN-4511-YARN-1011.00.patch, > YARN-4511-YARN-1011.01.patch, YARN-4511-YARN-1011.02.patch, > YARN-4511-YARN-1011.03.patch, YARN-4511-YARN-1011.04.patch, > YARN-4511-YARN-1011.05.patch, YARN-4511-YARN-1011.06.patch, > YARN-4511-YARN-1011.07.patch, YARN-4511-YARN-1011.08.patch, > YARN-4511-YARN-1011.09.patch, YARN-4511-YARN-1011.10.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-7202) Add UT for api-server
[ https://issues.apache.org/jira/browse/YARN-7202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gour Saha reopened YARN-7202: - Re-opening this, since the issues that were raised in this jira on "16/Oct/17 17:48" needs to be addressed. Refer to this comment for the raised issues - https://issues.apache.org/jira/browse/YARN-7202?focusedCommentId=16206843=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16206843 > Add UT for api-server > - > > Key: YARN-7202 > URL: https://issues.apache.org/jira/browse/YARN-7202 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Jian He >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7202.yarn-native-services.001.patch, > YARN-7202.yarn-native-services.002.patch, > YARN-7202.yarn-native-services.003.patch, > YARN-7202.yarn-native-services.004.patch, > YARN-7202.yarn-native-services.005.patch, > YARN-7202.yarn-native-services.006.patch, > YARN-7202.yarn-native-services.007.patch, > YARN-7202.yarn-native-services.008.patch, > YARN-7202.yarn-native-services.011.patch, > YARN-7202.yarn-native-services.012.patch, > YARN-7202.yarn-native-services.013.patch, > YARN-7202.yarn-native-services.014.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4511) Common scheduler changes supporting scheduler-specific implementations
[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226021#comment-16226021 ] Haibo Chen commented on YARN-4511: -- Filed YARN-7337 for SchedulerNodeReport changes because I think there is more work to do there. Uploaded the patch to include the following changes: 1) remove duplicate isDebugEnabled() 2) replace assert statements with throw YarnRuntimeExceptions 3) rename guaranteedContainerResourceReleased() to guaranteedContainerReleased, include all necessary updates. Similarly for opportunisticContainerResourceReleased(), opportunisticContainerResourceAllocated(), guaranteedContainerResourceAllocated() 4) replace containerAllocated(resource, allocatedResourceOpportunistic) with if (containerAllocated(resource, allocatedResourceOpportunistic)) { // nothing else to do } to make it consistent with guaranteed container allocation. > Common scheduler changes supporting scheduler-specific implementations > -- > > Key: YARN-4511 > URL: https://issues.apache.org/jira/browse/YARN-4511 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Haibo Chen > Attachments: YARN-4511-YARN-1011.00.patch, > YARN-4511-YARN-1011.01.patch, YARN-4511-YARN-1011.02.patch, > YARN-4511-YARN-1011.03.patch, YARN-4511-YARN-1011.04.patch, > YARN-4511-YARN-1011.05.patch, YARN-4511-YARN-1011.06.patch, > YARN-4511-YARN-1011.07.patch, YARN-4511-YARN-1011.08.patch, > YARN-4511-YARN-1011.09.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6413) Yarn Registry FS implementation
[ https://issues.apache.org/jira/browse/YARN-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226017#comment-16226017 ] Jian He commented on YARN-6413: --- - RegistryOperationsStoreService - rename to FSRegistryOperationsService ? - The ServiceRecord class has a bunch format only changes, please revert those > Yarn Registry FS implementation > --- > > Key: YARN-6413 > URL: https://issues.apache.org/jira/browse/YARN-6413 > Project: Hadoop YARN > Issue Type: Improvement > Components: amrmproxy, api, resourcemanager >Reporter: Ellen Hui >Assignee: Ellen Hui > Attachments: 0001-Registry-API-v2.patch, > 0001-YARN-6413-Yarn-Registry-FS-Implementation.patch, > 0002-Registry-API-v2.patch, 0003-Registry-API-api-only.patch, > 0004-Registry-API-api-stubbed.patch, YARN-6413.v1.patch, YARN-6413.v2.patch, > YARN-6413.v3.patch, YARN-6413.v4.patch, YARN-6413.v5.patch > > > Add a RegistryOperations implementation that writes records to the file > system. This does not include any changes to the API, to avoid compatibility > issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7394) Merge code paths for Reservation/Plan queues and Auto Created queues
[ https://issues.apache.org/jira/browse/YARN-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225986#comment-16225986 ] Wangda Tan commented on YARN-7394: -- [~suma.shivaprasad], Thanks for updating the patch, few comments: 1) Instead of adding {{validate}}, probably it's easier to let PlanQueue overwrite {{reinitialize}}. 2) Can we move setEntitlement to AutoCreatedLeafQueue. Beyond that, patch looks good to me. [~subru]/[~curino], wanna take a look? > Merge code paths for Reservation/Plan queues and Auto Created queues > > > Key: YARN-7394 > URL: https://issues.apache.org/jira/browse/YARN-7394 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Attachments: YARN-7394.1.patch, YARN-7394.2.patch, YARN-7394.4.patch, > YARN-7394.5.patch, YARN-7394.patch > > > The initialization/reinitialization logic for ReservationQueue and > AutoCreated Leaf queues are similar. The proposal is to rename > ReservationQueue to a more generic name AutoCreatedLeafQueue which are either > managed by PlanQueue(already exists) or AutoCreateEnabledParentQueue (new > class). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7394) Merge code paths for Reservation/Plan queues and Auto Created queues
[ https://issues.apache.org/jira/browse/YARN-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225986#comment-16225986 ] Wangda Tan edited comment on YARN-7394 at 10/30/17 11:40 PM: - [~suma.shivaprasad], Thanks for updating the patch, few comments: 1) Instead of adding {{validate}}, probably it's easier to let PlanQueue overwrite {{reinitialize}}. 2) Can we move setEntitlement to AutoCreatedLeafQueue. Beyond that, patch looks good to me. [~subru]/[~curino], wanna take a look? Changes should be straightforward. was (Author: leftnoteasy): [~suma.shivaprasad], Thanks for updating the patch, few comments: 1) Instead of adding {{validate}}, probably it's easier to let PlanQueue overwrite {{reinitialize}}. 2) Can we move setEntitlement to AutoCreatedLeafQueue. Beyond that, patch looks good to me. [~subru]/[~curino], wanna take a look? > Merge code paths for Reservation/Plan queues and Auto Created queues > > > Key: YARN-7394 > URL: https://issues.apache.org/jira/browse/YARN-7394 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Attachments: YARN-7394.1.patch, YARN-7394.2.patch, YARN-7394.4.patch, > YARN-7394.5.patch, YARN-7394.patch > > > The initialization/reinitialization logic for ReservationQueue and > AutoCreated Leaf queues are similar. The proposal is to rename > ReservationQueue to a more generic name AutoCreatedLeafQueue which are either > managed by PlanQueue(already exists) or AutoCreateEnabledParentQueue (new > class). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4511) Common scheduler changes supporting scheduler-specific implementations
[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225984#comment-16225984 ] Haibo Chen commented on YARN-4511: -- Thanks [~asuresh] for your review! bq. Is there a case where the 'resource' argument might be null in the former method call (since that is only case when the containerResourceAllocated method can return false)? Not sure. But this is the same behavior as before, so I figure it's safer to preserve it. The reason why I did not do the same in opportunisticContainerResourceAllocated is that there is nothing else to do if containerResourceAlllocated returns to true. i.e., if we were to make it consistent, it'll be like {code} if (containerAllocated(resource, allocatedResourceOpportunistic)) { // nothing else to do } {code} Not sure what's best style here. Open to all suggestions/preferences. bq. but then in the SchedulerNodeReport, shouldn't this.num = numOpp + numGuaranteed ? Yes. but given how used/avail is assigned in SchedulerNodeReport (used -> guaranteedResourceUsed, num-> numGuaranteedContainers), I created YARN-7337 to augment SchedulerNodeReport with opportunistic container stats instead of modifying existing variables and also make sure it is exposed properly in NodeReport api > Common scheduler changes supporting scheduler-specific implementations > -- > > Key: YARN-4511 > URL: https://issues.apache.org/jira/browse/YARN-4511 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Haibo Chen > Attachments: YARN-4511-YARN-1011.00.patch, > YARN-4511-YARN-1011.01.patch, YARN-4511-YARN-1011.02.patch, > YARN-4511-YARN-1011.03.patch, YARN-4511-YARN-1011.04.patch, > YARN-4511-YARN-1011.05.patch, YARN-4511-YARN-1011.06.patch, > YARN-4511-YARN-1011.07.patch, YARN-4511-YARN-1011.08.patch, > YARN-4511-YARN-1011.09.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225982#comment-16225982 ] Junping Du commented on YARN-7400: -- Hey [~subru], all new format log related patches get landed on 2.9, so we plan to enable new log format (and other log related enhancements) in 2.9 and users could start to use it since 2.9. Given this really belongs to a small fix, shouldn't actually block 2.9 release. > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > Attachments: YARN-7400.1.patch > > > In the job history server ui, If we enable the new log format, the container > preview log is displayed incorrectly, for e.x launch_container.sh is > displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225981#comment-16225981 ] Xuan Gong commented on YARN-7400: - the re-factory work will be tracked by https://issues.apache.org/jira/browse/YARN-7417 > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > Attachments: YARN-7400.1.patch > > > In the job history server ui, If we enable the new log format, the container > preview log is displayed incorrectly, for e.x launch_container.sh is > displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7417) re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes
Xuan Gong created YARN-7417: --- Summary: re-factory IndexedFileAggregatedLogsBlock and TFileAggregatedLogsBlock to remove duplicate codes Key: YARN-7417 URL: https://issues.apache.org/jira/browse/YARN-7417 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4511) Common scheduler changes supporting scheduler-specific implementations
[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225973#comment-16225973 ] Haibo Chen commented on YARN-4511: -- Thanks [~leftnoteasy] for the review! bq. assert in the main code (Such as SchedulerNode) will be removed at runtime. Do you want to throw exception instead? The assert is meant mainly for documentation purpose. I'll replace them with throw exception instead. bq. Moving following statements to guaranteedContainerResourceReleased? Good point. Will do that and rename it to guaranteedContainerReleased given its semantics are changed, and likewise for guaranteedContainerAllocated. > Common scheduler changes supporting scheduler-specific implementations > -- > > Key: YARN-4511 > URL: https://issues.apache.org/jira/browse/YARN-4511 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Haibo Chen > Attachments: YARN-4511-YARN-1011.00.patch, > YARN-4511-YARN-1011.01.patch, YARN-4511-YARN-1011.02.patch, > YARN-4511-YARN-1011.03.patch, YARN-4511-YARN-1011.04.patch, > YARN-4511-YARN-1011.05.patch, YARN-4511-YARN-1011.06.patch, > YARN-4511-YARN-1011.07.patch, YARN-4511-YARN-1011.08.patch, > YARN-4511-YARN-1011.09.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225965#comment-16225965 ] Xuan Gong commented on YARN-7400: - trival patch for quick fix > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > Attachments: YARN-7400.1.patch > > > In the job history server ui, If we enable the new log format, the container > preview log is displayed incorrectly, for e.x launch_container.sh is > displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-7400: Attachment: YARN-7400.1.patch > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > Attachments: YARN-7400.1.patch > > > In the job history server ui, If we enable the new log format, the container > preview log is displayed incorrectly, for e.x launch_container.sh is > displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225892#comment-16225892 ] Subru Krishnan edited comment on YARN-7400 at 10/30/17 10:57 PM: - [~xgong]/[~santhoshbg], is this a blocker for 2.9.0 as IIUC we don't enable the new log format yet? was (Author: subru): Is this a blocker for 2.9.0 as IIUC we don't enable the new log format yet? > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > > In the job history server ui, If we enable the new log format, the container > preview log is displayed incorrectly, for e.x launch_container.sh is > displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7342) Application page doesn't show correct metrics for reservation runs
[ https://issues.apache.org/jira/browse/YARN-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225896#comment-16225896 ] Subru Krishnan commented on YARN-7342: -- [~yufeigu], thanks for your response. I agree that it's not really a blocker for 2.9.0 but will be good to have. > Application page doesn't show correct metrics for reservation runs > --- > > Key: YARN-7342 > URL: https://issues.apache.org/jira/browse/YARN-7342 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, reservation system >Affects Versions: 3.1.0 >Reporter: Yufei Gu > Attachments: Screen Shot 2017-10-16 at 17.27.48.png > > > As the screen shot shows, there are some bugs on the webUI while running job > with reservation. For examples, queue name should just be "root.queueA" > instead of internal queue name. All metrics(Allocated CPU, % of queue, etc) > are missing for reservation runs. These should be a blocker though. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225892#comment-16225892 ] Subru Krishnan commented on YARN-7400: -- Is this a blocker for 2.9.0 as IIUC we don't enable the new log format yet? > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > > In the job history server ui, If we enable the new log format, the container > preview log is displayed incorrectly, for e.x launch_container.sh is > displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-7400: - Target Version/s: 2.9.0, 3.0.0, 3.1.0 > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > > In the job history server ui, If we enable the new log format, the container > preview log is displayed incorrectly, for e.x launch_container.sh is > displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7330) Add support to show GPU on UI/metrics
[ https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7330: - Attachment: YARN-7330.2-wip.patch Attached ver.2 patch, added cluster resource usages for all resource types. > Add support to show GPU on UI/metrics > - > > Key: YARN-7330 > URL: https://issues.apache.org/jira/browse/YARN-7330 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-7330.0-wip.patch, YARN-7330.1-wip.patch, > YARN-7330.2-wip.patch, screencapture-0-wip.png > > > We should be able to view GPU metrics from UI/REST API. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6413) Yarn Registry FS implementation
[ https://issues.apache.org/jira/browse/YARN-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ellen Hui updated YARN-6413: Attachment: YARN-6413.v5.patch Fix Checkstyle. The asflicense failures are caused by https://issues.apache.org/jira/browse/HADOOP-14990 > Yarn Registry FS implementation > --- > > Key: YARN-6413 > URL: https://issues.apache.org/jira/browse/YARN-6413 > Project: Hadoop YARN > Issue Type: Improvement > Components: amrmproxy, api, resourcemanager >Reporter: Ellen Hui >Assignee: Ellen Hui > Attachments: 0001-Registry-API-v2.patch, > 0001-YARN-6413-Yarn-Registry-FS-Implementation.patch, > 0002-Registry-API-v2.patch, 0003-Registry-API-api-only.patch, > 0004-Registry-API-api-stubbed.patch, YARN-6413.v1.patch, YARN-6413.v2.patch, > YARN-6413.v3.patch, YARN-6413.v4.patch, YARN-6413.v5.patch > > > Add a RegistryOperations implementation that writes records to the file > system. This does not include any changes to the API, to avoid compatibility > issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7371) NPE in ServiceMaster after RM is restarted and then the ServiceMaster is killed
[ https://issues.apache.org/jira/browse/YARN-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225841#comment-16225841 ] Subru Krishnan commented on YARN-7371: -- [~csingh]/[~billie.rinaldi], I am *not* in favor of replacing _allocationId_ with _priority_ as that's semantically incorrect. Moreover _allocationId_ was added exactly to serve the purpose. So I suggest to instead add _allocationId_ in recovery. Thanks. > NPE in ServiceMaster after RM is restarted and then the ServiceMaster is > killed > --- > > Key: YARN-7371 > URL: https://issues.apache.org/jira/browse/YARN-7371 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh > Attachments: YARN-7371-yarn-native-services.001.patch, > YARN-7371-yarn-native-services.002.patch, > YARN-7371-yarn-native-services.003.patch, > YARN-7371-yarn-native-services.004.patch > > > java.lang.NullPointerException > at > org.apache.hadoop.yarn.service.ServiceScheduler.recoverComponents(ServiceScheduler.java:313) > at > org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:265) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:150) > Steps: > 1. Stopped RM and then started it > 2. Application was still running > 3. Killed the ServiceMaster to check if it recovers > 4. Next attempt failed with the above exception -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7371) NPE in ServiceMaster after RM is restarted and then the ServiceMaster is killed
[ https://issues.apache.org/jira/browse/YARN-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-7371: Attachment: YARN-7371-yarn-native-services.004.patch > NPE in ServiceMaster after RM is restarted and then the ServiceMaster is > killed > --- > > Key: YARN-7371 > URL: https://issues.apache.org/jira/browse/YARN-7371 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh > Attachments: YARN-7371-yarn-native-services.001.patch, > YARN-7371-yarn-native-services.002.patch, > YARN-7371-yarn-native-services.003.patch, > YARN-7371-yarn-native-services.004.patch > > > java.lang.NullPointerException > at > org.apache.hadoop.yarn.service.ServiceScheduler.recoverComponents(ServiceScheduler.java:313) > at > org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:265) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:150) > Steps: > 1. Stopped RM and then started it > 2. Application was still running > 3. Killed the ServiceMaster to check if it recovers > 4. Next attempt failed with the above exception -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7197) Add support for a volume blacklist for docker containers
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225812#comment-16225812 ] Jason Lowe commented on YARN-7197: -- Solution 3 is more secure since the paths are unavailable within the container. Even if there is a trojan in the container that escalates privilege, the hacker needs to break outside of the container to access the path rather than accessing it within the container directly. bq. The third solution can throw people off, if they do not know about black-list is hijacked to empty location. I'm not sure how admins are going to be confused or upset that paths in the blacklist were not what was expected. Either it shouldn't be accessed and thus shouldn't matter what's there or it needs to be there and shouldn't be in the blacklist. Or am I missing a scenario where there's an in-between? IMHO it's more confusing to users if the blacklist doesn't actually prevent access to the blacklisted paths. The point of the blacklist is that containers should not be able to access those paths on the host. bq. Mounting from parent of black list directory will depends on filesystem acl to enforce the permission. If we are simply relying on filesystem ACLs to cover us if the user mounts above the blacklist then I would argue there's no point to the blacklist. Either we trust the filesystem ACLs or we don't. If we don't then letting the admin trivially configure a setup where the user can mount above the blacklist path is not helpful or intuitive. If we are serious about not trusting the filesystem ACLs from within the container (i.e.: actually need a blacklist) then we need to do whatever we can to prevent access even if the admins (un)intentionally configure paths above the blacklisted path. That means we need to do something like Solution 3 above or fail to create the container when the user mounts above. Choosing to fail the container means we're essentially at Solution 1 where the admin has no ability to cherry-pick out paths that should not be allowed and must maintain explicit paths to things that are allowed. Then the blacklist would have no utility short of documentation of what normally should not be placed in the whitelist. > Add support for a volume blacklist for docker containers > > > Key: YARN-7197 > URL: https://issues.apache.org/jira/browse/YARN-7197 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Shane Kumpf >Assignee: Eric Yang > Attachments: YARN-7197.001.patch, YARN-7197.002.patch > > > Docker supports bind mounting host directories into containers. Work is > underway to allow admins to configure a whilelist of volume mounts. While > this is a much needed and useful feature, it opens the door for > misconfiguration that may lead to users being able to compromise or crash the > system. > One example would be allowing users to mount /run from a host running > systemd, and then running systemd in that container, rendering the host > mostly unusable. > This issue is to add support for a default blacklist. The default blacklist > would be where we put files and directories that if mounted into a container, > are likely to have negative consequences. Users are encouraged not to remove > items from the default blacklist, but may do so if necessary. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7342) Application page doesn't show correct metrics for reservation runs
[ https://issues.apache.org/jira/browse/YARN-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225809#comment-16225809 ] Yufei Gu commented on YARN-7342: Not familiar with WebUI part. I will have a look first. Do you think it is necessary for 2.9? You don't think it is blocker for 2.9, right? > Application page doesn't show correct metrics for reservation runs > --- > > Key: YARN-7342 > URL: https://issues.apache.org/jira/browse/YARN-7342 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, reservation system >Affects Versions: 3.1.0 >Reporter: Yufei Gu > Attachments: Screen Shot 2017-10-16 at 17.27.48.png > > > As the screen shot shows, there are some bugs on the webUI while running job > with reservation. For examples, queue name should just be "root.queueA" > instead of internal queue name. All metrics(Allocated CPU, % of queue, etc) > are missing for reservation runs. These should be a blocker though. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7395) NM fails to successfully kill tasks that run over their memory limit
[ https://issues.apache.org/jira/browse/YARN-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225790#comment-16225790 ] Eric Badger commented on YARN-7395: --- I have confirmed that I do *not* see this on trunk. On trunk the docker containers are correctly stopped when they run over their memory limit and there is no extra {{'}} or {{%27}} in the stop command. This may have been fixed as a result of YARN-6623. I will test trunk just before YARN-6623 to make sure > NM fails to successfully kill tasks that run over their memory limit > > > Key: YARN-7395 > URL: https://issues.apache.org/jira/browse/YARN-7395 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Eric Badger > > The NM correctly notes that the container is over its configured limit, but > then fails to successfully kill the process. So the Docker container AM stays > around and the job keeps running -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7146) Many RM unit tests failing with FairScheduler
[ https://issues.apache.org/jira/browse/YARN-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225783#comment-16225783 ] Robert Kanter commented on YARN-7146: - They don't appear to be related. TestOpportunisticContainerAllocationE2E fails on branch-2 without the patch, and many of the others are due to the RM failing to startup, and the ones I tried locally were fine. I kicked off another run. > Many RM unit tests failing with FairScheduler > - > > Key: YARN-7146 > URL: https://issues.apache.org/jira/browse/YARN-7146 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Affects Versions: 3.0.0-beta1 >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 3.0.0-beta1, 3.1.0 > > Attachments: YARN-7146.001.patch, YARN-7146.002.patch, > YARN-7146.003.patch, YARN-7146.004.branch-2.patch, YARN-7146.004.patch > > > Many of the RM unit tests are failing when using the FairScheduler. > Here is a list of affected test classes: > {noformat} > TestYarnClient > TestApplicationCleanup > TestApplicationMasterLauncher > TestDecommissioningNodesWatcher > TestKillApplicationWithRMHA > TestNodeBlacklistingOnAMFailures > TestRM > TestRMAdminService > TestRMRestart > TestResourceTrackerService > TestWorkPreservingRMRestart > TestAMRMRPCNodeUpdates > TestAMRMRPCResponseId > TestAMRestart > TestApplicationLifetimeMonitor > TestNodesListManager > TestRMContainerImpl > TestAbstractYarnScheduler > TestSchedulerUtils > TestFairOrderingPolicy > TestAMRMTokens > TestDelegationTokenRenewer > {noformat} > Most of the test methods in these classes are failing, though some do succeed. > There's two main categories of issues: > # The test submits an application to the {{MockRM}} and waits for it to enter > a specific state, which it never does, and the test times out. We need to > call {{update()}} on the scheduler. > # The test throws a {{ClassCastException}} on {{FSQueueMetrics}} to > {{CSQueueMetrics}}. This is because {{QueueMetrics}} metrics are static, and > a previous test using FairScheduler initialized it, and the current test is > using CapacityScheduler. We need to reset the metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6940) FairScheduler: Enable Container update CodePaths and container resize testcase
[ https://issues.apache.org/jira/browse/YARN-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225779#comment-16225779 ] Daniel Templeton commented on YARN-6940: I'll took a look when I get a chance. > FairScheduler: Enable Container update CodePaths and container resize testcase > -- > > Key: YARN-6940 > URL: https://issues.apache.org/jira/browse/YARN-6940 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6940.001.patch > > > After YARN-6216, the Container Update (which includes Resource increase and > decrease) code-paths are mostly scheduler agnostic. > This JIRA tracks the final minor change needed in the FairScheduler. It also > re-enables the {{TestAMRMClient#testAMRMClientWithContainerResourceChange}} > test for the FairScheduler - which verifies that it works for the > FairScheduler. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7316) Cleaning up the usage of Resources and ResourceCalculator
[ https://issues.apache.org/jira/browse/YARN-7316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225778#comment-16225778 ] Daniel Templeton commented on YARN-7316: I've actually been actively doing the opposite. It seems strange to me that we should pass the calc object to {{Resources}} so that it can call the method for us. It's harder to read, makes for longer lines that may need to be wrapped, and creates unnecessary coupling. It's not like using {{Resources}} as a proxy saves you from needing a calculator instance or does anything useful for you at all, really. What's the motivation for running all the calls through the {{Resources}} object? It makes sense for some of the comparisons, where it's actually doing something useful, but I don't get it for the rest of the calls. > Cleaning up the usage of Resources and ResourceCalculator > - > > Key: YARN-7316 > URL: https://issues.apache.org/jira/browse/YARN-7316 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0 >Reporter: lovekesh bansal >Assignee: lovekesh bansal >Priority: Minor > Fix For: 3.1.0 > > Attachments: YARN-7316_trunk.001.patch > > > Cleaning up and Uniformizing the usage the usage of Resources and > ResourceCalculator. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225721#comment-16225721 ] Jason Lowe commented on YARN-7244: -- The ASF warnings are unrelated. +1 for the branch-2.8 patch. Committing this. > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 2.9.0, 3.0.0 > > Attachments: YARN-7244-branch-2.8.001.patch, > YARN-7244-branch-2.8.002.patch, YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch, > YARN-7244.006.patch, YARN-7244.007.patch, YARN-7244.008.patch, > YARN-7244.009.patch, YARN-7244.010.patch, YARN-7244.011.patch, > YARN-7244.012.patch, YARN-7244.013.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7371) NPE in ServiceMaster after RM is restarted and then the ServiceMaster is killed
[ https://issues.apache.org/jira/browse/YARN-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225713#comment-16225713 ] Billie Rinaldi commented on YARN-7371: -- Thanks for the patch, [~csingh]! The patch looks good overall, I just have a couple of small comments: * since allocateId is changed to an int in ServiceScheduler, I think we should change the Component constructor to take allocateId as an int, and we no longer need to cast allocateId to an int in Priority.newInstance((int) allocateId) in the constructor {code} Component( org.apache.hadoop.yarn.service.api.records.Component component, long allocateId, ServiceContext context) {code} * some of the checkstyle issues look easy to fix, specifically the cases where lines are too long and where a curly brace is on the wrong line > NPE in ServiceMaster after RM is restarted and then the ServiceMaster is > killed > --- > > Key: YARN-7371 > URL: https://issues.apache.org/jira/browse/YARN-7371 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh > Attachments: YARN-7371-yarn-native-services.001.patch, > YARN-7371-yarn-native-services.002.patch, > YARN-7371-yarn-native-services.003.patch > > > java.lang.NullPointerException > at > org.apache.hadoop.yarn.service.ServiceScheduler.recoverComponents(ServiceScheduler.java:313) > at > org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:265) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:150) > Steps: > 1. Stopped RM and then started it > 2. Application was still running > 3. Killed the ServiceMaster to check if it recovers > 4. Next attempt failed with the above exception -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong reassigned YARN-7400: --- Assignee: Xuan Gong > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > > In the job history server ui, the container preview log is displayed > incorrectly, for e.x launch_container.sh is displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-7400: Affects Version/s: (was: 2.7.3) 3.1.0 3.0.0 2.9.0 > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > > In the job history server ui, the container preview log is displayed > incorrectly, for e.x launch_container.sh is displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7400) incorrect log preview displayed in jobhistory server ui
[ https://issues.apache.org/jira/browse/YARN-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-7400: Description: In the job history server ui, If we enable the new log format, the container preview log is displayed incorrectly, for e.x launch_container.sh is displaying stderr logs. was: In the job history server ui, the container preview log is displayed incorrectly, for e.x launch_container.sh is displaying stderr logs. > incorrect log preview displayed in jobhistory server ui > --- > > Key: YARN-7400 > URL: https://issues.apache.org/jira/browse/YARN-7400 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Santhosh B Gowda >Assignee: Xuan Gong >Priority: Blocker > > In the job history server ui, If we enable the new log format, the container > preview log is displayed incorrectly, for e.x launch_container.sh is > displaying stderr logs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7378) Documentation changes post branch-2 merge
[ https://issues.apache.org/jira/browse/YARN-7378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225695#comment-16225695 ] Vrushali C commented on YARN-7378: -- Thanks [~rohithsharma] and [~varun_saxena] for the reviews and +1. I will commit this now. > Documentation changes post branch-2 merge > - > > Key: YARN-7378 > URL: https://issues.apache.org/jira/browse/YARN-7378 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineclient, timelinereader, timelineserver >Reporter: Varun Saxena >Assignee: Vrushali C > Attachments: YARN-7378-branch-2.0001.patch, > YARN-7378-branch-2.0002.patch, YARN-7378-branch-2.0003.patch, > YARN-7378-branch-2.0004.patch, YARN-7378-branch-2.0005.patch, schema creation > documentation.png > > > Need to update the documentation for the schema creator command. It should > include the timeline-service-hbase jar as well as hbase-server jar in > classpath when the command is to be run. Due to YARN-7190 classpath changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6909) The performance advantages of YARN-6679 are lost when resource types are used
[ https://issues.apache.org/jira/browse/YARN-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225659#comment-16225659 ] Daniel Templeton commented on YARN-6909: This patch looks like it's largely a refactor. It doesn't address the issue that when there are more than 2 resources, a PB is used even for internal operations. > The performance advantages of YARN-6679 are lost when resource types are used > - > > Key: YARN-6909 > URL: https://issues.apache.org/jira/browse/YARN-6909 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Sunil G >Priority: Critical > Attachments: YARN-6909.001.patch > > > YARN-6679 added the {{SimpleResource}} as a lightweight replacement for > {{ResourcePBImpl}} when a protobuf isn't needed. With resource types enabled > and anything other than memory and CPU defined, {{ResourcePBImpl}} will > always be used. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6930) Admins should be able to explicitly enable specific LinuxContainerRuntime in the NodeManager
[ https://issues.apache.org/jira/browse/YARN-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225652#comment-16225652 ] Junping Du commented on YARN-6930: -- bq. It has to be explicitly turned on in 2.9. In 2.8 it was turned on by default. If so, I agree that we should mark this fix in 2.9 as incompatible in case users upgrade from 2.8 to 2.9 with assumption that docker runtime is on by default. > Admins should be able to explicitly enable specific LinuxContainerRuntime in > the NodeManager > > > Key: YARN-6930 > URL: https://issues.apache.org/jira/browse/YARN-6930 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Vinod Kumar Vavilapalli >Assignee: Shane Kumpf > Fix For: 2.9.0, 3.0.0-beta1, 2.8.2 > > Attachments: YARN-6930.001.patch, YARN-6930.002.patch, > YARN-6930.003.patch, YARN-6930.004.patch, YARN-6930.005.patch, > YARN-6930.006.patch, YARN-6930.branch-2.001.patch, > YARN-6930.branch-2.002.patch, YARN-6930.branch-2.8.001.patch, > YARN-6930.branch-2.8.002.patch, YARN-6930.branch-2.8.2.001.patch > > > Today, in the java land, all LinuxContainerRuntimes are always enabled when > using LinuxContainerExecutor and the user can simply invoke anything that > he/she wants - default, docker, java-sandbox. > We should have a way for admins to explicitly enable only specific runtimes > that he/she decides for the cluster. And by default, we should have > everything other than the default one disabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7378) Documentation changes post branch-2 merge
[ https://issues.apache.org/jira/browse/YARN-7378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225648#comment-16225648 ] Varun Saxena commented on YARN-7378: +1 LGTM > Documentation changes post branch-2 merge > - > > Key: YARN-7378 > URL: https://issues.apache.org/jira/browse/YARN-7378 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineclient, timelinereader, timelineserver >Reporter: Varun Saxena >Assignee: Vrushali C > Attachments: YARN-7378-branch-2.0001.patch, > YARN-7378-branch-2.0002.patch, YARN-7378-branch-2.0003.patch, > YARN-7378-branch-2.0004.patch, YARN-7378-branch-2.0005.patch, schema creation > documentation.png > > > Need to update the documentation for the schema creator command. It should > include the timeline-service-hbase jar as well as hbase-server jar in > classpath when the command is to be run. Due to YARN-7190 classpath changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6128) Add support for AMRMProxy HA
[ https://issues.apache.org/jira/browse/YARN-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-6128: --- Attachment: YARN-6128.v0.patch > Add support for AMRMProxy HA > > > Key: YARN-6128 > URL: https://issues.apache.org/jira/browse/YARN-6128 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, nodemanager >Reporter: Subru Krishnan >Assignee: Botong Huang > Attachments: YARN-6128.v0.patch > > > YARN-556 added the ability for RM failover without loosing any running > applications. In a Federated YARN environment, there's additional state in > the {{AMRMProxy}} to allow for spanning across multiple sub-clusters, so we > need to enhance {{AMRMProxy}} to support HA. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5734) OrgQueue for easy CapacityScheduler queue configuration management
[ https://issues.apache.org/jira/browse/YARN-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225631#comment-16225631 ] Subru Krishnan commented on YARN-5734: -- [~jhung] (cc: [~mshen], [~xgong], [~leftnoteasy], [~zhz]), can you update the fix versions and release note in anticipation of 2.9.0 release. Thanks. > OrgQueue for easy CapacityScheduler queue configuration management > -- > > Key: YARN-5734 > URL: https://issues.apache.org/jira/browse/YARN-5734 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Min Shen >Assignee: Min Shen > Attachments: > OrgQueueAPI-BasedSchedulerConfigurationManagement_v2.pdf, > OrgQueue_API-Based_Config_Management_v1.pdf, OrgQueue_Design_v0.pdf, > YARN-5734-YARN-5734.001.patch > > > The current xml based configuration mechanism in CapacityScheduler makes it > very inconvenient to apply any changes to the queue configurations. We saw 2 > main drawbacks in the file based configuration mechanism: > # This makes it very inconvenient to automate queue configuration updates. > For example, in our cluster setup, we leverage the queue mapping feature from > YARN-2411 to route users to their dedicated organization queues. It could be > extremely cumbersome to keep updating the config file to manage the very > dynamic mapping between users to organizations. > # Even a user has the admin permission on one specific queue, that user is > unable to make any queue configuration changes to resize the subqueues, > changing queue ACLs, or creating new queues. All these operations need to be > performed in a centralized manner by the cluster administrators. > With these current limitations, we realized the need of a more flexible > configuration mechanism that allows queue configurations to be stored and > managed more dynamically. We developed the feature internally at LinkedIn > which introduces the concept of MutableConfigurationProvider. What it > essentially does is to provide a set of configuration mutation APIs that > allows queue configurations to be updated externally with a set of REST APIs. > When performing the queue configuration changes, the queue ACLs will be > honored, which means only queue administrators can make configuration changes > to a given queue. MutableConfigurationProvider is implemented as a pluggable > interface, and we have one implementation of this interface which is based on > Derby embedded database. > This feature has been deployed at LinkedIn's Hadoop cluster for a year now, > and have gone through several iterations of gathering feedbacks from users > and improving accordingly. With this feature, cluster administrators are able > to automate lots of thequeue configuration management tasks, such as setting > the queue capacities to adjust cluster resources between queues based on > established resource consumption patterns, or managing updating the user to > queue mappings. We have attached our design documentation with this ticket > and would like to receive feedbacks from the community regarding how to best > integrate it with the latest version of YARN. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2915) Enable YARN RM scale out via federation using multiple RM's
[ https://issues.apache.org/jira/browse/YARN-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-2915: - Release Note: A federation-based approach to transparently scale a single YARN cluster to tens of thousands of nodes, by federating multiple YARN standalone clusters (sub-clusters). The applications running in this federated environment will see a single massive YARN cluster and will be able to schedule tasks on any node of the federated cluster. Under the hood, the federation system will negotiate with sub-clusters ResourceManagers and provide resources to the application. The goal is to allow an individual job to “span” sub-clusters seamlessly. (was: A federation-based approach to transparently scale a single YARN cluster to tens of thousands of nodes, by federating multiple YARN standalone clusters.The proposed approach is to divide a large (10-100k nodes) cluster into smaller units called sub-clusters, each with its own YARN RM and compute nodes. The federation system will stitch these sub-clusters together and make them appear as one large YARN cluster to the applications. The applications running in this federated environment will see a single massive YARN cluster and will be able to schedule tasks on any node of the federated cluster. Under the hood, the federation system will negotiate with sub-clusters resource managers and provide resources to the application. The goal is to allow an individual job to “span” sub-clusters seamlessly.) > Enable YARN RM scale out via federation using multiple RM's > --- > > Key: YARN-2915 > URL: https://issues.apache.org/jira/browse/YARN-2915 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Sriram Rao >Assignee: Subru Krishnan > Labels: federation > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: FEDERATION_CAPACITY_ALLOCATION_JIRA.pdf, > Federation-BoF.pdf, YARN-Federation-Hadoop-Summit_final.pptx, > Yarn_federation_design_v1.pdf, federation-prototype.patch > > > This is an umbrella JIRA that proposes to scale out YARN to support large > clusters comprising of tens of thousands of nodes. That is, rather than > limiting a YARN managed cluster to about 4k in size, the proposal is to > enable the YARN managed cluster to be elastically scalable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2915) Enable YARN RM scale out via federation using multiple RM's
[ https://issues.apache.org/jira/browse/YARN-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-2915: - Release Note: A federation-based approach to transparently scale a single YARN cluster to tens of thousands of nodes, by federating multiple YARN standalone clusters.The proposed approach is to divide a large (10-100k nodes) cluster into smaller units called sub-clusters, each with its own YARN RM and compute nodes. The federation system will stitch these sub-clusters together and make them appear as one large YARN cluster to the applications. The applications running in this federated environment will see a single massive YARN cluster and will be able to schedule tasks on any node of the federated cluster. Under the hood, the federation system will negotiate with sub-clusters resource managers and provide resources to the application. The goal is to allow an individual job to “span” sub-clusters seamlessly. (was: A federation-based approach to transparently scale a single YARN cluster to tens of thousands of nodes, by federating multiple YARN standalone clusters) > Enable YARN RM scale out via federation using multiple RM's > --- > > Key: YARN-2915 > URL: https://issues.apache.org/jira/browse/YARN-2915 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Sriram Rao >Assignee: Subru Krishnan > Labels: federation > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: FEDERATION_CAPACITY_ALLOCATION_JIRA.pdf, > Federation-BoF.pdf, YARN-Federation-Hadoop-Summit_final.pptx, > Yarn_federation_design_v1.pdf, federation-prototype.patch > > > This is an umbrella JIRA that proposes to scale out YARN to support large > clusters comprising of tens of thousands of nodes. That is, rather than > limiting a YARN managed cluster to about 4k in size, the proposal is to > enable the YARN managed cluster to be elastically scalable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2915) Enable YARN RM scale out via federation using multiple RM's
[ https://issues.apache.org/jira/browse/YARN-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-2915: - Release Note: A federation-based approach to transparently scale a single YARN cluster to tens of thousands of nodes, by federating multiple YARN sub-clusters > Enable YARN RM scale out via federation using multiple RM's > --- > > Key: YARN-2915 > URL: https://issues.apache.org/jira/browse/YARN-2915 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Sriram Rao >Assignee: Subru Krishnan > Labels: federation > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: FEDERATION_CAPACITY_ALLOCATION_JIRA.pdf, > Federation-BoF.pdf, YARN-Federation-Hadoop-Summit_final.pptx, > Yarn_federation_design_v1.pdf, federation-prototype.patch > > > This is an umbrella JIRA that proposes to scale out YARN to support large > clusters comprising of tens of thousands of nodes. That is, rather than > limiting a YARN managed cluster to about 4k in size, the proposal is to > enable the YARN managed cluster to be elastically scalable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2915) Enable YARN RM scale out via federation using multiple RM's
[ https://issues.apache.org/jira/browse/YARN-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-2915: - Release Note: A federation-based approach to transparently scale a single YARN cluster to tens of thousands of nodes, by federating multiple YARN standalone clusters (was: A federation-based approach to transparently scale a single YARN cluster to tens of thousands of nodes, by federating multiple YARN sub-clusters) > Enable YARN RM scale out via federation using multiple RM's > --- > > Key: YARN-2915 > URL: https://issues.apache.org/jira/browse/YARN-2915 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Sriram Rao >Assignee: Subru Krishnan > Labels: federation > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: FEDERATION_CAPACITY_ALLOCATION_JIRA.pdf, > Federation-BoF.pdf, YARN-Federation-Hadoop-Summit_final.pptx, > Yarn_federation_design_v1.pdf, federation-prototype.patch > > > This is an umbrella JIRA that proposes to scale out YARN to support large > clusters comprising of tens of thousands of nodes. That is, rather than > limiting a YARN managed cluster to about 4k in size, the proposal is to > enable the YARN managed cluster to be elastically scalable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5326) Support for recurring reservations in the YARN ReservationSystem
[ https://issues.apache.org/jira/browse/YARN-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-5326: - Release Note: Add native support for recurring reservations (good till cancelled) to enable periodic allocations of the same resources. > Support for recurring reservations in the YARN ReservationSystem > > > Key: YARN-5326 > URL: https://issues.apache.org/jira/browse/YARN-5326 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Subru Krishnan >Assignee: Carlo Curino > Fix For: 2.9.0, 3.0.0, 3.1.0 > > Attachments: SupportRecurringReservationsInRayon.pdf > > > YARN-1051 introduced a ReservationSytem that enables the YARN RM to handle > time explicitly, i.e. users can now "reserve" capacity ahead of time which is > predictably allocated to them. Most SLA jobs/workflows are recurring so they > need the same resources periodically. With the current implementation, users > will have to make individual reservations for each run. This is an umbrella > JIRA to enhance the reservation system by adding native support for recurring > reservations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7286) Add support for docker to have no capabilities
[ https://issues.apache.org/jira/browse/YARN-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225595#comment-16225595 ] Hadoop QA commented on YARN-7286: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 35m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 43s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 14s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 4m 23s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 4m 23s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 1m 23s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 6m 37s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 37s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 28s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 50s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 44s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}117m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7286 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12892691/YARN-7286.007.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux 909ef667415e 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e4878a5 | | maven | version: Apache Maven 3.3.9 |
[jira] [Commented] (YARN-7394) Merge code paths for Reservation/Plan queues and Auto Created queues
[ https://issues.apache.org/jira/browse/YARN-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225574#comment-16225574 ] Hadoop QA commented on YARN-7394: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 6s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in trunk has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 26s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 30 new + 163 unchanged - 16 fixed = 193 total (was 179) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 54s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 25s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}115m 28s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA | | | org.apache.hadoop.yarn.server.resourcemanager.TestReservationSystemWithRMHA | | | org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7394 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12894774/YARN-7394.5.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 83c459562eeb 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e4878a5 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_131 | | findbugs
[jira] [Commented] (YARN-7410) Cleanup FixedValueResource to avoid dependency to ResourceUtils
[ https://issues.apache.org/jira/browse/YARN-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225557#comment-16225557 ] Wangda Tan commented on YARN-7410: -- Thanks [~sunilg] for trying this patch. [~templedf], I just updated description. The major reason to reinit resource map is, FixedValueResource cannot be updated when any resource types refresh by using approach introduced by YARN-7307 and do operations like {{Resources.compare(resource_x, Resources.none())}} will throw exceptions. Please let me know if you have any suggestions/concerns of this approach. I will update patch to address #1/#2. And also, discussed with [~sunilg] last week regarding to the approach. This is a short term solution. Even after this patch, we still have issues of FixedValueResource: 1) Returned array is still mutable which means returned array can be modified by callers and cause hard-to-debug issues. 2) We still calculate fair share, etc. while comparing the FixedValuedResources to other resource types. We need to consider solutions of these issues in the longer term. > Cleanup FixedValueResource to avoid dependency to ResourceUtils > --- > > Key: YARN-7410 > URL: https://issues.apache.org/jira/browse/YARN-7410 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Sunil G >Assignee: Wangda Tan > Attachments: YARN-7410.001.patch, YARN-7410.002.patch > > > After YARN-7307, Client/AM don't need to keep a up-to-dated resource-type.xml > in the classpath. Instead, they can use YarnClient/ApplicationMasterProtocol > APIs to get the resource types from RM and refresh local types. > One biggest issue of this approach is FixedValueResource: Since we initialize > FixedValueResource in static block, and they won't be updated if resource > types refreshed. > So we need to properly update FixedValueResource to make it can get > up-to-date results -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5326) Support for recurring reservations in the YARN ReservationSystem
[ https://issues.apache.org/jira/browse/YARN-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan resolved YARN-5326. -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.0 3.0.0 2.9.0 Marking this as resolved as all sub-tasks are closed. > Support for recurring reservations in the YARN ReservationSystem > > > Key: YARN-5326 > URL: https://issues.apache.org/jira/browse/YARN-5326 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Subru Krishnan >Assignee: Carlo Curino > Fix For: 2.9.0, 3.0.0, 3.1.0 > > Attachments: SupportRecurringReservationsInRayon.pdf > > > YARN-1051 introduced a ReservationSytem that enables the YARN RM to handle > time explicitly, i.e. users can now "reserve" capacity ahead of time which is > predictably allocated to them. Most SLA jobs/workflows are recurring so they > need the same resources periodically. With the current implementation, users > will have to make individual reservations for each run. This is an umbrella > JIRA to enhance the reservation system by adding native support for recurring > reservations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6469) Extending Synthetic Load Generator and SLS for recurring reservation
[ https://issues.apache.org/jira/browse/YARN-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-6469: - Parent Issue: YARN-2572 (was: YARN-5326) > Extending Synthetic Load Generator and SLS for recurring reservation > > > Key: YARN-6469 > URL: https://issues.apache.org/jira/browse/YARN-6469 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-6469.v0.patch, YARN-6469.v1.patch > > > This JIRA extends the synthetic load generator, and SLS to support the > generation and submission of recurring jobs. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7410) Cleanup FixedValueResource to avoid dependency to ResourceUtils
[ https://issues.apache.org/jira/browse/YARN-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7410: - Description: After YARN-7307, Client/AM don't need to keep a up-to-dated resource-type.xml in the classpath. Instead, they can use YarnClient/ApplicationMasterProtocol APIs to get the resource types from RM and refresh local types. One biggest issue of this approach is FixedValueResource: Since we initialize FixedValueResource in static block, and they won't be updated if resource types refreshed. So we need to properly update FixedValueResource to make it can get up-to-date results was: Currently FixedValue Resource constants has some dependencies to ResourceUtils. This jira will help to cleanup this dependencies. Thanks [~leftnoteasy] for finding the same. > Cleanup FixedValueResource to avoid dependency to ResourceUtils > --- > > Key: YARN-7410 > URL: https://issues.apache.org/jira/browse/YARN-7410 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Sunil G >Assignee: Wangda Tan > Attachments: YARN-7410.001.patch, YARN-7410.002.patch > > > After YARN-7307, Client/AM don't need to keep a up-to-dated resource-type.xml > in the classpath. Instead, they can use YarnClient/ApplicationMasterProtocol > APIs to get the resource types from RM and refresh local types. > One biggest issue of this approach is FixedValueResource: Since we initialize > FixedValueResource in static block, and they won't be updated if resource > types refreshed. > So we need to properly update FixedValueResource to make it can get > up-to-date results -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7342) Application page doesn't show correct metrics for reservation runs
[ https://issues.apache.org/jira/browse/YARN-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225534#comment-16225534 ] Subru Krishnan commented on YARN-7342: -- [~yufeigu], can you get to it this week as 2.9.0 release date is this Friday (3rd Nov)? If so, update the target version accordingly. > Application page doesn't show correct metrics for reservation runs > --- > > Key: YARN-7342 > URL: https://issues.apache.org/jira/browse/YARN-7342 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, reservation system >Affects Versions: 3.1.0 >Reporter: Yufei Gu > Attachments: Screen Shot 2017-10-16 at 17.27.48.png > > > As the screen shot shows, there are some bugs on the webUI while running job > with reservation. For examples, queue name should just be "root.queueA" > instead of internal queue name. All metrics(Allocated CPU, % of queue, etc) > are missing for reservation runs. These should be a blocker though. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7262) Add a hierarchy into the ZKRMStateStore for delegation token znodes to prevent jute buffer overflow
[ https://issues.apache.org/jira/browse/YARN-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225533#comment-16225533 ] Robert Kanter commented on YARN-7262: - Oops, Sorry, I had misread your previous comment as "LGTM +1" not just "LGTM". > Add a hierarchy into the ZKRMStateStore for delegation token znodes to > prevent jute buffer overflow > --- > > Key: YARN-7262 > URL: https://issues.apache.org/jira/browse/YARN-7262 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.6.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.9.0, 3.0.0 > > Attachments: YARN-7262.001.patch, YARN-7262.002.patch, > YARN-7262.003.patch, YARN-7262.003.patch > > > We've seen users who are running into a problem where the RM is storing so > many delegation tokens in the {{ZKRMStateStore}} that the _listing_ of those > znodes is higher than the jute buffer. This is fine during operations, but > becomes a problem on a fail over because the RM will try to read in all of > the token znodes (i.e. call {{getChildren}} on the parent znode). This is > particularly bad because everything appears to be okay, but then if a > failover occurs you end up with no active RMs. > There was a similar problem with the Yarn application data that was fixed in > YARN-2962 by adding a (configurable) hierarchy of znodes so the RM could pull > subchildren without overflowing the jute buffer (though it's off by default). > We should add a hierarchy similar to that of YARN-2962, but for the > delegation token znodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7343) Add a junit test for ContainerScheduler recovery
[ https://issues.apache.org/jira/browse/YARN-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225529#comment-16225529 ] Subru Krishnan commented on YARN-7343: -- Moving it out of 2.9 as the patch needs rework and it's not a blocker as it's related to tests. > Add a junit test for ContainerScheduler recovery > > > Key: YARN-7343 > URL: https://issues.apache.org/jira/browse/YARN-7343 > Project: Hadoop YARN > Issue Type: Task >Reporter: kartheek muthyala >Assignee: Sampada Dehankar >Priority: Minor > Attachments: YARN-7343.001.patch > > > With queuing at NM, Container recovery becomes interesting. Add a junit test > for recovering containers in different states. This should test the recovery > with the ContainerScheduler class that was introduced for enabling container > queuing on contention of resources. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7415) RM Rest API Submit application documentation not easy to understand
[ https://issues.apache.org/jira/browse/YARN-7415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Atul Kulkarni updated YARN-7415: Priority: Major (was: Minor) > RM Rest API Submit application documentation not easy to understand > --- > > Key: YARN-7415 > URL: https://issues.apache.org/jira/browse/YARN-7415 > Project: Hadoop YARN > Issue Type: Bug > Components: api, docs, documentation >Affects Versions: 2.7.3 >Reporter: Atul Kulkarni > Labels: documentation, newbie, rest_api > > This specifically pertains to the "Cluster Applications API(Submit > Application)" documentation > This line - “Please note that in order to submit an app, you must have an > authentication filter setup for the HTTP interface. The functionality > requires that a username is set in the HttpServletRequest. If no filter is > setup, the response will be an “UNAUTHORIZED” response.” is NOT very helpful > in conveying to the user what needs to happen on the client side or on the > REST API side. > Specifically, > 1. "Authentication filter setup for the HTTP interface" - > * Now does this mean that one needs Kerberos enabled on the cluster? > * If not, what kind of HTTP authentication filter needs to be setup on the > httpd or on the client side? A few wikipedia or other links to do this would > go long way in increasing adoption of the RM REST API. > 2. "The functionality requires that a username is set in the > HttpServletRequest" - > * HttpServletRequest is a Java class - does the REST API mandate integrations > use Java? > * If not what would be an equivalent of that in normal HTTP REST Parlance? > This will help people understand what to do when using Python etc. > I frustrated myself with this documentation over the last few days and very > many "tutorials" that claim to make it work on production clusters and > finally reached this conclusion that - if this documentation can be improved, > it would save everyone a LOT of time and effort. > I am happy to help fix this - I have no experience contributing to apache > projects and may make mistakes but I am willing to learn. I would need > answers to the above questions to be able to do anything with this. I still > have not figured out how to run the job without hdfs complaining about user > permissions. Which I am guessing are related. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7342) Application page doesn't show correct metrics for reservation runs
[ https://issues.apache.org/jira/browse/YARN-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-7342: - Target Version/s: 3.0.0, 3.1.0 (was: 2.9.0, 3.1.0) > Application page doesn't show correct metrics for reservation runs > --- > > Key: YARN-7342 > URL: https://issues.apache.org/jira/browse/YARN-7342 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, reservation system >Affects Versions: 3.1.0 >Reporter: Yufei Gu > Attachments: Screen Shot 2017-10-16 at 17.27.48.png > > > As the screen shot shows, there are some bugs on the webUI while running job > with reservation. For examples, queue name should just be "root.queueA" > instead of internal queue name. All metrics(Allocated CPU, % of queue, etc) > are missing for reservation runs. These should be a blocker though. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7343) Add a junit test for ContainerScheduler recovery
[ https://issues.apache.org/jira/browse/YARN-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-7343: - Target Version/s: 3.0.0 (was: 2.9.0, 3.0.0) > Add a junit test for ContainerScheduler recovery > > > Key: YARN-7343 > URL: https://issues.apache.org/jira/browse/YARN-7343 > Project: Hadoop YARN > Issue Type: Task >Reporter: kartheek muthyala >Assignee: Sampada Dehankar >Priority: Minor > Attachments: YARN-7343.001.patch > > > With queuing at NM, Container recovery becomes interesting. Add a junit test > for recovering containers in different states. This should test the recovery > with the ContainerScheduler class that was introduced for enabling container > queuing on contention of resources. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7276) Federation Router Web Service fixes
[ https://issues.apache.org/jira/browse/YARN-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated YARN-7276: -- Attachment: YARN-7276-branch-2.006.patch Fixed javac issue with explicit cast. > Federation Router Web Service fixes > --- > > Key: YARN-7276 > URL: https://issues.apache.org/jira/browse/YARN-7276 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Attachments: YARN-7276-branch-2.000.patch, > YARN-7276-branch-2.001.patch, YARN-7276-branch-2.002.patch, > YARN-7276-branch-2.003.patch, YARN-7276-branch-2.004.patch, > YARN-7276-branch-2.005.patch, YARN-7276-branch-2.006.patch, > YARN-7276.000.patch, YARN-7276.001.patch, YARN-7276.002.patch, > YARN-7276.003.patch, YARN-7276.004.patch, YARN-7276.005.patch, > YARN-7276.006.patch, YARN-7276.007.patch, YARN-7276.009.patch, > YARN-7276.010.patch, YARN-7276.011.patch, YARN-7276.012.patch, > YARN-7276.013.patch, YARN-7276.014.patch > > > While testing YARN-3661, I found a few issues with the REST interface in the > Router: > * No support for empty content (error 204) > * Media type support > * Attributes in {{FederationInterceptorREST}} > * Support for empty states and labels > * DefaultMetricsSystem initialization is missing -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6953) Clean up ResourceUtils.setMinimumAllocationForMandatoryResources() and setMaximumAllocationForMandatoryResources()
[ https://issues.apache.org/jira/browse/YARN-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225522#comment-16225522 ] Daniel Templeton commented on YARN-6953: +1 from me as well, except that the patch doesn't appear to apply anymore. > Clean up ResourceUtils.setMinimumAllocationForMandatoryResources() and > setMaximumAllocationForMandatoryResources() > -- > > Key: YARN-6953 > URL: https://issues.apache.org/jira/browse/YARN-6953 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Manikandan R >Priority: Minor > Labels: newbie > Attachments: YARN-6953-YARN-3926-WIP.patch, > YARN-6953-YARN-3926.001.patch, YARN-6953-YARN-3926.002.patch, > YARN-6953-YARN-3926.003.patch, YARN-6953-YARN-3926.004.patch, > YARN-6953-YARN-3926.005.patch, YARN-6953-YARN-3926.006.patch, > YARN-6953.007.patch, YARN-6953.008.patch > > > The {{setMinimumAllocationForMandatoryResources()}} and > {{setMaximumAllocationForMandatoryResources()}} methods are quite convoluted. > They'd be much simpler if they just handled CPU and memory manually instead > of trying to be clever about doing it in a loop. There are also issues, such > as the log warning always talking about memory or the last element of the > inner array being a copy of the first element. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7146) Many RM unit tests failing with FairScheduler
[ https://issues.apache.org/jira/browse/YARN-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225518#comment-16225518 ] Daniel Templeton commented on YARN-7146: [~rkanter], wanna check those test failures? > Many RM unit tests failing with FairScheduler > - > > Key: YARN-7146 > URL: https://issues.apache.org/jira/browse/YARN-7146 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Affects Versions: 3.0.0-beta1 >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 3.0.0-beta1, 3.1.0 > > Attachments: YARN-7146.001.patch, YARN-7146.002.patch, > YARN-7146.003.patch, YARN-7146.004.branch-2.patch, YARN-7146.004.patch > > > Many of the RM unit tests are failing when using the FairScheduler. > Here is a list of affected test classes: > {noformat} > TestYarnClient > TestApplicationCleanup > TestApplicationMasterLauncher > TestDecommissioningNodesWatcher > TestKillApplicationWithRMHA > TestNodeBlacklistingOnAMFailures > TestRM > TestRMAdminService > TestRMRestart > TestResourceTrackerService > TestWorkPreservingRMRestart > TestAMRMRPCNodeUpdates > TestAMRMRPCResponseId > TestAMRestart > TestApplicationLifetimeMonitor > TestNodesListManager > TestRMContainerImpl > TestAbstractYarnScheduler > TestSchedulerUtils > TestFairOrderingPolicy > TestAMRMTokens > TestDelegationTokenRenewer > {noformat} > Most of the test methods in these classes are failing, though some do succeed. > There's two main categories of issues: > # The test submits an application to the {{MockRM}} and waits for it to enter > a specific state, which it never does, and the test times out. We need to > call {{update()}} on the scheduler. > # The test throws a {{ClassCastException}} on {{FSQueueMetrics}} to > {{CSQueueMetrics}}. This is because {{QueueMetrics}} metrics are static, and > a previous test using FairScheduler initialized it, and the current test is > using CapacityScheduler. We need to reset the metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7276) Federation Router Web Service fixes
[ https://issues.apache.org/jira/browse/YARN-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated YARN-7276: -- Attachment: (was: YARN-7276-branch-2.005.patch) > Federation Router Web Service fixes > --- > > Key: YARN-7276 > URL: https://issues.apache.org/jira/browse/YARN-7276 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri > Attachments: YARN-7276-branch-2.000.patch, > YARN-7276-branch-2.001.patch, YARN-7276-branch-2.002.patch, > YARN-7276-branch-2.003.patch, YARN-7276-branch-2.004.patch, > YARN-7276-branch-2.005.patch, YARN-7276.000.patch, YARN-7276.001.patch, > YARN-7276.002.patch, YARN-7276.003.patch, YARN-7276.004.patch, > YARN-7276.005.patch, YARN-7276.006.patch, YARN-7276.007.patch, > YARN-7276.009.patch, YARN-7276.010.patch, YARN-7276.011.patch, > YARN-7276.012.patch, YARN-7276.013.patch, YARN-7276.014.patch > > > While testing YARN-3661, I found a few issues with the REST interface in the > Router: > * No support for empty content (error 204) > * Media type support > * Attributes in {{FederationInterceptorREST}} > * Support for empty states and labels > * DefaultMetricsSystem initialization is missing -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7262) Add a hierarchy into the ZKRMStateStore for delegation token znodes to prevent jute buffer overflow
[ https://issues.apache.org/jira/browse/YARN-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225516#comment-16225516 ] Daniel Templeton commented on YARN-7262: For the record, +1 > Add a hierarchy into the ZKRMStateStore for delegation token znodes to > prevent jute buffer overflow > --- > > Key: YARN-7262 > URL: https://issues.apache.org/jira/browse/YARN-7262 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.6.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.9.0, 3.0.0 > > Attachments: YARN-7262.001.patch, YARN-7262.002.patch, > YARN-7262.003.patch, YARN-7262.003.patch > > > We've seen users who are running into a problem where the RM is storing so > many delegation tokens in the {{ZKRMStateStore}} that the _listing_ of those > znodes is higher than the jute buffer. This is fine during operations, but > becomes a problem on a fail over because the RM will try to read in all of > the token znodes (i.e. call {{getChildren}} on the parent znode). This is > particularly bad because everything appears to be okay, but then if a > failover occurs you end up with no active RMs. > There was a similar problem with the Yarn application data that was fixed in > YARN-2962 by adding a (configurable) hierarchy of znodes so the RM could pull > subchildren without overflowing the jute buffer (though it's off by default). > We should add a hierarchy similar to that of YARN-2962, but for the > delegation token znodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7276) Federation Router Web Service fixes
[ https://issues.apache.org/jira/browse/YARN-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225485#comment-16225485 ] Hadoop QA commented on YARN-7276: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 30s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 36s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 19s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-router generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 46s{color} | {color:red} hadoop-yarn-server-router in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 28m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.router.webapp.TestRouterWebServicesREST | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 | | JIRA Issue | YARN-7276 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12894781/YARN-7276-branch-2.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux cea5c007ba83 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 2afb8ba | | maven | version: Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) | | Default Java | 1.7.0_151 | | findbugs | v3.0.0 | | javac | https://builds.apache.org/job/PreCommit-YARN-Build/18244/artifact/out/diff-compile-javac-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-router.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/18244/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-router.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18244/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-YARN-Build/18244/artifact/out/patch-asflicense-problems.txt | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router U:
[jira] [Comment Edited] (YARN-7197) Add support for a volume blacklist for docker containers
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225452#comment-16225452 ] Eric Yang edited comment on YARN-7197 at 10/30/17 6:19 PM: --- [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount {{/run}} and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with {{/etc/group}} modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software (i.e. hdfs short circuit read) to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. h3. Solution 1: System admin placed {{/var/*}}, {{/run/\*}} (except /run/docker.socket), and {{/mnt/hdfs/user/\*}} (except yarn), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. h3. Solution 2 (All symlinks, and hardcoded locations are banned): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. h3. Solution 3: (Replace black list location with empty directories): (Jason proposed implementation) System admin specifies: white-list-read-write: {{/var}},{{/run}},{{/mnt/hdfs/user}} black-list: {{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or mount /run/docker.socket manually, but result in empty file. All solutions requires system administrator to enforce ability to upload secure image to private registry to prevent torjan horse in docker image. I can see the appeal that without having to do a high upkeep of white-list-read-write directories by the new proposal. The third solution can throw people off, if they do not know about black-list is hijacked to empty location. However, the depth of directories might defeat second solution. If community favors the third solution, I can revise patch accordingly. was (Author: eyang): [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount {{/run}} and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with {{/etc/group}} modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software (i.e. hdfs short circuit read) to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. h3. Solution 1: System admin placed {{/var/*}}, {{/run/\*}} (except /run/docker.socket), and {{/mnt/hdfs/user/*}} (except yarn), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. h3. Solution 2 (All symlinks, and hardcoded locations are banned): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access
[jira] [Commented] (YARN-7244) ShuffleHandler is not aware of disks that are added
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225477#comment-16225477 ] Kuhu Shukla commented on YARN-7244: --- [~jlowe], request for comments on the 2.8 version of the patch. Appreciate it! > ShuffleHandler is not aware of disks that are added > --- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 2.9.0, 3.0.0 > > Attachments: YARN-7244-branch-2.8.001.patch, > YARN-7244-branch-2.8.002.patch, YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch, > YARN-7244.006.patch, YARN-7244.007.patch, YARN-7244.008.patch, > YARN-7244.009.patch, YARN-7244.010.patch, YARN-7244.011.patch, > YARN-7244.012.patch, YARN-7244.013.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7197) Add support for a volume blacklist for docker containers
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225452#comment-16225452 ] Eric Yang edited comment on YARN-7197 at 10/30/17 6:18 PM: --- [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount {{/run}} and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with {{/etc/group}} modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software (i.e. hdfs short circuit read) to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. h3. Solution 1: System admin placed {{/var/*}}, {{/run/\*}} (except /run/docker.socket), and {{/mnt/hdfs/user/*}} (except yarn), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. h3. Solution 2 (All symlinks, and hardcoded locations are banned): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. h3. Solution 3: (Replace black list location with empty directories): (Jason proposed implementation) System admin specifies: white-list-read-write: {{/var}},{{/run}},{{/mnt/hdfs/user}} black-list: {{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or mount /run/docker.socket manually, but result in empty file. All solutions requires system administrator to enforce ability to upload secure image to private registry to prevent torjan horse in docker image. I can see the appeal that without having to do a high upkeep of white-list-read-write directories by the new proposal. The third solution can throw people off, if they do not know about black-list is hijacked to empty location. However, the depth of directories might defeat second solution. If community favors the third solution, I can revise patch accordingly. was (Author: eyang): [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount {{/run}} and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with {{/etc/group}} modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software (i.e. hdfs short circuit read) to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. h3. Solution 1: System admin placed {{/var/*}} and {{/run/\*}} (except /run/docker.socket), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. h3. Solution 2 (All symlinks, and hardcoded locations are banned): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or
[jira] [Comment Edited] (YARN-7197) Add support for a volume blacklist for docker containers
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225452#comment-16225452 ] Eric Yang edited comment on YARN-7197 at 10/30/17 6:17 PM: --- [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount {{/run}} and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with {{/etc/group}} modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software (i.e. hdfs short circuit read) to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. h3. Solution 1: System admin placed {{/var/*}} and {{/run/\*}} (except /run/docker.socket), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. h3. Solution 2 (All symlinks, and hardcoded locations are banned): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. h3. Solution 3: (Replace black list location with empty directories): (Jason proposed implementation) System admin specifies: white-list-read-write: {{/var}},{{/run}},{{/mnt/hdfs/user}} black-list: {{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or mount /run/docker.socket manually, but result in empty file. All solutions requires system administrator to enforce ability to upload secure image to private registry to prevent torjan horse in docker image. I can see the appeal that without having to do a high upkeep of white-list-read-write directories by the new proposal. The third solution can throw people off, if they do not know about black-list is hijacked to empty location. However, the depth of directories might defeat second solution. If community favors the third solution, I can revise patch accordingly. was (Author: eyang): [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount {{/run}} and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with {{/etc/group}} modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. h3. Solution 1: System admin placed {{/var/*}} and {{/run/\*}} (except /run/docker.socket), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. h3. Solution 2 (All symlinks, and hardcoded locations are banned): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. h3. Solution 3: (Replace
[jira] [Comment Edited] (YARN-7197) Add support for a volume blacklist for docker containers
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225452#comment-16225452 ] Eric Yang edited comment on YARN-7197 at 10/30/17 6:13 PM: --- [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount {{/run}} and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with {{/etc/group}} modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. h3. Solution 1: System admin placed {{/var/*}} and {{/run/\*}} (except /run/docker.socket), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. h3. Solution 2 (All symlinks, and hardcoded locations are banned): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. h3. Solution 3: (Replace black list location with empty directories): (Jason proposed implementation) System admin specifies: white-list-read-write: {{/var}},{{/run}},{{/mnt/hdfs/user}} black-list: {{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or mount /run/docker.socket manually, but result in empty file. All solutions requires system administrator to enforce ability to upload secure image to private registry to prevent torjan horse in docker image. I can see the appeal that without having to do a high upkeep of white-list-read-write directories by the new proposal. The third solution can throw people off, if they do not know about black-list is hijacked to empty location. However, the depth of directories might defeat second solution. If community favors the third solution, I can revise patch accordingly. was (Author: eyang): [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount {{/run}} and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with {{/etc/group}} modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. Solution 1: System admin placed {{/var/*}} and {{/run/\*}} (except /run/docker.socket), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. Solution 2 (All symlinks are banned and explicit hardcoded locations): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. Solution 3: (Ban symlink and replace black list
[jira] [Commented] (YARN-6927) Add support for individual resource types requests in MapReduce
[ https://issues.apache.org/jira/browse/YARN-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225468#comment-16225468 ] Hadoop QA commented on YARN-6927: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} YARN-6927 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-6927 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12894793/YARN-6927.010.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/18245/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add support for individual resource types requests in MapReduce > --- > > Key: YARN-6927 > URL: https://issues.apache.org/jira/browse/YARN-6927 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Daniel Templeton >Assignee: Gergo Repas > Fix For: 3.1.0 > > Attachments: YARN-6927.000.patch, YARN-6927.001.patch, > YARN-6927.002.patch, YARN-6927.003.patch, YARN-6927.004.patch, > YARN-6927.005.patch, YARN-6927.006.patch, YARN-6927.007.patch, > YARN-6927.008.patch, YARN-6927.009.patch, YARN-6927.010.patch > > > YARN-6504 adds support for resource profiles in MapReduce jobs, but resource > profiles don't give users much flexibility in their resource requests. To > satisfy users' needs, MapReduce should also allow users to specify arbitrary > resource requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7197) Add support for a volume blacklist for docker containers
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225452#comment-16225452 ] Eric Yang edited comment on YARN-7197 at 10/30/17 6:09 PM: --- [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount {{/run}} and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with {{/etc/group}} modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. Solution 1: System admin placed {{/var/*}} and {{/run/\*}} (except /run/docker.socket), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. Solution 2 (All symlinks are banned and explicit hardcoded locations): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. Solution 3: (Ban symlink and replace black list location with empty directory): (Jason proposed implementation) System admin specifies: white-list-read-write: {{/var}},{{/run}},{{/mnt/hdfs/user}} black-list: {{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or mount /run/docker.socket manually, but result in empty file. All solutions requires system administrator to enforce ability to upload secure image to private registry to prevent torjan horse in docker image. I can see the appeal that without having to do a high upkeep of white-list-read-write directories by the new proposal. The third solution can throw people off, if they do not know about black-list is hijacked to empty location. However, the depth of directories will defeat second solution. If community favors the third solution, I can revise patch accordingly. was (Author: eyang): [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount /run and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with /etc/group modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. Solution 1: System admin placed {{/var/*}} and {{/run/\*}} (except /run/docker.socket), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. Solution 2 (All symlinks are banned and explicit hardcoded locations): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. Solution 3: (Ban symlink and replace black list
[jira] [Comment Edited] (YARN-7197) Add support for a volume blacklist for docker containers
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225452#comment-16225452 ] Eric Yang edited comment on YARN-7197 at 10/30/17 6:08 PM: --- [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount /run and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with /etc/group modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. Solution 1: System admin placed {{/var/*}} and {{/run/\*}} (except /run/docker.socket), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. Solution 2 (All symlinks are banned and explicit hardcoded locations): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/\*}} (except /run/docker.socket), {{/mnt/hdfs/user/\*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. Solution 3: (Ban symlink and replace black list location with empty directory): (Jason proposed implementation) System admin specifies: white-list-read-write: {{/var}},{{/run}},{{/mnt/hdfs/user}} black-list: {{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or mount /run/docker.socket manually, but result in empty file. All solutions requires system administrator to enforce ability to upload secure image to private registry to prevent torjan horse in docker image. I can see the appeal that without having to do a high upkeep of white-list-read-write directories by the new proposal. The third solution can throw people off, if they do not know about black-list is hijacked to empty location. However, the depth of directories will defeat second solution. If community favors the third solution, I can revise patch accordingly. was (Author: eyang): [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount /run and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with /etc/group modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. Solution 1: System admin placed {{/var/*}} and {{/run/*}} (except /run/docker.socket), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. Solution 2 (All symlinks are banned and explicit hardcoded locations): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/*}} (except /run/docker.socket), {{/mnt/hdfs/user/*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. Solution 3: (Ban symlink and replace black list location with
[jira] [Commented] (YARN-7197) Add support for a volume blacklist for docker containers
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225452#comment-16225452 ] Eric Yang commented on YARN-7197: - [~jlowe] {quote}Either /run isn't in the whitelist in the first place rendering the blacklist entry moot or /run is in the whitelist and the user can simply mount /run and access the blacklist path.{quote} Let's expand on the real world example. A hacker tries to take control of {{/run/docker.socket}} to acquire root privileges and spawn root containers to access vital system area to become root on the host system. The system admin placed {{/var}} in read-write white list for ability to write to database and log directories, without black list capability. Hacker explicitly specify {{/var/run/docker.socket}} to be included, put the socket in {{/tmp/docker.socket}}. Hacker generates a docker image with /etc/group modified to include his own name or setuid bit binary in container. Hack can successfully gain control to host level docker without much effort. {{/run}} contains a growing list of software that put their pid file or socket in this location. System admin can't say no to not allow other software to place their socket in {{/run}} location and share between containers due to company requirement. However, he still doesn't want to let hacker gain root access. Solution 1: System admin placed {{/var/*}} and {{/run/*}} (except /run/docker.socket), carefully in read-write white list. None of the symlink is exposed. Hacker can not get in. Solution 2 (All symlinks are banned and explicit hardcoded locations): (Current proposed patch) System admin specifies: white-list-read-write: {{/var}}, {{/run/*}} (except /run/docker.socket), {{/mnt/hdfs/user/*}} (exception yarn) black-list: {{/var/run}},{{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or explicit hard coded location also result in ban. Solution 3: (Ban symlink and replace black list location with empty directory): (Jason proposed implementation) System admin specifies: white-list-read-write: {{/var}},{{/run}},{{/mnt/hdfs/user}} black-list: {{/run/docker.socket}},{{/mnt/hdfs/user/yarn}} Hacker attempt to mount a symlink location resulting in access denied from container startup, or mount /run/docker.socket manually, but result in empty file. All solutions requires system administrator to enforce ability to upload secure image to private registry to prevent torjan horse in docker image. I can see the appeal that without having to do a high upkeep of white-list-read-write directories by the new proposal. The third solution can throw people off, if they do not know about black-list is hijacked to empty location. However, the deeper nested directories, it would be harder to secure by second solution. If community favors the third solution, I can revise patch accordingly. > Add support for a volume blacklist for docker containers > > > Key: YARN-7197 > URL: https://issues.apache.org/jira/browse/YARN-7197 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Shane Kumpf >Assignee: Eric Yang > Attachments: YARN-7197.001.patch, YARN-7197.002.patch > > > Docker supports bind mounting host directories into containers. Work is > underway to allow admins to configure a whilelist of volume mounts. While > this is a much needed and useful feature, it opens the door for > misconfiguration that may lead to users being able to compromise or crash the > system. > One example would be allowing users to mount /run from a host running > systemd, and then running systemd in that container, rendering the host > mostly unusable. > This issue is to add support for a default blacklist. The default blacklist > would be where we put files and directories that if mounted into a container, > are likely to have negative consequences. Users are encouraged not to remove > items from the default blacklist, but may do so if necessary. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7410) Cleanup FixedValueResource to avoid dependency to ResourceUtils
[ https://issues.apache.org/jira/browse/YARN-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225447#comment-16225447 ] Daniel Templeton commented on YARN-7410: A couple issues: # {{getResourceInformation(int)}} needs an override annotation # {{getResources()}} reinit's the map but doesn't retry the operation Can you explain to me why we need to reinit the resource map in these calls? > Cleanup FixedValueResource to avoid dependency to ResourceUtils > --- > > Key: YARN-7410 > URL: https://issues.apache.org/jira/browse/YARN-7410 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Sunil G >Assignee: Wangda Tan > Attachments: YARN-7410.001.patch, YARN-7410.002.patch > > > Currently FixedValue Resource constants has some dependencies to > ResourceUtils. This jira will help to cleanup this dependencies. > Thanks [~leftnoteasy] for finding the same. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2497) Changes for fair scheduler to support allocate resource respect labels
[ https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-2497: --- Attachment: YARN-2497.011.patch Rebased to latest trunk. > Changes for fair scheduler to support allocate resource respect labels > -- > > Key: YARN-2497 > URL: https://issues.apache.org/jira/browse/YARN-2497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Wangda Tan >Assignee: Daniel Templeton > Attachments: YARN-2497.001.patch, YARN-2497.002.patch, > YARN-2497.003.patch, YARN-2497.004.patch, YARN-2497.005.patch, > YARN-2497.006.patch, YARN-2497.007.patch, YARN-2497.008.patch, > YARN-2497.009.patch, YARN-2497.010.patch, YARN-2497.011.patch, > YARN-2499.WIP01.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7410) Cleanup FixedValueResource to avoid dependency to ResourceUtils
[ https://issues.apache.org/jira/browse/YARN-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225433#comment-16225433 ] Daniel Templeton commented on YARN-7410: Since this JIRA doesn't actually do anything about the dependencies, should the summary be changed to something more accurate? > Cleanup FixedValueResource to avoid dependency to ResourceUtils > --- > > Key: YARN-7410 > URL: https://issues.apache.org/jira/browse/YARN-7410 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Sunil G >Assignee: Wangda Tan > Attachments: YARN-7410.001.patch, YARN-7410.002.patch > > > Currently FixedValue Resource constants has some dependencies to > ResourceUtils. This jira will help to cleanup this dependencies. > Thanks [~leftnoteasy] for finding the same. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-7410) Cleanup FixedValueResource to avoid dependency to ResourceUtils
[ https://issues.apache.org/jira/browse/YARN-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-7410: --- Comment: was deleted (was: Since this JIRA doesn't actually do anything about the dependencies, should the summary be changed to something more accurate?) > Cleanup FixedValueResource to avoid dependency to ResourceUtils > --- > > Key: YARN-7410 > URL: https://issues.apache.org/jira/browse/YARN-7410 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.1.0 >Reporter: Sunil G >Assignee: Wangda Tan > Attachments: YARN-7410.001.patch, YARN-7410.002.patch > > > Currently FixedValue Resource constants has some dependencies to > ResourceUtils. This jira will help to cleanup this dependencies. > Thanks [~leftnoteasy] for finding the same. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7159) Normalize unit of resource objects in RM and avoid to do unit conversion in critical path
[ https://issues.apache.org/jira/browse/YARN-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225426#comment-16225426 ] Manikandan R commented on YARN-7159: TestClientRMService failure is related to this patch, other unit failures are not. Reading resources from dependent jar doesn't work using getFile() method. Attached patch to fix this. > Normalize unit of resource objects in RM and avoid to do unit conversion in > critical path > - > > Key: YARN-7159 > URL: https://issues.apache.org/jira/browse/YARN-7159 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Wangda Tan >Assignee: Manikandan R >Priority: Critical > Attachments: YARN-7159.001.patch, YARN-7159.002.patch, > YARN-7159.003.patch, YARN-7159.004.patch, YARN-7159.005.patch, > YARN-7159.006.patch, YARN-7159.007.patch, YARN-7159.008.patch > > > Currently resource conversion could happen in critical code path when > different unit is specified by client. This could impact performance and > throughput of RM a lot. We should do unit normalization when resource passed > to RM and avoid expensive unit conversion every time. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org