[jira] [Commented] (YARN-7612) Add Placement Processor and planner framework
[ https://issues.apache.org/jira/browse/YARN-7612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284637#comment-16284637 ] genericqa commented on YARN-7612: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} YARN-6592 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 0s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 18s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 59s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 8s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 26s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 14s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in YARN-6592 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s{color} | {color:green} YARN-6592 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 57s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 5s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 33 new + 560 unchanged - 1 fixed = 593 total (was 561) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 17s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 41s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 4s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 45s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings.
[jira] [Commented] (YARN-7591) NPE in async-scheduling mode of CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284620#comment-16284620 ] Wangda Tan commented on YARN-7591: -- [~subru], cherry-picked to branch-3.0 / branch-2.9 / branch-2. Updated fix version. Thanks for the reminding! > NPE in async-scheduling mode of CapacityScheduler > - > > Key: YARN-7591 > URL: https://issues.apache.org/jira/browse/YARN-7591 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.0.0-alpha4, 2.9.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.0.0, 2.9.1 > > Attachments: YARN-7591.001.patch, YARN-7591.002.patch > > > Currently in async-scheduling mode of CapacityScheduler, NPE may be raised in > special scenarios as below. > (1) The user should be removed after its last application finished, NPE may > be raised if getting something from user object without the null check in > async-scheduling threads. > (2) NPE may be raised when trying fulfill reservation for a finished > application in {{CapacityScheduler#allocateContainerOnSingleNode}}. > {code} > RMContainer reservedContainer = node.getReservedContainer(); > if (reservedContainer != null) { > FiCaSchedulerApp reservedApplication = getCurrentAttemptForContainer( > reservedContainer.getContainerId()); > // NPE here: reservedApplication could be null after this application > finished > // Try to fulfill the reservation > LOG.info( > "Trying to fulfill reservation for application " + > reservedApplication > .getApplicationId() + " on node: " + node.getNodeID()); > {code} > (3) If proposal1 (allocate containerX on node1) and proposal2 (reserve > containerY on node1) were generated by different async-scheduling threads > around the same time and proposal2 was submitted in front of proposal1, NPE > is raised when trying to submit proposal2 in > {{FiCaSchedulerApp#commonCheckContainerAllocation}}. > {code} > if (reservedContainerOnNode != null) { > // NPE here: allocation.getAllocateFromReservedContainer() should be > null for proposal2 in this case > RMContainer fromReservedContainer = > allocation.getAllocateFromReservedContainer().getRmContainer(); > if (fromReservedContainer != reservedContainerOnNode) { > if (LOG.isDebugEnabled()) { > LOG.debug( > "Try to allocate from a non-existed reserved container"); > } > return false; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7591) NPE in async-scheduling mode of CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7591: - Fix Version/s: (was: 3.1.0) 2.9.1 3.0.0 > NPE in async-scheduling mode of CapacityScheduler > - > > Key: YARN-7591 > URL: https://issues.apache.org/jira/browse/YARN-7591 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.0.0-alpha4, 2.9.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.0.0, 2.9.1 > > Attachments: YARN-7591.001.patch, YARN-7591.002.patch > > > Currently in async-scheduling mode of CapacityScheduler, NPE may be raised in > special scenarios as below. > (1) The user should be removed after its last application finished, NPE may > be raised if getting something from user object without the null check in > async-scheduling threads. > (2) NPE may be raised when trying fulfill reservation for a finished > application in {{CapacityScheduler#allocateContainerOnSingleNode}}. > {code} > RMContainer reservedContainer = node.getReservedContainer(); > if (reservedContainer != null) { > FiCaSchedulerApp reservedApplication = getCurrentAttemptForContainer( > reservedContainer.getContainerId()); > // NPE here: reservedApplication could be null after this application > finished > // Try to fulfill the reservation > LOG.info( > "Trying to fulfill reservation for application " + > reservedApplication > .getApplicationId() + " on node: " + node.getNodeID()); > {code} > (3) If proposal1 (allocate containerX on node1) and proposal2 (reserve > containerY on node1) were generated by different async-scheduling threads > around the same time and proposal2 was submitted in front of proposal1, NPE > is raised when trying to submit proposal2 in > {{FiCaSchedulerApp#commonCheckContainerAllocation}}. > {code} > if (reservedContainerOnNode != null) { > // NPE here: allocation.getAllocateFromReservedContainer() should be > null for proposal2 in this case > RMContainer fromReservedContainer = > allocation.getAllocateFromReservedContainer().getRmContainer(); > if (fromReservedContainer != reservedContainerOnNode) { > if (LOG.isDebugEnabled()) { > LOG.debug( > "Try to allocate from a non-existed reserved container"); > } > return false; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7591) NPE in async-scheduling mode of CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7591: - Fix Version/s: 3.1.0 > NPE in async-scheduling mode of CapacityScheduler > - > > Key: YARN-7591 > URL: https://issues.apache.org/jira/browse/YARN-7591 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.0.0-alpha4, 2.9.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Fix For: 3.1.0 > > Attachments: YARN-7591.001.patch, YARN-7591.002.patch > > > Currently in async-scheduling mode of CapacityScheduler, NPE may be raised in > special scenarios as below. > (1) The user should be removed after its last application finished, NPE may > be raised if getting something from user object without the null check in > async-scheduling threads. > (2) NPE may be raised when trying fulfill reservation for a finished > application in {{CapacityScheduler#allocateContainerOnSingleNode}}. > {code} > RMContainer reservedContainer = node.getReservedContainer(); > if (reservedContainer != null) { > FiCaSchedulerApp reservedApplication = getCurrentAttemptForContainer( > reservedContainer.getContainerId()); > // NPE here: reservedApplication could be null after this application > finished > // Try to fulfill the reservation > LOG.info( > "Trying to fulfill reservation for application " + > reservedApplication > .getApplicationId() + " on node: " + node.getNodeID()); > {code} > (3) If proposal1 (allocate containerX on node1) and proposal2 (reserve > containerY on node1) were generated by different async-scheduling threads > around the same time and proposal2 was submitted in front of proposal1, NPE > is raised when trying to submit proposal2 in > {{FiCaSchedulerApp#commonCheckContainerAllocation}}. > {code} > if (reservedContainerOnNode != null) { > // NPE here: allocation.getAllocateFromReservedContainer() should be > null for proposal2 in this case > RMContainer fromReservedContainer = > allocation.getAllocateFromReservedContainer().getRmContainer(); > if (fromReservedContainer != reservedContainerOnNode) { > if (LOG.isDebugEnabled()) { > LOG.debug( > "Try to allocate from a non-existed reserved container"); > } > return false; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7612) Add Placement Processor and planner framework
[ https://issues.apache.org/jira/browse/YARN-7612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284608#comment-16284608 ] Wangda Tan commented on YARN-7612: -- Thanks [~asuresh] for working on the patch. Here's my overall understanding of the patch: a. AM requests SchedulingRequest. b. PlacementProcessor intercepts AM's request, send SchedulingRequests to Algorithm (b1) and set RejectedRequest (b2) to AMResponse. b1. Algorithm generates PlacedRequest (proposal) and rejected requests, and PlacementProcessor will process responses from Algorithm. In general, I think the workflow looks good. PlacementProcessor is the true global scheduler and algorithm is the pluggable module, and common logic (like how to handle proposal / rejected requests) are handled by PlacementProcessor itself. There're a couple of high-level suggestions regarding API design. 1. Algorithm interface is not the true global scheduler: the BatchedRequests is requesting for one app. Ideally we should be able to assign containers based on requests from multiple apps, correct? 2. In addition to that, it's better not to pass tagsManager/constraintsManager/node selector to Algorithm. Instead, we should have a separate init method in Algorithm API to store these util classes. 3. The interaction between Algorithm & PlacementProcessor is too tightly coupled: Processor called algorithm once and get a response, in the response it includes attempt id, etc. (I'm not quite sure why it needs the attempt id, could you explain a bit?). I would suggest considering following Algorithm API option: 1. The algorithm holds reference to scheduler states (including scheduler itself, AllocationTagsManager, ConstraintsManager, etc.) 2. The algorithm makes decision truly based on scheduler state (no additional inputs). 3. The result of Algorithm can be pulled by Processor or alternatively, notify Processor. I would prefer the latter one, which Algorithm can have a PlacementResultNotifier reference (passed in init method). And if any placement or reject decision is made, Algorithm will invoke the PlacementResultNotifer. > Add Placement Processor and planner framework > - > > Key: YARN-7612 > URL: https://issues.apache.org/jira/browse/YARN-7612 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-7612-YARN-6592.001.patch, > YARN-7612-YARN-6592.002.patch, YARN-7612-v2.wip.patch, YARN-7612.wip.patch > > > This introduces a Placement Processor and a Planning algorithm framework to > handle placement constraints and scheduling requests from an app and places > them on nodes. > The actual planning algorithm(s) will be handled in a YARN-7613. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7612) Add Placement Processor and planner framework
[ https://issues.apache.org/jira/browse/YARN-7612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7612: -- Attachment: YARN-7612-YARN-6592.002.patch Updated paatch: Fixed some class names and added some documentation > Add Placement Processor and planner framework > - > > Key: YARN-7612 > URL: https://issues.apache.org/jira/browse/YARN-7612 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-7612-YARN-6592.001.patch, > YARN-7612-YARN-6592.002.patch, YARN-7612-v2.wip.patch, YARN-7612.wip.patch > > > This introduces a Placement Processor and a Planning algorithm framework to > handle placement constraints and scheduling requests from an app and places > them on nodes. > The actual planning algorithm(s) will be handled in a YARN-7613. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7630) Fix AMRMToken handling in AMRMProxy
[ https://issues.apache.org/jira/browse/YARN-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284544#comment-16284544 ] genericqa commented on YARN-7630: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 0s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 48s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7630 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901329/YARN-7630.v1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux de57ec5995ed 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 670e8d4 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18851/testReport/ | | Max. process+thread count | 335 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/18851/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org |
[jira] [Updated] (YARN-7605) Implement doAs for Api Service REST API
[ https://issues.apache.org/jira/browse/YARN-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7605: Attachment: YARN-7605.001.patch - Add server side REST API check for kerberos tokens - Added doAs call to perform operations > Implement doAs for Api Service REST API > --- > > Key: YARN-7605 > URL: https://issues.apache.org/jira/browse/YARN-7605 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7605.001.patch > > > In YARN-7540, all client entry points for API service is centralized to use > REST API instead of having direct file system and resource manager rpc calls. > This change helped to centralize yarn metadata to be owned by yarn user > instead of crawling through every user's home directory to find metadata. > The next step is to make sure "doAs" calls work properly for API Service. > The metadata is stored by YARN user, but the actual workload still need to be > performed as end users, hence API service must authenticate end user kerberos > credential, and perform doAs call when requesting containers via > ServiceClient. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7064) Use cgroup to get container resource utilization
[ https://issues.apache.org/jira/browse/YARN-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284487#comment-16284487 ] genericqa commented on YARN-7064: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 58s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 21s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 51s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 0s{color} | {color:green} root: The patch generated 0 new + 262 unchanged - 3 fixed = 262 total (was 265) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 59s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 34s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 46s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 3s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 15s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}127m 26s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.security.TestRaceWhenRelogin | | | hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | |
[jira] [Assigned] (YARN-7512) Support service upgrade via YARN Service API and CLI
[ https://issues.apache.org/jira/browse/YARN-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh reassigned YARN-7512: --- Assignee: Chandni Singh > Support service upgrade via YARN Service API and CLI > > > Key: YARN-7512 > URL: https://issues.apache.org/jira/browse/YARN-7512 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gour Saha >Assignee: Chandni Singh > Fix For: yarn-native-services > > > YARN Service API and CLI needs to support service (and containers) upgrade in > line with what Slider supported in SLIDER-787 > (http://slider.incubator.apache.org/docs/slider_specs/application_pkg_upgrade.html) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7512) Support service upgrade via YARN Service API and CLI
[ https://issues.apache.org/jira/browse/YARN-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284474#comment-16284474 ] Gour Saha commented on YARN-7512: - Need to use YARN feature YARN-4726 - Allocation reuse for application upgrades > Support service upgrade via YARN Service API and CLI > > > Key: YARN-7512 > URL: https://issues.apache.org/jira/browse/YARN-7512 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gour Saha > Fix For: yarn-native-services > > > YARN Service API and CLI needs to support service (and containers) upgrade in > line with what Slider supported in SLIDER-787 > (http://slider.incubator.apache.org/docs/slider_specs/application_pkg_upgrade.html) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7631) ResourceRequest with different Capacity (Resource) overrides each other in RM
[ https://issues.apache.org/jira/browse/YARN-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-7631: --- Attachment: resourcebug.patch > ResourceRequest with different Capacity (Resource) overrides each other in RM > - > > Key: YARN-7631 > URL: https://issues.apache.org/jira/browse/YARN-7631 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Botong Huang > Attachments: resourcebug.patch > > > Today in AMRMClientImpl, the ResourceRequests (RR) are kept as: RequestId -> > Priority -> ResourceName -> ExecutionType -> Resource (Capacity) -> > ResourceRequestInfo (the actual RR). > This means that only RRs with the same (requestId, priority, resourcename, > executionType, resource) will be grouped and aggregated together. > While in RM side, the mapping is SchedulerRequestKey (RequestId, priority) -> > LocalityAppPlacementAllocator (ResourceName -> RR). > The issue is that in RM side Resource is not in the key to the RR at all. > (Note that executionType is also not in the RM side, but it is fine because > RM handles it separately as container update requests.) This means that under > the same value of (requestId, priority, resourcename), RRs with different > Resource values will be grouped together and override each other in RM. As a > result, some of the container requests are lost and will never be allocated. > Furthermore, since the two RRs are kept under different keys in AMRMClient > side, allocation of RR1 will only trigger cancel for RR1, the pending RR2 > will not get resend as well. > I’ve attached an unit test (resourcebug.patch) which is failing in trunk to > illustrate this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7631) ResourceRequest with different Capacity (Resource) overrides each other in RM
[ https://issues.apache.org/jira/browse/YARN-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-7631: --- Description: Today in AMRMClientImpl, the ResourceRequests (RR) are kept as: RequestId -> Priority -> ResourceName -> ExecutionType -> Resource (Capacity) -> ResourceRequestInfo (the actual RR). This means that only RRs with the same (requestId, priority, resourcename, executionType, resource) will be grouped and aggregated together. While in RM side, the mapping is SchedulerRequestKey (RequestId, priority) -> LocalityAppPlacementAllocator (ResourceName -> RR). The issue is that in RM side Resource is not in the key to the RR at all. (Note that executionType is also not in the RM side, but it is fine because RM handles it separately as container update requests.) This means that under the same value of (requestId, priority, resourcename), RRs with different Resource values will be grouped together and override each other in RM. As a result, some of the container requests are lost and will never be allocated. Furthermore, since the two RRs are kept under different keys in AMRMClient side, allocation of RR1 will only trigger cancel for RR1, the pending RR2 will not get resend as well. I’ve attached an unit test (resourcebug.patch) which is failing in trunk to illustrate this issue. was: Today in AMRMClientImpl, the ResourceRequests (RR) are kept as: RequestId -> Priority -> ResourceName -> ExecutionType -> Resource (Capacity) -> ResourceRequestInfo (the actual RR). This means that only RRs with the same (requestId, priority, resourcename, executionType, resource) will be grouped and aggregated together. While in RM side, the mapping is SchedulerRequestKey (RequestId, priority) -> LocalityAppPlacementAllocator (ResourceName -> RR). The issue is that in RM side Resource is not in the key to the RR at all. (Note that executionType is also not in the RM side, but it is fine because RM handles it separately as container update requests.) This means that under the same value of (requestId, priority, resourcename), RRs with different Resource values will be grouped together and override each other in RM. As a result, some of the container requests are lost and will never be allocated. Furthermore, since the two RRs are kept under different keys in AMRMClient side, allocation of RR1 will only trigger cancel for RR1, the pending RR2 will not get resend as well. I’ve attached an unit test (resourcebug.patch) which is failing in trunk to illustrate this issue. > ResourceRequest with different Capacity (Resource) overrides each other in RM > - > > Key: YARN-7631 > URL: https://issues.apache.org/jira/browse/YARN-7631 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Botong Huang > Attachments: resourcebug.patch > > > Today in AMRMClientImpl, the ResourceRequests (RR) are kept as: RequestId -> > Priority -> ResourceName -> ExecutionType -> Resource (Capacity) -> > ResourceRequestInfo (the actual RR). This means that only RRs with the same > (requestId, priority, resourcename, executionType, resource) will be grouped > and aggregated together. > While in RM side, the mapping is SchedulerRequestKey (RequestId, priority) -> > LocalityAppPlacementAllocator (ResourceName -> RR). > The issue is that in RM side Resource is not in the key to the RR at all. > (Note that executionType is also not in the RM side, but it is fine because > RM handles it separately as container update requests.) This means that under > the same value of (requestId, priority, resourcename), RRs with different > Resource values will be grouped together and override each other in RM. As a > result, some of the container requests are lost and will never be allocated. > Furthermore, since the two RRs are kept under different keys in AMRMClient > side, allocation of RR1 will only trigger cancel for RR1, the pending RR2 > will not get resend as well. > I’ve attached an unit test (resourcebug.patch) which is failing in trunk to > illustrate this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7631) ResourceRequest with different Capacity (Resource) overrides each other in RM
Botong Huang created YARN-7631: -- Summary: ResourceRequest with different Capacity (Resource) overrides each other in RM Key: YARN-7631 URL: https://issues.apache.org/jira/browse/YARN-7631 Project: Hadoop YARN Issue Type: Bug Reporter: Botong Huang Today in AMRMClientImpl, the ResourceRequests (RR) are kept as: RequestId -> Priority -> ResourceName -> ExecutionType -> Resource (Capacity) -> ResourceRequestInfo (the actual RR). This means that only RRs with the same (requestId, priority, resourcename, executionType, resource) will be grouped and aggregated together. While in RM side, the mapping is SchedulerRequestKey (RequestId, priority) -> LocalityAppPlacementAllocator (ResourceName -> RR). The issue is that in RM side Resource is not in the key to the RR at all. (Note that executionType is also not in the RM side, but it is fine because RM handles it separately as container update requests.) This means that under the same value of (requestId, priority, resourcename), RRs with different Resource values will be grouped together and override each other in RM. As a result, some of the container requests are lost and will never be allocated. Furthermore, since the two RRs are kept under different keys in AMRMClient side, allocation of RR1 will only trigger cancel for RR1, the pending RR2 will not get resend as well. I’ve attached an unit test (resourcebug.patch) which is failing in trunk to illustrate this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7630) Fix AMRMToken handling in AMRMProxy
[ https://issues.apache.org/jira/browse/YARN-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-7630: --- Attachment: YARN-7630.v1.patch > Fix AMRMToken handling in AMRMProxy > --- > > Key: YARN-7630 > URL: https://issues.apache.org/jira/browse/YARN-7630 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-7630.v1.patch, YARN-7630.v1.patch > > > Symptom: after RM rolls over the master key for AMRMToken, whenever the RPC > connection from FederationInterceptor to RM breaks due to transient network > issue and reconnects, heartbeat to RM starts failing because of the “Invalid > AMRMToken” exception. Whenever it hits, it happens for both home RM and > secondary RMs. > Related facts: > 1. When RM issues a new AMRMToken, it always send with service name field as > empty string. RPC layer in AM side will set it properly before start using > it. > 2. UGI keeps all tokens using a map from serviceName->Token. Initially > AMRMClientUtils.createRMProxy() is used to load the first token and start the > RM connection. > 3. When RM renew the token, YarnServerSecurityUtils.updateAMRMToken() is used > to load it into UGI and replace the existing token (with the same serviceName > key). > Bug: > The bug is that 2-AMRMClientUtils.createRMProxy() and > 3-YarnServerSecurityUtils.updateAMRMToken() is not handling the sequence > consistently. We always need to load the token (with empty service name) into > UGI first before we set the serviceName, so that the previous AMRMToken will > be overridden. But 2 is doing it reversely. That’s why after RM rolls the > amrmToken, the UGI end up with two tokens. Whenever the RPC connection break > and reconnect, the wrong token could be picked and thus trigger the > exception. > Fix: > Should load the AMRMToken into UGI first and then update the service name > field for RPC -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7630) Fix AMRMToken handling in AMRMProxy
[ https://issues.apache.org/jira/browse/YARN-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284460#comment-16284460 ] genericqa commented on YARN-7630: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 9m 51s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 41s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 33s{color} | {color:red} hadoop-yarn-server-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 57s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 28s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7630 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901324/YARN-7630.v1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux edb737455678 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 15:49:21 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 04b84da | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | mvninstall | https://builds.apache.org/job/PreCommit-YARN-Build/18849/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18849/testReport/ | | Max. process+thread count | 330 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U:
[jira] [Commented] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
[ https://issues.apache.org/jira/browse/YARN-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284459#comment-16284459 ] genericqa commented on YARN-7628: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 23m 34s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 34m 55s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7628 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901327/YARN-7628.wip.1.patch | | Optional Tests | asflicense mvnsite | | uname | Linux 77394ac7621e 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 670e8d4 | | maven | version: Apache Maven 3.3.9 | | Max. process+thread count | 440 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/18850/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [Documentation] Documenting the ability to disable elasticity at leaf queue > --- > > Key: YARN-7628 > URL: https://issues.apache.org/jira/browse/YARN-7628 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Zian Chen >Assignee: Zian Chen > Attachments: YARN-7628.wip.1.patch > > > Update documentation after YARN-7274 gets in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7540) Convert yarn app cli to call yarn api services
[ https://issues.apache.org/jira/browse/YARN-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284456#comment-16284456 ] Eric Yang commented on YARN-7540: - The failed unit test is not related to this JIRA. > Convert yarn app cli to call yarn api services > -- > > Key: YARN-7540 > URL: https://issues.apache.org/jira/browse/YARN-7540 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7540.001.patch, YARN-7540.002.patch, > YARN-7540.003.patch, YARN-7540.004.patch > > > For YARN docker application to launch through CLI, it works differently from > launching through REST API. All application launched through REST API is > currently stored in yarn user HDFS home directory. Application managed > through CLI are stored into individual user's HDFS home directory. For > consistency, we want to have yarn app cli to interact with API service to > manage applications. For performance reason, it is easier to implement list > all applications from one user's home directory instead of crawling all > user's home directories. For security reason, it is safer to access only one > user home directory instead of all users. Given the reasons above, the > proposal is to change how {{yarn app -launch}}, {{yarn app -list}} and {{yarn > app -destroy}} work. Instead of calling HDFS API and RM API to launch > containers, CLI will be converted to call API service REST API resides in RM. > RM perform the persist and operations to launch the actual application. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6704) Add support for work preserving NM restart when FederationInterceptor is enabled in AMRMProxyService
[ https://issues.apache.org/jira/browse/YARN-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284449#comment-16284449 ] Hudson commented on YARN-6704: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13351 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13351/]) YARN-6704. Add support for work preserving NM restart when (subru: rev 670e8d4ec7e71fc3b054cd3b2826f869b649a788) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/BaseAMRMProxyTest.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/AMRMProxyService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/TestFederationInterceptor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/MockResourceManagerFacade.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/TestableFederationInterceptor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/FederationInterceptor.java > Add support for work preserving NM restart when FederationInterceptor is > enabled in AMRMProxyService > > > Key: YARN-6704 > URL: https://issues.apache.org/jira/browse/YARN-6704 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang > Fix For: 3.1.0, 2.10.0, 2.9.1 > > Attachments: YARN-6704-YARN-2915.v1.patch, > YARN-6704-YARN-2915.v2.patch, YARN-6704.v3.patch, YARN-6704.v4.patch, > YARN-6704.v5.patch, YARN-6704.v6.patch, YARN-6704.v7.patch, > YARN-6704.v8.patch, YARN-6704.v9.patch > > > YARN-1336 added the ability to restart NM without loosing any running > containers. {{AMRMProxy}} restart is added in YARN-6127. In a Federated YARN > environment, there's additional state in the {{FederationInterceptor}} to > allow for spanning across multiple sub-clusters, so we need to enhance > {{FederationInterceptor}} to support work-preserving restart. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7511) NPE in ContainerLocalizer when localization failed for running container
[ https://issues.apache.org/jira/browse/YARN-7511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-7511: - Target Version/s: 3.1.0, 2.9.1 > NPE in ContainerLocalizer when localization failed for running container > > > Key: YARN-7511 > URL: https://issues.apache.org/jira/browse/YARN-7511 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha4, 2.9.1 >Reporter: Tao Yang >Assignee: Tao Yang > Attachments: YARN-7511.001.patch > > > Error log: > {noformat} > 2017-09-30 20:14:32,839 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread > java.lang.NullPointerException > at > java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106) > at > java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceSet.resourceLocalizationFailed(ResourceSet.java:151) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$ResourceLocalizationFailedWhileRunningTransition.transition(ContainerImpl.java:821) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$ResourceLocalizationFailedWhileRunningTransition.transition(ContainerImpl.java:813) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:1335) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:95) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1372) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1365) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:834) > 2017-09-30 20:14:32,842 INFO [AsyncDispatcher ShutDown handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye.. > {noformat} > Reproduce this problem: > 1. Container was running and ContainerManagerImpl#localize was called for > this container > 2. Localization failed in ResourceLocalizationService$LocalizerRunner#run and > sent out ContainerResourceFailedEvent with null LocalResourceRequest. > 3. NPE when ResourceLocalizationFailedWhileRunningTransition#transition --> > container.resourceSet.resourceLocalizationFailed(null) > I think we can fix this problem through ensuring that request is not null > before remove it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7508) NPE in FiCaSchedulerApp when debug log enabled in async-scheduling mode
[ https://issues.apache.org/jira/browse/YARN-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-7508: - Target Version/s: 3.1.0, 2.9.1 > NPE in FiCaSchedulerApp when debug log enabled in async-scheduling mode > --- > > Key: YARN-7508 > URL: https://issues.apache.org/jira/browse/YARN-7508 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.9.0, 3.0.0-alpha4 >Reporter: Tao Yang >Assignee: Tao Yang > Attachments: YARN-7508.001.patch > > > YARN-6678 have fixed the IllegalStateException problem but the debug log it > added may cause NPE when trying to print containerId of non-existed reserved > container on this node. Replace > {{schedulerContainer.getSchedulerNode().getReservedContainer().getContainerId()}} > with {{schedulerContainer.getSchedulerNode().getReservedContainer()}} can > fix this problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7540) Convert yarn app cli to call yarn api services
[ https://issues.apache.org/jira/browse/YARN-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284443#comment-16284443 ] genericqa commented on YARN-7540: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 39s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 49s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 1s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 55 unchanged - 0 fixed = 57 total (was 55) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 4s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 56s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 55s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 36s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 95m 8s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMClientOnRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7540 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901314/YARN-7540.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux
[jira] [Commented] (YARN-6704) Add support for work preserving NM restart when FederationInterceptor is enabled in AMRMProxyService
[ https://issues.apache.org/jira/browse/YARN-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284440#comment-16284440 ] Botong Huang commented on YARN-6704: Thanks [~subru]! > Add support for work preserving NM restart when FederationInterceptor is > enabled in AMRMProxyService > > > Key: YARN-6704 > URL: https://issues.apache.org/jira/browse/YARN-6704 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang > Fix For: 3.1.0, 2.10.0, 2.9.1 > > Attachments: YARN-6704-YARN-2915.v1.patch, > YARN-6704-YARN-2915.v2.patch, YARN-6704.v3.patch, YARN-6704.v4.patch, > YARN-6704.v5.patch, YARN-6704.v6.patch, YARN-6704.v7.patch, > YARN-6704.v8.patch, YARN-6704.v9.patch > > > YARN-1336 added the ability to restart NM without loosing any running > containers. {{AMRMProxy}} restart is added in YARN-6127. In a Federated YARN > environment, there's additional state in the {{FederationInterceptor}} to > allow for spanning across multiple sub-clusters, so we need to enhance > {{FederationInterceptor}} to support work-preserving restart. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7591) NPE in async-scheduling mode of CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284441#comment-16284441 ] Subru Krishnan commented on YARN-7591: -- Thanks [~Tao Yang] for the contribution and [~leftnoteasy] for the review/commit. [~leftnoteasy], I see the commit in trunk but not in branch-2/2.9 so are you planning cherry-pick down? > NPE in async-scheduling mode of CapacityScheduler > - > > Key: YARN-7591 > URL: https://issues.apache.org/jira/browse/YARN-7591 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.0.0-alpha4, 2.9.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-7591.001.patch, YARN-7591.002.patch > > > Currently in async-scheduling mode of CapacityScheduler, NPE may be raised in > special scenarios as below. > (1) The user should be removed after its last application finished, NPE may > be raised if getting something from user object without the null check in > async-scheduling threads. > (2) NPE may be raised when trying fulfill reservation for a finished > application in {{CapacityScheduler#allocateContainerOnSingleNode}}. > {code} > RMContainer reservedContainer = node.getReservedContainer(); > if (reservedContainer != null) { > FiCaSchedulerApp reservedApplication = getCurrentAttemptForContainer( > reservedContainer.getContainerId()); > // NPE here: reservedApplication could be null after this application > finished > // Try to fulfill the reservation > LOG.info( > "Trying to fulfill reservation for application " + > reservedApplication > .getApplicationId() + " on node: " + node.getNodeID()); > {code} > (3) If proposal1 (allocate containerX on node1) and proposal2 (reserve > containerY on node1) were generated by different async-scheduling threads > around the same time and proposal2 was submitted in front of proposal1, NPE > is raised when trying to submit proposal2 in > {{FiCaSchedulerApp#commonCheckContainerAllocation}}. > {code} > if (reservedContainerOnNode != null) { > // NPE here: allocation.getAllocateFromReservedContainer() should be > null for proposal2 in this case > RMContainer fromReservedContainer = > allocation.getAllocateFromReservedContainer().getRmContainer(); > if (fromReservedContainer != reservedContainerOnNode) { > if (LOG.isDebugEnabled()) { > LOG.debug( > "Try to allocate from a non-existed reserved container"); > } > return false; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6704) Add support for work preserving NM restart when FederationInterceptor is enabled in AMRMProxyService
[ https://issues.apache.org/jira/browse/YARN-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-6704: - Target Version/s: (was: 3.1.0, 2.9.1) > Add support for work preserving NM restart when FederationInterceptor is > enabled in AMRMProxyService > > > Key: YARN-6704 > URL: https://issues.apache.org/jira/browse/YARN-6704 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang > Fix For: 3.1.0, 2.10.0, 2.9.1 > > Attachments: YARN-6704-YARN-2915.v1.patch, > YARN-6704-YARN-2915.v2.patch, YARN-6704.v3.patch, YARN-6704.v4.patch, > YARN-6704.v5.patch, YARN-6704.v6.patch, YARN-6704.v7.patch, > YARN-6704.v8.patch, YARN-6704.v9.patch > > > YARN-1336 added the ability to restart NM without loosing any running > containers. {{AMRMProxy}} restart is added in YARN-6127. In a Federated YARN > environment, there's additional state in the {{FederationInterceptor}} to > allow for spanning across multiple sub-clusters, so we need to enhance > {{FederationInterceptor}} to support work-preserving restart. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7443) Add native FPGA module support to do isolation with cgroups
[ https://issues.apache.org/jira/browse/YARN-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284431#comment-16284431 ] Hudson commented on YARN-7443: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13350 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13350/]) YARN-7443. Add native FPGA module support to do isolation with cgroups. (wangda: rev 04b84da2456fb8c716e728b16db4851e2e87ec25) * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/modules/fpga/fpga-module.c * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/CMakeLists.txt * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/modules/fpga/test-fpga-module.cc * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/modules/fpga/fpga-module.h * (edit) hadoop-yarn-project/hadoop-yarn/conf/container-executor.cfg > Add native FPGA module support to do isolation with cgroups > --- > > Key: YARN-7443 > URL: https://issues.apache.org/jira/browse/YARN-7443 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Fix For: 3.1.0 > > Attachments: YARN-7443-trunk.001.patch, YARN-7443-trunk.002.patch, > YARN-7443-trunk.003.patch, YARN-7443-trunk.004.patch > > > Only support one major number devices configured in c-e.cfg for now. So > almost same with GPU native module -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7420) YARN UI changes to depict auto created queues
[ https://issues.apache.org/jira/browse/YARN-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284428#comment-16284428 ] Hudson commented on YARN-7420: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13350 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13350/]) YARN-7420. YARN UI changes to depict auto created queues. (Suma (wangda: rev f548bfffbdcd426811352d6920ee5fe50cd0182c) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerQueueInfo.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java > YARN UI changes to depict auto created queues > -- > > Key: YARN-7420 > URL: https://issues.apache.org/jira/browse/YARN-7420 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Fix For: 3.1.0 > > Attachments: ScreenShot_AutoCreated_Queues_Legend_color.png, > ScreenShot_Zero_capacity_queues_running_app.png, YARN-7420.1.patch, > YARN-7420.2.patch > > > Auto created queues will be depicted in a different color to indicate they > have been auto created and for easier distinction from manually > pre-configured queues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7591) NPE in async-scheduling mode of CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284430#comment-16284430 ] Hudson commented on YARN-7591: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13350 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13350/]) YARN-7591. NPE in async-scheduling mode of CapacityScheduler. (Tao Yang (wangda: rev adca1a72e4eca2ea634551e9fb8e9b878c36cb5c) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java > NPE in async-scheduling mode of CapacityScheduler > - > > Key: YARN-7591 > URL: https://issues.apache.org/jira/browse/YARN-7591 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.0.0-alpha4, 2.9.1 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-7591.001.patch, YARN-7591.002.patch > > > Currently in async-scheduling mode of CapacityScheduler, NPE may be raised in > special scenarios as below. > (1) The user should be removed after its last application finished, NPE may > be raised if getting something from user object without the null check in > async-scheduling threads. > (2) NPE may be raised when trying fulfill reservation for a finished > application in {{CapacityScheduler#allocateContainerOnSingleNode}}. > {code} > RMContainer reservedContainer = node.getReservedContainer(); > if (reservedContainer != null) { > FiCaSchedulerApp reservedApplication = getCurrentAttemptForContainer( > reservedContainer.getContainerId()); > // NPE here: reservedApplication could be null after this application > finished > // Try to fulfill the reservation > LOG.info( > "Trying to fulfill reservation for application " + > reservedApplication > .getApplicationId() + " on node: " + node.getNodeID()); > {code} > (3) If proposal1 (allocate containerX on node1) and proposal2 (reserve > containerY on node1) were generated by different async-scheduling threads > around the same time and proposal2 was submitted in front of proposal1, NPE > is raised when trying to submit proposal2 in > {{FiCaSchedulerApp#commonCheckContainerAllocation}}. > {code} > if (reservedContainerOnNode != null) { > // NPE here: allocation.getAllocateFromReservedContainer() should be > null for proposal2 in this case > RMContainer fromReservedContainer = > allocation.getAllocateFromReservedContainer().getRmContainer(); > if (fromReservedContainer != reservedContainerOnNode) { > if (LOG.isDebugEnabled()) { > LOG.debug( > "Try to allocate from a non-existed reserved container"); > } > return false; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7473) Implement Framework and policy for capacity management of auto created queues
[ https://issues.apache.org/jira/browse/YARN-7473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284427#comment-16284427 ] Hudson commented on YARN-7473: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13350 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13350/]) YARN-7473. Implement Framework and policy for capacity management of (wangda: rev b38643c9a8dd2c53024ae830b9565a550d0ec39c) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TempQueuePerPartition.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerAutoCreatedQueueBase.java * (delete) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestAutoCreatedLeafQueue.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/SchedulerEventType.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/QueueManagementChangeEvent.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AutoCreatedLeafQueueConfig.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/QueueEntitlement.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueManagementChange.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AutoCreatedQueueManagementPolicy.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractAutoCreatedLeafQueue.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacitySchedulerPlanFollower.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerAutoQueueCreation.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestGuaranteedOrZeroCapacityOverTimePolicy.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java * (edit)
[jira] [Commented] (YARN-7520) Queue Ordering policy changes for ordering auto created leaf queues within Managed parent Queues
[ https://issues.apache.org/jira/browse/YARN-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284429#comment-16284429 ] Hudson commented on YARN-7520: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13350 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13350/]) YARN-7520. Queue Ordering policy changes for ordering auto created leaf (wangda: rev a8316df8c05a7b3d1a5577174b838711a49ef971) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/policy/TestPriorityUtilizationQueueOrderingPolicy.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/policy/PriorityUtilizationQueueOrderingPolicy.java > Queue Ordering policy changes for ordering auto created leaf queues within > Managed parent Queues > > > Key: YARN-7520 > URL: https://issues.apache.org/jira/browse/YARN-7520 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Fix For: 3.1.0 > > Attachments: YARN-7520.1.patch, YARN-7520.2.patch, YARN-7520.3.patch, > YARN-7520.4.patch, YARN-7520.5.patch, YARN-7520.6.patch > > > Queue Ordering policy currently uses priority, utilization and absolute > capacity for pre-configured parent queues to order leaf queues while > assigning containers. It needs modifications for auto created leaf queues > since they can have zero capacity -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
[ https://issues.apache.org/jira/browse/YARN-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284425#comment-16284425 ] Wangda Tan commented on YARN-7628: -- Thanks [~Zian Chen], you don't have to mark it to be WIP unless it is not a finished feature (without tests, cannot compile, not functional, etc.). I suggest to keep it simple. Just remove {{Defaults to -1 which disables it.}} from the original doc, and states that 1) Value is between 0 and 100. 2) Admin needs to make sure absolute maximum capacity >= absolute capacity for each queue. And also, what's the behavior of specifying "-1" to maximum capacity? We should document that as well. > [Documentation] Documenting the ability to disable elasticity at leaf queue > --- > > Key: YARN-7628 > URL: https://issues.apache.org/jira/browse/YARN-7628 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Zian Chen >Assignee: Zian Chen > Attachments: YARN-7628.wip.1.patch > > > Update documentation after YARN-7274 gets in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7274) Ability to disable elasticity at leaf queue level
[ https://issues.apache.org/jira/browse/YARN-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284426#comment-16284426 ] Hudson commented on YARN-7274: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13350 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13350/]) YARN-7274. Ability to disable elasticity at leaf queue level. (Zian Chen (wangda: rev 74665e3a7d7f05644d9a5abad5a3f2d47597d6c8) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestNodeLabelContainerAllocation.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java > Ability to disable elasticity at leaf queue level > - > > Key: YARN-7274 > URL: https://issues.apache.org/jira/browse/YARN-7274 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Reporter: Scott Brokaw >Assignee: Zian Chen > Fix For: 3.1.0 > > Attachments: YARN-7274.2.patch, YARN-7274.wip.1.patch > > > The > [documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html] > defines yarn.scheduler.capacity..maximum-capacity as "Maximum > queue capacity in percentage (%) as a float. This limits the elasticity for > applications in the queue. Defaults to -1 which disables it." > However, setting this value to -1 sets maximum capacity to 100% but I thought > (perhaps incorrectly) that the intention of the -1 setting is that it would > disable elasticity. This is confirmed looking at the code: > {code:java} > public static final float MAXIMUM_CAPACITY_VALUE = 100; > public static final float DEFAULT_MAXIMUM_CAPACITY_VALUE = -1.0f; > .. > maxCapacity = (maxCapacity == DEFAULT_MAXIMUM_CAPACITY_VALUE) ? > MAXIMUM_CAPACITY_VALUE : maxCapacity; > {code} > The sum of yarn.scheduler.capacity..capacity for all queues, at > each level, must be equal to 100 but for > yarn.scheduler.capacity..maximum-capacity this value is actually > a percentage of the entire cluster not just the parent queue. Yet it can not > be set lower then the leaf queue's capacity setting. This seems to make it > impossible to disable elasticity at a leaf queue level. > This improvement is proposing that YARN have the ability to have elasticity > disabled at a leaf queue level even if a parent queue permits elasticity by > having a yarn.scheduler.capacity..maximum-capacity greater then > it's yarn.scheduler.capacity..capacity -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6704) Add support for work preserving NM restart when FederationInterceptor is enabled in AMRMProxyService
[ https://issues.apache.org/jira/browse/YARN-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-6704: - Summary: Add support for work preserving NM restart when FederationInterceptor is enabled in AMRMProxyService (was: Add Federation Interceptor restart when work preserving NM is enabled) > Add support for work preserving NM restart when FederationInterceptor is > enabled in AMRMProxyService > > > Key: YARN-6704 > URL: https://issues.apache.org/jira/browse/YARN-6704 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang > Attachments: YARN-6704-YARN-2915.v1.patch, > YARN-6704-YARN-2915.v2.patch, YARN-6704.v3.patch, YARN-6704.v4.patch, > YARN-6704.v5.patch, YARN-6704.v6.patch, YARN-6704.v7.patch, > YARN-6704.v8.patch, YARN-6704.v9.patch > > > YARN-1336 added the ability to restart NM without loosing any running > containers. {{AMRMProxy}} restart is added in YARN-6127. In a Federated YARN > environment, there's additional state in the {{FederationInterceptor}} to > allow for spanning across multiple sub-clusters, so we need to enhance > {{FederationInterceptor}} to support work-preserving restart. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
[ https://issues.apache.org/jira/browse/YARN-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zian Chen updated YARN-7628: Issue Type: Bug (was: Task) > [Documentation] Documenting the ability to disable elasticity at leaf queue > --- > > Key: YARN-7628 > URL: https://issues.apache.org/jira/browse/YARN-7628 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Zian Chen >Assignee: Zian Chen > Attachments: YARN-7628.wip.1.patch > > > Update documentation after YARN-7274 gets in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
[ https://issues.apache.org/jira/browse/YARN-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zian Chen updated YARN-7628: Issue Type: Task (was: Bug) > [Documentation] Documenting the ability to disable elasticity at leaf queue > --- > > Key: YARN-7628 > URL: https://issues.apache.org/jira/browse/YARN-7628 > Project: Hadoop YARN > Issue Type: Task > Components: capacity scheduler >Reporter: Zian Chen >Assignee: Zian Chen > Attachments: YARN-7628.wip.1.patch > > > Update documentation after YARN-7274 gets in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
[ https://issues.apache.org/jira/browse/YARN-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zian Chen updated YARN-7628: Attachment: YARN-7628.wip.1.patch > [Documentation] Documenting the ability to disable elasticity at leaf queue > --- > > Key: YARN-7628 > URL: https://issues.apache.org/jira/browse/YARN-7628 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Zian Chen >Assignee: Zian Chen > Attachments: YARN-7628.wip.1.patch > > > Update documentation after YARN-7274 gets in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
[ https://issues.apache.org/jira/browse/YARN-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284405#comment-16284405 ] Zian Chen commented on YARN-7628: - Attached patch (WIP). [~leftnoteasy], any suggestions would be appreciated! > [Documentation] Documenting the ability to disable elasticity at leaf queue > --- > > Key: YARN-7628 > URL: https://issues.apache.org/jira/browse/YARN-7628 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Zian Chen >Assignee: Zian Chen > > Update documentation after YARN-7274 gets in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7630) Fix AMRMToken handling in AMRMProxy
[ https://issues.apache.org/jira/browse/YARN-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-7630: --- Attachment: YARN-7630.v1.patch > Fix AMRMToken handling in AMRMProxy > --- > > Key: YARN-7630 > URL: https://issues.apache.org/jira/browse/YARN-7630 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-7630.v1.patch > > > Symptom: after RM rolls over the master key for AMRMToken, whenever the RPC > connection from FederationInterceptor to RM breaks due to transient network > issue and reconnects, heartbeat to RM starts failing because of the “Invalid > AMRMToken” exception. Whenever it hits, it happens for both home RM and > secondary RMs. > Related facts: > 1. When RM issues a new AMRMToken, it always send with service name field as > empty string. RPC layer in AM side will set it properly before start using > it. > 2. UGI keeps all tokens using a map from serviceName->Token. Initially > AMRMClientUtils.createRMProxy() is used to load the first token and start the > RM connection. > 3. When RM renew the token, YarnServerSecurityUtils.updateAMRMToken() is used > to load it into UGI and replace the existing token (with the same serviceName > key). > Bug: > The bug is that 2-AMRMClientUtils.createRMProxy() and > 3-YarnServerSecurityUtils.updateAMRMToken() is not handling the sequence > consistently. We always need to load the token (with empty service name) into > UGI first before we set the serviceName, so that the previous AMRMToken will > be overridden. But 2 is doing it reversely. That’s why after RM rolls the > amrmToken, the UGI end up with two tokens. Whenever the RPC connection break > and reconnect, the wrong token could be picked and thus trigger the > exception. > Fix: > Should load the AMRMToken into UGI first and then update the service name > field for RPC -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7630) Fix AMRMToken handling in AMRMProxy
[ https://issues.apache.org/jira/browse/YARN-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-7630: --- Description: Symptom: after RM rolls over the master key for AMRMToken, whenever the RPC connection from FederationInterceptor to RM breaks due to transient network issue and reconnects, heartbeat to RM starts failing because of the “Invalid AMRMToken” exception. Whenever it hits, it happens for both home RM and secondary RMs. Related facts: 1. When RM issues a new AMRMToken, it always send with service name field as empty string. RPC layer in AM side will set it properly before start using it. 2. UGI keeps all tokens using a map from serviceName->Token. Initially AMRMClientUtils.createRMProxy() is used to load the first token and start the RM connection. 3. When RM renew the token, YarnServerSecurityUtils.updateAMRMToken() is used to load it into UGI and replace the existing token (with the same serviceName key). Bug: The bug is that 2-AMRMClientUtils.createRMProxy() and 3-YarnServerSecurityUtils.updateAMRMToken() is not handling the sequence consistently. We always need to load the token (with empty service name) into UGI first before we set the serviceName, so that the previous AMRMToken will be overridden. But 2 is doing it reversely. That’s why after RM rolls the amrmToken, the UGI end up with two tokens. Whenever the RPC connection break and reconnect, the wrong token could be picked and thus trigger the exception. Fix: Should load the AMRMToken into UGI first and then update the service name field for RPC was: Symptom: after RM rolls over the master key for AMRMToken, whenever the RPC connection from FederationInterceptor to RM breaks due to transient network issue and reconnects, heartbeat to RM starts failing because of the “Invalid AMRMToken” exception. Whenever it hits, it happens for both home RM and secondary RMs. Related facts: 1. When RM issues a new AMRMToken, it always send with service name field as empty string. RPC layer in AM side will set it properly before start using it. 2. UGI keeps all tokens using a map from serviceName->Token. Initially AMRMClientUtils.createRMProxy() is used to load the first token and start the RM connection. 3. When RM renew the token, YarnServerSecurityUtils.updateAMRMToken() is used to load it into UGI and replace the existing token (with the same serviceName key). Bug: The bug is that 2-AMRMClientUtils.createRMProxy() and 3-YarnServerSecurityUtils.updateAMRMToken() is not handling the sequence consistently. We always need to load the token (with empty service name) into UGI first before we set the serviceName, so that the previous AMRMToken will be overridden. But 2 is doing it reversely. That’s why after RM rolls the amrmToken, the UGI end up with two tokens. Whenever the RPC connection break and reconnect, the wrong token could be picked and thus trigger the exception. > Fix AMRMToken handling in AMRMProxy > --- > > Key: YARN-7630 > URL: https://issues.apache.org/jira/browse/YARN-7630 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > > Symptom: after RM rolls over the master key for AMRMToken, whenever the RPC > connection from FederationInterceptor to RM breaks due to transient network > issue and reconnects, heartbeat to RM starts failing because of the “Invalid > AMRMToken” exception. Whenever it hits, it happens for both home RM and > secondary RMs. > Related facts: > 1. When RM issues a new AMRMToken, it always send with service name field as > empty string. RPC layer in AM side will set it properly before start using > it. > 2. UGI keeps all tokens using a map from serviceName->Token. Initially > AMRMClientUtils.createRMProxy() is used to load the first token and start the > RM connection. > 3. When RM renew the token, YarnServerSecurityUtils.updateAMRMToken() is used > to load it into UGI and replace the existing token (with the same serviceName > key). > Bug: > The bug is that 2-AMRMClientUtils.createRMProxy() and > 3-YarnServerSecurityUtils.updateAMRMToken() is not handling the sequence > consistently. We always need to load the token (with empty service name) into > UGI first before we set the serviceName, so that the previous AMRMToken will > be overridden. But 2 is doing it reversely. That’s why after RM rolls the > amrmToken, the UGI end up with two tokens. Whenever the RPC connection break > and reconnect, the wrong token could be picked and thus trigger the > exception. > Fix: > Should load the AMRMToken into UGI first and then update the service name > field for RPC -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
[ https://issues.apache.org/jira/browse/YARN-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7628: - Description: Update documentation after YARN-7274 gets in. > [Documentation] Documenting the ability to disable elasticity at leaf queue > --- > > Key: YARN-7628 > URL: https://issues.apache.org/jira/browse/YARN-7628 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Zian Chen >Assignee: Zian Chen > > Update documentation after YARN-7274 gets in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
[ https://issues.apache.org/jira/browse/YARN-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284385#comment-16284385 ] Wangda Tan commented on YARN-7628: -- Converted to issue. > [Documentation] Documenting the ability to disable elasticity at leaf queue > --- > > Key: YARN-7628 > URL: https://issues.apache.org/jira/browse/YARN-7628 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Zian Chen >Assignee: Zian Chen > > Update documentation after YARN-7274 gets in. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
[ https://issues.apache.org/jira/browse/YARN-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7628: - Issue Type: Bug (was: Sub-task) Parent: (was: YARN-7274) > [Documentation] Documenting the ability to disable elasticity at leaf queue > --- > > Key: YARN-7628 > URL: https://issues.apache.org/jira/browse/YARN-7628 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Zian Chen >Assignee: Zian Chen > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7595) Container launching code suppresses close exceptions after writes
[ https://issues.apache.org/jira/browse/YARN-7595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284381#comment-16284381 ] Jason Lowe commented on YARN-7595: -- Thanks for the patch! Looks good overall, just a few nits. {{containerScriptOutStream}} is created significantly before it really needs to be. There's a lot of stuff in the try block that does not need to be there, and we could postpone creating the script stream until just before the writeLaunchEnv call. Nit: We could save some indentation on the following by putting the two resources in the same try-with-resources block rather than nesting one block within another. Not a must-fix, it's OK as nested blocks. If we keep it nested the line break on the PrintStream creation is unnecessary. {code} try (DataOutputStream out = lfs.create(wrapperScriptPath, EnumSet.of(CREATE, OVERWRITE))) { try (PrintStream pout = new PrintStream(out, false,"UTF-8")) { writeLocalWrapperScript(launchDst, pidFile, pout); } } {code} I filed YARN-7629 to track the unrelated TestContainerLaunch failure. > Container launching code suppresses close exceptions after writes > - > > Key: YARN-7595 > URL: https://issues.apache.org/jira/browse/YARN-7595 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Jason Lowe >Assignee: Jim Brennan > Attachments: YARN-7595.001.patch > > > There are a number of places in code related to container launching where the > following pattern is used: > {code} > try { > ...write to stream outStream... > } finally { > IOUtils.cleanupWithLogger(LOG, outStream); > } > {code} > Unfortunately this suppresses any IOException that occurs during the close() > method on outStream. If the stream is buffered or could otherwise fail to > finish writing the file when trying to close then this can lead to > partial/corrupted data without throwing an I/O error. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7630) Fix AMRMToken handling in AMRMProxy
[ https://issues.apache.org/jira/browse/YARN-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-7630: --- Issue Type: Sub-task (was: Bug) Parent: YARN-5579 > Fix AMRMToken handling in AMRMProxy > --- > > Key: YARN-7630 > URL: https://issues.apache.org/jira/browse/YARN-7630 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > > Symptom: after RM rolls over the master key for AMRMToken, whenever the RPC > connection from FederationInterceptor to RM breaks due to transient network > issue and reconnects, heartbeat to RM starts failing because of the “Invalid > AMRMToken” exception. Whenever it hits, it happens for both home RM and > secondary RMs. > Related facts: > 1. When RM issues a new AMRMToken, it always send with service name field as > empty string. RPC layer in AM side will set it properly before start using > it. > 2. UGI keeps all tokens using a map from serviceName->Token. Initially > AMRMClientUtils.createRMProxy() is used to load the first token and start the > RM connection. > 3. When RM renew the token, YarnServerSecurityUtils.updateAMRMToken() is used > to load it into UGI and replace the existing token (with the same serviceName > key). > Bug: > The bug is that 2-AMRMClientUtils.createRMProxy() and > 3-YarnServerSecurityUtils.updateAMRMToken() is not handling the sequence > consistently. We always need to load the token (with empty service name) into > UGI first before we set the serviceName, so that the previous AMRMToken will > be overridden. But 2 is doing it reversely. That’s why after RM rolls the > amrmToken, the UGI end up with two tokens. Whenever the RPC connection break > and reconnect, the wrong token could be picked and thus trigger the > exception. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7629) TestContainerLaunch fails
[ https://issues.apache.org/jira/browse/YARN-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284376#comment-16284376 ] Jason Lowe commented on YARN-7629: -- This started failing after YARN-7381. I'm guessing the new debugging logic added to the container launch script by default is messing up the unit test somehow. [~xgong] [~leftnoteasy] could you take a look? > TestContainerLaunch fails > - > > Key: YARN-7629 > URL: https://issues.apache.org/jira/browse/YARN-7629 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Jason Lowe > > TestContainerLaunch#testValidEnvVariableSubstitution is failing: > {noformat} > [INFO] Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.824 > s <<< FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch > [ERROR] > testValidEnvVariableSubstitution(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch) > Time elapsed: 0.322 s <<< FAILURE! > java.lang.AssertionError: Should not catch exception > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testValidEnvVariableSubstitution(TestContainerLaunch.java:1813) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7630) Fix AMRMToken handling in AMRMProxy
Botong Huang created YARN-7630: -- Summary: Fix AMRMToken handling in AMRMProxy Key: YARN-7630 URL: https://issues.apache.org/jira/browse/YARN-7630 Project: Hadoop YARN Issue Type: Bug Reporter: Botong Huang Assignee: Botong Huang Priority: Minor Symptom: after RM rolls over the master key for AMRMToken, whenever the RPC connection from FederationInterceptor to RM breaks due to transient network issue and reconnects, heartbeat to RM starts failing because of the “Invalid AMRMToken” exception. Whenever it hits, it happens for both home RM and secondary RMs. Related facts: 1. When RM issues a new AMRMToken, it always send with service name field as empty string. RPC layer in AM side will set it properly before start using it. 2. UGI keeps all tokens using a map from serviceName->Token. Initially AMRMClientUtils.createRMProxy() is used to load the first token and start the RM connection. 3. When RM renew the token, YarnServerSecurityUtils.updateAMRMToken() is used to load it into UGI and replace the existing token (with the same serviceName key). Bug: The bug is that 2-AMRMClientUtils.createRMProxy() and 3-YarnServerSecurityUtils.updateAMRMToken() is not handling the sequence consistently. We always need to load the token (with empty service name) into UGI first before we set the serviceName, so that the previous AMRMToken will be overridden. But 2 is doing it reversely. That’s why after RM rolls the amrmToken, the UGI end up with two tokens. Whenever the RPC connection break and reconnect, the wrong token could be picked and thus trigger the exception. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7629) TestContainerLaunch fails
Jason Lowe created YARN-7629: Summary: TestContainerLaunch fails Key: YARN-7629 URL: https://issues.apache.org/jira/browse/YARN-7629 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0 Reporter: Jason Lowe TestContainerLaunch#testValidEnvVariableSubstitution is failing: {noformat} [INFO] Running org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.824 s <<< FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch [ERROR] testValidEnvVariableSubstitution(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch) Time elapsed: 0.322 s <<< FAILURE! java.lang.AssertionError: Should not catch exception at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testValidEnvVariableSubstitution(TestContainerLaunch.java:1813) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7590) Improve container-executor validation check
[ https://issues.apache.org/jira/browse/YARN-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284372#comment-16284372 ] Miklos Szegedi commented on YARN-7590: -- [~eyang], sorry about the delay. Due to the sensitivity of the issue I intend to do some end to end tests but I did not get there yet. > Improve container-executor validation check > --- > > Key: YARN-7590 > URL: https://issues.apache.org/jira/browse/YARN-7590 > Project: Hadoop YARN > Issue Type: Improvement > Components: security, yarn >Affects Versions: 2.0.1-alpha, 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.7.0, > 2.8.0, 2.8.1, 3.0.0-beta1 >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7590.001.patch > > > There is minimum check for prefix path for container-executor. If YARN is > compromised, attacker can use container-executor to change system files > ownership: > {code} > /usr/local/hadoop/bin/container-executor spark yarn 0 etc /home/yarn/tokens > /home/spark / ls > {code} > This will change /etc to be owned by spark user: > {code} > # ls -ld /etc > drwxr-s---. 110 spark hadoop 8192 Nov 21 20:00 /etc > {code} > Spark user can rewrite /etc files to gain more access. We can improve this > with additional check in container-executor: > # Make sure the prefix path is same as the one in yarn-site.xml, and > yarn-site.xml is owned by root, 644, and marked as final in property. > # Make sure the user path is not a symlink, usercache is not a symlink. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7064) Use cgroup to get container resource utilization
[ https://issues.apache.org/jira/browse/YARN-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-7064: - Attachment: YARN-7064.009.patch Fixing checkstyle and unit tests > Use cgroup to get container resource utilization > > > Key: YARN-7064 > URL: https://issues.apache.org/jira/browse/YARN-7064 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi > Attachments: YARN-7064.000.patch, YARN-7064.001.patch, > YARN-7064.002.patch, YARN-7064.003.patch, YARN-7064.004.patch, > YARN-7064.005.patch, YARN-7064.007.patch, YARN-7064.008.patch, > YARN-7064.009.patch > > > This is an addendum to YARN-6668. What happens is that that jira always wants > to rebase patches against YARN-1011 instead of trunk. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7628) [Documentation] Documenting the ability to disable elasticity at leaf queue
Zian Chen created YARN-7628: --- Summary: [Documentation] Documenting the ability to disable elasticity at leaf queue Key: YARN-7628 URL: https://issues.apache.org/jira/browse/YARN-7628 Project: Hadoop YARN Issue Type: Sub-task Components: capacity scheduler Reporter: Zian Chen Assignee: Zian Chen -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6704) Add Federation Interceptor restart when work preserving NM is enabled
[ https://issues.apache.org/jira/browse/YARN-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284334#comment-16284334 ] Botong Huang commented on YARN-6704: The unit test failure is irrelevant. > Add Federation Interceptor restart when work preserving NM is enabled > - > > Key: YARN-6704 > URL: https://issues.apache.org/jira/browse/YARN-6704 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang > Attachments: YARN-6704-YARN-2915.v1.patch, > YARN-6704-YARN-2915.v2.patch, YARN-6704.v3.patch, YARN-6704.v4.patch, > YARN-6704.v5.patch, YARN-6704.v6.patch, YARN-6704.v7.patch, > YARN-6704.v8.patch, YARN-6704.v9.patch > > > YARN-1336 added the ability to restart NM without loosing any running > containers. {{AMRMProxy}} restart is added in YARN-6127. In a Federated YARN > environment, there's additional state in the {{FederationInterceptor}} to > allow for spanning across multiple sub-clusters, so we need to enhance > {{FederationInterceptor}} to support work-preserving restart. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7540) Convert yarn app cli to call yarn api services
[ https://issues.apache.org/jira/browse/YARN-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7540: Attachment: YARN-7540.004.patch - Bugfix to make sure non-HA + SSL mode uses yarn.resourcemanager.webapp.https.address property for resolution. > Convert yarn app cli to call yarn api services > -- > > Key: YARN-7540 > URL: https://issues.apache.org/jira/browse/YARN-7540 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7540.001.patch, YARN-7540.002.patch, > YARN-7540.003.patch, YARN-7540.004.patch > > > For YARN docker application to launch through CLI, it works differently from > launching through REST API. All application launched through REST API is > currently stored in yarn user HDFS home directory. Application managed > through CLI are stored into individual user's HDFS home directory. For > consistency, we want to have yarn app cli to interact with API service to > manage applications. For performance reason, it is easier to implement list > all applications from one user's home directory instead of crawling all > user's home directories. For security reason, it is safer to access only one > user home directory instead of all users. Given the reasons above, the > proposal is to change how {{yarn app -launch}}, {{yarn app -list}} and {{yarn > app -destroy}} work. Instead of calling HDFS API and RM API to launch > containers, CLI will be converted to call API service REST API resides in RM. > RM perform the persist and operations to launch the actual application. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7540) Convert yarn app cli to call yarn api services
[ https://issues.apache.org/jira/browse/YARN-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284309#comment-16284309 ] Jian He commented on YARN-7540: --- - config name for https is "webapp.https.address" {code} String rmAddress = conf.get("yarn.resourcemanager.webapp.address"); if(conf.getBoolean("hadoop.ssl.enabled", false)) { scheme = "https://;; } {code} > Convert yarn app cli to call yarn api services > -- > > Key: YARN-7540 > URL: https://issues.apache.org/jira/browse/YARN-7540 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7540.001.patch, YARN-7540.002.patch, > YARN-7540.003.patch > > > For YARN docker application to launch through CLI, it works differently from > launching through REST API. All application launched through REST API is > currently stored in yarn user HDFS home directory. Application managed > through CLI are stored into individual user's HDFS home directory. For > consistency, we want to have yarn app cli to interact with API service to > manage applications. For performance reason, it is easier to implement list > all applications from one user's home directory instead of crawling all > user's home directories. For security reason, it is safer to access only one > user home directory instead of all users. Given the reasons above, the > proposal is to change how {{yarn app -launch}}, {{yarn app -list}} and {{yarn > app -destroy}} work. Instead of calling HDFS API and RM API to launch > containers, CLI will be converted to call API service REST API resides in RM. > RM perform the persist and operations to launch the actual application. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7609) mvn package fails by javadoc error
[ https://issues.apache.org/jira/browse/YARN-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284264#comment-16284264 ] Chandni Singh commented on YARN-7609: - [~ajisakaa], {{mvn package -Pdist -DskipTests}} doesn't fail for me so I don't know what are the javadoc errors for IntelFpgaOpenclPlugin and AbstractFpgaVendocPlugin. Could you please post them here. > mvn package fails by javadoc error > -- > > Key: YARN-7609 > URL: https://issues.apache.org/jira/browse/YARN-7609 > Project: Hadoop YARN > Issue Type: Bug > Components: build, documentation >Reporter: Akira Ajisaka >Assignee: Chandni Singh > > {{mvn package -Pdist -DskipTests}} failed. > {noformat} > [ERROR] > /home/centos/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/AllocateResponse.java:379: > error: self-closing element not allowed > [ERROR]* > [ERROR] ^ > [ERROR] > /home/centos/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/AllocateResponse.java:397: > error: self-closing element not allowed > [ERROR]* > [ERROR] ^ > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7627) [ATSv2] When passing a non-number as metricslimit, the error message is wrong
Grant Sohn created YARN-7627: Summary: [ATSv2] When passing a non-number as metricslimit, the error message is wrong Key: YARN-7627 URL: https://issues.apache.org/jira/browse/YARN-7627 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 3.0.0-beta1 Reporter: Grant Sohn Priority: Trivial curl "$ATS_URL/ws/v2/timeline/apps/application_1512430070811_0022/entities/MAPREDUCE_JOB?metricslimit=w" returns: {"exception":"BadRequestException","message":"java.lang.Exception: createdTime start/end or limit or flowrunid is not a numeric value.","javaClassName":"org.apache.hadoop.yarn.webapp.BadRequestException"} and: curl "$ATS_URL/ws/v2/timeline/apps/application_1512430070811_0022?metricslimit=ALL" {"exception":"BadRequestException","message":"java.lang.Exception: flowrunid is not a numeric value.","javaClassName":"org.apache.hadoop.yarn.webapp.BadRequestException"} This could be part of YARN-6389 which indicates this functionality was not completed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7540) Convert yarn app cli to call yarn api services
[ https://issues.apache.org/jira/browse/YARN-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284128#comment-16284128 ] genericqa commented on YARN-7540: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 4m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 53s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 53s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 56s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 55 unchanged - 0 fixed = 57 total (was 55) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 50s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 50s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 43s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 98m 49s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMClientOnRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7540 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901284/YARN-7540.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux
[jira] [Commented] (YARN-7290) Method canContainerBePreempted can return true when it shouldn't
[ https://issues.apache.org/jira/browse/YARN-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284035#comment-16284035 ] Andrew Wang commented on YARN-7290: --- I pulled this into branch-3.0.0 as well, thanks folks. > Method canContainerBePreempted can return true when it shouldn't > > > Key: YARN-7290 > URL: https://issues.apache.org/jira/browse/YARN-7290 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.0.0-beta1 >Reporter: Steven Rand >Assignee: Steven Rand > Fix For: 3.0.0, 3.1.0 > > Attachments: YARN-7290-failing-test.patch, YARN-7290.001.patch, > YARN-7290.002.patch, YARN-7290.003.patch, YARN-7290.004.patch, > YARN-7290.005.patch > > > In FSAppAttempt#canContainerBePreempted, we make sure that preempting the > given container would not put the app below its fair share: > {code} > // Check if the app's allocation will be over its fairshare even > // after preempting this container > Resource usageAfterPreemption = Resources.clone(getResourceUsage()); > // Subtract resources of containers already queued for preemption > synchronized (preemptionVariablesLock) { > Resources.subtractFrom(usageAfterPreemption, resourcesToBePreempted); > } > // Subtract this container's allocation to compute usage after preemption > Resources.subtractFrom( > usageAfterPreemption, container.getAllocatedResource()); > return !isUsageBelowShare(usageAfterPreemption, getFairShare()); > {code} > However, this only considers one container in isolation, and fails to > consider containers for the same app that we already added to > {{preemptableContainers}} in > FSPreemptionThread#identifyContainersToPreemptOnNode. Therefore we can have a > case where we preempt multiple containers from the same app, none of which by > itself puts the app below fair share, but which cumulatively do so. > I've attached a patch with a test to show this behavior. The flow is: > 1. Initially greedyApp runs in {{root.preemptable.child-1}} and is allocated > all the resources (8g and 8vcores) > 2. Then starvingApp runs in {{root.preemptable.child-2}} and requests 2 > containers, each of which is 3g and 3vcores in size. At this point both > greedyApp and starvingApp have a fair share of 4g (with DRF not in use). > 3. For the first container requested by starvedApp, we (correctly) preempt 3 > containers from greedyApp, each of which is 1g and 1vcore. > 4. For the second container requested by starvedApp, we again (this time > incorrectly) preempt 3 containers from greedyApp. This puts greedyApp below > its fair share, but happens anyway because all six times that we call > {{return !isUsageBelowShare(usageAfterPreemption, getFairShare());}}, the > value of {{usageAfterPreemption}} is 7g and 7vcores (confirmed using > debugger). > So in addition to accounting for {{resourcesToBePreempted}}, we also need to > account for containers that we're already planning on preempting in > FSPreemptionThread#identifyContainersToPreemptOnNode. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7390) All reservation related test cases failed when TestYarnClient runs against Fair Scheduler.
[ https://issues.apache.org/jira/browse/YARN-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-7390: -- Fix Version/s: (was: 3.0.0) 3.0.1 > All reservation related test cases failed when TestYarnClient runs against > Fair Scheduler. > -- > > Key: YARN-7390 > URL: https://issues.apache.org/jira/browse/YARN-7390 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, reservation system >Affects Versions: 2.9.0, 3.0.0, 3.1.0 >Reporter: Yufei Gu >Assignee: Yufei Gu > Fix For: 3.1.0, 2.9.1, 3.0.1 > > Attachments: YARN-7390.001.patch, YARN-7390.002.patch, > YARN-7390.003.patch, YARN-7390.004.patch, YARN-7390.005.patch, > YARN-7390.branch-2.001.patch > > > All reservation related test cases failed when {{TestYarnClient}} runs > against Fair Scheduler. To reproduce it, you need to set scheduler class to > Fair Scheduler in yarn-default.xml. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7469) Capacity Scheduler Intra-queue preemption: User can starve if newest app is exactly at user limit
[ https://issues.apache.org/jira/browse/YARN-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284032#comment-16284032 ] Andrew Wang commented on YARN-7469: --- I pulled this into branch-3.0.0 as well, thanks folks. > Capacity Scheduler Intra-queue preemption: User can starve if newest app is > exactly at user limit > - > > Key: YARN-7469 > URL: https://issues.apache.org/jira/browse/YARN-7469 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.9.0, 3.0.0-beta1, 2.8.2 >Reporter: Eric Payne >Assignee: Eric Payne > Fix For: 2.8.3, 3.0.0, 3.1.0, 2.10.0, 2.9.1 > > Attachments: UnitTestToShowStarvedUser.patch, YARN-7469.001.patch > > > Queue Configuration: > - Total Memory: 20GB > - 2 Queues > -- Queue1 > --- Memory: 10GB > --- MULP: 10% > --- ULF: 2.0 > - Minimum Container Size: 0.5GB > Use Case: > - User1 submits app1 to Queue1 and consumes 20GB > - User2 submits app2 to Queue1 and requests 7.5GB > - Preemption monitor preempts 7.5GB from app1. Capacity Scheduler gives those > resources to User2 > - User 3 submits app3 to Queue1. To begin with, app3 is requesting 1 > container for the AM. > - Preemption monitor never preempts a container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7524) Remove unused FairSchedulerEventLog
[ https://issues.apache.org/jira/browse/YARN-7524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-7524: -- Fix Version/s: (was: 3.0.0) 3.0.1 > Remove unused FairSchedulerEventLog > --- > > Key: YARN-7524 > URL: https://issues.apache.org/jira/browse/YARN-7524 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Fix For: 3.1.0, 3.0.1 > > Attachments: YARN-7524.001.patch, YARN-7524.002.patch > > > The FairSchedulerEventLog is no longer used. It is only being written to in > one location in the FS (see YARN-1383) and the functionality requested in > that jira has been implemented using the normal OOTB logging in the > AbstractYarnScheduler. > The functionality the scheduler event log used to provide has been replaced > with normal logging and the scheduler state dump in YARN-6042 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7611) Node manager web UI should display container type in containers page
[ https://issues.apache.org/jira/browse/YARN-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-7611: -- Fix Version/s: (was: 3.0.0) 3.0.1 > Node manager web UI should display container type in containers page > > > Key: YARN-7611 > URL: https://issues.apache.org/jira/browse/YARN-7611 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager, webapp >Affects Versions: 2.9.0 >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Fix For: 2.9.1, 3.0.1 > > Attachments: YARN-7611.001.patch, YARN-7611.002.patch, > after_patch.png, before_patch.png > > > Currently node manager UI page > [http://:/node/allContainers] lists all containers, but > it doesn't contain {{ExecutionType}} column. To figure out the type, user has > to click each container link which is quite cumbersome. We should add a > column to display this info to give a more straightforward view. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7610) Extend Distributed Shell to support launching job with opportunistic containers
[ https://issues.apache.org/jira/browse/YARN-7610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-7610: -- Fix Version/s: (was: 3.0.0) 3.0.1 > Extend Distributed Shell to support launching job with opportunistic > containers > --- > > Key: YARN-7610 > URL: https://issues.apache.org/jira/browse/YARN-7610 > Project: Hadoop YARN > Issue Type: Sub-task > Components: applications/distributed-shell >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Fix For: 3.0.1 > > Attachments: YARN-7610.001.patch, YARN-7610.002.patch, > YARN-7610.003.patch, YARN-7610.004.patch, YARN-7610.005.patch, added_doc.png, > outline_compare.png > > > Per doc in > [https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/OpportunisticContainers.html#Running_a_Sample_Job], > user can run some of PI job mappers as O containers. Similarly, propose to > extend distributed shell to support specifying the container type, it will be > very helpful for testing. Propose to add following argument > {code} > $./bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client > -container_type Container execution type, > GUARANTEED or > OPPORTUNISTIC > {code} > Implication: all containers in a distributed shell job will be launching as > user-specified container type (except for AM), if not given, default type is > {{GUARANTEED}}. AM is always launched as {{GUARANTEED}} container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7625) Expose NM node/containers resource utilization in JVM metrics
[ https://issues.apache.org/jira/browse/YARN-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284028#comment-16284028 ] genericqa commented on YARN-7625: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 93 unchanged - 1 fixed = 94 total (was 94) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 53s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 65m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch | | | hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitorResourceChange | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7625 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901264/YARN-7625.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux cb2c196436d0 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f196383 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/18845/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | unit |
[jira] [Commented] (YARN-7612) Add Placement Processor and planner framework
[ https://issues.apache.org/jira/browse/YARN-7612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284020#comment-16284020 ] genericqa commented on YARN-7612: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 49s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} YARN-6592 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 22s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 9s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 11s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 16s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 27s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 26s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in YARN-6592 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} YARN-6592 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 56s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 5s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 35 new + 560 unchanged - 1 fixed = 595 total (was 561) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 54s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 18s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 40s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 5s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 54s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings.
[jira] [Commented] (YARN-7522) Introduce AllocationTagsManager to associate allocation tags to nodes
[ https://issues.apache.org/jira/browse/YARN-7522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283977#comment-16283977 ] Wangda Tan commented on YARN-7522: -- Thanks [~asuresh]/[~kkaranasos]/[~sunilg] for reviewing the patch. > Introduce AllocationTagsManager to associate allocation tags to nodes > - > > Key: YARN-7522 > URL: https://issues.apache.org/jira/browse/YARN-7522 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Fix For: YARN-6592 > > Attachments: YARN-7522.YARN-6592.002.patch, > YARN-7522.YARN-6592.003.patch, YARN-7522.YARN-6592.004.patch, > YARN-7522.YARN-6592.005.patch, YARN-7522.YARN-6592.wip-001.patch > > > This is different from YARN-6596, YARN-6596 is targeted to add constraint > manager to store intra/inter application placement constraints. This JIRA is > targeted to support storing maps between container-tags/applications and > nodes. This will be required by affinity/anti-affinity implementation and > cardinality. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7473) Implement Framework and policy for capacity management of auto created queues
[ https://issues.apache.org/jira/browse/YARN-7473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283970#comment-16283970 ] Wangda Tan commented on YARN-7473: -- Thanks [~suma.shivaprasad], will commit by end of today if no objections. > Implement Framework and policy for capacity management of auto created queues > -- > > Key: YARN-7473 > URL: https://issues.apache.org/jira/browse/YARN-7473 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Attachments: YARN-7473.1.patch, YARN-7473.10.patch, > YARN-7473.11.patch, YARN-7473.12.patch, YARN-7473.12.patch, > YARN-7473.13.patch, YARN-7473.14.patch, YARN-7473.15.patch, > YARN-7473.16.patch, YARN-7473.17.patch, YARN-7473.2.patch, YARN-7473.3.patch, > YARN-7473.4.patch, YARN-7473.5.patch, YARN-7473.6.patch, YARN-7473.7.patch, > YARN-7473.8.patch, YARN-7473.9.patch > > > This jira mainly addresses the following > > 1.Support adding pluggable policies on parent queue for dynamically managing > capacity/state for leaf queues. > 2. Implement a default policy that manages capacity based on pending > applications and either grants guaranteed or zero capacity to queues based on > parent's available guaranteed capacity. > 3. Integrate with SchedulingEditPolicy framework to trigger this periodically > and signal scheduler to take necessary actions for capacity/queue management. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7520) Queue Ordering policy changes for ordering auto created leaf queues within Managed parent Queues
[ https://issues.apache.org/jira/browse/YARN-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283950#comment-16283950 ] genericqa commented on YARN-7520: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 9s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 27s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 41 new + 77 unchanged - 26 fixed = 118 total (was 103) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 25s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 4s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}122m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7520 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901218/YARN-7520.6.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux fdc83820880f 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f196383 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/18839/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit |
[jira] [Updated] (YARN-7540) Convert yarn app cli to call yarn api services
[ https://issues.apache.org/jira/browse/YARN-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-7540: Attachment: YARN-7540.003.patch - Fixed the discovered issues from Billie. > Convert yarn app cli to call yarn api services > -- > > Key: YARN-7540 > URL: https://issues.apache.org/jira/browse/YARN-7540 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang > Fix For: yarn-native-services > > Attachments: YARN-7540.001.patch, YARN-7540.002.patch, > YARN-7540.003.patch > > > For YARN docker application to launch through CLI, it works differently from > launching through REST API. All application launched through REST API is > currently stored in yarn user HDFS home directory. Application managed > through CLI are stored into individual user's HDFS home directory. For > consistency, we want to have yarn app cli to interact with API service to > manage applications. For performance reason, it is easier to implement list > all applications from one user's home directory instead of crawling all > user's home directories. For security reason, it is safer to access only one > user home directory instead of all users. Given the reasons above, the > proposal is to change how {{yarn app -launch}}, {{yarn app -list}} and {{yarn > app -destroy}} work. Instead of calling HDFS API and RM API to launch > containers, CLI will be converted to call API service REST API resides in RM. > RM perform the persist and operations to launch the actual application. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7473) Implement Framework and policy for capacity management of auto created queues
[ https://issues.apache.org/jira/browse/YARN-7473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283923#comment-16283923 ] genericqa commented on YARN-7473: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 8s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 35s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 110 new + 771 unchanged - 30 fixed = 881 total (was 801) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 37s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 37s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}109m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7473 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901236/YARN-7473.17.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 698ce400540d 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f196383 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/18844/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit |
[jira] [Commented] (YARN-7590) Improve container-executor validation check
[ https://issues.apache.org/jira/browse/YARN-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283906#comment-16283906 ] Eric Yang commented on YARN-7590: - [~miklos.szeg...@cloudera.com] Hi Miklos, would you mind to review this patch? Thanks > Improve container-executor validation check > --- > > Key: YARN-7590 > URL: https://issues.apache.org/jira/browse/YARN-7590 > Project: Hadoop YARN > Issue Type: Improvement > Components: security, yarn >Affects Versions: 2.0.1-alpha, 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0, 2.7.0, > 2.8.0, 2.8.1, 3.0.0-beta1 >Reporter: Eric Yang >Assignee: Eric Yang > Attachments: YARN-7590.001.patch > > > There is minimum check for prefix path for container-executor. If YARN is > compromised, attacker can use container-executor to change system files > ownership: > {code} > /usr/local/hadoop/bin/container-executor spark yarn 0 etc /home/yarn/tokens > /home/spark / ls > {code} > This will change /etc to be owned by spark user: > {code} > # ls -ld /etc > drwxr-s---. 110 spark hadoop 8192 Nov 21 20:00 /etc > {code} > Spark user can rewrite /etc files to gain more access. We can improve this > with additional check in container-executor: > # Make sure the prefix path is same as the one in yarn-site.xml, and > yarn-site.xml is owned by root, 644, and marked as final in property. > # Make sure the user path is not a symlink, usercache is not a symlink. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7625) Expose NM node/containers resource utilization in JVM metrics
[ https://issues.apache.org/jira/browse/YARN-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-7625: -- Attachment: YARN-7625.002.patch > Expose NM node/containers resource utilization in JVM metrics > - > > Key: YARN-7625 > URL: https://issues.apache.org/jira/browse/YARN-7625 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Attachments: YARN-7625.001.patch, YARN-7625.002.patch > > > YARN-4055 adds node resource utilization to NM, we should expose these info > in NM metrics, it helps in following cases: > # Users want to check NM load in NM web UI or via rest API > # Provide the API to further integrated to the new yarn UI, to display NM > load status -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7625) Expose NM node/containers resource utilization in JVM metrics
[ https://issues.apache.org/jira/browse/YARN-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283900#comment-16283900 ] Weiwei Yang commented on YARN-7625: --- Hi [~jlowe] I agree. To ensure the metrics be accurate, it should be published when its value gets updated. This needs to be done in {{ContainersMonitorImpl}} and {{NodeResourceMonitorImpl}}. I just uploaded v2 patch, with following changes # Moved the publish metrics code to {{ContainersMonitorImpl}} and {{NodeResourceMonitorImpl}}, they publish the metrics in each monitor interval once utilization updated # Added {{NodeManagerMetrics}} into node manager {{Context}}, so it can be accessed in monitors # Added 2 UT to test metrics update in both {{ContainersMonitorImpl}} and {{NodeResourceMonitorImpl}} Please help to review, thanks. > Expose NM node/containers resource utilization in JVM metrics > - > > Key: YARN-7625 > URL: https://issues.apache.org/jira/browse/YARN-7625 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Attachments: YARN-7625.001.patch > > > YARN-4055 adds node resource utilization to NM, we should expose these info > in NM metrics, it helps in following cases: > # Users want to check NM load in NM web UI or via rest API > # Provide the API to further integrated to the new yarn UI, to display NM > load status -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7625) Expose NM node/containers resource utilization in JVM metrics
[ https://issues.apache.org/jira/browse/YARN-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283876#comment-16283876 ] genericqa commented on YARN-7625: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 47s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 8s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 45s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 80m 37s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7625 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901212/YARN-7625.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4d20271d8dcc 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f196383 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/18841/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18841/testReport/ | | Max. process+thread count | 301 (vs. ulimit of 5000) | | modules | C:
[jira] [Commented] (YARN-6589) Recover all resources when NM restart
[ https://issues.apache.org/jira/browse/YARN-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283870#comment-16283870 ] genericqa commented on YARN-6589: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 14s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 15m 46s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 31s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 25s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 42s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-6589 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901211/YARN-6589.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2a3a32193f57 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f196383 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-YARN-Build/18838/artifact/out/patch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/18838/artifact/out/patch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/18838/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results |
[jira] [Commented] (YARN-7608) Incorrect sTarget column causing DataTable warning on RM application and scheduler web page
[ https://issues.apache.org/jira/browse/YARN-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283839#comment-16283839 ] genericqa commented on YARN-7608: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 7s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 8m 58s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 0s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 44s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-7608 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12901213/YARN-7608.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 854c97b8cdfa 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f196383 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/18843/testReport/ | | Max. process+thread count | 440 (vs. ulimit of 5000) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/18843/console | | Powered by | Apache Yetus 0.7.0-SNAPSHOT http://yetus.apache.org |
[jira] [Commented] (YARN-7468) Provide means for container network policy control
[ https://issues.apache.org/jira/browse/YARN-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283834#comment-16283834 ] Xuan Gong commented on YARN-7468: - uploaded a new doc. Please take a look. > Provide means for container network policy control > -- > > Key: YARN-7468 > URL: https://issues.apache.org/jira/browse/YARN-7468 > Project: Hadoop YARN > Issue Type: Task > Components: nodemanager >Reporter: Clay B. >Priority: Minor > Attachments: [YARN-7468] [Design] Provide means for container network > policy control.pdf > > > To prevent data exfiltration from a YARN cluster, it would be very helpful to > have "firewall" rules able to map to a user/queue's containers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7468) Provide means for container network policy control
[ https://issues.apache.org/jira/browse/YARN-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-7468: Attachment: [YARN-7468] [Design] Provide means for container network policy control.pdf > Provide means for container network policy control > -- > > Key: YARN-7468 > URL: https://issues.apache.org/jira/browse/YARN-7468 > Project: Hadoop YARN > Issue Type: Task > Components: nodemanager >Reporter: Clay B. >Priority: Minor > Attachments: [YARN-7468] [Design] Provide means for container network > policy control.pdf > > > To prevent data exfiltration from a YARN cluster, it would be very helpful to > have "firewall" rules able to map to a user/queue's containers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7622) Allow fair-scheduler configuration on HDFS
[ https://issues.apache.org/jira/browse/YARN-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283810#comment-16283810 ] Greg Phillips commented on YARN-7622: - The failed unit test is unrelated, it is currently failing in unpatched trunk. The findbugs issue is also unrelated to the specific changes in this patch. The findbugs issue notes there is inconsistent synchronization on a field that is not modified in this patch, on inspection this seems to be a false positive. > Allow fair-scheduler configuration on HDFS > -- > > Key: YARN-7622 > URL: https://issues.apache.org/jira/browse/YARN-7622 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler, resourcemanager >Reporter: Greg Phillips >Assignee: Greg Phillips >Priority: Minor > Attachments: YARN-7622.001.patch > > > The FairScheduler requires the allocation file to be hosted on the local > filesystem on the RM node(s). Allowing HDFS to store the allocation file will > provide improved redundancy, more options for scheduler updates, and RM > failover consistency in HA. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6487) FairScheduler: remove continuous scheduling (YARN-1010)
[ https://issues.apache.org/jira/browse/YARN-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283668#comment-16283668 ] stefanlee commented on YARN-6487: - it seems continuous scheduling will impact scheduler performance. > FairScheduler: remove continuous scheduling (YARN-1010) > --- > > Key: YARN-6487 > URL: https://issues.apache.org/jira/browse/YARN-6487 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > > Remove deprecated FairScheduler continuous scheduler code -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1010) FairScheduler: decouple container scheduling from nodemanager heartbeats
[ https://issues.apache.org/jira/browse/YARN-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283660#comment-16283660 ] stefanlee commented on YARN-1010: - Thanks [~ywskycn] , i have a doubt that if _if (!completedContainers.isEmpty())_ will impact scheduler performance, why add this judgment here? > FairScheduler: decouple container scheduling from nodemanager heartbeats > > > Key: YARN-1010 > URL: https://issues.apache.org/jira/browse/YARN-1010 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.1.0-beta >Reporter: Alejandro Abdelnur >Assignee: Wei Yan >Priority: Critical > Fix For: 2.3.0 > > Attachments: YARN-1010.patch, YARN-1010.patch > > > Currently scheduling for a node is done when a node heartbeats. > For large cluster where the heartbeat interval is set to several seconds this > delays scheduling of incoming allocations significantly. > We could have a continuous loop scanning all nodes and doing scheduling. If > there is availability AMs will get the allocation in the next heartbeat after > the one that placed the request. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7625) Expose NM node/containers resource utilization in JVM metrics
[ https://issues.apache.org/jira/browse/YARN-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283556#comment-16283556 ] Jason Lowe commented on YARN-7625: -- Thanks for the patch! Get methods that have side effects are a bit less than ideal. Would it make more sense to update the metrics when the values are calculated rather than when they are retrieved? Due to out-of-band heartbeats, it's not unusual for the nodemanager to generate a node status more often than the ContainersMonitorImpl and NodeResourceMonitorImpl will calculate new values. Similarly if the admin has configured a large node heartbeat interval then the metric is only updated when the node heartbeats rather than when new values have been calculated by the monitoring code. > Expose NM node/containers resource utilization in JVM metrics > - > > Key: YARN-7625 > URL: https://issues.apache.org/jira/browse/YARN-7625 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Attachments: YARN-7625.001.patch > > > YARN-4055 adds node resource utilization to NM, we should expose these info > in NM metrics, it helps in following cases: > # Users want to check NM load in NM web UI or via rest API > # Provide the API to further integrated to the new yarn UI, to display NM > load status -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283419#comment-16283419 ] Tao Yang edited comment on YARN-7494 at 12/8/17 11:47 AM: -- Thanks for the patch. [~sunilg] Some thoughts from my side: * Agree with 2) from [~leftnoteasy]. Sorting requirements may be different in some scenarios. For example, opportunistic containers would prefer considering node utilization to unallocated resource. I think we should support expandable sorting library and AppPlacementAllocator can choose from it. * This patch iterates all partition nodes to create new {{PartitionBasedCandidateNodeSet}} instance for every schedule process in {{CapacityScheduler#getCandidateNodeSet}}. I think we can keep a single instance to avoid always creating it. * This patch remains as it is to iterate all nodes and trigger the schedule process for every node in {{CapacityScheduler#schedule}}. Is it better to move {{multiNodePlacementEnabled}} check branch from {{CapacityScheduler#getCandidateNodeSet}} to {{CapacityScheduler#schedule}} and iterates all partitions to trigger the schedule process when multi-node-placement enabled ? was (Author: tao yang): Thanks for the patch. [~sunilg] Some thoughts from my side: * Agree with 2) from [~leftnoteasy]. Sorting requirement may be different in some scenarios. For example, opportunistic containers would prefer considering node utilization to unallocated resource. I think we should support expandable sorting library and AppPlacementAllocator can choose from it. * This patch iterates all partition nodes to create new {{PartitionBasedCandidateNodeSet}} instance for every schedule process in {{CapacityScheduler#getCandidateNodeSet}}. I think we can keep a single instance to avoid always creating the same set. * This patch remains as it is to iterates all nodes and trigger the schedule process for every node in CapacityScheduler#schedule. Is it better to move multiNodePlacementEnabled condition branch from {{CapacityScheduler#getCandidateNodeSet}} to {{CapacityScheduler#schedule}} and iterates all partitions to trigger the schedule process when multi-node-placement enabled ? > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-7494.v0.patch > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283419#comment-16283419 ] Tao Yang edited comment on YARN-7494 at 12/8/17 11:45 AM: -- Thanks for the patch. [~sunilg] Some thoughts from my side: * Agree with 2) from [~leftnoteasy]. Sorting requirement may be different in some scenarios. For example, opportunistic containers would prefer considering node utilization to unallocated resource. I think we should support expandable sorting library and AppPlacementAllocator can choose from it. * This patch iterates all partition nodes to create new {{PartitionBasedCandidateNodeSet}} instance for every schedule process in {{CapacityScheduler#getCandidateNodeSet}}. I think we can keep a single instance to avoid always creating the same set. * This patch remains as it is to iterates all nodes and trigger the schedule process for every node in CapacityScheduler#schedule. Is it better to move multiNodePlacementEnabled condition branch from {{CapacityScheduler#getCandidateNodeSet}} to {{CapacityScheduler#schedule}} and iterates all partitions to trigger the schedule process when multi-node-placement enabled ? was (Author: tao yang): Thanks for the patch. [~sunilg] Some thoughts from my side: * Agree with 2) from [~leftnoteasy]. Sorting requirement may be different in some scenarios. For example, opportunistic containers would prefer considering node utilization to unallocated resource. I think we should support expandable sorting library and AppPlacementAllocator can choose from it. * This patch iterates all partition nodes to create new PartitionBasedCandidateNodeSet for every schedule process in CapacityScheduler#getCandidateNodeSet. I think we can keep a single instance to avoid always creating the same set. * This patch remain as it is to iterates all nodes and trigger the schedule process for every node in CapacityScheduler#schedule. Is it better to move multiNodePlacementEnabled condition branch from CapacityScheduler#getCandidateNodeSet to CapacityScheduler#schedule and iterates all partitions to trigger the schedule process when multi-node-placement enabled ? > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-7494.v0.patch > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283419#comment-16283419 ] Tao Yang edited comment on YARN-7494 at 12/8/17 11:43 AM: -- Thanks for the patch. [~sunilg] Some thoughts from my side: * Agree with 2) from [~leftnoteasy]. Sorting requirement may be different in some scenarios. For example, opportunistic containers would prefer considering node utilization to unallocated resource. I think we should support expandable sorting library and AppPlacementAllocator can choose from it. * This patch iterates all partition nodes to create new PartitionBasedCandidateNodeSet for every schedule process in CapacityScheduler#getCandidateNodeSet. I think we can keep a single instance to avoid always creating the same set. * This patch remain as it is to iterates all nodes and trigger the schedule process for every node in CapacityScheduler#schedule. Is it better to move multiNodePlacementEnabled condition branch from CapacityScheduler#getCandidateNodeSet to CapacityScheduler#schedule and iterates all partitions to trigger the schedule process when multi-node-placement enabled ? was (Author: tao yang): Thanks for the patch. [~sunilg] Some thoughts from my side: * Agree with 2) from [~leftnoteasy]. Sorting requirement may be different in some scenarios. For example, opportunistic containers would prefer considering node utilization to unallocated resource. I think we should support expandable sorting library and AppPlacementAllocator can choose from it. * This patch iterates all partition nodes to create new PartitionBasedCandidateNodeSet for every schedule process in CapacityScheduler#getCandidateNodeSet. I think we can keep a single instance to avoid always creating the same set. * This patch remain as it is to iterates all nodes and trigger the schedule process for every node in CapacityScheduler#schedule. IIUC Is it better to move multiNodePlacementEnabled condition branch from CapacityScheduler#getCandidateNodeSet to CapacityScheduler#schedule and iterates all partitions to trigger the schedule process when multi-node-placement enabled ? > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-7494.v0.patch > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283419#comment-16283419 ] Tao Yang commented on YARN-7494: Thanks for the patch. [~sunilg] Some thoughts from my side: * Agree with 2) from [~leftnoteasy]. Sorting requirement may be different in some scenarios. For example, opportunistic containers would prefer considering node utilization to unallocated resource. I think we should support expandable sorting library and AppPlacementAllocator can choose from it. * This patch iterates all partition nodes to create new PartitionBasedCandidateNodeSet for every schedule process in CapacityScheduler#getCandidateNodeSet. I think we can keep a single instance to avoid always creating the same set. * This patch remain as it is to iterates all nodes and trigger the schedule process for every node in CapacityScheduler#schedule. IIUC Is it better to move multiNodePlacementEnabled condition branch from CapacityScheduler#getCandidateNodeSet to CapacityScheduler#schedule and iterates all partitions to trigger the schedule process when multi-node-placement enabled ? > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-7494.v0.patch > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7473) Implement Framework and policy for capacity management of auto created queues
[ https://issues.apache.org/jira/browse/YARN-7473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated YARN-7473: --- Attachment: YARN-7473.17.patch Rebased with trunk and minor changes to TestCapacitySchedulerAutoCreatedQueueBase > Implement Framework and policy for capacity management of auto created queues > -- > > Key: YARN-7473 > URL: https://issues.apache.org/jira/browse/YARN-7473 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Attachments: YARN-7473.1.patch, YARN-7473.10.patch, > YARN-7473.11.patch, YARN-7473.12.patch, YARN-7473.12.patch, > YARN-7473.13.patch, YARN-7473.14.patch, YARN-7473.15.patch, > YARN-7473.16.patch, YARN-7473.17.patch, YARN-7473.2.patch, YARN-7473.3.patch, > YARN-7473.4.patch, YARN-7473.5.patch, YARN-7473.6.patch, YARN-7473.7.patch, > YARN-7473.8.patch, YARN-7473.9.patch > > > This jira mainly addresses the following > > 1.Support adding pluggable policies on parent queue for dynamically managing > capacity/state for leaf queues. > 2. Implement a default policy that manages capacity based on pending > applications and either grants guaranteed or zero capacity to queues based on > parent's available guaranteed capacity. > 3. Integrate with SchedulingEditPolicy framework to trigger this periodically > and signal scheduler to take necessary actions for capacity/queue management. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7473) Implement Framework and policy for capacity management of auto created queues
[ https://issues.apache.org/jira/browse/YARN-7473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283262#comment-16283262 ] Suma Shivaprasad edited comment on YARN-7473 at 12/8/17 11:32 AM: -- Thanks [~wangda] Addressed checkstyle, findbugs and UT failure. was (Author: suma.shivaprasad): Addressed checkstyle, findbugs and UT failure. > Implement Framework and policy for capacity management of auto created queues > -- > > Key: YARN-7473 > URL: https://issues.apache.org/jira/browse/YARN-7473 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Attachments: YARN-7473.1.patch, YARN-7473.10.patch, > YARN-7473.11.patch, YARN-7473.12.patch, YARN-7473.12.patch, > YARN-7473.13.patch, YARN-7473.14.patch, YARN-7473.15.patch, > YARN-7473.16.patch, YARN-7473.17.patch, YARN-7473.2.patch, YARN-7473.3.patch, > YARN-7473.4.patch, YARN-7473.5.patch, YARN-7473.6.patch, YARN-7473.7.patch, > YARN-7473.8.patch, YARN-7473.9.patch > > > This jira mainly addresses the following > > 1.Support adding pluggable policies on parent queue for dynamically managing > capacity/state for leaf queues. > 2. Implement a default policy that manages capacity based on pending > applications and either grants guaranteed or zero capacity to queues based on > parent's available guaranteed capacity. > 3. Integrate with SchedulingEditPolicy framework to trigger this periodically > and signal scheduler to take necessary actions for capacity/queue management. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5139) [Umbrella] Move YARN scheduler towards global scheduler
[ https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283265#comment-16283265 ] Tao Yang commented on YARN-5139: Thanks [~leftnoteasy] for detailed introduction above and apologize for my late reply. {quote} please share the use cases of multiple nodes look up from your POV. We can incorporate it once working on implementations. {quote} I think YARN-6592 which can support lots of customized requirements is good enough for our use cases. We will keep watching these issues and give some feedbacks from our use cases. {quote} If you have interests/bandwidth, you may take a crack at YARN-7457, which is also crucial to make a clean separation of allocation algorithm. {quote} I would like to look into YARN-7457 and make a try, Thanks. > [Umbrella] Move YARN scheduler towards global scheduler > --- > > Key: YARN-5139 > URL: https://issues.apache.org/jira/browse/YARN-5139 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: Explanantions of Global Scheduling (YARN-5139) > Implementation.pdf, YARN-5139-Concurrent-scheduling-performance-report.pdf, > YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf, > YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf, > YARN-5139.000.patch, wip-1.YARN-5139.patch, wip-2.YARN-5139.patch, > wip-3.YARN-5139.patch, wip-4.YARN-5139.patch, wip-5.YARN-5139.patch > > > Existing YARN scheduler is based on node heartbeat. This can lead to > sub-optimal decisions because scheduler can only look at one node at the time > when scheduling resources. > Pseudo code of existing scheduling logic looks like: > {code} > for node in allNodes: >Go to parentQueue > Go to leafQueue > for application in leafQueue.applications: >for resource-request in application.resource-requests > try to schedule on node > {code} > Considering future complex resource placement requirements, such as node > constraints (give me "a && b || c") or anti-affinity (do not allocate HBase > regionsevers and Storm workers on the same host), we may need to consider > moving YARN scheduler towards global scheduling. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7473) Implement Framework and policy for capacity management of auto created queues
[ https://issues.apache.org/jira/browse/YARN-7473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated YARN-7473: --- Attachment: YARN-7473.16.patch Addressed checkstyle, findbugs and UT failure. > Implement Framework and policy for capacity management of auto created queues > -- > > Key: YARN-7473 > URL: https://issues.apache.org/jira/browse/YARN-7473 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Attachments: YARN-7473.1.patch, YARN-7473.10.patch, > YARN-7473.11.patch, YARN-7473.12.patch, YARN-7473.12.patch, > YARN-7473.13.patch, YARN-7473.14.patch, YARN-7473.15.patch, > YARN-7473.16.patch, YARN-7473.2.patch, YARN-7473.3.patch, YARN-7473.4.patch, > YARN-7473.5.patch, YARN-7473.6.patch, YARN-7473.7.patch, YARN-7473.8.patch, > YARN-7473.9.patch > > > This jira mainly addresses the following > > 1.Support adding pluggable policies on parent queue for dynamically managing > capacity/state for leaf queues. > 2. Implement a default policy that manages capacity based on pending > applications and either grants guaranteed or zero capacity to queues based on > parent's available guaranteed capacity. > 3. Integrate with SchedulingEditPolicy framework to trigger this periodically > and signal scheduler to take necessary actions for capacity/queue management. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7520) Queue Ordering policy changes for ordering auto created leaf queues within Managed parent Queues
[ https://issues.apache.org/jira/browse/YARN-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated YARN-7520: --- Attachment: YARN-7520.6.patch Rebased with trunk > Queue Ordering policy changes for ordering auto created leaf queues within > Managed parent Queues > > > Key: YARN-7520 > URL: https://issues.apache.org/jira/browse/YARN-7520 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Attachments: YARN-7520.1.patch, YARN-7520.2.patch, YARN-7520.3.patch, > YARN-7520.4.patch, YARN-7520.5.patch, YARN-7520.6.patch > > > Queue Ordering policy currently uses priority, utilization and absolute > capacity for pre-configured parent queues to order leaf queues while > assigning containers. It needs modifications for auto created leaf queues > since they can have zero capacity -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7420) YARN UI changes to depict auto created queues
[ https://issues.apache.org/jira/browse/YARN-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated YARN-7420: --- Attachment: ScreenShot_AutoCreated_Queues_Legend_color.png > YARN UI changes to depict auto created queues > -- > > Key: YARN-7420 > URL: https://issues.apache.org/jira/browse/YARN-7420 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Attachments: ScreenShot_AutoCreated_Queues_Legend_color.png, > ScreenShot_Zero_capacity_queues_running_app.png, YARN-7420.1.patch, > YARN-7420.2.patch > > > Auto created queues will be depicted in a different color to indicate they > have been auto created and for easier distinction from manually > pre-configured queues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7420) YARN UI changes to depict auto created queues
[ https://issues.apache.org/jira/browse/YARN-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated YARN-7420: --- Attachment: (was: ScreenShot_AutoCreated_Queues_Legend_color.png) > YARN UI changes to depict auto created queues > -- > > Key: YARN-7420 > URL: https://issues.apache.org/jira/browse/YARN-7420 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Attachments: ScreenShot_Zero_capacity_queues_running_app.png, > YARN-7420.1.patch, YARN-7420.2.patch > > > Auto created queues will be depicted in a different color to indicate they > have been auto created and for easier distinction from manually > pre-configured queues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7420) YARN UI changes to depict auto created queues
[ https://issues.apache.org/jira/browse/YARN-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated YARN-7420: --- Attachment: YARN-7420.2.patch Rebased patch with trunk > YARN UI changes to depict auto created queues > -- > > Key: YARN-7420 > URL: https://issues.apache.org/jira/browse/YARN-7420 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Suma Shivaprasad >Assignee: Suma Shivaprasad > Attachments: ScreenShot_AutoCreated_Queues_Legend_color.png, > ScreenShot_Zero_capacity_queues_running_app.png, YARN-7420.1.patch, > YARN-7420.2.patch > > > Auto created queues will be depicted in a different color to indicate they > have been auto created and for easier distinction from manually > pre-configured queues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7612) Add Placement Processor and planner framework
[ https://issues.apache.org/jira/browse/YARN-7612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7612: -- Attachment: (was: YARN-7612-YARN-6592.combined.patch) > Add Placement Processor and planner framework > - > > Key: YARN-7612 > URL: https://issues.apache.org/jira/browse/YARN-7612 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-7612-YARN-6592.001.patch, YARN-7612-v2.wip.patch, > YARN-7612.wip.patch > > > This introduces a Placement Processor and a Planning algorithm framework to > handle placement constraints and scheduling requests from an app and places > them on nodes. > The actual planning algorithm(s) will be handled in a YARN-7613. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7612) Add Placement Processor and planner framework
[ https://issues.apache.org/jira/browse/YARN-7612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7612: -- Attachment: YARN-7612-YARN-6592.001.patch Re-attaching after rebase > Add Placement Processor and planner framework > - > > Key: YARN-7612 > URL: https://issues.apache.org/jira/browse/YARN-7612 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-7612-YARN-6592.001.patch, > YARN-7612-YARN-6592.combined.patch, YARN-7612-v2.wip.patch, > YARN-7612.wip.patch > > > This introduces a Placement Processor and a Planning algorithm framework to > handle placement constraints and scheduling requests from an app and places > them on nodes. > The actual planning algorithm(s) will be handled in a YARN-7613. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7608) Incorrect sTarget column causing DataTable warning on RM application and scheduler web page
[ https://issues.apache.org/jira/browse/YARN-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283232#comment-16283232 ] Weiwei Yang commented on YARN-7608: --- Thanks [~GergelyNovak] for the update, the change looks good to me, +1, pending on jenkins. I agree that the else code seems not be reached by any pages, so it should not be a problem. I think we are set now for this issue, will commit once there is a clean jenkins result. Thank you! > Incorrect sTarget column causing DataTable warning on RM application and > scheduler web page > --- > > Key: YARN-7608 > URL: https://issues.apache.org/jira/browse/YARN-7608 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, webapp >Affects Versions: 2.9.0 >Reporter: Weiwei Yang >Assignee: Gergely Novák > Attachments: YARN-7608.001.patch, YARN-7608.002.patch > > > On a cluster built from latest trunk, click {{% of Queue}} gives following > warning > {noformat} > DataTable warning (tableID="'apps'): Requested unknown parameter '15' from > the data source for row 0 > {noformat} > {{% of Cluster}} doesn't have this problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7608) Incorrect sTarget column causing DataTable warning on RM application and scheduler web page
[ https://issues.apache.org/jira/browse/YARN-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283222#comment-16283222 ] Gergely Novák commented on YARN-7608: - Nice catch, patch #2 fixes the fair scheduler page, too. What about the else branch? The only usage of the method where both {{isFairSchedulerPage}} and {{isResourceManager}} is false is {{AHSView::preHead()}}, but as far as I could see, all the subclasses of {{AHSView}} overrides {{preHead}}, so eventually it's not used anywhere.. > Incorrect sTarget column causing DataTable warning on RM application and > scheduler web page > --- > > Key: YARN-7608 > URL: https://issues.apache.org/jira/browse/YARN-7608 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, webapp >Affects Versions: 2.9.0 >Reporter: Weiwei Yang >Assignee: Gergely Novák > Attachments: YARN-7608.001.patch, YARN-7608.002.patch > > > On a cluster built from latest trunk, click {{% of Queue}} gives following > warning > {noformat} > DataTable warning (tableID="'apps'): Requested unknown parameter '15' from > the data source for row 0 > {noformat} > {{% of Cluster}} doesn't have this problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7625) Expose NM node/containers resource utilization in JVM metrics
[ https://issues.apache.org/jira/browse/YARN-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-7625: -- Description: YARN-4055 adds node resource utilization to NM, we should expose these info in NM metrics, it helps in following cases: # Users want to check NM load in NM web UI or via rest API # Provide the API to further integrated to the new yarn UI, to display NM load status was: YARN-4055 adds node resource utilization to NM, we should expose these info in NM metrics, it helps in following cases: # Users want to check NM load in NM web UI or via rest API # Provide the API to further integrated to the new yarn UI, to display NM load status propose to add * Physical memory * Virtual memory * CPU utilization for both {{node}} and {{containers}} in NM metrics. > Expose NM node/containers resource utilization in JVM metrics > - > > Key: YARN-7625 > URL: https://issues.apache.org/jira/browse/YARN-7625 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Attachments: YARN-7625.001.patch > > > YARN-4055 adds node resource utilization to NM, we should expose these info > in NM metrics, it helps in following cases: > # Users want to check NM load in NM web UI or via rest API > # Provide the API to further integrated to the new yarn UI, to display NM > load status -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7625) Expose NM node/containers resource utilization in JVM metrics
[ https://issues.apache.org/jira/browse/YARN-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-7625: -- Description: YARN-4055 adds node resource utilization to NM, we should expose these info in NM metrics, it helps in following cases: # Users want to check NM load in NM web UI or via rest API # Provide the API to further integrated to the new yarn UI, to display NM load status propose to add * Physical memory * Virtual memory * CPU utilization for both {{node}} and {{containers}} in NM metrics. was: YARN-4055 adds node resource utilization to NM, we should expose these info in NM metrics, it helps in following cases: # Users want to check NM load in NM web UI or via rest API # Provide the API to further integrated to the new yarn UI, to display NM load status propose to add * Physical memory * Virtual memory * CPU utilization for both {{node}} and {{containers}} in NM metrics. > Expose NM node/containers resource utilization in JVM metrics > - > > Key: YARN-7625 > URL: https://issues.apache.org/jira/browse/YARN-7625 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Weiwei Yang >Assignee: Weiwei Yang > Attachments: YARN-7625.001.patch > > > YARN-4055 adds node resource utilization to NM, we should expose these info > in NM metrics, it helps in following cases: > # Users want to check NM load in NM web UI or via rest API > # Provide the API to further integrated to the new yarn UI, to display NM > load status > propose to add > * Physical memory > * Virtual memory > * CPU utilization > for both {{node}} and {{containers}} in NM metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7608) Incorrect sTarget column causing DataTable warning on RM application and scheduler web page
[ https://issues.apache.org/jira/browse/YARN-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Novák updated YARN-7608: Attachment: YARN-7608.002.patch > Incorrect sTarget column causing DataTable warning on RM application and > scheduler web page > --- > > Key: YARN-7608 > URL: https://issues.apache.org/jira/browse/YARN-7608 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, webapp >Affects Versions: 2.9.0 >Reporter: Weiwei Yang >Assignee: Gergely Novák > Attachments: YARN-7608.001.patch, YARN-7608.002.patch > > > On a cluster built from latest trunk, click {{% of Queue}} gives following > warning > {noformat} > DataTable warning (tableID="'apps'): Requested unknown parameter '15' from > the data source for row 0 > {noformat} > {{% of Cluster}} doesn't have this problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org