[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17565898#comment-17565898 ] Masatake Iwasaki commented on YARN-8657: update the targets to 3.2.5 for preparing 3.2.4 release. > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Major > Attachments: YARN-8657.001.patch, YARN-8657.002.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17194686#comment-17194686 ] Hadoop QA commented on YARN-8657: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 11s{color} | {color:red} YARN-8657 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8657 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941021/YARN-8657.002.patch | | Console output | https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/177/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Major > Attachments: YARN-8657.001.patch, YARN-8657.002.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16915509#comment-16915509 ] Wangda Tan commented on YARN-8657: -- I'd prefer to move it to next releases and downgrade the priority. This only causes some trouble in the allocation phase, and it will be double-checked by {{accept}} in writeLock. > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch, YARN-8657.002.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913593#comment-16913593 ] Hadoop QA commented on YARN-8657: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} YARN-8657 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8657 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941021/YARN-8657.002.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24610/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch, YARN-8657.002.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913205#comment-16913205 ] Rohith Sharma K S commented on YARN-8657: - [~cheersyang] [~sunilg] [~leftnoteasy] [~bsteinbach] Any consensus on this? > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch, YARN-8657.002.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627123#comment-16627123 ] Antal Bálint Steinbach commented on YARN-8657: -- Hi [~sunilg] , No, in canAssignToUserWithCache the lock is inside the try block. The point is to move it before the try block. > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch, YARN-8657.002.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626764#comment-16626764 ] Weiwei Yang commented on YARN-8657: --- Hi [~sunilg], It seems the UT failure was related, I tried that locally, seems reproducible. Can you pls check? > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch, YARN-8657.002.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625814#comment-16625814 ] Sunil Govindan commented on YARN-8657: -- {code:java} } finally { readLock.unlock(); }{code} We use the same now in this new method {{canAssignToUserWithCache}} , correct ? > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch, YARN-8657.002.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625725#comment-16625725 ] Antal Bálint Steinbach commented on YARN-8657: -- Hi [~leftnoteasy] , Thanks for the patch. I ran into a very small issue while reading your patch. In line 1531 {code:java} try { readLock.lock();{code} it is a good pattern to do it like: {code:java} readLock.lock(); try {...} finally { readLock.unlock(); } {code} There are some threads around this on Stackoverflow. For example [https://stackoverflow.com/questions/31058681/java-locking-structure-best-pattern|http://example.com/] There are some more examples on this in the file, I just wanted to raise this while you did some modification around this. > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch, YARN-8657.002.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625623#comment-16625623 ] Sunil Govindan commented on YARN-8657: -- [~cheersyang] cud u pls check latest patch > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch, YARN-8657.002.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625616#comment-16625616 ] Hadoop QA commented on YARN-8657: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 27s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 32s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 42 unchanged - 2 fixed = 43 total (was 44) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 25s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 16s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}124m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerMultiNodes | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8657 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941021/YARN-8657.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5f0e73189a29 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 32a35dc | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/21945/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-ser
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625443#comment-16625443 ] Sunil Govindan commented on YARN-8657: -- Thanks [~cheersyang]. I will help to rebase this patch. > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617342#comment-16617342 ] Weiwei Yang commented on YARN-8657: --- [~leftnoteasy], [~sunilg] Sorry I just got time to this. The patch looks fine. My only concern is the performance. Since computeUserLimitAndSetHeadroom was called in both scheduling-phase and commit-phase, so even the user limit was stale a bit in the scheduling-phase, the check put in the commit-phase should be able to guarantee there is no violations. If we add this sync block, it may reduce the performance as a side-effect. If I remember correctly, when we were analyzing the conflicts a few months ago, this was not one of the top-3 cause of conflicts. BTW, the patch no longer applies to trunk. Thanks > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16608717#comment-16608717 ] Sunil Govindan commented on YARN-8657: -- [~leftnoteasy] Patch looks good to me. Will commit tomorrow if no objections. cc [~cheersyang] > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595937#comment-16595937 ] Sunil Govindan commented on YARN-8657: -- I think there are no changes in ordering. Only annotation was bit confusing. And existing trunk code was doings some updates in readLock which is out of scope. I think we can revisit the same in another patch also. I ll check this patch in details and share comments if any. cc [~cheersyang] > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584238#comment-16584238 ] Wangda Tan commented on YARN-8657: -- [~sunilg], I'm not quite sure if the patch changed locking scope of user limit calc. It moved few logics from no lock to readlock. But I didn't see any read<->write lock changes. Please clarify. > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579199#comment-16579199 ] Sunil Govindan commented on YARN-8657: -- Hi [~leftnoteasy] Thanks for the patch. Some doubts on this. {{computeUserLimitAndSetHeadroom}} is invoked under readLock in few places and writeLock under {{updateClusterResource}}. With this patch, this method is now called under readLock. However other than {{metrics.setAvailableResourcesToUser(nodePartition, user, headroom);}}, all other setters are under respective locks. I think this also to be protected as I have seen some recent issues in metrics in queue metrics. I suspect this is one reason. {{computeUserLimitAndSetHeadroom}} has this annotation. I think this need to revisited ? *@Lock(\{LeafQueue.class})* > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579141#comment-16579141 ] genericqa commented on YARN-8657: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 73m 45s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}128m 27s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8657 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12935450/YARN-8657.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d4624b9f03f7 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2385444 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21591/testReport/ | | Max. process+thread count | 946 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21591/console | | Powered by | Apache Yetus
[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579053#comment-16579053 ] Wangda Tan commented on YARN-8657: -- [~sunil.gov...@gmail.com], [~cheersyang], could u help to review this ticket? > User limit calculation should be read-lock-protected within LeafQueue > - > > Key: YARN-8657 > URL: https://issues.apache.org/jira/browse/YARN-8657 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8657.001.patch > > > When async scheduling is enabled, user limit calculation could be wrong: > It is possible that scheduler calculated a user_limit, but inside > {{canAssignToUser}} it becomes staled. > We need to protect user limit calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org