[jira] [Commented] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir
[ https://issues.apache.org/jira/browse/YARN-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800714#comment-15800714 ] Varun Saxena commented on YARN-6056: IIUC, the issue here is that the exit code returned from delete as user is not successful and hence indicates failure. But we would still continue to delete other directories in the list. Correct ? > Yarn NM using LCE shows a failure when trying to delete a non-existing dir > -- > > Key: YARN-6056 > URL: https://issues.apache.org/jira/browse/YARN-6056 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.6.5 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: YARN-6056-branch-2.6.1.patch > > > As part of YARN-2902 the clean up of the local directories was changed to > ignore non existing directories and proceed with others in the list. This > part of the code change was not backported into branch-2.6, backporting just > that part now. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir
[ https://issues.apache.org/jira/browse/YARN-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800714#comment-15800714 ] Varun Saxena edited comment on YARN-6056 at 1/5/17 8:17 AM: [~wilfreds], IIUC, the issue here is that the exit code returned from delete as user is not successful and hence indicates failure. But we would still continue to delete other directories in the list. Correct ? was (Author: varun_saxena): IIUC, the issue here is that the exit code returned from delete as user is not successful and hence indicates failure. But we would still continue to delete other directories in the list. Correct ? > Yarn NM using LCE shows a failure when trying to delete a non-existing dir > -- > > Key: YARN-6056 > URL: https://issues.apache.org/jira/browse/YARN-6056 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.6.5 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: YARN-6056-branch-2.6.1.patch > > > As part of YARN-2902 the clean up of the local directories was changed to > ignore non existing directories and proceed with others in the list. This > part of the code change was not backported into branch-2.6, backporting just > that part now. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5959) RM changes to support change of container ExecutionType
[ https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800721#comment-15800721 ] Hadoop QA commented on YARN-5959: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 16 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 48s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 33s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 43s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 7s{color} | {color:orange} root: The patch generated 37 new + 1694 unchanged - 18 fixed = 1731 total (was 1712) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 912 unchanged - 1 fixed = 912 total (was 913) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} hadoop-sls in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 40s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 41m 23s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 42s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {co
[jira] [Updated] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters
[ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-5585: Attachment: YARN-5585-YARN-5355.0006.patch Updated the patch as per finalized Java Doc. > [Atsv2] Reader side changes for entity prefix and support for pagination via > additional filters > --- > > Key: YARN-5585 > URL: https://issues.apache.org/jira/browse/YARN-5585 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Critical > Labels: yarn-5355-merge-blocker > Attachments: 0001-YARN-5585.patch, YARN-5585-YARN-5355.0001.patch, > YARN-5585-YARN-5355.0002.patch, YARN-5585-YARN-5355.0003.patch, > YARN-5585-YARN-5355.0004.patch, YARN-5585-YARN-5355.0005.patch, > YARN-5585-YARN-5355.0006.patch, YARN-5585-workaround.patch, YARN-5585.v0.patch > > > TimelineReader REST API's provides lot of filters to retrieve the > applications. Along with those, it would be good to add new filter i.e fromId > so that entities can be retrieved after the fromId. > Current Behavior : Default limit is set to 100. If there are 1000 entities > then REST call gives first/last 100 entities. How to retrieve next set of 100 > entities i.e 101 to 200 OR 900 to 801? > Example : If applications are stored database, app-1 app-2 ... app-10. > *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is > no way to achieve this. > So proposal is to have fromId in the filter like > *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to > app-10. > Since ATS is targeting large number of entities storage, it is very common > use case to get next set of entities using fromId rather than querying all > the entites. This is very useful for pagination in web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys
[ https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800932#comment-15800932 ] Ajith S commented on YARN-5547: --- Hi guys, sorry for delay. [~jlowe] thanks for your comments. You are right, we can avoid storing killed state for container which will not be recovered. Also, for deleting the unknown keys, would it be ok to remove unknown keys in {{NMLeveldbStateStoreService.loadContainerState(ContainerId, LeveldbIterator, String)}} .? As per the patch it will be after the warning log about the unknown keys This will avoid any scanning of store hence forth avoid performance penalty > NMLeveldbStateStore should be more tolerant of unknown keys > --- > > Key: YARN-5547 > URL: https://issues.apache.org/jira/browse/YARN-5547 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Ajith S > Attachments: YARN-5547.01.patch, YARN-5547.02.patch, > YARN-5547.03.patch > > > Whenever new keys are added to the NM state store it will break rolling > downgrades because the code will throw if it encounters an unrecognized key. > If instead it skipped unrecognized keys it could be simpler to continue > supporting rolling downgrades. We need to define the semantics of > unrecognized keys when containers and apps are cleaned up, e.g.: we may want > to delete all keys underneath an app or container directory when it is being > removed from the state store to prevent leaking unrecognized keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters
[ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800956#comment-15800956 ] Hadoop QA commented on YARN-5585: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 14s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 1s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s{color} | {color:green} YARN-5355 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 31s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} YARN-5355 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 26s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 37s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 17 new + 25 unchanged - 13 fixed = 42 total (was 38) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 55s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 41m 42s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | YARN-5585 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12845753/YARN-5585-YARN-5355.0006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvns
[jira] [Updated] (YARN-5554) MoveApplicationAcrossQueues does not check user permission on the target queue
[ https://issues.apache.org/jira/browse/YARN-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-5554: Attachment: YARN-5554.14.patch fixed one checkstyle issue introduced and the one remark from the review > MoveApplicationAcrossQueues does not check user permission on the target queue > -- > > Key: YARN-5554 > URL: https://issues.apache.org/jira/browse/YARN-5554 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: Haibo Chen >Assignee: Wilfred Spiegelenburg > Labels: oct16-medium > Attachments: YARN-5554.10.patch, YARN-5554.11.patch, > YARN-5554.12.patch, YARN-5554.13.patch, YARN-5554.14.patch, > YARN-5554.2.patch, YARN-5554.3.patch, YARN-5554.4.patch, YARN-5554.5.patch, > YARN-5554.6.patch, YARN-5554.7.patch, YARN-5554.8.patch, YARN-5554.9.patch > > > moveApplicationAcrossQueues operation currently does not check user > permission on the target queue. This incorrectly allows one user to move > his/her own applications to a queue that the user has no access to -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6015) AsyncDispatcher thread name can be set to improved debugging
[ https://issues.apache.org/jira/browse/YARN-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated YARN-6015: -- Attachment: YARN-6015.02.patch Thanks for the review. i have updated the patch based on your comments. please review > AsyncDispatcher thread name can be set to improved debugging > > > Key: YARN-6015 > URL: https://issues.apache.org/jira/browse/YARN-6015 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ajith S >Assignee: Ajith S > Attachments: YARN-6015.01.patch, YARN-6015.02.patch > > > Currently all the running instances of AsyncDispatcher have same thread name. > To improve debugging, we can have option to set thread name -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps
[ https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801085#comment-15801085 ] Rohith Sharma K S commented on YARN-6027: - bq. We could consider the fromId as (user + flow), right? Flow entityId has format of *yarn-cluster/148357440/rohithsharmaks@Sleep job* . So, fromId could be *id* self. Thoughts? > Support fromId for flows/flowrun apps > - > > Key: YARN-6027 > URL: https://issues.apache.org/jira/browse/YARN-6027 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Labels: yarn-5355-merge-blocker > > In YARN-5585 , fromId is supported for retrieving entities. We need similar > filter for flows/flowRun apps and flow run and flow as well. > Along with supporting fromId, this JIRA should also discuss following points > * Should we throw an exception for entities/entity retrieval if duplicates > found? > * TimelieEntity : > ** Should equals method also check for idPrefix? > ** Does idPrefix is part of identifiers? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6015) AsyncDispatcher thread name can be set to improved debugging
[ https://issues.apache.org/jira/browse/YARN-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801089#comment-15801089 ] Naganarasimha G R commented on YARN-6015: - Thanks for the patch [~ajithshetty], overall modifications looks fine, will wait for the jenkins run and if no other comments will commit it. > AsyncDispatcher thread name can be set to improved debugging > > > Key: YARN-6015 > URL: https://issues.apache.org/jira/browse/YARN-6015 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ajith S >Assignee: Ajith S > Attachments: YARN-6015.01.patch, YARN-6015.02.patch > > > Currently all the running instances of AsyncDispatcher have same thread name. > To improve debugging, we can have option to set thread name -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps
[ https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801147#comment-15801147 ] Varun Saxena commented on YARN-6027: Well, we can potentially reuse the ID. However, the query is within the scope of a cluster and pagination will be within a specific day so first 2 parts are sort of unnecessary. Either ways, we may have to consider escaping the delimiters("/") if they are part of the string itself. Also user will have to be checked for "@". However, @ is not allowed in Linux usernames. However, assuming cluster is known both by reader and client and timestamp will always be a number, both can identify how to parse the ID. But some logic will have to be added at both ends. We cannot directly replace the delimiters and construct the row key. This needs to be taken care of. > Support fromId for flows/flowrun apps > - > > Key: YARN-6027 > URL: https://issues.apache.org/jira/browse/YARN-6027 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Labels: yarn-5355-merge-blocker > > In YARN-5585 , fromId is supported for retrieving entities. We need similar > filter for flows/flowRun apps and flow run and flow as well. > Along with supporting fromId, this JIRA should also discuss following points > * Should we throw an exception for entities/entity retrieval if duplicates > found? > * TimelieEntity : > ** Should equals method also check for idPrefix? > ** Does idPrefix is part of identifiers? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps
[ https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801151#comment-15801151 ] Varun Saxena commented on YARN-6027: However, if I am not wrong, other OS do allow @ in username. > Support fromId for flows/flowrun apps > - > > Key: YARN-6027 > URL: https://issues.apache.org/jira/browse/YARN-6027 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Labels: yarn-5355-merge-blocker > > In YARN-5585 , fromId is supported for retrieving entities. We need similar > filter for flows/flowRun apps and flow run and flow as well. > Along with supporting fromId, this JIRA should also discuss following points > * Should we throw an exception for entities/entity retrieval if duplicates > found? > * TimelieEntity : > ** Should equals method also check for idPrefix? > ** Does idPrefix is part of identifiers? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5889) Improve user-limit calculation in capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-5889: -- Attachment: YARN-5889.0001.patch Attaching an initial version of patch as per discussion. > Improve user-limit calculation in capacity scheduler > > > Key: YARN-5889 > URL: https://issues.apache.org/jira/browse/YARN-5889 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Sunil G >Assignee: Sunil G > Attachments: YARN-5889.0001.patch, YARN-5889.v0.patch, > YARN-5889.v1.patch, YARN-5889.v2.patch > > > Currently user-limit is computed during every heartbeat allocation cycle with > a write lock. To improve performance, this tickets is focussing on moving > user-limit calculation out of heartbeat allocation flow. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2
[ https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801234#comment-15801234 ] Hadoop QA commented on YARN-6041: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 40 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 52s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 49s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 28s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 30s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 29s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 6m 8s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 3m 8s{color} | {color:green} branch-2 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 54s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 54s{color} | {color:green} branch-2 passed with JDK v1.8.0_111 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 23s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 27s{color} | {color:green} the patch passed with JDK v1.8.0_111 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 27s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 6m 27s{color} | {color:red} root-jdk1.8.0_111 with JDK v1.8.0_111 generated 1 new + 860 unchanged - 1 fixed = 861 total (was 861) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 10s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 10s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 27s{color} | {color:orange} root: The patch generated 19 new + 3469 unchanged - 43 fixed = 3488 total (was 3512) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 6m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 3m 49s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 12m 45s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc
[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2
[ https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801509#comment-15801509 ] Arun Suresh commented on YARN-6041: --- As mentioned earlier, the javac, javadoc and whitespace errors are better left unfixed to retain the style of the existing files. The Testcase errors are not related. [~kasha] / [~leftnoteasy], do let me know if this is good for checkin > Opportunistic containers : Combined patch for branch-2 > --- > > Key: YARN-6041 > URL: https://issues.apache.org/jira/browse/YARN-6041 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6041-branch-2.001.patch, > YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch > > > This is a combined patch targeting branch-2 of the following JIRAs which have > already been committed to trunk : > YARN-5938. Refactoring OpportunisticContainerAllocator to use > SchedulerRequestKey instead of Priority and other misc fixes > YARN-5646. Add documentation and update config parameter names for scheduling > of OPPORTUNISTIC containers. > YARN-5982. Simplify opportunistic container parameters and metrics. > YARN-5918. Handle Opportunistic scheduling allocate request failure when NM > is lost. > YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager > container lifecycle. > YARN-5823. Update NMTokens in case of requests with only opportunistic > containers. > YARN-5377. Fix > TestQueuingContainerManager.testKillMultipleOpportunisticContainers. > YARN-2995. Enhance UI to show cluster resource utilization of various > container Execution types. > YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http > Address. > YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method > to handle OPPORTUNISTIC container requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir
[ https://issues.apache.org/jira/browse/YARN-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801512#comment-15801512 ] Wilfred Spiegelenburg commented on YARN-6056: - correct, if you pass in multiple directories then a directory in that list which does not exist on the file system should not be fatal. We should not stop processing and just continue with the next in the list. In that way a directory that does not exist is not a failed delete. The end result is the correct the directory does not exist (any more) on the FS and should thus not be a failure. I am not sure what is going on with the build but it looks like {{protoc}} failed which caused a cascading failure. > Yarn NM using LCE shows a failure when trying to delete a non-existing dir > -- > > Key: YARN-6056 > URL: https://issues.apache.org/jira/browse/YARN-6056 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.6.5 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: YARN-6056-branch-2.6.1.patch > > > As part of YARN-2902 the clean up of the local directories was changed to > ignore non existing directories and proceed with others in the list. This > part of the code change was not backported into branch-2.6, backporting just > that part now. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6015) AsyncDispatcher thread name can be set to improved debugging
[ https://issues.apache.org/jira/browse/YARN-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801544#comment-15801544 ] Daniel Templeton commented on YARN-6015: +1 from me. I'll go kick Jenkins since it seems to have gone deaf lately. > AsyncDispatcher thread name can be set to improved debugging > > > Key: YARN-6015 > URL: https://issues.apache.org/jira/browse/YARN-6015 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ajith S >Assignee: Ajith S > Attachments: YARN-6015.01.patch, YARN-6015.02.patch > > > Currently all the running instances of AsyncDispatcher have same thread name. > To improve debugging, we can have option to set thread name -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5554) MoveApplicationAcrossQueues does not check user permission on the target queue
[ https://issues.apache.org/jira/browse/YARN-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801637#comment-15801637 ] Daniel Templeton commented on YARN-5554: In doing a last pass, I have two questions on the test code: # In {{testMoveApplicationSubmitTargetQueue()}} and {{testMoveApplicationAdminTargetQueue()}}, would it make sense to test that the moves that are supposed to work do actually work? # Why a {{ConcurrentHashMap}} in {{createClientRMServiceForMoveApplicationRequest()}} instead of {{Collections.singletonMap()}}? > MoveApplicationAcrossQueues does not check user permission on the target queue > -- > > Key: YARN-5554 > URL: https://issues.apache.org/jira/browse/YARN-5554 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.2 >Reporter: Haibo Chen >Assignee: Wilfred Spiegelenburg > Labels: oct16-medium > Attachments: YARN-5554.10.patch, YARN-5554.11.patch, > YARN-5554.12.patch, YARN-5554.13.patch, YARN-5554.14.patch, > YARN-5554.2.patch, YARN-5554.3.patch, YARN-5554.4.patch, YARN-5554.5.patch, > YARN-5554.6.patch, YARN-5554.7.patch, YARN-5554.8.patch, YARN-5554.9.patch > > > moveApplicationAcrossQueues operation currently does not check user > permission on the target queue. This incorrectly allows one user to move > his/her own applications to a queue that the user has no access to -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5258) Document Use of Docker with LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801657#comment-15801657 ] Daniel Templeton commented on YARN-5258: [~sidharta-s] or [~vvasudev], any comments? I'd love to get this into 2.8.0. > Document Use of Docker with LinuxContainerExecutor > -- > > Key: YARN-5258 > URL: https://issues.apache.org/jira/browse/YARN-5258 > Project: Hadoop YARN > Issue Type: Sub-task > Components: documentation >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > Labels: oct16-easy > Attachments: YARN-5258.001.patch, YARN-5258.002.patch, > YARN-5258.003.patch, YARN-5258.004.patch > > > There aren't currently any docs that explain how to configure Docker and all > of its various options aside from reading all of the JIRAs. We need to > document the configuration, use, and troubleshooting, along with helpful > examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4990) Re-direction of a particular log file within in a container in NM UI does not redirect properly to Log Server ( history ) on container completion
[ https://issues.apache.org/jira/browse/YARN-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801743#comment-15801743 ] Jason Lowe commented on YARN-4990: -- This would be a nice fix to get into 2.8 and seems to be low risk. Any objections? > Re-direction of a particular log file within in a container in NM UI does not > redirect properly to Log Server ( history ) on container completion > - > > Key: YARN-4990 > URL: https://issues.apache.org/jira/browse/YARN-4990 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Hitesh Shah >Assignee: Xuan Gong > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-4990.1.patch, YARN-4990.2.patch > > > The NM does the redirection to the history server correctly. However if the > user is viewing or has a link to a particular specific file, the redirect > ends up going to the top level page for the container and not redirecting to > the specific file. Additionally, the start param to show logs from the offset > 0 also goes missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys
[ https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801788#comment-15801788 ] Jason Lowe commented on YARN-5547: -- bq. for deleting the unknown keys, would it be ok to remove unknown keys in NMLeveldbStateStoreService.loadContainerState(ContainerId, LeveldbIterator, String) .? That should be OK as long as we record the container as killed before we remove the unknown keys. When we eventually add the ability to ignore unknown keys without killing the container then it can be problematic. For example: # NM is on version V and is using key K, which is new in version V, that is not deemed critical to the recovery of a running container. # NM is downgraded to version V-1 # On startup, NM with version V-1 deletes the unknown key K for the container but keeps it running because it was deemed safe to ignore in the (yet to be added) state store key descriptor table # With the container still running, NM is upgraded to version V again # Now the container has lost key K yet was started on NM version V and continues to run on NM version V. If we skip the unknown keys that are deemed "safe to ignore" then we can leak per the concern above if the container completes on version V-1. One way to fix that case is to have the NM always try to delete the list of unknown keys in the (yet to be added) safe-to-ignore key descriptor table when the container completes. Should be fine unless that table gets to be particularly large. But we don't have to implement that now, only when we add the ability to ignore unknown keys without killing a container. For the purposes of this JIRA, we will always be killing containers that have unknown keys so it's simpler. > NMLeveldbStateStore should be more tolerant of unknown keys > --- > > Key: YARN-5547 > URL: https://issues.apache.org/jira/browse/YARN-5547 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Ajith S > Attachments: YARN-5547.01.patch, YARN-5547.02.patch, > YARN-5547.03.patch > > > Whenever new keys are added to the NM state store it will break rolling > downgrades because the code will throw if it encounters an unrecognized key. > If instead it skipped unrecognized keys it could be simpler to continue > supporting rolling downgrades. We need to define the semantics of > unrecognized keys when containers and apps are cleaned up, e.g.: we may want > to delete all keys underneath an app or container directory when it is being > removed from the state store to prevent leaking unrecognized keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6015) AsyncDispatcher thread name can be set to improved debugging
[ https://issues.apache.org/jira/browse/YARN-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801854#comment-15801854 ] Hadoop QA commented on YARN-6015: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 52s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 1s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 17s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 45m 10s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}120m 58s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6015 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12845766/YARN-6015.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 1e2c0f9dd246 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a605ff3 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/14565/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-
[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes
[ https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801872#comment-15801872 ] Robert Kanter commented on YARN-6050: - [~leftnoteasy], you're right. I should have changed that when I made the protobuf changes. I'll upload a new patch soon. Though I think we should still throw and exception if there's no ANY request because otherwise, the client will be expecting a specific rack or node, and it won't be doing that, and they'll be left wondering why. An exception with a clear error message makes it more obvious what's happening. > AMs can't be scheduled on racks or nodes > > > Key: YARN-6050 > URL: https://issues.apache.org/jira/browse/YARN-6050 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-6050.001.patch, YARN-6050.002.patch, > YARN-6050.003.patch > > > Yarn itself supports rack/node aware scheduling for AMs; however, there > currently are two problems: > # To specify hard or soft rack/node requests, you have to specify more than > one {{ResourceRequest}}. For example, if you want to schedule an AM only on > "rackA", you have to create two {{ResourceRequest}}, like this: > {code} > ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false); > ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, > true); > {code} > The problem is that the Yarn API doesn't actually allow you to specify more > than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}. The > current behavior is to either build one from {{getResource}} or directly from > {{getAMContainerResourceRequest}}, depending on if > {{getAMContainerResourceRequest}} is null or not. We'll need to add a third > method, say {{getAMContainerResourceRequests}}, which takes a list of > {{ResourceRequest}} so that clients can specify the multiple resource > requests. > # There are some places where things are hardcoded to overwrite what the > client specifies. These are pretty straightforward to fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6057) yarn.scheduler.minimum-allocation describtion update required
Bibin A Chundatt created YARN-6057: -- Summary: yarn.scheduler.minimum-allocation describtion update required Key: YARN-6057 URL: https://issues.apache.org/jira/browse/YARN-6057 Project: Hadoop YARN Issue Type: Bug Reporter: Bibin A Chundatt Priority: Minor {code} The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this will throw a InvalidResourceRequestException. yarn.scheduler.minimum-allocation-vcores 1 {code} *Requests lower than this will throw a InvalidResourceRequestException.* Only incase of maximum allocation vcore and memory InvalidResourceRequestException is thrown -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6057) yarn.scheduler.minimum-allocation describtion update required
[ https://issues.apache.org/jira/browse/YARN-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801897#comment-15801897 ] Bibin A Chundatt commented on YARN-6057: IIUC for minimum allocation the requests gets rounded up to minimum value > yarn.scheduler.minimum-allocation describtion update required > - > > Key: YARN-6057 > URL: https://issues.apache.org/jira/browse/YARN-6057 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Minor > > {code} > > The minimum allocation for every container request at the RM, > in terms of virtual CPU cores. Requests lower than this will throw a > InvalidResourceRequestException. > yarn.scheduler.minimum-allocation-vcores > 1 > > {code} > *Requests lower than this will throw a InvalidResourceRequestException.* > Only incase of maximum allocation vcore and memory > InvalidResourceRequestException is thrown -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6057) yarn.scheduler.minimum-allocation describtion update required
[ https://issues.apache.org/jira/browse/YARN-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801897#comment-15801897 ] Bibin A Chundatt edited comment on YARN-6057 at 1/5/17 5:15 PM: IIUC for minimum allocation the requests gets roundup to minimum value/ increment resource value was (Author: bibinchundatt): IIUC for minimum allocation the requests gets rounded up to minimum value > yarn.scheduler.minimum-allocation describtion update required > - > > Key: YARN-6057 > URL: https://issues.apache.org/jira/browse/YARN-6057 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Minor > > {code} > > The minimum allocation for every container request at the RM, > in terms of virtual CPU cores. Requests lower than this will throw a > InvalidResourceRequestException. > yarn.scheduler.minimum-allocation-vcores > 1 > > {code} > *Requests lower than this will throw a InvalidResourceRequestException.* > Only incase of maximum allocation vcore and memory > InvalidResourceRequestException is thrown -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters
[ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801911#comment-15801911 ] Varun Saxena commented on YARN-5585: Thanks [~rohithsharma] for the latest patch. The patch LGTM. Checkstyle issues are unrelated. Will wait for a day before committing it. > [Atsv2] Reader side changes for entity prefix and support for pagination via > additional filters > --- > > Key: YARN-5585 > URL: https://issues.apache.org/jira/browse/YARN-5585 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Critical > Labels: yarn-5355-merge-blocker > Attachments: 0001-YARN-5585.patch, YARN-5585-YARN-5355.0001.patch, > YARN-5585-YARN-5355.0002.patch, YARN-5585-YARN-5355.0003.patch, > YARN-5585-YARN-5355.0004.patch, YARN-5585-YARN-5355.0005.patch, > YARN-5585-YARN-5355.0006.patch, YARN-5585-workaround.patch, YARN-5585.v0.patch > > > TimelineReader REST API's provides lot of filters to retrieve the > applications. Along with those, it would be good to add new filter i.e fromId > so that entities can be retrieved after the fromId. > Current Behavior : Default limit is set to 100. If there are 1000 entities > then REST call gives first/last 100 entities. How to retrieve next set of 100 > entities i.e 101 to 200 OR 900 to 801? > Example : If applications are stored database, app-1 app-2 ... app-10. > *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is > no way to achieve this. > So proposal is to have fromId in the filter like > *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to > app-10. > Since ATS is targeting large number of entities storage, it is very common > use case to get next set of entities using fromId rather than querying all > the entites. This is very useful for pagination in web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6027) Support fromId for flows/flowrun apps
[ https://issues.apache.org/jira/browse/YARN-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801934#comment-15801934 ] Varun Saxena commented on YARN-6027: By the way in YARN-5585 we have kept fromId and fromIdPrefix as inclusive. We cant keep fromId here, inclusive. Right ? Can clients determine next flowId or we should let the reader side do it ? > Support fromId for flows/flowrun apps > - > > Key: YARN-6027 > URL: https://issues.apache.org/jira/browse/YARN-6027 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Labels: yarn-5355-merge-blocker > > In YARN-5585 , fromId is supported for retrieving entities. We need similar > filter for flows/flowRun apps and flow run and flow as well. > Along with supporting fromId, this JIRA should also discuss following points > * Should we throw an exception for entities/entity retrieval if duplicates > found? > * TimelieEntity : > ** Should equals method also check for idPrefix? > ** Does idPrefix is part of identifiers? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6057) yarn.scheduler.minimum-allocation describtion update required
[ https://issues.apache.org/jira/browse/YARN-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801933#comment-15801933 ] Daniel Templeton commented on YARN-6057: In the case of both minimum and maximum, resource requests that are out of bounds are quietly adjusted to be in bounds. (See {{DefaultResourceCalculator.normalize()}} and {{DominantResourceCalculator.normalize()}}) The minimum will also prevent NMs that have fewer vcores from starting. (See {{ResourceTrackerService.registerNodeManager()}}.) > yarn.scheduler.minimum-allocation describtion update required > - > > Key: YARN-6057 > URL: https://issues.apache.org/jira/browse/YARN-6057 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Minor > > {code} > > The minimum allocation for every container request at the RM, > in terms of virtual CPU cores. Requests lower than this will throw a > InvalidResourceRequestException. > yarn.scheduler.minimum-allocation-vcores > 1 > > {code} > *Requests lower than this will throw a InvalidResourceRequestException.* > Only incase of maximum allocation vcore and memory > InvalidResourceRequestException is thrown -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6057) yarn.scheduler.minimum-allocation describtion update required
[ https://issues.apache.org/jira/browse/YARN-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton reassigned YARN-6057: -- Assignee: Daniel Templeton > yarn.scheduler.minimum-allocation describtion update required > - > > Key: YARN-6057 > URL: https://issues.apache.org/jira/browse/YARN-6057 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Daniel Templeton >Priority: Minor > > {code} > > The minimum allocation for every container request at the RM, > in terms of virtual CPU cores. Requests lower than this will throw a > InvalidResourceRequestException. > yarn.scheduler.minimum-allocation-vcores > 1 > > {code} > *Requests lower than this will throw a InvalidResourceRequestException.* > Only incase of maximum allocation vcore and memory > InvalidResourceRequestException is thrown -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5585) [Atsv2] Reader side changes for entity prefix and support for pagination via additional filters
[ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801940#comment-15801940 ] Sangjin Lee commented on YARN-5585: --- +1. Thanks [~rohithsharma]! > [Atsv2] Reader side changes for entity prefix and support for pagination via > additional filters > --- > > Key: YARN-5585 > URL: https://issues.apache.org/jira/browse/YARN-5585 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Critical > Labels: yarn-5355-merge-blocker > Attachments: 0001-YARN-5585.patch, YARN-5585-YARN-5355.0001.patch, > YARN-5585-YARN-5355.0002.patch, YARN-5585-YARN-5355.0003.patch, > YARN-5585-YARN-5355.0004.patch, YARN-5585-YARN-5355.0005.patch, > YARN-5585-YARN-5355.0006.patch, YARN-5585-workaround.patch, YARN-5585.v0.patch > > > TimelineReader REST API's provides lot of filters to retrieve the > applications. Along with those, it would be good to add new filter i.e fromId > so that entities can be retrieved after the fromId. > Current Behavior : Default limit is set to 100. If there are 1000 entities > then REST call gives first/last 100 entities. How to retrieve next set of 100 > entities i.e 101 to 200 OR 900 to 801? > Example : If applications are stored database, app-1 app-2 ... app-10. > *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is > no way to achieve this. > So proposal is to have fromId in the filter like > *getApps?limit=5&&fromId=app-5* which gives list of apps from app-6 to > app-10. > Since ATS is targeting large number of entities storage, it is very common > use case to get next set of entities using fromId rather than querying all > the entites. This is very useful for pagination in web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6057) yarn.scheduler.minimum-allocation-* and yarn.scheduler.maximum-allocation-* descriptions are incorrect about behavior when a request is out of bounds
[ https://issues.apache.org/jira/browse/YARN-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-6057: --- Summary: yarn.scheduler.minimum-allocation-* and yarn.scheduler.maximum-allocation-* descriptions are incorrect about behavior when a request is out of bounds (was: yarn.scheduler.minimum-allocation describtion update required) > yarn.scheduler.minimum-allocation-* and yarn.scheduler.maximum-allocation-* > descriptions are incorrect about behavior when a request is out of bounds > - > > Key: YARN-6057 > URL: https://issues.apache.org/jira/browse/YARN-6057 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Daniel Templeton >Priority: Minor > > {code} > > The minimum allocation for every container request at the RM, > in terms of virtual CPU cores. Requests lower than this will throw a > InvalidResourceRequestException. > yarn.scheduler.minimum-allocation-vcores > 1 > > {code} > *Requests lower than this will throw a InvalidResourceRequestException.* > Only incase of maximum allocation vcore and memory > InvalidResourceRequestException is thrown -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3955) Support for priority ACLs in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-3955: -- Attachment: YARN-3955.0008.patch Thanks [~leftnoteasy] for the detailed comments. I have some more doubts here. 1) Common logic of checkAccess / getDefaultPriority can be merged further: both can get approvedPriority first. >> priority acls are stored in ascending order. So for checkAccess, we need to >> see whether ACL match or not and then submitted priority is lesser than >> configure priority. However in case there are no configurations for priority >> ACLs or ACLs are disabled, we still need to say access check is passed. Now >> for default priority, we will loop through all configured priority acls and >> if any ACLs are matching, we try to get max priority all group from which >> default could be taken. Do you mean that below methods also can be made common. {noformat} if (!isACLsEnable) { return true; } List acls = allAcls.get(queueName); if (acls == null || acls.isEmpty()) { return true; } {noformat} There is one issue here. If approvedPriorityACL comes are null, for checkAccess it means false. If we put above code also inside {{getPriorityPerUserACL}}, then we expect to return true if that returns null. Since there is conflict of interest, i pulled it out. May be you could explain a bit further if I missed some. 
2) As I commented above, do changes of capacity-scheduler.xml related to the patch? I cannot find which module uses acl_access_priority in configuration. If not, could you add correct default value? >> in {{CapacitySchedulerConfiguration.getAclKey(AccessType acl)}}, we try to >> get priority ACL config from acl_access_priority . And that is used to >> parse and then populate internal structures. By default I kept it *, but I >> have given an example as below. {noformat} The ACL of who can submit applications with configured priority. For e.g, [user={name} group={name} max_priority={priority} default_priority={priority}] {noformat} 3) CapacityScheduler: * updateApplicationPriority should hold writeLock? * similiarily, checkAndGetApplicationPriority should hold readlock? >> Done. Updated in patch * checkAndGetApplicationPriority: when an app's priority set to negative, I think we should use 0 instead of max. Thoughts? {noformat} if (appPriority.compareTo(getMaxClusterLevelAppPriority()) < 0) { appPriority = Priority .newInstance(getMaxClusterLevelAppPriority().getPriority()); } {noformat} This code will reset to cluster-max priority only if submitted priority is more than cluster max. Since I used {{compareTo}}, it not looks very readable. Now to your point, we never worry much of -ve priority as such since we use priority as integer. Do you feel we need to make 0 as lowest priority ? 4) AppPriorityACLsMgr: * addPrioirityACLs, should we do "replace" instead of "add" to acl groups? If it is not intentional, could you add a test to make sure update of acls works? (like change from [1,2,3] to [1,3,4]) >> Could I add a clear model. It may be more easy. Thoughts? Updated patch as >> per this. * getPriorityPerUserACL -> getMappedPriorityAclForUGI. >> Done. 5) As I mentioned before, remove readlock of LQ#getPriorityAcls, final should be enough. >> One doubt here. Since priorityAcls could also be updated in reinitialize, we >> can’t make it as final rt. refreshQueue’s call flow for eg. 6) YarnScheduler: why the new added method has SettableFuture in parameters? It doesn't look very clean ... >> I agree with you. But we are doing statestore update within scheduler. Hence >> we need to pass future to see exception is thrown immediately. Hence we had >> to add this while doing move to queue. > Support for priority ACLs in CapacityScheduler > -- > > Key: YARN-3955 > URL: https://issues.apache.org/jira/browse/YARN-3955 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Sunil G >Assignee: Sunil G > Attachments: ApplicationPriority-ACL.pdf, > ApplicationPriority-ACLs-v2.pdf, YARN-3955.0001.patch, YARN-3955.0002.patch, > YARN-3955.0003.patch, YARN-3955.0004.patch, YARN-3955.0005.patch, > YARN-3955.0006.patch, YARN-3955.0007.patch, YARN-3955.0008.patch, > YARN-3955.v0.patch, YARN-3955.v1.patch, YARN-3955.wip1.patch > > > Support will be added for User-level access permission to use different > application-priorities. This is to avoid situations where all users try > running max priority in the cluster and thus degrading the value of > priorities. > Access Control Lists can be set per priority level within each queue. Below > is an example configuration that can be added in capacity scheduler > configuration > file for each Queue level. > y
[jira] [Commented] (YARN-6056) Yarn NM using LCE shows a failure when trying to delete a non-existing dir
[ https://issues.apache.org/jira/browse/YARN-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801998#comment-15801998 ] Hadoop QA commented on YARN-6056: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 49s{color} | {color:red} root in branch-2.6.1 failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 23s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed with JDK v1.8.0_111. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 6s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed with JDK v1.7.0_121. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed. {color} | | {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.6.1 failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 6s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 6s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_111. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 6s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_111. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 6s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_111. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 7s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_121. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 7s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_121. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 7s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_121. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 8s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 8s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 2s{color} | {color:red} The patch has 765 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 19s{color} | {color:red} The patch 16 line(s) with tabs. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 6s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_121. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 28s{color} | {color:red} The patch generated 47 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 5m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:date2017-01-05 | | JIRA Issue | YARN-6056 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12845723/YARN-6056-branch-2.6.1.patch | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux 71e6b70555b4 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | branch-2.6.1 / 41d19f4 | | Default Java | 1.7.0_121 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_111 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121 | | mvninstall | https://builds.apache.org/job/PreCommit-YARN-Build/14567/artifact/patchprocess/branch-mvninstall-root.txt | | compile | https://builds.apache.org/job/PreCommit-YARN-Build/14567/artifact/patchprocess/branch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop
[jira] [Commented] (YARN-5959) RM changes to support change of container ExecutionType
[ https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802113#comment-15802113 ] Wangda Tan commented on YARN-5959: -- Committing ... > RM changes to support change of container ExecutionType > --- > > Key: YARN-5959 > URL: https://issues.apache.org/jira/browse/YARN-5959 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5959-YARN-5085.001.patch, > YARN-5959-YARN-5085.002.patch, YARN-5959-YARN-5085.003.patch, > YARN-5959-YARN-5085.004.patch, YARN-5959-YARN-5085.005.patch, > YARN-5959.005.patch, YARN-5959.combined.001.patch, YARN-5959.wip.002.patch, > YARN-5959.wip.003.patch, YARN-5959.wip.patch > > > RM side changes to allow an AM to ask for change of ExecutionType. > Currently, there are two cases: > # *Promotion* : OPPORTUNISTIC to GUARANTEED. > # *Demotion* : GUARANTEED to OPPORTUNISTIC. > This is similar in YARN-1197 which allows for change in Container resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes
[ https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802134#comment-15802134 ] Wangda Tan commented on YARN-6050: -- [~rkanter], bq. Though I think we should still throw and exception if there's no ANY request because otherwise, the client will be expecting a specific rack or node, and it won't be doing that, and they'll be left wondering why. An exception with a clear error message makes it more obvious what's happening. I'm fine with either way, since the change you proposed could be treated as a bug fix instead of incompatible behavior change. > AMs can't be scheduled on racks or nodes > > > Key: YARN-6050 > URL: https://issues.apache.org/jira/browse/YARN-6050 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-6050.001.patch, YARN-6050.002.patch, > YARN-6050.003.patch > > > Yarn itself supports rack/node aware scheduling for AMs; however, there > currently are two problems: > # To specify hard or soft rack/node requests, you have to specify more than > one {{ResourceRequest}}. For example, if you want to schedule an AM only on > "rackA", you have to create two {{ResourceRequest}}, like this: > {code} > ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false); > ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, > true); > {code} > The problem is that the Yarn API doesn't actually allow you to specify more > than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}. The > current behavior is to either build one from {{getResource}} or directly from > {{getAMContainerResourceRequest}}, depending on if > {{getAMContainerResourceRequest}} is null or not. We'll need to add a third > method, say {{getAMContainerResourceRequests}}, which takes a list of > {{ResourceRequest}} so that clients can specify the multiple resource > requests. > # There are some places where things are hardcoded to overwrite what the > client specifies. These are pretty straightforward to fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6058) Support for listing all applications i.e /apps
Rohith Sharma K S created YARN-6058: --- Summary: Support for listing all applications i.e /apps Key: YARN-6058 URL: https://issues.apache.org/jira/browse/YARN-6058 Project: Hadoop YARN Issue Type: Sub-task Components: timelinereader Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Priority: Critical Primary use case for /apps is many execution engines runs on top of YARN example, Tez, MR. These engines will have their own UI's which list specific type of entities which are published by them Ex: DAG entities. But, these UI's do not aware of either userName or flowName or applicationId which are submitted by these engines. Currently, given that user do not aware of user, flownName, and applicationId, then he can not retrieve any entities. By supporting /apps with filters, user can list of application with given ApplicationType. These applications can be used for retrieving engine specific entities like DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6040) Remove usage of ResourceRequest from AppSchedulerInfo / SchedulerApplicationAttempt
[ https://issues.apache.org/jira/browse/YARN-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6040: - Attachment: YARN-6040.006.patch Attached 006 patch, rebased to latest trunk. [~asuresh] could you please review? > Remove usage of ResourceRequest from AppSchedulerInfo / > SchedulerApplicationAttempt > --- > > Key: YARN-6040 > URL: https://issues.apache.org/jira/browse/YARN-6040 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6040.001.patch, YARN-6040.002.patch, > YARN-6040.003.patch, YARN-6040.004.patch, YARN-6040.005.patch, > YARN-6040.006.patch > > > As mentioned by YARN-5906, currently schedulers are using ResourceRequest > heavily so it will be very hard to adopt the new PowerfulResourceRequest > (YARN-4902). > This JIRA is the 2nd step of refactoring, which remove usage of > ResourceRequest from AppSchedulingInfo / SchedulerApplicationAttempt. Instead > of returning ResourceRequest, it returns a lightweight and API-independent > object - {{PendingAsk}}. > The only remained ResourceRequest API of AppSchedulingInfo will be used by > web service to get list of ResourceRequests. > So after this patch, usage of ResourceRequest will be isolated inside > AppSchedulingInfo, so it will be more flexible to update internal data > structure and upgrade old ResourceRequest API to new. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5234) ResourceManager REST API missing descriptions for what's returned when using Fair Scheduler
[ https://issues.apache.org/jira/browse/YARN-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Sohn resolved YARN-5234. -- Resolution: Fixed Fix Version/s: 3.0.0-alpha1 Looked at latest docs and this has been addressed. > ResourceManager REST API missing descriptions for what's returned when using > Fair Scheduler > --- > > Key: YARN-5234 > URL: https://issues.apache.org/jira/browse/YARN-5234 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation, fairscheduler, resourcemanager >Reporter: Grant Sohn >Priority: Minor > Fix For: 3.0.0-alpha1 > > > Cluster Scheduler API indicates support for Capacity and Fifo. What's > missing is what would be returned if using Fair scheduling. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2
[ https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802169#comment-15802169 ] Wangda Tan commented on YARN-6041: -- [~asuresh], How you plan to merge these changes to branch-2? I think it might be better to do cherry-pick one-by-one, and file a separate JIRA to address new comments from [~kasha] and me. Commit this huge patch to branch-2 creates troubles for future maintenance. Thoughts? > Opportunistic containers : Combined patch for branch-2 > --- > > Key: YARN-6041 > URL: https://issues.apache.org/jira/browse/YARN-6041 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6041-branch-2.001.patch, > YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch > > > This is a combined patch targeting branch-2 of the following JIRAs which have > already been committed to trunk : > YARN-5938. Refactoring OpportunisticContainerAllocator to use > SchedulerRequestKey instead of Priority and other misc fixes > YARN-5646. Add documentation and update config parameter names for scheduling > of OPPORTUNISTIC containers. > YARN-5982. Simplify opportunistic container parameters and metrics. > YARN-5918. Handle Opportunistic scheduling allocate request failure when NM > is lost. > YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager > container lifecycle. > YARN-5823. Update NMTokens in case of requests with only opportunistic > containers. > YARN-5377. Fix > TestQueuingContainerManager.testKillMultipleOpportunisticContainers. > YARN-2995. Enhance UI to show cluster resource utilization of various > container Execution types. > YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http > Address. > YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method > to handle OPPORTUNISTIC container requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5556) Support for deleting queues without requiring a RM restart
[ https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802176#comment-15802176 ] Xuan Gong commented on YARN-5556: - [~Naganarasimha] Thanks for the comments [~leftnoteasy] Please comment if you have any further suggestions. bq. So user needs to delete a queue(say a2) then he needs to remove the queue from its parent's "yarn.scheduler.capacity..queues" config and also mention its state(yarn.scheduler.capacity..state) as DELETED right ? Do not need to remove the queue from its parent's "yarn.scheduler.capacity..queues" config, just mention its state(yarn.scheduler.capacity..state) as DELETED. bq. How to delete intermediate queues? i presume we need NOT configure state for each of its children right ? or do we plan to support delete of only leaf queue? We need not configure the state for each of its children. Just mark delete for the queue itself. bq. Do we need to consider the moving of queues(along with its apps) from one queue hiearchy to another ? IMO it complicates but not sure about the real world usecases. we can consider this scenario later. bq. In case of HA, i think it further complicates as if both the RM's are initialiased with old queue settings and then if new queue is updated then CS is aware of deleted queue else if the RM starts of with updated xml(with deleted queue) then deleted queue information is not available and if failover happens to this RM then apps running on the deleted queue cannot be recovered as the queue doesnt exist. so do we need to start maintaining the deleted queue in statestore or need handling of creating queue objects for the queues whose state has been marked as deleted (then we need to consider 2nd point) ? Yes, this is the fundamental issue with the "configuration-based" approach. This api-based approach would solve this issue: https://issues.apache.org/jira/browse/YARN-5734. But for "configuration-based" approach, in RM HA case, we have to make sure the configuration file for every RM nodes is updated. bq. do we need to consider showing of the deleted queues in the webui ? may be in another jira but the code needs to be updated. Yes, we could file a separate jira, and do it later. The basic workflow could be: before we can actually delete the queue, we should make sure the queue in STOPPED state which means this queue can not accept any new applications, and all apps (including pending request) have been finished (for now, we could simply wait. or add a command/flag to force kill later). Then, we could delete the queue and split capacity. Thanks Xuan Gong > Support for deleting queues without requiring a RM restart > -- > > Key: YARN-5556 > URL: https://issues.apache.org/jira/browse/YARN-5556 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Xuan Gong >Assignee: Naganarasimha G R > Attachments: YARN-5556.v1.001.patch, YARN-5556.v1.002.patch, > YARN-5556.v1.003.patch, YARN-5556.v1.004.patch > > > Today, we could add or modify queues without restarting the RM, via a CS > refresh. But for deleting queue, we have to restart the ResourceManager. We > could support for deleting queues without requiring a RM restart -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4990) Re-direction of a particular log file within in a container in NM UI does not redirect properly to Log Server ( history ) on container completion
[ https://issues.apache.org/jira/browse/YARN-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802185#comment-15802185 ] Junping Du commented on YARN-4990: -- I am fine with it. > Re-direction of a particular log file within in a container in NM UI does not > redirect properly to Log Server ( history ) on container completion > - > > Key: YARN-4990 > URL: https://issues.apache.org/jira/browse/YARN-4990 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Hitesh Shah >Assignee: Xuan Gong > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-4990.1.patch, YARN-4990.2.patch > > > The NM does the redirection to the history server correctly. However if the > user is viewing or has a link to a particular specific file, the redirect > ends up going to the top level page for the container and not redirecting to > the specific file. Additionally, the start param to show logs from the offset > 0 also goes missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2
[ https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802195#comment-15802195 ] Arun Suresh commented on YARN-6041: --- [~leftnoteasy], So what I plan to do was, like you mentioned, cherry-pick the 10 JIRAs specified in the description (I actually created the patch by doing just that.. and then doing a "git diff ").. Then commit JUST the changes Karthik suggested as "YARN-6041: .." which I will cherry-pick on to trunk as well. > Opportunistic containers : Combined patch for branch-2 > --- > > Key: YARN-6041 > URL: https://issues.apache.org/jira/browse/YARN-6041 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6041-branch-2.001.patch, > YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch > > > This is a combined patch targeting branch-2 of the following JIRAs which have > already been committed to trunk : > YARN-5938. Refactoring OpportunisticContainerAllocator to use > SchedulerRequestKey instead of Priority and other misc fixes > YARN-5646. Add documentation and update config parameter names for scheduling > of OPPORTUNISTIC containers. > YARN-5982. Simplify opportunistic container parameters and metrics. > YARN-5918. Handle Opportunistic scheduling allocate request failure when NM > is lost. > YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager > container lifecycle. > YARN-5823. Update NMTokens in case of requests with only opportunistic > containers. > YARN-5377. Fix > TestQueuingContainerManager.testKillMultipleOpportunisticContainers. > YARN-2995. Enhance UI to show cluster resource utilization of various > container Execution types. > YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http > Address. > YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method > to handle OPPORTUNISTIC container requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2
[ https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802212#comment-15802212 ] Wangda Tan commented on YARN-6041: -- [~asuresh], so it will generate 10 commits (plus one for suggestions from this JIRA), correct? It will be better to create a separate JIRA to track kasha's suggestions and commit it separately (so we will have a JIRA number) > Opportunistic containers : Combined patch for branch-2 > --- > > Key: YARN-6041 > URL: https://issues.apache.org/jira/browse/YARN-6041 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6041-branch-2.001.patch, > YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch > > > This is a combined patch targeting branch-2 of the following JIRAs which have > already been committed to trunk : > YARN-5938. Refactoring OpportunisticContainerAllocator to use > SchedulerRequestKey instead of Priority and other misc fixes > YARN-5646. Add documentation and update config parameter names for scheduling > of OPPORTUNISTIC containers. > YARN-5982. Simplify opportunistic container parameters and metrics. > YARN-5918. Handle Opportunistic scheduling allocate request failure when NM > is lost. > YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager > container lifecycle. > YARN-5823. Update NMTokens in case of requests with only opportunistic > containers. > YARN-5377. Fix > TestQueuingContainerManager.testKillMultipleOpportunisticContainers. > YARN-2995. Enhance UI to show cluster resource utilization of various > container Execution types. > YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http > Address. > YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method > to handle OPPORTUNISTIC container requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5959) RM changes to support change of container ExecutionType
[ https://issues.apache.org/jira/browse/YARN-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802237#comment-15802237 ] Hudson commented on YARN-5959: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11075 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11075/]) YARN-5959. RM changes to support change of container ExecutionType. (wangda: rev 0a55bd841ec0f2eb89a0383f4c589526e8b138d4) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/UpdateContainerRequest.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/OpportunisticContainerAllocatorAMService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Allocation.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoAppAttempt.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClientOnRMRestart.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/TestRMContainerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java * (edit) hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSCapacityScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java * (edit) hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/scheduler/OpportunisticContainerContext.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerUpdateType.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java *
[jira] [Updated] (YARN-6050) AMs can't be scheduled on racks or nodes
[ https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated YARN-6050: Attachment: YARN-6050.004.patch The 004 patch - Removes the {{getAMContainerResourceRequest}} check that [~leftnoteasy] pointed out - Adds a check that there are at least 1 {{ResourceRequest}} or a {{Resource}} set; plus a test. > AMs can't be scheduled on racks or nodes > > > Key: YARN-6050 > URL: https://issues.apache.org/jira/browse/YARN-6050 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha2 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: YARN-6050.001.patch, YARN-6050.002.patch, > YARN-6050.003.patch, YARN-6050.004.patch > > > Yarn itself supports rack/node aware scheduling for AMs; however, there > currently are two problems: > # To specify hard or soft rack/node requests, you have to specify more than > one {{ResourceRequest}}. For example, if you want to schedule an AM only on > "rackA", you have to create two {{ResourceRequest}}, like this: > {code} > ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false); > ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, > true); > {code} > The problem is that the Yarn API doesn't actually allow you to specify more > than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}. The > current behavior is to either build one from {{getResource}} or directly from > {{getAMContainerResourceRequest}}, depending on if > {{getAMContainerResourceRequest}} is null or not. We'll need to add a third > method, say {{getAMContainerResourceRequests}}, which takes a list of > {{ResourceRequest}} so that clients can specify the multiple resource > requests. > # There are some places where things are hardcoded to overwrite what the > client specifies. These are pretty straightforward to fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5556) Support for deleting queues without requiring a RM restart
[ https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802255#comment-15802255 ] Wangda Tan commented on YARN-5556: -- Just offline discussed with [~xgong]. I think we don't need the additional DELETED state, first it generate some maintenance overheads, for example we need to maintain state transition to/from of the DELETED state. And since by design a queue can be deleted only if queue is stopped and no app running, so the impact of typo should be minimum. Our preference is simply remove queue from config. And for re-distribution of stopped/deleted queue. For delete queue it should be obvious, since the queue is gone, sum of its siblings should be 100. For stopped queue, our expectation is, it will be reactivated at some time. So it will be better to keep the capacity as-is, and admin can update max-capacity of its siblings to make sure queue capacity can be utilized. I think we need to update design doc to make it up-to-date. Thoughts? > Support for deleting queues without requiring a RM restart > -- > > Key: YARN-5556 > URL: https://issues.apache.org/jira/browse/YARN-5556 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Xuan Gong >Assignee: Naganarasimha G R > Attachments: YARN-5556.v1.001.patch, YARN-5556.v1.002.patch, > YARN-5556.v1.003.patch, YARN-5556.v1.004.patch > > > Today, we could add or modify queues without restarting the RM, via a CS > refresh. But for deleting queue, we have to restart the ResourceManager. We > could support for deleting queues without requiring a RM restart -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5280) Allow YARN containers to run with Java Security Manager
[ https://issues.apache.org/jira/browse/YARN-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Phillips updated YARN-5280: Attachment: YARN-5280.006.patch > Allow YARN containers to run with Java Security Manager > --- > > Key: YARN-5280 > URL: https://issues.apache.org/jira/browse/YARN-5280 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, yarn >Affects Versions: 2.6.4 >Reporter: Greg Phillips >Assignee: Greg Phillips >Priority: Minor > Labels: oct16-medium > Attachments: YARN-5280.001.patch, YARN-5280.002.patch, > YARN-5280.003.patch, YARN-5280.004.patch, YARN-5280.005.patch, > YARN-5280.006.patch, YARN-5280.patch, YARNContainerSandbox.pdf > > > YARN applications have the ability to perform privileged actions which have > the potential to add instability into the cluster. The Java Security Manager > can be used to prevent users from running privileged actions while still > allowing their core data processing use cases. > Introduce a YARN flag which will allow a Hadoop administrator to enable the > Java Security Manager for user code, while still providing complete > permissions to core Hadoop libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3955) Support for priority ACLs in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802298#comment-15802298 ] Hadoop QA commented on YARN-3955: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 57s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 37s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 42s{color} | {color:orange} root: The patch generated 13 new + 484 unchanged - 2 fixed = 497 total (was 486) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 15s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 31s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 1 new + 913 unchanged - 0 fixed = 914 total (was 913) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 30s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 39m 37s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 5s{color} | {color:green} hadoop-sls in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}124m 58s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-3955 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12845827/YARN-3955.0008.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 25481fef2165 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a605ff3 | | Default Java | 1.8.0_111
[jira] [Updated] (YARN-5964) Lower the granularity of locks in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-5964: - Fix Version/s: (was: 2.7.1) > Lower the granularity of locks in FairScheduler > --- > > Key: YARN-5964 > URL: https://issues.apache.org/jira/browse/YARN-5964 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.7.1 > Environment: CentOS-7.1 >Reporter: zhengchenyu >Priority: Critical > Original Estimate: 2m > Remaining Estimate: 2m > > When too many applications are running, we found that client couldn't submit > the application, and a high callqueuelength of port 8032. I catch the jstack > of resourcemanager when callqueuelength is too high. I found that the thread > "IPC Server handler xxx on 8032" are waitting for the object lock of > FairScheduler, nodeupdate holds the lock of the FairScheduler. Maybe high > process time leads to the phenomenon that client can't submit the > application. > Here I don't consider the problem that client can't submit the application, > only estimates the performance of the fairscheduler. We can see too many > function which needs object lock are used, the granularity of object lock is > too big. For example, nodeUpdate and getAppWeight wanna hold the same object > lock. It is unresonable and inefficiency. I recommand that the low > granularity lock replaces the current lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6032) SharedCacheManager cleaner task should rm InMemorySCMStore some cachedResources which does not exists in hdfs fs
[ https://issues.apache.org/jira/browse/YARN-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802329#comment-15802329 ] Junping Du commented on YARN-6032: -- Remove fix version as the jira haven't be resolved. > SharedCacheManager cleaner task should rm InMemorySCMStore some > cachedResources which does not exists in hdfs fs > - > > Key: YARN-6032 > URL: https://issues.apache.org/jira/browse/YARN-6032 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Zhaofei Meng > > If cacheresources exist in scm but not exist in hdfs,the cacheresources > whill not rm from scm until restart scm.So we shoult add check funcion in > cleaner task that rm the cachedResources which does not exists in hdfs fs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-3955) Support for priority ACLs in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802330#comment-15802330 ] Wangda Tan edited comment on YARN-3955 at 1/5/17 7:45 PM: -- 1) bq. There is one issue here. If approvedPriorityACL comes are null, for checkAccess it means false. Ok gotcha, my bad, we cannot merge the two. 2) Got it, not related to your patch. The previous design of "acl-key" is bad, it will be hard to find which code path uses it... And in addition, I didn't see test case that parses raw priority acls (string) to List of PriorityACLGroup. Could you point me if there's any test cases exist? Few renaming suggestions: - PriorityACLConfiguration \-> AppPriorityACLConfigurationParser (I was trying to find where's the parser code, and since we're adding queue priority YARN-5864, so it will be better to add an App\- to distinguish that) - Similiarily, PriorityAclConfig -> AppPriorityACLOwnerType (or any better name?) - PriorityACLGroup -> AppPriorityACLGroup - Do you think is it better to rename acl_access_priority to acl_app_max_priority? 3) bq. This code will reset to cluster-max priority only if submitted priority is more than cluster max. Since I used compareTo, it not looks very readable. Yeah, since we're using Priority in different ways, sometimes lower is more important and sometimes higher is more important. Could you use ">" to do the comparision? bq. checkAndGetApplicationPriority: when an app's priority set to negative, I think we should use 0 instead of max. Thoughts? Negative value looks fine, since app can set lower priority if needed. 4) bq. Could I add a clear model. It may be more easy. Thoughts? Updated patch as per this. Not quite sure what did you mean. From my understanding, existing logic read acls from configs while refreshQueues, and what we need to do is to replace all ACLs instead of append to previous acl list, correct? bq. One doubt here. Since priorityAcls could also be updated in reinitialize, we can’t make it as final rt. refreshQueue’s call flow for eg. Since the returned list can be modified by another thread, so the readLock cannot provide enough protection. The better way might be readLock + copyList. bq. But we are doing statestore update within scheduler. Hence we need to pass future to see exception is thrown immediately. Hence we had to add this while doing move to queue. Make sense. was (Author: leftnoteasy): 1) bq. There is one issue here. If approvedPriorityACL comes are null, for checkAccess it means false. Ok gotcha, my bad, we cannot merge the two. 2) Got it, not related to your patch. The previous design of "acl-key" is bad, it will be hard to find which code path uses it... And in addition, I didn't see test case that parses raw priority acls (string) to List of PriorityACLGroup. Could you point me if there's any test cases exist? Few renaming suggestions: - PriorityACLConfiguration -> AppPriorityACLConfigurationParser (I was trying to find where's the parser code, and since we're adding queue priority YARN-5864, so it will be better to add an App- to distinguish that) - Similiarily, PriorityAclConfig -> AppPriorityACLOwnerType (or any better name?) - PriorityACLGroup -> AppPriorityACLGroup - Do you think is it better to rename acl_access_priority to acl_app_max_priority? 3) bq. This code will reset to cluster-max priority only if submitted priority is more than cluster max. Since I used compareTo, it not looks very readable. Yeah, since we're using Priority in different ways, sometimes lower is more important and sometimes higher is more important. Could you use ">" to do the comparision? bq. checkAndGetApplicationPriority: when an app's priority set to negative, I think we should use 0 instead of max. Thoughts? Negative value looks fine, since app can set lower priority if needed. 4) bq. Could I add a clear model. It may be more easy. Thoughts? Updated patch as per this. Not quite sure what did you mean. From my understanding, existing logic read acls from configs while refreshQueues, and what we need to do is to replace all ACLs instead of append to previous acl list, correct? bq. One doubt here. Since priorityAcls could also be updated in reinitialize, we can’t make it as final rt. refreshQueue’s call flow for eg. Since the returned list can be modified by another thread, so the readLock cannot provide enough protection. The better way might be readLock + copyList. bq. But we are doing statestore update within scheduler. Hence we need to pass future to see exception is thrown immediately. Hence we had to add this while doing move to queue. Make sense. > Support for priority ACLs in CapacityScheduler > -- > > Key: YARN-3955 > URL: https://issues.apache.org/jira/browse/YARN-3955 > Pro
[jira] [Updated] (YARN-6032) SharedCacheManager cleaner task should rm InMemorySCMStore some cachedResources which does not exists in hdfs fs
[ https://issues.apache.org/jira/browse/YARN-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-6032: - Fix Version/s: (was: 2.7.1) > SharedCacheManager cleaner task should rm InMemorySCMStore some > cachedResources which does not exists in hdfs fs > - > > Key: YARN-6032 > URL: https://issues.apache.org/jira/browse/YARN-6032 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Zhaofei Meng > > If cacheresources exist in scm but not exist in hdfs,the cacheresources > whill not rm from scm until restart scm.So we shoult add check funcion in > cleaner task that rm the cachedResources which does not exists in hdfs fs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3955) Support for priority ACLs in CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802330#comment-15802330 ] Wangda Tan commented on YARN-3955: -- 1) bq. There is one issue here. If approvedPriorityACL comes are null, for checkAccess it means false. Ok gotcha, my bad, we cannot merge the two. 2) Got it, not related to your patch. The previous design of "acl-key" is bad, it will be hard to find which code path uses it... And in addition, I didn't see test case that parses raw priority acls (string) to List of PriorityACLGroup. Could you point me if there's any test cases exist? Few renaming suggestions: - PriorityACLConfiguration -> AppPriorityACLConfigurationParser (I was trying to find where's the parser code, and since we're adding queue priority YARN-5864, so it will be better to add an App- to distinguish that) - Similiarily, PriorityAclConfig -> AppPriorityACLOwnerType (or any better name?) - PriorityACLGroup -> AppPriorityACLGroup - Do you think is it better to rename acl_access_priority to acl_app_max_priority? 3) bq. This code will reset to cluster-max priority only if submitted priority is more than cluster max. Since I used compareTo, it not looks very readable. Yeah, since we're using Priority in different ways, sometimes lower is more important and sometimes higher is more important. Could you use ">" to do the comparision? bq. checkAndGetApplicationPriority: when an app's priority set to negative, I think we should use 0 instead of max. Thoughts? Negative value looks fine, since app can set lower priority if needed. 4) bq. Could I add a clear model. It may be more easy. Thoughts? Updated patch as per this. Not quite sure what did you mean. From my understanding, existing logic read acls from configs while refreshQueues, and what we need to do is to replace all ACLs instead of append to previous acl list, correct? bq. One doubt here. Since priorityAcls could also be updated in reinitialize, we can’t make it as final rt. refreshQueue’s call flow for eg. Since the returned list can be modified by another thread, so the readLock cannot provide enough protection. The better way might be readLock + copyList. bq. But we are doing statestore update within scheduler. Hence we need to pass future to see exception is thrown immediately. Hence we had to add this while doing move to queue. Make sense. > Support for priority ACLs in CapacityScheduler > -- > > Key: YARN-3955 > URL: https://issues.apache.org/jira/browse/YARN-3955 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Sunil G >Assignee: Sunil G > Attachments: ApplicationPriority-ACL.pdf, > ApplicationPriority-ACLs-v2.pdf, YARN-3955.0001.patch, YARN-3955.0002.patch, > YARN-3955.0003.patch, YARN-3955.0004.patch, YARN-3955.0005.patch, > YARN-3955.0006.patch, YARN-3955.0007.patch, YARN-3955.0008.patch, > YARN-3955.v0.patch, YARN-3955.v1.patch, YARN-3955.wip1.patch > > > Support will be added for User-level access permission to use different > application-priorities. This is to avoid situations where all users try > running max priority in the cluster and thus degrading the value of > priorities. > Access Control Lists can be set per priority level within each queue. Below > is an example configuration that can be added in capacity scheduler > configuration > file for each Queue level. > yarn.scheduler.capacity.root...acl=user1,user2 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5831) Propagate allowPreemptionFrom flag all the way down to the app
[ https://issues.apache.org/jira/browse/YARN-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802328#comment-15802328 ] Hadoop QA commented on YARN-5831: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 21s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 46 unchanged - 0 fixed = 47 total (was 46) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 21s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 1 new + 913 unchanged - 0 fixed = 914 total (was 913) {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m 34s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 66m 18s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5831 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12845682/YARN-5831.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 32d87da1364c 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0a55bd8 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/14568/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/14568/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/14568/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https
[jira] [Commented] (YARN-5955) Use threadpool or multiple thread to recover app
[ https://issues.apache.org/jira/browse/YARN-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802336#comment-15802336 ] Junping Du commented on YARN-5955: -- Remove fix version as JIRA hasn't been resolved. > Use threadpool or multiple thread to recover app > > > Key: YARN-5955 > URL: https://issues.apache.org/jira/browse/YARN-5955 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.7.1 >Reporter: Zhaofei Meng >Assignee: Ajith S > > current app recovery is one by one,use thead pool can make recovery faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5936) when cpu strict mode is closed, yarn couldn't assure scheduling fairness between containers
[ https://issues.apache.org/jira/browse/YARN-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-5936: - Fix Version/s: (was: 2.7.1) > when cpu strict mode is closed, yarn couldn't assure scheduling fairness > between containers > --- > > Key: YARN-5936 > URL: https://issues.apache.org/jira/browse/YARN-5936 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 > Environment: CentOS7.1 >Reporter: zhengchenyu >Priority: Critical > Original Estimate: 1m > Remaining Estimate: 1m > > When using LinuxContainer, the setting that > "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" is > true could assure scheduling fairness with the cpu bandwith of cgroup. But > the cpu bandwidth of cgroup would lead to bad performance in our experience. > Without cpu bandwidth of cgroup, cpu.share of cgroup is our only way to > assure scheduling fairness, but it is not completely effective. For example, > There are two container that have same vcore(means same cpu.share), one > container is single-threaded, the other container is multi-thread. the > multi-thread will have more CPU time, It's unreasonable! > Here is my test case, I submit two distributedshell application. And two > commmand are below: > {code} > hadoop jar > share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar > org.apache.hadoop.yarn.applications.distributedshell.Client -jar > share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar > -shell_script ./run.sh -shell_args 10 -num_containers 1 -container_memory > 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10 > hadoop jar > share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar > org.apache.hadoop.yarn.applications.distributedshell.Client -jar > share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar > -shell_script ./run.sh -shell_args 1 -num_containers 1 -container_memory > 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10 > {code} > here show the cpu time of the two container: > {code} > PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND > 15448 yarn 20 0 9059592 28336 9180 S 998.7 0.1 24:09.30 java > 15026 yarn 20 0 9050340 27480 9188 S 100.0 0.1 3:33.97 java > 13767 yarn 20 0 1799816 381208 18528 S 4.6 1.2 0:30.55 java >77 root rt 0 0 0 0 S 0.3 0.0 0:00.74 > migration/1 > {code} > We find the cpu time of Muliti-Thread are ten times than the cpu time of > Single-Thread, though the two container have same cpu.share. > notes: > run.sh > {code} > java -cp /home/yarn/loop.jar:$CLASSPATH loop.loop $1 > {code} > loop.java > {code} > package loop; > public class loop { > public static void main(String[] args) { > // TODO Auto-generated method stub > int loop = 1; > if(args.length>=1) { > System.out.println(args[0]); > loop = Integer.parseInt(args[0]); > } > for(int i=0;i System.out.println("start thread " + i); > new Thread(new Runnable() { > @Override > public void run() { > // TODO Auto-generated method stub > int j=0; > while(true){j++;} > } > }).start(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5955) Use threadpool or multiple thread to recover app
[ https://issues.apache.org/jira/browse/YARN-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-5955: - Fix Version/s: (was: 2.7.1) > Use threadpool or multiple thread to recover app > > > Key: YARN-5955 > URL: https://issues.apache.org/jira/browse/YARN-5955 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.7.1 >Reporter: Zhaofei Meng >Assignee: Ajith S > > current app recovery is one by one,use thead pool can make recovery faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5846) Improve the fairscheduler attemptScheduler
[ https://issues.apache.org/jira/browse/YARN-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-5846: - Fix Version/s: (was: 2.7.1) > Improve the fairscheduler attemptScheduler > --- > > Key: YARN-5846 > URL: https://issues.apache.org/jira/browse/YARN-5846 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.7.1 > Environment: CentOS-7.1 >Reporter: zhengchenyu >Priority: Critical > Labels: fairscheduler > Original Estimate: 1m > Remaining Estimate: 1m > > when I assign a container, we must consider two factor: > (1) sort the queue and application, and select the proper request. > (2) then we assure this request's host is just this node (data locality). > or skip this loop! > this algorithm regard the sorting queue and application as primary factor. > when yarn consider data locality, for example, > yarn.scheduler.fair.locality.threshold.node=1, > yarn.scheduler.fair.locality.threshold.rack=1 (or > yarn.scheduler.fair.locality-delay-rack-ms and > yarn.scheduler.fair.locality-delay-node-ms is very large) and lots of > applications are runnig, the process of assigning contianer becomes very slow. > I think data locality is more important then the sequence of the queue and > applications. > I wanna a new algorithm like this: > (1) when resourcemanager accept a new request, notice the RMNodeImpl, > and then record this association between RMNode and request > (2) when assign containers for node, we assign container by > RMNodeImpl's association between RMNode and request directly > (3) then I consider the priority of queue and applation. In one object > of RMNodeImpl, we sort the request of association. > (4) and I think the sorting of current algorithm is consuming, in > especial, losts of applications are running, lots of sorting are called. so I > think we should sort the queue and applicaiton in a daemon thread, because > less error of queues's sequences is allowed. > > > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5846) Improve the fairscheduler attemptScheduler
[ https://issues.apache.org/jira/browse/YARN-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802340#comment-15802340 ] Junping Du commented on YARN-5846: -- Hi, we shouldn't set fix version here unless the commit get checked in. > Improve the fairscheduler attemptScheduler > --- > > Key: YARN-5846 > URL: https://issues.apache.org/jira/browse/YARN-5846 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.7.1 > Environment: CentOS-7.1 >Reporter: zhengchenyu >Priority: Critical > Labels: fairscheduler > Original Estimate: 1m > Remaining Estimate: 1m > > when I assign a container, we must consider two factor: > (1) sort the queue and application, and select the proper request. > (2) then we assure this request's host is just this node (data locality). > or skip this loop! > this algorithm regard the sorting queue and application as primary factor. > when yarn consider data locality, for example, > yarn.scheduler.fair.locality.threshold.node=1, > yarn.scheduler.fair.locality.threshold.rack=1 (or > yarn.scheduler.fair.locality-delay-rack-ms and > yarn.scheduler.fair.locality-delay-node-ms is very large) and lots of > applications are runnig, the process of assigning contianer becomes very slow. > I think data locality is more important then the sequence of the queue and > applications. > I wanna a new algorithm like this: > (1) when resourcemanager accept a new request, notice the RMNodeImpl, > and then record this association between RMNode and request > (2) when assign containers for node, we assign container by > RMNodeImpl's association between RMNode and request directly > (3) then I consider the priority of queue and applation. In one object > of RMNodeImpl, we sort the request of association. > (4) and I think the sorting of current algorithm is consuming, in > especial, losts of applications are running, lots of sorting are called. so I > think we should sort the queue and applicaiton in a daemon thread, because > less error of queues's sequences is allowed. > > > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3795) ZKRMStateStore crashes due to IOException: Broken pipe
[ https://issues.apache.org/jira/browse/YARN-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3795: - Fix Version/s: (was: 2.7.1) > ZKRMStateStore crashes due to IOException: Broken pipe > -- > > Key: YARN-3795 > URL: https://issues.apache.org/jira/browse/YARN-3795 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.0 >Reporter: lachisis >Priority: Critical > > 2015-06-05 06:06:54,848 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to dap88/134.41.33.88:2181, initiating session > 2015-06-05 06:06:54,876 INFO org.apache.zookeeper.ClientCnxn: Session > establishment complete on server dap88/134.41.33.88:2181, sessionid = > 0x34db2f72ac50c86, negotiated timeout = 1 > 2015-06-05 06:06:54,881 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > Watcher event type: None with state:SyncConnected for path:null for Service > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED > 2015-06-05 06:06:54,881 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > ZKRMStateStore Session connected > 2015-06-05 06:06:54,881 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > ZKRMStateStore Session restored > 2015-06-05 06:06:54,881 WARN org.apache.zookeeper.ClientCnxn: Session > 0x34db2f72ac50c86 for server dap88/134.41.33.88:2181, unexpected error, > closing socket connection and attempting reconnect > java.io.IOException: Broken pipe > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94) > at sun.nio.ch.IOUtil.write(IOUtil.java:65) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075) > 2015-06-05 06:06:54,986 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > Watcher event type: None with state:Disconnected for path:null for Service > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED > 2015-06-05 06:06:54,986 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > ZKRMStateStore Session disconnected > 2015-06-05 06:06:55,278 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server dap87/134.41.33.87:2181. Will not attempt to > authenticate using SASL (unknown error) > 2015-06-05 06:06:55,278 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to dap87/134.41.33.87:2181, initiating session > 2015-06-05 06:06:55,330 INFO org.apache.zookeeper.ClientCnxn: Session > establishment complete on server dap87/134.41.33.87:2181, sessionid = > 0x34db2f72ac50c86, negotiated timeout = 1 > 2015-06-05 06:06:55,343 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > Watcher event type: None with state:SyncConnected for path:null for Service > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED > 2015-06-05 06:06:55,343 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > ZKRMStateStore Session connected > 2015-06-05 06:06:55,344 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: > ZKRMStateStore Session restored > 2015-06-05 06:06:55,345 WARN org.apache.zookeeper.ClientCnxn: Session > 0x34db2f72ac50c86 for server dap87/134.41.33.87:2181, unexpected error, > closing socket connection and attempting reconnect > java.io.IOException: Broken pipe > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94) > at sun.nio.ch.IOUtil.write(IOUtil.java:65) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-6040) Remove usage of ResourceRequest from AppSchedulerInfo / SchedulerApplicationAttempt
[ https://issues.apache.org/jira/browse/YARN-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802354#comment-15802354 ] Hadoop QA commented on YARN-6040: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 33s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 26 new + 954 unchanged - 19 fixed = 980 total (was 973) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 42s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 65m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6040 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12845837/YARN-6040.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 9c22d01c90a5 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0a55bd8 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/14569/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/14569/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/14569/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-B
[jira] [Updated] (YARN-3614) FileSystemRMStateStore throw exception when failed to remove application, that cause resourcemanager to crash
[ https://issues.apache.org/jira/browse/YARN-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3614: - Fix Version/s: (was: 2.7.1) > FileSystemRMStateStore throw exception when failed to remove application, > that cause resourcemanager to crash > - > > Key: YARN-3614 > URL: https://issues.apache.org/jira/browse/YARN-3614 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.0, 2.7.0 >Reporter: lachisis >Priority: Critical > Labels: patch > Attachments: YARN-3614-1.patch > > > FileSystemRMStateStore is only a accessorial plug-in of rmstore. > When it failed to remove application, I think warning is enough, but now > resourcemanager crashed. > Recently, I configure > "yarn.resourcemanager.state-store.max-completed-applications" to limit > applications number in rmstore. when applications number exceed the limit, > some old applications will be removed. If failed to remove, resourcemanager > will crash. > The following is log: > 2015-05-11 06:58:43,815 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Removing > info for app: application_1430994493305_0053 > 2015-05-11 06:58:43,815 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore: > Removing info for app: application_1430994493305_0053 at: > /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053 > 2015-05-11 06:58:43,816 ERROR > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error > removing app: application_1430994493305_0053 > java.lang.Exception: Failed to delete > /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053 > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:572) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:471) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:185) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:171) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:806) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:879) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:874) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:745) > 2015-05-11 06:58:43,819 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a > org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type > STATE_STORE_OP_FAILED. Cause: > java.lang.Exception: Failed to delete > /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053 > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:572) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:471) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:185) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:171) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateM
[jira] [Updated] (YARN-3550) Improve YARN RM REST API error messages
[ https://issues.apache.org/jira/browse/YARN-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3550: - Fix Version/s: (was: 2.7.1) > Improve YARN RM REST API error messages > --- > > Key: YARN-3550 > URL: https://issues.apache.org/jira/browse/YARN-3550 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.6.0 >Reporter: Rajesh Kartha >Priority: Minor > > The error messages from an invalid REST call to the YARN RM Rest service does > not yield in useful messages. > Here is a simple example of using GET instead of POST to get a new > application id: > $ curl -X GET http://myhost:8088/ws/v1/cluster/apps/new-application > standalone="yes"?>WebApplicationExceptionjavax.ws.rs.WebApplicationException > and the RM log has this: > 2015-04-27 11:18:27,783 WARN webapp.GenericExceptionHandler > (GenericExceptionHandler.java:toResponse(98)) - INTERNAL_SERVER_ERROR > javax.ws.rs.WebApplicationException > at > com.sun.jersey.server.impl.uri.rules.TerminatingRule.accept(TerminatingRule.java:66) > at > com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) > at > com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) > at > com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) > at > com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) > at > com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) > at > com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) > at > com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) > at > com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) > at > com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:84) > Would be useful to return a useful message -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3193) When visit standby RM webui, it will redirect to the active RM webui slowly.
[ https://issues.apache.org/jira/browse/YARN-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3193: - Fix Version/s: (was: 2.7.1) > When visit standby RM webui, it will redirect to the active RM webui slowly. > > > Key: YARN-3193 > URL: https://issues.apache.org/jira/browse/YARN-3193 > Project: Hadoop YARN > Issue Type: Improvement > Components: webapp >Reporter: Japs_123 >Assignee: Steve Loughran >Priority: Minor > > when visit the standby RM web ui, it will redirect to the active RM web ui. > but this redirect is very slow which give client bad experience. I have try > to visit standby namenode, it directly show the web to client quickly. So, > can we improve this experience with YARN like HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6040) Remove usage of ResourceRequest from AppSchedulerInfo / SchedulerApplicationAttempt
[ https://issues.apache.org/jira/browse/YARN-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802425#comment-15802425 ] Arun Suresh commented on YARN-6040: --- Thanks for updating the patch [~leftnoteasy], # Looks like SchedulingPlacementSet still exposes getPendingAllocationNumber(), can we change that to match the PendingAsk ? Thinking further, wondering if you need that method.. you should be doing getPendingAsk(resourceName).getCount() right ? # Similarly, looks like you might not need SchedulerApplicationAttempt::getPendingAllocationNumber() either.. Everything else looks ok to me.. > Remove usage of ResourceRequest from AppSchedulerInfo / > SchedulerApplicationAttempt > --- > > Key: YARN-6040 > URL: https://issues.apache.org/jira/browse/YARN-6040 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6040.001.patch, YARN-6040.002.patch, > YARN-6040.003.patch, YARN-6040.004.patch, YARN-6040.005.patch, > YARN-6040.006.patch > > > As mentioned by YARN-5906, currently schedulers are using ResourceRequest > heavily so it will be very hard to adopt the new PowerfulResourceRequest > (YARN-4902). > This JIRA is the 2nd step of refactoring, which remove usage of > ResourceRequest from AppSchedulingInfo / SchedulerApplicationAttempt. Instead > of returning ResourceRequest, it returns a lightweight and API-independent > object - {{PendingAsk}}. > The only remained ResourceRequest API of AppSchedulingInfo will be used by > web service to get list of ResourceRequests. > So after this patch, usage of ResourceRequest will be isolated inside > AppSchedulingInfo, so it will be more flexible to update internal data > structure and upgrade old ResourceRequest API to new. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6059) Update paused container state in the state store
Hitesh Sharma created YARN-6059: --- Summary: Update paused container state in the state store Key: YARN-6059 URL: https://issues.apache.org/jira/browse/YARN-6059 Project: Hadoop YARN Issue Type: Sub-task Reporter: Hitesh Sharma Assignee: Hitesh Sharma -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5246) NMWebAppFilter web redirects drop query parameters
[ https://issues.apache.org/jira/browse/YARN-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5246: - Fix Version/s: (was: 2.9.0) 2.8.0 Thanks, Varun! I committed this to branch-2.8 as well. > NMWebAppFilter web redirects drop query parameters > -- > > Key: YARN-5246 > URL: https://issues.apache.org/jira/browse/YARN-5246 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: YARN-5246.001.patch, YARN-5246.002.patch > > > The NMWebAppFilter drops query parameters when it carries out a redirect to > the log server. This leads to problems when users have simple web > authentication setup. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4990) Re-direction of a particular log file within in a container in NM UI does not redirect properly to Log Server ( history ) on container completion
[ https://issues.apache.org/jira/browse/YARN-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4990: - Fix Version/s: (was: 2.9.0) 2.8.0 Thanks, Xuan! I committed this to branch-2.8 as well. > Re-direction of a particular log file within in a container in NM UI does not > redirect properly to Log Server ( history ) on container completion > - > > Key: YARN-4990 > URL: https://issues.apache.org/jira/browse/YARN-4990 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Hitesh Shah >Assignee: Xuan Gong > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: YARN-4990.1.patch, YARN-4990.2.patch > > > The NM does the redirection to the history server correctly. However if the > user is viewing or has a link to a particular specific file, the redirect > ends up going to the top level page for the container and not redirecting to > the specific file. Additionally, the start param to show logs from the offset > 0 also goes missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5222) DockerContainerExecutor dosn't set work directory
[ https://issues.apache.org/jira/browse/YARN-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-5222: - Fix Version/s: (was: 2.7.2) > DockerContainerExecutor dosn't set work directory > - > > Key: YARN-5222 > URL: https://issues.apache.org/jira/browse/YARN-5222 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1, 2.7.2 > Environment: centos >Reporter: zhengchenyu >Priority: Critical > Labels: patch > Original Estimate: 168h > Remaining Estimate: 168h > > When I submit a spark task on Docker Container, NoClassDefFoundError happend! > but the MapReduce taks dosen‘t have this problem. Because, whe lauch the > Docker container, Docker dosen't set the work directory for comand, this > program can't find spark-assembly-1.6.1-hadoop2.7.1.jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5222) DockerContainerExecutor dosn't set work directory
[ https://issues.apache.org/jira/browse/YARN-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802485#comment-15802485 ] Junping Du commented on YARN-5222: -- Also, please don't set fix version field as we don't have any patch get committed. > DockerContainerExecutor dosn't set work directory > - > > Key: YARN-5222 > URL: https://issues.apache.org/jira/browse/YARN-5222 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.1, 2.7.2 > Environment: centos >Reporter: zhengchenyu >Priority: Critical > Labels: patch > Original Estimate: 168h > Remaining Estimate: 168h > > When I submit a spark task on Docker Container, NoClassDefFoundError happend! > but the MapReduce taks dosen‘t have this problem. Because, whe lauch the > Docker container, Docker dosen't set the work directory for comand, this > program can't find spark-assembly-1.6.1-hadoop2.7.1.jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4348) ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding blocking ZK's event thread
[ https://issues.apache.org/jira/browse/YARN-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802493#comment-15802493 ] Junping Du commented on YARN-4348: -- Hi [~jianhe] and [~ozawa], Does this fix need to go to trunk/branch-2/branch-2.8? > ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding > blocking ZK's event thread > -- > > Key: YARN-4348 > URL: https://issues.apache.org/jira/browse/YARN-4348 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2, 2.6.2 >Reporter: Tsuyoshi Ozawa >Assignee: Tsuyoshi Ozawa >Priority: Blocker > Fix For: 2.7.2, 2.6.3 > > Attachments: YARN-4348-branch-2.7.002.patch, > YARN-4348-branch-2.7.003.patch, YARN-4348-branch-2.7.004.patch, > YARN-4348.001.patch, YARN-4348.001.patch, log.txt > > > Jian mentioned that the current internal ZK configuration of ZKRMStateStore > can cause a following situation: > 1. syncInternal timeouts, > 2. but sync succeeded later on. > We should use zkResyncWaitTime as the timeout value. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4348) ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding blocking ZK's event thread
[ https://issues.apache.org/jira/browse/YARN-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802518#comment-15802518 ] Jian He commented on YARN-4348: --- No, it doesn't need to. The zkstore implementation has been changed by using curator 2.8 upwards > ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding > blocking ZK's event thread > -- > > Key: YARN-4348 > URL: https://issues.apache.org/jira/browse/YARN-4348 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2, 2.6.2 >Reporter: Tsuyoshi Ozawa >Assignee: Tsuyoshi Ozawa >Priority: Blocker > Fix For: 2.7.2, 2.6.3 > > Attachments: YARN-4348-branch-2.7.002.patch, > YARN-4348-branch-2.7.003.patch, YARN-4348-branch-2.7.004.patch, > YARN-4348.001.patch, YARN-4348.001.patch, log.txt > > > Jian mentioned that the current internal ZK configuration of ZKRMStateStore > can cause a following situation: > 1. syncInternal timeouts, > 2. but sync succeeded later on. > We should use zkResyncWaitTime as the timeout value. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5280) Allow YARN containers to run with Java Security Manager
[ https://issues.apache.org/jira/browse/YARN-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802560#comment-15802560 ] Hadoop QA commented on YARN-5280: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 38s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 45s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 5 new + 267 unchanged - 2 fixed = 272 total (was 269) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 23s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 0s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 64m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-5280 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12845844/YARN-5280.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux dcaeed50ee29 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0a55bd8 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/14571/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt | | Te
[jira] [Commented] (YARN-4348) ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding blocking ZK's event thread
[ https://issues.apache.org/jira/browse/YARN-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802604#comment-15802604 ] Junping Du commented on YARN-4348: -- Got it. Thanks for confirmation here, Jian! > ZKRMStateStore.syncInternal shouldn't wait for sync completion for avoiding > blocking ZK's event thread > -- > > Key: YARN-4348 > URL: https://issues.apache.org/jira/browse/YARN-4348 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2, 2.6.2 >Reporter: Tsuyoshi Ozawa >Assignee: Tsuyoshi Ozawa >Priority: Blocker > Fix For: 2.7.2, 2.6.3 > > Attachments: YARN-4348-branch-2.7.002.patch, > YARN-4348-branch-2.7.003.patch, YARN-4348-branch-2.7.004.patch, > YARN-4348.001.patch, YARN-4348.001.patch, log.txt > > > Jian mentioned that the current internal ZK configuration of ZKRMStateStore > can cause a following situation: > 1. syncInternal timeouts, > 2. but sync succeeded later on. > We should use zkResyncWaitTime as the timeout value. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2
[ https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802732#comment-15802732 ] Wangda Tan commented on YARN-6041: -- +1 to latest patch, [~asuresh] please wait another day before cherry-picking it to see is there any other comments. > Opportunistic containers : Combined patch for branch-2 > --- > > Key: YARN-6041 > URL: https://issues.apache.org/jira/browse/YARN-6041 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6041-branch-2.001.patch, > YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch > > > This is a combined patch targeting branch-2 of the following JIRAs which have > already been committed to trunk : > YARN-5938. Refactoring OpportunisticContainerAllocator to use > SchedulerRequestKey instead of Priority and other misc fixes > YARN-5646. Add documentation and update config parameter names for scheduling > of OPPORTUNISTIC containers. > YARN-5982. Simplify opportunistic container parameters and metrics. > YARN-5918. Handle Opportunistic scheduling allocate request failure when NM > is lost. > YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager > container lifecycle. > YARN-5823. Update NMTokens in case of requests with only opportunistic > containers. > YARN-5377. Fix > TestQueuingContainerManager.testKillMultipleOpportunisticContainers. > YARN-2995. Enhance UI to show cluster resource utilization of various > container Execution types. > YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http > Address. > YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method > to handle OPPORTUNISTIC container requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4675) Reorganize TimeClientImpl into TimeClientV1Impl and TimeClientV2Impl
[ https://issues.apache.org/jira/browse/YARN-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802756#comment-15802756 ] Sangjin Lee commented on YARN-4675: --- One implication of refactoring the interface is that code that uses the timeline client would need to be updated along with the API changes. MR and DS are not a problem, but other off-hadoop clients such as Tez would need to make this change when this change lands on trunk. I assume it is not a major problem, but just so that we are aware. cc [~rohithsharma] > Reorganize TimeClientImpl into TimeClientV1Impl and TimeClientV2Impl > > > Key: YARN-4675 > URL: https://issues.apache.org/jira/browse/YARN-4675 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Labels: YARN-5355, yarn-5355-merge-blocker > Attachments: YARN-4675-YARN-2928.v1.001.patch > > > We need to reorganize TimeClientImpl into TimeClientV1Impl , > TimeClientV2Impl and if required a base class, so that its clear which part > of the code belongs to which version and thus better maintainable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3544) AM logs link missing in the RM UI for a completed app
[ https://issues.apache.org/jira/browse/YARN-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3544: - Fix Version/s: 2.8.0 > AM logs link missing in the RM UI for a completed app > -- > > Key: YARN-3544 > URL: https://issues.apache.org/jira/browse/YARN-3544 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.7.0 >Reporter: Hitesh Shah >Assignee: Xuan Gong >Priority: Blocker > Labels: 2.6.1-candidate > Fix For: 2.6.1, 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: Screen Shot 2015-04-27 at 6.24.05 PM.png, > YARN-3544-branch-2.6.1.txt, YARN-3544-branch-2.7-1.2.patch, > YARN-3544-branch-2.7-1.patch, YARN-3544.1.patch > > > AM log links should always be present ( for both running and completed apps). > Likewise node info is also empty. This is usually quite crucial when trying > to debug where an AM was launched and a pointer to which NM's logs to look at > if the AM failed to launch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows
[ https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3681: - Fix Version/s: 2.8.0 > yarn cmd says "could not find main class 'queue'" in windows > > > Key: YARN-3681 > URL: https://issues.apache.org/jira/browse/YARN-3681 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.0 > Environment: Windows Only >Reporter: Sumana Sathish >Assignee: Varun Saxena >Priority: Blocker > Labels: windows, yarn-client > Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: YARN-3681.0.patch, YARN-3681.01.patch, > YARN-3681.1.patch, YARN-3681.branch-2.0.patch, yarncmd.png > > > Attached the screenshot of the command prompt in windows running yarn queue > command. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3725) App submission via REST API is broken in secure mode due to Timeline DT service address is empty
[ https://issues.apache.org/jira/browse/YARN-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3725: - Fix Version/s: 2.8.0 > App submission via REST API is broken in secure mode due to Timeline DT > service address is empty > > > Key: YARN-3725 > URL: https://issues.apache.org/jira/browse/YARN-3725 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, timelineserver >Affects Versions: 2.7.0 >Reporter: Zhijie Shen >Assignee: Zhijie Shen >Priority: Blocker > Labels: 2.6.1-candidate > Fix For: 2.6.1, 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: YARN-3725-branch-2.6.1.txt, YARN-3725.1.patch > > > YARN-2971 changes TimelineClient to use the service address from Timeline DT > to renew the DT instead of configured address. This break the procedure of > submitting an YARN app via REST API in the secure mode. > The problem is that service address is set by the client instead of the > server in Java code. REST API response is an encode token Sting, such that > it's so inconvenient to deserialize it and set the service address and > serialize it again. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3733) Fix DominantRC#compare() does not work as expected if cluster resource is empty
[ https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3733: - Fix Version/s: 2.8.0 > Fix DominantRC#compare() does not work as expected if cluster resource is > empty > --- > > Key: YARN-3733 > URL: https://issues.apache.org/jira/browse/YARN-3733 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.0 > Environment: Suse 11 Sp3 , 2 NM , 2 RM > one NM - 3 GB 6 v core >Reporter: Bibin A Chundatt >Assignee: Rohith Sharma K S >Priority: Blocker > Labels: 2.6.1-candidate > Fix For: 2.6.1, 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: 0001-YARN-3733.patch, 0002-YARN-3733.patch, > 0002-YARN-3733.patch, YARN-3733.patch > > > Steps to reproduce > = > 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) > 2. Configure map and reduce size to 512 MB after changing scheduler minimum > size to 512 MB > 3. Configure capacity scheduler and AM limit to .5 > (DominantResourceCalculator is configured) > 4. Submit 30 concurrent task > 5. Switch RM > Actual > = > For 12 Jobs AM gets allocated and all 12 starts running > No other Yarn child is initiated , *all 12 Jobs in Running state for ever* > Expected > === > Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3701) Isolating the error of generating a single app report when getting all apps from generic history service
[ https://issues.apache.org/jira/browse/YARN-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3701: - Fix Version/s: 2.8.0 > Isolating the error of generating a single app report when getting all apps > from generic history service > > > Key: YARN-3701 > URL: https://issues.apache.org/jira/browse/YARN-3701 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Zhijie Shen >Priority: Blocker > Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: YARN-3701.1.patch > > > Nowadays, if some error of generating a single app report when getting the > application list from generic history service, it will throw the exception. > Therefore, even if it just 1 out of 100 apps has something wrong, the whole > app list is screwed. The worst impact is making the default page (app list) > of GHS web UI crash, wile REST API /applicationhistory/apps will also break. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5309) Fix SSLFactory truststore reloader thread leak in TimelineClientImpl
[ https://issues.apache.org/jira/browse/YARN-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-5309: - Fix Version/s: 2.8.0 > Fix SSLFactory truststore reloader thread leak in TimelineClientImpl > > > Key: YARN-5309 > URL: https://issues.apache.org/jira/browse/YARN-5309 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver, yarn >Affects Versions: 2.7.1 >Reporter: Thomas Friedrich >Assignee: Weiwei Yang >Priority: Blocker > Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1 > > Attachments: YARN-5309.001.patch, YARN-5309.002.patch, > YARN-5309.003.patch, YARN-5309.004.patch, YARN-5309.005.patch, > YARN-5309.branch-2.7.3.001.patch, YARN-5309.branch-2.8.001.patch > > > We found a similar issue as HADOOP-11368 in TimelineClientImpl. The class > creates an instance of SSLFactory in newSslConnConfigurator and subsequently > creates the ReloadingX509TrustManager instance which in turn starts a trust > store reloader thread. > However, the SSLFactory is never destroyed and hence the trust store reloader > threads are not killed. > This problem was observed by a customer who had SSL enabled in Hadoop and > submitted many queries against the HiveServer2. After a few days, the HS2 > instance crashed and from the Java dump we could see many (over 13000) > threads like this: > "Truststore reloader thread" #126 daemon prio=5 os_prio=0 > tid=0x7f680d2e3000 nid=0x98fd waiting on > condition [0x7f67e482c000] >java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run > (ReloadingX509TrustManager.java:225) > at java.lang.Thread.run(Thread.java:745) > HiveServer2 uses the JobClient to submit a job: > Thread [HiveServer2-Background-Pool: Thread-188] (Suspended (breakpoint at > line 89 in > ReloadingX509TrustManager)) > owns: Object (id=464) > owns: Object (id=465) > owns: Object (id=466) > owns: ServiceLoader (id=210) > ReloadingX509TrustManager.(String, String, String, long) line: 89 > FileBasedKeyStoresFactory.init(SSLFactory$Mode) line: 209 > SSLFactory.init() line: 131 > TimelineClientImpl.newSslConnConfigurator(int, Configuration) line: 532 > TimelineClientImpl.newConnConfigurator(Configuration) line: 507 > TimelineClientImpl.serviceInit(Configuration) line: 269 > TimelineClientImpl(AbstractService).init(Configuration) line: 163 > YarnClientImpl.serviceInit(Configuration) line: 169 > YarnClientImpl(AbstractService).init(Configuration) line: 163 > ResourceMgrDelegate.serviceInit(Configuration) line: 102 > ResourceMgrDelegate(AbstractService).init(Configuration) line: 163 > ResourceMgrDelegate.(YarnConfiguration) line: 96 > YARNRunner.(Configuration) line: 112 > YarnClientProtocolProvider.create(Configuration) line: 34 > Cluster.initialize(InetSocketAddress, Configuration) line: 95 > Cluster.(InetSocketAddress, Configuration) line: 82 > Cluster.(Configuration) line: 75 > JobClient.init(JobConf) line: 475 > JobClient.(JobConf) line: 454 > MapRedTask(ExecDriver).execute(DriverContext) line: 401 > MapRedTask.execute(DriverContext) line: 137 > MapRedTask(Task).executeTask() line: 160 > TaskRunner.runSequential() line: 88 > Driver.launchTask(Task, String, boolean, String, int, > DriverContext) line: 1653 > Driver.execute() line: 1412 > For every job, a new instance of JobClient/YarnClientImpl/TimelineClientImpl > is created. But because the HS2 process stays up for days, the previous trust > store reloader threads are still hanging around in the HS2 process and > eventually use all the resources available. > It seems like a similar fix as HADOOP-11368 is needed in TimelineClientImpl > but it doesn't have a destroy method to begin with. > One option to avoid this problem is to disable the yarn timeline service > (yarn.timeline-service.enabled=false). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3426) Add jdiff support to YARN
[ https://issues.apache.org/jira/browse/YARN-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3426: - Fix Version/s: 2.8.0 > Add jdiff support to YARN > - > > Key: YARN-3426 > URL: https://issues.apache.org/jira/browse/YARN-3426 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Li Lu >Assignee: Li Lu >Priority: Blocker > Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1 > > Attachments: YARN-3426-040615-1.patch, YARN-3426-040615.patch, > YARN-3426-040715.patch, YARN-3426-040815.patch, YARN-3426-05-12-2016.txt, > YARN-3426-06-09-2016.txt, YARN-3426-branch-2.005.patch, > YARN-3426-branch-2.8.005.patch > > > Maybe we'd like to extend our current jdiff tool for hadoop-common and hdfs > to YARN as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues
[ https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3850: - Fix Version/s: 2.8.0 > NM fails to read files from full disks which can lead to container logs being > lost and other issues > --- > > Key: YARN-3850 > URL: https://issues.apache.org/jira/browse/YARN-3850 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, nodemanager >Affects Versions: 2.7.0 >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Blocker > Labels: 2.6.1-candidate > Fix For: 2.6.1, 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: YARN-3850.01.patch, YARN-3850.02.patch > > > *Container logs* can be lost if disk has become full(~90% full). > When application finishes, we upload logs after aggregation by calling > {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns > checks the eligible directories on call to > {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would > return nothing. So none of the container logs are aggregated and uploaded. > But on application finish, we also call > {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the > application directory which contains container logs. This is because it calls > {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks > as well. > So we are left with neither aggregated logs for the app nor the individual > container logs for the app. > In addition to this, there are 2 more issues : > # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so > NM will fail to serve up logs from full disks from its web interfaces. > # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full > disks so it is possible that on container recovery, PID file is not found. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6022) Revert changes of AbstractResourceRequest
[ https://issues.apache.org/jira/browse/YARN-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802770#comment-15802770 ] Daniel Templeton commented on YARN-6022: Sorry, [~leftnoteasy], there's a conflict now. Mind rebasing? > Revert changes of AbstractResourceRequest > - > > Key: YARN-6022 > URL: https://issues.apache.org/jira/browse/YARN-6022 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-6022.001.patch, YARN-6022.002.patch, > YARN-6022.003.patch > > > YARN-5774 added AbstractResourceRequest to make easier internal scheduler > change, this is not a correct approach: For example, with this change, we > need to make AbstractResourceRequest to be public/stable. And end users can > use it like: > {code} > AbstractResourceRequest request = ... > request.setCapability(...) > {code} > But AbstractResourceRequest should not be visible by application at all. > We need to revert it from branch-2.8 / branch-2 / trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3764) CapacityScheduler should forbid moving LeafQueue from one parent to another
[ https://issues.apache.org/jira/browse/YARN-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3764: - Fix Version/s: 2.8.0 > CapacityScheduler should forbid moving LeafQueue from one parent to another > --- > > Key: YARN-3764 > URL: https://issues.apache.org/jira/browse/YARN-3764 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: YARN-3764.1.patch > > > Currently CapacityScheduler doesn't handle the case well, for example: > A queue structure: > {code} > root > | > a (100) > / \ >x y > (50) (50) > {code} > And reinitialize using following structure: > {code} > root > / \ > (50)a x (50) > | > y >(100) > {code} > The actual queue structure after reinitialize is: > {code} > root > /\ >a (50) x (50) > / \ > xy > (50) (100) > {code} > We should forbid admin doing that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3522) DistributedShell uses the wrong user to put timeline data
[ https://issues.apache.org/jira/browse/YARN-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3522: - Fix Version/s: 2.8.0 > DistributedShell uses the wrong user to put timeline data > - > > Key: YARN-3522 > URL: https://issues.apache.org/jira/browse/YARN-3522 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Zhijie Shen >Priority: Blocker > Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: YARN-3522.1.patch, YARN-3522.2.patch, YARN-3522.3.patch > > > YARN-3287 breaks the timeline access control of distributed shell. In > distributed shell AM: > {code} > if (conf.getBoolean(YarnConfiguration.TIMELINE_SERVICE_ENABLED, > YarnConfiguration.DEFAULT_TIMELINE_SERVICE_ENABLED)) { > // Creating the Timeline Client > timelineClient = TimelineClient.createTimelineClient(); > timelineClient.init(conf); > timelineClient.start(); > } else { > timelineClient = null; > LOG.warn("Timeline service is not enabled"); > } > {code} > {code} > ugi.doAs(new PrivilegedExceptionAction() { > @Override > public TimelinePutResponse run() throws Exception { > return timelineClient.putEntities(entity); > } > }); > {code} > YARN-3287 changes the timeline client to get the right ugi at serviceInit, > but DS AM still doesn't use submitter ugi to init timeline client, but use > the ugi for each put entity call. It result in the wrong user of the put > request. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6022) Revert changes of AbstractResourceRequest
[ https://issues.apache.org/jira/browse/YARN-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802779#comment-15802779 ] Daniel Templeton commented on YARN-6022: Bah, nevermind. The conflict is trivial. I'll take care of it. > Revert changes of AbstractResourceRequest > - > > Key: YARN-6022 > URL: https://issues.apache.org/jira/browse/YARN-6022 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-6022.001.patch, YARN-6022.002.patch, > YARN-6022.003.patch > > > YARN-5774 added AbstractResourceRequest to make easier internal scheduler > change, this is not a correct approach: For example, with this change, we > need to make AbstractResourceRequest to be public/stable. And end users can > use it like: > {code} > AbstractResourceRequest request = ... > request.setCapability(...) > {code} > But AbstractResourceRequest should not be visible by application at all. > We need to revert it from branch-2.8 / branch-2 / trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4424) Fix deadlock in RMAppImpl
[ https://issues.apache.org/jira/browse/YARN-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4424: - Fix Version/s: 2.8.0 > Fix deadlock in RMAppImpl > - > > Key: YARN-4424 > URL: https://issues.apache.org/jira/browse/YARN-4424 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Yesha Vora >Assignee: Jian He >Priority: Blocker > Fix For: 2.8.0, 2.7.2, 2.6.3, 3.0.0-alpha1 > > Attachments: YARN-4424.1.patch > > > {code} > yarn@XXX:/mnt/hadoopqe$ /usr/hdp/current/hadoop-yarn-client/bin/yarn > application -list -appStates NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUNNING > 15/12/04 21:59:54 INFO impl.TimelineClientImpl: Timeline service address: > http://XXX:8188/ws/v1/timeline/ > 15/12/04 21:59:54 INFO client.RMProxy: Connecting to ResourceManager at > XXX/0.0.0.0:8050 > 15/12/04 21:59:55 INFO client.AHSProxy: Connecting to Application History > server at XXX/0.0.0.0:10200 > {code} > {code:title=RM log} > 2015-12-04 21:59:19,744 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 237000 > 2015-12-04 22:00:50,945 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 238000 > 2015-12-04 22:02:22,416 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 239000 > 2015-12-04 22:03:53,593 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 24 > 2015-12-04 22:05:24,856 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 241000 > 2015-12-04 22:06:56,235 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 242000 > 2015-12-04 22:08:27,510 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 243000 > 2015-12-04 22:09:58,786 INFO event.AsyncDispatcher > (AsyncDispatcher.java:handle(243)) - Size of event-queue is 244000 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4610) Reservations continue looking for one app causes other apps to starve
[ https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4610: - Fix Version/s: 2.8.0 > Reservations continue looking for one app causes other apps to starve > - > > Key: YARN-4610 > URL: https://issues.apache.org/jira/browse/YARN-4610 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Blocker > Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1 > > Attachments: YARN-4610-branch-2.7.002.patch, YARN-4610.001.patch, > YARN-4610.branch-2.7.001.patch > > > CapacityScheduler's LeafQueue has "reservations continue looking" logic that > allows an application to unreserve elsewhere to fulfil a container request on > a node that has available space. However in 2.7 that logic seems to break > allocations for subsequent apps in the queue. Once a user hits its user > limit, subsequent apps in the queue for other users receive containers at a > significantly reduced rate. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6060) Linux container executor fails to run container on directories mounted as noexec
Miklos Szegedi created YARN-6060: Summary: Linux container executor fails to run container on directories mounted as noexec Key: YARN-6060 URL: https://issues.apache.org/jira/browse/YARN-6060 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager, yarn Reporter: Miklos Szegedi Assignee: Miklos Szegedi If node manager directories are mounted as noexec, LCE fails with the following error: Launching container... Couldn't execute the container launch file /tmp/hadoop-/nm-local-dir/usercache//appcache/application_1483656052575_0001/container_1483656052575_0001_02_01/launch_container.sh - Permission denied -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6041) Opportunistic containers : Combined patch for branch-2
[ https://issues.apache.org/jira/browse/YARN-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802796#comment-15802796 ] Karthik Kambatla commented on YARN-6041: The changes to config names and methods look good to me. > Opportunistic containers : Combined patch for branch-2 > --- > > Key: YARN-6041 > URL: https://issues.apache.org/jira/browse/YARN-6041 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6041-branch-2.001.patch, > YARN-6041-branch-2.002.patch, YARN-6041-branch-2.003.patch > > > This is a combined patch targeting branch-2 of the following JIRAs which have > already been committed to trunk : > YARN-5938. Refactoring OpportunisticContainerAllocator to use > SchedulerRequestKey instead of Priority and other misc fixes > YARN-5646. Add documentation and update config parameter names for scheduling > of OPPORTUNISTIC containers. > YARN-5982. Simplify opportunistic container parameters and metrics. > YARN-5918. Handle Opportunistic scheduling allocate request failure when NM > is lost. > YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to NodeManager > container lifecycle. > YARN-5823. Update NMTokens in case of requests with only opportunistic > containers. > YARN-5377. Fix > TestQueuingContainerManager.testKillMultipleOpportunisticContainers. > YARN-2995. Enhance UI to show cluster resource utilization of various > container Execution types. > YARN-5799. Fix Opportunistic Allocation to set the correct value of Node Http > Address. > YARN-5486. Update OpportunisticContainerAllocatorAMService::allocate method > to handle OPPORTUNISTIC container requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6049) Graceful Decommission web link is broken, gives 404 Not Found
[ https://issues.apache.org/jira/browse/YARN-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-6049: -- Fix Version/s: (was: 3.0.0-alpha2) > Graceful Decommission web link is broken, gives 404 Not Found > - > > Key: YARN-6049 > URL: https://issues.apache.org/jira/browse/YARN-6049 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 3.0.0-alpha1 >Reporter: Emre Sevinç >Priority: Minor > Labels: documentation, easyfix > > Graceful Decommission page, > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html, > is broken. It gives "Not Found The requested URL > /docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html was not > found on this server." error. > There are links to this problematic web page from all of the HTML pages in > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6049) Graceful Decommission web link is broken, gives 404 Not Found
[ https://issues.apache.org/jira/browse/YARN-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated YARN-6049: -- Target Version/s: 3.0.0-alpha2 (was: 3.0.0-alpha1) > Graceful Decommission web link is broken, gives 404 Not Found > - > > Key: YARN-6049 > URL: https://issues.apache.org/jira/browse/YARN-6049 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 3.0.0-alpha1 >Reporter: Emre Sevinç >Priority: Minor > Labels: documentation, easyfix > > Graceful Decommission page, > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html, > is broken. It gives "Not Found The requested URL > /docs/current/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html was not > found on this server." error. > There are links to this problematic web page from all of the HTML pages in > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5556) Support for deleting queues without requiring a RM restart
[ https://issues.apache.org/jira/browse/YARN-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802851#comment-15802851 ] Xuan Gong commented on YARN-5556: - sounds good. Will update the design doc > Support for deleting queues without requiring a RM restart > -- > > Key: YARN-5556 > URL: https://issues.apache.org/jira/browse/YARN-5556 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Xuan Gong >Assignee: Naganarasimha G R > Attachments: YARN-5556.v1.001.patch, YARN-5556.v1.002.patch, > YARN-5556.v1.003.patch, YARN-5556.v1.004.patch > > > Today, we could add or modify queues without restarting the RM, via a CS > refresh. But for deleting queue, we have to restart the ResourceManager. We > could support for deleting queues without requiring a RM restart -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6060) Linux container executor fails to run container on directories mounted as noexec
[ https://issues.apache.org/jira/browse/YARN-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-6060: - Attachment: YARN-6060.000.patch > Linux container executor fails to run container on directories mounted as > noexec > > > Key: YARN-6060 > URL: https://issues.apache.org/jira/browse/YARN-6060 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager, yarn >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi > Attachments: YARN-6060.000.patch > > > If node manager directories are mounted as noexec, LCE fails with the > following error: > Launching container... > Couldn't execute the container launch file > /tmp/hadoop-/nm-local-dir/usercache//appcache/application_1483656052575_0001/container_1483656052575_0001_02_01/launch_container.sh > - Permission denied -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5724) [Umbrella] Better Queue Management in YARN
[ https://issues.apache.org/jira/browse/YARN-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15802882#comment-15802882 ] Xuan Gong commented on YARN-5724: - update the design doc based on the discussion on YARN-5556 > [Umbrella] Better Queue Management in YARN > -- > > Key: YARN-5724 > URL: https://issues.apache.org/jira/browse/YARN-5724 > Project: Hadoop YARN > Issue Type: Task > Components: capacity scheduler >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: > Designdocv1-Configuration-basedQueueManagementinYARN.pdf, > Designdocv2-Configuration-basedQueueManagementinYARN.pdf > > > This serves as an umbrella ticket for tasks related to better queue > management in YARN. > Today's the only way to manage the queue is through admins editing > configuration files and then issuing a refresh command. This will bring many > inconveniences. For example, the users can not create / delete /modify their > own queues without talking to site level admins. > Even in today's approach (configuration-based), we still have several places > needed to improve: > * It is possible today to add or modify queues without restarting the RM, > via a CS refresh. But for deleting queue, we have to restart the > resourcemanager. > * When a queue is STOPPED, resources allocated to the queue can be handled > better. Currently, they'll only be used if the other queues are setup to go > over their capacity. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5724) [Umbrella] Better Queue Management in YARN
[ https://issues.apache.org/jira/browse/YARN-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-5724: Attachment: Designdocv2-Configuration-basedQueueManagementinYARN.pdf > [Umbrella] Better Queue Management in YARN > -- > > Key: YARN-5724 > URL: https://issues.apache.org/jira/browse/YARN-5724 > Project: Hadoop YARN > Issue Type: Task > Components: capacity scheduler >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: > Designdocv1-Configuration-basedQueueManagementinYARN.pdf, > Designdocv2-Configuration-basedQueueManagementinYARN.pdf > > > This serves as an umbrella ticket for tasks related to better queue > management in YARN. > Today's the only way to manage the queue is through admins editing > configuration files and then issuing a refresh command. This will bring many > inconveniences. For example, the users can not create / delete /modify their > own queues without talking to site level admins. > Even in today's approach (configuration-based), we still have several places > needed to improve: > * It is possible today to add or modify queues without restarting the RM, > via a CS refresh. But for deleting queue, we have to restart the > resourcemanager. > * When a queue is STOPPED, resources allocated to the queue can be handled > better. Currently, they'll only be used if the other queues are setup to go > over their capacity. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6054) TimelineServer fails to start when some LevelDb state files are missing.
[ https://issues.apache.org/jira/browse/YARN-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated YARN-6054: --- Attachment: YARN-6054.01.patch Here's a patch along with a unit test. > TimelineServer fails to start when some LevelDb state files are missing. > > > Key: YARN-6054 > URL: https://issues.apache.org/jira/browse/YARN-6054 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0-alpha2 >Reporter: Ravi Prakash > Attachments: YARN-6054.01.patch > > > We encountered an issue recently where the TimelineServer failed to start > because some state files went missing. > {code} > 2016-11-21 20:46:43,134 INFO org.apache.hadoop.service.AbstractService: > Service > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer > failed in state INITED > ; cause: org.apache.hadoop.service.ServiceStateException: > org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 > missing files; e.g.: /timelines > erver/leveldb-timeline-store.ldb/127897.sst > org.apache.hadoop.service.ServiceStateException: > org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 > missing files; e.g.: /timelineserver/lev > eldb-timeline-store.ldb/127897.sst > 2016-11-21 20:46:43,135 FATAL > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: > Error starting ApplicationHistoryServer > org.apache.hadoop.service.ServiceStateException: > org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 9 > missing files; e.g.: > /timelineserver/leveldb-timeline-store.ldb/127897.sst > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:172) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:182) > Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: > Corruption: 9 missing files; e.g.: > /timelineserver/leveldb-timeline-store.ldb/127897.sst > at > org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) > at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) > at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) > at > org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore.serviceInit(LeveldbTimelineStore.java:229) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > ... 5 more > 2016-11-21 20:46:43,136 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status -1 > {code} > Ideally we shouldn't have any missing state files. However I'd posit that the > TimelineServer should have graceful degradation instead of failing to start > at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org