[jira] [Commented] (YARN-1904) Uniform the XXXXNotFound messages from ClientRMService and ApplicationHistoryClientService
[ https://issues.apache.org/jira/browse/YARN-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308802#comment-14308802 ] Hudson commented on YARN-1904: -- FAILURE: Integrated in Hadoop-trunk-Commit #7037 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7037/]) YARN-1904. Ensure exceptions thrown in ClientRMService & ApplicationHistoryClientService are uniform when application-attempt is not found. Contributed by Zhijie Shen. (acmurthy: rev 18b2507edaac991e3ed68d2f27eb96f6882137b9) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryClientService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java > Uniform the NotFound messages from ClientRMService and > ApplicationHistoryClientService > -- > > Key: YARN-1904 > URL: https://issues.apache.org/jira/browse/YARN-1904 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Zhijie Shen > Fix For: 2.7.0 > > Attachments: YARN-1904.1.patch > > > It's good to make ClientRMService and ApplicationHistoryClientService throw > NotFoundException with similar messages -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1449) Protocol changes and implementations in NM side to support change container resource
[ https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308799#comment-14308799 ] Hadoop QA commented on YARN-1449: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618222/yarn-1449.5.patch against trunk revision 18b2507. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6534//console This message is automatically generated. > Protocol changes and implementations in NM side to support change container > resource > > > Key: YARN-1449 > URL: https://issues.apache.org/jira/browse/YARN-1449 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: Wangda Tan (No longer used) >Assignee: Wangda Tan (No longer used) > Attachments: yarn-1449.1.patch, yarn-1449.3.patch, yarn-1449.4.patch, > yarn-1449.5.patch > > > As described in YARN-1197, we need add API/implementation changes, > 1) Add a "changeContainersResources" method in ContainerManagementProtocol > 2) Can get succeed/failed increased/decreased containers in response of > "changeContainersResources" > 3) Add a "new decreased containers" field in NodeStatus which can help NM > notify RM such changes > 4) Added changeContainersResources implementation in ContainerManagerImpl > 5) Added changes in ContainersMonitorImpl to support change resource limit of > containers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1231) Fix test cases that will hit max- am-used-resources-percent limit after YARN-276
[ https://issues.apache.org/jira/browse/YARN-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-1231: Target Version/s: (was: 2.1.1-beta) > Fix test cases that will hit max- am-used-resources-percent limit after > YARN-276 > > > Key: YARN-1231 > URL: https://issues.apache.org/jira/browse/YARN-1231 > Project: Hadoop YARN > Issue Type: Task >Affects Versions: 2.1.1-beta >Reporter: Nemon Lou >Assignee: Nemon Lou > Labels: test > Attachments: YARN-1231.patch > > > Use a separate jira to fix YARN's test cases that will fail by hitting max- > am-used-resources-percent limit after YARN-276. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1449) Protocol changes and implementations in NM side to support change container resource
[ https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308786#comment-14308786 ] Arun C Murthy commented on YARN-1449: - Cleaning up stale PA patches. > Protocol changes and implementations in NM side to support change container > resource > > > Key: YARN-1449 > URL: https://issues.apache.org/jira/browse/YARN-1449 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: Wangda Tan (No longer used) >Assignee: Wangda Tan (No longer used) > Attachments: yarn-1449.1.patch, yarn-1449.3.patch, yarn-1449.4.patch, > yarn-1449.5.patch > > > As described in YARN-1197, we need add API/implementation changes, > 1) Add a "changeContainersResources" method in ContainerManagementProtocol > 2) Can get succeed/failed increased/decreased containers in response of > "changeContainersResources" > 3) Add a "new decreased containers" field in NodeStatus which can help NM > notify RM such changes > 4) Added changeContainersResources implementation in ContainerManagerImpl > 5) Added changes in ContainersMonitorImpl to support change resource limit of > containers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-933) After an AppAttempt_1 got failed [ removal and releasing of container is done , AppAttempt_2 is scheduled ] again relaunching of AppAttempt_1 throws Exception at RM .And cl
[ https://issues.apache.org/jira/browse/YARN-933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-933: --- Assignee: Rohith > After an AppAttempt_1 got failed [ removal and releasing of container is done > , AppAttempt_2 is scheduled ] again relaunching of AppAttempt_1 throws > Exception at RM .And client exited before appattempt retries got over > -- > > Key: YARN-933 > URL: https://issues.apache.org/jira/browse/YARN-933 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.0.5-alpha >Reporter: J.Andreina >Assignee: Rohith > Attachments: YARN-933.patch > > > am max retries configured as 3 at client and RM side. > Step 1: Install cluster with NM on 2 Machines > Step 2: Make Ping using ip from RM machine to NM1 machine as successful ,But > using Hostname should fail > Step 3: Execute a job > Step 4: After AM [ AppAttempt_1 ] allocation to NM1 machine is done , > connection loss happened. > Observation : > == > After AppAttempt_1 has moved to failed state ,release of container for > AppAttempt_1 and Application removal are successful. New AppAttempt_2 is > sponed. > 1. Then again retry for AppAttempt_1 happens. > 2. Again RM side it is trying to launch AppAttempt_1, hence fails with > InvalidStateTransitonException > 3. Client got exited after AppAttempt_1 is been finished [But actually job is > still running ], while the appattempts configured is 3 and rest appattempts > are all sponed and running. > RMLogs: > == > 2013-07-17 16:22:51,013 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1373952096466_0056_01 State change from SCHEDULED to ALLOCATED > 2013-07-17 16:35:48,171 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: host-10-18-40-15/10.18.40.59:8048. Already tried 36 time(s); > maxRetries=45 > 2013-07-17 16:36:07,091 INFO > org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: > Expired:container_1373952096466_0056_01_01 Timed out after 600 secs > 2013-07-17 16:36:07,093 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: > container_1373952096466_0056_01_01 Container Transitioned from ACQUIRED > to EXPIRED > 2013-07-17 16:36:07,093 INFO > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: > Registering appattempt_1373952096466_0056_02 > 2013-07-17 16:36:07,131 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Application appattempt_1373952096466_0056_01 is done. finalState=FAILED > 2013-07-17 16:36:07,131 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: > Application removed - appId: application_1373952096466_0056 user: Rex > leaf-queue of parent: root #applications: 35 > 2013-07-17 16:36:07,132 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Application Submission: appattempt_1373952096466_0056_02, > 2013-07-17 16:36:07,138 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1373952096466_0056_02 State change from SUBMITTED to SCHEDULED > 2013-07-17 16:36:30,179 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: host-10-18-40-15/10.18.40.59:8048. Already tried 38 time(s); > maxRetries=45 > 2013-07-17 16:38:36,203 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: host-10-18-40-15/10.18.40.59:8048. Already tried 44 time(s); > maxRetries=45 > 2013-07-17 16:38:56,207 INFO > org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Error > launching appattempt_1373952096466_0056_01. Got exception: > java.lang.reflect.UndeclaredThrowableException > 2013-07-17 16:38:56,207 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > LAUNCH_FAILED at FAILED > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:630) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:99) > at > org.apache.
[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice
[ https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308734#comment-14308734 ] Zhijie Shen commented on YARN-2246: --- bq. IMHO the proxy URL advertised to clients should always be http://rmaddr/proxy/appid, without other stuff tacked on the end. +1 for the idea. In WebAppProxyServlet, we don't need to append {{rest}} to the end of {{trackingUri}}, but just include the query param. BTW, this may assume the framework has done proper multiplexing on the tracking URL for each application, which means app1 has the track URL {{http://x/y/app1}}, while app2 has {{http://x/y/app2}}. It seems that both MR and TEZ (TEZ-2018) are composing the track URL in this way. > Job History Link in RM UI is redirecting to the URL which contains Job Id > twice > --- > > Key: YARN-2246 > URL: https://issues.apache.org/jira/browse/YARN-2246 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 3.0.0, 0.23.11, 2.5.0 >Reporter: Devaraj K >Assignee: Devaraj K > Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch > > > {code:xml} > http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-893) Capacity scheduler allocates vcores to containers but does not report it in headroom
[ https://issues.apache.org/jira/browse/YARN-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308727#comment-14308727 ] Tsuyoshi OZAWA commented on YARN-893: - Make sense to me. Closing this as invalid one. > Capacity scheduler allocates vcores to containers but does not report it in > headroom > > > Key: YARN-893 > URL: https://issues.apache.org/jira/browse/YARN-893 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta, 2.3.0 >Reporter: Bikas Saha >Assignee: Kenji Kikushima > Attachments: YARN-893-2.patch, YARN-893.patch > > > In non-DRF mode, it reports 0 vcores in the headroom but it allocates 1 vcore > to containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-893) Capacity scheduler allocates vcores to containers but does not report it in headroom
[ https://issues.apache.org/jira/browse/YARN-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA resolved YARN-893. - Resolution: Won't Fix > Capacity scheduler allocates vcores to containers but does not report it in > headroom > > > Key: YARN-893 > URL: https://issues.apache.org/jira/browse/YARN-893 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta, 2.3.0 >Reporter: Bikas Saha >Assignee: Kenji Kikushima > Attachments: YARN-893-2.patch, YARN-893.patch > > > In non-DRF mode, it reports 0 vcores in the headroom but it allocates 1 vcore > to containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308717#comment-14308717 ] Rajesh Balamohan commented on YARN-2928: In certain cases, it might be required to mine a specific job's data by exporting contents out of ATS. Would there be any support for an export tool to get data out of ATS? > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-893) Capacity scheduler allocates vcores to containers but does not report it in headroom
[ https://issues.apache.org/jira/browse/YARN-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308705#comment-14308705 ] Arun C Murthy commented on YARN-893: [~kj-ki] & [~ozawa] - the {{DefaultResourceCalculator}} is meant to not use vcores by design - if vcores is desired, one should use the {{DominantResourceCalculator}}. Makes sense? > Capacity scheduler allocates vcores to containers but does not report it in > headroom > > > Key: YARN-893 > URL: https://issues.apache.org/jira/browse/YARN-893 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.1.0-beta, 2.3.0 >Reporter: Bikas Saha >Assignee: Kenji Kikushima > Attachments: YARN-893-2.patch, YARN-893.patch > > > In non-DRF mode, it reports 0 vcores in the headroom but it allocates 1 vcore > to containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1778) TestFSRMStateStore fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308663#comment-14308663 ] zhihai xu commented on YARN-1778: - [~ozawa], Not sure what do you mean. The retries is not hard-coded based on the following code at [DFSOutputStream#completeFile|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java#L1540] {code} int retries = dfsClient.getConf().nBlockWriteLocateFollowingRetry; {code} nBlockWriteLocateFollowingRetry is decided by configuration "dfs.client.block.write.locateFollowingBlock.retries". The problem for me is the retry in DFSOutputStream#completeFile doesn't work. Based on the log, It retry 5 times in more than 30 seconds and it still doesn't work, then the exception "Unable to close file because the last block does not have enough number of replicas" generated from [FileSystemRMStateStore#writeFile|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java#L583] caused RM restart(). My patch will work better with retry at both high layer(new code) and low layer(old code) because it retry in FileSystemRMStateStore#writeFile, if any exception happen, it will [overwrite the file|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java#L581] and redo everything. > TestFSRMStateStore fails on trunk > - > > Key: YARN-1778 > URL: https://issues.apache.org/jira/browse/YARN-1778 > Project: Hadoop YARN > Issue Type: Test >Reporter: Xuan Gong >Assignee: zhihai xu > Attachments: YARN-1778.000.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3100) Make YARN authorization pluggable
[ https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308607#comment-14308607 ] Jian He commented on YARN-3100: --- I see. got your point. I'll fix this in next patch. thanks ! And I think the existing code will have this problem when exception thrown [here| https://git1-us-west.apache.org/repos/asf?p=hadoop.git;a=blob;f=hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java;h=c1432101510b30cab5979223c4a52b813cfc7aee;hb=HEAD#l497]. I can open a jira for this, if you also think this is an issue. > Make YARN authorization pluggable > - > > Key: YARN-3100 > URL: https://issues.apache.org/jira/browse/YARN-3100 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-3100.1.patch, YARN-3100.2.patch > > > The goal is to have YARN acl model pluggable so as to integrate other > authorization tool such as Apache Ranger, Sentry. > Currently, we have > - admin ACL > - queue ACL > - application ACL > - time line domain ACL > - service ACL > The proposal is to create a YarnAuthorizationProvider interface. Current > implementation will be the default implementation. Ranger or Sentry plug-in > can implement this interface. > Benefit: > - Unify the code base. With the default implementation, we can get rid of > each specific ACL manager such as AdminAclManager, ApplicationACLsManager, > QueueAclsManager etc. > - Enable Ranger, Sentry to do authorization for YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly
[ https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308580#comment-14308580 ] Hadoop QA commented on YARN-3021: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696924/YARN-3021.003.patch against trunk revision 45ea53f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 13 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.conf.TestJobConf Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6533//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6533//artifact/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6533//console This message is automatically generated. > YARN's delegation-token handling disallows certain trust setups to operate > properly > --- > > Key: YARN-3021 > URL: https://issues.apache.org/jira/browse/YARN-3021 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 2.3.0 >Reporter: Harsh J > Attachments: YARN-3021.001.patch, YARN-3021.002.patch, > YARN-3021.003.patch, YARN-3021.patch > > > Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, > and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN > clusters. > Now if one logs in with a COMMON credential, and runs a job on A's YARN that > needs to access B's HDFS (such as a DistCp), the operation fails in the RM, > as it attempts a renewDelegationToken(…) synchronously during application > submission (to validate the managed token before it adds it to a scheduler > for automatic renewal). The call obviously fails cause B realm will not trust > A's credentials (here, the RM's principal is the renewer). > In the 1.x JobTracker the same call is present, but it is done asynchronously > and once the renewal attempt failed we simply ceased to schedule any further > attempts of renewals, rather than fail the job immediately. > We should change the logic such that we attempt the renewal but go easy on > the failure and skip the scheduling alone, rather than bubble back an error > to the client, failing the app submission. This way the old behaviour is > retained. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3100) Make YARN authorization pluggable
[ https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308559#comment-14308559 ] Chris Douglas commented on YARN-3100: - bq. I agree with you that if construction of Q' fails, we possibly get a mix of Q' and Q ACLs, which happens in the existing code. I think the existing code doesn't have this property. ACLs [parsed|https://git1-us-west.apache.org/repos/asf?p=hadoop.git;a=blob;f=hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java;h=c1432101510b30cab5979223c4a52b813cfc7aee;hb=HEAD#l156] from the config are stored in a [member field|https://git1-us-west.apache.org/repos/asf?p=hadoop.git;a=blob;f=hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java;h=e4c26658b0bf5301892ce7c618402ece3a6ea360;hb=HEAD#l273]. If construction fails, those ACLs aren't installed. The patch moves enforcement to the authorizer: {noformat} public boolean hasAccess(QueueACL acl, UserGroupInformation user) { synchronized (this) { - if (acls.get(acl).isUserAllowed(user)) { + if (authorizer.checkPermission(toAccessType(acl), queueEntity, user)) { return true; } } {noformat} Which is updated during construction of the replacement queue hierarchy. > Make YARN authorization pluggable > - > > Key: YARN-3100 > URL: https://issues.apache.org/jira/browse/YARN-3100 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-3100.1.patch, YARN-3100.2.patch > > > The goal is to have YARN acl model pluggable so as to integrate other > authorization tool such as Apache Ranger, Sentry. > Currently, we have > - admin ACL > - queue ACL > - application ACL > - time line domain ACL > - service ACL > The proposal is to create a YarnAuthorizationProvider interface. Current > implementation will be the default implementation. Ranger or Sentry plug-in > can implement this interface. > Benefit: > - Unify the code base. With the default implementation, we can get rid of > each specific ACL manager such as AdminAclManager, ApplicationACLsManager, > QueueAclsManager etc. > - Enable Ranger, Sentry to do authorization for YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2664) Improve RM webapp to expose info about reservations.
[ https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308499#comment-14308499 ] Karthik Kambatla commented on YARN-2664: I am okay with adding a "ReservationID" field even when reservations are not enabled. Would be nice to set it to *N/A* by default. > Improve RM webapp to expose info about reservations. > > > Key: YARN-2664 > URL: https://issues.apache.org/jira/browse/YARN-2664 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Carlo Curino >Assignee: Matteo Mazzucchelli > Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, > YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, YARN-2664.5.patch, > YARN-2664.6.patch, YARN-2664.7.patch, YARN-2664.8.patch, YARN-2664.9.patch, > YARN-2664.patch, legal.patch, screenshot_reservation_UI.pdf > > > YARN-1051 provides a new functionality in the RM to ask for reservation on > resources. Exposing this through the webapp GUI is important. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308459#comment-14308459 ] Hitesh Shah commented on YARN-2928: --- [~vinodkv] Should have probably added more context from the design doc: "We assume that the failure semantics of the ATS writer companion is the same as the AM. If the ATS writer companion fails for any reason, we try to bring it back up up to a specified number of times. If the maximum retries are exhausted, we consider it a fatal failure, and fail the application." > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2694) Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY
[ https://issues.apache.org/jira/browse/YARN-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308454#comment-14308454 ] Hadoop QA commented on YARN-2694: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696916/YARN-2694-20150205-3.patch against trunk revision 4641196. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6532//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6532//console This message is automatically generated. > Ensure only single node labels specified in resource request / host, and node > label expression only specified when resourceName=ANY > --- > > Key: YARN-2694 > URL: https://issues.apache.org/jira/browse/YARN-2694 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-2694-20141020-1.patch, YARN-2694-20141021-1.patch, > YARN-2694-20141023-1.patch, YARN-2694-20141023-2.patch, > YARN-2694-20141101-1.patch, YARN-2694-20141101-2.patch, > YARN-2694-20150121-1.patch, YARN-2694-20150122-1.patch, > YARN-2694-20150202-1.patch, YARN-2694-20150203-1.patch, > YARN-2694-20150203-2.patch, YARN-2694-20150204-1.patch, > YARN-2694-20150205-1.patch, YARN-2694-20150205-2.patch, > YARN-2694-20150205-3.patch > > > Currently, node label expression supporting in capacity scheduler is partial > completed. Now node label expression specified in Resource Request will only > respected when it specified at ANY level. And a ResourceRequest/host with > multiple node labels will make user limit, etc. computation becomes more > tricky. > Now we need temporarily disable them, changes include, > - AMRMClient > - ApplicationMasterService > - RMAdminCLI > - CommonNodeLabelsManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308435#comment-14308435 ] Bibin A Chundatt commented on YARN-3149: Thank you Tsuyoshi for resubmitting the patch and Xuang for commiting. Result was strange. > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Fix For: 2.7.0 > > Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2664) Improve RM webapp to expose info about reservations.
[ https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308432#comment-14308432 ] Carlo Curino commented on YARN-2664: [~vinodkv], [~kasha] this patch looks good, but [~chris.douglas] and I want your opinion on something: The patch adds a "ReservationID" field to the AppBlock. This is set to "(best effort)" for job without a reservation. Are we ok having the field always present (even if reservation system is not turned on)? If not we can ask [~mazzu] to add the proper switches, but it adds a lot of switches to the rendering code (not sure if it is worth). nit pick: [~mazzu] please check code conventions, there is at least one place where you don't follow them (in NavBlock). > Improve RM webapp to expose info about reservations. > > > Key: YARN-2664 > URL: https://issues.apache.org/jira/browse/YARN-2664 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Carlo Curino >Assignee: Matteo Mazzucchelli > Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, > YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, YARN-2664.5.patch, > YARN-2664.6.patch, YARN-2664.7.patch, YARN-2664.8.patch, YARN-2664.9.patch, > YARN-2664.patch, legal.patch, screenshot_reservation_UI.pdf > > > YARN-1051 provides a new functionality in the RM to ask for reservation on > resources. Exposing this through the webapp GUI is important. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308431#comment-14308431 ] Vinod Kumar Vavilapalli commented on YARN-2928: --- bq. The AM and the ATS writer are always considered as a pair, both in terms of resource allocation and failure handling. bq. Why is this necessary? Why does the ATS layer decide what is fatal or non-fatal for an application? This might have meant something different. Colocating the AM and the Timeline aggregator is a physical optimization that also simplifies scheduling a bit. So if the AM fails and runs on a different host, it may make sense to move the aggregator too. > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reassigned YARN-2928: - Assignee: (was: Vinod Kumar Vavilapalli) Might have accidentally assigned it to myself, didn't realize before, unassigning. > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly
[ https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308422#comment-14308422 ] Vinod Kumar Vavilapalli commented on YARN-3021: --- Though the patch unblocks the jobs in the short term, it seems like long term this is still bad. Applications that want to run for longer than 7 days in such setups will just fail without any other way. May be the solution is the following: - Explicitly have an external renewer system that has the right permissions to renew these tokens. Working with such an external renewer system needs support in frameworks, for e.g. in MapReduce, a renewal server list similar to mapreduce.job.hdfs-servers. - RM can simply inspect the incoming renewer specified in the token and skip renewing those tokens if the renewer doesn't match it's own address. This way, we don't need an explicit API in the submission context. Apologies for going back and forth on this one. Does that work? /cc [~jianhe], [~kasha]. Irrespective of how we decide to skip tokens, the way the patch is skipping renewal will not work. In secure mode, DelegationTokenRenewer drives the app state machine. So if you skip adding the app itself to DTR, the app will be completely stuck. > YARN's delegation-token handling disallows certain trust setups to operate > properly > --- > > Key: YARN-3021 > URL: https://issues.apache.org/jira/browse/YARN-3021 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 2.3.0 >Reporter: Harsh J > Attachments: YARN-3021.001.patch, YARN-3021.002.patch, > YARN-3021.003.patch, YARN-3021.patch > > > Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, > and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN > clusters. > Now if one logs in with a COMMON credential, and runs a job on A's YARN that > needs to access B's HDFS (such as a DistCp), the operation fails in the RM, > as it attempts a renewDelegationToken(…) synchronously during application > submission (to validate the managed token before it adds it to a scheduler > for automatic renewal). The call obviously fails cause B realm will not trust > A's credentials (here, the RM's principal is the renewer). > In the 1.x JobTracker the same call is present, but it is done asynchronously > and once the renewal attempt failed we simply ceased to schedule any further > attempts of renewals, rather than fail the job immediately. > We should change the logic such that we attempt the renewal but go easy on > the failure and skip the scheduling alone, rather than bubble back an error > to the client, failing the app submission. This way the old behaviour is > retained. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308413#comment-14308413 ] Hitesh Shah commented on YARN-2928: --- More questions: What are the main differences between meta-data and configuration? What search/sort/aggregate/count functionality is plan to be supported? It seems based on the design doc, certain functionality is not supported on configuration. Does this mean that it is simpler to dump all the data into the meta-data to make it searchable? Where does the current implementation's "otherInfo" and "primaryFilters" fit in? What use are events? Will there be a "streaming" API available to listen to all events based on some search criteria? If there is a hierarchy of objects, will there be support to listen to or retrieve all events for a given tree by providing a root node? How does an application define a relationship for its entity to a system entity? > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users
[ https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308396#comment-14308396 ] Zhijie Shen commented on YARN-2423: --- Thanks for the patch, Robert! Overall, it seems to be on the right rack. Here're some initial comments. Will read more about the patch. 1. IMHO, we may not want to expose the internal data structure {{NameValuePair}}. See how the filters are in {{TimelineEntity}}. NameValuePair -> (String key, Object value) and Collectioin -> Map>? {code} NameValuePair primaryFilter, Collection secondaryFilters, {code} 2. At the point of view of the client, we actually not need to be so specific about the collection type, such as {{SortedSet entityIds}}. Maybe we can be as general as Collection, or List/Set that is consistent with other collection interfaces used in the client lib. The collection will be serialized before sending to the server, and then deserialized to the suitable collection again. 3. Can we refactor doGettingJson and doPosting to reuse the code? > TimelineClient should wrap all GET APIs to facilitate Java users > > > Key: YARN-2423 > URL: https://issues.apache.org/jira/browse/YARN-2423 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Robert Kanter > Attachments: YARN-2423.004.patch, YARN-2423.005.patch, > YARN-2423.006.patch, YARN-2423.patch, YARN-2423.patch, YARN-2423.patch > > > TimelineClient provides the Java method to put timeline entities. It's also > good to wrap over all GET APIs (both entity and domain), and deserialize the > json response into Java POJO objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly
[ https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated YARN-3021: Attachment: YARN-3021.003.patch Uploaded rev 003 to address Zhihai's comments. Thanks Zhihai. > YARN's delegation-token handling disallows certain trust setups to operate > properly > --- > > Key: YARN-3021 > URL: https://issues.apache.org/jira/browse/YARN-3021 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 2.3.0 >Reporter: Harsh J > Attachments: YARN-3021.001.patch, YARN-3021.002.patch, > YARN-3021.003.patch, YARN-3021.patch > > > Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, > and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN > clusters. > Now if one logs in with a COMMON credential, and runs a job on A's YARN that > needs to access B's HDFS (such as a DistCp), the operation fails in the RM, > as it attempts a renewDelegationToken(…) synchronously during application > submission (to validate the managed token before it adds it to a scheduler > for automatic renewal). The call obviously fails cause B realm will not trust > A's credentials (here, the RM's principal is the renewer). > In the 1.x JobTracker the same call is present, but it is done asynchronously > and once the renewal attempt failed we simply ceased to schedule any further > attempts of renewals, rather than fail the job immediately. > We should change the logic such that we attempt the renewal but go easy on > the failure and skip the scheduling alone, rather than bubble back an error > to the client, failing the app submission. This way the old behaviour is > retained. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal
[ https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308345#comment-14308345 ] Hadoop QA commented on YARN-3144: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696908/YARN-3144.2.patch against trunk revision b77ff37. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6530//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6530//console This message is automatically generated. > Configuration for making delegation token failures to timeline server > not-fatal > --- > > Key: YARN-3144 > URL: https://issues.apache.org/jira/browse/YARN-3144 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-3144.1.patch, YARN-3144.2.patch > > > Posting events to the timeline server is best-effort. However, getting the > delegation tokens from the timeline server will kill the job. This patch adds > a configuration to make get delegation token operations "best-effort". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308337#comment-14308337 ] Tsuyoshi OZAWA commented on YARN-3145: -- Thanks Jian and Wangda for your review and thanks Rohith for taking this. > ConcurrentModificationException on CapacityScheduler > ParentQueue#getQueueUserAclInfo > > > Key: YARN-3145 > URL: https://issues.apache.org/jira/browse/YARN-3145 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Tsuyoshi OZAWA > Fix For: 2.7.0 > > Attachments: YARN-3145.001.patch, YARN-3145.002.patch > > > {code} > ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2694) Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY
[ https://issues.apache.org/jira/browse/YARN-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2694: - Attachment: YARN-2694-20150205-3.patch Reverted change of {code} if (null == req.getNodeLabelExpression() && ResourceRequest.ANY.equals(req.getResourceName())) { req.setNodeLabelExpression(asc.getNodeLabelExpression()); } {code} In ApplicationMasterService We should not check empty here, because empty means it has being set. And this logic is "overwrite when not set", updated patch. > Ensure only single node labels specified in resource request / host, and node > label expression only specified when resourceName=ANY > --- > > Key: YARN-2694 > URL: https://issues.apache.org/jira/browse/YARN-2694 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-2694-20141020-1.patch, YARN-2694-20141021-1.patch, > YARN-2694-20141023-1.patch, YARN-2694-20141023-2.patch, > YARN-2694-20141101-1.patch, YARN-2694-20141101-2.patch, > YARN-2694-20150121-1.patch, YARN-2694-20150122-1.patch, > YARN-2694-20150202-1.patch, YARN-2694-20150203-1.patch, > YARN-2694-20150203-2.patch, YARN-2694-20150204-1.patch, > YARN-2694-20150205-1.patch, YARN-2694-20150205-2.patch, > YARN-2694-20150205-3.patch > > > Currently, node label expression supporting in capacity scheduler is partial > completed. Now node label expression specified in Resource Request will only > respected when it specified at ANY level. And a ResourceRequest/host with > multiple node labels will make user limit, etc. computation becomes more > tricky. > Now we need temporarily disable them, changes include, > - AMRMClient > - ApplicationMasterService > - RMAdminCLI > - CommonNodeLabelsManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308327#comment-14308327 ] Hudson commented on YARN-3145: -- FAILURE: Integrated in Hadoop-trunk-Commit #7030 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7030/]) YARN-3145. Fixed ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo. Contributed by Tsuyoshi OZAWA (jianhe: rev 4641196fe02af5cab3d56a9f3c78875c495dbe03) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java > ConcurrentModificationException on CapacityScheduler > ParentQueue#getQueueUserAclInfo > > > Key: YARN-3145 > URL: https://issues.apache.org/jira/browse/YARN-3145 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Tsuyoshi OZAWA > Fix For: 2.7.0 > > Attachments: YARN-3145.001.patch, YARN-3145.002.patch > > > {code} > ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-2292) RM web services should use hadoop-common for authentication using delegation tokens
[ https://issues.apache.org/jira/browse/YARN-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved YARN-2292. --- Resolution: Duplicate Closing with the right status. > RM web services should use hadoop-common for authentication using delegation > tokens > --- > > Key: YARN-2292 > URL: https://issues.apache.org/jira/browse/YARN-2292 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Fix For: 2.6.0 > > > HADOOP-10771 refactors the WebHDFS authentication code to hadoop-common. > YARN-2290 will add support for passing delegation tokens via headers. Once > support is added RM web services should use the authentication code from > hadoop-common -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (YARN-2292) RM web services should use hadoop-common for authentication using delegation tokens
[ https://issues.apache.org/jira/browse/YARN-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reopened YARN-2292: --- > RM web services should use hadoop-common for authentication using delegation > tokens > --- > > Key: YARN-2292 > URL: https://issues.apache.org/jira/browse/YARN-2292 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Fix For: 2.6.0 > > > HADOOP-10771 refactors the WebHDFS authentication code to hadoop-common. > YARN-2290 will add support for passing delegation tokens via headers. Once > support is added RM web services should use the authentication code from > hadoop-common -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-2291) Timeline and RM web services should use same authentication code
[ https://issues.apache.org/jira/browse/YARN-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved YARN-2291. --- Resolution: Duplicate Closing with the right resolution.. > Timeline and RM web services should use same authentication code > > > Key: YARN-2291 > URL: https://issues.apache.org/jira/browse/YARN-2291 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Fix For: 2.6.0 > > > The TimelineServer and the RM web services have very similar requirements and > implementation for authentication via delegation tokens apart from the fact > that the RM web services requires delegation tokens to be passed as a header. > They should use the same code base instead of different implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (YARN-2291) Timeline and RM web services should use same authentication code
[ https://issues.apache.org/jira/browse/YARN-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reopened YARN-2291: --- > Timeline and RM web services should use same authentication code > > > Key: YARN-2291 > URL: https://issues.apache.org/jira/browse/YARN-2291 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Fix For: 2.6.0 > > > The TimelineServer and the RM web services have very similar requirements and > implementation for authentication via delegation tokens apart from the fact > that the RM web services requires delegation tokens to be passed as a header. > They should use the same code base instead of different implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308295#comment-14308295 ] Hitesh Shah commented on YARN-2928: --- Some questions on the design doc: bq. The AM and the ATS writer are always considered as a pair, both in terms of resource allocation and failure handling. Why is this necessary? Why does the ATS layer decide what is fatal or non-fatal for an application? >From a Tez perspective, we have a different use-case when it comes to >relationships with higher level applications. A single tez application can run >multiple different Hive queries submitted by different users. In today's >implementation, the data generated from each different query ( within the same >Tez yarn application ) will have different access/privacy controls. How do you >see the flow relationship being handled in this case as there is an entity ( >Tez DAG ) that is a child of both a Tez Application as well as a Hive query? > Application Timeline Server (ATS) next gen: phase 1 > --- > > Key: YARN-2928 > URL: https://issues.apache.org/jira/browse/YARN-2928 > Project: Hadoop YARN > Issue Type: New Feature > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal > v1.pdf > > > We have the application timeline server implemented in yarn per YARN-1530 and > YARN-321. Although it is a great feature, we have recognized several critical > issues and features that need to be addressed. > This JIRA proposes the design and implementation changes to address those. > This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2694) Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY
[ https://issues.apache.org/jira/browse/YARN-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308292#comment-14308292 ] Hadoop QA commented on YARN-2694: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696888/YARN-2694-20150205-2.patch against trunk revision e1990ab. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6529//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6529//console This message is automatically generated. > Ensure only single node labels specified in resource request / host, and node > label expression only specified when resourceName=ANY > --- > > Key: YARN-2694 > URL: https://issues.apache.org/jira/browse/YARN-2694 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-2694-20141020-1.patch, YARN-2694-20141021-1.patch, > YARN-2694-20141023-1.patch, YARN-2694-20141023-2.patch, > YARN-2694-20141101-1.patch, YARN-2694-20141101-2.patch, > YARN-2694-20150121-1.patch, YARN-2694-20150122-1.patch, > YARN-2694-20150202-1.patch, YARN-2694-20150203-1.patch, > YARN-2694-20150203-2.patch, YARN-2694-20150204-1.patch, > YARN-2694-20150205-1.patch, YARN-2694-20150205-2.patch > > > Currently, node label expression supporting in capacity scheduler is partial > completed. Now node label expression specified in Resource Request will only > respected when it specified at ANY level. And a ResourceRequest/host with > multiple node labels will make user limit, etc. computation becomes more > tricky. > Now we need temporarily disable them, changes include, > - AMRMClient > - ApplicationMasterService > - RMAdminCLI > - CommonNodeLabelsManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal
[ https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated YARN-3144: -- Attachment: YARN-3144.2.patch [~jlowe], latest patch addresses you comments. Can you have a look when you get a chance? > Configuration for making delegation token failures to timeline server > not-fatal > --- > > Key: YARN-3144 > URL: https://issues.apache.org/jira/browse/YARN-3144 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-3144.1.patch, YARN-3144.2.patch > > > Posting events to the timeline server is best-effort. However, getting the > delegation tokens from the timeline server will kill the job. This patch adds > a configuration to make get delegation token operations "best-effort". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser
[ https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308278#comment-14308278 ] Xuan Gong commented on YARN-3089: - bq. To avoid another race, are you filing the log aggregation rolling JIRA or shall I I will do that. > LinuxContainerExecutor does not handle file arguments to deleteAsUser > - > > Key: YARN-3089 > URL: https://issues.apache.org/jira/browse/YARN-3089 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Eric Payne >Priority: Blocker > Attachments: YARN-3089.v1.txt, YARN-3089.v2.txt, YARN-3089.v3.txt > > > YARN-2468 added the deletion of individual logs that are aggregated, but this > fails to delete log files when the LCE is being used. The LCE native > executable assumes the paths being passed are paths and the delete fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser
[ https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308270#comment-14308270 ] Xuan Gong commented on YARN-3089: - bq. is it true that any MapReduce job or other typical job will have their partial logs uploaded and then removed from the local filesystem after 24 hours? Yes, * If the Application keeps running more than 24 hours * The Application does not finish * User does not set any roll-over * User does not set anything in LogAggregationContext * User does not config the proper value for yarn.nodemanager.delete.debug-delay-sec > LinuxContainerExecutor does not handle file arguments to deleteAsUser > - > > Key: YARN-3089 > URL: https://issues.apache.org/jira/browse/YARN-3089 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Eric Payne >Priority: Blocker > Attachments: YARN-3089.v1.txt, YARN-3089.v2.txt, YARN-3089.v3.txt > > > YARN-2468 added the deletion of individual logs that are aggregated, but this > fails to delete log files when the LCE is being used. The LCE native > executable assumes the paths being passed are paths and the delete fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser
[ https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308245#comment-14308245 ] Jason Lowe commented on YARN-3089: -- Ah, comment race. Thanks for confirming Xuan. To avoid another race, are you filing the log aggregation rolling JIRA or shall I? Or is there already an existing JIRA? Back to this JIRA, I'd like to commit this tomorrow if there aren't any objections. > LinuxContainerExecutor does not handle file arguments to deleteAsUser > - > > Key: YARN-3089 > URL: https://issues.apache.org/jira/browse/YARN-3089 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Eric Payne >Priority: Blocker > Attachments: YARN-3089.v1.txt, YARN-3089.v2.txt, YARN-3089.v3.txt > > > YARN-2468 added the deletion of individual logs that are aggregated, but this > fails to delete log files when the LCE is being used. The LCE native > executable assumes the paths being passed are paths and the delete fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3150) Documenting the timeline service v2
[ https://issues.apache.org/jira/browse/YARN-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-3150: -- Issue Type: Sub-task (was: Bug) Parent: YARN-2928 > Documenting the timeline service v2 > --- > > Key: YARN-3150 > URL: https://issues.apache.org/jira/browse/YARN-3150 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Zhijie Shen > > Let's make sure we will have a document to describe what's new in TS v2, the > APIs, the client libs and so on. We should do better around documentation in > v2 than v1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3150) Documenting the timeline service v2
Zhijie Shen created YARN-3150: - Summary: Documenting the timeline service v2 Key: YARN-3150 URL: https://issues.apache.org/jira/browse/YARN-3150 Project: Hadoop YARN Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Let's make sure we will have a document to describe what's new in TS v2, the APIs, the client libs and so on. We should do better around documentation in v2 than v1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3100) Make YARN authorization pluggable
[ https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308239#comment-14308239 ] Jian He commented on YARN-3100: --- Chris, appreciate your feedbacks ! bq. These (1) will be observable by readers of Q who share the instance. bq. I'm curious if (1) and (2) are an artifact of the new plugin architecture or if this is also happens in the existing code IIUC, in case refreshQueue is successful, readers won't be able to observe the new Q' ACL. because while constructing new Q', it's holding the scheduler lock, and the reader(i.e. the caller of checkPermission) has to get the same scheduler lock to check permission. So I think readers won't be able to observe the new ACL until Q refresh is completed. But I agree with you that if construction of Q' fails, we possibly get a mix of Q' and Q ACLs, which happens in the existing code. This could be a problem to other parts of the refreshQueue too, e.g. update queue capacity. bq. I suppose it could track the sequence of calls that install ACLs and only publish new ACLs when it's received updates for everything, Yes, a simple way is that the plug-in can check if the acl already exists and only add the new ones. As you said, this is not clean. I think maybe we can print warning if admin uses external component for ACL management but still call the refreshQueue API to update the ACL. bq. but that could still yield (2) if the refresh adds new queues before the refresh fails. Yes, it will still yield 2) if refresh fails. > Make YARN authorization pluggable > - > > Key: YARN-3100 > URL: https://issues.apache.org/jira/browse/YARN-3100 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-3100.1.patch, YARN-3100.2.patch > > > The goal is to have YARN acl model pluggable so as to integrate other > authorization tool such as Apache Ranger, Sentry. > Currently, we have > - admin ACL > - queue ACL > - application ACL > - time line domain ACL > - service ACL > The proposal is to create a YarnAuthorizationProvider interface. Current > implementation will be the default implementation. Ranger or Sentry plug-in > can implement this interface. > Benefit: > - Unify the code base. With the default implementation, we can get rid of > each specific ACL manager such as AdminAclManager, ApplicationACLsManager, > QueueAclsManager etc. > - Enable Ranger, Sentry to do authorization for YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser
[ https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308238#comment-14308238 ] Jason Lowe commented on YARN-3089: -- Looking closer at AppLogAggregatorImpl's rolling interval support, I'm worried enabling the rolling interval on a cluster will easily lose logs. For example if the cluster is configured with a log aggregation rolling interval of 24 hours, is it true that any MapReduce job or other typical job will have their partial logs uploaded and then removed from the local filesystem after 24 hours? It seems like one would have to set the rolling interval to be larger than the worst-case runtime of any application that doesn't roll its own logs to avoid loss of logs. > LinuxContainerExecutor does not handle file arguments to deleteAsUser > - > > Key: YARN-3089 > URL: https://issues.apache.org/jira/browse/YARN-3089 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Eric Payne >Priority: Blocker > Attachments: YARN-3089.v1.txt, YARN-3089.v2.txt, YARN-3089.v3.txt > > > YARN-2468 added the deletion of individual logs that are aggregated, but this > fails to delete log files when the LCE is being used. The LCE native > executable assumes the paths being passed are paths and the delete fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser
[ https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308232#comment-14308232 ] Xuan Gong commented on YARN-3089: - bq. That interval isn't configurable per app, so I can see cases where someone puts an app on the cluster that isn't super long running, doesn't roll its own logs, but then this feature comes along and uploads a partial log and removes the active log files. Yes, that is the issue that we have already realized. Currently, even we are running a MR job, it will upload the partial logs which does not sound right. And we need to fix it. > LinuxContainerExecutor does not handle file arguments to deleteAsUser > - > > Key: YARN-3089 > URL: https://issues.apache.org/jira/browse/YARN-3089 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Eric Payne >Priority: Blocker > Attachments: YARN-3089.v1.txt, YARN-3089.v2.txt, YARN-3089.v3.txt > > > YARN-2468 added the deletion of individual logs that are aggregated, but this > fails to delete log files when the LCE is being used. The LCE native > executable assumes the paths being passed are paths and the delete fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308228#comment-14308228 ] Hudson commented on YARN-3149: -- FAILURE: Integrated in Hadoop-trunk-Commit #7029 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7029/]) YARN-3149. Fix typo in message for invalid application id. Contributed (xgong: rev b77ff37686e01b7497d3869fbc62789a5b123c0a) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ConverterUtils.java * hadoop-yarn-project/CHANGES.txt > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Fix For: 2.7.0 > > Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3124) Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track capacities-by-label
[ https://issues.apache.org/jira/browse/YARN-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308227#comment-14308227 ] Hadoop QA commented on YARN-3124: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696880/YARN-3124.2.patch against trunk revision e1990ab. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6528//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6528//console This message is automatically generated. > Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track > capacities-by-label > > > Key: YARN-3124 > URL: https://issues.apache.org/jira/browse/YARN-3124 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3124.1.patch, YARN-3124.2.patch > > > After YARN-3098, capacities-by-label (include > used-capacity/maximum-capacity/absolute-maximum-capacity, etc.) should be > tracked in QueueCapacities. > This patch is targeting to make capacities-by-label in CS Queues are all > tracked by QueueCapacities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308216#comment-14308216 ] Xuan Gong commented on YARN-3149: - Committed to trunk/branch-2. Thanks, Bibin A Chundatt! > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Fix For: 2.7.0 > > Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308213#comment-14308213 ] Xuan Gong commented on YARN-3149: - +1 lgtm. Will commit it > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser
[ https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308204#comment-14308204 ] Jason Lowe commented on YARN-3089: -- bq. But if the user does not set any includePattern/excludePattern in LogAggregationContext, and does not want to roll-over the logs, all container logs will be written into the files with the same name(run the command, such as 1>>stdout, 2>>stderr ). After we aggregate the logs into HDFS, the stdout/stderr will be deleted. In this case, this LRS app is affected. This sounds pretty bad. If the app doesn't roll its own logs but is normally not very chatty such that it isn't much of an issue, we're going to blow away the apps logs every log-aggregation.roll-monitoring-interval-seconds interval? That interval isn't configurable per app, so I can see cases where someone puts an app on the cluster that isn't super long running, doesn't roll its own logs, but then this feature comes along and uploads a partial log and removes the active log files. In other words, fixing this bug may actually start deleting logs we shouldn't delete on clusters using the LCE. However that's a separate JIRA from this, as this is focusing on making the behavior of the LCE consistent with the default executor wrt. deletes. > LinuxContainerExecutor does not handle file arguments to deleteAsUser > - > > Key: YARN-3089 > URL: https://issues.apache.org/jira/browse/YARN-3089 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Eric Payne >Priority: Blocker > Attachments: YARN-3089.v1.txt, YARN-3089.v2.txt, YARN-3089.v3.txt > > > YARN-2468 added the deletion of individual logs that are aggregated, but this > fails to delete log files when the LCE is being used. The LCE native > executable assumes the paths being passed are paths and the delete fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3100) Make YARN authorization pluggable
[ https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308126#comment-14308126 ] Chris Douglas commented on YARN-3100: - bq. The reinitializeQueues looks to be transactional, it instantiates all new sub queues first and then update the root queue and child queues accordingly. And the checkAccess chain will compete the same scheduler lock with the refreshQueue. If there's a queue with root _Q_, say we're constructing _Q'_. In the current patch, the {{YarnAuthorizationProvider}} singleton instance will get calls to {{setPermission()}} during construction of _Q'_. These (1) will be observable by readers of _Q_ who share the instance. I agree that if construction of _Q'_ fails then it won't get installed, but (2) _Q_ will run with a mix of _Q'_ and _Q_ ACLs because each call to {{setPermission()}} overwrites what was installed for _Q_. I'm curious if (1) and (2) are an artifact of the new plugin architecture or if this is also happens in the existing code. Not for external implementations, but for the {{Default\*}} one. bq. Alternatively, the plug-in can choose to add new acl via the setPermission when refreshQueue is invoked, but not to replace existing acl. Also, whether to add new or update or no, this is something that plug-in itself can decide or make it configurable by user. Maybe I'm being dense, but I don't see how a plugin could implement those semantics cleanly. {{YarnAuthorizationProvider}} forces the instance to be a singleton, and it gets some sequence of calls to {{setPermission()}}. Since queues can't be deleted in the CS, I suppose it could track the sequence of calls that install ACLs and only publish new ACLs when it's received updates for everything, but that could still yield (2) if the refresh adds new queues before the refresh fails. > Make YARN authorization pluggable > - > > Key: YARN-3100 > URL: https://issues.apache.org/jira/browse/YARN-3100 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-3100.1.patch, YARN-3100.2.patch > > > The goal is to have YARN acl model pluggable so as to integrate other > authorization tool such as Apache Ranger, Sentry. > Currently, we have > - admin ACL > - queue ACL > - application ACL > - time line domain ACL > - service ACL > The proposal is to create a YarnAuthorizationProvider interface. Current > implementation will be the default implementation. Ranger or Sentry plug-in > can implement this interface. > Benefit: > - Unify the code base. With the default implementation, we can get rid of > each specific ACL manager such as AdminAclManager, ApplicationACLsManager, > QueueAclsManager etc. > - Enable Ranger, Sentry to do authorization for YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice
[ https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308101#comment-14308101 ] Jason Lowe commented on YARN-2246: -- I think the bug is in RMAppAttemptImpl. When the AM unregisters, RMAppAttemptImpl.generateProxyUriWithScheme will take whatever URL the AM specified and replace the server with the proxy URL, e.g.: tracking URL http://x/y/z becomes http://rmaddr/proxy/appid/y/z. Then when the webproxy processes that URL it just replaces http://rmaddr/proxy/appid with the tracking URL, leading to http://x/y/z/y/z. IMHO the proxy URL advertised to clients should always be http://rmaddr/proxy/appid, without other stuff tacked on the end. That can map to whatever tracking URL the app provided when processed by the webproxy. I also wonder if we should list the final tracking URL on the RM UI rather than the proxy URL. Seems simpler to just direct them to the final tracking URL rather than through the RM proxy, unless there's a use case where the client can't reach the final tracking URL directly and needs to go through the proxy. I haven't heard of such a setup. > Job History Link in RM UI is redirecting to the URL which contains Job Id > twice > --- > > Key: YARN-2246 > URL: https://issues.apache.org/jira/browse/YARN-2246 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 3.0.0, 0.23.11, 2.5.0 >Reporter: Devaraj K >Assignee: Devaraj K > Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch > > > {code:xml} > http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2694) Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY
[ https://issues.apache.org/jira/browse/YARN-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2694: - Attachment: YARN-2694-20150205-2.patch Local repo is missed, regenerated patch. > Ensure only single node labels specified in resource request / host, and node > label expression only specified when resourceName=ANY > --- > > Key: YARN-2694 > URL: https://issues.apache.org/jira/browse/YARN-2694 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-2694-20141020-1.patch, YARN-2694-20141021-1.patch, > YARN-2694-20141023-1.patch, YARN-2694-20141023-2.patch, > YARN-2694-20141101-1.patch, YARN-2694-20141101-2.patch, > YARN-2694-20150121-1.patch, YARN-2694-20150122-1.patch, > YARN-2694-20150202-1.patch, YARN-2694-20150203-1.patch, > YARN-2694-20150203-2.patch, YARN-2694-20150204-1.patch, > YARN-2694-20150205-1.patch, YARN-2694-20150205-2.patch > > > Currently, node label expression supporting in capacity scheduler is partial > completed. Now node label expression specified in Resource Request will only > respected when it specified at ANY level. And a ResourceRequest/host with > multiple node labels will make user limit, etc. computation becomes more > tricky. > Now we need temporarily disable them, changes include, > - AMRMClient > - ApplicationMasterService > - RMAdminCLI > - CommonNodeLabelsManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2694) Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY
[ https://issues.apache.org/jira/browse/YARN-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308089#comment-14308089 ] Hadoop QA commented on YARN-2694: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696874/YARN-2694-20150205-1.patch against trunk revision e1990ab. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6527//console This message is automatically generated. > Ensure only single node labels specified in resource request / host, and node > label expression only specified when resourceName=ANY > --- > > Key: YARN-2694 > URL: https://issues.apache.org/jira/browse/YARN-2694 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-2694-20141020-1.patch, YARN-2694-20141021-1.patch, > YARN-2694-20141023-1.patch, YARN-2694-20141023-2.patch, > YARN-2694-20141101-1.patch, YARN-2694-20141101-2.patch, > YARN-2694-20150121-1.patch, YARN-2694-20150122-1.patch, > YARN-2694-20150202-1.patch, YARN-2694-20150203-1.patch, > YARN-2694-20150203-2.patch, YARN-2694-20150204-1.patch, > YARN-2694-20150205-1.patch > > > Currently, node label expression supporting in capacity scheduler is partial > completed. Now node label expression specified in Resource Request will only > respected when it specified at ANY level. And a ResourceRequest/host with > multiple node labels will make user limit, etc. computation becomes more > tricky. > Now we need temporarily disable them, changes include, > - AMRMClient > - ApplicationMasterService > - RMAdminCLI > - CommonNodeLabelsManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3124) Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track capacities-by-label
[ https://issues.apache.org/jira/browse/YARN-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3124: - Attachment: YARN-3124.2.patch Rebased against trunk. > Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track > capacities-by-label > > > Key: YARN-3124 > URL: https://issues.apache.org/jira/browse/YARN-3124 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3124.1.patch, YARN-3124.2.patch > > > After YARN-3098, capacities-by-label (include > used-capacity/maximum-capacity/absolute-maximum-capacity, etc.) should be > tracked in QueueCapacities. > This patch is targeting to make capacities-by-label in CS Queues are all > tracked by QueueCapacities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3100) Make YARN authorization pluggable
[ https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308061#comment-14308061 ] Jian He commented on YARN-3100: --- Do you mean the intermediate state that some queue ACLs are updated but some are not ? The reinitializeQueues looks to be transactional, it instantiates all new sub queues first and then update the root queue and child queues accordingly. And the checkAccess chain will compete the same scheduler lock with the refreshQueue. > Make YARN authorization pluggable > - > > Key: YARN-3100 > URL: https://issues.apache.org/jira/browse/YARN-3100 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-3100.1.patch, YARN-3100.2.patch > > > The goal is to have YARN acl model pluggable so as to integrate other > authorization tool such as Apache Ranger, Sentry. > Currently, we have > - admin ACL > - queue ACL > - application ACL > - time line domain ACL > - service ACL > The proposal is to create a YarnAuthorizationProvider interface. Current > implementation will be the default implementation. Ranger or Sentry plug-in > can implement this interface. > Benefit: > - Unify the code base. With the default implementation, we can get rid of > each specific ACL manager such as AdminAclManager, ApplicationACLsManager, > QueueAclsManager etc. > - Enable Ranger, Sentry to do authorization for YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2694) Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY
[ https://issues.apache.org/jira/browse/YARN-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2694: - Attachment: YARN-2694-20150205-1.patch [~jianhe], Thanks for your comments, Addressed most of them, except: bq. Instead of throwing IOException, use precondition check too It will throw RuntimeException, I'm not sure will it be caught by framework. bq. maybe check “||” too ? It's no need to check "||" here, it is already sufficient, because an empty exp will not contain "&&" Attached new patch. > Ensure only single node labels specified in resource request / host, and node > label expression only specified when resourceName=ANY > --- > > Key: YARN-2694 > URL: https://issues.apache.org/jira/browse/YARN-2694 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-2694-20141020-1.patch, YARN-2694-20141021-1.patch, > YARN-2694-20141023-1.patch, YARN-2694-20141023-2.patch, > YARN-2694-20141101-1.patch, YARN-2694-20141101-2.patch, > YARN-2694-20150121-1.patch, YARN-2694-20150122-1.patch, > YARN-2694-20150202-1.patch, YARN-2694-20150203-1.patch, > YARN-2694-20150203-2.patch, YARN-2694-20150204-1.patch, > YARN-2694-20150205-1.patch > > > Currently, node label expression supporting in capacity scheduler is partial > completed. Now node label expression specified in Resource Request will only > respected when it specified at ANY level. And a ResourceRequest/host with > multiple node labels will make user limit, etc. computation becomes more > tricky. > Now we need temporarily disable them, changes include, > - AMRMClient > - ApplicationMasterService > - RMAdminCLI > - CommonNodeLabelsManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308033#comment-14308033 ] Hadoop QA commented on YARN-3145: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696843/YARN-3145.002.patch against trunk revision 276485e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6525//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6525//console This message is automatically generated. > ConcurrentModificationException on CapacityScheduler > ParentQueue#getQueueUserAclInfo > > > Key: YARN-3145 > URL: https://issues.apache.org/jira/browse/YARN-3145 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Tsuyoshi OZAWA > Attachments: YARN-3145.001.patch, YARN-3145.002.patch > > > {code} > ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308015#comment-14308015 ] Wangda Tan commented on YARN-3145: -- Thanks [~ozawa] working on this, patch LGTM, +1. > ConcurrentModificationException on CapacityScheduler > ParentQueue#getQueueUserAclInfo > > > Key: YARN-3145 > URL: https://issues.apache.org/jira/browse/YARN-3145 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Tsuyoshi OZAWA > Attachments: YARN-3145.001.patch, YARN-3145.002.patch > > > {code} > ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308013#comment-14308013 ] Hadoop QA commented on YARN-3149: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696852/YARN-3149.patch against trunk revision e1990ab. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6526//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6526//console This message is automatically generated. > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2694) Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY
[ https://issues.apache.org/jira/browse/YARN-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308002#comment-14308002 ] Jian He commented on YARN-2694: --- few minor comments, looks good overall, - check if empty too ? {code} if (null == exp) { return; } {code} - remove the IOException from the method header {code} private Map> buildNodeLabelsMapFromStr(String args) throws IOException {code} - exceed 80 column limit {code} private void verifyAddRequestFailed(AMRMClient client, ContainerRequest request) { {code} - Instead of throwing IOException, use precondition check too {code} if (labels.size() > 1) { String msg = String.format("%d labels specified on host=%s" + ", please note that we do not support specifying multiple" + " labels on a single host for now.", labels.size(), nodeId.getHost()); LOG.error(msg); {code} - should the following check empty string too ? {code} if (null == req.getNodeLabelExpression() && ResourceRequest.ANY.equals(req.getResourceName())) { {code} - maybe check “||” too ? {code} if (labelExp != null && labelExp.contains("&&")) { {code} > Ensure only single node labels specified in resource request / host, and node > label expression only specified when resourceName=ANY > --- > > Key: YARN-2694 > URL: https://issues.apache.org/jira/browse/YARN-2694 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-2694-20141020-1.patch, YARN-2694-20141021-1.patch, > YARN-2694-20141023-1.patch, YARN-2694-20141023-2.patch, > YARN-2694-20141101-1.patch, YARN-2694-20141101-2.patch, > YARN-2694-20150121-1.patch, YARN-2694-20150122-1.patch, > YARN-2694-20150202-1.patch, YARN-2694-20150203-1.patch, > YARN-2694-20150203-2.patch, YARN-2694-20150204-1.patch > > > Currently, node label expression supporting in capacity scheduler is partial > completed. Now node label expression specified in Resource Request will only > respected when it specified at ANY level. And a ResourceRequest/host with > multiple node labels will make user limit, etc. computation becomes more > tricky. > Now we need temporarily disable them, changes include, > - AMRMClient > - ApplicationMasterService > - RMAdminCLI > - CommonNodeLabelsManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
[ https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307966#comment-14307966 ] Wangda Tan commented on YARN-3141: -- Just updated description, thanks. > Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp > -- > > Key: YARN-3141 > URL: https://issues.apache.org/jira/browse/YARN-3141 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > > Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, > as mentioned in YARN-3091, a possible solution is using read/write lock. > Other fine-graind locks for specific purposes / bugs should be addressed in > separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
[ https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3141: - Description: Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, as mentioned in YARN-3091, a possible solution is using read/write lock. Other fine-graind locks for specific purposes / bugs should be addressed in separated tickets. > Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp > -- > > Key: YARN-3141 > URL: https://issues.apache.org/jira/browse/YARN-3141 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > > Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, > as mentioned in YARN-3091, a possible solution is using read/write lock. > Other fine-graind locks for specific purposes / bugs should be addressed in > separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3069) Document missing properties in yarn-default.xml
[ https://issues.apache.org/jira/browse/YARN-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-3069: - Attachment: YARN-3069.001.patch Initial version: - Make section separations consistent - Put in all known missing properties. Some may need to be moved to TestYarnConfigurationFields as an exception instead of a documented property. - Currently, all new properties are undocumented, but do exists within the yarn-default.xml file. > Document missing properties in yarn-default.xml > --- > > Key: YARN-3069 > URL: https://issues.apache.org/jira/browse/YARN-3069 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: supportability > Attachments: YARN-3069.001.patch > > > The following properties are currently not defined in yarn-default.xml. > These properties should either be > A) documented in yarn-default.xml OR > B) listed as an exception (with comments, e.g. for internal use) in the > TestYarnConfigurationFields unit test > Any comments for any of the properties below are welcome. > org.apache.hadoop.yarn.server.sharedcachemanager.RemoteAppChecker > org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore > security.applicationhistory.protocol.acl > yarn.app.container.log.backups > yarn.app.container.log.dir > yarn.app.container.log.filesize > yarn.client.app-submission.poll-interval > yarn.client.application-client-protocol.poll-timeout-ms > yarn.is.minicluster > yarn.log.server.url > yarn.minicluster.control-resource-monitoring > yarn.minicluster.fixed.ports > yarn.minicluster.use-rpc > yarn.node-labels.fs-store.retry-policy-spec > yarn.node-labels.fs-store.root-dir > yarn.node-labels.manager-class > yarn.nodemanager.container-executor.os.sched.priority.adjustment > yarn.nodemanager.container-monitor.process-tree.class > yarn.nodemanager.disk-health-checker.enable > yarn.nodemanager.docker-container-executor.image-name > yarn.nodemanager.linux-container-executor.cgroups.delete-timeout-ms > yarn.nodemanager.linux-container-executor.group > yarn.nodemanager.log.deletion-threads-count > yarn.nodemanager.user-home-dir > yarn.nodemanager.webapp.https.address > yarn.nodemanager.webapp.spnego-keytab-file > yarn.nodemanager.webapp.spnego-principal > yarn.nodemanager.windows-secure-container-executor.group > yarn.resourcemanager.configuration.file-system-based-store > yarn.resourcemanager.delegation-token-renewer.thread-count > yarn.resourcemanager.delegation.key.update-interval > yarn.resourcemanager.delegation.token.max-lifetime > yarn.resourcemanager.delegation.token.renew-interval > yarn.resourcemanager.history-writer.multi-threaded-dispatcher.pool-size > yarn.resourcemanager.metrics.runtime.buckets > yarn.resourcemanager.nm-tokens.master-key-rolling-interval-secs > yarn.resourcemanager.reservation-system.class > yarn.resourcemanager.reservation-system.enable > yarn.resourcemanager.reservation-system.plan.follower > yarn.resourcemanager.reservation-system.planfollower.time-step > yarn.resourcemanager.rm.container-allocation.expiry-interval-ms > yarn.resourcemanager.webapp.spnego-keytab-file > yarn.resourcemanager.webapp.spnego-principal > yarn.scheduler.include-port-in-node-name > yarn.timeline-service.delegation.key.update-interval > yarn.timeline-service.delegation.token.max-lifetime > yarn.timeline-service.delegation.token.renew-interval > yarn.timeline-service.generic-application-history.enabled > > yarn.timeline-service.generic-application-history.fs-history-store.compression-type > yarn.timeline-service.generic-application-history.fs-history-store.uri > yarn.timeline-service.generic-application-history.store-class > yarn.timeline-service.http-cross-origin.enabled > yarn.tracking.url.generator -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2921) MockRM#waitForState methods can be too slow and flaky
[ https://issues.apache.org/jira/browse/YARN-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307942#comment-14307942 ] Hadoop QA commented on YARN-2921: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696827/YARN-2921.004.patch against trunk revision c4980a2. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6523//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6523//console This message is automatically generated. > MockRM#waitForState methods can be too slow and flaky > - > > Key: YARN-2921 > URL: https://issues.apache.org/jira/browse/YARN-2921 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 2.6.0 >Reporter: Karthik Kambatla >Assignee: Tsuyoshi OZAWA > Attachments: YARN-2921.001.patch, YARN-2921.002.patch, > YARN-2921.003.patch, YARN-2921.004.patch > > > MockRM#waitForState methods currently sleep for too long (2 seconds and 1 > second). This leads to slow tests and sometimes failures if the > App/AppAttempt moves to another state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3147) Clean up RM web proxy code
[ https://issues.apache.org/jira/browse/YARN-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307937#comment-14307937 ] Hadoop QA commented on YARN-3147: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696839/YARN-3147-002.patch against trunk revision 276485e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6524//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6524//console This message is automatically generated. > Clean up RM web proxy code > --- > > Key: YARN-3147 > URL: https://issues.apache.org/jira/browse/YARN-3147 > Project: Hadoop YARN > Issue Type: Improvement > Components: webapp >Affects Versions: 2.6.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: YARN-3147-001.patch, YARN-3147-002.patch > > > YARN-2084 covers fixing up the RM proxy & filter for REST support. > Before doing that, prepare for it by cleaning up the codebase: factoring out > the redirect logic into a single method, some minor reformatting, move to > SLF4J and Java7 code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated YARN-3149: - Attachment: YARN-3149.patch The CI result looks strange... Resubmitting a patch. > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307911#comment-14307911 ] Hadoop QA commented on YARN-3149: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696814/YARN-3149.patch against trunk revision d27439f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color}. The applied patch generated 1150 javac compiler warnings (more than the trunk's current 205 warnings). {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 48 warning messages. See https://builds.apache.org/job/PreCommit-YARN-Build/6521//artifact/patchprocess/diffJavadocWarnings.txt for details. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6521//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6521//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6521//console This message is automatically generated. > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Attachments: YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated YARN-3145: - Attachment: YARN-3145.002.patch Thank you for the review, Jian. Good catch. Updating a patch to move the if block into synchronization block. > ConcurrentModificationException on CapacityScheduler > ParentQueue#getQueueUserAclInfo > > > Key: YARN-3145 > URL: https://issues.apache.org/jira/browse/YARN-3145 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Tsuyoshi OZAWA > Attachments: YARN-3145.001.patch, YARN-3145.002.patch > > > {code} > ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3147) Clean up RM web proxy code
[ https://issues.apache.org/jira/browse/YARN-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated YARN-3147: - Attachment: YARN-3147-002.patch Patch -002: Reinstate the deleted method, add javadocs saying: leave this alone. > Clean up RM web proxy code > --- > > Key: YARN-3147 > URL: https://issues.apache.org/jira/browse/YARN-3147 > Project: Hadoop YARN > Issue Type: Improvement > Components: webapp >Affects Versions: 2.6.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: YARN-3147-001.patch, YARN-3147-002.patch > > > YARN-2084 covers fixing up the RM proxy & filter for REST support. > Before doing that, prepare for it by cleaning up the codebase: factoring out > the redirect logic into a single method, some minor reformatting, move to > SLF4J and Java7 code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307864#comment-14307864 ] Tsuyoshi OZAWA commented on YARN-3145: -- Thank you! > ConcurrentModificationException on CapacityScheduler > ParentQueue#getQueueUserAclInfo > > > Key: YARN-3145 > URL: https://issues.apache.org/jira/browse/YARN-3145 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Tsuyoshi OZAWA > Attachments: YARN-3145.001.patch > > > {code} > ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA reassigned YARN-3145: Assignee: Tsuyoshi OZAWA (was: Rohith) > ConcurrentModificationException on CapacityScheduler > ParentQueue#getQueueUserAclInfo > > > Key: YARN-3145 > URL: https://issues.apache.org/jira/browse/YARN-3145 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Tsuyoshi OZAWA > Attachments: YARN-3145.001.patch > > > {code} > ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3146) Unused readObject() method in WebAppProxyServlet
[ https://issues.apache.org/jira/browse/YARN-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved YARN-3146. -- Resolution: Invalid Fix Version/s: 2.7.0 It's used by the servlet deserialization code, which while not a code path used by jersey/grizzly, is something findbugs needs. needs to be left in -just with some javadocs to explain why > Unused readObject() method in WebAppProxyServlet > > > Key: YARN-3146 > URL: https://issues.apache.org/jira/browse/YARN-3146 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: 2.7.0 > > Original Estimate: 2h > Remaining Estimate: 2h > > YARN-2940, "fix new findbugs", somehow inserted a new private method into > {{WebAppProxyServlet}}: {{readObject()}} > This method is not used and so unrelated to any findbugs warnings. > It should be deleted and only re-inserted with any code that actually uses it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2031) YARN Proxy model doesn't support REST APIs in AMs
[ https://issues.apache.org/jira/browse/YARN-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307858#comment-14307858 ] Steve Loughran commented on YARN-2031: -- I think httpclient needs to be set to specifically handle 307 + PUT/POST, etc. Leaving basic redirect @302 reduces change on the GET calls. The new ProxyUtility.sendRedirect() takes the HttpRequest as an argument, ready for the method to check the verb and choose the response code automatically. It could also consider returning something other than HTML if the input is, say, JSON. Though any automated client should just be discarding that. Anyway: yes, get YARN-3147 in first. There's some tests in the proxy code on servlet handling of GET redirects; these can be supplemented with tests that check the other verbs. Then we look at the AM filter > YARN Proxy model doesn't support REST APIs in AMs > - > > Key: YARN-2031 > URL: https://issues.apache.org/jira/browse/YARN-2031 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: YARN-2031.patch.001 > > > AMs can't support REST APIs because > # the AM filter redirects all requests to the proxy with a 302 response (not > 307) > # the proxy doesn't forward PUT/POST/DELETE verbs > Either the AM filter needs to return 307 and the proxy to forward the verbs, > or Am filter should not filter a REST bit of the web site -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307857#comment-14307857 ] Jian He edited comment on YARN-3145 at 2/5/15 7:46 PM: --- thanks for working on this! In ParentQueue#completedContainer, the following is not synchronized, I think this should be inside the previous queue synchronization block. {code} if (sortQueues) { // reinsert the updated queue for (Iterator iter=childQueues.iterator(); iter.hasNext();) { CSQueue csqueue = iter.next(); if(csqueue.equals(completedChildQueue)) { iter.remove(); LOG.info("Re-sorting completed queue: " + csqueue.getQueuePath() + " stats: " + csqueue); childQueues.add(csqueue); break; } } } {code} was (Author: jianhe): thanks for working on this! In ParentQueue#completedContainer, the following is not synchronized, I think this should be inside the previous application synchronization block. {code} if (sortQueues) { // reinsert the updated queue for (Iterator iter=childQueues.iterator(); iter.hasNext();) { CSQueue csqueue = iter.next(); if(csqueue.equals(completedChildQueue)) { iter.remove(); LOG.info("Re-sorting completed queue: " + csqueue.getQueuePath() + " stats: " + csqueue); childQueues.add(csqueue); break; } } } {code} > ConcurrentModificationException on CapacityScheduler > ParentQueue#getQueueUserAclInfo > > > Key: YARN-3145 > URL: https://issues.apache.org/jira/browse/YARN-3145 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith > Attachments: YARN-3145.001.patch > > > {code} > ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307857#comment-14307857 ] Jian He commented on YARN-3145: --- thanks for working on this! In ParentQueue#completedContainer, the following is not synchronized, I think this should be inside the previous application synchronization block. {code} if (sortQueues) { // reinsert the updated queue for (Iterator iter=childQueues.iterator(); iter.hasNext();) { CSQueue csqueue = iter.next(); if(csqueue.equals(completedChildQueue)) { iter.remove(); LOG.info("Re-sorting completed queue: " + csqueue.getQueuePath() + " stats: " + csqueue); childQueues.add(csqueue); break; } } } {code} > ConcurrentModificationException on CapacityScheduler > ParentQueue#getQueueUserAclInfo > > > Key: YARN-3145 > URL: https://issues.apache.org/jira/browse/YARN-3145 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Rohith > Attachments: YARN-3145.001.patch > > > {code} > ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3148) allow CORS related headers to passthrough in WebAppProxyServlet
[ https://issues.apache.org/jira/browse/YARN-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307855#comment-14307855 ] Hadoop QA commented on YARN-3148: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696820/YARN-3148.001.patch against trunk revision c4980a2. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6522//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6522//console This message is automatically generated. > allow CORS related headers to passthrough in WebAppProxyServlet > --- > > Key: YARN-3148 > URL: https://issues.apache.org/jira/browse/YARN-3148 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Prakash Ramachandran >Assignee: Varun Saxena > Attachments: YARN-3148.001.patch > > > currently the WebAppProxyServlet filters the request headers as defined by > passThroughHeaders. Tez UI is building a webapp which using rest api to fetch > data from the am via the rm tracking url. > for this purpose it would be nice to have additional headers allowed > especially the ones related to CORS. A few of them that would help are > * Origin > * Access-Control-Request-Method > * Access-Control-Request-Headers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1582) Capacity Scheduler: add a maximum-allocation-mb setting per queue
[ https://issues.apache.org/jira/browse/YARN-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307836#comment-14307836 ] Hudson commented on YARN-1582: -- FAILURE: Integrated in Hadoop-trunk-Commit #7025 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7025/]) YARN-1582. Capacity Scheduler: add a maximum-allocation-mb setting per queue. Contributed by Thomas Graves (jlowe: rev 69c8a7f45be5c0aa6787b07f328d74f1e2ba5628) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java > Capacity Scheduler: add a maximum-allocation-mb setting per queue > -- > > Key: YARN-1582 > URL: https://issues.apache.org/jira/browse/YARN-1582 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.0.0, 0.23.10, 2.2.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.7.0 > > Attachments: YARN-1582-branch-0.23.patch, YARN-1582.002.patch, > YARN-1582.003.patch > > > We want to allow certain queues to use larger container sizes while limiting > other queues to smaller container sizes. Setting it per queue will help > prevent abuse, help limit the impact of reservations, and allow changes in > the maximum container size to be rolled out more easily. > One reason this is needed is more application types are becoming available on > yarn and certain applications require more memory to run efficiently. While > we want to allow for that we don't want other applications to abuse that and > start requesting bigger containers then what they really need. > Note that we could have this based on application type, but that might not be > totally accurate either since for example you might want to allow certain > users on MapReduce to use larger containers, while limiting other users of > MapReduce to smaller containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1582) Capacity Scheduler: add a maximum-allocation-mb setting per queue
[ https://issues.apache.org/jira/browse/YARN-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307822#comment-14307822 ] Jason Lowe commented on YARN-1582: -- Thanks for the review, Tom! Committing this. > Capacity Scheduler: add a maximum-allocation-mb setting per queue > -- > > Key: YARN-1582 > URL: https://issues.apache.org/jira/browse/YARN-1582 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.0.0, 0.23.10, 2.2.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Attachments: YARN-1582-branch-0.23.patch, YARN-1582.002.patch, > YARN-1582.003.patch > > > We want to allow certain queues to use larger container sizes while limiting > other queues to smaller container sizes. Setting it per queue will help > prevent abuse, help limit the impact of reservations, and allow changes in > the maximum container size to be rolled out more easily. > One reason this is needed is more application types are becoming available on > yarn and certain applications require more memory to run efficiently. While > we want to allow for that we don't want other applications to abuse that and > start requesting bigger containers then what they really need. > Note that we could have this based on application type, but that might not be > totally accurate either since for example you might want to allow certain > users on MapReduce to use larger containers, while limiting other users of > MapReduce to smaller containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2921) MockRM#waitForState methods can be too slow and flaky
[ https://issues.apache.org/jira/browse/YARN-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated YARN-2921: - Attachment: YARN-2921.004.patch Make timeouts smaller and fix tests failure. > MockRM#waitForState methods can be too slow and flaky > - > > Key: YARN-2921 > URL: https://issues.apache.org/jira/browse/YARN-2921 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 2.6.0 >Reporter: Karthik Kambatla >Assignee: Tsuyoshi OZAWA > Attachments: YARN-2921.001.patch, YARN-2921.002.patch, > YARN-2921.003.patch, YARN-2921.004.patch > > > MockRM#waitForState methods currently sleep for too long (2 seconds and 1 > second). This leads to slow tests and sometimes failures if the > App/AppAttempt moves to another state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated YARN-3149: - Fix Version/s: (was: 2.6.0) > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Attachments: YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated YARN-3149: - Hadoop Flags: Reviewed > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Attachments: YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307778#comment-14307778 ] Tsuyoshi OZAWA commented on YARN-3149: -- [~bibinchundatt] thank you for working this jira. Good catch, and the fix looks good to me(+1). Pending for Jenkins. > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Fix For: 2.6.0 > > Attachments: YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3148) allow CORS related headers to passthrough in WebAppProxyServlet
[ https://issues.apache.org/jira/browse/YARN-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3148: --- Attachment: YARN-3148.001.patch > allow CORS related headers to passthrough in WebAppProxyServlet > --- > > Key: YARN-3148 > URL: https://issues.apache.org/jira/browse/YARN-3148 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Prakash Ramachandran >Assignee: Varun Saxena > Attachments: YARN-3148.001.patch > > > currently the WebAppProxyServlet filters the request headers as defined by > passThroughHeaders. Tez UI is building a webapp which using rest api to fetch > data from the am via the rm tracking url. > for this purpose it would be nice to have additional headers allowed > especially the ones related to CORS. A few of them that would help are > * Origin > * Access-Control-Request-Method > * Access-Control-Request-Headers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated YARN-3149: --- Assignee: Bibin A Chundatt > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Trivial > Attachments: YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-3149: --- Affects Version/s: 2.6.0 > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Bibin A Chundatt >Priority: Trivial > Attachments: YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-3149: --- Attachment: YARN-3149.patch > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Bibin A Chundatt >Priority: Trivial > Attachments: YARN-3149.patch, screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-3149: --- Attachment: screenshot-1.png > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Bibin A Chundatt >Priority: Trivial > Attachments: screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3149) Typo in message for invalid application id
[ https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307733#comment-14307733 ] Bibin A Chundatt commented on YARN-3149: The message should be application id could any one assign to me. I would like to submit the patch {quote} try { return toApplicationId(it); } catch (NumberFormatException n) { throw new IllegalArgumentException("Invalid AppAttemptId: " + appIdStr, n); } {quote} > Typo in message for invalid application id > -- > > Key: YARN-3149 > URL: https://issues.apache.org/jira/browse/YARN-3149 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Bibin A Chundatt >Priority: Trivial > Attachments: screenshot-1.png > > > Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3149) Typo in message for invalid application id
Bibin A Chundatt created YARN-3149: -- Summary: Typo in message for invalid application id Key: YARN-3149 URL: https://issues.apache.org/jira/browse/YARN-3149 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Bibin A Chundatt Priority: Trivial Message in console wrong when application id format wrong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-914) Support graceful decommission of nodemanager
[ https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307724#comment-14307724 ] Ming Ma commented on YARN-914: -- I agree with Jason. It is easier if NM doesn't need to know about decommission. There is a scalability issue that Junping might have brought up; but it shouldn't be an issue. To clarify decomm node list, it appears there are two things, one is the decomm request list; another one is the run time state of the decomm nodes. From Xuan's comment it appears we want to put the request in HDFS and leverage FileSystemBasedConfigurationProvider to read it at run time. Given it is considered configuration, that seems a good fit. Jason mentioned the state store , that can be used to track the run time state of the decomm. This is necessary given we plan to introduce timeout for graceful decommission. However, if we assume ResouceOption's overcommitTimeout state is stored in state store for RM failover case as part YARN-291, then the new active RM can just replay the state transition. If so, it seems we don't need to persist decomm run time state to state store. Alternatively we can remove graceful decommission timeout for YARN layer and let external decommission script handle that. If the script considers the graceful decommission takes too long, it can ask YARN to do the immediate decommission. BTW, it appears fair scheduler doesn't support ConfigurationProvider. Recommission is another scenario. It can happen when node is in decommissioned state or decommissioned_in_progress state. > Support graceful decommission of nodemanager > > > Key: YARN-914 > URL: https://issues.apache.org/jira/browse/YARN-914 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.0.4-alpha >Reporter: Luke Lu >Assignee: Junping Du > Attachments: Gracefully Decommission of NodeManager (v1).pdf > > > When NMs are decommissioned for non-fault reasons (capacity change etc.), > it's desirable to minimize the impact to running applications. > Currently if a NM is decommissioned, all running containers on the NM need to > be rescheduled on other NMs. Further more, for finished map tasks, if their > map output are not fetched by the reducers of the job, these map tasks will > need to be rerun as well. > We propose to introduce a mechanism to optionally gracefully decommission a > node manager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3148) allow CORS related headers to passthrough in WebAppProxyServlet
[ https://issues.apache.org/jira/browse/YARN-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena reassigned YARN-3148: -- Assignee: Varun Saxena > allow CORS related headers to passthrough in WebAppProxyServlet > --- > > Key: YARN-3148 > URL: https://issues.apache.org/jira/browse/YARN-3148 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Prakash Ramachandran >Assignee: Varun Saxena > > currently the WebAppProxyServlet filters the request headers as defined by > passThroughHeaders. Tez UI is building a webapp which using rest api to fetch > data from the am via the rm tracking url. > for this purpose it would be nice to have additional headers allowed > especially the ones related to CORS. A few of them that would help are > * Origin > * Access-Control-Request-Method > * Access-Control-Request-Headers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3101) In Fair Scheduler, fix canceling of reservations for exceeding max share
[ https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307689#comment-14307689 ] Anubhav Dhoot commented on YARN-3101: - Thanks [~sandyr] for the review and commit. > In Fair Scheduler, fix canceling of reservations for exceeding max share > > > Key: YARN-3101 > URL: https://issues.apache.org/jira/browse/YARN-3101 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Fix For: 2.7.0 > > Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, > YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, > YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch > > > YARN-2811 added fitInMaxShare to validate reservations on a queue, but did > not count it during its calculations. It also had the condition reversed so > the test was still passing because both cancelled each other. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3148) allow CORS related headers to passthrough in WebAppProxyServlet
Prakash Ramachandran created YARN-3148: -- Summary: allow CORS related headers to passthrough in WebAppProxyServlet Key: YARN-3148 URL: https://issues.apache.org/jira/browse/YARN-3148 Project: Hadoop YARN Issue Type: Improvement Reporter: Prakash Ramachandran currently the WebAppProxyServlet filters the request headers as defined by passThroughHeaders. Tez UI is building a webapp which using rest api to fetch data from the am via the rm tracking url. for this purpose it would be nice to have additional headers allowed especially the ones related to CORS. A few of them that would help are * Origin * Access-Control-Request-Method * Access-Control-Request-Headers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal
[ https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307656#comment-14307656 ] Zhijie Shen commented on YARN-3144: --- bq. How do we know if the job has used the service if we're trying to get the delegation token? I guess it's reasonable for the job submitter to know if the app is using the timeline service or not (such as Tez), and putting data to the timeline server is enabled or not. I meant the decision of whether loading the timeline DT automatically can be on job basis. But Jonathan's case about Oozie workflow makes sense. Different apps in a workflow may even have different situation of using the timeline service. bq. My main purpose now is to given users options on whether the timeline server failures are fatal to the users job. Given the justification above, it sounds reasonable. Hopefully it's could just be a temporal solution before high availability of the timeline service. > Configuration for making delegation token failures to timeline server > not-fatal > --- > > Key: YARN-3144 > URL: https://issues.apache.org/jira/browse/YARN-3144 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-3144.1.patch > > > Posting events to the timeline server is best-effort. However, getting the > delegation tokens from the timeline server will kill the job. This patch adds > a configuration to make get delegation token operations "best-effort". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3101) In Fair Scheduler, fix canceling of reservations for exceeding max share
[ https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307629#comment-14307629 ] Hudson commented on YARN-3101: -- FAILURE: Integrated in Hadoop-trunk-Commit #7022 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7022/]) YARN-3101. In Fair Scheduler, fix canceling of reservations for exceeding max share (Anubhav Dhoot via Sandy Ryza) (sandy: rev b6466deac6d5d6344f693144290b46e2bef83a02) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java > In Fair Scheduler, fix canceling of reservations for exceeding max share > > > Key: YARN-3101 > URL: https://issues.apache.org/jira/browse/YARN-3101 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, > YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, > YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch > > > YARN-2811 added fitInMaxShare to validate reservations on a queue, but did > not count it during its calculations. It also had the condition reversed so > the test was still passing because both cancelled each other. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3101) In Fair Scheduler, fix canceling of reservations for exceeding max share
[ https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-3101: - Summary: In Fair Scheduler, fix canceling of reservations for exceeding max share (was: Fix canceling of reservations for exceeding max share) > In Fair Scheduler, fix canceling of reservations for exceeding max share > > > Key: YARN-3101 > URL: https://issues.apache.org/jira/browse/YARN-3101 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, > YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, > YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch > > > YARN-2811 added fitInMaxShare to validate reservations on a queue, but did > not count it during its calculations. It also had the condition reversed so > the test was still passing because both cancelled each other. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3101) Fix canceling of reservations for exceeding max share
[ https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-3101: - Summary: Fix canceling of reservations for exceeding max share (was: FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it ) > Fix canceling of reservations for exceeding max share > - > > Key: YARN-3101 > URL: https://issues.apache.org/jira/browse/YARN-3101 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, > YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, > YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch > > > YARN-2811 added fitInMaxShare to validate reservations on a queue, but did > not count it during its calculations. It also had the condition reversed so > the test was still passing because both cancelled each other. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) implement client-side API for handling flows
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307615#comment-14307615 ] Robert Kanter commented on YARN-3040: - Tags are basically just arbitrary strings you can "attach" with an application. We had discussed the idea of specifying which applications belong to which flow by adding a tagging them with the flow id. Tags were added in YARN-1461. > implement client-side API for handling flows > > > Key: YARN-3040 > URL: https://issues.apache.org/jira/browse/YARN-3040 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Robert Kanter > > Per design in YARN-2928, implement client-side API for handling *flows*. > Frameworks should be able to define and pass in all attributes of flows and > flow runs to YARN, and they should be passed into ATS writers. > YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3087) the REST server (web server) for per-node aggregator does not work if it runs inside node manager
[ https://issues.apache.org/jira/browse/YARN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307614#comment-14307614 ] Devaraj K commented on YARN-3087: - I did some investigation on this issue and I found below is the cause for the issue. GuiceFilter is maintaining some static variables and accessing these for request processing. These values are getting overwriting for each new WebApp starting in the same process and holding only the last WebApp details. That's why we see that we can access only the last started WebApp. And also the same WebApp can be accessed in all the started WebApp’s web ports due to the GuiceFilter static data instead of their own WebApp. {code:title=com.google.inject.servlet.GuiceFilter|borderStyle=solid} static final ThreadLocal localContext = new ThreadLocal(); static volatile FilterPipeline pipeline = new DefaultFilterPipeline(); ** static volatile WeakReference servletContext = new WeakReference(null); ** @Inject static void setPipeline(FilterPipeline pipeline) { GuiceFilter.pipeline = pipeline; {code} I tried to override GuiceFilter with custom filter( extends GuiceFilter) to avoid these static data overwriting. But I see using the hard coded GuiceFilter.class in different places of guice-servlet module during initialization and request processing which again causes the problem. {code:title=com.google.inject.servlet.InternalServletModule|borderStyle=solid} @Override protected void configure() { *** requestStaticInjection(GuiceFilter.class); *** {code} {code:title=com.google.inject.servlet.ServletScopes|borderStyle=solid} HttpServletRequest request = GuiceFilter.getRequest(); {code} Please share your thoughts on this issue. Thanks. > the REST server (web server) for per-node aggregator does not work if it runs > inside node manager > - > > Key: YARN-3087 > URL: https://issues.apache.org/jira/browse/YARN-3087 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Devaraj K > > This is related to YARN-3030. YARN-3030 sets up a per-node timeline > aggregator and the associated REST server. It runs fine as a standalone > process, but does not work if it runs inside the node manager due to possible > collisions of servlet mapping. > Exception: > {noformat} > org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for > v2 not found > at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232) > at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3087) the REST server (web server) for per-node aggregator does not work if it runs inside node manager
[ https://issues.apache.org/jira/browse/YARN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307607#comment-14307607 ] Devaraj K commented on YARN-3087: - Thanks [~sjlee0] for reporting this issue. > the REST server (web server) for per-node aggregator does not work if it runs > inside node manager > - > > Key: YARN-3087 > URL: https://issues.apache.org/jira/browse/YARN-3087 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Devaraj K > > This is related to YARN-3030. YARN-3030 sets up a per-node timeline > aggregator and the associated REST server. It runs fine as a standalone > process, but does not work if it runs inside the node manager due to possible > collisions of servlet mapping. > Exception: > {noformat} > org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for > v2 not found > at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232) > at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)