[jira] [Commented] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308817#comment-14308817
 ] 

Hudson commented on YARN-1537:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7038 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7038/])
YARN-1537. Fix race condition in 
TestLocalResourcesTrackerImpl.testLocalResourceCache. Contributed by Xuan Gong. 
(acmurthy: rev 02f154a0016b7321bbe5b09f2da44a9b33797c36)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java
* hadoop-yarn-project/CHANGES.txt


> TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
> -
>
> Key: YARN-1537
> URL: https://issues.apache.org/jira/browse/YARN-1537
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Hong Shen
>Assignee: Xuan Gong
> Fix For: 2.7.0
>
> Attachments: YARN-1537.1.patch
>
>
> Here is the error log
> {code}
> Results :
> Failed tests: 
>   TestLocalResourcesTrackerImpl.testLocalResourceCache:351 
> Wanted but not invoked:
> eventHandler.handle(
> 
> isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent)
> );
> -> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351)
> However, there were other interactions with this mock:
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3151) On Failover tracking url wrong in application cli for KILLED application

2015-02-06 Thread Bibin A Chundatt (JIRA)
Bibin A Chundatt created YARN-3151:
--

 Summary: On Failover tracking url wrong in application cli for 
KILLED application
 Key: YARN-3151
 URL: https://issues.apache.org/jira/browse/YARN-3151
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Affects Versions: 2.6.0
 Environment: 2 RM HA 
Reporter: Bibin A Chundatt
Priority: Minor


Run an application and kill the same after starting
Check {color:red} ./yarn application -list -appStates KILLED {color}

(empty line)

{quote}

Application-Id Tracking-URL
application_1423219262738_0001  
http://:PORT>/cluster/app/application_1423219262738_0001

{quote}

Shutdown the active RM1
Check the same command {color:red} ./yarn application -list -appStates KILLED 
{color} after RM2 is active

{quote}

Application-Id Tracking-URL
application_1423219262738_0001  null

{quote}

Tracking url for application is shown as null 
Expected : Same url before failover should be shown

ApplicationReport .getOriginalTrackingUrl() is null after failover

org.apache.hadoop.yarn.client.cli.ApplicationCLI
listApplications(Set appTypes,
  EnumSet appStates)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3151) On Failover tracking url wrong in application cli for KILLED application

2015-02-06 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith reassigned YARN-3151:


Assignee: Rohith

> On Failover tracking url wrong in application cli for KILLED application
> 
>
> Key: YARN-3151
> URL: https://issues.apache.org/jira/browse/YARN-3151
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, resourcemanager
>Affects Versions: 2.6.0
> Environment: 2 RM HA 
>Reporter: Bibin A Chundatt
>Assignee: Rohith
>Priority: Minor
>
> Run an application and kill the same after starting
> Check {color:red} ./yarn application -list -appStates KILLED {color}
> (empty line)
> {quote}
> Application-Id Tracking-URL
> application_1423219262738_0001  
> http://:PORT>/cluster/app/application_1423219262738_0001
> {quote}
> Shutdown the active RM1
> Check the same command {color:red} ./yarn application -list -appStates KILLED 
> {color} after RM2 is active
> {quote}
> Application-Id Tracking-URL
> application_1423219262738_0001  null
> {quote}
> Tracking url for application is shown as null 
> Expected : Same url before failover should be shown
> ApplicationReport .getOriginalTrackingUrl() is null after failover
> org.apache.hadoop.yarn.client.cli.ApplicationCLI
> listApplications(Set appTypes,
>   EnumSet appStates)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-933) After an AppAttempt_1 got failed [ removal and releasing of container is done , AppAttempt_2 is scheduled ] again relaunching of AppAttempt_1 throws Exception at RM .And

2015-02-06 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308862#comment-14308862
 ] 

Rohith commented on YARN-933:
-

Sure, I will recheck the code for existence of problem and update the patch.

> After an AppAttempt_1 got failed [ removal and releasing of container is done 
> , AppAttempt_2 is scheduled ] again relaunching of AppAttempt_1 throws 
> Exception at RM .And client exited before appattempt retries got over
> --
>
> Key: YARN-933
> URL: https://issues.apache.org/jira/browse/YARN-933
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.5-alpha
>Reporter: J.Andreina
>Assignee: Rohith
> Attachments: YARN-933.patch
>
>
> am max retries configured as 3 at client and RM side.
> Step 1: Install cluster with NM on 2 Machines 
> Step 2: Make Ping using ip from RM machine to NM1 machine as successful ,But 
> using Hostname should fail
> Step 3: Execute a job
> Step 4: After AM [ AppAttempt_1 ] allocation to NM1 machine is done , 
> connection loss happened.
> Observation :
> ==
> After AppAttempt_1 has moved to failed state ,release of container for 
> AppAttempt_1 and Application removal are successful. New AppAttempt_2 is 
> sponed.
> 1. Then again retry for AppAttempt_1 happens.
> 2. Again RM side it is trying to launch AppAttempt_1, hence fails with 
> InvalidStateTransitonException
> 3. Client got exited after AppAttempt_1 is been finished [But actually job is 
> still running ], while the appattempts configured is 3 and rest appattempts 
> are all sponed and running.
> RMLogs:
> ==
> 2013-07-17 16:22:51,013 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1373952096466_0056_01 State change from SCHEDULED to ALLOCATED
> 2013-07-17 16:35:48,171 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: host-10-18-40-15/10.18.40.59:8048. Already tried 36 time(s); 
> maxRetries=45
> 2013-07-17 16:36:07,091 INFO 
> org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: 
> Expired:container_1373952096466_0056_01_01 Timed out after 600 secs
> 2013-07-17 16:36:07,093 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1373952096466_0056_01_01 Container Transitioned from ACQUIRED 
> to EXPIRED
> 2013-07-17 16:36:07,093 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
> Registering appattempt_1373952096466_0056_02
> 2013-07-17 16:36:07,131 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Application appattempt_1373952096466_0056_01 is done. finalState=FAILED
> 2013-07-17 16:36:07,131 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> Application removed - appId: application_1373952096466_0056 user: Rex 
> leaf-queue of parent: root #applications: 35
> 2013-07-17 16:36:07,132 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Application Submission: appattempt_1373952096466_0056_02, 
> 2013-07-17 16:36:07,138 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1373952096466_0056_02 State change from SUBMITTED to SCHEDULED
> 2013-07-17 16:36:30,179 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: host-10-18-40-15/10.18.40.59:8048. Already tried 38 time(s); 
> maxRetries=45
> 2013-07-17 16:38:36,203 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: host-10-18-40-15/10.18.40.59:8048. Already tried 44 time(s); 
> maxRetries=45
> 2013-07-17 16:38:56,207 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Error 
> launching appattempt_1373952096466_0056_01. Got exception: 
> java.lang.reflect.UndeclaredThrowableException
> 2013-07-17 16:38:56,207 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> LAUNCH_FAILED at FAILED
>  at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>  at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>  at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:630)
>  at 
> org.apache.hadoop.yarn.server.

[jira] [Commented] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308947#comment-14308947
 ] 

Hudson commented on YARN-1537:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #96 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/96/])
YARN-1537. Fix race condition in 
TestLocalResourcesTrackerImpl.testLocalResourceCache. Contributed by Xuan Gong. 
(acmurthy: rev 02f154a0016b7321bbe5b09f2da44a9b33797c36)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java


> TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
> -
>
> Key: YARN-1537
> URL: https://issues.apache.org/jira/browse/YARN-1537
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Hong Shen
>Assignee: Xuan Gong
> Fix For: 2.7.0
>
> Attachments: YARN-1537.1.patch
>
>
> Here is the error log
> {code}
> Results :
> Failed tests: 
>   TestLocalResourcesTrackerImpl.testLocalResourceCache:351 
> Wanted but not invoked:
> eventHandler.handle(
> 
> isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent)
> );
> -> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351)
> However, there were other interactions with this mock:
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) In Fair Scheduler, fix canceling of reservations for exceeding max share

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308949#comment-14308949
 ] 

Hudson commented on YARN-3101:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #96 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/96/])
YARN-3101. In Fair Scheduler, fix canceling of reservations for exceeding max 
share (Anubhav Dhoot via Sandy Ryza) (sandy: rev 
b6466deac6d5d6344f693144290b46e2bef83a02)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* hadoop-yarn-project/CHANGES.txt


> In Fair Scheduler, fix canceling of reservations for exceeding max share
> 
>
> Key: YARN-3101
> URL: https://issues.apache.org/jira/browse/YARN-3101
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Fix For: 2.7.0
>
> Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
> YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, 
> YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch
>
>
> YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
> not count it during its calculations. It also had the condition reversed so 
> the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1904) Uniform the XXXXNotFound messages from ClientRMService and ApplicationHistoryClientService

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308953#comment-14308953
 ] 

Hudson commented on YARN-1904:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #96 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/96/])
YARN-1904. Ensure exceptions thrown in ClientRMService & 
ApplicationHistoryClientService are uniform when application-attempt is not 
found. Contributed by Zhijie Shen. (acmurthy: rev 
18b2507edaac991e3ed68d2f27eb96f6882137b9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryClientService.java


> Uniform the NotFound messages from ClientRMService and 
> ApplicationHistoryClientService
> --
>
> Key: YARN-1904
> URL: https://issues.apache.org/jira/browse/YARN-1904
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.7.0
>
> Attachments: YARN-1904.1.patch
>
>
> It's good to make ClientRMService and ApplicationHistoryClientService throw 
> NotFoundException with similar messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3149) Typo in message for invalid application id

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308957#comment-14308957
 ] 

Hudson commented on YARN-3149:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #96 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/96/])
YARN-3149. Fix typo in message for invalid application id. Contributed (xgong: 
rev b77ff37686e01b7497d3869fbc62789a5b123c0a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ConverterUtils.java
* hadoop-yarn-project/CHANGES.txt


> Typo in message for invalid application id
> --
>
> Key: YARN-3149
> URL: https://issues.apache.org/jira/browse/YARN-3149
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png
>
>
> Message in console wrong when application id format wrong



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1582) Capacity Scheduler: add a maximum-allocation-mb setting per queue

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308946#comment-14308946
 ] 

Hudson commented on YARN-1582:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #96 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/96/])
YARN-1582. Capacity Scheduler: add a maximum-allocation-mb setting per queue. 
Contributed by Thomas Graves (jlowe: rev 
69c8a7f45be5c0aa6787b07f328d74f1e2ba5628)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


> Capacity Scheduler: add a maximum-allocation-mb setting per queue 
> --
>
> Key: YARN-1582
> URL: https://issues.apache.org/jira/browse/YARN-1582
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 0.23.10, 2.2.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.7.0
>
> Attachments: YARN-1582-branch-0.23.patch, YARN-1582.002.patch, 
> YARN-1582.003.patch
>
>
> We want to allow certain queues to use larger container sizes while limiting 
> other queues to smaller container sizes.  Setting it per queue will help 
> prevent abuse, help limit the impact of reservations, and allow changes in 
> the maximum container size to be rolled out more easily.
> One reason this is needed is more application types are becoming available on 
> yarn and certain applications require more memory to run efficiently. While 
> we want to allow for that we don't want other applications to abuse that and 
> start requesting bigger containers then what they really need.  
> Note that we could have this based on application type, but that might not be 
> totally accurate either since for example you might want to allow certain 
> users on MapReduce to use larger containers, while limiting other users of 
> MapReduce to smaller containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308952#comment-14308952
 ] 

Hudson commented on YARN-3145:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #96 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/96/])
YARN-3145. Fixed ConcurrentModificationException on CapacityScheduler 
ParentQueue#getQueueUserAclInfo. Contributed by Tsuyoshi OZAWA (jianhe: rev 
4641196fe02af5cab3d56a9f3c78875c495dbe03)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* hadoop-yarn-project/CHANGES.txt


> ConcurrentModificationException on CapacityScheduler 
> ParentQueue#getQueueUserAclInfo
> 
>
> Key: YARN-3145
> URL: https://issues.apache.org/jira/browse/YARN-3145
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Tsuyoshi OZAWA
> Fix For: 2.7.0
>
> Attachments: YARN-3145.001.patch, YARN-3145.002.patch
>
>
> {code}
> ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
> at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-06 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308987#comment-14308987
 ] 

Chris Douglas commented on YARN-3100:
-

Looking through {{AbstractCSQueue}} and {{CSQueueUtils}}, it looks like there 
are many misconfigurations that leave queues in an inconsistent state...

> Make YARN authorization pluggable
> -
>
> Key: YARN-3100
> URL: https://issues.apache.org/jira/browse/YARN-3100
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-3100.1.patch, YARN-3100.2.patch
>
>
> The goal is to have YARN acl model pluggable so as to integrate other 
> authorization tool such as Apache Ranger, Sentry.
> Currently, we have 
> - admin ACL
> - queue ACL
> - application ACL
> - time line domain ACL
> - service ACL
> The proposal is to create a YarnAuthorizationProvider interface. Current 
> implementation will be the default implementation. Ranger or Sentry plug-in 
> can implement  this interface.
> Benefit:
> -  Unify the code base. With the default implementation, we can get rid of 
> each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
> QueueAclsManager etc.
> - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1582) Capacity Scheduler: add a maximum-allocation-mb setting per queue

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309007#comment-14309007
 ] 

Hudson commented on YARN-1582:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #830 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/830/])
YARN-1582. Capacity Scheduler: add a maximum-allocation-mb setting per queue. 
Contributed by Thomas Graves (jlowe: rev 
69c8a7f45be5c0aa6787b07f328d74f1e2ba5628)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java


> Capacity Scheduler: add a maximum-allocation-mb setting per queue 
> --
>
> Key: YARN-1582
> URL: https://issues.apache.org/jira/browse/YARN-1582
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 0.23.10, 2.2.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.7.0
>
> Attachments: YARN-1582-branch-0.23.patch, YARN-1582.002.patch, 
> YARN-1582.003.patch
>
>
> We want to allow certain queues to use larger container sizes while limiting 
> other queues to smaller container sizes.  Setting it per queue will help 
> prevent abuse, help limit the impact of reservations, and allow changes in 
> the maximum container size to be rolled out more easily.
> One reason this is needed is more application types are becoming available on 
> yarn and certain applications require more memory to run efficiently. While 
> we want to allow for that we don't want other applications to abuse that and 
> start requesting bigger containers then what they really need.  
> Note that we could have this based on application type, but that might not be 
> totally accurate either since for example you might want to allow certain 
> users on MapReduce to use larger containers, while limiting other users of 
> MapReduce to smaller containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309008#comment-14309008
 ] 

Hudson commented on YARN-1537:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #830 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/830/])
YARN-1537. Fix race condition in 
TestLocalResourcesTrackerImpl.testLocalResourceCache. Contributed by Xuan Gong. 
(acmurthy: rev 02f154a0016b7321bbe5b09f2da44a9b33797c36)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java


> TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
> -
>
> Key: YARN-1537
> URL: https://issues.apache.org/jira/browse/YARN-1537
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Hong Shen
>Assignee: Xuan Gong
> Fix For: 2.7.0
>
> Attachments: YARN-1537.1.patch
>
>
> Here is the error log
> {code}
> Results :
> Failed tests: 
>   TestLocalResourcesTrackerImpl.testLocalResourceCache:351 
> Wanted but not invoked:
> eventHandler.handle(
> 
> isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent)
> );
> -> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351)
> However, there were other interactions with this mock:
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) In Fair Scheduler, fix canceling of reservations for exceeding max share

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309010#comment-14309010
 ] 

Hudson commented on YARN-3101:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #830 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/830/])
YARN-3101. In Fair Scheduler, fix canceling of reservations for exceeding max 
share (Anubhav Dhoot via Sandy Ryza) (sandy: rev 
b6466deac6d5d6344f693144290b46e2bef83a02)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java


> In Fair Scheduler, fix canceling of reservations for exceeding max share
> 
>
> Key: YARN-3101
> URL: https://issues.apache.org/jira/browse/YARN-3101
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Fix For: 2.7.0
>
> Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
> YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, 
> YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch
>
>
> YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
> not count it during its calculations. It also had the condition reversed so 
> the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309013#comment-14309013
 ] 

Hudson commented on YARN-3145:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #830 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/830/])
YARN-3145. Fixed ConcurrentModificationException on CapacityScheduler 
ParentQueue#getQueueUserAclInfo. Contributed by Tsuyoshi OZAWA (jianhe: rev 
4641196fe02af5cab3d56a9f3c78875c495dbe03)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java


> ConcurrentModificationException on CapacityScheduler 
> ParentQueue#getQueueUserAclInfo
> 
>
> Key: YARN-3145
> URL: https://issues.apache.org/jira/browse/YARN-3145
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Tsuyoshi OZAWA
> Fix For: 2.7.0
>
> Attachments: YARN-3145.001.patch, YARN-3145.002.patch
>
>
> {code}
> ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
> at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3149) Typo in message for invalid application id

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309018#comment-14309018
 ] 

Hudson commented on YARN-3149:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #830 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/830/])
YARN-3149. Fix typo in message for invalid application id. Contributed (xgong: 
rev b77ff37686e01b7497d3869fbc62789a5b123c0a)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ConverterUtils.java


> Typo in message for invalid application id
> --
>
> Key: YARN-3149
> URL: https://issues.apache.org/jira/browse/YARN-3149
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png
>
>
> Message in console wrong when application id format wrong



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1904) Uniform the XXXXNotFound messages from ClientRMService and ApplicationHistoryClientService

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309014#comment-14309014
 ] 

Hudson commented on YARN-1904:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #830 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/830/])
YARN-1904. Ensure exceptions thrown in ClientRMService & 
ApplicationHistoryClientService are uniform when application-attempt is not 
found. Contributed by Zhijie Shen. (acmurthy: rev 
18b2507edaac991e3ed68d2f27eb96f6882137b9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryClientService.java
* hadoop-yarn-project/CHANGES.txt


> Uniform the NotFound messages from ClientRMService and 
> ApplicationHistoryClientService
> --
>
> Key: YARN-1904
> URL: https://issues.apache.org/jira/browse/YARN-1904
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.7.0
>
> Attachments: YARN-1904.1.patch
>
>
> It's good to make ClientRMService and ApplicationHistoryClientService throw 
> NotFoundException with similar messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) In Fair Scheduler, fix canceling of reservations for exceeding max share

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309173#comment-14309173
 ] 

Hudson commented on YARN-3101:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #93 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/93/])
YARN-3101. In Fair Scheduler, fix canceling of reservations for exceeding max 
share (Anubhav Dhoot via Sandy Ryza) (sandy: rev 
b6466deac6d5d6344f693144290b46e2bef83a02)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java


> In Fair Scheduler, fix canceling of reservations for exceeding max share
> 
>
> Key: YARN-3101
> URL: https://issues.apache.org/jira/browse/YARN-3101
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Fix For: 2.7.0
>
> Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
> YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, 
> YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch
>
>
> YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
> not count it during its calculations. It also had the condition reversed so 
> the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1904) Uniform the XXXXNotFound messages from ClientRMService and ApplicationHistoryClientService

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309177#comment-14309177
 ] 

Hudson commented on YARN-1904:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #93 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/93/])
YARN-1904. Ensure exceptions thrown in ClientRMService & 
ApplicationHistoryClientService are uniform when application-attempt is not 
found. Contributed by Zhijie Shen. (acmurthy: rev 
18b2507edaac991e3ed68d2f27eb96f6882137b9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryClientService.java
* hadoop-yarn-project/CHANGES.txt


> Uniform the NotFound messages from ClientRMService and 
> ApplicationHistoryClientService
> --
>
> Key: YARN-1904
> URL: https://issues.apache.org/jira/browse/YARN-1904
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.7.0
>
> Attachments: YARN-1904.1.patch
>
>
> It's good to make ClientRMService and ApplicationHistoryClientService throw 
> NotFoundException with similar messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309171#comment-14309171
 ] 

Hudson commented on YARN-1537:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #93 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/93/])
YARN-1537. Fix race condition in 
TestLocalResourcesTrackerImpl.testLocalResourceCache. Contributed by Xuan Gong. 
(acmurthy: rev 02f154a0016b7321bbe5b09f2da44a9b33797c36)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java
* hadoop-yarn-project/CHANGES.txt


> TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
> -
>
> Key: YARN-1537
> URL: https://issues.apache.org/jira/browse/YARN-1537
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Hong Shen
>Assignee: Xuan Gong
> Fix For: 2.7.0
>
> Attachments: YARN-1537.1.patch
>
>
> Here is the error log
> {code}
> Results :
> Failed tests: 
>   TestLocalResourcesTrackerImpl.testLocalResourceCache:351 
> Wanted but not invoked:
> eventHandler.handle(
> 
> isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent)
> );
> -> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351)
> However, there were other interactions with this mock:
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3149) Typo in message for invalid application id

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309180#comment-14309180
 ] 

Hudson commented on YARN-3149:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #93 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/93/])
YARN-3149. Fix typo in message for invalid application id. Contributed (xgong: 
rev b77ff37686e01b7497d3869fbc62789a5b123c0a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ConverterUtils.java
* hadoop-yarn-project/CHANGES.txt


> Typo in message for invalid application id
> --
>
> Key: YARN-3149
> URL: https://issues.apache.org/jira/browse/YARN-3149
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png
>
>
> Message in console wrong when application id format wrong



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1582) Capacity Scheduler: add a maximum-allocation-mb setting per queue

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309170#comment-14309170
 ] 

Hudson commented on YARN-1582:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #93 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/93/])
YARN-1582. Capacity Scheduler: add a maximum-allocation-mb setting per queue. 
Contributed by Thomas Graves (jlowe: rev 
69c8a7f45be5c0aa6787b07f328d74f1e2ba5628)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java


> Capacity Scheduler: add a maximum-allocation-mb setting per queue 
> --
>
> Key: YARN-1582
> URL: https://issues.apache.org/jira/browse/YARN-1582
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 0.23.10, 2.2.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.7.0
>
> Attachments: YARN-1582-branch-0.23.patch, YARN-1582.002.patch, 
> YARN-1582.003.patch
>
>
> We want to allow certain queues to use larger container sizes while limiting 
> other queues to smaller container sizes.  Setting it per queue will help 
> prevent abuse, help limit the impact of reservations, and allow changes in 
> the maximum container size to be rolled out more easily.
> One reason this is needed is more application types are becoming available on 
> yarn and certain applications require more memory to run efficiently. While 
> we want to allow for that we don't want other applications to abuse that and 
> start requesting bigger containers then what they really need.  
> Note that we could have this based on application type, but that might not be 
> totally accurate either since for example you might want to allow certain 
> users on MapReduce to use larger containers, while limiting other users of 
> MapReduce to smaller containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309176#comment-14309176
 ] 

Hudson commented on YARN-3145:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #93 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/93/])
YARN-3145. Fixed ConcurrentModificationException on CapacityScheduler 
ParentQueue#getQueueUserAclInfo. Contributed by Tsuyoshi OZAWA (jianhe: rev 
4641196fe02af5cab3d56a9f3c78875c495dbe03)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java


> ConcurrentModificationException on CapacityScheduler 
> ParentQueue#getQueueUserAclInfo
> 
>
> Key: YARN-3145
> URL: https://issues.apache.org/jira/browse/YARN-3145
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Tsuyoshi OZAWA
> Fix For: 2.7.0
>
> Attachments: YARN-3145.001.patch, YARN-3145.002.patch
>
>
> {code}
> ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
> at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1582) Capacity Scheduler: add a maximum-allocation-mb setting per queue

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309195#comment-14309195
 ] 

Hudson commented on YARN-1582:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2028 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2028/])
YARN-1582. Capacity Scheduler: add a maximum-allocation-mb setting per queue. 
Contributed by Thomas Graves (jlowe: rev 
69c8a7f45be5c0aa6787b07f328d74f1e2ba5628)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java


> Capacity Scheduler: add a maximum-allocation-mb setting per queue 
> --
>
> Key: YARN-1582
> URL: https://issues.apache.org/jira/browse/YARN-1582
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 0.23.10, 2.2.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.7.0
>
> Attachments: YARN-1582-branch-0.23.patch, YARN-1582.002.patch, 
> YARN-1582.003.patch
>
>
> We want to allow certain queues to use larger container sizes while limiting 
> other queues to smaller container sizes.  Setting it per queue will help 
> prevent abuse, help limit the impact of reservations, and allow changes in 
> the maximum container size to be rolled out more easily.
> One reason this is needed is more application types are becoming available on 
> yarn and certain applications require more memory to run efficiently. While 
> we want to allow for that we don't want other applications to abuse that and 
> start requesting bigger containers then what they really need.  
> Note that we could have this based on application type, but that might not be 
> totally accurate either since for example you might want to allow certain 
> users on MapReduce to use larger containers, while limiting other users of 
> MapReduce to smaller containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309202#comment-14309202
 ] 

Hudson commented on YARN-3145:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2028 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2028/])
YARN-3145. Fixed ConcurrentModificationException on CapacityScheduler 
ParentQueue#getQueueUserAclInfo. Contributed by Tsuyoshi OZAWA (jianhe: rev 
4641196fe02af5cab3d56a9f3c78875c495dbe03)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java


> ConcurrentModificationException on CapacityScheduler 
> ParentQueue#getQueueUserAclInfo
> 
>
> Key: YARN-3145
> URL: https://issues.apache.org/jira/browse/YARN-3145
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Tsuyoshi OZAWA
> Fix For: 2.7.0
>
> Attachments: YARN-3145.001.patch, YARN-3145.002.patch
>
>
> {code}
> ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
> at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3149) Typo in message for invalid application id

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309206#comment-14309206
 ] 

Hudson commented on YARN-3149:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2028 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2028/])
YARN-3149. Fix typo in message for invalid application id. Contributed (xgong: 
rev b77ff37686e01b7497d3869fbc62789a5b123c0a)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ConverterUtils.java


> Typo in message for invalid application id
> --
>
> Key: YARN-3149
> URL: https://issues.apache.org/jira/browse/YARN-3149
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png
>
>
> Message in console wrong when application id format wrong



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1904) Uniform the XXXXNotFound messages from ClientRMService and ApplicationHistoryClientService

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309203#comment-14309203
 ] 

Hudson commented on YARN-1904:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2028 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2028/])
YARN-1904. Ensure exceptions thrown in ClientRMService & 
ApplicationHistoryClientService are uniform when application-attempt is not 
found. Contributed by Zhijie Shen. (acmurthy: rev 
18b2507edaac991e3ed68d2f27eb96f6882137b9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryClientService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* hadoop-yarn-project/CHANGES.txt


> Uniform the NotFound messages from ClientRMService and 
> ApplicationHistoryClientService
> --
>
> Key: YARN-1904
> URL: https://issues.apache.org/jira/browse/YARN-1904
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.7.0
>
> Attachments: YARN-1904.1.patch
>
>
> It's good to make ClientRMService and ApplicationHistoryClientService throw 
> NotFoundException with similar messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309196#comment-14309196
 ] 

Hudson commented on YARN-1537:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2028 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2028/])
YARN-1537. Fix race condition in 
TestLocalResourcesTrackerImpl.testLocalResourceCache. Contributed by Xuan Gong. 
(acmurthy: rev 02f154a0016b7321bbe5b09f2da44a9b33797c36)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java
* hadoop-yarn-project/CHANGES.txt


> TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
> -
>
> Key: YARN-1537
> URL: https://issues.apache.org/jira/browse/YARN-1537
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Hong Shen
>Assignee: Xuan Gong
> Fix For: 2.7.0
>
> Attachments: YARN-1537.1.patch
>
>
> Here is the error log
> {code}
> Results :
> Failed tests: 
>   TestLocalResourcesTrackerImpl.testLocalResourceCache:351 
> Wanted but not invoked:
> eventHandler.handle(
> 
> isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent)
> );
> -> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351)
> However, there were other interactions with this mock:
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) In Fair Scheduler, fix canceling of reservations for exceeding max share

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309199#comment-14309199
 ] 

Hudson commented on YARN-3101:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2028 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2028/])
YARN-3101. In Fair Scheduler, fix canceling of reservations for exceeding max 
share (Anubhav Dhoot via Sandy Ryza) (sandy: rev 
b6466deac6d5d6344f693144290b46e2bef83a02)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java


> In Fair Scheduler, fix canceling of reservations for exceeding max share
> 
>
> Key: YARN-3101
> URL: https://issues.apache.org/jira/browse/YARN-3101
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Fix For: 2.7.0
>
> Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
> YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, 
> YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch
>
>
> YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
> not count it during its calculations. It also had the condition reversed so 
> the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309253#comment-14309253
 ] 

Hudson commented on YARN-1537:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #97 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/97/])
YARN-1537. Fix race condition in 
TestLocalResourcesTrackerImpl.testLocalResourceCache. Contributed by Xuan Gong. 
(acmurthy: rev 02f154a0016b7321bbe5b09f2da44a9b33797c36)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java


> TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
> -
>
> Key: YARN-1537
> URL: https://issues.apache.org/jira/browse/YARN-1537
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Hong Shen
>Assignee: Xuan Gong
> Fix For: 2.7.0
>
> Attachments: YARN-1537.1.patch
>
>
> Here is the error log
> {code}
> Results :
> Failed tests: 
>   TestLocalResourcesTrackerImpl.testLocalResourceCache:351 
> Wanted but not invoked:
> eventHandler.handle(
> 
> isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent)
> );
> -> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351)
> However, there were other interactions with this mock:
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1582) Capacity Scheduler: add a maximum-allocation-mb setting per queue

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309252#comment-14309252
 ] 

Hudson commented on YARN-1582:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #97 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/97/])
YARN-1582. Capacity Scheduler: add a maximum-allocation-mb setting per queue. 
Contributed by Thomas Graves (jlowe: rev 
69c8a7f45be5c0aa6787b07f328d74f1e2ba5628)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java


> Capacity Scheduler: add a maximum-allocation-mb setting per queue 
> --
>
> Key: YARN-1582
> URL: https://issues.apache.org/jira/browse/YARN-1582
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 0.23.10, 2.2.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.7.0
>
> Attachments: YARN-1582-branch-0.23.patch, YARN-1582.002.patch, 
> YARN-1582.003.patch
>
>
> We want to allow certain queues to use larger container sizes while limiting 
> other queues to smaller container sizes.  Setting it per queue will help 
> prevent abuse, help limit the impact of reservations, and allow changes in 
> the maximum container size to be rolled out more easily.
> One reason this is needed is more application types are becoming available on 
> yarn and certain applications require more memory to run efficiently. While 
> we want to allow for that we don't want other applications to abuse that and 
> start requesting bigger containers then what they really need.  
> Note that we could have this based on application type, but that might not be 
> totally accurate either since for example you might want to allow certain 
> users on MapReduce to use larger containers, while limiting other users of 
> MapReduce to smaller containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1904) Uniform the XXXXNotFound messages from ClientRMService and ApplicationHistoryClientService

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309259#comment-14309259
 ] 

Hudson commented on YARN-1904:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #97 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/97/])
YARN-1904. Ensure exceptions thrown in ClientRMService & 
ApplicationHistoryClientService are uniform when application-attempt is not 
found. Contributed by Zhijie Shen. (acmurthy: rev 
18b2507edaac991e3ed68d2f27eb96f6882137b9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryClientService.java
* hadoop-yarn-project/CHANGES.txt


> Uniform the NotFound messages from ClientRMService and 
> ApplicationHistoryClientService
> --
>
> Key: YARN-1904
> URL: https://issues.apache.org/jira/browse/YARN-1904
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.7.0
>
> Attachments: YARN-1904.1.patch
>
>
> It's good to make ClientRMService and ApplicationHistoryClientService throw 
> NotFoundException with similar messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) In Fair Scheduler, fix canceling of reservations for exceeding max share

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309255#comment-14309255
 ] 

Hudson commented on YARN-3101:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #97 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/97/])
YARN-3101. In Fair Scheduler, fix canceling of reservations for exceeding max 
share (Anubhav Dhoot via Sandy Ryza) (sandy: rev 
b6466deac6d5d6344f693144290b46e2bef83a02)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* hadoop-yarn-project/CHANGES.txt


> In Fair Scheduler, fix canceling of reservations for exceeding max share
> 
>
> Key: YARN-3101
> URL: https://issues.apache.org/jira/browse/YARN-3101
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Fix For: 2.7.0
>
> Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
> YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, 
> YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch
>
>
> YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
> not count it during its calculations. It also had the condition reversed so 
> the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3149) Typo in message for invalid application id

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309262#comment-14309262
 ] 

Hudson commented on YARN-3149:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #97 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/97/])
YARN-3149. Fix typo in message for invalid application id. Contributed (xgong: 
rev b77ff37686e01b7497d3869fbc62789a5b123c0a)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ConverterUtils.java


> Typo in message for invalid application id
> --
>
> Key: YARN-3149
> URL: https://issues.apache.org/jira/browse/YARN-3149
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png
>
>
> Message in console wrong when application id format wrong



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309258#comment-14309258
 ] 

Hudson commented on YARN-3145:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #97 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/97/])
YARN-3145. Fixed ConcurrentModificationException on CapacityScheduler 
ParentQueue#getQueueUserAclInfo. Contributed by Tsuyoshi OZAWA (jianhe: rev 
4641196fe02af5cab3d56a9f3c78875c495dbe03)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java


> ConcurrentModificationException on CapacityScheduler 
> ParentQueue#getQueueUserAclInfo
> 
>
> Key: YARN-3145
> URL: https://issues.apache.org/jira/browse/YARN-3145
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Tsuyoshi OZAWA
> Fix For: 2.7.0
>
> Attachments: YARN-3145.001.patch, YARN-3145.002.patch
>
>
> {code}
> ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
> at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2809) Implement workaround for linux kernel panic when removing cgroup

2015-02-06 Thread Nathan Roberts (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Roberts updated YARN-2809:
-
Attachment: YARN-2809-v2.patch

upmerge to latest trunk

> Implement workaround for linux kernel panic when removing cgroup
> 
>
> Key: YARN-2809
> URL: https://issues.apache.org/jira/browse/YARN-2809
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
> Environment:  RHEL 6.4
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-2809-v2.patch, YARN-2809.patch
>
>
> Some older versions of linux have a bug that can cause a kernel panic when 
> the LCE attempts to remove a cgroup. It is a race condition so it's a bit 
> rare but on a few thousand node cluster it can result in a couple of panics 
> per day.
> This is the commit that likely (haven't verified) fixes the problem in linux: 
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-2.6.39.y&id=068c5cc5ac7414a8e9eb7856b4bf3cc4d4744267
> Details will be added in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-06 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309280#comment-14309280
 ] 

Jason Lowe commented on YARN-2246:
--

[~devaraj.k] are you still planning to address this issue?  It's a benign 
problem with the history server UI since it ignores the extra components of the 
URL, but there are some use cases with Tez and other instances where this needs 
to be fixed.

> Job History Link in RM UI is redirecting to the URL which contains Job Id 
> twice
> ---
>
> Key: YARN-2246
> URL: https://issues.apache.org/jira/browse/YARN-2246
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.0.0, 0.23.11, 2.5.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch
>
>
> {code:xml}
> http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309302#comment-14309302
 ] 

Hudson commented on YARN-1537:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2047 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2047/])
YARN-1537. Fix race condition in 
TestLocalResourcesTrackerImpl.testLocalResourceCache. Contributed by Xuan Gong. 
(acmurthy: rev 02f154a0016b7321bbe5b09f2da44a9b33797c36)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalResourcesTrackerImpl.java
* hadoop-yarn-project/CHANGES.txt


> TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
> -
>
> Key: YARN-1537
> URL: https://issues.apache.org/jira/browse/YARN-1537
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Hong Shen
>Assignee: Xuan Gong
> Fix For: 2.7.0
>
> Attachments: YARN-1537.1.patch
>
>
> Here is the error log
> {code}
> Results :
> Failed tests: 
>   TestLocalResourcesTrackerImpl.testLocalResourceCache:351 
> Wanted but not invoked:
> eventHandler.handle(
> 
> isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent)
> );
> -> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351)
> However, there were other interactions with this mock:
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3145) ConcurrentModificationException on CapacityScheduler ParentQueue#getQueueUserAclInfo

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309307#comment-14309307
 ] 

Hudson commented on YARN-3145:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2047 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2047/])
YARN-3145. Fixed ConcurrentModificationException on CapacityScheduler 
ParentQueue#getQueueUserAclInfo. Contributed by Tsuyoshi OZAWA (jianhe: rev 
4641196fe02af5cab3d56a9f3c78875c495dbe03)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java


> ConcurrentModificationException on CapacityScheduler 
> ParentQueue#getQueueUserAclInfo
> 
>
> Key: YARN-3145
> URL: https://issues.apache.org/jira/browse/YARN-3145
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Tsuyoshi OZAWA
> Fix For: 2.7.0
>
> Attachments: YARN-3145.001.patch, YARN-3145.002.patch
>
>
> {code}
> ava.util.ConcurrentModificationException(java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115)
> at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:347)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.getQueueUserAclInfo(ParentQueue.java:348)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getQueueUserAclInfo(CapacityScheduler.java:850)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:844)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueUserAcls(ApplicationClientProtocolPBServiceImpl.java:250)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:335)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) In Fair Scheduler, fix canceling of reservations for exceeding max share

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309304#comment-14309304
 ] 

Hudson commented on YARN-3101:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2047 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2047/])
YARN-3101. In Fair Scheduler, fix canceling of reservations for exceeding max 
share (Anubhav Dhoot via Sandy Ryza) (sandy: rev 
b6466deac6d5d6344f693144290b46e2bef83a02)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* hadoop-yarn-project/CHANGES.txt


> In Fair Scheduler, fix canceling of reservations for exceeding max share
> 
>
> Key: YARN-3101
> URL: https://issues.apache.org/jira/browse/YARN-3101
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Fix For: 2.7.0
>
> Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
> YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, 
> YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch
>
>
> YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
> not count it during its calculations. It also had the condition reversed so 
> the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1582) Capacity Scheduler: add a maximum-allocation-mb setting per queue

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309301#comment-14309301
 ] 

Hudson commented on YARN-1582:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2047 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2047/])
YARN-1582. Capacity Scheduler: add a maximum-allocation-mb setting per queue. 
Contributed by Thomas Graves (jlowe: rev 
69c8a7f45be5c0aa6787b07f328d74f1e2ba5628)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java


> Capacity Scheduler: add a maximum-allocation-mb setting per queue 
> --
>
> Key: YARN-1582
> URL: https://issues.apache.org/jira/browse/YARN-1582
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 0.23.10, 2.2.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 2.7.0
>
> Attachments: YARN-1582-branch-0.23.patch, YARN-1582.002.patch, 
> YARN-1582.003.patch
>
>
> We want to allow certain queues to use larger container sizes while limiting 
> other queues to smaller container sizes.  Setting it per queue will help 
> prevent abuse, help limit the impact of reservations, and allow changes in 
> the maximum container size to be rolled out more easily.
> One reason this is needed is more application types are becoming available on 
> yarn and certain applications require more memory to run efficiently. While 
> we want to allow for that we don't want other applications to abuse that and 
> start requesting bigger containers then what they really need.  
> Note that we could have this based on application type, but that might not be 
> totally accurate either since for example you might want to allow certain 
> users on MapReduce to use larger containers, while limiting other users of 
> MapReduce to smaller containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3149) Typo in message for invalid application id

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309311#comment-14309311
 ] 

Hudson commented on YARN-3149:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2047 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2047/])
YARN-3149. Fix typo in message for invalid application id. Contributed (xgong: 
rev b77ff37686e01b7497d3869fbc62789a5b123c0a)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ConverterUtils.java


> Typo in message for invalid application id
> --
>
> Key: YARN-3149
> URL: https://issues.apache.org/jira/browse/YARN-3149
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: YARN-3149.patch, YARN-3149.patch, screenshot-1.png
>
>
> Message in console wrong when application id format wrong



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1904) Uniform the XXXXNotFound messages from ClientRMService and ApplicationHistoryClientService

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309308#comment-14309308
 ] 

Hudson commented on YARN-1904:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2047 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2047/])
YARN-1904. Ensure exceptions thrown in ClientRMService & 
ApplicationHistoryClientService are uniform when application-attempt is not 
found. Contributed by Zhijie Shen. (acmurthy: rev 
18b2507edaac991e3ed68d2f27eb96f6882137b9)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryClientService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java


> Uniform the NotFound messages from ClientRMService and 
> ApplicationHistoryClientService
> --
>
> Key: YARN-1904
> URL: https://issues.apache.org/jira/browse/YARN-1904
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
> Fix For: 2.7.0
>
> Attachments: YARN-1904.1.patch
>
>
> It's good to make ClientRMService and ApplicationHistoryClientService throw 
> NotFoundException with similar messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (YARN-1126) Add validation of users input nodes-states options to nodes CLI

2015-02-06 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy reopened YARN-1126:
-

I'm re-opening this to commit the addendum patch from YARN-905 
(https://issues.apache.org/jira/secure/attachment/12606009/YARN-905-addendum.patch)
 since the other jira already went out in 2.3.0.

Targeting this for 2.7.0.

> Add validation of users input nodes-states options to nodes CLI
> ---
>
> Key: YARN-1126
> URL: https://issues.apache.org/jira/browse/YARN-1126
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>
> Follow the discussion in YARN-905.
> (1) case-insensitive checks for "all".
> (2) validation of users input, exit with non-zero code and print all valid 
> states when user gives an invalid state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2809) Implement workaround for linux kernel panic when removing cgroup

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309343#comment-14309343
 ] 

Hadoop QA commented on YARN-2809:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697032/YARN-2809-v2.patch
  against trunk revision 1425e3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6535//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6535//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6535//console

This message is automatically generated.

> Implement workaround for linux kernel panic when removing cgroup
> 
>
> Key: YARN-2809
> URL: https://issues.apache.org/jira/browse/YARN-2809
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
> Environment:  RHEL 6.4
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-2809-v2.patch, YARN-2809.patch
>
>
> Some older versions of linux have a bug that can cause a kernel panic when 
> the LCE attempts to remove a cgroup. It is a race condition so it's a bit 
> rare but on a few thousand node cluster it can result in a couple of panics 
> per day.
> This is the commit that likely (haven't verified) fixes the problem in linux: 
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-2.6.39.y&id=068c5cc5ac7414a8e9eb7856b4bf3cc4d4744267
> Details will be added in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal

2015-02-06 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309344#comment-14309344
 ] 

Jason Lowe commented on YARN-3144:
--

Thanks for updating the patch.  Comments:
* The added test now no longer mocks the TimelineClient as it did before?  The 
test requires the timeline client to throw to work properly, and we could 
accidentally connect to a timeline server.
* Nit: Does timelineServicesBestEffort need to be visible anymore?
* Nit: Reading the doc string for the property in yarn-default.xml implies it 
should be true to make timeline operations fatal.


> Configuration for making delegation token failures to timeline server 
> not-fatal
> ---
>
> Key: YARN-3144
> URL: https://issues.apache.org/jira/browse/YARN-3144
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3144.1.patch, YARN-3144.2.patch
>
>
> Posting events to the timeline server is best-effort. However, getting the 
> delegation tokens from the timeline server will kill the job. This patch adds 
> a configuration to make get delegation token operations "best-effort".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal

2015-02-06 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-3144:
--
Attachment: YARN-3144.3.patch

Thanks, [~jlowe]. One more patch to fix up those issues.

> Configuration for making delegation token failures to timeline server 
> not-fatal
> ---
>
> Key: YARN-3144
> URL: https://issues.apache.org/jira/browse/YARN-3144
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3144.1.patch, YARN-3144.2.patch, YARN-3144.3.patch
>
>
> Posting events to the timeline server is best-effort. However, getting the 
> delegation tokens from the timeline server will kill the job. This patch adds 
> a configuration to make get delegation token operations "best-effort".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-06 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309390#comment-14309390
 ] 

Devaraj K commented on YARN-2246:
-

[~jlowe], [~zjshen] Thanks for your inputs. 

[~jlowe], I have started working on this, will provide patch today. Thanks


> Job History Link in RM UI is redirecting to the URL which contains Job Id 
> twice
> ---
>
> Key: YARN-2246
> URL: https://issues.apache.org/jira/browse/YARN-2246
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.0.0, 0.23.11, 2.5.0
>Reporter: Devaraj K
>Assignee: Devaraj K
> Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch
>
>
> {code:xml}
> http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3152) Missing hadoop exclude file fails RMs in HA

2015-02-06 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla moved HADOOP-11555 to YARN-3152:
-

  Component/s: (was: ha)
   resourcemanager
Affects Version/s: (was: 2.6.0)
   2.6.0
  Key: YARN-3152  (was: HADOOP-11555)
  Project: Hadoop YARN  (was: Hadoop Common)

> Missing hadoop exclude file fails RMs in HA
> ---
>
> Key: YARN-3152
> URL: https://issues.apache.org/jira/browse/YARN-3152
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
> Environment: Debian 7
>Reporter: Neill Lima
>
> I have two NNs in HA, they do not fail when the exclude file is not present 
> (hadoop-2.6.0/etc/hadoop/exclude). I had one RM and I wanted to make two in 
> HA. I didn't create the exclude file at this point as well. I applied the HA 
> RM settings properly and when I started both RMs I started getting this 
> exception:
> 2015-02-06 12:25:25,326 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root   
> OPERATION=transitionToActiveTARGET=RMHAProtocolService  
> RESULT=FAILURE  DESCRIPTION=Exception transitioning to active   
> PERMISSIONS=All users are allowed
> 2015-02-06 12:25:25,326 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
> Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:128)
>   at 
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:805)
>   at 
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:416)
>   at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when 
> transitioning to Active mode
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:304)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:126)
>   ... 4 more
> Caused by: org.apache.hadoop.ha.ServiceFailedException: 
> java.io.FileNotFoundException: /hadoop-2.6.0/etc/hadoop/exclude (No such file 
> or directory)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:626)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:297)
>   ... 5 more
> 2015-02-06 12:25:25,327 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
> Trying to re-establish ZK session
> 2015-02-06 12:25:25,339 INFO org.apache.zookeeper.ZooKeeper: Session: 
> 0x44af32566180094 closed
> 2015-02-06 12:25:26,340 INFO org.apache.zookeeper.ZooKeeper: Initiating 
> client connection, connectString=x.x.x.x:2181,x.x.x.x:2181 
> sessionTimeout=1 
> watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@307587c
> 2015-02-06 12:25:26,341 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
> connection to server x.x.x.x/x.x.x.x:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-02-06 12:25:26,341 INFO org.apache.zookeeper.ClientCnxn: Socket 
> connection established to x.x.x.x/x.x.x.x:2181, initiating session
> The issue is descriptive enough to resolve the problem - and it has been 
> fixed by creating the exclude file. 
> I just think as of a improvement: 
> - Should RMs ignore the missing file as the NNs did?
> - Should single RM fail even when the file is not present?
> Just suggesting this improvement to keep the behavior consistent when working 
> with in HA (both NNs and RMs). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309454#comment-14309454
 ] 

Hadoop QA commented on YARN-3144:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697051/YARN-3144.3.patch
  against trunk revision 1425e3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6536//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6536//console

This message is automatically generated.

> Configuration for making delegation token failures to timeline server 
> not-fatal
> ---
>
> Key: YARN-3144
> URL: https://issues.apache.org/jira/browse/YARN-3144
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3144.1.patch, YARN-3144.2.patch, YARN-3144.3.patch
>
>
> Posting events to the timeline server is best-effort. However, getting the 
> delegation tokens from the timeline server will kill the job. This patch adds 
> a configuration to make get delegation token operations "best-effort".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-1449) Protocol changes and implementations in NM side to support change container resource

2015-02-06 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-1449:


Assignee: Wangda Tan  (was: Wangda Tan (No longer used))

> Protocol changes and implementations in NM side to support change container 
> resource
> 
>
> Key: YARN-1449
> URL: https://issues.apache.org/jira/browse/YARN-1449
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Wangda Tan (No longer used)
>Assignee: Wangda Tan
> Attachments: yarn-1449.1.patch, yarn-1449.3.patch, yarn-1449.4.patch, 
> yarn-1449.5.patch
>
>
> As described in YARN-1197, we need add API/implementation changes,
> 1) Add a "changeContainersResources" method in ContainerManagementProtocol
> 2) Can get succeed/failed increased/decreased containers in response of 
> "changeContainersResources"
> 3) Add a "new decreased containers" field in NodeStatus which can help NM 
> notify RM such changes
> 4) Added changeContainersResources implementation in ContainerManagerImpl
> 5) Added changes in ContainersMonitorImpl to support change resource limit of 
> containers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1449) Protocol changes and implementations in NM side to support change container resource

2015-02-06 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309516#comment-14309516
 ] 

Wangda Tan commented on YARN-1449:
--

Canceled patch.

> Protocol changes and implementations in NM side to support change container 
> resource
> 
>
> Key: YARN-1449
> URL: https://issues.apache.org/jira/browse/YARN-1449
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Wangda Tan (No longer used)
>Assignee: Wangda Tan
> Attachments: yarn-1449.1.patch, yarn-1449.3.patch, yarn-1449.4.patch, 
> yarn-1449.5.patch
>
>
> As described in YARN-1197, we need add API/implementation changes,
> 1) Add a "changeContainersResources" method in ContainerManagementProtocol
> 2) Can get succeed/failed increased/decreased containers in response of 
> "changeContainersResources"
> 3) Add a "new decreased containers" field in NodeStatus which can help NM 
> notify RM such changes
> 4) Added changeContainersResources implementation in ContainerManagerImpl
> 5) Added changes in ContainersMonitorImpl to support change resource limit of 
> containers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3147) Clean up RM web proxy code

2015-02-06 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309555#comment-14309555
 ] 

Xuan Gong commented on YARN-3147:
-

Thanks for the patch. [~ste...@apache.org]
I will take a look shortly.


> Clean up RM web proxy code 
> ---
>
> Key: YARN-3147
> URL: https://issues.apache.org/jira/browse/YARN-3147
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: YARN-3147-001.patch, YARN-3147-002.patch
>
>
> YARN-2084 covers fixing up the RM proxy & filter for REST support.
> Before doing that, prepare for it by cleaning up the codebase: factoring out 
> the redirect logic into a single method, some minor reformatting, move to 
> SLF4J and Java7 code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3033) implement NM starting the ATS writer companion

2015-02-06 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309574#comment-14309574
 ] 

Sangjin Lee commented on YARN-3033:
---

[~djp], I think it'd be good to support *both* options.

I do see some may want to run it as an aux service for simplicity of 
deployments (one fewer daemons to start), especially in a small setup. However, 
we do need to address the web app issue at YARN-3087 to avoid the undesirable 
module dependency.

A standalone is probably safer as it would affect the node manager less. We 
still need to poke this daemon for the AM lifecycle (thus the service part).

> implement NM starting the ATS writer companion
> --
>
> Key: YARN-3033
> URL: https://issues.apache.org/jira/browse/YARN-3033
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>
> Per design in YARN-2928, implement node managers starting the ATS writer 
> companion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal

2015-02-06 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309585#comment-14309585
 ] 

Jason Lowe commented on YARN-3144:
--

Thanks, Jon!  We're almost there, but on the final review before commit I found 
one last thing that I think should be fixed.  My apologies for not catching it 
sooner:

{code}
+} catch (Exception e ) {
+  if (timelineServiceBestEffort) {
+LOG.warn("Failed to get delegation token from the timeline 
server");
+return null;
+  }
{code}

I think it's important to log something about the exception that was received, 
otherwise it can be very frustrating to debug.  Not sure if we should log the 
full exception stack or just the message, but I think we should say more than 
just it didn't work.

> Configuration for making delegation token failures to timeline server 
> not-fatal
> ---
>
> Key: YARN-3144
> URL: https://issues.apache.org/jira/browse/YARN-3144
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3144.1.patch, YARN-3144.2.patch, YARN-3144.3.patch
>
>
> Posting events to the timeline server is best-effort. However, getting the 
> delegation tokens from the timeline server will kill the job. This patch adds 
> a configuration to make get delegation token operations "best-effort".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3041) create the ATS entity/event API

2015-02-06 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309612#comment-14309612
 ] 

Sangjin Lee commented on YARN-3041:
---

[~rkanter], [~Naganarasimha], IMO it might make sense to define all YARN system 
entities as explicit types. It would include flow runs, YARN apps, app 
attempts, and containers. They have well-defined meaning and relationship, so 
it seems natural to me? Thoughts?

> create the ATS entity/event API
> ---
>
> Key: YARN-3041
> URL: https://issues.apache.org/jira/browse/YARN-3041
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Robert Kanter
> Attachments: YARN-3041.preliminary.001.patch
>
>
> Per design in YARN-2928, create the ATS entity and events API.
> Also, as part of this JIRA, create YARN system entities (e.g. cluster, user, 
> flow, flow run, YARN app, ...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly

2015-02-06 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309639#comment-14309639
 ] 

Jian He commented on YARN-3021:
---

bq. Explicitly have an external renewer system that has the right permissions 
to renew these tokens. 
I think this is the correct long-term solution. RM today happens to be the 
renewer. But we need a central renewer component so that we can do 
cross-cluster renewals.  
bq. RM can simply inspect the incoming renewer specified in the token and skip 
renewing those tokens if the renewer doesn't match it's own address
I think in this case, the renewer specified in the token is the same as the RM. 
IIUC, the JobClient will request the token from B cluster, but still specify 
the renewer as the A cluster RM (via the A cluster local config), am I right?

> YARN's delegation-token handling disallows certain trust setups to operate 
> properly
> ---
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Harsh J
> Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
> YARN-3021.003.patch, YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
> as it attempts a renewDelegationToken(…) synchronously during application 
> submission (to validate the managed token before it adds it to a scheduler 
> for automatic renewal). The call obviously fails cause B realm will not trust 
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously 
> and once the renewal attempt failed we simply ceased to schedule any further 
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on 
> the failure and skip the scheduling alone, rather than bubble back an error 
> to the client, failing the app submission. This way the old behaviour is 
> retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly

2015-02-06 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309650#comment-14309650
 ] 

Jian He commented on YARN-3021:
---

bq. the JobClient will request the token from B cluster, but still specify the 
renewer as the A cluster RM (via the A cluster local config)
If this is the case, the assumption here is problematic, why would I request a 
token from B but let untrusted 3rd party A renew my token in the first place?

> YARN's delegation-token handling disallows certain trust setups to operate 
> properly
> ---
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Harsh J
> Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
> YARN-3021.003.patch, YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
> as it attempts a renewDelegationToken(…) synchronously during application 
> submission (to validate the managed token before it adds it to a scheduler 
> for automatic renewal). The call obviously fails cause B realm will not trust 
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously 
> and once the renewal attempt failed we simply ceased to schedule any further 
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on 
> the failure and skip the scheduling alone, rather than bubble back an error 
> to the client, failing the app submission. This way the old behaviour is 
> retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1142) MiniYARNCluster web ui does not work properly

2015-02-06 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309663#comment-14309663
 ] 

Sangjin Lee commented on YARN-1142:
---

Some more info on this at 
https://issues.apache.org/jira/browse/YARN-3087?focusedCommentId=14307614&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14307614

> MiniYARNCluster web ui does not work properly
> -
>
> Key: YARN-1142
> URL: https://issues.apache.org/jira/browse/YARN-1142
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.1.0-beta
>Reporter: Alejandro Abdelnur
> Fix For: 2.7.0
>
>
> When going to the RM http port, the NM web ui is displayed. It seems there is 
> a singleton somewhere that breaks things when RM & NMs run in the same 
> process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3087) the REST server (web server) for per-node aggregator does not work if it runs inside node manager

2015-02-06 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309662#comment-14309662
 ] 

Sangjin Lee commented on YARN-3087:
---

Thanks for looking into this [~devaraj.k]! Doesn't sound there is a quick 
resolution then. :(

> the REST server (web server) for per-node aggregator does not work if it runs 
> inside node manager
> -
>
> Key: YARN-3087
> URL: https://issues.apache.org/jira/browse/YARN-3087
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Devaraj K
>
> This is related to YARN-3030. YARN-3030 sets up a per-node timeline 
> aggregator and the associated REST server. It runs fine as a standalone 
> process, but does not work if it runs inside the node manager due to possible 
> collisions of servlet mapping.
> Exception:
> {noformat}
> org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for 
> v2 not found
>   at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232)
>   at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>   at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
>   at 
> com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
>   at 
> com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly

2015-02-06 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309703#comment-14309703
 ] 

Yongjun Zhang commented on YARN-3021:
-

Hi [~vinodkv] and [~jianhe],

Thank you so much for review and commenting!

I will try to respond to part of your comments here and keep looking into the 
rest.

{quote}
RM can simply inspect the incoming renewer specified in the token and skip 
renewing those tokens if the renewer doesn't match it's own address. This way, 
we don't need an explicit API in the submission context.
{quote}
Seems regardless of this jira, we could have do the above change, right? any 
catch?

{quote}
Apologies for going back and forth on this one.
{quote}
I appreciate the insight you provided, and we are trying to figure out the best 
solution together. All the points you provided are reasonable, so absolutely no 
need for apologies here.

{quote}
Irrespective of how we decide to skip tokens, the way the patch is skipping 
renewal will not work. In secure mode, DelegationTokenRenewer drives the app 
state machine. So if you skip adding the app itself to DTR, the app will be 
completely 
{quote}
I did test in a secure env and it worked. Would you please elaborate?

{quote}
I think in this case, the renewer specified in the token is the same as the RM. 
IIUC, the JobClient will request the token from B cluster, but still specify 
the renewer as the A cluster RM (via the A cluster local config), am I right?
{quote}
I think that's the case. The problem is that there is no trust between A and B. 
So "common" should be the one to renew the token.

Thanks.





> YARN's delegation-token handling disallows certain trust setups to operate 
> properly
> ---
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Harsh J
> Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
> YARN-3021.003.patch, YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
> as it attempts a renewDelegationToken(…) synchronously during application 
> submission (to validate the managed token before it adds it to a scheduler 
> for automatic renewal). The call obviously fails cause B realm will not trust 
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously 
> and once the renewal attempt failed we simply ceased to schedule any further 
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on 
> the failure and skip the scheduling alone, rather than bubble back an error 
> to the client, failing the app submission. This way the old behaviour is 
> retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal

2015-02-06 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-3144:
--
Attachment: YARN-3144.4.patch

No problem, [~jlowe]. Uploaded patch to add the exception message.

> Configuration for making delegation token failures to timeline server 
> not-fatal
> ---
>
> Key: YARN-3144
> URL: https://issues.apache.org/jira/browse/YARN-3144
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3144.1.patch, YARN-3144.2.patch, YARN-3144.3.patch, 
> YARN-3144.4.patch
>
>
> Posting events to the timeline server is best-effort. However, getting the 
> delegation tokens from the timeline server will kill the job. This patch adds 
> a configuration to make get delegation token operations "best-effort".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2694) Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY

2015-02-06 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2694:
--
Target Version/s: 2.7.0  (was: 2.6.0)

> Ensure only single node labels specified in resource request / host, and node 
> label expression only specified when resourceName=ANY
> ---
>
> Key: YARN-2694
> URL: https://issues.apache.org/jira/browse/YARN-2694
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Fix For: 2.7.0
>
> Attachments: YARN-2694-20141020-1.patch, YARN-2694-20141021-1.patch, 
> YARN-2694-20141023-1.patch, YARN-2694-20141023-2.patch, 
> YARN-2694-20141101-1.patch, YARN-2694-20141101-2.patch, 
> YARN-2694-20150121-1.patch, YARN-2694-20150122-1.patch, 
> YARN-2694-20150202-1.patch, YARN-2694-20150203-1.patch, 
> YARN-2694-20150203-2.patch, YARN-2694-20150204-1.patch, 
> YARN-2694-20150205-1.patch, YARN-2694-20150205-2.patch, 
> YARN-2694-20150205-3.patch
>
>
> Currently, node label expression supporting in capacity scheduler is partial 
> completed. Now node label expression specified in Resource Request will only 
> respected when it specified at ANY level. And a ResourceRequest/host with 
> multiple node labels will make user limit, etc. computation becomes more 
> tricky.
> Now we need temporarily disable them, changes include,
> - AMRMClient
> - ApplicationMasterService
> - RMAdminCLI
> - CommonNodeLabelsManager



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2694) Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309752#comment-14309752
 ] 

Hudson commented on YARN-2694:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7042 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7042/])
YARN-2694. Ensure only single node label specified in ResourceRequest. 
Contributed by Wangda Tan (jianhe: rev c1957fef29b07fea70938e971b30532a1e131fd0)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ResourceRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AMRMClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/AMRMClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockAM.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodeLabels.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestCommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java


> Ensure only single node labels specified in resource request / host, and node 
> label expression only specified when resourceName=ANY
> ---
>
> Key: YARN-2694
> URL: https://issues.apache.org/jira/browse/YARN-2694
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Fix For: 2.7.0
>
> Attachments: YARN-2694-20141020-1.patch, YARN-2694-20141021-1.patch, 
> YARN-2694-20141023-1.patch, YARN-2694-20141023-2.patch, 
> YARN-2694-20141101-1.patch, YARN-2694-20141101-2.patch, 
> YARN-2694-20150121-1.patch, YARN-2694-20150122-1.patch, 
> YARN-2694-20150202-1.patch, YARN-2694-20150203-1.patch, 
> YARN-2694-20150203-2.patch, YARN-2694-20150204-1.patch, 
> YARN-2694-20150205-1.patch, YARN-2694-20150205-2.patch, 
> YARN-2694-20150205-3.patch
>
>
> Currently, node label expression supporting in capacity scheduler is partial 
> completed. Now node label expression specified in Resource Request will only 
> respected when it specified at ANY level. And a ResourceRequest/host with 
> multiple node labels will make user limit, etc. computation becomes more 
> tricky.
> Now we need temporarily disable them, changes include,
> - AMRMClient
> - ApplicationMasterService
> - RMAdminCLI
> - CommonNodeLabelsManager



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-06 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309757#comment-14309757
 ] 

Jian He commented on YARN-3100:
---

bq. AbstractCSQueue and CSQueueUtils
Maybe I missed something, I think these two are mostly fine. As we create the 
new queue hierarchy  first and then update the old queues. If certain methods 
fail in these two classes,  the new queue creation will fail upfront and  so 
will not update the old queue. Anyway, we can address this separately. 

> Make YARN authorization pluggable
> -
>
> Key: YARN-3100
> URL: https://issues.apache.org/jira/browse/YARN-3100
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-3100.1.patch, YARN-3100.2.patch
>
>
> The goal is to have YARN acl model pluggable so as to integrate other 
> authorization tool such as Apache Ranger, Sentry.
> Currently, we have 
> - admin ACL
> - queue ACL
> - application ACL
> - time line domain ACL
> - service ACL
> The proposal is to create a YarnAuthorizationProvider interface. Current 
> implementation will be the default implementation. Ranger or Sentry plug-in 
> can implement  this interface.
> Benefit:
> -  Unify the code base. With the default implementation, we can get rid of 
> each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
> QueueAclsManager etc.
> - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-281) Add a test for YARN Schedulers' MAXIMUM_ALLOCATION limits

2015-02-06 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan resolved YARN-281.
-
  Resolution: Won't Fix
Release Note: 
I think this may not need since we already have tests in TestSchedulerUitls, it 
will verify minimum/maximum resource normalization/verification. And 
SchedulerUtil runs before scheduler can see such resource requests.

Resolved it as won't fix.

> Add a test for YARN Schedulers' MAXIMUM_ALLOCATION limits
> -
>
> Key: YARN-281
> URL: https://issues.apache.org/jira/browse/YARN-281
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: scheduler
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Wangda Tan
>  Labels: test
>
> We currently have tests that test MINIMUM_ALLOCATION limits for FifoScheduler 
> and the likes, but no test for MAXIMUM_ALLOCATION yet. We should add a test 
> to prevent regressions of any kind on such limits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal

2015-02-06 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309779#comment-14309779
 ] 

Jason Lowe commented on YARN-3144:
--

+1 pending Jenkins.

> Configuration for making delegation token failures to timeline server 
> not-fatal
> ---
>
> Key: YARN-3144
> URL: https://issues.apache.org/jira/browse/YARN-3144
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3144.1.patch, YARN-3144.2.patch, YARN-3144.3.patch, 
> YARN-3144.4.patch
>
>
> Posting events to the timeline server is best-effort. However, getting the 
> delegation tokens from the timeline server will kill the job. This patch adds 
> a configuration to make get delegation token operations "best-effort".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2809) Implement workaround for linux kernel panic when removing cgroup

2015-02-06 Thread Nathan Roberts (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Roberts updated YARN-2809:
-
Attachment: YARN-2809-v3.patch

> Implement workaround for linux kernel panic when removing cgroup
> 
>
> Key: YARN-2809
> URL: https://issues.apache.org/jira/browse/YARN-2809
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
> Environment:  RHEL 6.4
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-2809-v2.patch, YARN-2809-v3.patch, YARN-2809.patch
>
>
> Some older versions of linux have a bug that can cause a kernel panic when 
> the LCE attempts to remove a cgroup. It is a race condition so it's a bit 
> rare but on a few thousand node cluster it can result in a couple of panics 
> per day.
> This is the commit that likely (haven't verified) fixes the problem in linux: 
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-2.6.39.y&id=068c5cc5ac7414a8e9eb7856b4bf3cc4d4744267
> Details will be added in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309816#comment-14309816
 ] 

Hadoop QA commented on YARN-3144:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697097/YARN-3144.4.patch
  against trunk revision eaab959.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6537//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6537//console

This message is automatically generated.

> Configuration for making delegation token failures to timeline server 
> not-fatal
> ---
>
> Key: YARN-3144
> URL: https://issues.apache.org/jira/browse/YARN-3144
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3144.1.patch, YARN-3144.2.patch, YARN-3144.3.patch, 
> YARN-3144.4.patch
>
>
> Posting events to the timeline server is best-effort. However, getting the 
> delegation tokens from the timeline server will kill the job. This patch adds 
> a configuration to make get delegation token operations "best-effort".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3126) FairScheduler: queue's usedResource is always more than the maxResource limit

2015-02-06 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309865#comment-14309865
 ] 

Wei Yan commented on YARN-3126:
---

[~Xia Hu], I checked the latest trunk version. The problem is still there. 
Could u rebase a patch for the trunk? Normally we fix the problem in trunk, 
instead of previous released version. And we may need to get YARN-2083 
committed firstly.
Hey, [~kasha], do u have time look YARN-2083?

> FairScheduler: queue's usedResource is always more than the maxResource limit
> -
>
> Key: YARN-3126
> URL: https://issues.apache.org/jira/browse/YARN-3126
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.3.0
> Environment: hadoop2.3.0. fair scheduler. spark 1.1.0. 
>Reporter: Xia Hu
>  Labels: assignContainer, fairscheduler, resources
> Attachments: resourcelimit.patch
>
>
> When submitting spark application(both spark-on-yarn-cluster and 
> spark-on-yarn-cleint model), the queue's usedResources assigned by 
> fairscheduler always can be more than the queue's maxResources limit.
> And by reading codes of fairscheduler, I suppose this issue happened because 
> of ignore to check the request resources when assign Container.
> Here is the detail:
> 1. choose a queue. In this process, it will check if queue's usedResource is 
> bigger than its max, with assignContainerPreCheck. 
> 2. then choose a app in the certain queue. 
> 3. then choose a container. And here is the question, there is no check 
> whether this container would make the queue sources over its max limit. If a 
> queue's usedResource is 13G, the maxResource limit is 16G, then a container 
> which asking for 4G resources may be assigned successful. 
> This problem will always happen in spark application, cause we can ask for 
> different container resources in different applications. 
> By the way, I have already use the patch from YARN-2083. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3120) YarnException on windows + org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dirnm-local-dir, which was marked as good.

2015-02-06 Thread vaidhyanathan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309867#comment-14309867
 ] 

vaidhyanathan commented on YARN-3120:
-

Hi Varun,

Thanks for responding. I started running the yarn cmd files by running as 
administrator and it worked . Also i opened the command prompt and ran it in 
the administrator mode.

The word count example worked fine for the first time but now im facing a 
different issue , When i run it now with the earlier setup , job doesnt proceed 
after this step '15/02/06 15:38:26 INFO mapreduce.Job: Running job: 
job_1423255041751_0001' and when i check the consolde the status is 'Accepted' 
and the final status is 'Undefined'

> YarnException on windows + 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
> dirnm-local-dir, which was marked as good.
> ---
>
> Key: YARN-3120
> URL: https://issues.apache.org/jira/browse/YARN-3120
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
> Environment: Windows 8 , Hadoop 2.6.0
>Reporter: vaidhyanathan
>
> Hi,
> I tried to follow the instructiosn in 
> http://wiki.apache.org/hadoop/Hadoop2OnWindows and have setup 
> hadoop-2.6.0.jar in my windows system.
> I was able to start everything properly but when i try to run the job 
> wordcount as given in the above URL , the job fails with the below exception .
> 15/01/30 12:56:09 INFO localizer.ResourceLocalizationService: Localizer failed
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
> di
> r /tmp/hadoop-haremangala/nm-local-dir, which was marked as good.
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
> ResourceLocalizationService.getInitializedLocalDirs(ResourceLocalizationService.
> java:1372)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
> ResourceLocalizationService.access$900(ResourceLocalizationService.java:137)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
> ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java
> :1085)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal

2015-02-06 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309886#comment-14309886
 ] 

Jason Lowe commented on YARN-3144:
--

Committing this.  The test failures appear to be unrelated, and they both pass 
for me locally with the patch applied.

> Configuration for making delegation token failures to timeline server 
> not-fatal
> ---
>
> Key: YARN-3144
> URL: https://issues.apache.org/jira/browse/YARN-3144
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-3144.1.patch, YARN-3144.2.patch, YARN-3144.3.patch, 
> YARN-3144.4.patch
>
>
> Posting events to the timeline server is best-effort. However, getting the 
> delegation tokens from the timeline server will kill the job. This patch adds 
> a configuration to make get delegation token operations "best-effort".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-06 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309909#comment-14309909
 ] 

Chris Douglas commented on YARN-3100:
-

Agreed; definitely a separate JIRA. As state is copied from the old queues, 
some of the methods called in {{CSQueueUtils}} throw exceptions, similar to the 
case you found in {{LeafQueue}}.

> Make YARN authorization pluggable
> -
>
> Key: YARN-3100
> URL: https://issues.apache.org/jira/browse/YARN-3100
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-3100.1.patch, YARN-3100.2.patch
>
>
> The goal is to have YARN acl model pluggable so as to integrate other 
> authorization tool such as Apache Ranger, Sentry.
> Currently, we have 
> - admin ACL
> - queue ACL
> - application ACL
> - time line domain ACL
> - service ACL
> The proposal is to create a YarnAuthorizationProvider interface. Current 
> implementation will be the default implementation. Ranger or Sentry plug-in 
> can implement  this interface.
> Benefit:
> -  Unify the code base. With the default implementation, we can get rid of 
> each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
> QueueAclsManager etc.
> - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3144) Configuration for making delegation token failures to timeline server not-fatal

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309907#comment-14309907
 ] 

Hudson commented on YARN-3144:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7043 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7043/])
YARN-3144. Configuration for making delegation token failures to timeline 
server not-fatal. Contributed by Jonathan Eagles (jlowe: rev 
6f10434a5ad965d50352602ce31a9fce353cb90c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java


> Configuration for making delegation token failures to timeline server 
> not-fatal
> ---
>
> Key: YARN-3144
> URL: https://issues.apache.org/jira/browse/YARN-3144
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Fix For: 2.7.0
>
> Attachments: YARN-3144.1.patch, YARN-3144.2.patch, YARN-3144.3.patch, 
> YARN-3144.4.patch
>
>
> Posting events to the timeline server is best-effort. However, getting the 
> delegation tokens from the timeline server will kill the job. This patch adds 
> a configuration to make get delegation token operations "best-effort".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309906#comment-14309906
 ] 

Hudson commented on YARN-3089:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7043 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7043/])
YARN-3089. LinuxContainerExecutor does not handle file arguments to 
deleteAsUser. Contributed by Eric Payne (jlowe: rev 
4c484320b430950ce195cfad433a97099e117bad)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


> LinuxContainerExecutor does not handle file arguments to deleteAsUser
> -
>
> Key: YARN-3089
> URL: https://issues.apache.org/jira/browse/YARN-3089
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Eric Payne
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-3089.v1.txt, YARN-3089.v2.txt, YARN-3089.v3.txt
>
>
> YARN-2468 added the deletion of individual logs that are aggregated, but this 
> fails to delete log files when the LCE is being used.  The LCE native 
> executable assumes the paths being passed are paths and the delete fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2809) Implement workaround for linux kernel panic when removing cgroup

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309905#comment-14309905
 ] 

Hadoop QA commented on YARN-2809:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697110/YARN-2809-v3.patch
  against trunk revision c1957fe.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6538//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6538//console

This message is automatically generated.

> Implement workaround for linux kernel panic when removing cgroup
> 
>
> Key: YARN-2809
> URL: https://issues.apache.org/jira/browse/YARN-2809
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
> Environment:  RHEL 6.4
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-2809-v2.patch, YARN-2809-v3.patch, YARN-2809.patch
>
>
> Some older versions of linux have a bug that can cause a kernel panic when 
> the LCE attempts to remove a cgroup. It is a race condition so it's a bit 
> rare but on a few thousand node cluster it can result in a couple of panics 
> per day.
> This is the commit that likely (haven't verified) fixes the problem in linux: 
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-2.6.39.y&id=068c5cc5ac7414a8e9eb7856b4bf3cc4d4744267
> Details will be added in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2015-02-06 Thread Matteo Mazzucchelli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Mazzucchelli updated YARN-2664:
--
Attachment: YARN-2664.10.patch

In this patch I set *N/A* instead _(best effort)_

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Matteo Mazzucchelli
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.10.patch, YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, 
> YARN-2664.5.patch, YARN-2664.6.patch, YARN-2664.7.patch, YARN-2664.8.patch, 
> YARN-2664.9.patch, YARN-2664.patch, legal.patch, screenshot_reservation_UI.pdf
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2809) Implement workaround for linux kernel panic when removing cgroup

2015-02-06 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309920#comment-14309920
 ] 

Jason Lowe commented on YARN-2809:
--

+1 lgtm.  Will commit this early next week if there are no objections.

> Implement workaround for linux kernel panic when removing cgroup
> 
>
> Key: YARN-2809
> URL: https://issues.apache.org/jira/browse/YARN-2809
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
> Environment:  RHEL 6.4
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-2809-v2.patch, YARN-2809-v3.patch, YARN-2809.patch
>
>
> Some older versions of linux have a bug that can cause a kernel panic when 
> the LCE attempts to remove a cgroup. It is a race condition so it's a bit 
> rare but on a few thousand node cluster it can result in a couple of panics 
> per day.
> This is the commit that likely (haven't verified) fixes the problem in linux: 
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-2.6.39.y&id=068c5cc5ac7414a8e9eb7856b4bf3cc4d4744267
> Details will be added in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3143) RM Apps REST API can return NPE or entries missing id and other fields

2015-02-06 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309922#comment-14309922
 ] 

Kihwal Lee commented on YARN-3143:
--

+1 the patch looks good.

> RM Apps REST API can return NPE or entries missing id and other fields
> --
>
> Key: YARN-3143
> URL: https://issues.apache.org/jira/browse/YARN-3143
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.5.2
>Reporter: Kendall Thrapp
>Assignee: Jason Lowe
> Attachments: YARN-3143.001.patch
>
>
> I'm seeing intermittent null pointer exceptions being returned by
> the YARN Apps REST API.
> For example:
> {code}
> http://{cluster}:{port}/ws/v1/cluster/apps?finalStatus=UNDEFINED
> {code}
> JSON Response was:
> {code}
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException"}}
> {code}
> At a glance appears to be only when we query for unfinished apps (i.e. 
> finalStatus=UNDEFINED).  
> Possibly related, when I do get back a list of apps, sometimes one or more of 
> the apps will be missing most of the fields, like id, name, user, etc., and 
> the fields that are present all have zero for the value.  
> For example:
> {code}
> {"progress":0.0,"clusterId":0,"applicationTags":"","startedTime":0,"finishedTime":0,"elapsedTime":0,"allocatedMB":0,"allocatedVCores":0,"runningContainers":0,"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0}
> {code}
> Let me know if there's any other information I can provide to help debug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1126) Add validation of users input nodes-states options to nodes CLI

2015-02-06 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1126:

Attachment: YARN-905-addendum.patch

Uploading patch from YARN-905 on behalf of [~ywskycn].

> Add validation of users input nodes-states options to nodes CLI
> ---
>
> Key: YARN-1126
> URL: https://issues.apache.org/jira/browse/YARN-1126
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
> Attachments: YARN-905-addendum.patch
>
>
> Follow the discussion in YARN-905.
> (1) case-insensitive checks for "all".
> (2) validation of users input, exit with non-zero code and print all valid 
> states when user gives an invalid state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3143) RM Apps REST API can return NPE or entries missing id and other fields

2015-02-06 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309949#comment-14309949
 ] 

Jason Lowe commented on YARN-3143:
--

Thanks for the review, Kihwal!  Committing this.

> RM Apps REST API can return NPE or entries missing id and other fields
> --
>
> Key: YARN-3143
> URL: https://issues.apache.org/jira/browse/YARN-3143
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.5.2
>Reporter: Kendall Thrapp
>Assignee: Jason Lowe
> Attachments: YARN-3143.001.patch
>
>
> I'm seeing intermittent null pointer exceptions being returned by
> the YARN Apps REST API.
> For example:
> {code}
> http://{cluster}:{port}/ws/v1/cluster/apps?finalStatus=UNDEFINED
> {code}
> JSON Response was:
> {code}
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException"}}
> {code}
> At a glance appears to be only when we query for unfinished apps (i.e. 
> finalStatus=UNDEFINED).  
> Possibly related, when I do get back a list of apps, sometimes one or more of 
> the apps will be missing most of the fields, like id, name, user, etc., and 
> the fields that are present all have zero for the value.  
> For example:
> {code}
> {"progress":0.0,"clusterId":0,"applicationTags":"","startedTime":0,"finishedTime":0,"elapsedTime":0,"allocatedMB":0,"allocatedVCores":0,"runningContainers":0,"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0}
> {code}
> Let me know if there's any other information I can provide to help debug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser

2015-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309952#comment-14309952
 ] 

Vinod Kumar Vavilapalli commented on YARN-3089:
---

bq. Currently, even we are running a MR job, it will upload the partial logs 
which does not sound right. And we need to fix it.
Wow, this is a huge blocker. We should fix it in 2.6.1. [~xgong], can you 
please file a ticket and link it here? Tx.

> LinuxContainerExecutor does not handle file arguments to deleteAsUser
> -
>
> Key: YARN-3089
> URL: https://issues.apache.org/jira/browse/YARN-3089
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Eric Payne
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-3089.v1.txt, YARN-3089.v2.txt, YARN-3089.v3.txt
>
>
> YARN-2468 added the deletion of individual logs that are aggregated, but this 
> fails to delete log files when the LCE is being used.  The LCE native 
> executable assumes the paths being passed are paths and the delete fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3153) Capacity Scheduler max AM resource percentage is mis-used as ratio

2015-02-06 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-3153:


 Summary: Capacity Scheduler max AM resource percentage is mis-used 
as ratio
 Key: YARN-3153
 URL: https://issues.apache.org/jira/browse/YARN-3153
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wangda Tan
Assignee: Wangda Tan
Priority: Critical


In existing Capacity Scheduler, it can limit max applications running within a 
queue. The config is yarn.scheduler.capacity.maximum-am-resource-percent, but 
actually, it is used as "ratio", in implementation, it assumes input will be 
\[0,1\]. So now user can specify it up to 100, which makes AM can use 100x of 
queue capacity. We should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3153) Capacity Scheduler max AM resource limit for queues is defined as percentage but used as ratio

2015-02-06 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3153:
-
Summary: Capacity Scheduler max AM resource limit for queues is defined as 
percentage but used as ratio  (was: Capacity Scheduler max AM resource 
percentage is mis-used as ratio)

> Capacity Scheduler max AM resource limit for queues is defined as percentage 
> but used as ratio
> --
>
> Key: YARN-3153
> URL: https://issues.apache.org/jira/browse/YARN-3153
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
>
> In existing Capacity Scheduler, it can limit max applications running within 
> a queue. The config is yarn.scheduler.capacity.maximum-am-resource-percent, 
> but actually, it is used as "ratio", in implementation, it assumes input will 
> be \[0,1\]. So now user can specify it up to 100, which makes AM can use 100x 
> of queue capacity. We should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3154) Should not upload partial logs for MR jobs or other "short-running' applications

2015-02-06 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-3154:
---

 Summary: Should not upload partial logs for MR jobs or other 
"short-running' applications 
 Key: YARN-3154
 URL: https://issues.apache.org/jira/browse/YARN-3154
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Blocker


Currently, if we are running a MR job, and we do not set the log interval 
properly, we will have their partial logs uploaded and then removed from the 
local filesystem which is not right.

We only upload the partial logs for LRS applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3154) Should not upload partial logs for MR jobs or other "short-running' applications

2015-02-06 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309973#comment-14309973
 ] 

Xuan Gong commented on YARN-3154:
-

We can add a parameter in logAggregationContext and indicate whether this app 
is LRS app. Based on this flag, the NM can decide whether it need to upload the 
partial logs for this app

> Should not upload partial logs for MR jobs or other "short-running' 
> applications 
> -
>
> Key: YARN-3154
> URL: https://issues.apache.org/jira/browse/YARN-3154
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Blocker
>
> Currently, if we are running a MR job, and we do not set the log interval 
> properly, we will have their partial logs uploaded and then removed from the 
> local filesystem which is not right.
> We only upload the partial logs for LRS applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3153) Capacity Scheduler max AM resource limit for queues is defined as percentage but used as ratio

2015-02-06 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309974#comment-14309974
 ] 

Wangda Tan commented on YARN-3153:
--

We have 3 options basically, 
1) Keep the config name (...percentage) and continue use it as ratio, add 
additional checking for this to make sure it fit in range \[0,1\]
2) Keep the config name. Use it as percentage, this need update yarn-default as 
well. This will have some impacts on existing deployments if they upgrade.
3) Change the config name to (...ratio), this will be a in-compatible change.

Thoughts? [~vinodkv], [~jianhe]

> Capacity Scheduler max AM resource limit for queues is defined as percentage 
> but used as ratio
> --
>
> Key: YARN-3153
> URL: https://issues.apache.org/jira/browse/YARN-3153
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
>
> In existing Capacity Scheduler, it can limit max applications running within 
> a queue. The config is yarn.scheduler.capacity.maximum-am-resource-percent, 
> but actually, it is used as "ratio", in implementation, it assumes input will 
> be \[0,1\]. So now user can specify it up to 100, which makes AM can use 100x 
> of queue capacity. We should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3154) Should not upload partial logs for MR jobs or other "short-running' applications

2015-02-06 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309986#comment-14309986
 ] 

Jason Lowe commented on YARN-3154:
--

Note that even LRS apps have issues if they don't do their own log rolling.  If 
I remember correctly, stdout and stderr files are setup by the container 
executor, and we'll have partial logs uploaded then deleted from the local 
filesystem, losing any subsequent logs to these files or any other files that 
aren't explicitly log rolled and filtered via a log aggregation context.

IMHO we need to make sure we do _not_ delete anything for a running app 
_unless_ it has a log aggregation context filter to tell us what is safe to 
upload and delete.  Without that information, we cannot tell if a log file is 
"live" and therefore going to be deleted too early.

> Should not upload partial logs for MR jobs or other "short-running' 
> applications 
> -
>
> Key: YARN-3154
> URL: https://issues.apache.org/jira/browse/YARN-3154
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Blocker
>
> Currently, if we are running a MR job, and we do not set the log interval 
> properly, we will have their partial logs uploaded and then removed from the 
> local filesystem which is not right.
> We only upload the partial logs for LRS applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3143) RM Apps REST API can return NPE or entries missing id and other fields

2015-02-06 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310004#comment-14310004
 ] 

Jason Lowe commented on YARN-3143:
--

My apologies, I also meant to thank Eric for the original review!

> RM Apps REST API can return NPE or entries missing id and other fields
> --
>
> Key: YARN-3143
> URL: https://issues.apache.org/jira/browse/YARN-3143
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.5.2
>Reporter: Kendall Thrapp
>Assignee: Jason Lowe
> Fix For: 2.7.0
>
> Attachments: YARN-3143.001.patch
>
>
> I'm seeing intermittent null pointer exceptions being returned by
> the YARN Apps REST API.
> For example:
> {code}
> http://{cluster}:{port}/ws/v1/cluster/apps?finalStatus=UNDEFINED
> {code}
> JSON Response was:
> {code}
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException"}}
> {code}
> At a glance appears to be only when we query for unfinished apps (i.e. 
> finalStatus=UNDEFINED).  
> Possibly related, when I do get back a list of apps, sometimes one or more of 
> the apps will be missing most of the fields, like id, name, user, etc., and 
> the fields that are present all have zero for the value.  
> For example:
> {code}
> {"progress":0.0,"clusterId":0,"applicationTags":"","startedTime":0,"finishedTime":0,"elapsedTime":0,"allocatedMB":0,"allocatedVCores":0,"runningContainers":0,"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0}
> {code}
> Let me know if there's any other information I can provide to help debug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3153) Capacity Scheduler max AM resource limit for queues is defined as percentage but used as ratio

2015-02-06 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310023#comment-14310023
 ] 

Jian He commented on YARN-3153:
---

As the 
[doc|http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html]
  already explicitly mentions "specified as float", to keep it compatible, we 
may choose to do 1)

> Capacity Scheduler max AM resource limit for queues is defined as percentage 
> but used as ratio
> --
>
> Key: YARN-3153
> URL: https://issues.apache.org/jira/browse/YARN-3153
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
>
> In existing Capacity Scheduler, it can limit max applications running within 
> a queue. The config is yarn.scheduler.capacity.maximum-am-resource-percent, 
> but actually, it is used as "ratio", in implementation, it assumes input will 
> be \[0,1\]. So now user can specify it up to 100, which makes AM can use 100x 
> of queue capacity. We should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3041) create the ATS entity/event API

2015-02-06 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310026#comment-14310026
 ] 

Zhijie Shen commented on YARN-3041:
---

bq. IMO it might make sense to define all YARN system entities as explicit types

Make sense to me.

> create the ATS entity/event API
> ---
>
> Key: YARN-3041
> URL: https://issues.apache.org/jira/browse/YARN-3041
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Robert Kanter
> Attachments: YARN-3041.preliminary.001.patch
>
>
> Per design in YARN-2928, create the ATS entity and events API.
> Also, as part of this JIRA, create YARN system entities (e.g. cluster, user, 
> flow, flow run, YARN app, ...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3153) Capacity Scheduler max AM resource limit for queues is defined as percentage but used as ratio

2015-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310029#comment-14310029
 ] 

Vinod Kumar Vavilapalli commented on YARN-3153:
---

This is a hard one to solve.

+1 for option (1) for now. In addition to that, we can chose to deprecate this 
configuration completely and introduce a new one with the right semantics but 
with a name-change: say yarn.scheduler.capacity.maximum-am-resources-percentage.

> Capacity Scheduler max AM resource limit for queues is defined as percentage 
> but used as ratio
> --
>
> Key: YARN-3153
> URL: https://issues.apache.org/jira/browse/YARN-3153
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
>
> In existing Capacity Scheduler, it can limit max applications running within 
> a queue. The config is yarn.scheduler.capacity.maximum-am-resource-percent, 
> but actually, it is used as "ratio", in implementation, it assumes input will 
> be \[0,1\]. So now user can specify it up to 100, which makes AM can use 100x 
> of queue capacity. We should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3154) Should not upload partial logs for MR jobs or other "short-running' applications

2015-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3154:
--
Target Version/s: 2.7.0, 2.6.1

> Should not upload partial logs for MR jobs or other "short-running' 
> applications 
> -
>
> Key: YARN-3154
> URL: https://issues.apache.org/jira/browse/YARN-3154
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Blocker
>
> Currently, if we are running a MR job, and we do not set the log interval 
> properly, we will have their partial logs uploaded and then removed from the 
> local filesystem which is not right.
> We only upload the partial logs for LRS applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3153) Capacity Scheduler max AM resource limit for queues is defined as percentage but used as ratio

2015-02-06 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310032#comment-14310032
 ] 

Wangda Tan commented on YARN-3153:
--

Thanks for your feedbacks, I agree to do 1) first. I think 
deprecating+change-name is not so graceful enough, user will get confused when 
he found one option deprecated but system suggest to use a very similar one.

Will upload a patch for #1 shortly.

> Capacity Scheduler max AM resource limit for queues is defined as percentage 
> but used as ratio
> --
>
> Key: YARN-3153
> URL: https://issues.apache.org/jira/browse/YARN-3153
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
>
> In existing Capacity Scheduler, it can limit max applications running within 
> a queue. The config is yarn.scheduler.capacity.maximum-am-resource-percent, 
> but actually, it is used as "ratio", in implementation, it assumes input will 
> be \[0,1\]. So now user can specify it up to 100, which makes AM can use 100x 
> of queue capacity. We should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1126) Add validation of users input nodes-states options to nodes CLI

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310033#comment-14310033
 ] 

Hadoop QA commented on YARN-1126:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12697127/YARN-905-addendum.patch
  against trunk revision 5c79439.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:

  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6540//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6540//console

This message is automatically generated.

> Add validation of users input nodes-states options to nodes CLI
> ---
>
> Key: YARN-1126
> URL: https://issues.apache.org/jira/browse/YARN-1126
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
> Attachments: YARN-905-addendum.patch
>
>
> Follow the discussion in YARN-905.
> (1) case-insensitive checks for "all".
> (2) validation of users input, exit with non-zero code and print all valid 
> states when user gives an invalid state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2990) FairScheduler's delay-scheduling always waits for node-local and rack-local delays, even for off-rack-only requests

2015-02-06 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310034#comment-14310034
 ] 

Sandy Ryza commented on YARN-2990:
--

+1.  Sorry for the delay in getting to this.

> FairScheduler's delay-scheduling always waits for node-local and rack-local 
> delays, even for off-rack-only requests
> ---
>
> Key: YARN-2990
> URL: https://issues.apache.org/jira/browse/YARN-2990
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-2990-0.patch, yarn-2990-1.patch, yarn-2990-2.patch, 
> yarn-2990-test.patch
>
>
> Looking at the FairScheduler, it appears the node/rack locality delays are 
> used for all requests, even those that are only off-rack. 
> More details in comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3154) Should not upload partial logs for MR jobs or other "short-running' applications

2015-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310042#comment-14310042
 ] 

Vinod Kumar Vavilapalli commented on YARN-3154:
---

Does having two separate notions work?
 - Today's LogAggregationContext's include/exclude patterns for the app to 
indicate which log files need to be aggregated explicitly at app finish. This 
works for regular apps.
 - A new include/exclude pattern for app to indicate which log files need to be 
aggregated in a rolling fashion.

> Should not upload partial logs for MR jobs or other "short-running' 
> applications 
> -
>
> Key: YARN-3154
> URL: https://issues.apache.org/jira/browse/YARN-3154
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Blocker
>
> Currently, if we are running a MR job, and we do not set the log interval 
> properly, we will have their partial logs uploaded and then removed from the 
> local filesystem which is not right.
> We only upload the partial logs for LRS applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3153) Capacity Scheduler max AM resource limit for queues is defined as percentage but used as ratio

2015-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310045#comment-14310045
 ] 

Vinod Kumar Vavilapalli commented on YARN-3153:
---

We could instead pick a radically different name. Or may be two radically 
different ones - one for the ratio and one for the percentage.

> Capacity Scheduler max AM resource limit for queues is defined as percentage 
> but used as ratio
> --
>
> Key: YARN-3153
> URL: https://issues.apache.org/jira/browse/YARN-3153
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
>
> In existing Capacity Scheduler, it can limit max applications running within 
> a queue. The config is yarn.scheduler.capacity.maximum-am-resource-percent, 
> but actually, it is used as "ratio", in implementation, it assumes input will 
> be \[0,1\]. So now user can specify it up to 100, which makes AM can use 100x 
> of queue capacity. We should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3153) Capacity Scheduler max AM resource limit for queues is defined as percentage but used as ratio

2015-02-06 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310049#comment-14310049
 ] 

Wangda Tan commented on YARN-3153:
--

Good suggestion, I think we can deprecate the precent one, make sure its value 
within \[0, 1\], and use a ratio/factor as the new option name. Sounds good?

> Capacity Scheduler max AM resource limit for queues is defined as percentage 
> but used as ratio
> --
>
> Key: YARN-3153
> URL: https://issues.apache.org/jira/browse/YARN-3153
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
>
> In existing Capacity Scheduler, it can limit max applications running within 
> a queue. The config is yarn.scheduler.capacity.maximum-am-resource-percent, 
> but actually, it is used as "ratio", in implementation, it assumes input will 
> be \[0,1\]. So now user can specify it up to 100, which makes AM can use 100x 
> of queue capacity. We should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3143) RM Apps REST API can return NPE or entries missing id and other fields

2015-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310055#comment-14310055
 ] 

Hudson commented on YARN-3143:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7045 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7045/])
YARN-3143. RM Apps REST API can return NPE or entries missing id and other 
fields. Contributed by Jason Lowe (jlowe: rev 
da2fb2bc46bddf42d79c6d7664cbf0311973709e)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java


> RM Apps REST API can return NPE or entries missing id and other fields
> --
>
> Key: YARN-3143
> URL: https://issues.apache.org/jira/browse/YARN-3143
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.5.2
>Reporter: Kendall Thrapp
>Assignee: Jason Lowe
> Fix For: 2.7.0
>
> Attachments: YARN-3143.001.patch
>
>
> I'm seeing intermittent null pointer exceptions being returned by
> the YARN Apps REST API.
> For example:
> {code}
> http://{cluster}:{port}/ws/v1/cluster/apps?finalStatus=UNDEFINED
> {code}
> JSON Response was:
> {code}
> {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException"}}
> {code}
> At a glance appears to be only when we query for unfinished apps (i.e. 
> finalStatus=UNDEFINED).  
> Possibly related, when I do get back a list of apps, sometimes one or more of 
> the apps will be missing most of the fields, like id, name, user, etc., and 
> the fields that are present all have zero for the value.  
> For example:
> {code}
> {"progress":0.0,"clusterId":0,"applicationTags":"","startedTime":0,"finishedTime":0,"elapsedTime":0,"allocatedMB":0,"allocatedVCores":0,"runningContainers":0,"preemptedResourceMB":0,"preemptedResourceVCores":0,"numNonAMContainerPreempted":0,"numAMContainerPreempted":0}
> {code}
> Let me know if there's any other information I can provide to help debug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >