date:20150106


[ 
https://issues.apache.org/jira/browse/YARN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265860#comment-14265860
 ] 

Hadoop QA commented on YARN-2427:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12690283/apache-yarn-2427.4.patch
  against trunk revision 4cd66f7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6251//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6251//console

This message is automatically generated.

 Add support for moving apps between queues in RM web services
 -

 Key: YARN-2427
 URL: https://issues.apache.org/jira/browse/YARN-2427
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-2427.0.patch, apache-yarn-2427.1.patch, 
 apache-yarn-2427.2.patch, apache-yarn-2427.3.patch, apache-yarn-2427.4.patch


 Support for moving apps from one queue to another is now present in 
 CapacityScheduler and FairScheduler. We should expose the functionality via 
 RM web services as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished


[ 
https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265865#comment-14265865
 ] 

Hadoop QA commented on YARN-2997:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690286/YARN-2997.4.patch
  against trunk revision 4cd66f7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6252//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6252//console

This message is automatically generated.

 NM keeps sending finished containers to RM until app is finished
 

 Key: YARN-2997
 URL: https://issues.apache.org/jira/browse/YARN-2997
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Chengbing Liu
Assignee: Chengbing Liu
 Attachments: YARN-2997.2.patch, YARN-2997.3.patch, YARN-2997.4.patch, 
 YARN-2997.patch


 We have seen in RM log a lot of
 {quote}
 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {quote}
 It is caused by NM sending completed containers repeatedly until the app is 
 finished. On the RM side, the container is already released, hence 
 {{getRMContainer}} returns null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2997) NM keeps sending finished containers to RM until app is finished

2015-01-06 Thread Chengbing Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengbing Liu updated YARN-2997:

Attachment: YARN-2997.4.patch

Updated patch.

The testing-only method is removed. {{pendingCompletedContainers.clear()}} is 
added at the end of {{removeOrTrackCompletedContainersFromContext}}, and also 
in RESYNC section to clear the cache so that these outdated container statuses 
will not be reported to the restarted RM.

 NM keeps sending finished containers to RM until app is finished
 

 Key: YARN-2997
 URL: https://issues.apache.org/jira/browse/YARN-2997
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Chengbing Liu
Assignee: Chengbing Liu
 Attachments: YARN-2997.2.patch, YARN-2997.3.patch, YARN-2997.4.patch, 
 YARN-2997.patch


 We have seen in RM log a lot of
 {quote}
 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...
 {quote}
 It is caused by NM sending completed containers repeatedly until the app is 
 finished. On the RM side, the container is already released, hence 
 {{getRMContainer}} returns null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance


[ 
https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265835#comment-14265835
 ] 

Yi Liu commented on YARN-2996:
--

Test failure and findbugs are not related.

 Refine some fs operations in FileSystemRMStateStore to improve performance
 --

 Key: YARN-2996
 URL: https://issues.apache.org/jira/browse/YARN-2996
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: YARN-2996.001.patch, YARN-2996.002.patch


 In {{FileSystemRMStateStore}}, we can refine some fs operations to improve 
 performance:
 *1.* There are several places invoke {{fs.exists}}, then 
 {{fs.getFileStatus}}, we can merge them to save one RPC call
 {code}
 if (fs.exists(versionNodePath)) {
 FileStatus status = fs.getFileStatus(versionNodePath);
 {code}
 *2.*
 {code}
 protected void updateFile(Path outputPath, byte[] data) throws Exception {
   Path newPath = new Path(outputPath.getParent(), outputPath.getName() + 
 .new);
   // use writeFile to make sure .new file is created atomically
   writeFile(newPath, data);
   replaceFile(newPath, outputPath);
 }
 {code}
 The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then 
 rename to _output\_file_.new, then rename it to _output\_file_, we can reduce 
 one rename operation.
 Also there is one unnecessary import, we can remove it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3010) Fix recent findbug issue in AbstractYarnScheduler


 [ 
https://issues.apache.org/jira/browse/YARN-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated YARN-3010:
-
Description: 
A new findbug issues reported recently in latest trunk: 
{quote}
IS  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.rmContext;
 locked 91% of time
{quote}
https://issues.apache.org/jira/browse/YARN-2996?focusedCommentId=14265760page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14265760

https://builds.apache.org/job/PreCommit-YARN-Build/6249//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html

  was:
A new findbug issues reported recently in latest trunk: 
{quote}
IS  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.rmContext;
 locked 91% of time
{quote}
https://builds.apache.org/job/PreCommit-YARN-Build/6249//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html


 Fix recent findbug issue in AbstractYarnScheduler
 -

 Key: YARN-3010
 URL: https://issues.apache.org/jira/browse/YARN-3010
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Attachments: YARN-3010.001.patch


 A new findbug issues reported recently in latest trunk: 
 {quote}
 ISInconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.rmContext;
  locked 91% of time
 {quote}
 https://issues.apache.org/jira/browse/YARN-2996?focusedCommentId=14265760page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14265760
 https://builds.apache.org/job/PreCommit-YARN-Build/6249//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2616) Add CLI client to the registry to list/view entries

2015-01-06 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265959#comment-14265959
 ] 

Steve Loughran commented on YARN-2616:
--


thanks for doing the tests; without that the code that was checked in earlier 
doesn't officially exist.

There's an outstanding patch, YARN-2683, which I'm trying to get in; this moves 
all registry config settings to core-default and documents the registry in the 
hadoop site docs. This impacts the CLI in a couple of ways:

*config*

Once YARN-2683 is in, all registry options will move to the core config, not 
yarn config; this helps the registry
to run without any other YARN dependencies. Can you switch to using the basic 
{{Configuration}}?

* docs*

The YARN-2683 patch will provide the structure for adding documentation on the 
CLI


If we can get that patch in then it'll be easy to round off the CLI with a 
basic manpage


h3. Testing

* test assertions*


There's lots of test operations like

{code}
result = cli.run(new String[] { ls, NonSlashPath});
assertEquals(-1, result);
{code}

This could be factored out into some method

assertResult(cli, int code, String...args)

which included the arg list on a failure

minor: lots of tabs in the source. Indent with (two) spaces please.

* failure testing *

Can you add some tests with invalid bindings and see how the CLI fails? e.g: no 
valid ZK host/port.

 Add CLI client to the registry to list/view entries
 ---

 Key: YARN-2616
 URL: https://issues.apache.org/jira/browse/YARN-2616
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Reporter: Steve Loughran
Assignee: Akshay Radia
 Attachments: YARN-2616-003.patch, yarn-2616-v1.patch, 
 yarn-2616-v2.patch, yarn-2616-v4.patch, yarn-2616-v5.patch, yarn-2616-v6.patch


 registry needs a CLI interface



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3010) Fix recent findbug issue in AbstractYarnScheduler


 [ 
https://issues.apache.org/jira/browse/YARN-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated YARN-3010:
-
Attachment: YARN-3010.002.patch

Update patch

 Fix recent findbug issue in AbstractYarnScheduler
 -

 Key: YARN-3010
 URL: https://issues.apache.org/jira/browse/YARN-3010
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Attachments: YARN-3010.001.patch, YARN-3010.002.patch


 A new findbug issues reported recently in latest trunk: 
 {quote}
 ISInconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.rmContext;
  locked 91% of time
 {quote}
 https://issues.apache.org/jira/browse/YARN-2996?focusedCommentId=14265760page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14265760
 https://builds.apache.org/job/PreCommit-YARN-Build/6249//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2360) Fair Scheduler: Display dynamic fair share for queues on the scheduler page


[ 
https://issues.apache.org/jira/browse/YARN-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265977#comment-14265977
 ] 

Hudson commented on YARN-2360:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #799 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/799/])
Move YARN-2360 from 2.6 to 2.7 in CHANGES.txt (kasha: rev 
41d72cbd48e6df7be3d177eaf04d73e88cf38381)
* hadoop-yarn-project/CHANGES.txt


 Fair Scheduler: Display dynamic fair share for queues on the scheduler page
 ---

 Key: YARN-2360
 URL: https://issues.apache.org/jira/browse/YARN-2360
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Fix For: 2.7.0

 Attachments: Screen Shot 2014-07-28 at 1.12.19 PM.png, 
 Screen_Shot_v3.png, Screen_Shot_v4.png, Screen_Shot_v5.png, YARN-2360-v1.txt, 
 YARN-2360-v2.txt, YARN-2360-v3.patch, YARN-2360-v4.patch, YARN-2360-v5.patch, 
 yarn-2360-6.patch


 Based on the discussion in YARN-2026,  we'd like to display dynamic fair 
 share for queues on the scheduler page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2958) RMStateStore seems to unnecessarily and wrongly store sequence number separately


[ 
https://issues.apache.org/jira/browse/YARN-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265982#comment-14265982
 ] 

Hudson commented on YARN-2958:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #799 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/799/])
YARN-2958. Made RMStateStore not update the last sequence number when updating 
the delegation token. Contributed by Varun Saxena. (zjshen: rev 
562a701945be3a672f9cb5a52cc6db2c1589ba2b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/LeveldbRMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreRMDTEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java


 RMStateStore seems to unnecessarily and wrongly store sequence number 
 separately
 

 Key: YARN-2958
 URL: https://issues.apache.org/jira/browse/YARN-2958
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Zhijie Shen
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-2958.001.patch, YARN-2958.002.patch, 
 YARN-2958.003.patch, YARN-2958.004.patch


 It seems that RMStateStore updates last sequence number when storing or 
 updating each individual DT, to recover the latest sequence number when RM 
 restarting.
 First, the current logic seems to be problematic:
 {code}
   public synchronized void updateRMDelegationTokenAndSequenceNumber(
   RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate,
   int latestSequenceNumber) {
 if(isFencedState()) {
   LOG.info(State store is in Fenced state. Can't update RM Delegation 
 Token.);
   return;
 }
 try {
   updateRMDelegationTokenAndSequenceNumberInternal(rmDTIdentifier, 
 renewDate,
   latestSequenceNumber);
 } catch (Exception e) {
   notifyStoreOperationFailed(e);
 }
   }
 {code}
 {code}
   @Override
   protected void updateStoredToken(RMDelegationTokenIdentifier id,
   long renewDate) {
 try {
   LOG.info(updating RMDelegation token with sequence number: 
   + id.getSequenceNumber());
   rmContext.getStateStore().updateRMDelegationTokenAndSequenceNumber(id,
 renewDate, id.getSequenceNumber());
 } catch (Exception e) {
   LOG.error(Error in updating persisted RMDelegationToken with sequence 
 number: 
 + id.getSequenceNumber());
   ExitUtil.terminate(1, e);
 }
   }
 {code}
 According to code above, even when renewing a DT, the last sequence number is 
 updated in the store, which is wrong. For example, we have the following 
 sequence:
 1. Get DT 1 (seq = 1)
 2. Get DT 2( seq = 2)
 3. Renew DT 1 (seq = 1)
 4. Restart RM
 The stored and then recovered last sequence number is 1. It makes the next 
 created DT after RM restarting will conflict with DT 2 on sequence num.
 Second, the aforementioned bug doesn't happen actually, because the recovered 
 last sequence num has been overwritten at by the correctly one.
 {code}
   public void recover(RMState rmState) throws Exception {
 LOG.info(recovering RMDelegationTokenSecretManager.);
 //

[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem


[ 
https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265978#comment-14265978
 ] 

Hudson commented on YARN-2574:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #799 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/799/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java


 Add support for FairScheduler to the ReservationSystem
 --

 Key: YARN-2574
 URL: https://issues.apache.org/jira/browse/YARN-2574
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Subru Krishnan
Assignee: Anubhav Dhoot

 YARN-1051 introduces the ReservationSystem and the current implementation is 
 based on CapacityScheduler. This JIRA proposes adding support for 
 FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3010) Fix recent findbug issue in AbstractYarnScheduler


[ 
https://issues.apache.org/jira/browse/YARN-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266041#comment-14266041
 ] 

Hadoop QA commented on YARN-3010:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690315/YARN-3010.002.patch
  against trunk revision 4cd66f7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6253//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6253//console

This message is automatically generated.

 Fix recent findbug issue in AbstractYarnScheduler
 -

 Key: YARN-3010
 URL: https://issues.apache.org/jira/browse/YARN-3010
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Attachments: YARN-3010.001.patch, YARN-3010.002.patch


 A new findbug issues reported recently in latest trunk: 
 {quote}
 ISInconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.rmContext;
  locked 91% of time
 {quote}
 https://issues.apache.org/jira/browse/YARN-2996?focusedCommentId=14265760page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14265760
 https://builds.apache.org/job/PreCommit-YARN-Build/6249//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2958) RMStateStore seems to unnecessarily and wrongly store sequence number separately


[ 
https://issues.apache.org/jira/browse/YARN-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265997#comment-14265997
 ] 

Hudson commented on YARN-2958:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #65 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/65/])
YARN-2958. Made RMStateStore not update the last sequence number when updating 
the delegation token. Contributed by Varun Saxena. (zjshen: rev 
562a701945be3a672f9cb5a52cc6db2c1589ba2b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreRMDTEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/LeveldbRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java


 RMStateStore seems to unnecessarily and wrongly store sequence number 
 separately
 

 Key: YARN-2958
 URL: https://issues.apache.org/jira/browse/YARN-2958
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Zhijie Shen
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-2958.001.patch, YARN-2958.002.patch, 
 YARN-2958.003.patch, YARN-2958.004.patch


 It seems that RMStateStore updates last sequence number when storing or 
 updating each individual DT, to recover the latest sequence number when RM 
 restarting.
 First, the current logic seems to be problematic:
 {code}
   public synchronized void updateRMDelegationTokenAndSequenceNumber(
   RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate,
   int latestSequenceNumber) {
 if(isFencedState()) {
   LOG.info(State store is in Fenced state. Can't update RM Delegation 
 Token.);
   return;
 }
 try {
   updateRMDelegationTokenAndSequenceNumberInternal(rmDTIdentifier, 
 renewDate,
   latestSequenceNumber);
 } catch (Exception e) {
   notifyStoreOperationFailed(e);
 }
   }
 {code}
 {code}
   @Override
   protected void updateStoredToken(RMDelegationTokenIdentifier id,
   long renewDate) {
 try {
   LOG.info(updating RMDelegation token with sequence number: 
   + id.getSequenceNumber());
   rmContext.getStateStore().updateRMDelegationTokenAndSequenceNumber(id,
 renewDate, id.getSequenceNumber());
 } catch (Exception e) {
   LOG.error(Error in updating persisted RMDelegationToken with sequence 
 number: 
 + id.getSequenceNumber());
   ExitUtil.terminate(1, e);
 }
   }
 {code}
 According to code above, even when renewing a DT, the last sequence number is 
 updated in the store, which is wrong. For example, we have the following 
 sequence:
 1. Get DT 1 (seq = 1)
 2. Get DT 2( seq = 2)
 3. Renew DT 1 (seq = 1)
 4. Restart RM
 The stored and then recovered last sequence number is 1. It makes the next 
 created DT after RM restarting will conflict with DT 2 on sequence num.
 Second, the aforementioned bug doesn't happen actually, because the recovered 
 last sequence num has been overwritten at by the correctly one.
 {code}
   public void recover(RMState rmState) throws Exception {
 LOG.info(recovering RMDelegationTokenSecretManager.);

[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265991#comment-14265991
 ] 

Hudson commented on YARN-2881:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #65 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/65/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java


 Implement PlanFollower for FairScheduler
 

 Key: YARN-2881
 URL: https://issues.apache.org/jira/browse/YARN-2881
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Fix For: 2.7.0

 Attachments: YARN-2881.001.patch, YARN-2881.002.patch, 
 YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, 
 YARN-2881.005.patch, YARN-2881.006.patch, YARN-2881.prelim.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem


[ 
https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265993#comment-14265993
 ] 

Hudson commented on YARN-2574:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #65 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/65/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java


 Add support for FairScheduler to the ReservationSystem
 --

 Key: YARN-2574
 URL: https://issues.apache.org/jira/browse/YARN-2574
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Subru Krishnan
Assignee: Anubhav Dhoot

 YARN-1051 introduces the ReservationSystem and the current implementation is 
 based on CapacityScheduler. This JIRA proposes adding support for 
 FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2360) Fair Scheduler: Display dynamic fair share for queues on the scheduler page


[ 
https://issues.apache.org/jira/browse/YARN-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265992#comment-14265992
 ] 

Hudson commented on YARN-2360:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #65 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/65/])
Move YARN-2360 from 2.6 to 2.7 in CHANGES.txt (kasha: rev 
41d72cbd48e6df7be3d177eaf04d73e88cf38381)
* hadoop-yarn-project/CHANGES.txt


 Fair Scheduler: Display dynamic fair share for queues on the scheduler page
 ---

 Key: YARN-2360
 URL: https://issues.apache.org/jira/browse/YARN-2360
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Fix For: 2.7.0

 Attachments: Screen Shot 2014-07-28 at 1.12.19 PM.png, 
 Screen_Shot_v3.png, Screen_Shot_v4.png, Screen_Shot_v5.png, YARN-2360-v1.txt, 
 YARN-2360-v2.txt, YARN-2360-v3.patch, YARN-2360-v4.patch, YARN-2360-v5.patch, 
 yarn-2360-6.patch


 Based on the discussion in YARN-2026,  we'd like to display dynamic fair 
 share for queues on the scheduler page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266142#comment-14266142
 ] 

Hudson commented on YARN-2881:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1997 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1997/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java


 Implement PlanFollower for FairScheduler
 

 Key: YARN-2881
 URL: https://issues.apache.org/jira/browse/YARN-2881
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Fix For: 2.7.0

 Attachments: YARN-2881.001.patch, YARN-2881.002.patch, 
 YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, 
 YARN-2881.005.patch, YARN-2881.006.patch, YARN-2881.prelim.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2958) RMStateStore seems to unnecessarily and wrongly store sequence number separately


[ 
https://issues.apache.org/jira/browse/YARN-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266148#comment-14266148
 ] 

Hudson commented on YARN-2958:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1997 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1997/])
YARN-2958. Made RMStateStore not update the last sequence number when updating 
the delegation token. Contributed by Varun Saxena. (zjshen: rev 
562a701945be3a672f9cb5a52cc6db2c1589ba2b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreRMDTEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/LeveldbRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java


 RMStateStore seems to unnecessarily and wrongly store sequence number 
 separately
 

 Key: YARN-2958
 URL: https://issues.apache.org/jira/browse/YARN-2958
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Zhijie Shen
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-2958.001.patch, YARN-2958.002.patch, 
 YARN-2958.003.patch, YARN-2958.004.patch


 It seems that RMStateStore updates last sequence number when storing or 
 updating each individual DT, to recover the latest sequence number when RM 
 restarting.
 First, the current logic seems to be problematic:
 {code}
   public synchronized void updateRMDelegationTokenAndSequenceNumber(
   RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate,
   int latestSequenceNumber) {
 if(isFencedState()) {
   LOG.info(State store is in Fenced state. Can't update RM Delegation 
 Token.);
   return;
 }
 try {
   updateRMDelegationTokenAndSequenceNumberInternal(rmDTIdentifier, 
 renewDate,
   latestSequenceNumber);
 } catch (Exception e) {
   notifyStoreOperationFailed(e);
 }
   }
 {code}
 {code}
   @Override
   protected void updateStoredToken(RMDelegationTokenIdentifier id,
   long renewDate) {
 try {
   LOG.info(updating RMDelegation token with sequence number: 
   + id.getSequenceNumber());
   rmContext.getStateStore().updateRMDelegationTokenAndSequenceNumber(id,
 renewDate, id.getSequenceNumber());
 } catch (Exception e) {
   LOG.error(Error in updating persisted RMDelegationToken with sequence 
 number: 
 + id.getSequenceNumber());
   ExitUtil.terminate(1, e);
 }
   }
 {code}
 According to code above, even when renewing a DT, the last sequence number is 
 updated in the store, which is wrong. For example, we have the following 
 sequence:
 1. Get DT 1 (seq = 1)
 2. Get DT 2( seq = 2)
 3. Renew DT 1 (seq = 1)
 4. Restart RM
 The stored and then recovered last sequence number is 1. It makes the next 
 created DT after RM restarting will conflict with DT 2 on sequence num.
 Second, the aforementioned bug doesn't happen actually, because the recovered 
 last sequence num has been overwritten at by the correctly one.
 {code}
   public void recover(RMState rmState) throws Exception {
 LOG.info(recovering RMDelegationTokenSecretManager.);
 //

[jira] [Commented] (YARN-2360) Fair Scheduler: Display dynamic fair share for queues on the scheduler page


[ 
https://issues.apache.org/jira/browse/YARN-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266143#comment-14266143
 ] 

Hudson commented on YARN-2360:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1997 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1997/])
Move YARN-2360 from 2.6 to 2.7 in CHANGES.txt (kasha: rev 
41d72cbd48e6df7be3d177eaf04d73e88cf38381)
* hadoop-yarn-project/CHANGES.txt


 Fair Scheduler: Display dynamic fair share for queues on the scheduler page
 ---

 Key: YARN-2360
 URL: https://issues.apache.org/jira/browse/YARN-2360
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Fix For: 2.7.0

 Attachments: Screen Shot 2014-07-28 at 1.12.19 PM.png, 
 Screen_Shot_v3.png, Screen_Shot_v4.png, Screen_Shot_v5.png, YARN-2360-v1.txt, 
 YARN-2360-v2.txt, YARN-2360-v3.patch, YARN-2360-v4.patch, YARN-2360-v5.patch, 
 yarn-2360-6.patch


 Based on the discussion in YARN-2026,  we'd like to display dynamic fair 
 share for queues on the scheduler page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2571) RM to support YARN registry

2015-01-06 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-2571:
-
Target Version/s: 2.7.0  (was: 2.6.0)

 RM to support YARN registry 
 

 Key: YARN-2571
 URL: https://issues.apache.org/jira/browse/YARN-2571
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: YARN-2571-001.patch, YARN-2571-002.patch, 
 YARN-2571-003.patch, YARN-2571-005.patch, YARN-2571-007.patch, 
 YARN-2571-008.patch, YARN-2571-009.patch


 The RM needs to (optionally) integrate with the YARN registry:
 # startup: create the /services and /users paths with system ACLs (yarn, hdfs 
 principals)
 # app-launch: create the user directory /users/$username with the relevant 
 permissions (CRD) for them to create subnodes.
 # attempt, container, app completion: remove service records with the 
 matching persistence and ID



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem


[ 
https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266144#comment-14266144
 ] 

Hudson commented on YARN-2574:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1997 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1997/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java


 Add support for FairScheduler to the ReservationSystem
 --

 Key: YARN-2574
 URL: https://issues.apache.org/jira/browse/YARN-2574
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Subru Krishnan
Assignee: Anubhav Dhoot

 YARN-1051 introduces the ReservationSystem and the current implementation is 
 based on CapacityScheduler. This JIRA proposes adding support for 
 FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2958) RMStateStore seems to unnecessarily and wrongly store sequence number separately


[ 
https://issues.apache.org/jira/browse/YARN-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266171#comment-14266171
 ] 

Hudson commented on YARN-2958:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #62 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/62/])
YARN-2958. Made RMStateStore not update the last sequence number when updating 
the delegation token. Contributed by Varun Saxena. (zjshen: rev 
562a701945be3a672f9cb5a52cc6db2c1589ba2b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/LeveldbRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreRMDTEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java


 RMStateStore seems to unnecessarily and wrongly store sequence number 
 separately
 

 Key: YARN-2958
 URL: https://issues.apache.org/jira/browse/YARN-2958
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Zhijie Shen
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-2958.001.patch, YARN-2958.002.patch, 
 YARN-2958.003.patch, YARN-2958.004.patch


 It seems that RMStateStore updates last sequence number when storing or 
 updating each individual DT, to recover the latest sequence number when RM 
 restarting.
 First, the current logic seems to be problematic:
 {code}
   public synchronized void updateRMDelegationTokenAndSequenceNumber(
   RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate,
   int latestSequenceNumber) {
 if(isFencedState()) {
   LOG.info(State store is in Fenced state. Can't update RM Delegation 
 Token.);
   return;
 }
 try {
   updateRMDelegationTokenAndSequenceNumberInternal(rmDTIdentifier, 
 renewDate,
   latestSequenceNumber);
 } catch (Exception e) {
   notifyStoreOperationFailed(e);
 }
   }
 {code}
 {code}
   @Override
   protected void updateStoredToken(RMDelegationTokenIdentifier id,
   long renewDate) {
 try {
   LOG.info(updating RMDelegation token with sequence number: 
   + id.getSequenceNumber());
   rmContext.getStateStore().updateRMDelegationTokenAndSequenceNumber(id,
 renewDate, id.getSequenceNumber());
 } catch (Exception e) {
   LOG.error(Error in updating persisted RMDelegationToken with sequence 
 number: 
 + id.getSequenceNumber());
   ExitUtil.terminate(1, e);
 }
   }
 {code}
 According to code above, even when renewing a DT, the last sequence number is 
 updated in the store, which is wrong. For example, we have the following 
 sequence:
 1. Get DT 1 (seq = 1)
 2. Get DT 2( seq = 2)
 3. Renew DT 1 (seq = 1)
 4. Restart RM
 The stored and then recovered last sequence number is 1. It makes the next 
 created DT after RM restarting will conflict with DT 2 on sequence num.
 Second, the aforementioned bug doesn't happen actually, because the recovered 
 last sequence num has been overwritten at by the correctly one.
 {code}
   public void recover(RMState rmState) throws Exception {
 LOG.info(recovering RMDelegationTokenSecretManager.);

[jira] [Commented] (YARN-2360) Fair Scheduler: Display dynamic fair share for queues on the scheduler page


[ 
https://issues.apache.org/jira/browse/YARN-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266166#comment-14266166
 ] 

Hudson commented on YARN-2360:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #62 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/62/])
Move YARN-2360 from 2.6 to 2.7 in CHANGES.txt (kasha: rev 
41d72cbd48e6df7be3d177eaf04d73e88cf38381)
* hadoop-yarn-project/CHANGES.txt


 Fair Scheduler: Display dynamic fair share for queues on the scheduler page
 ---

 Key: YARN-2360
 URL: https://issues.apache.org/jira/browse/YARN-2360
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Fix For: 2.7.0

 Attachments: Screen Shot 2014-07-28 at 1.12.19 PM.png, 
 Screen_Shot_v3.png, Screen_Shot_v4.png, Screen_Shot_v5.png, YARN-2360-v1.txt, 
 YARN-2360-v2.txt, YARN-2360-v3.patch, YARN-2360-v4.patch, YARN-2360-v5.patch, 
 yarn-2360-6.patch


 Based on the discussion in YARN-2026,  we'd like to display dynamic fair 
 share for queues on the scheduler page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem


[ 
https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266167#comment-14266167
 ] 

Hudson commented on YARN-2574:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #62 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/62/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java


 Add support for FairScheduler to the ReservationSystem
 --

 Key: YARN-2574
 URL: https://issues.apache.org/jira/browse/YARN-2574
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Subru Krishnan
Assignee: Anubhav Dhoot

 YARN-1051 introduces the ReservationSystem and the current implementation is 
 based on CapacityScheduler. This JIRA proposes adding support for 
 FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266165#comment-14266165
 ] 

Hudson commented on YARN-2881:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #62 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/62/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java


 Implement PlanFollower for FairScheduler
 

 Key: YARN-2881
 URL: https://issues.apache.org/jira/browse/YARN-2881
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Fix For: 2.7.0

 Attachments: YARN-2881.001.patch, YARN-2881.002.patch, 
 YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, 
 YARN-2881.005.patch, YARN-2881.006.patch, YARN-2881.prelim.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2605) [RM HA] Rest api endpoints doing redirect incorrectly

2015-01-06 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266109#comment-14266109
 ] 

Steve Loughran commented on YARN-2605:
--

Can it just send a 307 resubmit same verb response to the caller? That will 
be picked up by browsers and handled as a new GET, while REST clients 
(including curl, jersey, htttpclient) will either GET or resubmit the original 
verb depending on their config.

Sending a custom structured response won't work with those existing clients

 [RM HA] Rest api endpoints doing redirect incorrectly
 -

 Key: YARN-2605
 URL: https://issues.apache.org/jira/browse/YARN-2605
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: bc Wong
  Labels: newbie

 The standby RM's webui tries to do a redirect via meta-refresh. That is fine 
 for pages designed to be viewed by web browsers. But the API endpoints 
 shouldn't do that. Most programmatic HTTP clients do not do meta-refresh. I'd 
 suggest HTTP 303, or return a well-defined error message (json or xml) 
 stating that the standby status and a link to the active RM.
 The standby RM is returning this today:
 {noformat}
 $ curl -i http://bcsec-1.ent.cloudera.com:8088/ws/v1/cluster/metrics
 HTTP/1.1 200 OK
 Cache-Control: no-cache
 Expires: Thu, 25 Sep 2014 18:34:53 GMT
 Date: Thu, 25 Sep 2014 18:34:53 GMT
 Pragma: no-cache
 Expires: Thu, 25 Sep 2014 18:34:53 GMT
 Date: Thu, 25 Sep 2014 18:34:53 GMT
 Pragma: no-cache
 Content-Type: text/plain; charset=UTF-8
 Refresh: 3; url=http://bcsec-2.ent.cloudera.com:8088/ws/v1/cluster/metrics
 Content-Length: 117
 Server: Jetty(6.1.26)
 This is standby RM. Redirecting to the current active RM: 
 http://bcsec-2.ent.cloudera.com:8088/ws/v1/cluster/metrics
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2571) RM to support YARN registry


[ 
https://issues.apache.org/jira/browse/YARN-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266201#comment-14266201
 ] 

Hadoop QA commented on YARN-2571:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685837/YARN-2571-009.patch
  against trunk revision 4cd66f7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 16 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6254//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6254//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6254//console

This message is automatically generated.

 RM to support YARN registry 
 

 Key: YARN-2571
 URL: https://issues.apache.org/jira/browse/YARN-2571
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: YARN-2571-001.patch, YARN-2571-002.patch, 
 YARN-2571-003.patch, YARN-2571-005.patch, YARN-2571-007.patch, 
 YARN-2571-008.patch, YARN-2571-009.patch


 The RM needs to (optionally) integrate with the YARN registry:
 # startup: create the /services and /users paths with system ACLs (yarn, hdfs 
 principals)
 # app-launch: create the user directory /users/$username with the relevant 
 permissions (CRD) for them to create subnodes.
 # attempt, container, app completion: remove service records with the 
 matching persistence and ID



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2360) Fair Scheduler: Display dynamic fair share for queues on the scheduler page


[ 
https://issues.apache.org/jira/browse/YARN-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266207#comment-14266207
 ] 

Hudson commented on YARN-2360:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #66 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/66/])
Move YARN-2360 from 2.6 to 2.7 in CHANGES.txt (kasha: rev 
41d72cbd48e6df7be3d177eaf04d73e88cf38381)
* hadoop-yarn-project/CHANGES.txt


 Fair Scheduler: Display dynamic fair share for queues on the scheduler page
 ---

 Key: YARN-2360
 URL: https://issues.apache.org/jira/browse/YARN-2360
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Fix For: 2.7.0

 Attachments: Screen Shot 2014-07-28 at 1.12.19 PM.png, 
 Screen_Shot_v3.png, Screen_Shot_v4.png, Screen_Shot_v5.png, YARN-2360-v1.txt, 
 YARN-2360-v2.txt, YARN-2360-v3.patch, YARN-2360-v4.patch, YARN-2360-v5.patch, 
 yarn-2360-6.patch


 Based on the discussion in YARN-2026,  we'd like to display dynamic fair 
 share for queues on the scheduler page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem


[ 
https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266208#comment-14266208
 ] 

Hudson commented on YARN-2574:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #66 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/66/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java


 Add support for FairScheduler to the ReservationSystem
 --

 Key: YARN-2574
 URL: https://issues.apache.org/jira/browse/YARN-2574
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Subru Krishnan
Assignee: Anubhav Dhoot

 YARN-1051 introduces the ReservationSystem and the current implementation is 
 based on CapacityScheduler. This JIRA proposes adding support for 
 FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2958) RMStateStore seems to unnecessarily and wrongly store sequence number separately


[ 
https://issues.apache.org/jira/browse/YARN-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266212#comment-14266212
 ] 

Hudson commented on YARN-2958:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #66 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/66/])
YARN-2958. Made RMStateStore not update the last sequence number when updating 
the delegation token. Contributed by Varun Saxena. (zjshen: rev 
562a701945be3a672f9cb5a52cc6db2c1589ba2b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreRMDTEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/LeveldbRMStateStore.java


 RMStateStore seems to unnecessarily and wrongly store sequence number 
 separately
 

 Key: YARN-2958
 URL: https://issues.apache.org/jira/browse/YARN-2958
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Zhijie Shen
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-2958.001.patch, YARN-2958.002.patch, 
 YARN-2958.003.patch, YARN-2958.004.patch


 It seems that RMStateStore updates last sequence number when storing or 
 updating each individual DT, to recover the latest sequence number when RM 
 restarting.
 First, the current logic seems to be problematic:
 {code}
   public synchronized void updateRMDelegationTokenAndSequenceNumber(
   RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate,
   int latestSequenceNumber) {
 if(isFencedState()) {
   LOG.info(State store is in Fenced state. Can't update RM Delegation 
 Token.);
   return;
 }
 try {
   updateRMDelegationTokenAndSequenceNumberInternal(rmDTIdentifier, 
 renewDate,
   latestSequenceNumber);
 } catch (Exception e) {
   notifyStoreOperationFailed(e);
 }
   }
 {code}
 {code}
   @Override
   protected void updateStoredToken(RMDelegationTokenIdentifier id,
   long renewDate) {
 try {
   LOG.info(updating RMDelegation token with sequence number: 
   + id.getSequenceNumber());
   rmContext.getStateStore().updateRMDelegationTokenAndSequenceNumber(id,
 renewDate, id.getSequenceNumber());
 } catch (Exception e) {
   LOG.error(Error in updating persisted RMDelegationToken with sequence 
 number: 
 + id.getSequenceNumber());
   ExitUtil.terminate(1, e);
 }
   }
 {code}
 According to code above, even when renewing a DT, the last sequence number is 
 updated in the store, which is wrong. For example, we have the following 
 sequence:
 1. Get DT 1 (seq = 1)
 2. Get DT 2( seq = 2)
 3. Renew DT 1 (seq = 1)
 4. Restart RM
 The stored and then recovered last sequence number is 1. It makes the next 
 created DT after RM restarting will conflict with DT 2 on sequence num.
 Second, the aforementioned bug doesn't happen actually, because the recovered 
 last sequence num has been overwritten at by the correctly one.
 {code}
   public void recover(RMState rmState) throws Exception {
 LOG.info(recovering

[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266206#comment-14266206
 ] 

Hudson commented on YARN-2881:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #66 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/66/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java


 Implement PlanFollower for FairScheduler
 

 Key: YARN-2881
 URL: https://issues.apache.org/jira/browse/YARN-2881
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Fix For: 2.7.0

 Attachments: YARN-2881.001.patch, YARN-2881.002.patch, 
 YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, 
 YARN-2881.005.patch, YARN-2881.006.patch, YARN-2881.prelim.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266249#comment-14266249
 ] 

Hudson commented on YARN-2881:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2016 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2016/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java


 Implement PlanFollower for FairScheduler
 

 Key: YARN-2881
 URL: https://issues.apache.org/jira/browse/YARN-2881
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Fix For: 2.7.0

 Attachments: YARN-2881.001.patch, YARN-2881.002.patch, 
 YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, 
 YARN-2881.005.patch, YARN-2881.006.patch, YARN-2881.prelim.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem


[ 
https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266251#comment-14266251
 ] 

Hudson commented on YARN-2574:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2016 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2016/])
YARN-2881. [YARN-2574] Implement PlanFollower for FairScheduler. (Anubhav Dhoot 
via kasha) (kasha: rev 0c4b11267717eb451fa6ed4c586317f2db32fbd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/ReservationQueueConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/PlanQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationConstants.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerDynamicBehavior.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/FairSchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestFairReservationSystem.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java


 Add support for FairScheduler to the ReservationSystem
 --

 Key: YARN-2574
 URL: https://issues.apache.org/jira/browse/YARN-2574
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Subru Krishnan
Assignee: Anubhav Dhoot

 YARN-1051 introduces the ReservationSystem and the current implementation is 
 based on CapacityScheduler. This JIRA proposes adding support for 
 FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2360) Fair Scheduler: Display dynamic fair share for queues on the scheduler page


[ 
https://issues.apache.org/jira/browse/YARN-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266250#comment-14266250
 ] 

Hudson commented on YARN-2360:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2016 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2016/])
Move YARN-2360 from 2.6 to 2.7 in CHANGES.txt (kasha: rev 
41d72cbd48e6df7be3d177eaf04d73e88cf38381)
* hadoop-yarn-project/CHANGES.txt


 Fair Scheduler: Display dynamic fair share for queues on the scheduler page
 ---

 Key: YARN-2360
 URL: https://issues.apache.org/jira/browse/YARN-2360
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
 Fix For: 2.7.0

 Attachments: Screen Shot 2014-07-28 at 1.12.19 PM.png, 
 Screen_Shot_v3.png, Screen_Shot_v4.png, Screen_Shot_v5.png, YARN-2360-v1.txt, 
 YARN-2360-v2.txt, YARN-2360-v3.patch, YARN-2360-v4.patch, YARN-2360-v5.patch, 
 yarn-2360-6.patch


 Based on the discussion in YARN-2026,  we'd like to display dynamic fair 
 share for queues on the scheduler page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2958) RMStateStore seems to unnecessarily and wrongly store sequence number separately


[ 
https://issues.apache.org/jira/browse/YARN-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266255#comment-14266255
 ] 

Hudson commented on YARN-2958:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2016 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2016/])
YARN-2958. Made RMStateStore not update the last sequence number when updating 
the delegation token. Contributed by Varun Saxena. (zjshen: rev 
562a701945be3a672f9cb5a52cc6db2c1589ba2b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreRMDTEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/LeveldbRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java


 RMStateStore seems to unnecessarily and wrongly store sequence number 
 separately
 

 Key: YARN-2958
 URL: https://issues.apache.org/jira/browse/YARN-2958
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Zhijie Shen
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-2958.001.patch, YARN-2958.002.patch, 
 YARN-2958.003.patch, YARN-2958.004.patch


 It seems that RMStateStore updates last sequence number when storing or 
 updating each individual DT, to recover the latest sequence number when RM 
 restarting.
 First, the current logic seems to be problematic:
 {code}
   public synchronized void updateRMDelegationTokenAndSequenceNumber(
   RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate,
   int latestSequenceNumber) {
 if(isFencedState()) {
   LOG.info(State store is in Fenced state. Can't update RM Delegation 
 Token.);
   return;
 }
 try {
   updateRMDelegationTokenAndSequenceNumberInternal(rmDTIdentifier, 
 renewDate,
   latestSequenceNumber);
 } catch (Exception e) {
   notifyStoreOperationFailed(e);
 }
   }
 {code}
 {code}
   @Override
   protected void updateStoredToken(RMDelegationTokenIdentifier id,
   long renewDate) {
 try {
   LOG.info(updating RMDelegation token with sequence number: 
   + id.getSequenceNumber());
   rmContext.getStateStore().updateRMDelegationTokenAndSequenceNumber(id,
 renewDate, id.getSequenceNumber());
 } catch (Exception e) {
   LOG.error(Error in updating persisted RMDelegationToken with sequence 
 number: 
 + id.getSequenceNumber());
   ExitUtil.terminate(1, e);
 }
   }
 {code}
 According to code above, even when renewing a DT, the last sequence number is 
 updated in the store, which is wrong. For example, we have the following 
 sequence:
 1. Get DT 1 (seq = 1)
 2. Get DT 2( seq = 2)
 3. Renew DT 1 (seq = 1)
 4. Restart RM
 The stored and then recovered last sequence number is 1. It makes the next 
 created DT after RM restarting will conflict with DT 2 on sequence num.
 Second, the aforementioned bug doesn't happen actually, because the recovered 
 last sequence num has been overwritten at by the correctly one.
 {code}
   public void recover(RMState rmState) throws Exception {
 LOG.info(recovering RMDelegationTokenSecretManager.);

[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation


 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.27.patch

patch tests which fail when null check for rmcontext.getscheduler is not 
present in ficaschedulerapp

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2217) Shared cache client side changes

2015-01-06 Thread Chris Trezzo (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266602#comment-14266602
 ] 

Chris Trezzo commented on YARN-2217:


Will do. Thanks!

 Shared cache client side changes
 

 Key: YARN-2217
 URL: https://issues.apache.org/jira/browse/YARN-2217
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Attachments: YARN-2217-trunk-v1.patch, YARN-2217-trunk-v2.patch, 
 YARN-2217-trunk-v3.patch, YARN-2217-trunk-v4.patch, YARN-2217-trunk-v5.patch, 
 YARN-2217-trunk-v6.patch


 Implement the client side changes for the shared cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-914) Support graceful decommission of nodemanager

2015-01-06 Thread Ming Ma (JIRA)

[
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266692#comment-14266692
]

Ming Ma commented on YARN-914:
--

Thanks, Junping. The timeout is definitely necessary.

* Sounds like we need a new state for NM, called decommission_in_progress
when NM is draining the containers. When RM considers the decommission
completes, it will be marked decommissioned.

* To clarify my early comment all its map output are fetched or until all the
applications the node touches have completed, the question is when YARN can
declare a node's state has been gracefully drained and thus the node gracefully
decommissioned ( admins can shutdown the whole machine without any impact on
jobs ). For MR, the state could be running tasks/containers or mapper outputs.
Say we have timeout of 30 minutes for decommission, it takes 3 minutes to
finish the mappers on the node, another 5 minutes for the job to finish, then
YARN can declare the node gracefully decommissioned in 8 minutes, instead of
waiting for 30 minutes. RM knows all applications on any given NM. So if all
applications on any given node have completed, RM can mark the node
decommissioned.

* Yes, I meant long running services. If YARN just kills the containers upon
decommission request, the impact could vary. Some services might not have
states to drain. Or maybe the services can handle the state migration on their
own without YARN's help. For such services, maybe we can just use
ResourceOption's timeout for that; set timeout to 0 and NM will just kill the
containers.

* Given we don't plan to have applications checkpoint and migrate states, it
doesn't seem to be necessary to have YARN notify applications upon decommission
requests. Just to call it out.

* It might be useful to have a new state called decommissioned_timeout, so
that admins know the node has been gracefully decommissioned or not.

Thoughts?

Support graceful decommission of nodemanager

Key: YARN-914
URL: https://issues.apache.org/jira/browse/YARN-914
Project: Hadoop YARN
Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Luke Lu
Assignee: Junping Du

When NMs are decommissioned for non-fault reasons (capacity change etc.),
it's desirable to minimize the impact to running applications.
Currently if a NM is decommissioned, all running containers on the NM need to
be rescheduled on other NMs. Further more, for finished map tasks, if their
map output are not fetched by the reducers of the job, these map tasks will
need to be rerun as well.
We propose to introduce a mechanism to optionally gracefully decommission a
node manager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation


[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266691#comment-14266691
 ] 

Hadoop QA commented on YARN-2637:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690392/YARN-2637.27.patch
  against trunk revision 4cd66f7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6256//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6256//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6256//console

This message is automatically generated.

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (YARN-2574) Add support for FairScheduler to the ReservationSystem

2015-01-06 Thread Anubhav Dhoot (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot resolved YARN-2574.
-
   Resolution: Fixed
Fix Version/s: 2.7.0

 Add support for FairScheduler to the ReservationSystem
 --

 Key: YARN-2574
 URL: https://issues.apache.org/jira/browse/YARN-2574
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler
Reporter: Subru Krishnan
Assignee: Anubhav Dhoot
 Fix For: 2.7.0


 YARN-1051 introduces the ReservationSystem and the current implementation is 
 based on CapacityScheduler. This JIRA proposes adding support for 
 FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2015-01-06 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266707#comment-14266707
 ] 

Wangda Tan commented on YARN-2637:
--

Failed test should not relate to this patch. Could you check the findbugs 
warning? Besides the findbugs warning, +1.

Thanks,

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2982) Use ReservationQueueConfiguration in CapacityScheduler

2015-01-06 Thread Anubhav Dhoot (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-2982:

Issue Type: Bug  (was: Sub-task)
Parent: (was: YARN-2574)

 Use ReservationQueueConfiguration in CapacityScheduler
 --

 Key: YARN-2982
 URL: https://issues.apache.org/jira/browse/YARN-2982
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Anubhav Dhoot

 ReservationQueueConfiguration is common to reservation irrespective of 
 Scheduler. It would be good to have CapacityScheduler also  support this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2982) Use ReservationQueueConfiguration in CapacityScheduler

2015-01-06 Thread Anubhav Dhoot (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-2982:

Component/s: (was: fairscheduler)

 Use ReservationQueueConfiguration in CapacityScheduler
 --

 Key: YARN-2982
 URL: https://issues.apache.org/jira/browse/YARN-2982
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Anubhav Dhoot

 ReservationQueueConfiguration is common to reservation irrespective of 
 Scheduler. It would be good to have CapacityScheduler also  support this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.


[ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266843#comment-14266843
 ] 

Chen He commented on YARN-1680:
---

Since Label scheduling patches are continuously being checked into trunk, we 
need to consider a little bit more than just blacklist nodes in headroom and 
user limit calculation. It is possible that the app asks for some labeled nodes 
in its ResourceRequest but some of them have already been blocked listed by 
cluster. 

 availableResources sent to applicationMaster in heartbeat should exclude 
 blacklistedNodes free memory.
 --

 Key: YARN-1680
 URL: https://issues.apache.org/jira/browse/YARN-1680
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0, 2.3.0
 Environment: SuSE 11 SP2 + Hadoop-2.3 
Reporter: Rohith
Assignee: Chen He
 Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
 YARN-1680-v2.patch, YARN-1680.patch


 There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
 slow start is set to 1.
 Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
 become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
 NodeManager(NM-4). All reducer task are running in cluster now.
 MRAppMaster does not preempt the reducers because for Reducer preemption 
 calculation, headRoom is considering blacklisted nodes memory. This makes 
 jobs to hang forever(ResourceManager does not assing any new containers on 
 blacklisted nodes but returns availableResouce considers cluster free 
 memory). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2933) Capacity Scheduler preemption policy should only consider capacity without labels temporarily

2015-01-06 Thread Mayank Bansal (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated YARN-2933:

Attachment: YARN-2933-4.patch

Thanks [~wangda] for review.

I updated the patch based on the comments.

Thanks,
Mayank

 Capacity Scheduler preemption policy should only consider capacity without 
 labels temporarily
 -

 Key: YARN-2933
 URL: https://issues.apache.org/jira/browse/YARN-2933
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Mayank Bansal
 Attachments: YARN-2933-1.patch, YARN-2933-2.patch, YARN-2933-3.patch, 
 YARN-2933-4.patch


 Currently, we have capacity enforcement on each queue for each label in 
 CapacityScheduler, but we don't have preemption policy to support that. 
 YARN-2498 is targeting to support preemption respect node labels, but we have 
 some gaps in code base, like queues/FiCaScheduler should be able to get 
 usedResource/pendingResource, etc. by label. These items potentially need to 
 refactor CS which we need spend some time carefully think about.
 For now, what immediately we can do is allow calculate ideal_allocation and 
 preempt containers only for resources on nodes without labels, to avoid 
 regression like: A cluster has some nodes with labels and some not, assume 
 queueA isn't satisfied for resource without label, but for now, preemption 
 policy may preempt resource from nodes with labels for queueA, that is not 
 correct.
 Again, it is just a short-term enhancement, YARN-2498 will consider 
 preemption respecting node-labels for Capacity Scheduler which is our final 
 target. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2230) Fix description of yarn.scheduler.maximum-allocation-vcores in yarn-default.xml (or code)

2015-01-06 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266880#comment-14266880
 ] 

Jian He commented on YARN-2230:
---

+1

 Fix description of yarn.scheduler.maximum-allocation-vcores in 
 yarn-default.xml (or code)
 -

 Key: YARN-2230
 URL: https://issues.apache.org/jira/browse/YARN-2230
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, documentation, scheduler
Affects Versions: 2.4.0
Reporter: Adam Kawa
Assignee: Vijay Bhat
Priority: Minor
 Attachments: YARN-2230.001.patch, YARN-2230.002.patch


 When a user requests more vcores than the allocation limit (e.g. 
 mapreduce.map.cpu.vcores  is larger than 
 yarn.scheduler.maximum-allocation-vcores), then 
 InvalidResourceRequestException is thrown - 
 https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
 {code}
 if (resReq.getCapability().getVirtualCores()  0 ||
 resReq.getCapability().getVirtualCores() 
 maximumResource.getVirtualCores()) {
   throw new InvalidResourceRequestException(Invalid resource request
   + , requested virtual cores  0
   + , or requested virtual cores  max configured
   + , requestedVirtualCores=
   + resReq.getCapability().getVirtualCores()
   + , maxVirtualCores= + maximumResource.getVirtualCores());
 }
 {code}
 According to documentation - yarn-default.xml 
 http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml,
  the request should be capped to the allocation limit.
 {code}
   property
 descriptionThe maximum allocation for every container request at the RM,
 in terms of virtual CPU cores. Requests higher than this won't take 
 effect,
 and will get capped to this value./description
 nameyarn.scheduler.maximum-allocation-vcores/name
 value32/value
   /property
 {code}
 This means that:
 * Either documentation or code should be corrected (unless this exception is 
 handled elsewhere accordingly, but it looks that it is not).
 This behavior is confusing, because when such a job (with 
 mapreduce.map.cpu.vcores is larger than 
 yarn.scheduler.maximum-allocation-vcores) is submitted, it does not make any 
 progress. The warnings/exceptions are thrown at the scheduler (RM) side e.g.
 {code}
 2014-06-29 00:34:51,469 WARN 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
 Invalid resource ask by application appattempt_1403993411503_0002_01
 org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
 resource request, requested virtual cores  0, or requested virtual cores  
 max configured, requestedVirtualCores=32, maxVirtualCores=3
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:237)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.validateResourceRequests(RMServerUtils.java:80)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:420)
 .
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
 {code}
 * IMHO, such an exception should be forwarded to client. Otherwise, it is non 
 obvious to discover why a job does not make any progress.
 The same looks to be related to memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2427) Add support for moving apps between queues in RM web services


[ 
https://issues.apache.org/jira/browse/YARN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266900#comment-14266900
 ] 

Hudson commented on YARN-2427:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6818 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6818/])
YARN-2427. Added the API of moving apps between queues in RM web services. 
Contributed by Varun Vasudev. (zjshen: rev 
60103fca04dc713183e4ec9e12f961642e7d1001)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/JAXBContextResolver.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesAppsModification.java


 Add support for moving apps between queues in RM web services
 -

 Key: YARN-2427
 URL: https://issues.apache.org/jira/browse/YARN-2427
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.7.0

 Attachments: apache-yarn-2427.0.patch, apache-yarn-2427.1.patch, 
 apache-yarn-2427.2.patch, apache-yarn-2427.3.patch, apache-yarn-2427.4.patch


 Support for moving apps from one queue to another is now present in 
 CapacityScheduler and FairScheduler. We should expose the functionality via 
 RM web services as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2571) RM to support YARN registry

2015-01-06 Thread Xuan Gong (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266911#comment-14266911
]

Xuan Gong commented on YARN-2571:
-

Thanks for the patch. Overall looks fine.

1. Could we move the registry service from alway-on service to active services
? For example, if RM HA is enabled, only the active RM can start the registry
service.

2. The Curator Framework is used here to do the ZK operation. I am not familiar
with this. Does the curator framework provide automatic fence mechanism when we
write/delete related znodes? For example, only the active RM can write data in
znode. The standby RM should not be allowed to do anything.

RM to support YARN registry

Key: YARN-2571
URL: https://issues.apache.org/jira/browse/YARN-2571
Project: Hadoop YARN
Issue Type: Sub-task
Components: resourcemanager
Reporter: Steve Loughran
Assignee: Steve Loughran
Attachments: YARN-2571-001.patch, YARN-2571-002.patch,
YARN-2571-003.patch, YARN-2571-005.patch, YARN-2571-007.patch,
YARN-2571-008.patch, YARN-2571-009.patch

The RM needs to (optionally) integrate with the YARN registry:
# startup: create the /services and /users paths with system ACLs (yarn, hdfs
principals)
# app-launch: create the user directory /users/$username with the relevant
permissions (CRD) for them to create subnodes.
# attempt, container, app completion: remove service records with the
matching persistence and ID

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3003) Provide API for client to retrieve label to node mapping


[ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266919#comment-14266919
 ] 

Varun Saxena commented on YARN-3003:


Thanks [~leftnoteasy] for your input. Yes, its about getting labels to nodes.  
Separating it out into 2 APIs' is more cleaner and less confusing to the user. 
Basically wanted your input on whether we need this new API or not. 
I had a similar idea in mind regarding how to fix this issue. Will look at 
YARN-2943 as well as [~Naganarasimha] suggested.
Thanks.

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena

 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state


[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266923#comment-14266923
 ] 

Varun Saxena commented on YARN-2902:


Kindly review this. Pending since a long time.

 Killing a container that is localizing can orphan resources in the 
 DOWNLOADING state
 

 Key: YARN-2902
 URL: https://issues.apache.org/jira/browse/YARN-2902
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2902.002.patch, YARN-2902.patch


 If a container is in the process of localizing when it is stopped/killed then 
 resources are left in the DOWNLOADING state.  If no other container comes 
 along and requests these resources they linger around with no reference 
 counts but aren't cleaned up during normal cache cleanup scans since it will 
 never delete resources in the DOWNLOADING state even if their reference count 
 is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2936) YARNDelegationTokenIdentifier doesn't set proto.builder now

[
https://issues.apache.org/jira/browse/YARN-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266925#comment-14266925
]

Varun Saxena commented on YARN-2936:

[~jianhe], sorry couldnt get you. What do you mean by core change ? The newly
added test code calls getProto and that is where the change is.

YARNDelegationTokenIdentifier doesn't set proto.builder now
---

Key: YARN-2936
URL: https://issues.apache.org/jira/browse/YARN-2936
Project: Hadoop YARN
Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Varun Saxena
Attachments: YARN-2936.001.patch, YARN-2936.002.patch,
YARN-2936.003.patch, YARN-2936.004.patch, YARN-2936.005.patch

After YARN-2743, the setters are removed from YARNDelegationTokenIdentifier,
such that when constructing a object which extends
YARNDelegationTokenIdentifier, proto.builder is not set at all. Later on,
when we call getProto() of it, we will just get an empty proto object.
It seems to do no harm to the production code path, as we will always call
getBytes() before using proto to persist the DT in the state store, when we
generating the password.
I think the setter is removed to avoid duplicating setting the fields why
getBytes() is called. However, YARNDelegationTokenIdentifier doesn't work
properly alone. YARNDelegationTokenIdentifier is tightly coupled with the
logic in secretManager. It's vulnerable if something is changed at
secretManager. For example, in the test case of YARN-2837, I spent time to
figure out we need to execute getBytes() first to make sure the testing DTs
can be properly put into the state store.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2978) ResourceManager crashes with NPE while getting queue info


[ 
https://issues.apache.org/jira/browse/YARN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266881#comment-14266881
 ] 

Hudson commented on YARN-2978:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6817 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6817/])
YARN-2978. Fixed potential NPE while getting queue info. Contributed by Varun 
Saxena (jianhe: rev dd57c2047bfd21910acc38c98153eedf1db75169)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


 ResourceManager crashes with NPE while getting queue info
 -

 Key: YARN-2978
 URL: https://issues.apache.org/jira/browse/YARN-2978
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Jason Tufo
Assignee: Varun Saxena
Priority: Critical
  Labels: capacityscheduler, resourcemanager
 Fix For: 2.7.0

 Attachments: YARN-2978.001.patch, YARN-2978.002.patch, 
 YARN-2978.003.patch, YARN-2978.004.patch


  java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto.isInitialized(YarnProtos.java:29625)
   at 
 org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.build(YarnProtos.java:29939)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:290)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation


 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.28.patch

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2979) Unsupported operation exception in message building (YarnProtos)


[ 
https://issues.apache.org/jira/browse/YARN-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266908#comment-14266908
 ] 

Varun Saxena commented on YARN-2979:


Resolving this as well as YARN-2978 is committed

 Unsupported operation exception in message building (YarnProtos)
 

 Key: YARN-2979
 URL: https://issues.apache.org/jira/browse/YARN-2979
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.5.1
Reporter: Jason Tufo
Assignee: Varun Saxena
 Fix For: 2.7.0


 java.lang.UnsupportedOperationException
   at java.util.AbstractList.add(AbstractList.java:148)
   at java.util.AbstractList.add(AbstractList.java:108)
   at 
 com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:330)
   at 
 org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.addAllApplications(YarnProtos.java:30702)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.addApplicationsToProto(QueueInfoPBImpl.java:227)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToBuilder(QueueInfoPBImpl.java:282)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:289)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (YARN-2979) Unsupported operation exception in message building (YarnProtos)


 [ 
https://issues.apache.org/jira/browse/YARN-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena resolved YARN-2979.

   Resolution: Fixed
Fix Version/s: 2.7.0

 Unsupported operation exception in message building (YarnProtos)
 

 Key: YARN-2979
 URL: https://issues.apache.org/jira/browse/YARN-2979
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.5.1
Reporter: Jason Tufo
Assignee: Varun Saxena
 Fix For: 2.7.0


 java.lang.UnsupportedOperationException
   at java.util.AbstractList.add(AbstractList.java:148)
   at java.util.AbstractList.add(AbstractList.java:108)
   at 
 com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:330)
   at 
 org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.addAllApplications(YarnProtos.java:30702)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.addApplicationsToProto(QueueInfoPBImpl.java:227)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToBuilder(QueueInfoPBImpl.java:282)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:289)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2978) ResourceManager crashes with NPE while getting queue info


[ 
https://issues.apache.org/jira/browse/YARN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266920#comment-14266920
 ] 

Varun Saxena commented on YARN-2978:


Thanks [~jianhe] for the review and commit.

 ResourceManager crashes with NPE while getting queue info
 -

 Key: YARN-2978
 URL: https://issues.apache.org/jira/browse/YARN-2978
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Jason Tufo
Assignee: Varun Saxena
Priority: Critical
  Labels: capacityscheduler, resourcemanager
 Fix For: 2.7.0

 Attachments: YARN-2978.001.patch, YARN-2978.002.patch, 
 YARN-2978.003.patch, YARN-2978.004.patch


  java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto.isInitialized(YarnProtos.java:29625)
   at 
 org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.build(YarnProtos.java:29939)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:290)
   at 
 org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111)
   at 
 org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users


[ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266985#comment-14266985
 ] 

Hadoop QA commented on YARN-2423:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690438/YARN-2423.006.patch
  against trunk revision 60103fc.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6259//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6259//console

This message is automatically generated.

 TimelineClient should wrap all GET APIs to facilitate Java users
 

 Key: YARN-2423
 URL: https://issues.apache.org/jira/browse/YARN-2423
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Robert Kanter
 Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
 YARN-2423.006.patch, YARN-2423.patch, YARN-2423.patch, YARN-2423.patch


 TimelineClient provides the Java method to put timeline entities. It's also 
 good to wrap over all GET APIs (both entity and domain), and deserialize the 
 json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2015-01-06 Thread Andy Schlaikjer (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267068#comment-14267068
]

Andy Schlaikjer commented on YARN-1529:
---

Any update on this? These new metrics look valuable.

Add Localization overhead metrics to NM
---

Key: YARN-1529
URL: https://issues.apache.org/jira/browse/YARN-1529
Project: Hadoop YARN
Issue Type: Improvement
Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch,
YARN-1529.v03.patch

Users are often unaware of localization cost that their jobs incur. To
measure effectiveness of localization caches it is necessary to expose the
overhead in the form of metrics.
We propose addition of the following metrics to NodeManagerMetrics.
When a container is about to launch, its set of LocalResources has to be
fetched from a central location, typically on HDFS, that results in a number
of download requests for the files missing in caches.
LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache
misses.
LocalizedFilesCached: total localization requests that were served from local
caches. Cache hits.
LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
LocalizedBytesCached: total bytes satisfied from local caches.
Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that
were served out of cache: ratio = 100 * caches / (caches + misses)
LocalizationDownloadNanos: total elapsed time in nanoseconds for a container
to go from ResourceRequestTransition to LocalizedTransition

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

[
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267075#comment-14267075
]

Hadoop QA commented on YARN-1529:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12621292/YARN-1529.v03.patch
against trunk revision 788ee35.

{color:red}-1 patch{color}. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6260//console

This message is automatically generated.

Add Localization overhead metrics to NM
---

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation


[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267086#comment-14267086
 ] 

Yi Liu commented on YARN-2637:
--

{quote}
Findbugs was the result of changing the ratio of sync to unsync accesses which 
hit the findbugs limits, but not the pattern itself, which looks fine, so added 
fb exclusion. 
{quote}
Not exactly, in FairScheduler, it's a real issue, we need *synchronized* for 
_resolveReservationQueueName_.
Already have a JIRA YARN-3010 to fix the findbugs...

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2936) YARNDelegationTokenIdentifier doesn't set proto.builder now

[
https://issues.apache.org/jira/browse/YARN-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267098#comment-14267098
]

Varun Saxena commented on YARN-2936:

[~jianhe], changed the test. Kindly review.

YARNDelegationTokenIdentifier doesn't set proto.builder now
---

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG


 [ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2213:
---
Attachment: YARN-2213.001.patch

 Change proxy-user cookie log in AmIpFilter to DEBUG
 ---

 Key: YARN-2213
 URL: https://issues.apache.org/jira/browse/YARN-2213
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2213.001.patch


 I saw a lot of the following lines in AppMaster log:
 {code}
 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 {code}
 For long running app, this would consume considerable log space.
 Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2428) LCE default banned user list should have yarn


 [ 
https://issues.apache.org/jira/browse/YARN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2428:
---
Attachment: (was: YARN-2428.patch)

 LCE default banned user list should have yarn
 -

 Key: YARN-2428
 URL: https://issues.apache.org/jira/browse/YARN-2428
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Allen Wittenauer
Assignee: Varun Saxena
Priority: Trivial
  Labels: newbie
 Fix For: 2.7.0


 When task-controller was retrofitted to YARN, the default banned user list 
 didn't add yarn.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG

2015-01-06 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267100#comment-14267100
 ] 

Ted Yu commented on YARN-2213:
--

lgtm

 Change proxy-user cookie log in AmIpFilter to DEBUG
 ---

 Key: YARN-2213
 URL: https://issues.apache.org/jira/browse/YARN-2213
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Attachments: YARN-2213.001.patch


 I saw a lot of the following lines in AppMaster log:
 {code}
 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 {code}
 For long running app, this would consume considerable log space.
 Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state


[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266968#comment-14266968
 ] 

Varun Saxena commented on YARN-2902:


Thanks [~jlowe]

 Killing a container that is localizing can orphan resources in the 
 DOWNLOADING state
 

 Key: YARN-2902
 URL: https://issues.apache.org/jira/browse/YARN-2902
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2902.002.patch, YARN-2902.patch


 If a container is in the process of localizing when it is stopped/killed then 
 resources are left in the DOWNLOADING state.  If no other container comes 
 along and requests these resources they linger around with no reference 
 counts but aren't cleaned up during normal cache cleanup scans since it will 
 never delete resources in the DOWNLOADING state even if their reference count 
 is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation


[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267030#comment-14267030
 ] 

Craig Welch commented on YARN-2637:
---

Findbugs was the result of changing the ratio of sync to unsync accesses which 
hit the findbugs limits, but not the pattern itself, which looks fine, so added 
fb exclusion.  TestFairScheduler passes on my box with the change so build 
server related / not a real issue.  

Was not originally planning to address the max am percent for user as that 
wasn't the issue we kept encountering but forgot to mention this / edit the 
jira to reflect.  However, I'm going to see what the impact would be of adding 
that now  then we can decide to include it or move to it's own jira.

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2936) YARNDelegationTokenIdentifier doesn't set proto.builder now

[
https://issues.apache.org/jira/browse/YARN-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Varun Saxena updated YARN-2936:
---
Attachment: YARN-2936.006.patch

YARNDelegationTokenIdentifier doesn't set proto.builder now
---

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2933) Capacity Scheduler preemption policy should only consider capacity without labels temporarily

[
https://issues.apache.org/jira/browse/YARN-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266981#comment-14266981
]

Hadoop QA commented on YARN-2933:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12690428/YARN-2933-4.patch
against trunk revision dd57c20.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 1 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:red}-1 findbugs{color}. The patch appears to introduce 1 new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 core tests{color}. The patch passed unit tests in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/6257//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-YARN-Build/6257//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6257//console

This message is automatically generated.

Capacity Scheduler preemption policy should only consider capacity without
labels temporarily
-

Key: YARN-2933
URL: https://issues.apache.org/jira/browse/YARN-2933
Project: Hadoop YARN
Issue Type: Sub-task
Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Mayank Bansal
Attachments: YARN-2933-1.patch, YARN-2933-2.patch, YARN-2933-3.patch,
YARN-2933-4.patch

Currently, we have capacity enforcement on each queue for each label in
CapacityScheduler, but we don't have preemption policy to support that.
YARN-2498 is targeting to support preemption respect node labels, but we have
some gaps in code base, like queues/FiCaScheduler should be able to get
usedResource/pendingResource, etc. by label. These items potentially need to
refactor CS which we need spend some time carefully think about.
For now, what immediately we can do is allow calculate ideal_allocation and
preempt containers only for resources on nodes without labels, to avoid
regression like: A cluster has some nodes with labels and some not, assume
queueA isn't satisfied for resource without label, but for now, preemption
policy may preempt resource from nodes with labels for queueA, that is not
correct.
Again, it is just a short-term enhancement, YARN-2498 will consider
preemption respecting node-labels for Capacity Scheduler which is our final
target.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2015-01-06 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266947#comment-14266947
 ] 

Jason Lowe commented on YARN-2902:
--

Sorry for the delay, Varun, as I was busy with end-of-year items and vacation.  
I'll try to get to this by the end of the week.

 Killing a container that is localizing can orphan resources in the 
 DOWNLOADING state
 

 Key: YARN-2902
 URL: https://issues.apache.org/jira/browse/YARN-2902
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2902.002.patch, YARN-2902.patch


 If a container is in the process of localizing when it is stopped/killed then 
 resources are left in the DOWNLOADING state.  If no other container comes 
 along and requests these resources they linger around with no reference 
 counts but aren't cleaned up during normal cache cleanup scans since it will 
 never delete resources in the DOWNLOADING state even if their reference count 
 is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation


[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266994#comment-14266994
 ] 

Hadoop QA commented on YARN-2637:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690430/YARN-2637.28.patch
  against trunk revision dd57c20.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6258//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6258//console

This message is automatically generated.

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.6.patch, YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2933) Capacity Scheduler preemption policy should only consider capacity without labels temporarily

2015-01-06 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267008#comment-14267008
 ] 

Wangda Tan commented on YARN-2933:
--

Hi [~mayank_bansal],
Thanks for updating,

In Proportional...Policy, some minor comments
1. {{getNodeLabels}} is nobody using it, should be remove it.

2. {{setNodeLabels}} is too simple to be a method, suggest to remove it too.

3. {{getNonLabeledResources}} should be private

4. {{isLabeledContainer}} could write like 
{code}
  private boolean isLabeledContainer(RMContainer c) {
return labels.containsKey(c.getAllocatedNode());
  }
{code}
Avoid traversing of all keys,
I suggest to remove this method since it's too simple. At least, it should be 
private.

In Test, currently the {{testIdealAllocationForLabels}} is not correct. In your 
test, queueA/B has total guaranteed *NON_LABELED* resource 100, they used 100 
*NON_LABELED* resource, but {{NodeLabelsManager.getResourceByLabel(no-label)}} 
is only 80. (non-labeled-used/configured-resource  
NodeLabelsManager.ResourceByLabel(no-label))

One thing need worth to take care is, if we don't do anything on 
TestPro..Policy mocking queues and applications. All used/configured capacities 
are *NON_LABELED* capacity.

I suggest to write test like:

{code}
  @Test
  public void testIdealAllocationForLabels() {
int[][] qData = new int[][] {
// / A B
{ 80, 40, 40 }, // abs
{ 80, 80, 80 }, // maxcap
{ 80, 80, 0 }, // used
{ 70, 20, 50 }, // pending
{ 0, 0, 0 }, // reserved
{ 5, 4, 1 }, // apps
{ -1, 1, 1 }, // req granularity
{ 2, 0, 0 }, // subqueues
};
setAMContainer = true;
setLabelContainer = true;
MapNodeId, SetString labels = new HashMapNodeId, SetString();
NodeId node = NodeId.newInstance(node1, 0);
SetString labelSet = new HashSetString();
labelSet.add(x);
labels.put(node, labelSet);
when(lm.getNodeLabels()).thenReturn(labels);
ProportionalCapacityPreemptionPolicy policy = buildPolicy(qData);
// Subtracting Label X resources from cluster resources
when(lm.getResourceByLabel(anyString(), any(Resource.class))).thenReturn(
Resources.clone(Resource.newInstance(80, 0)));
clusterResources.setMemory(100);
policy.editSchedule();

// By skipping AM Container and Labeled container, all other 18 containers
// of appD will be
// preempted
verify(mDisp, times(18)).handle(argThat(new IsPreemptionRequestFor(appD)));

// By skipping AM Container and Labeled container, all other 18 containers
// of appC will be
// preempted
verify(mDisp, times(18)).handle(argThat(new IsPreemptionRequestFor(appC)));

// rest 4 containers from appB will be preempted
verify(mDisp, times(4)).handle(argThat(new IsPreemptionRequestFor(appB)));
setAMContainer = false;
setLabelContainer = false;
  }
{code}
Now, configured *NON_LABELED* resource is 80, before entering 
policy.editSchedule, {{clusterResources.setMemory(100);}}. Which makes 
clusterResource  non-labeled-resource And in computation, it will only 
consider clusterResource is 80 after {{getNonLabeledResources}}.

And could you take a look at findbugs warning.

Thoughts?

 Capacity Scheduler preemption policy should only consider capacity without 
 labels temporarily
 -

 Key: YARN-2933
 URL: https://issues.apache.org/jira/browse/YARN-2933
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Mayank Bansal
 Attachments: YARN-2933-1.patch, YARN-2933-2.patch, YARN-2933-3.patch, 
 YARN-2933-4.patch


 Currently, we have capacity enforcement on each queue for each label in 
 CapacityScheduler, but we don't have preemption policy to support that. 
 YARN-2498 is targeting to support preemption respect node labels, but we have 
 some gaps in code base, like queues/FiCaScheduler should be able to get 
 usedResource/pendingResource, etc. by label. These items potentially need to 
 refactor CS which we need spend some time carefully think about.
 For now, what immediately we can do is allow calculate ideal_allocation and 
 preempt containers only for resources on nodes without labels, to avoid 
 regression like: A cluster has some nodes with labels and some not, assume 
 queueA isn't satisfied for resource without label, but for now, preemption 
 policy may preempt resource from nodes with labels for queueA, that is not 
 correct.
 Again, it is just a short-term enhancement, YARN-2498 will consider 
 preemption respecting node-labels for Capacity Scheduler which is our final 
 target. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance


[ 
https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267065#comment-14267065
 ] 

Yi Liu commented on YARN-2996:
--

Yes, Zhijie
{quote}
Good catch! It seems that 
MemoryRMStateStore#storeOrUpdateAMRMTokenSecretManagerState needs to be fixed 
too.
{quote}
{{.002}} patch already includes this fix.

 Refine some fs operations in FileSystemRMStateStore to improve performance
 --

 Key: YARN-2996
 URL: https://issues.apache.org/jira/browse/YARN-2996
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: YARN-2996.001.patch, YARN-2996.002.patch


 In {{FileSystemRMStateStore}}, we can refine some fs operations to improve 
 performance:
 *1.* There are several places invoke {{fs.exists}}, then 
 {{fs.getFileStatus}}, we can merge them to save one RPC call
 {code}
 if (fs.exists(versionNodePath)) {
 FileStatus status = fs.getFileStatus(versionNodePath);
 {code}
 *2.*
 {code}
 protected void updateFile(Path outputPath, byte[] data) throws Exception {
   Path newPath = new Path(outputPath.getParent(), outputPath.getName() + 
 .new);
   // use writeFile to make sure .new file is created atomically
   writeFile(newPath, data);
   replaceFile(newPath, outputPath);
 }
 {code}
 The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then 
 rename to _output\_file_.new, then rename it to _output\_file_, we can reduce 
 one rename operation.
 Also there is one unnecessary import, we can remove it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2936) YARNDelegationTokenIdentifier doesn't set proto.builder now

[
https://issues.apache.org/jira/browse/YARN-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267077#comment-14267077
]

Varun Saxena commented on YARN-2936:

Oh you mean, the newly added test code passes even without the change. Will
look at it and change test accordingly.

YARNDelegationTokenIdentifier doesn't set proto.builder now
---

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG


 [ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2213:
---
Attachment: (was: YARN-2213.patch)

 Change proxy-user cookie log in AmIpFilter to DEBUG
 ---

 Key: YARN-2213
 URL: https://issues.apache.org/jira/browse/YARN-2213
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0


 I saw a lot of the following lines in AppMaster log:
 {code}
 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 {code}
 For long running app, this would consume considerable log space.
 Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3003) Provide API for client to retrieve label to node mapping

2015-01-06 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267108#comment-14267108
 ] 

Naganarasimha G R commented on YARN-3003:
-

Hi [~leftnoteasy],
Yes, my idea was to transpose as a given node can belong to multiple labels 
getting node-label then transposing would be better as there is already 
existing node to label mapping Data structure. 
Further would it be better to support label expression and get the list of 
nodes mapping to it rather than set of labels as input ? May be 
[~yuzhih...@gmail.com] can describe more about the scenario or usage of this 
interface ?

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena

 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2427) Add support for moving apps between queues in RM web services

2015-01-06 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266891#comment-14266891
 ] 

Zhijie Shen commented on YARN-2427:
---

bq. If you feel strongly about it, I can remove it.

I realize we have done the similar thing for app state endpoint. Let's keep to 
it for app queue.

The new patch looks good to me. Will commit it.

 Add support for moving apps between queues in RM web services
 -

 Key: YARN-2427
 URL: https://issues.apache.org/jira/browse/YARN-2427
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-2427.0.patch, apache-yarn-2427.1.patch, 
 apache-yarn-2427.2.patch, apache-yarn-2427.3.patch, apache-yarn-2427.4.patch


 Support for moving apps from one queue to another is now present in 
 CapacityScheduler and FairScheduler. We should expose the functionality via 
 RM web services as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users

2015-01-06 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-2423:

Attachment: YARN-2423.006.patch

006 patch is rebased on latest trunk

 TimelineClient should wrap all GET APIs to facilitate Java users
 

 Key: YARN-2423
 URL: https://issues.apache.org/jira/browse/YARN-2423
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Robert Kanter
 Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
 YARN-2423.006.patch, YARN-2423.patch, YARN-2423.patch, YARN-2423.patch


 TimelineClient provides the Java method to put timeline entities. It's also 
 good to wrap over all GET APIs (both entity and domain), and deserialize the 
 json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2015-01-06 Thread Jason Lowe (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266877#comment-14266877
]

Jason Lowe commented on YARN-1680:
--

bq. It is possible that the app asks for some labeled nodes in its
ResourceRequest but some of them have already been blocked listed by cluster.

Yes, agreed. However it would be very useful to have a patch that just fixes
the blacklisted node case in the interim since many clusters (most at this
point) are not using labels. If it's easy to add label consideration into this
then go for it. Otherwise I think it would be better to make incremental steps
by fixing the existing issue of blacklisted nodes and address the label issue
in a separate JIRA.

availableResources sent to applicationMaster in heartbeat should exclude
blacklistedNodes free memory.
--

Key: YARN-1680
URL: https://issues.apache.org/jira/browse/YARN-1680
Project: Hadoop YARN
Issue Type: Sub-task
Affects Versions: 2.2.0, 2.3.0
Environment: SuSE 11 SP2 + Hadoop-2.3
Reporter: Rohith
Assignee: Chen He
Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch,
YARN-1680-v2.patch, YARN-1680.patch

There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster
slow start is set to 1.
Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is
become unstable(3 Map got killed), MRAppMaster blacklisted unstable
NodeManager(NM-4). All reducer task are running in cluster now.
MRAppMaster does not preempt the reducers because for Reducer preemption
calculation, headRoom is considering blacklisted nodes memory. This makes
jobs to hang forever(ResourceManager does not assing any new containers on
blacklisted nodes but returns availableResouce considers cluster free
memory).

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3002) YARN documentation needs updating post-shell rewrite

2015-01-06 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3002:
---
Attachment: YARN-3002-01.patch

-01:
* Make these documents consistent with and reference HADOOP-10908 appropriately.
* Add quite a few missing commands and options. :(
* Style fixes here and there
* Alphabetize the subcommands


 YARN documentation needs updating post-shell rewrite
 

 Key: YARN-3002
 URL: https://issues.apache.org/jira/browse/YARN-3002
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
 Attachments: YARN-3002-00.patch, YARN-3002-01.patch


 After HADOOP-9902, the YARN documentation is out of date. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2556) Tool to measure the performance of the timeline server


[ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266866#comment-14266866
 ] 

Chen He commented on YARN-2556:
---

Current benchmark only contains basic Timelineserver write / sec. Do we need to 
add more?

 Tool to measure the performance of the timeline server
 --

 Key: YARN-2556
 URL: https://issues.apache.org/jira/browse/YARN-2556
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: chang li
 Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
 yarn2556.patch, yarn2556.patch, yarn2556_wip.patch


 We need to be able to understand the capacity model for the timeline server 
 to give users the tools they need to deploy a timeline server with the 
 correct capacity.
 I propose we create a mapreduce job that can measure timeline server write 
 and read performance. Transactions per second, I/O for both read and write 
 would be a good start.
 This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2936) YARNDelegationTokenIdentifier doesn't set proto.builder now

2015-01-06 Thread Jian He (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266930#comment-14266930
]

Jian He commented on YARN-2936:
---

sorry for confusion, I meant the core/production code (not test code) change.

YARNDelegationTokenIdentifier doesn't set proto.builder now
---

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2848) (FICA) Applications should maintain an application specific 'cluster' resource to calculate headroom and userlimit


[ 
https://issues.apache.org/jira/browse/YARN-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266590#comment-14266590
 ] 

Chen He commented on YARN-2848:
---

I guess the label is provide by users or applications to choose what nodes to 
run. The Blacklist is detected by system that what nodes are not stable to run. 
The blacklisted nodes could be regarded as a special label or NOT label. 
However, we need extra synchronization process to keep the consistency of 
users/apps requests and unstable nodes before making scheduling decision. 
YARN-1680 could be a solution before we actually settle down the label scope 
and the synchronization overhead issue. 

 (FICA) Applications should maintain an application specific 'cluster' 
 resource to calculate headroom and userlimit
 --

 Key: YARN-2848
 URL: https://issues.apache.org/jira/browse/YARN-2848
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch

 Likely solutions to [YARN-1680] (properly handling node and rack blacklisting 
 with cluster level node additions and removals) will entail managing an 
 application-level slice of the cluster resource available to the 
 application for use in accurately calculating the application headroom and 
 user limit.  There is an assumption that events which impact this resource 
 will occur less frequently than the need to calculate headroom, userlimit, 
 etc (which is a valid assumption given that occurs per-allocation heartbeat). 
  Given that, the application should (with assistance from cluster-level 
 code...) detect changes to the composition of the cluster (node addition, 
 removal) and when those have occurred, calculate an application specific 
 cluster resource by comparing cluster nodes to it's own blacklist (both rack 
 and individual node).  I think it makes sense to include nodelabel 
 considerations into this calculation as it will be efficient to do both at 
 the same time and the single resource value reflecting both constraints could 
 then be used for efficient frequent headroom and userlimit calculations while 
 remaining highly accurate.  The application would need to be made aware of 
 nodelabel changes it is interested in (the application or removal of labels 
 of interest to the application to/from nodes).  For this purpose, the 
 application submissions's nodelabel expression would be used to determine the 
 nodelabel impact on the resource used to calculate userlimit and headroom 
 (Cases where the application elected to request resources not using the 
 application level label expression are out of scope for this - but for the 
 common usecase of an application which uses a particular expression 
 throughout, userlimit and headroom would be accurate) This could also provide 
 an overall mechanism for handling application-specific resource constraints 
 which might be added in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server


 [ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated YARN-2556:
--
Target Version/s: 2.7.0  (was: 2.6.0)

 Tool to measure the performance of the timeline server
 --

 Key: YARN-2556
 URL: https://issues.apache.org/jira/browse/YARN-2556
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: chang li
 Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
 yarn2556.patch, yarn2556.patch, yarn2556_wip.patch


 We need to be able to understand the capacity model for the timeline server 
 to give users the tools they need to deploy a timeline server with the 
 correct capacity.
 I propose we create a mapreduce job that can measure timeline server write 
 and read performance. Transactions per second, I/O for both read and write 
 would be a good start.
 This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished

2015-01-06 Thread Chengbing Liu (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267210#comment-14267210
]

Chengbing Liu commented on YARN-2997:
-

Once got an RESYNC, NM calls {{getNMContainerStatuses}}, which will loop over
all containers in the NM context, remove those whose app is not in the NM
context, finally report to RM. The method {{getNMContainerStatuses}} remains
unchanged before and after this patch. The logic of removing containers from
context is also unchanged.

From a different viewpoint, {{pendingCompletedContainers}} contains the
following:
* completed containers, whose app is stopped, and the container is removed from
the NM context.
* completed containers, whose app is NOT stopped (which implies their apps are
in the NM context), and the container is NOT removed from the NM context.

The first kind will not be reported to RM since they are not in the NM context,
so they will not be looped.
The second kind will be reported to RM since they are in the NM context, and
their apps must be in the NM context.

Finally, the changes of this patch can be summarized as follows:
* Does not send finished container statuses repeatedly for running application
* Send completed container statuses again in case of lost heartbeat (normal
heartbeat, not RESYNC)

I hope this will clarify your doubts.

NM keeps sending finished containers to RM until app is finished

Key: YARN-2997
URL: https://issues.apache.org/jira/browse/YARN-2997
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.6.0
Reporter: Chengbing Liu
Assignee: Chengbing Liu
Attachments: YARN-2997.2.patch, YARN-2997.3.patch, YARN-2997.4.patch,
YARN-2997.patch

We have seen in RM log a lot of
{quote}
INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
Null container completed...
{quote}
It is caused by NM sending completed containers repeatedly until the app is
finished. On the RM side, the container is already released, hence
{{getRMContainer}} returns null.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation


[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267219#comment-14267219
 ] 

Hadoop QA commented on YARN-2637:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690467/YARN-2637.29.patch
  against trunk revision 788ee35.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6264//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6264//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6264//console

This message is automatically generated.

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.29.patch, YARN-2637.6.patch, YARN-2637.7.patch, 
 YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2428) LCE default banned user list should have yarn


 [ 
https://issues.apache.org/jira/browse/YARN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2428:
---
Attachment: YARN-2428.001.patch

 LCE default banned user list should have yarn
 -

 Key: YARN-2428
 URL: https://issues.apache.org/jira/browse/YARN-2428
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Allen Wittenauer
Assignee: Varun Saxena
Priority: Trivial
  Labels: newbie
 Attachments: YARN-2428.001.patch


 When task-controller was retrofitted to YARN, the default banned user list 
 didn't add yarn.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3003) Provide API for client to retrieve label to node mapping


[ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267125#comment-14267125
 ] 

Varun Saxena commented on YARN-3003:


[~Naganarasimha], the thing you are looking at is a Bidirectional Map.
I think Guava has such functionality. Will explore and update once I start 
working on this issue.

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena

 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG


[ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267159#comment-14267159
 ] 

Hadoop QA commented on YARN-2213:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690459/YARN-2213.001.patch
  against trunk revision 788ee35.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6262//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6262//console

This message is automatically generated.

 Change proxy-user cookie log in AmIpFilter to DEBUG
 ---

 Key: YARN-2213
 URL: https://issues.apache.org/jira/browse/YARN-2213
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Attachments: YARN-2213.001.patch


 I saw a lot of the following lines in AppMaster log:
 {code}
 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 {code}
 For long running app, this would consume considerable log space.
 Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3011) NM dies because of the failure of resource localization

2015-01-06 Thread Wang Hao (JIRA)

Wang Hao created YARN-3011:
--

 Summary: NM dies because of the failure of resource localization
 Key: YARN-3011
 URL: https://issues.apache.org/jira/browse/YARN-3011
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.1
Reporter: Wang Hao


NM dies because of IllegalArgumentException when localize resource.

2014-12-29 13:43:58,699 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
 Downloading public rsrc:{ 
hdfs://hadoop002.dx.momo.com:8020/user/hadoop/share/lib/oozie/json-simple-1.1.jar,
 1416997035456, FILE, null }
2014-12-29 13:43:58,699 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
 Downloading public rsrc:{ 
hdfs://hadoop002.dx.momo.com:8020/user/hive/src/final_test_ooize/test_ooize_job1.sql/,
 1419831474153, FILE, null }
2014-12-29 13:43:58,701 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
Error in dispatcher thread
java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
at org.apache.hadoop.fs.Path.init(Path.java:135)
at org.apache.hadoop.fs.Path.init(Path.java:94)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.getPathForLocalization(LocalResourcesTrackerImpl.java:420)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:758)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:672)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:614)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)

at java.lang.Thread.run(Thread.java:745)
2014-12-29 13:43:58,701 INFO 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: 
Initializing user hadoop
2014-12-29 13:43:58,702 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
Exiting, bbye..
2014-12-29 13:43:58,704 INFO org.apache.hadoop.mapred.ShuffleHandler: Setting 
connection close header...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3011) NM dies because of the failure of resource localization


 [ 
https://issues.apache.org/jira/browse/YARN-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-3011:
--

Assignee: Varun Saxena

 NM dies because of the failure of resource localization
 ---

 Key: YARN-3011
 URL: https://issues.apache.org/jira/browse/YARN-3011
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.1
Reporter: Wang Hao
Assignee: Varun Saxena

 NM dies because of IllegalArgumentException when localize resource.
 2014-12-29 13:43:58,699 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Downloading public rsrc:{ 
 hdfs://hadoop002.dx.momo.com:8020/user/hadoop/share/lib/oozie/json-simple-1.1.jar,
  1416997035456, FILE, null }
 2014-12-29 13:43:58,699 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Downloading public rsrc:{ 
 hdfs://hadoop002.dx.momo.com:8020/user/hive/src/final_test_ooize/test_ooize_job1.sql/,
  1419831474153, FILE, null }
 2014-12-29 13:43:58,701 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Error in dispatcher thread
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
 at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
 at org.apache.hadoop.fs.Path.init(Path.java:135)
 at org.apache.hadoop.fs.Path.init(Path.java:94)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.getPathForLocalization(LocalResourcesTrackerImpl.java:420)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:758)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:672)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:614)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)  
   
 at java.lang.Thread.run(Thread.java:745)
 2014-12-29 13:43:58,701 INFO 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: 
 Initializing user hadoop
 2014-12-29 13:43:58,702 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Exiting, bbye..
 2014-12-29 13:43:58,704 INFO org.apache.hadoop.mapred.ShuffleHandler: Setting 
 connection close header...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation


 [ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-2637:
--
Attachment: YARN-2637.29.patch

Take a go adding user am limit also (needs further verification/test), see test 
impact

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.18.patch, 
 YARN-2637.19.patch, YARN-2637.2.patch, YARN-2637.20.patch, 
 YARN-2637.21.patch, YARN-2637.22.patch, YARN-2637.23.patch, 
 YARN-2637.25.patch, YARN-2637.26.patch, YARN-2637.27.patch, 
 YARN-2637.28.patch, YARN-2637.29.patch, YARN-2637.6.patch, YARN-2637.7.patch, 
 YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2936) YARNDelegationTokenIdentifier doesn't set proto.builder now


[ 
https://issues.apache.org/jira/browse/YARN-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267153#comment-14267153
 ] 

Hadoop QA commented on YARN-2936:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690457/YARN-2936.006.patch
  against trunk revision 788ee35.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6261//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6261//console

This message is automatically generated.

 YARNDelegationTokenIdentifier doesn't set proto.builder now
 ---

 Key: YARN-2936
 URL: https://issues.apache.org/jira/browse/YARN-2936
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Varun Saxena
 Attachments: YARN-2936.001.patch, YARN-2936.002.patch, 
 YARN-2936.003.patch, YARN-2936.004.patch, YARN-2936.005.patch, 
 YARN-2936.006.patch


 After YARN-2743, the setters are removed from YARNDelegationTokenIdentifier, 
 such that when constructing a object which extends 
 YARNDelegationTokenIdentifier, proto.builder is not set at all. Later on, 
 when we call getProto() of it, we will just get an empty proto object.
 It seems to do no harm to the production code path, as we will always call 
 getBytes() before using proto to persist the DT in the state store, when we 
 generating the password.
 I think the setter is removed to avoid duplicating setting the fields why 
 getBytes() is called. However, YARNDelegationTokenIdentifier doesn't work 
 properly alone. YARNDelegationTokenIdentifier is tightly coupled with the 
 logic in secretManager. It's vulnerable if something is changed at 
 secretManager. For example, in the test case of YARN-2837, I spent time to 
 figure out we need to execute getBytes() first to make sure the testing DTs 
 can be properly put into the state store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3011) NM dies because of the failure of resource localization

2015-01-06 Thread Wang Hao (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267246#comment-14267246
 ] 

Wang Hao commented on YARN-3011:


I submitted a job to oozie. In my workflow.xml, the value of the tag script is 
ended with '/' by mistake.
workflow-app xmlns=uri:oozie:workflow:0.2 name=hive-wf
start to=create_hive/

action name=create_hive
hive xmlns=uri:oozie:hive-action:0.2
job-tracker${jobTracker}/job-tracker
name-node${nameNode}/name-node
configuration
property
nameoozie.action.sharelib.for.hive/name
valuehive2/value
/property
property
nameoozie.launcher.action.main.class/name
valueorg.apache.oozie.action.hadoop.Hive2Main/value
/property
property
namemapreduce.job.queuename/name
value${queueName}/value
/property
/configuration
scripttest_ooize_job1.sql//script
paramhivevar:dbname=offline/param
paramhivevar:partition_date=20141228/param
/hive
ok to=end/
error to=fail/
/action
kill name=fail
messageHive failed, error 
message[${wf:errorMessage(wf:lastErrorNode())}]/message
/kill
end name=end/
/workflow-app

When NM localized resource , the file test_ooize_job1.sql/ cause a exception 
in function getPathForLocalization of LocalResourcesTrackerImpl.

In function getPathForLocalization, when created Path, the second parameter 
will get null.
Path localPath = new Path(rPath, req.getPath().getName());

finally, the exception will cause AsyncDispatcher to shutdown the jvm.
So, I think we should handle this Exception, otherwise, it will cause lots of 
NMs die.

 NM dies because of the failure of resource localization
 ---

 Key: YARN-3011
 URL: https://issues.apache.org/jira/browse/YARN-3011
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.1
Reporter: Wang Hao
Assignee: Varun Saxena

 NM dies because of IllegalArgumentException when localize resource.
 2014-12-29 13:43:58,699 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Downloading public rsrc:{ 
 hdfs://hadoop002.dx.momo.com:8020/user/hadoop/share/lib/oozie/json-simple-1.1.jar,
  1416997035456, FILE, null }
 2014-12-29 13:43:58,699 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Downloading public rsrc:{ 
 hdfs://hadoop002.dx.momo.com:8020/user/hive/src/final_test_ooize/test_ooize_job1.sql/,
  1419831474153, FILE, null }
 2014-12-29 13:43:58,701 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Error in dispatcher thread
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
 at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
 at org.apache.hadoop.fs.Path.init(Path.java:135)
 at org.apache.hadoop.fs.Path.init(Path.java:94)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.getPathForLocalization(LocalResourcesTrackerImpl.java:420)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:758)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:672)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:614)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)  
   
 at java.lang.Thread.run(Thread.java:745)
 2014-12-29 13:43:58,701 INFO 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: 
 Initializing user hadoop
 2014-12-29 13:43:58,702 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Exiting, bbye..
 2014-12-29 13:43:58,704 INFO org.apache.hadoop.mapred.ShuffleHandler: Setting 
 connection close header...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2807) Option --forceactive not works as described in usage of yarn rmadmin -transitionToActive


[ 
https://issues.apache.org/jira/browse/YARN-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267329#comment-14267329
 ] 

Hadoop QA commented on YARN-2807:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12690300/YARN-2807.2.patch
  against trunk revision 788ee35.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.security.ssl.TestReloadingX509TrustManager

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6265//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6265//console

This message is automatically generated.

 Option --forceactive not works as described in usage of yarn rmadmin 
 -transitionToActive
 

 Key: YARN-2807
 URL: https://issues.apache.org/jira/browse/YARN-2807
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation, resourcemanager
Reporter: Wangda Tan
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-2807.1.patch, YARN-2807.2.patch


 Currently the help message of yarn rmadmin -transitionToActive is:
 {code}
 transitionToActive: incorrect number of arguments
 Usage: HAAdmin [-transitionToActive serviceId [--forceactive]]
 {code}
 But the --forceactive not works as expected. When transition RM state with 
 --forceactive:
 {code}
 yarn rmadmin -transitionToActive rm2 --forceactive
 Automatic failover is enabled for 
 org.apache.hadoop.yarn.client.RMHAServiceTarget@64c9f31e
 Refusing to manually manage HA state, since it may cause
 a split-brain scenario or other incorrect state.
 If you are very sure you know what you are doing, please
 specify the forcemanual flag.
 {code}
 As shown above, we still cannot transitionToActive when automatic failover is 
 enabled with --forceactive.
 The option can work is: {{--forcemanual}}, there's no place in usage 
 describes this option. I think we should fix this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3009) TimelineWebServices always parses primary and secondary filters as numbers if first char is a number

2015-01-06 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267275#comment-14267275
 ] 

Naganarasimha G R commented on YARN-3009:
-

Hi [~cwensel]
Took a look @ code and test cases. Seems like its not a issue, if the filter 
value is placed within double quotes then its expected to be read as a string, 
if not it will read as numerical object itself (refer 
{{TestTimelineWebServices.testPrimaryFilterNumericString()  
testPrimaryFilterNumericStringWithQuotes()}} )
May be you can share the URL which you are using to store and accessing the 
timeline entities through webservice, which can help in validating this issue 
further


 TimelineWebServices always parses primary and secondary filters as numbers if 
 first char is a number
 

 Key: YARN-3009
 URL: https://issues.apache.org/jira/browse/YARN-3009
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.4.0
Reporter: Chris K Wensel
Assignee: Naganarasimha G R

 If you pass a filter value that starts with a number (7CCA...), the filter 
 value will be parsed into the Number '7' causing the filter to fail the 
 search.
 Should be noted the actual value as stored via a PUT operation is properly 
 parsed and stored as a String.
 This manifests as a very hard to identify issue with DAGClient in Apache Tez 
 and naming dags/vertices with alphanumeric guid values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2807) Option --forceactive not works as described in usage of yarn rmadmin -transitionToActive

2015-01-06 Thread Akira AJISAKA (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-2807:

Component/s: documentation

 Option --forceactive not works as described in usage of yarn rmadmin 
 -transitionToActive
 

 Key: YARN-2807
 URL: https://issues.apache.org/jira/browse/YARN-2807
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation, resourcemanager
Reporter: Wangda Tan
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-2807.1.patch, YARN-2807.2.patch


 Currently the help message of yarn rmadmin -transitionToActive is:
 {code}
 transitionToActive: incorrect number of arguments
 Usage: HAAdmin [-transitionToActive serviceId [--forceactive]]
 {code}
 But the --forceactive not works as expected. When transition RM state with 
 --forceactive:
 {code}
 yarn rmadmin -transitionToActive rm2 --forceactive
 Automatic failover is enabled for 
 org.apache.hadoop.yarn.client.RMHAServiceTarget@64c9f31e
 Refusing to manually manage HA state, since it may cause
 a split-brain scenario or other incorrect state.
 If you are very sure you know what you are doing, please
 specify the forcemanual flag.
 {code}
 As shown above, we still cannot transitionToActive when automatic failover is 
 enabled with --forceactive.
 The option can work is: {{--forcemanual}}, there's no place in usage 
 describes this option. I think we should fix this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance

2015-01-06 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267322#comment-14267322
 ] 

Zhijie Shen commented on YARN-2996:
---

My bad. I mean MemoryRMStateStore#updateRMDelegationTokenState. It contains two 
other synchronized methods, but it's better to keep them atomic, and not 
interpolated by other operations.

 Refine some fs operations in FileSystemRMStateStore to improve performance
 --

 Key: YARN-2996
 URL: https://issues.apache.org/jira/browse/YARN-2996
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: YARN-2996.001.patch, YARN-2996.002.patch


 In {{FileSystemRMStateStore}}, we can refine some fs operations to improve 
 performance:
 *1.* There are several places invoke {{fs.exists}}, then 
 {{fs.getFileStatus}}, we can merge them to save one RPC call
 {code}
 if (fs.exists(versionNodePath)) {
 FileStatus status = fs.getFileStatus(versionNodePath);
 {code}
 *2.*
 {code}
 protected void updateFile(Path outputPath, byte[] data) throws Exception {
   Path newPath = new Path(outputPath.getParent(), outputPath.getName() + 
 .new);
   // use writeFile to make sure .new file is created atomically
   writeFile(newPath, data);
   replaceFile(newPath, outputPath);
 }
 {code}
 The {{updateFile}} is not good too, it write file to _output\_file_.tmp, then 
 rename to _output\_file_.new, then rename it to _output\_file_, we can reduce 
 one rename operation.
 Also there is one unnecessary import, we can remove it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2807) Option --forceactive not works as described in usage of yarn rmadmin -transitionToActive

2015-01-06 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267358#comment-14267358
 ] 

Akira AJISAKA commented on YARN-2807:
-

Thanks [~iwasakims] for updating the patch. Mostly looks good to me.
Minor comment: Would you remove trailing whitespaces in YarnCommands.apt.vm?

 Option --forceactive not works as described in usage of yarn rmadmin 
 -transitionToActive
 

 Key: YARN-2807
 URL: https://issues.apache.org/jira/browse/YARN-2807
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation, resourcemanager
Reporter: Wangda Tan
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-2807.1.patch, YARN-2807.2.patch


 Currently the help message of yarn rmadmin -transitionToActive is:
 {code}
 transitionToActive: incorrect number of arguments
 Usage: HAAdmin [-transitionToActive serviceId [--forceactive]]
 {code}
 But the --forceactive not works as expected. When transition RM state with 
 --forceactive:
 {code}
 yarn rmadmin -transitionToActive rm2 --forceactive
 Automatic failover is enabled for 
 org.apache.hadoop.yarn.client.RMHAServiceTarget@64c9f31e
 Refusing to manually manage HA state, since it may cause
 a split-brain scenario or other incorrect state.
 If you are very sure you know what you are doing, please
 specify the forcemanual flag.
 {code}
 As shown above, we still cannot transitionToActive when automatic failover is 
 enabled with --forceactive.
 The option can work is: {{--forcemanual}}, there's no place in usage 
 describes this option. I think we should fix this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2996) Refine some fs operations in FileSystemRMStateStore to improve performance