[jira] [Commented] (YARN-1327) Fix nodemgr native compilation problems on FreeBSD9

2013-12-12 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846155#comment-13846155
 ] 

Radim Kolar commented on YARN-1327:
---

can anybody look at this? it breaks hadoop on freeBSD.

 Fix nodemgr native compilation problems on FreeBSD9
 ---

 Key: YARN-1327
 URL: https://issues.apache.org/jira/browse/YARN-1327
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Radim Kolar
Assignee: Radim Kolar
 Fix For: 3.0.0, 2.3.0

 Attachments: nodemgr-portability.txt


 There are several portability problems preventing from compiling native 
 component on freebsd.
 1. libgen.h is not included. correct function prototype is there but linux 
 glibc has workaround to define it for user if libgen.h is not directly 
 included. Include this file directly.
 2. query max size of login name using sysconf. it follows same code style 
 like rest of code using sysconf too.
 3. cgroups are linux only feature, make conditional compile and return error 
 if mount_cgroup is attempted on non linux OS
 4. do not use posix function setpgrp() since it clashes with same function 
 from BSD 4.2, use equivalent function. After inspecting glibc sources its 
 just shortcut to setpgid(0,0)
 These changes makes it compile on both linux and freebsd.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy

2013-12-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1028:
---

Attachment: yarn-1028-6.patch

 Add FailoverProxyProvider like capability to RMProxy
 

 Key: YARN-1028
 URL: https://issues.apache.org/jira/browse/YARN-1028
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1028-1.patch, yarn-1028-2.patch, yarn-1028-3.patch, 
 yarn-1028-4.patch, yarn-1028-5.patch, yarn-1028-6.patch, 
 yarn-1028-draft-cumulative.patch


 RMProxy layer currently abstracts RM discovery and implements it by looking 
 up service information from configuration. Motivated by HDFS and using 
 existing classes from Common, we can add failover proxy providers that may 
 provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy

2013-12-12 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846315#comment-13846315
 ] 

Karthik Kambatla commented on YARN-1028:


Thanks Tom. Updated the patch to address your comments. 

bq. It looks like the behaviour in this patch differs from the way failover is 
implemented for HDFS HA, where it is controlled by dfs.client.failover settings
For consistency, all yarn-failover configs are prefixed by 
yarn.client.failover. The suffixes are also similar to the ones HDFS uses, but 
use hyphens instead of dots for consistency with rest of YARN configs.

bq. Why do you need both YarnFailoverProxyProvider and 
ConfiguredFailoverProxyProvider?
Changed {{YarnFailoverProxyProvider}} to an interface with a single method 
{{#init(Conf, RMProxy, ClassT protocol)}}. This init() is called after 
creating an instance of the specified class. HDFS, on the other hand, expects 
the plugged-in FailoverProxyProvider to have a constructor of a particular 
form. I think the approach in the current patch is cleaner, so anyone writing a 
plugin knows they should have an init method. What do you think? I can change 
it to remove YarnFailoverProxyProvider altogether if you think it is a better 
approach.


 Add FailoverProxyProvider like capability to RMProxy
 

 Key: YARN-1028
 URL: https://issues.apache.org/jira/browse/YARN-1028
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1028-1.patch, yarn-1028-2.patch, yarn-1028-3.patch, 
 yarn-1028-4.patch, yarn-1028-5.patch, yarn-1028-6.patch, 
 yarn-1028-draft-cumulative.patch


 RMProxy layer currently abstracts RM discovery and implements it by looking 
 up service information from configuration. Motivated by HDFS and using 
 existing classes from Common, we can add failover proxy providers that may 
 provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (YARN-1502) Protocol changes and implementations in RM side to support change container resource

2013-12-12 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-1502:


 Summary: Protocol changes and implementations in RM side to 
support change container resource
 Key: YARN-1502
 URL: https://issues.apache.org/jira/browse/YARN-1502
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan


As described in YARN-1197, we need add API/implementation changes,
1) Add a ListContainerResourceIncreaseRequest to YarnScheduler interface
2) Can get resource changed containers in AllocateResponse
3) Added implementation in Capacity Scheduler side to support increase/decrease

Other details, please refer to design doc and discussion in YARN-1197



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1502) Protocol changes and implementations in RM side to support change container resource

2013-12-12 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-1502:
-

Attachment: yarn-1502.1.patch

Attached the first patch of scheduler changes for preview, just to want to know 
if I have any big issue of this approach. 
Currently this still in development, container increasing is supported and have 
unit tests, but container decreasing not tested yet.
Hope you can share some light to me! :)

 Protocol changes and implementations in RM side to support change container 
 resource
 

 Key: YARN-1502
 URL: https://issues.apache.org/jira/browse/YARN-1502
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: yarn-1502.1.patch


 As described in YARN-1197, we need add API/implementation changes,
 1) Add a ListContainerResourceIncreaseRequest to YarnScheduler interface
 2) Can get resource changed containers in AllocateResponse
 3) Added implementation in Capacity Scheduler side to support 
 increase/decrease
 Other details, please refer to design doc and discussion in YARN-1197



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1029) Allow embedding leader election into the RM

2013-12-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1029:
---

Attachment: yarn-1029-0.patch

yarn-1029-0.patch is a working patch that uses ActiveStandbyElector directly. 

The ZKFCProtocol implementation of the code is straight-forward - 60 lines of 
code between AdminService and RMZKActiveStandbyElector along with error/ 
exception handling. 

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1029-0.patch, yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846325#comment-13846325
 ] 

Hadoop QA commented on YARN-1028:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618407/yarn-1028-6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests:

  org.apache.hadoop.yarn.server.TestContainerManagerSecurity
  org.apache.hadoop.yarn.server.TestRMNMSecretKeys

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2649//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2649//console

This message is automatically generated.

 Add FailoverProxyProvider like capability to RMProxy
 

 Key: YARN-1028
 URL: https://issues.apache.org/jira/browse/YARN-1028
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1028-1.patch, yarn-1028-2.patch, yarn-1028-3.patch, 
 yarn-1028-4.patch, yarn-1028-5.patch, yarn-1028-6.patch, 
 yarn-1028-draft-cumulative.patch


 RMProxy layer currently abstracts RM discovery and implements it by looking 
 up service information from configuration. Motivated by HDFS and using 
 existing classes from Common, we can add failover proxy providers that may 
 provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1029) Allow embedding leader election into the RM

2013-12-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1029:
---

Attachment: embedded-zkfc-approach.patch

Also uploading embedded-zkfc-approach that I worked on earlier. It might not 
apply to trunk/ branch-2 anymore though. The uploaded patch doesn't remove any 
of the overheads or handle short-circuit. 

After having implemented both approaches, I sincerely feel the 
ActiveStandbyElector approach is simpler, cleaner, straight-forward than the 
embedded ZKFC approach. Refactoring ZKFC will only add more work, without 
apparent gains. 

When working on the embedded ZKFC approach, I ran it by Todd and he suggested 
we might want to use ActiveStandbyElector directly and do away with unnecessary 
failover code paths if we are not using rest of the ZKFC features. Thanks to 
his suggestion, the code definitely looks simpler that way. 

[~vinodkv] - is there a good technical reason for using ZKFC instead of 
ActiveStandbyElector directly. In this case, we only need election and will be 
using ZKFC for the elector. 

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy

2013-12-12 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846332#comment-13846332
 ] 

Karthik Kambatla commented on YARN-1028:


The test failures are unrelated.

 Add FailoverProxyProvider like capability to RMProxy
 

 Key: YARN-1028
 URL: https://issues.apache.org/jira/browse/YARN-1028
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1028-1.patch, yarn-1028-2.patch, yarn-1028-3.patch, 
 yarn-1028-4.patch, yarn-1028-5.patch, yarn-1028-6.patch, 
 yarn-1028-draft-cumulative.patch


 RMProxy layer currently abstracts RM discovery and implements it by looking 
 up service information from configuration. Motivated by HDFS and using 
 existing classes from Common, we can add failover proxy providers that may 
 provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1488) Allow containers to delegate resources to another container

2013-12-12 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846343#comment-13846343
 ] 

Arun C Murthy commented on YARN-1488:
-

[~henryr] Good to see you around here.

bq. Would the recipient and delegated containers have to match the queues to 
which their original resources were granted?

No, not at all.

The requirement, gleaned from discussions in YARN-1404 (and HDFS-4949, if you 
squint hard *smile*) is that you'd want an external framework, call it Gamma, 
which runs in a separate queue, call it GammaQ.

Now, a user Alice, belonging to queue Phi, needs to use Gamma.

The key requirement is that Gamma would like to leverage YARN's workload 
management capabilities (queues, SLAs etc.) rather than merely run under YARN 
to leverage YARN's resource management.

Use-cases: 
# Alice running queries on Impala (resource: memory, cpu, others in future)
# Bob caching data-sets in HDFS i.e. DataNodes (resource: memory)
# Charlie doing a bunch of I/O operations on HBase/Accumulo (resource: cpu, 
iops).

If we all agree on the use cases; then it would be very critical to support 
source and target containers belonging to different queues - that would be key 
to allow these external frameworks to leverage YARN's workload management.

Does that make sense? 

This would definitely require the NodeManager (and, potentially, the external 
system i.e. impalad, datanode etc.) to maintain the resource-map so that they 
can return the original source container to YARN for various reasons (finished 
the task at hand, preemption to respect queue SLAs etc.)

We could, and should, allow the recipient service to decide how to manage the 
resource map for itself (i.e. decouple that from how the NodeManager manages 
the mapping) - this could be either a single cgroup (which the NodeManager has 
to manage for the external framework anyway) or a hierarchy within.

Thoughts? Thanks.

 Allow containers to delegate resources to another container
 ---

 Key: YARN-1488
 URL: https://issues.apache.org/jira/browse/YARN-1488
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Arun C Murthy

 We should allow containers to delegate resources to another container. This 
 would allow external frameworks to share not just YARN's resource-management 
 capabilities but also it's workload-management capabilities.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1488) Allow containers to delegate resources to another container

2013-12-12 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846347#comment-13846347
 ] 

Arun C Murthy commented on YARN-1488:
-

bq. We could, and should, allow the recipient service to decide how to manage 
the resource map for itself

[~henryr]: To clarify, we would have to do this whether or not we decide to 
take the delegation approach or not. Thanks.

 Allow containers to delegate resources to another container
 ---

 Key: YARN-1488
 URL: https://issues.apache.org/jira/browse/YARN-1488
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Arun C Murthy

 We should allow containers to delegate resources to another container. This 
 would allow external frameworks to share not just YARN's resource-management 
 capabilities but also it's workload-management capabilities.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Comment Edited] (YARN-1488) Allow containers to delegate resources to another container

2013-12-12 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846343#comment-13846343
 ] 

Arun C Murthy edited comment on YARN-1488 at 12/12/13 3:06 PM:
---

[~henryr] Good to see you around here.

bq. Would the recipient and delegated containers have to match the queues to 
which their original resources were granted?

No, not at all.

The requirement, gleaned from discussions in YARN-1404 (and HDFS-4949, if you 
squint hard *smile*) is that you'd want an external framework, call it Gamma, 
which runs in a separate queue, call it GammaQ.

Now, a user Alice, belonging to queue Alpha, needs to use Gamma.

The key requirement is that Gamma would like to leverage YARN's workload 
management capabilities (queues, SLAs etc.) rather than merely run under YARN 
to leverage YARN's resource management.

Use-cases: 
# Alice, belonging to queue Phi, running queries on Impala (resource: memory, 
cpu, others in future)
# Bob, belonging to queue Alpha, caching data-sets in HDFS i.e. DataNodes 
(resource: memory)
# Charlie, belonging to queue Beta, doing a bunch of I/O operations on 
HBase/Accumulo (resource: cpu, iops).

If we all agree on the use cases; then it would be very critical to support 
source and target containers belonging to different queues - that would be key 
to allow these external frameworks to leverage YARN's workload management.

Does that make sense? 

This would definitely require the NodeManager (and, potentially, the external 
system i.e. impalad, datanode etc.) to maintain the resource-map so that they 
can return the original source container to YARN for various reasons (finished 
the task at hand, preemption to respect queue SLAs etc.)

We could, and should, allow the recipient service to decide how to manage the 
resource map for itself (i.e. decouple that from how the NodeManager manages 
the mapping) - this could be either a single cgroup (which the NodeManager has 
to manage for the external framework anyway) or a hierarchy within.

Thoughts? Thanks.


was (Author: acmurthy):
[~henryr] Good to see you around here.

bq. Would the recipient and delegated containers have to match the queues to 
which their original resources were granted?

No, not at all.

The requirement, gleaned from discussions in YARN-1404 (and HDFS-4949, if you 
squint hard *smile*) is that you'd want an external framework, call it Gamma, 
which runs in a separate queue, call it GammaQ.

Now, a user Alice, belonging to queue Phi, needs to use Gamma.

The key requirement is that Gamma would like to leverage YARN's workload 
management capabilities (queues, SLAs etc.) rather than merely run under YARN 
to leverage YARN's resource management.

Use-cases: 
# Alice running queries on Impala (resource: memory, cpu, others in future)
# Bob caching data-sets in HDFS i.e. DataNodes (resource: memory)
# Charlie doing a bunch of I/O operations on HBase/Accumulo (resource: cpu, 
iops).

If we all agree on the use cases; then it would be very critical to support 
source and target containers belonging to different queues - that would be key 
to allow these external frameworks to leverage YARN's workload management.

Does that make sense? 

This would definitely require the NodeManager (and, potentially, the external 
system i.e. impalad, datanode etc.) to maintain the resource-map so that they 
can return the original source container to YARN for various reasons (finished 
the task at hand, preemption to respect queue SLAs etc.)

We could, and should, allow the recipient service to decide how to manage the 
resource map for itself (i.e. decouple that from how the NodeManager manages 
the mapping) - this could be either a single cgroup (which the NodeManager has 
to manage for the external framework anyway) or a hierarchy within.

Thoughts? Thanks.

 Allow containers to delegate resources to another container
 ---

 Key: YARN-1488
 URL: https://issues.apache.org/jira/browse/YARN-1488
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Arun C Murthy

 We should allow containers to delegate resources to another container. This 
 would allow external frameworks to share not just YARN's resource-management 
 capabilities but also it's workload-management capabilities.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1491) Upgrade JUnit3 TestCase to JUnit 4

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846369#comment-13846369
 ] 

Hudson commented on YARN-1491:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1610 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1610/])
YARN-1491. Upgrade JUnit3 TestCase to JUnit 4 (Chen He via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550204)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLinuxResourceCalculatorPlugin.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestWindowsBasedProcessTree.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestWindowsResourceCalculatorPlugin.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestYarnVersionInfo.java


 Upgrade JUnit3 TestCase to JUnit 4
 --

 Key: YARN-1491
 URL: https://issues.apache.org/jira/browse/YARN-1491
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Jonathan Eagles
Assignee: Chen He
  Labels: newbie
 Fix For: 3.0.0, 2.4.0

 Attachments: Yarn-1491.patch


 There are still four references to test classes that extend from 
 junit.framework.TestCase
 hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestYarnVersionInfo.java
 hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestWindowsResourceCalculatorPlugin.java
 hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLinuxResourceCalculatorPlugin.java
 hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestWindowsBasedProcessTree.java



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1481) Move internal services logic from AdminService to ResourceManager

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846371#comment-13846371
 ] 

Hudson commented on YARN-1481:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1610 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1610/])
YARN-1481. Move internal services logic from AdminService to ResourceManager. 
(vinodkv via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550167)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


 Move internal services logic from AdminService to ResourceManager
 -

 Key: YARN-1481
 URL: https://issues.apache.org/jira/browse/YARN-1481
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: 2.4.0

 Attachments: YARN-1481-20131207.txt, YARN-1481-20131209.txt


 This is something I found while reviewing YARN-1318, but didn't halt that 
 patch as many cycles went there already. Some top level issues
  - Not easy to follow RM's service life cycle
 -- RM adds only AdminService as its service directly.
 -- Other services are added to RM when AdminService's init calls 
 RM.activeServices.init()
  - Overall, AdminService shouldn't encompass all of RM's HA state management. 
 It was originally supposed to be the implementation of just the RPC server.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-408) Capacity Scheduler delay scheduling should not be disabled by default

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846375#comment-13846375
 ] 

Hudson commented on YARN-408:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #1610 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1610/])
YARN-408. Change CapacityScheduler to not disable delay-scheduling by default. 
Contributed by Mayank Bansal. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550245)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf/capacity-scheduler.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java


 Capacity Scheduler delay scheduling should not be disabled by default
 -

 Key: YARN-408
 URL: https://issues.apache.org/jira/browse/YARN-408
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Mayank Bansal
Assignee: Mayank Bansal
Priority: Minor
 Fix For: 2.4.0

 Attachments: YARN-408-trunk-2.patch, YARN-408-trunk-3.patch, 
 YARN-408-trunk.patch


 Capacity Scheduler delay scheduling should not be disabled by default.
 Enabling it to number of nodes in one rack.
 Thanks,
 Mayank



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1197) Support changing resources of an allocated container

2013-12-12 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-1197:
-

Attachment: yarn-1197-scheduler-v1.pdf

Attached scheduler design doc for increasing and decreasing, I've uploaded a 
very draft preview patch for scheduler changes in YARN-1502

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: mapreduce-project.patch.ver.1, 
 tools-project.patch.ver.1, yarn-1197-scheduler-v1.pdf, yarn-1197-v2.pdf, 
 yarn-1197-v3.pdf, yarn-1197-v4.pdf, yarn-1197-v5.pdf, yarn-1197.pdf, 
 yarn-api-protocol.patch.ver.1, yarn-pb-impl.patch.ver.1, 
 yarn-server-common.patch.ver.1, yarn-server-nodemanager.patch.ver.1, 
 yarn-server-resourcemanager.patch.ver.1


 The current YARN resource management logic assumes resource allocated to a 
 container is fixed during the lifetime of it. When users want to change a 
 resource 
 of an allocated container the only way is releasing it and allocating a new 
 container with expected size.
 Allowing run-time changing resources of an allocated container will give us 
 better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2013-12-12 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846389#comment-13846389
 ] 

Wangda Tan commented on YARN-1197:
--

Copy text from scheduler design doc to here for easier discussion, please feel 
free to let me know your comments!

*Basic Requirements*
We need support handling resource increase request from AM and resource 
decrease notify from NM
* Such resource changes should reflect to FiCaSchedulerNode/ App, LeafQueue, 
ParentQueue (like usedResource, reservedResource, etc.)
* If user requested an increase request and not be satisfied immediately, it 
will be reserved in node/app (The node/app means FiCaSchedulerApp/Node, same in 
below) like before.

*Advanced Requirements*
* We need gracefully handle racing conditions,
** Only acquired/running containers can be increased
** Container decreasing will only take effect in acquired/running containers. 
(If a container is finished/killed, etc. All of its resource will be released, 
we don’t need decrease it)
** User may request a new increase requests on a container, and a pending 
increase request for the same container existed. We need replace the pending 
with the new one.
** When a requested container resource is less or equal to existing container 
resource. 
* This will be ignored if no pending increase request for this container
* This will be ignored and the pending increase request will be canceled
** When a pending increase request existed, and a decrease container notify on 
the same container comes, this container will be decreased and the pending 
increase request will be canceled

*Requirements not clear*
* Do we need a time-out parameter for reserved resource increase request to 
avoid it occupy the node resource too long? (Do we have such parameter for 
reserve a “normal” container in CS?)
* How to decide which of increase request and normal container request will be 
satisfied first? (Currently, I simply make CS satisfy increase request first).  
Should it be a configurable parameter?

*Current Implementation*

*1) Decrease Container*
I start with decrease container because it’s more easier to understand,
Decreased container will be handled in nodeUpdate() of Capacity scheduler.
When CS received decreased containers from NM, it will process them one by one 
by following steps

* Check if it’s in running state (Because this is reported by NM, it’s state 
will either be running or completed), skip if no.
* Remove increase request on the same container-id if it exists
* Decrease/Update container resource in 
FiCaSchedulerApp/AppSchedulingInfo/FiCaSchedulerNode/LeafQueue/ParentQueue/other-related-metrics
* Update resource in Container.
* Return decreased container to AM by calling setDecreasedContainer in 
AllocateResponse

*2) Increase Container*
Increasing container will be much more complex than decreasing, 

*Steps to add container increase request, (pseudo code)*
In CapacityScheduler.allocate(...)
{code}
foreach (increase_request):
if (state != ACQUIRED) and (state != RUNNING):
continue;

// Remove the old request on the same container-id if it exists
if increase_request_exist(increase_request.getContainerId()):
remove(increaseRequest);

// Ask target resource should larger than existing resource
if increase_request.ask_resource = 
existing_resource(increase_request.getContainerId()):
continue;

// Add it to application
getApplication(increase_request.getContainerId()).add(increase_request)
{code}

*Steps to handle container increase request,*
2.1) In CapacityScheduler.nodeUpdate(...):
{code}
if node.is_reserved():
if reserved-increase-request:
LeafQueue.assignReservedIncreaseRequest(...)
elif reserved-normal-container:
...
else:
ParentQueue.assignContainers(...)
// this will finally call 
// LeafQueue.assignContainers(...)
{code}

2.2) In CapacityScheduler.nodeUpdate(...):
{code}
if request-is-fit-in-resource:
allocate-resource
update container token
add to AllocateResponse
return allocated-resource
else:
return None
{code}

2.3) In LeafQueue.assignContainers(...):
{code}
foreach (application):
// do increase allocation first
foreach (increase_request):
// check if we can allocate it
// in queue/user limites, etc.
// return None if not satisfied

if request-is-fit-in-resource:
allocate-resource
update container token
add to AllocateResponse
else:
reserve in app/node
return reserved-resource

// do normal allocation
...
{code}

*API changes in CapacityScheduler*
1)YarnScheduler
{code}
   public Allocation 

[jira] [Commented] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy

2013-12-12 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846395#comment-13846395
 ] 

Tom White commented on YARN-1028:
-

Thanks for the explanation of how failover works, Karthik. I think the failover 
configuration is much better now - the patch is very close. Just a few minor 
comments:

*  The YarnFailoverProxyProvider interface is an improvement. It might be good 
to have RM in its name since it is about RM failover. Ditto for 
ConfiguredFailoverProxyProvider.
* It would be nice to have YarnClientImpl still report which RM it submitted to 
- the logical name when HA is enabled, the host/port when not. 
* Nit: TestRMFailover has a spurious log message LOG.error(KK)
* Nit: YARN_MINI_CLUSTER_USE_RPC and DEFAULT_YARN_MINI_CLUSTER_USE_RPC - should 
be MINICLUSTER (without a space) for consistency with existing names.

 Add FailoverProxyProvider like capability to RMProxy
 

 Key: YARN-1028
 URL: https://issues.apache.org/jira/browse/YARN-1028
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1028-1.patch, yarn-1028-2.patch, yarn-1028-3.patch, 
 yarn-1028-4.patch, yarn-1028-5.patch, yarn-1028-6.patch, 
 yarn-1028-draft-cumulative.patch


 RMProxy layer currently abstracts RM discovery and implements it by looking 
 up service information from configuration. Motivated by HDFS and using 
 existing classes from Common, we can add failover proxy providers that may 
 provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy

2013-12-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1028:
---

Attachment: yarn-1028-7.patch

Thanks again, Tom. New patch that addresses you comments.

 Add FailoverProxyProvider like capability to RMProxy
 

 Key: YARN-1028
 URL: https://issues.apache.org/jira/browse/YARN-1028
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1028-1.patch, yarn-1028-2.patch, yarn-1028-3.patch, 
 yarn-1028-4.patch, yarn-1028-5.patch, yarn-1028-6.patch, yarn-1028-7.patch, 
 yarn-1028-draft-cumulative.patch


 RMProxy layer currently abstracts RM discovery and implements it by looking 
 up service information from configuration. Motivated by HDFS and using 
 existing classes from Common, we can add failover proxy providers that may 
 provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846501#comment-13846501
 ] 

Hadoop QA commented on YARN-1028:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618440/yarn-1028-7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests:

  org.apache.hadoop.yarn.server.TestContainerManagerSecurity
  org.apache.hadoop.yarn.server.TestRMNMSecretKeys

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2650//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2650//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2650//console

This message is automatically generated.

 Add FailoverProxyProvider like capability to RMProxy
 

 Key: YARN-1028
 URL: https://issues.apache.org/jira/browse/YARN-1028
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1028-1.patch, yarn-1028-2.patch, yarn-1028-3.patch, 
 yarn-1028-4.patch, yarn-1028-5.patch, yarn-1028-6.patch, yarn-1028-7.patch, 
 yarn-1028-draft-cumulative.patch


 RMProxy layer currently abstracts RM discovery and implements it by looking 
 up service information from configuration. Motivated by HDFS and using 
 existing classes from Common, we can add failover proxy providers that may 
 provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy

2013-12-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1028:
---

Attachment: yarn-1028-8.patch

Fix findbugs warning. Also verified the output from YarnClientImpl by running a 
job against HA cluster.

 Add FailoverProxyProvider like capability to RMProxy
 

 Key: YARN-1028
 URL: https://issues.apache.org/jira/browse/YARN-1028
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1028-1.patch, yarn-1028-2.patch, yarn-1028-3.patch, 
 yarn-1028-4.patch, yarn-1028-5.patch, yarn-1028-6.patch, yarn-1028-7.patch, 
 yarn-1028-8.patch, yarn-1028-draft-cumulative.patch


 RMProxy layer currently abstracts RM discovery and implements it by looking 
 up service information from configuration. Motivated by HDFS and using 
 existing classes from Common, we can add failover proxy providers that may 
 provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846574#comment-13846574
 ] 

Hadoop QA commented on YARN-1028:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618449/yarn-1028-8.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests:

  org.apache.hadoop.yarn.server.TestContainerManagerSecurity
  org.apache.hadoop.yarn.server.TestRMNMSecretKeys

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2651//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2651//console

This message is automatically generated.

 Add FailoverProxyProvider like capability to RMProxy
 

 Key: YARN-1028
 URL: https://issues.apache.org/jira/browse/YARN-1028
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1028-1.patch, yarn-1028-2.patch, yarn-1028-3.patch, 
 yarn-1028-4.patch, yarn-1028-5.patch, yarn-1028-6.patch, yarn-1028-7.patch, 
 yarn-1028-8.patch, yarn-1028-draft-cumulative.patch


 RMProxy layer currently abstracts RM discovery and implements it by looking 
 up service information from configuration. Motivated by HDFS and using 
 existing classes from Common, we can add failover proxy providers that may 
 provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1325) Enabling HA should check Configuration contains multiple RMs

2013-12-12 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846696#comment-13846696
 ] 

Xuan Gong commented on YARN-1325:
-

testcase failures are un-related

 Enabling HA should check Configuration contains multiple RMs
 

 Key: YARN-1325
 URL: https://issues.apache.org/jira/browse/YARN-1325
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Tsuyoshi OZAWA
Assignee: Xuan Gong
  Labels: ha
 Attachments: YARN-1325.1.patch, YARN-1325.2.patch, YARN-1325.3.patch, 
 YARN-1325.4.patch


 Currently, we can enable RM HA configuration without multiple RM 
 ids(YarnConfiguration.RM_HA_IDS).  This behaviour can cause wrong operations. 
 ResourceManager should verify that more than 1 RM id must be specified in 
 RM-HA-IDs.
 One idea is to support strict mode to enforce this check as 
 configuration(e.g. yarn.resourcemanager.ha.strict-mode.enabled).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Assigned] (YARN-1180) Update capacity scheduler docs to include types on the configs

2013-12-12 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned YARN-1180:
-

Assignee: Chen He

 Update capacity scheduler docs to include types on the configs
 --

 Key: YARN-1180
 URL: https://issues.apache.org/jira/browse/YARN-1180
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 3.0.0, 2.1.0-beta, 0.23.9
Reporter: Thomas Graves
Assignee: Chen He
  Labels: documentation, newbie

 The capacity scheduler docs 
 (http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html)
  don't include types for all the configs. For instance the 
 minimum-user-limit-percent doesn't say its an Int.  It also the only setting 
 for the Resource Allocation configs that is an Int rather then a float.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1325) Enabling HA should check Configuration contains multiple RMs

2013-12-12 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846730#comment-13846730
 ] 

Xuan Gong commented on YARN-1325:
-

Thanks.
https://issues.apache.org/jira/browse/YARN-1463 is used to track the testcase 
failures

 Enabling HA should check Configuration contains multiple RMs
 

 Key: YARN-1325
 URL: https://issues.apache.org/jira/browse/YARN-1325
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Tsuyoshi OZAWA
Assignee: Xuan Gong
  Labels: ha
 Fix For: 2.4.0

 Attachments: YARN-1325.1.patch, YARN-1325.2.patch, YARN-1325.3.patch, 
 YARN-1325.4.patch


 Currently, we can enable RM HA configuration without multiple RM 
 ids(YarnConfiguration.RM_HA_IDS).  This behaviour can cause wrong operations. 
 ResourceManager should verify that more than 1 RM id must be specified in 
 RM-HA-IDs.
 One idea is to support strict mode to enforce this check as 
 configuration(e.g. yarn.resourcemanager.ha.strict-mode.enabled).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (YARN-1503) Support making additional 'LocalResources' available to running containers

2013-12-12 Thread Siddharth Seth (JIRA)
Siddharth Seth created YARN-1503:


 Summary: Support making additional 'LocalResources' available to 
running containers
 Key: YARN-1503
 URL: https://issues.apache.org/jira/browse/YARN-1503
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth


We have a use case, where additional resources (jars, libraries etc) need to be 
made available to an already running container. Ideally, we'd like this to be 
done via YARN (instead of having potentially multiple containers per node 
download resources on their own).

Proposal:
  NM to support an additional API where a list of resources can be specified. 
Something like localiceResource(ContainerId, MapString, LocalResource)
  NM would also require an additional API to get state for these resources - 
getLocalizationState(ContainerId) - which returns the current state of all 
local resources for the specified container(s).




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1498) RM changes for moving apps between queues

2013-12-12 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1498:
-

Attachment: YARN-1498.patch

 RM changes for moving apps between queues
 -

 Key: YARN-1498
 URL: https://issues.apache.org/jira/browse/YARN-1498
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1498.patch






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1325) Enabling HA should check Configuration contains multiple RMs

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846784#comment-13846784
 ] 

Hudson commented on YARN-1325:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4875 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4875/])
YARN-1325. Modified RM HA configuration validation to also ensure that multiple 
RMs are configured. Contributed by Xuan Gong. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550524)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestHAUtil.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java


 Enabling HA should check Configuration contains multiple RMs
 

 Key: YARN-1325
 URL: https://issues.apache.org/jira/browse/YARN-1325
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Tsuyoshi OZAWA
Assignee: Xuan Gong
  Labels: ha
 Fix For: 2.4.0

 Attachments: YARN-1325.1.patch, YARN-1325.2.patch, YARN-1325.3.patch, 
 YARN-1325.4.patch


 Currently, we can enable RM HA configuration without multiple RM 
 ids(YarnConfiguration.RM_HA_IDS).  This behaviour can cause wrong operations. 
 ResourceManager should verify that more than 1 RM id must be specified in 
 RM-HA-IDs.
 One idea is to support strict mode to enforce this check as 
 configuration(e.g. yarn.resourcemanager.ha.strict-mode.enabled).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1311) Fix app specific scheduler-events' names to be app-attempt based

2013-12-12 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846816#comment-13846816
 ] 

Jian He commented on YARN-1311:
---

patch looks good, check it in

 Fix app specific scheduler-events' names to be app-attempt based
 

 Key: YARN-1311
 URL: https://issues.apache.org/jira/browse/YARN-1311
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Trivial
 Attachments: YARN-1311-20131015.txt, YARN-1311-20131211.1.txt, 
 YARN-1311-20131211.txt


 Today, APP_ADDED and APP_REMOVED are sent to the scheduler. They are 
 misnomers as schedulers only deal with AppAttempts today. This JIRA is for 
 fixing their names so that we can add App-level events in the near future, 
 notably for work-preserving RM-restart.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1498) RM changes for moving apps between queues

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846822#comment-13846822
 ] 

Hadoop QA commented on YARN-1498:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618484/YARN-1498.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestQueueMetrics

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2652//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2652//console

This message is automatically generated.

 RM changes for moving apps between queues
 -

 Key: YARN-1498
 URL: https://issues.apache.org/jira/browse/YARN-1498
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1498.patch






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1485) Enabling HA should verify the RM service addresses configurations have been set for every RM Ids defined in RM_HA_IDs

2013-12-12 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1485:


Attachment: YARN-1485.1.patch

 Enabling HA should verify the RM service addresses configurations have been 
 set for every RM Ids defined in RM_HA_IDs
 -

 Key: YARN-1485
 URL: https://issues.apache.org/jira/browse/YARN-1485
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1485.1.patch


 After YARN-1325, the YarnConfiguration.RM_HA_IDS will contain multiple 
 RM_Ids. We need to verify that the RM service addresses configurations have 
 been set for all of RM_Ids.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1463) TestContainerManagerSecurity#testContainerManager fails

2013-12-12 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846865#comment-13846865
 ] 

Haohui Mai commented on YARN-1463:
--

Just discussed with [~vinodkv], we believe that the unit tests should be fixed 
as well. Maybe we can fix the yarn tests by specifying the keytabs just like 
what TestSecureNamenode does.

 TestContainerManagerSecurity#testContainerManager fails
 ---

 Key: YARN-1463
 URL: https://issues.apache.org/jira/browse/YARN-1463
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Binglin Chang
 Attachments: YARN-1463.v1.patch, YARN-1463.v2.patch


 Here is stack trace:
 {code}
 testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
   Time elapsed: 1.756 sec   ERROR!
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: 
 ResourceManager failed to start. Final state is STOPPED
   at 
 org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:253)
   at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
   at 
 org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
   at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
   at 
 org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:110)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1180) Update capacity scheduler docs to include types on the configs

2013-12-12 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated YARN-1180:
--

Attachment: Yarn-1180.patch

 Update capacity scheduler docs to include types on the configs
 --

 Key: YARN-1180
 URL: https://issues.apache.org/jira/browse/YARN-1180
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 3.0.0, 2.1.0-beta, 0.23.9
Reporter: Thomas Graves
Assignee: Chen He
  Labels: documentation, newbie
 Fix For: 2.4.0

 Attachments: Yarn-1180.patch


 The capacity scheduler docs 
 (http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html)
  don't include types for all the configs. For instance the 
 minimum-user-limit-percent doesn't say its an Int.  It also the only setting 
 for the Resource Allocation configs that is an Int rather then a float.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1180) Update capacity scheduler docs to include types on the configs

2013-12-12 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846893#comment-13846893
 ] 

Chen He commented on YARN-1180:
---

patch submitted!

 Update capacity scheduler docs to include types on the configs
 --

 Key: YARN-1180
 URL: https://issues.apache.org/jira/browse/YARN-1180
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 3.0.0, 2.1.0-beta, 0.23.9
Reporter: Thomas Graves
Assignee: Chen He
  Labels: documentation, newbie
 Fix For: 2.4.0

 Attachments: Yarn-1180.patch


 The capacity scheduler docs 
 (http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html)
  don't include types for all the configs. For instance the 
 minimum-user-limit-percent doesn't say its an Int.  It also the only setting 
 for the Resource Allocation configs that is an Int rather then a float.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1311) Fix app specific scheduler-events' names to be app-attempt based

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846898#comment-13846898
 ] 

Hudson commented on YARN-1311:
--

FAILURE: Integrated in Hadoop-trunk-Commit #4876 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4876/])
YARN-1311. Fixed app specific scheduler-events' names to be app-attempt based. 
Contributed by Vinod Kumar Vavilapalli (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550579)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAttemptAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAttemptRemovedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppRemovedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/SchedulerEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/Application.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java


 Fix app specific scheduler-events' names to be app-attempt based
 

 Key: YARN-1311
 URL: https://issues.apache.org/jira/browse/YARN-1311
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Trivial
 Attachments: YARN-1311-20131015.txt, YARN-1311-20131211.1.txt, 
 YARN-1311-20131211.txt


 Today, APP_ADDED and APP_REMOVED are sent to the scheduler. They are 
 misnomers as schedulers only deal with AppAttempts today. This JIRA is for 
 fixing their names so that we can add App-level events in the near future, 
 notably for work-preserving RM-restart.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1180) Update capacity scheduler docs to include types on the configs

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846907#comment-13846907
 ] 

Hadoop QA commented on YARN-1180:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618492/Yarn-1180.patch
  against trunk revision .

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2654//console

This message is automatically generated.

 Update capacity scheduler docs to include types on the configs
 --

 Key: YARN-1180
 URL: https://issues.apache.org/jira/browse/YARN-1180
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 3.0.0, 2.1.0-beta, 0.23.9
Reporter: Thomas Graves
Assignee: Chen He
  Labels: documentation, newbie
 Fix For: 2.4.0

 Attachments: Yarn-1180.patch


 The capacity scheduler docs 
 (http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html)
  don't include types for all the configs. For instance the 
 minimum-user-limit-percent doesn't say its an Int.  It also the only setting 
 for the Resource Allocation configs that is an Int rather then a float.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1485) Enabling HA should verify the RM service addresses configurations have been set for every RM Ids defined in RM_HA_IDs

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846914#comment-13846914
 ] 

Hadoop QA commented on YARN-1485:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618490/YARN-1485.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests:

  org.apache.hadoop.yarn.server.TestContainerManagerSecurity
  org.apache.hadoop.yarn.server.TestRMNMSecretKeys

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2653//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2653//console

This message is automatically generated.

 Enabling HA should verify the RM service addresses configurations have been 
 set for every RM Ids defined in RM_HA_IDs
 -

 Key: YARN-1485
 URL: https://issues.apache.org/jira/browse/YARN-1485
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-1485.1.patch


 After YARN-1325, the YarnConfiguration.RM_HA_IDS will contain multiple 
 RM_Ids. We need to verify that the RM service addresses configurations have 
 been set for all of RM_Ids.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1415) In scheduler UI, including used memory in Memory Total seems to be inaccurate

2013-12-12 Thread Siqi Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846938#comment-13846938
 ] 

Siqi Li commented on YARN-1415:
---

According to the tests in TestQueueMetric.java, AvailableMB is never deducted 
when allocating memory to applications. It actually means the total available 
memory of the cluster. Therefore, totalMB displayed in the UI should only 
include AvailableMB.


 In scheduler UI, including used memory in Memory Total seems to be 
 inaccurate
 ---

 Key: YARN-1415
 URL: https://issues.apache.org/jira/browse/YARN-1415
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager, scheduler
Reporter: Siqi Li
 Fix For: 2.1.0-beta

 Attachments: 1.png, 2.png


 Memory Total is currently a sum of availableMB, allocatedMB, and 
 reservedMB. 
 It seems that the term availableMB actually means total memory, since it 
 doesn't get decreased when some jobs use a certain amount of memory.
 Hence, the Memory Total should not include allocatedMB, or availableMB 
 doesn't get updated properly.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1311) Fix app specific scheduler-events' names to be app-attempt based

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846940#comment-13846940
 ] 

Hudson commented on YARN-1311:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4877 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4877/])
Reverting YARN-1311. Fixed app specific scheduler-events' names to be 
app-attempt based. Contributed by Vinod Kumar Vavilapalli (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550594)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAttemptAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAttemptRemovedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppRemovedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/SchedulerEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/Application.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java


 Fix app specific scheduler-events' names to be app-attempt based
 

 Key: YARN-1311
 URL: https://issues.apache.org/jira/browse/YARN-1311
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Trivial
 Fix For: 2.4.0

 Attachments: YARN-1311-20131015.txt, YARN-1311-20131211.1.txt, 
 YARN-1311-20131211.txt


 Today, APP_ADDED and APP_REMOVED are sent to the scheduler. They are 
 misnomers as schedulers only deal with AppAttempts today. This JIRA is for 
 fixing their names so that we can add App-level events in the near future, 
 notably for work-preserving RM-restart.



--
This message was 

[jira] [Commented] (YARN-1415) In scheduler UI, including used memory in Memory Total seems to be inaccurate

2013-12-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846951#comment-13846951
 ] 

Sandy Ryza commented on YARN-1415:
--

Currently, all the schedulers interpret available MB to mean non-allocated 
memory.  Check out CSQueueUtils.updateQueueStatistics, 
FairScheduler.updateRootQueueMetrics, and FifoScheduler.nodeUpdate.  If 
TestQueueMetrics does not reflect this, it's TestQueueMetrics that is 
misinterpreting.

 In scheduler UI, including used memory in Memory Total seems to be 
 inaccurate
 ---

 Key: YARN-1415
 URL: https://issues.apache.org/jira/browse/YARN-1415
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager, scheduler
Reporter: Siqi Li
 Fix For: 2.1.0-beta

 Attachments: 1.png, 2.png


 Memory Total is currently a sum of availableMB, allocatedMB, and 
 reservedMB. 
 It seems that the term availableMB actually means total memory, since it 
 doesn't get decreased when some jobs use a certain amount of memory.
 Hence, the Memory Total should not include allocatedMB, or availableMB 
 doesn't get updated properly.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1391) Lost node list should be identify by NodeId

2013-12-12 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846969#comment-13846969
 ] 

Gera Shegalov commented on YARN-1391:
-

Siqi asked me to chime in. 
An example of a real yet untypical scenario I am aware of is an HPC scale up 
machine. A single node manager does not scale to manage all containers that can 
run concurrently there. So you have a choice of either unnecessarily 
fragmenting this machine into a bunch of smaller VM/OS's or run a bunch of NM's 
without any overhead of virtualization.

It's always been possible to run multiple TT's in MRv1 as well.

 Lost node list should be identify by NodeId
 ---

 Key: YARN-1391
 URL: https://issues.apache.org/jira/browse/YARN-1391
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.5-alpha
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-1391.v1.patch


 in case of multiple node managers on a single machine. each of them should be 
 identified by NodeId, which is more unique than just host name



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1311) Fix app specific scheduler-events' names to be app-attempt based

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847050#comment-13847050
 ] 

Hudson commented on YARN-1311:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4878 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4878/])
YARN-1311. Fixed app specific scheduler-events' names to be app-attempt based. 
Contributed by Vinod Kumar Vavilapalli (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550613)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAttemptAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAttemptRemovedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppRemovedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/SchedulerEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/Application.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java


 Fix app specific scheduler-events' names to be app-attempt based
 

 Key: YARN-1311
 URL: https://issues.apache.org/jira/browse/YARN-1311
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Trivial
 Fix For: 2.4.0

 Attachments: YARN-1311-20131015.txt, YARN-1311-20131211.1.txt, 
 YARN-1311-20131211.txt


 Today, APP_ADDED and APP_REMOVED are sent to the scheduler. They are 
 misnomers as schedulers only deal with AppAttempts today. This JIRA is for 
 fixing their names so that we can add App-level events in the near future, 
 

[jira] [Commented] (YARN-1311) Fix app specific scheduler-events' names to be app-attempt based

2013-12-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847054#comment-13847054
 ] 

Hudson commented on YARN-1311:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4879 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4879/])
Updated CHANGES.txt for YARN-1311. (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550615)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt


 Fix app specific scheduler-events' names to be app-attempt based
 

 Key: YARN-1311
 URL: https://issues.apache.org/jira/browse/YARN-1311
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
Priority: Trivial
 Fix For: 2.4.0

 Attachments: YARN-1311-20131015.txt, YARN-1311-20131211.1.txt, 
 YARN-1311-20131211.txt


 Today, APP_ADDED and APP_REMOVED are sent to the scheduler. They are 
 misnomers as schedulers only deal with AppAttempts today. This JIRA is for 
 fixing their names so that we can add App-level events in the near future, 
 notably for work-preserving RM-restart.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-312) Add updateNodeResource in ResourceManagerAdministrationProtocol

2013-12-12 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847062#comment-13847062
 ] 

Junping Du commented on YARN-312:
-

Thanks [~vinodkv] for review and comments! All points make sense to me. Please 
see my reply.
bq. The patch isn't applying anymore. Please update.
Sure. Will update in next patch.
bq. There's a better way to implement the map. See ApplicationACLMapProto in 
yarn_protos.proto for example and its usage. This should avoid the length 
checks in AdminService. In a similar vein, the java APIs can directly deal with 
maps.
Thanks for your suggestion here. Yes. That seems better, will update in next 
patch.
bq. Didn't review the previous patches, but I think we should have a better 
name instead of ResourceOption. Will file a JIRA.
Yes. Please share your idea there. Thanks.
bq. The UpdateNodeResourceRequest and response objects need to be @Public too?
Yes. Nice catch. Will change it to public.
bq. Failure handling: If there is an invalid node, should we reject the change 
completely or partially update all the correctly defined nodes? You'e done the 
former. Seems fine. May be say the same in the exception message? That we are 
rejecting all requests?
I tried to keep it simple as getting rid of partial update. Will update the 
exception message.
bq. Are you not doing the CLI support for the update resources in this patch? I 
think we should. Here or separate patch.
Yes. This is major work for YARN-313. Make sense?
bq. Again, didn't review previous patch. So we need to fix here or elsewhere: 
RMNode is supposed to be a read-only interface, so setsetResourceOption() 
doesn't belong there. It should be an event to the node informing the change in 
resource.
That's a good point! Can we fix it in a separated JIRA given this patch is big 
enough and we may want it to be dedicated for RPC things?



 Add updateNodeResource in ResourceManagerAdministrationProtocol
 ---

 Key: YARN-312
 URL: https://issues.apache.org/jira/browse/YARN-312
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Affects Versions: 2.2.0
Reporter: Junping Du
Assignee: Junping Du
 Attachments: YARN-312-v1.patch, YARN-312-v2.patch, YARN-312-v3.patch, 
 YARN-312-v4.1.patch, YARN-312-v4.patch, YARN-312-v5.1.patch, 
 YARN-312-v5.patch, YARN-312-v6.patch, YARN-312-v7.1.patch, 
 YARN-312-v7.1.patch, YARN-312-v7.patch, YARN-312-v8.patch


 Add fundamental RPC (ResourceManagerAdministrationProtocol) to support node's 
 resource change. For design detail, please refer parent JIRA: YARN-291.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1495) Allow moving apps between queues

2013-12-12 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847081#comment-13847081
 ] 

Vinod Kumar Vavilapalli commented on YARN-1495:
---

Hi Sandy, some questions and quick thoughts on this ticket:
 - Any specific use-case? Example where it can be used? To justify this isn't 
feature creep.
 - What happens when scheduling-constraints are violated? The client will just 
get an error? It kind of depends on the type of scheduling constraint.
 - Who initiates the move any regular user or just admins? Given your 
description of ACLs, seems like any one.
 - Only running apps can be moved? There are races w.r.t apps that are 
submitted but not accepted and close-to-completion apps.
 - The ACLs choice seems straightforward and makes sense.

There is some non-trivial stuff that needs ironing out, outside of schedulers.
 - While the move happens,
-- Apps may be in the process of submitting new requests. What happens to 
them? I guess queue-move and new-requests should be synchronized.
-- Preemption monitors will need to be notified. As they kind of know a lot 
about schedulers but sit outside the schedulers.
-- there will be a potential wild-change in the head-room for the 
application.

 Allow moving apps between queues
 

 Key: YARN-1495
 URL: https://issues.apache.org/jira/browse/YARN-1495
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 This is an umbrella JIRA for work needed to allow moving YARN applications 
 from one queue to another.  The work will consist of additions in the command 
 line options, additions in the client RM protocol, and changes in the 
 schedulers to support this.
 I have a picture of how this should function in the Fair Scheduler, but I'm 
 not familiar enough with the Capacity Scheduler for the same there.  
 Ultimately, the decision to whether an application can be moved should go 
 down to the scheduler - some schedulers may wish not to support this at all.  
 However, schedulers that do support it should share some common semantics 
 around ACLs and what happens to running containers.
 Here is how I see the general semantics working out:
 * A move request is issued by the client.  After it gets past ACLs, the 
 scheduler checks whether executing the move will violate any constraints. For 
 the Fair Scheduler, these would be queue maxRunningApps and queue 
 maxResources constraints
 * All running containers are transferred from the old queue to the new queue
 * All outstanding requests are transferred from the old queue to the new queue
 Here is I see the ACLs of this working out:
 * To move an app from a queue a user must have modify access on the app or 
 administer access on the queue
 * To move an app to a queue a user must have submit access on the queue or 
 administer access on the queue 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-312) Add updateNodeResource in ResourceManagerAdministrationProtocol

2013-12-12 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847084#comment-13847084
 ] 

Vinod Kumar Vavilapalli commented on YARN-312:
--

Sure, go ahead and update the patch. Tx.

 Add updateNodeResource in ResourceManagerAdministrationProtocol
 ---

 Key: YARN-312
 URL: https://issues.apache.org/jira/browse/YARN-312
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Affects Versions: 2.2.0
Reporter: Junping Du
Assignee: Junping Du
 Attachments: YARN-312-v1.patch, YARN-312-v2.patch, YARN-312-v3.patch, 
 YARN-312-v4.1.patch, YARN-312-v4.patch, YARN-312-v5.1.patch, 
 YARN-312-v5.patch, YARN-312-v6.patch, YARN-312-v7.1.patch, 
 YARN-312-v7.1.patch, YARN-312-v7.patch, YARN-312-v8.patch


 Add fundamental RPC (ResourceManagerAdministrationProtocol) to support node's 
 resource change. For design detail, please refer parent JIRA: YARN-291.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1029) Allow embedding leader election into the RM

2013-12-12 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1029:
---

Attachment: yarn-1029-0.patch

Patch with tests for automatic and manual failover. The major pending item is 
adding the configs and description to yarn-default.xml. Will address that once 
we agree using ActiveStandbyElector is a simpler approach.

BTW, this patch is to be applied on top of the latest one for YARN-1028.

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1495) Allow moving apps between queues

2013-12-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847108#comment-13847108
 ] 

Sandy Ryza commented on YARN-1495:
--

Thanks for taking a look Vinod.

bq. Any specific use-case? Example where it can be used? To justify this isn't 
feature creep.
Yeah, we've seen requests for this a few times.  I think the most common 
scenario is that someone experiences job slowly because of the queue that it's 
in and the job needs to be placed in a queue where it can complete more 
quickly.  This can occur because it's taking longer than expected and a 
deadline is approaching, the original queue is fuller than expected, the job 
was submitted incorrectly in the first place but has made some progress, or for 
a number of other reasons.

bq. What happens when scheduling-constraints are violated? The client will just 
get an error? It kind of depends on the type of scheduling constraint.
Not sure how this should play out for the Capacity Scheduler, but for the Fair 
Scheduler constraints I mentioned in the description I think the client should 
get an error. I suppose another option would be to kill containers until the 
constraints would be satisfied, but I think this is a lot more work and not 
clearly better behavior.

bq. Who initiates the move any regular user or just admins?
My opinion is any regular user, within ACLs.  I.e. if I could kill my job and 
resubmit it to a different queue, I should be able to move it.

bq. Only running apps can be moved?
I don't see a reason that we shouldn't be able to move an app that has been 
submitted, but not accepted, or that is very close to completion.  In some 
cases we may not need to touch the scheduler.  There are definitely race 
conditions we need to be careful of here.

bq. Apps may be in the process of submitting new requests. What happens to 
them? I guess queue-move and new-requests should be synchronized.
Right.


 Allow moving apps between queues
 

 Key: YARN-1495
 URL: https://issues.apache.org/jira/browse/YARN-1495
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 This is an umbrella JIRA for work needed to allow moving YARN applications 
 from one queue to another.  The work will consist of additions in the command 
 line options, additions in the client RM protocol, and changes in the 
 schedulers to support this.
 I have a picture of how this should function in the Fair Scheduler, but I'm 
 not familiar enough with the Capacity Scheduler for the same there.  
 Ultimately, the decision to whether an application can be moved should go 
 down to the scheduler - some schedulers may wish not to support this at all.  
 However, schedulers that do support it should share some common semantics 
 around ACLs and what happens to running containers.
 Here is how I see the general semantics working out:
 * A move request is issued by the client.  After it gets past ACLs, the 
 scheduler checks whether executing the move will violate any constraints. For 
 the Fair Scheduler, these would be queue maxRunningApps and queue 
 maxResources constraints
 * All running containers are transferred from the old queue to the new queue
 * All outstanding requests are transferred from the old queue to the new queue
 Here is I see the ACLs of this working out:
 * To move an app from a queue a user must have modify access on the app or 
 administer access on the queue
 * To move an app to a queue a user must have submit access on the queue or 
 administer access on the queue 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1498) Common scheduler changes for moving apps between queues

2013-12-12 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1498:
-

Attachment: YARN-1498-1.patch

 Common scheduler changes for moving apps between queues
 ---

 Key: YARN-1498
 URL: https://issues.apache.org/jira/browse/YARN-1498
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1498-1.patch, YARN-1498.patch






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1498) Common scheduler changes for moving apps between queues

2013-12-12 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1498:
-

Summary: Common scheduler changes for moving apps between queues  (was: RM 
changes for moving apps between queues)

 Common scheduler changes for moving apps between queues
 ---

 Key: YARN-1498
 URL: https://issues.apache.org/jira/browse/YARN-1498
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1498.patch






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1498) Common scheduler changes for moving apps between queues

2013-12-12 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1498:
-

Description: This JIRA is to track changes that aren't in particular 
schedulers but that help them support moving apps between queues.

 Common scheduler changes for moving apps between queues
 ---

 Key: YARN-1498
 URL: https://issues.apache.org/jira/browse/YARN-1498
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1498-1.patch, YARN-1498.patch


 This JIRA is to track changes that aren't in particular schedulers but that 
 help them support moving apps between queues.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1498) Common scheduler changes for moving apps between queues

2013-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847136#comment-13847136
 ] 

Hadoop QA commented on YARN-1498:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12618534/YARN-1498-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2655//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2655//console

This message is automatically generated.

 Common scheduler changes for moving apps between queues
 ---

 Key: YARN-1498
 URL: https://issues.apache.org/jira/browse/YARN-1498
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1498-1.patch, YARN-1498.patch


 This JIRA is to track changes that aren't in particular schedulers but that 
 help them support moving apps between queues.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1495) Allow moving apps between queues

2013-12-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847134#comment-13847134
 ] 

Sandy Ryza commented on YARN-1495:
--

Also, a coding question you can maybe provide me guidance on?

Ideally, we would like to return the RPC with whether or not the operation 
succeeded.  However, we need to go down through the app, app attempt, and 
finally, scheduler to determine this.   We could achieve this in a couple of 
ways:
* Use an aync event at each level as is the convention (e.g. as is done for 
killing an application).  Have the call in ClientRMService block and wait for 
things to get sorted out lower down before returning.  Not entirely sure what 
we would wait for because the ClientRMService itself doesn't  receive events.  
A Future might be clean.
* Bypass events and go synchronously through to the scheduler.
Is one of these preferred?  Is there a third path I'm missing?

 Allow moving apps between queues
 

 Key: YARN-1495
 URL: https://issues.apache.org/jira/browse/YARN-1495
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 This is an umbrella JIRA for work needed to allow moving YARN applications 
 from one queue to another.  The work will consist of additions in the command 
 line options, additions in the client RM protocol, and changes in the 
 schedulers to support this.
 I have a picture of how this should function in the Fair Scheduler, but I'm 
 not familiar enough with the Capacity Scheduler for the same there.  
 Ultimately, the decision to whether an application can be moved should go 
 down to the scheduler - some schedulers may wish not to support this at all.  
 However, schedulers that do support it should share some common semantics 
 around ACLs and what happens to running containers.
 Here is how I see the general semantics working out:
 * A move request is issued by the client.  After it gets past ACLs, the 
 scheduler checks whether executing the move will violate any constraints. For 
 the Fair Scheduler, these would be queue maxRunningApps and queue 
 maxResources constraints
 * All running containers are transferred from the old queue to the new queue
 * All outstanding requests are transferred from the old queue to the new queue
 Here is I see the ACLs of this working out:
 * To move an app from a queue a user must have modify access on the app or 
 administer access on the queue
 * To move an app to a queue a user must have submit access on the queue or 
 administer access on the queue 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1363) Get / Cancel / Renew delegation token api should be non blocking

2013-12-12 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1363:
--

Attachment: YARN-1363.3.patch

Upload a new patch which include test cases and some bug fixes

 Get / Cancel / Renew delegation token api should be non blocking
 

 Key: YARN-1363
 URL: https://issues.apache.org/jira/browse/YARN-1363
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Zhijie Shen
 Attachments: YARN-1363.1.patch, YARN-1363.2.patch, YARN-1363.3.patch


 Today GetDelgationToken, CancelDelegationToken and RenewDelegationToken are 
 all blocking apis.
 * As a part of these calls we try to update RMStateStore and that may slow it 
 down.
 * Now as we have limited number of client request handlers we may fill up 
 client handlers quickly.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1413) [YARN-321] AHS WebUI should server aggregated logs as well

2013-12-12 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847208#comment-13847208
 ] 

Zhijie Shen commented on YARN-1413:
---

Some comments:

1. The wrong javadoc bellow:
{code}
+  /*
+   * (non-Javadoc)
+   * 
+   * @see
+   * org.apache.hadoop.mapreduce.v2.hs.webapp.AHSView#preHead(org.apache.hadoop
+   * .yarn.webapp.hamlet.Hamlet.HTML)
+   */
{code}
{code}
+  /**
+   * The content of this page is the JobBlock
+   * 
+   * @return HsJobBlock.class
+   */
{code}

2. I think the better way to construct the logURL in attempt/container blocks 
is to use ContainerReport.getLogURL directly (adding host:port prefix), instead 
of combining several attributes. The logURL should be set correctly in 
RMContainer final transition.


 [YARN-321] AHS WebUI should server aggregated logs as well
 --

 Key: YARN-1413
 URL: https://issues.apache.org/jira/browse/YARN-1413
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: YARN-1413-1.patch






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)