[jira] [Commented] (YARN-690) RM exits on token cancel/renew problems

2013-05-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660388#comment-13660388
 ] 

Vinod Kumar Vavilapalli commented on YARN-690:
--

Bobby, the fix went in a little too fast for any of us to notice, you should 
give others a bit of time to be able to look at it. Tx.

While this is a quick fix that should help, we should think of more long term 
solutions - specifically looking for correct exceptions etc. After our recent 
exception work, mainly after YARN-628 and MAPREDUCE-5254, we can look for 
IOException specifically. Is that enough?

 RM exits on token cancel/renew problems
 ---

 Key: YARN-690
 URL: https://issues.apache.org/jira/browse/YARN-690
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 3.0.0, 2.0.5-beta, 0.23.8

 Attachments: YARN-690.patch, YARN-690.patch


 The DelegationTokenRenewer thread is critical to the RM.  When a 
 non-IOException occurs, the thread calls System.exit to prevent the RM from 
 running w/o the thread.  It should be exiting only on non-RuntimeExceptions.
 The problem is especially bad in 23 because the yarn protobuf layer converts 
 IOExceptions into UndeclaredThrowableExceptions (RuntimeException) which 
 causes the renewer to abort the process.  An UnknownHostException takes down 
 the RM...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-662) Enforce required parameters for all the protocols

2013-05-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660396#comment-13660396
 ] 

Vinod Kumar Vavilapalli commented on YARN-662:
--

We can do it server side as well as on the client side, but clearly server-side 
is a must. If isn't a big change, protocol by protocol, we should fix both the 
server side as well as the client libraries. It'll be easier that way both for 
dev and review.

bq. Does this include making null checks etc on all incoming fields in the API 
handlers?
Yes, these would essentially be null checks.

 Enforce required parameters for all the protocols
 -

 Key: YARN-662
 URL: https://issues.apache.org/jira/browse/YARN-662
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Zhijie Shen

 All proto fields are marked as options. We need to mark some of them as 
 requried, or enforce these server side. Server side is likely better since 
 that's more flexible (Example deprecating a field type in favour of another - 
 either of the two must be present)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-695) masterContainer and status are in ApplicationReportProto but not in ApplicationReport

2013-05-17 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-695:
-

Issue Type: Sub-task  (was: Bug)
Parent: YARN-386

 masterContainer and status are in ApplicationReportProto but not in 
 ApplicationReport
 -

 Key: YARN-695
 URL: https://issues.apache.org/jira/browse/YARN-695
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 If masterContainer and status are no longer part of ApplicationReport, they 
 should be removed from proto as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660400#comment-13660400
 ] 

Vinod Kumar Vavilapalli commented on YARN-617:
--

This looks good, checking it in..

 In unsercure mode, AM can fake resource requirements 
 -

 Key: YARN-617
 URL: https://issues.apache.org/jira/browse/YARN-617
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: YARN-617.20130501.1.patch, YARN-617.20130501.patch, 
 YARN-617.20130502.patch, YARN-617-20130507.patch, YARN-617.20130508.patch, 
 YARN-617-20130513.patch, YARN-617-20130515.patch, 
 YARN-617-20130516.branch-2.patch, YARN-617-20130516.trunk.patch


 Without security, it is impossible to completely avoid AMs faking resources. 
 We can at the least make it as difficult as possible by using the same 
 container tokens and the RM-NM shared key mechanism over unauthenticated 
 RM-NM channel.
 In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660405#comment-13660405
 ] 

Hudson commented on YARN-617:
-

Integrated in Hadoop-trunk-Commit #3764 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3764/])
YARN-617. Made ContainerTokens to be used for validation at NodeManager 
also in unsecure mode to prevent AMs from faking resource requirements in 
unsecure mode. Contributed by Omkar Vinit Joshi. (Revision 1483667)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483667
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMContainerTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/DummyContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/LocalRMInterface.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/MockNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestEventFlow.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerReboot.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/TestApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/krb5.conf
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
* 

[jira] [Commented] (YARN-628) Fix YarnException unwrapping

2013-05-17 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660411#comment-13660411
 ] 

Siddharth Seth commented on YARN-628:
-

Good catch on the javadoc. That should be fixed.

bq. The only solution is creating a union (IOException, YarnRemoteException)
Or just using a Throwable with an instance check, but that check seems 
unnecessary. One downside to the current approach (other than the ugly code) is 
that the trace for the exception includes the call to the unwrap method.

 Fix YarnException unwrapping
 

 Key: YARN-628
 URL: https://issues.apache.org/jira/browse/YARN-628
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: 2.0.5-beta

 Attachments: YARN-628.txt, YARN-628.txt, YARN-628.txt, YARN-628.txt.2


 Unwrapping of YarnRemoteExceptions (currently in YarnRemoteExceptionPBImpl, 
 RPCUtil post YARN-625) is broken, and often ends up throwin 
 UndeclaredThrowableException. This needs to be fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660548#comment-13660548
 ] 

Hudson commented on YARN-617:
-

Integrated in Hadoop-Yarn-trunk #212 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/212/])
YARN-617. Made ContainerTokens to be used for validation at NodeManager 
also in unsecure mode to prevent AMs from faking resource requirements in 
unsecure mode. Contributed by Omkar Vinit Joshi. (Revision 1483667)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483667
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMContainerTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/DummyContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/LocalRMInterface.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/MockNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestEventFlow.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerReboot.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/TestApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/krb5.conf
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
* 

[jira] [Commented] (YARN-690) RM exits on token cancel/renew problems

2013-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660550#comment-13660550
 ] 

Hudson commented on YARN-690:
-

Integrated in Hadoop-Yarn-trunk #212 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/212/])
YARN-690. RM exits on token cancel/renew problems (daryn via bobby) 
(Revision 1483578)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483578
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java


 RM exits on token cancel/renew problems
 ---

 Key: YARN-690
 URL: https://issues.apache.org/jira/browse/YARN-690
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 3.0.0, 2.0.5-beta, 0.23.8

 Attachments: YARN-690.patch, YARN-690.patch


 The DelegationTokenRenewer thread is critical to the RM.  When a 
 non-IOException occurs, the thread calls System.exit to prevent the RM from 
 running w/o the thread.  It should be exiting only on non-RuntimeExceptions.
 The problem is especially bad in 23 because the yarn protobuf layer converts 
 IOExceptions into UndeclaredThrowableExceptions (RuntimeException) which 
 causes the renewer to abort the process.  An UnknownHostException takes down 
 the RM...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-690) RM exits on token cancel/renew problems

2013-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660678#comment-13660678
 ] 

Hudson commented on YARN-690:
-

Integrated in Hadoop-Hdfs-0.23-Build #610 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/610/])
svn merge -c 1483578 FIXES: YARN-690. RM exits on token cancel/renew 
problems (daryn via bobby) (Revision 1483581)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483581
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java


 RM exits on token cancel/renew problems
 ---

 Key: YARN-690
 URL: https://issues.apache.org/jira/browse/YARN-690
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 3.0.0, 2.0.5-beta, 0.23.8

 Attachments: YARN-690.patch, YARN-690.patch


 The DelegationTokenRenewer thread is critical to the RM.  When a 
 non-IOException occurs, the thread calls System.exit to prevent the RM from 
 running w/o the thread.  It should be exiting only on non-RuntimeExceptions.
 The problem is especially bad in 23 because the yarn protobuf layer converts 
 IOExceptions into UndeclaredThrowableExceptions (RuntimeException) which 
 causes the renewer to abort the process.  An UnknownHostException takes down 
 the RM...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660692#comment-13660692
 ] 

Hudson commented on YARN-617:
-

Integrated in Hadoop-Hdfs-trunk #1401 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1401/])
YARN-617. Made ContainerTokens to be used for validation at NodeManager 
also in unsecure mode to prevent AMs from faking resource requirements in 
unsecure mode. Contributed by Omkar Vinit Joshi. (Revision 1483667)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483667
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMContainerTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/DummyContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/LocalRMInterface.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/MockNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestEventFlow.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerReboot.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/TestApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/krb5.conf
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
* 

[jira] [Commented] (YARN-690) RM exits on token cancel/renew problems

2013-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660694#comment-13660694
 ] 

Hudson commented on YARN-690:
-

Integrated in Hadoop-Hdfs-trunk #1401 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1401/])
YARN-690. RM exits on token cancel/renew problems (daryn via bobby) 
(Revision 1483578)

 Result = FAILURE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483578
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java


 RM exits on token cancel/renew problems
 ---

 Key: YARN-690
 URL: https://issues.apache.org/jira/browse/YARN-690
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 0.23.7, 2.0.5-beta
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 3.0.0, 2.0.5-beta, 0.23.8

 Attachments: YARN-690.patch, YARN-690.patch


 The DelegationTokenRenewer thread is critical to the RM.  When a 
 non-IOException occurs, the thread calls System.exit to prevent the RM from 
 running w/o the thread.  It should be exiting only on non-RuntimeExceptions.
 The problem is especially bad in 23 because the yarn protobuf layer converts 
 IOExceptions into UndeclaredThrowableExceptions (RuntimeException) which 
 causes the renewer to abort the process.  An UnknownHostException takes down 
 the RM...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-05-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660712#comment-13660712
 ] 

Hudson commented on YARN-617:
-

Integrated in Hadoop-Mapreduce-trunk #1428 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1428/])
YARN-617. Made ContainerTokens to be used for validation at NodeManager 
also in unsecure mode to prevent AMs from faking resource requirements in 
unsecure mode. Contributed by Omkar Vinit Joshi. (Revision 1483667)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1483667
Files : 
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMContainerTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/DummyContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/LocalRMInterface.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/MockNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestEventFlow.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerReboot.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/TestApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/krb5.conf
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
* 

[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660779#comment-13660779
 ] 

Alejandro Abdelnur commented on YARN-689:
-

The headroom needed to run a container, given all the typical dependencies, is 
significant. Having the multiplier decoupled from the minimum allows a better 
tuning of the cluster to maximize utilization/allocation.

As I was trying to exemplify in the description, take the current min default 
is 1GB, MRAM (via YARNRunner) asks for 1.5GB as default, meaning you are 
getting alway 2GB.


 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660840#comment-13660840
 ] 

Hitesh Shah commented on YARN-689:
--

The assumption you seem to be making is that all applications will require the 
same minimum headroom. This may be the case with all MR jobs but need not be 
true for other applications running on the same cluster. This could be solved 
by just setting minimum allocation to 512 MB. 

 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660874#comment-13660874
 ] 

Alejandro Abdelnur commented on YARN-689:
-

There is a headroom that YARN requires, that is YARN_MINIMUM = YARN_HEADROOM.

Then, as you said, there it is headroom specific to each up, APP_MINIMUM = 
YARN_MINIMUM + APP_HEADROOM.

Then, there is the multiplier that defines the valid increments (which are used 
for normalization).

Going to a concret example, using increments of 256MB seems a reasonable unit, 
but if you set your YARN_MINIMUM to 256MB you'll run OOM doing basic stuff.

Furthermore, given how things have been in the past, I see the headroom 
required by the framework growing, thus requiring the YARN_MIN to increase. And 
if the minimum stays tied to the multipler it will lead to under utilization.

What is the concern of having a multiplier that allows decoupling the minimum 
from the multiplier? What is conceptually wrong with it? We could have the 
default tied up to minimum, thus preserving current behavior.


 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality

2013-05-17 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660904#comment-13660904
 ] 

Bikas Saha commented on YARN-392:
-

This is confusing me. The purpose of this jira is to add support for specific 
nodes/racks for scheduling. ie dont relax locality automatically. In that 
context, what does it mean to disable allocation of containers on a node which 
sounds like blacklisting the node? 

 Make it possible to schedule to specific nodes without dropping locality
 

 Key: YARN-392
 URL: https://issues.apache.org/jira/browse/YARN-392
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Sandy Ryza
 Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, 
 YARN-392-2.patch, YARN-392-3.patch, YARN-392-4.patch, YARN-392.patch


 Currently its not possible to specify scheduling requests for specific nodes 
 and nowhere else. The RM automatically relaxes locality to rack and * and 
 assigns non-specified machines to the app.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-638) Restore RMDelegationTokens after RM Restart

2013-05-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-638:
-

Attachment: YARN-638.12.patch

New patch loaded.
bq.Should this be AssertEquals (same as done earlier for RM1 in the test)?
new rm has its own master keys when it starts
After the change, testRMDTMasterKeyStateOnRollingMasterKey takes 4.9s and 
testRemoveExpiredMasterKeyInRMStateStore takes 1.8s

 Restore RMDelegationTokens after RM Restart
 ---

 Key: YARN-638
 URL: https://issues.apache.org/jira/browse/YARN-638
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-638.10.patch, YARN-638.11.patch, YARN-638.12.patch, 
 YARN-638.1.patch, YARN-638.2.patch, YARN-638.3.patch, YARN-638.4.patch, 
 YARN-638.5.patch, YARN-638.6.patch, YARN-638.7.patch, YARN-638.8.patch, 
 YARN-638.9.patch


 This is missed in YARN-581. After RM restart, RMDelegationTokens need to be 
 added both in DelegationTokenRenewer (addressed in YARN-581), and 
 delegationTokenSecretManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660919#comment-13660919
 ] 

Hitesh Shah commented on YARN-689:
--

I am not sure I understand the headroom references that you gave. 

I will try and explain my understanding:

There is no headroom that YARN needs. It runs a container via a shell script 
that can in turn launch a shell script ( in case of distributed shell ) or a 
jvm ( in case of MR ). To run such a container, it does not require a minimum 
set of resources. The shell container could run with say 100 MB or even less, 
where as the MR case due to how MR tasks work may need anywhere from 1 GB to 2 
GB. 

The minimum allocation defined at the scheduler level in RM is actually just 
the multiplier. Maybe, calling it minimum is confusing and we could change it 
to be slot size or multiplier value? 

Lets consider 4 application types: 
  - App1 needs 1.5 GB sized container .
  - App2 needs 2 GB sized containers
  - App3 needs 400 MB sized containers. 
  - App4 needs 1 GB sized containers.

From the above, the simplest answer will be to set the slot size to 512 MB ( 
which is currently set using the minimum allocation config property ). Each 
application has its own set of defaults which can then be translated into 
multiples of slot sizes. 

Currently, MR has 3 settings:
  - Application Master memory : default is 1.5 GB
  - map memory : default is 1 GB
  - reduce memory : default is 1 GB

Due to yarn's default configs of 1 GB slot size ( and max container size of 8 
GB ), the AM ends up taking 2 GB and the maps/reduces take up 1 GB each. To 
make the AM use 1.5 GB containers ( instead of 2 GB ), yarn's minimum 
allocation (slot size) could be changed to 512 MB. 

Question is whether we need an additional minimum configuration on top of the 
already available multiplier setting? 











 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-695) masterContainer and status are in ApplicationReportProto but not in ApplicationReport

2013-05-17 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660920#comment-13660920
 ] 

Zhijie Shen commented on YARN-695:
--

Status has been used in ApplicationReportPBImpl, but the code looks buggy. The 
following two setters both check applicationId and clear status.

{code}
  @Override
  public void setApplicationId(ApplicationId applicationId) {
maybeInitBuilder();
if (applicationId == null)
  builder.clearStatus();
this.applicationId = applicationId;
  }

  @Override
  public void setCurrentApplicationAttemptId(ApplicationAttemptId 
applicationAttemptId) {
maybeInitBuilder();
if (applicationId == null)
  builder.clearStatus();
this.currentApplicationAttemptId = applicationAttemptId;
  }
{code}

 masterContainer and status are in ApplicationReportProto but not in 
 ApplicationReport
 -

 Key: YARN-695
 URL: https://issues.apache.org/jira/browse/YARN-695
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 If masterContainer and status are no longer part of ApplicationReport, they 
 should be removed from proto as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-638) Restore RMDelegationTokens after RM Restart

2013-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660925#comment-13660925
 ] 

Hadoop QA commented on YARN-638:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12583668/YARN-638.12.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/949//console

This message is automatically generated.

 Restore RMDelegationTokens after RM Restart
 ---

 Key: YARN-638
 URL: https://issues.apache.org/jira/browse/YARN-638
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-638.10.patch, YARN-638.11.patch, YARN-638.12.patch, 
 YARN-638.1.patch, YARN-638.2.patch, YARN-638.3.patch, YARN-638.4.patch, 
 YARN-638.5.patch, YARN-638.6.patch, YARN-638.7.patch, YARN-638.8.patch, 
 YARN-638.9.patch


 This is missed in YARN-581. After RM restart, RMDelegationTokens need to be 
 added both in DelegationTokenRenewer (addressed in YARN-581), and 
 delegationTokenSecretManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-613) Create NM proxy per NM instead of per container

2013-05-17 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-613:
---

Attachment: AMNMToken.docx

 Create NM proxy per NM instead of per container
 ---

 Key: YARN-613
 URL: https://issues.apache.org/jira/browse/YARN-613
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
 Attachments: AMNMToken.docx


 Currently a new NM proxy has to be created per container since the secure 
 authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660943#comment-13660943
 ] 

Alejandro Abdelnur commented on YARN-689:
-

Got the point of a container not having a headroom (I was just thinking of JVM 
containers which typically end up having the framework in the classpath).

Unless I'm mistaken, with the current logic I could not run a shell app using 
less than the 'minimum', which happens to be the multiplier, correct?

If we decide a separate multiplier has not merits, then we should rename the 
current minimum to multiplier (and indicate it works with base 1).

 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality

2013-05-17 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13660946#comment-13660946
 ] 

Sandy Ryza commented on YARN-392:
-

We are implementing the approach outlined by Arun in his first comment on 
YARN-398.  Although it is not the primary goal, the approach does allow for 
node/rack blacklisting, and loses nothing by doing so.  Even if we were to say 
that you can't set the disable-allocation flag on node-level requests, it would 
still be possible to blacklist racks by setting the disable flag on a rack and 
submitting node requests for nodes under it.  It would also still be possible 
to blacklist nodes by whitelisting every other node on its rack.  Allowing the 
disable-allocation flag on node-level requests just makes the semantics more 
consistent.

I'll update the title of the JIRA to better reflect this.


 Make it possible to schedule to specific nodes without dropping locality
 

 Key: YARN-392
 URL: https://issues.apache.org/jira/browse/YARN-392
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Sandy Ryza
 Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, 
 YARN-392-2.patch, YARN-392-3.patch, YARN-392-4.patch, YARN-392.patch


 Currently its not possible to specify scheduling requests for specific nodes 
 and nowhere else. The RM automatically relaxes locality to rack and * and 
 assigns non-specified machines to the app.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-392) Make it possible to specify hard locality constraints in resource requests

2013-05-17 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-392:


Summary: Make it possible to specify hard locality constraints in resource 
requests  (was: Make it possible to schedule to specific nodes without dropping 
locality)

 Make it possible to specify hard locality constraints in resource requests
 --

 Key: YARN-392
 URL: https://issues.apache.org/jira/browse/YARN-392
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Sandy Ryza
 Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, 
 YARN-392-2.patch, YARN-392-3.patch, YARN-392-4.patch, YARN-392.patch


 Currently its not possible to specify scheduling requests for specific nodes 
 and nowhere else. The RM automatically relaxes locality to rack and * and 
 assigns non-specified machines to the app.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (YARN-696) Enable multiple states to to be specified in Resource Manager apps REST call

2013-05-17 Thread Konstantin Boudnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik moved HADOOP-9571 to YARN-696:
-

  Component/s: (was: util)
   resourcemanager
Affects Version/s: (was: 2.0.4-alpha)
   2.0.4-alpha
  Key: YARN-696  (was: HADOOP-9571)
  Project: Hadoop YARN  (was: Hadoop Common)

 Enable multiple states to to be specified in Resource Manager apps REST call
 

 Key: YARN-696
 URL: https://issues.apache.org/jira/browse/YARN-696
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Trevor Lorimer
Priority: Trivial

 Within the YARN Resource Manager REST API the GET call which returns all 
 Applications can be filtered by a single State query parameter (http://rm 
 http address:port/ws/v1/cluster/apps). 
 There are 8 possible states (New, Submitted, Accepted, Running, Finishing, 
 Finished, Failed, Killed), if no state parameter is specified all states are 
 returned, however if a sub-set of states is required then multiple REST calls 
 are required (max. of 7).
 The proposal is to be able to specify multiple states in a single REST call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661010#comment-13661010
 ] 

Hitesh Shah commented on YARN-689:
--

+1 to renaming the current minimum to indicate slot size or multiplier. 

 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-638) Restore RMDelegationTokens after RM Restart

2013-05-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-638:
-

Attachment: YARN-638.13.patch

fix merge conflicts  

 Restore RMDelegationTokens after RM Restart
 ---

 Key: YARN-638
 URL: https://issues.apache.org/jira/browse/YARN-638
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-638.10.patch, YARN-638.11.patch, YARN-638.12.patch, 
 YARN-638.13.patch, YARN-638.1.patch, YARN-638.2.patch, YARN-638.3.patch, 
 YARN-638.4.patch, YARN-638.5.patch, YARN-638.6.patch, YARN-638.7.patch, 
 YARN-638.8.patch, YARN-638.9.patch


 This is missed in YARN-581. After RM restart, RMDelegationTokens need to be 
 added both in DelegationTokenRenewer (addressed in YARN-581), and 
 delegationTokenSecretManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-688) Containers not cleaned up when NM received SHUTDOWN event from NodeStatusUpdater

2013-05-17 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661037#comment-13661037
 ] 

Omkar Vinit Joshi commented on YARN-688:


bq. +  nodeStatusUpdater.getNodeStatusAndUpdateContainersInContext();
why is this required?

you might need to rebase patch based on YARN-617

testNodeStatusUpdaterRetryAndNMShutdown - startContainer code might change 
after above patch.



 Containers not cleaned up when NM received SHUTDOWN event from 
 NodeStatusUpdater
 

 Key: YARN-688
 URL: https://issues.apache.org/jira/browse/YARN-688
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-688.1.patch


 Currently, both SHUTDOWN event from nodeStatusUpdater and CleanupContainers 
 event happens to be on the same dispatcher thread, CleanupContainers Event 
 will not be processed until SHUTDOWN event is processed. see similar problem 
 on YARN-495.
 On normal NM shutdown, this is not a problem since normal stop happens on 
 shutdownHook thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-688) Containers not cleaned up when NM received SHUTDOWN event from NodeStatusUpdater

2013-05-17 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661042#comment-13661042
 ] 

Jian He commented on YARN-688:
--

bq.  nodeStatusUpdater.getNodeStatusAndUpdateContainersInContext();
this is required, since this method is removing the completed containers from 
the context

Will rebase later.

 Containers not cleaned up when NM received SHUTDOWN event from 
 NodeStatusUpdater
 

 Key: YARN-688
 URL: https://issues.apache.org/jira/browse/YARN-688
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-688.1.patch


 Currently, both SHUTDOWN event from nodeStatusUpdater and CleanupContainers 
 event happens to be on the same dispatcher thread, CleanupContainers Event 
 will not be processed until SHUTDOWN event is processed. see similar problem 
 on YARN-495.
 On normal NM shutdown, this is not a problem since normal stop happens on 
 shutdownHook thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-696) Enable multiple states to to be specified in Resource Manager apps REST call

2013-05-17 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661052#comment-13661052
 ] 

Sandy Ryza commented on YARN-696:
-

In YARN-642, we're addressing a similar issue with the /nodes api.  There we 
resolved to have an all option, but not allow specifying multiple states.  
Worth thinking about for this as well.   Although, the number of apps of apps 
can be much larger than the number of nodes, so being able to have more control 
over which ones are returned might be more important in this case.

 Enable multiple states to to be specified in Resource Manager apps REST call
 

 Key: YARN-696
 URL: https://issues.apache.org/jira/browse/YARN-696
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Trevor Lorimer
Priority: Trivial

 Within the YARN Resource Manager REST API the GET call which returns all 
 Applications can be filtered by a single State query parameter (http://rm 
 http address:port/ws/v1/cluster/apps). 
 There are 8 possible states (New, Submitted, Accepted, Running, Finishing, 
 Finished, Failed, Killed), if no state parameter is specified all states are 
 returned, however if a sub-set of states is required then multiple REST calls 
 are required (max. of 7).
 The proposal is to be able to specify multiple states in a single REST call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-697) Move addPersistedDelegationToken to AbstractDelegationTokenSecretManager to be used by both HDFSRM

2013-05-17 Thread Jian He (JIRA)
Jian He created YARN-697:


 Summary: Move addPersistedDelegationToken to 
AbstractDelegationTokenSecretManager to be used by both HDFSRM
 Key: YARN-697
 URL: https://issues.apache.org/jira/browse/YARN-697
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He


Is it possible to move addPersistedDelegationToken to 
AbstractDelegationTokenSecretManager to be used by both HDFS  RM ?

Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey,
logExpireToken to removeStoredToken to be used by both HDFS  RM for persisting 
and recovering keys/tokens?



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-697) Move addPersistedDelegationToken to AbstractDelegationTokenSecretManager to be used by both HDFSRM

2013-05-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-697:
-

Description: 
Is it possible to move addPersistedDelegationToken in 
DelegationTokenSecretManager to AbstractDelegationTokenSecretManager to be used 
by both HDFS  RM ?

Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey,
logExpireToken to removeStoredToken to be used by both HDFS  RM for persisting 
and recovering keys/tokens?



  was:
Is it possible to move addPersistedDelegationToken to 
AbstractDelegationTokenSecretManager to be used by both HDFS  RM ?

Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey,
logExpireToken to removeStoredToken to be used by both HDFS  RM for persisting 
and recovering keys/tokens?




 Move addPersistedDelegationToken to AbstractDelegationTokenSecretManager to 
 be used by both HDFSRM
 ---

 Key: YARN-697
 URL: https://issues.apache.org/jira/browse/YARN-697
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He

 Is it possible to move addPersistedDelegationToken in 
 DelegationTokenSecretManager to AbstractDelegationTokenSecretManager to be 
 used by both HDFS  RM ?
 Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey,
 logExpireToken to removeStoredToken to be used by both HDFS  RM for 
 persisting and recovering keys/tokens?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-563) Add application type to ApplicationReport

2013-05-17 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated YARN-563:
---

Attachment: YARN-563-trunk-4.patch

 Add application type to ApplicationReport 
 --

 Key: YARN-563
 URL: https://issues.apache.org/jira/browse/YARN-563
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Thomas Weise
Assignee: Mayank Bansal
 Attachments: YARN-563-trunk-1.patch, YARN-563-trunk-2.patch, 
 YARN-563-trunk-3.patch, YARN-563-trunk-4.patch


 This field is needed to distinguish different types of applications (app 
 master implementations). For example, we may run applications of type XYZ in 
 a cluster alongside MR and would like to filter applications by type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-563) Add application type to ApplicationReport

2013-05-17 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661125#comment-13661125
 ] 

Mayank Bansal commented on YARN-563:


Thanks Vinod for review. 

I have updated all your comments.

[~hitesh]

Application type is part of the application report and we are not doing any 
filtering neither in CLI nor in rest api.
I think user can get the application report and filter at their end thats what 
seems to work till now for other attributes as well.

If we want attribute based filtering then probably we need to add more apis.
If you think thats useful I can create a seprate JIRA and will work on that.

Thoughts?

Thanks,
Mayank

 Add application type to ApplicationReport 
 --

 Key: YARN-563
 URL: https://issues.apache.org/jira/browse/YARN-563
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Thomas Weise
Assignee: Mayank Bansal
 Attachments: YARN-563-trunk-1.patch, YARN-563-trunk-2.patch, 
 YARN-563-trunk-3.patch, YARN-563-trunk-4.patch


 This field is needed to distinguish different types of applications (app 
 master implementations). For example, we may run applications of type XYZ in 
 a cluster alongside MR and would like to filter applications by type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-697) Move addPersistedDelegationToken to AbstractDelegationTokenSecretManager to be used by both HDFSRM

2013-05-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-697:
-

Description: 
Is it possible to move addPersistedDelegationToken in 
DelegationTokenSecretManager to AbstractDelegationTokenSecretManager?

Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey AND 
logExpireToken to removeStoredToken for persisting and recovering keys/tokens?

These methods are likely to be common methods and be used by overridden 
secretManager

  was:
Is it possible to move addPersistedDelegationToken in 
DelegationTokenSecretManager to AbstractDelegationTokenSecretManager to be used 
by both HDFS  RM ?

Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey,
logExpireToken to removeStoredToken to be used by both HDFS  RM for persisting 
and recovering keys/tokens?




 Move addPersistedDelegationToken to AbstractDelegationTokenSecretManager to 
 be used by both HDFSRM
 ---

 Key: YARN-697
 URL: https://issues.apache.org/jira/browse/YARN-697
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He

 Is it possible to move addPersistedDelegationToken in 
 DelegationTokenSecretManager to AbstractDelegationTokenSecretManager?
 Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey AND 
 logExpireToken to removeStoredToken for persisting and recovering keys/tokens?
 These methods are likely to be common methods and be used by overridden 
 secretManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-695) masterContainer and status are in ApplicationReportProto but not in ApplicationReport

2013-05-17 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-695:
-

Attachment: YARN-695.1.patch

Removed the two fields from ApplicatonReportProto, changed the two setters in 
ApplicationREportPBImpl, and added a test.

 masterContainer and status are in ApplicationReportProto but not in 
 ApplicationReport
 -

 Key: YARN-695
 URL: https://issues.apache.org/jira/browse/YARN-695
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-695.1.patch


 If masterContainer and status are no longer part of ApplicationReport, they 
 should be removed from proto as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-697) Move addPersistedDelegationToken to AbstractDelegationTokenSecretManager

2013-05-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-697:
-

Summary: Move addPersistedDelegationToken to 
AbstractDelegationTokenSecretManager  (was: Move addPersistedDelegationToken to 
AbstractDelegationTokenSecretManager to be used by both HDFSRM)

 Move addPersistedDelegationToken to AbstractDelegationTokenSecretManager
 

 Key: YARN-697
 URL: https://issues.apache.org/jira/browse/YARN-697
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He

 Is it possible to move addPersistedDelegationToken in 
 DelegationTokenSecretManager to AbstractDelegationTokenSecretManager?
 Also, Is it possible to rename logUpdateMasterKey to storeNewMasterKey AND 
 logExpireToken to removeStoredToken for persisting and recovering keys/tokens?
 These methods are likely to be common methods and be used by overridden 
 secretManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-563) Add application type to ApplicationReport

2013-05-17 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661131#comment-13661131
 ] 

Hitesh Shah commented on YARN-563:
--

I would say that listing applications based on type ( via CLI or webservices ) 
should become a feature of the framework. Need not be part of this jira but it 
should be part of the framework. Filtering 1 apps for a particular type is 
not good usability in my opinion.

 Add application type to ApplicationReport 
 --

 Key: YARN-563
 URL: https://issues.apache.org/jira/browse/YARN-563
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Thomas Weise
Assignee: Mayank Bansal
 Attachments: YARN-563-trunk-1.patch, YARN-563-trunk-2.patch, 
 YARN-563-trunk-3.patch, YARN-563-trunk-4.patch


 This field is needed to distinguish different types of applications (app 
 master implementations). For example, we may run applications of type XYZ in 
 a cluster alongside MR and would like to filter applications by type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-638) Restore RMDelegationTokens after RM Restart

2013-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661132#comment-13661132
 ] 

Hadoop QA commented on YARN-638:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12583687/YARN-638.13.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.hdfs.web.TestWebHdfsTimeouts
  
org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/950//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/950//console

This message is automatically generated.

 Restore RMDelegationTokens after RM Restart
 ---

 Key: YARN-638
 URL: https://issues.apache.org/jira/browse/YARN-638
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-638.10.patch, YARN-638.11.patch, YARN-638.12.patch, 
 YARN-638.13.patch, YARN-638.1.patch, YARN-638.2.patch, YARN-638.3.patch, 
 YARN-638.4.patch, YARN-638.5.patch, YARN-638.6.patch, YARN-638.7.patch, 
 YARN-638.8.patch, YARN-638.9.patch


 This is missed in YARN-581. After RM restart, RMDelegationTokens need to be 
 added both in DelegationTokenRenewer (addressed in YARN-581), and 
 delegationTokenSecretManager

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-695) masterContainer and status are in ApplicationReportProto but not in ApplicationReport

2013-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661141#comment-13661141
 ] 

Hadoop QA commented on YARN-695:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12583701/YARN-695.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/951//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/951//console

This message is automatically generated.

 masterContainer and status are in ApplicationReportProto but not in 
 ApplicationReport
 -

 Key: YARN-695
 URL: https://issues.apache.org/jira/browse/YARN-695
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-695.1.patch


 If masterContainer and status are no longer part of ApplicationReport, they 
 should be removed from proto as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661159#comment-13661159
 ] 

Sandy Ryza commented on YARN-689:
-

Consider this situation:
All the nodes in a cluster have 3 GB.  Because YARN does not yet allow us to 
set limits on disk IO, we don't want more than 3 containers on each node.  We 
do, however, think it is reasonable to run two 1.5 GB AMs on a node.  How would 
we specify this with only a multiplier?

 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-698) Review of Field Rules, Default Values and Sanity Check for ClientRMProtocol

2013-05-17 Thread Zhijie Shen (JIRA)
Zhijie Shen created YARN-698:


 Summary: Review of Field Rules, Default Values and Sanity Check 
for ClientRMProtocol
 Key: YARN-698
 URL: https://issues.apache.org/jira/browse/YARN-698
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Zhijie Shen


We need to check the fields of the protos used by ClientRMProtocol 
(recursively) to clarify the following stuff:

1. Whether the field should be required or optional
2. What the default value should be if the field is optional
3. Whether sanity check is required to validate the input value against the 
field's value domain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-563) Add application type to ApplicationReport

2013-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661173#comment-13661173
 ] 

Hadoop QA commented on YARN-563:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12583699/YARN-563-trunk-4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/952//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/952//console

This message is automatically generated.

 Add application type to ApplicationReport 
 --

 Key: YARN-563
 URL: https://issues.apache.org/jira/browse/YARN-563
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Thomas Weise
Assignee: Mayank Bansal
 Attachments: YARN-563-trunk-1.patch, YARN-563-trunk-2.patch, 
 YARN-563-trunk-3.patch, YARN-563-trunk-4.patch


 This field is needed to distinguish different types of applications (app 
 master implementations). For example, we may run applications of type XYZ in 
 a cluster alongside MR and would like to filter applications by type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-05-17 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661175#comment-13661175
 ] 

Carlo Curino commented on YARN-569:
---

We post an improved version of the patch, that reflects: 
- the committed versions of YARN-45, and YARN-567
- uses the resource-based version of YARN-45, and 
- handles hierarchies of queues 

The key change to handle hierarchies is to:
- roll up pending requests from the leaf to parents
- compute the ideal capacity assignment (same algo as before) for level of 
the three from the top down
- determine preemption as (current - ideal) in the leafs and select containers 

This covers nicely the use case brought up by Bikas, Arun, Hitish, Sid, and 
Vinod where a (even heavily) over-capacity 
leaf queue should not be preempted if its parent is within capacity. We 
included this specific test as part of 
our unit tests. 

Note: my previous [comment | 
https://issues.apache.org/jira/browse/YARN-569?focusedCommentId=13638825page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13638825]
 about having doubts on the priority-first still stands. Priorities capture the 
order in which the application wants containers, but they are not updated 
after containers are granted to capture the relative relevance of containers at 
runtime. This is way using a resource-based PreemptionMessage is important, 
since it allows the underlying app to pick a different set of containers. This 
is what we do in the implementation of this for mapreduce (MAPREDUCE-5196 and 
friends), where we preempt reducers instead of maps whenever possible.



 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.1.patch, YARN-569.patch, YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until 

[jira] [Updated] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-05-17 Thread Carlo Curino (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino updated YARN-569:
--

Attachment: YARN-569.2.patch

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.1.patch, YARN-569.2.patch, YARN-569.patch, 
 YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only for specific locations but not any. 
 (**) The ideal balance state is one in which each queue has at least its 
 guaranteed capacity, and the spare capacity is distributed among queues (that 
 wants some) as a weighted fair share. Where the weighting is based on the 
 guaranteed capacity of a queue, and the function runs to a fix point.  
 Tunables of the ProportionalCapacityPreemptionPolicy:
 # observe-only mode (i.e., log the actions it would take, but behave as 
 read-only)
 # how frequently to run the policy
 # how long to wait between preemption and kill of a container
 # which fraction of the containers I would like to obtain should I preempt 
 (has to do with the natural rate at which containers are returned)
 # deadzone size, i.e., what % of over-capacity should I ignore (if we are off 
 perfect balance by some small % we ignore it)
 # overall amount of preemption we can 

[jira] [Commented] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661185#comment-13661185
 ] 

Hadoop QA commented on YARN-569:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12583710/YARN-569.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/953//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/953//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/953//console

This message is automatically generated.

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.1.patch, YARN-569.2.patch, YARN-569.patch, 
 YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for 

[jira] [Updated] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated YARN-689:


Attachment: YARN-689.patch

updating patch to straighten up test-patch report:

* new normalizeInt() method was not handling properly usecase where the value 
was rounded up over the maximum, was not being capped down.
* TestAMMRClient was using wrong constants in the testcase
* run into diffs between YarnConfiguration and yarn-default.xml default number 
of cores, set yarn-default.xml to the same value used in YarnConfiguration (4, 
it was 32).

 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch, YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661235#comment-13661235
 ] 

Alejandro Abdelnur commented on YARN-689:
-

Sandy, thanks for coming up with an example showing the shortcoming of 
overloading minimum to be the multipler

 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch, YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661236#comment-13661236
 ] 

Hadoop QA commented on YARN-689:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12583718/YARN-689.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/954//console

This message is automatically generated.

 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch, YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated YARN-689:


Attachment: YARN-689.patch

rebasing to HEAD resolving some import conflicts.

 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch, YARN-689.patch, YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-597) TestFSDownload fails on Windows because of dependencies on tar/gzip/jar tools

2013-05-17 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661274#comment-13661274
 ] 

Ivan Mitic commented on YARN-597:
-

Thanks Chris and Arun!

 TestFSDownload fails on Windows because of dependencies on tar/gzip/jar tools
 -

 Key: YARN-597
 URL: https://issues.apache.org/jira/browse/YARN-597
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Fix For: 3.0.0

 Attachments: YARN-597.patch


 {{testDownloadArchive}}, {{testDownloadPatternJar}} and 
 {{testDownloadArchiveZip}} fail with the similar Shell ExitCodeException:
 {code}
 testDownloadArchiveZip(org.apache.hadoop.yarn.util.TestFSDownload)  Time 
 elapsed: 480 sec   ERROR!
 org.apache.hadoop.util.Shell$ExitCodeException: bash: line 0: cd: 
 /D:/svn/t/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/TestFSDownload:
  No such file or directory
 gzip: 1: No such file or directory
   at org.apache.hadoop.util.Shell.runCommand(Shell.java:377)
   at org.apache.hadoop.util.Shell.run(Shell.java:292)
   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:497)
   at 
 org.apache.hadoop.yarn.util.TestFSDownload.createZipFile(TestFSDownload.java:225)
   at 
 org.apache.hadoop.yarn.util.TestFSDownload.testDownloadArchiveZip(TestFSDownload.java:503)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-689) Add multiplier unit to resourcecapabilities

2013-05-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661278#comment-13661278
 ] 

Hadoop QA commented on YARN-689:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12583721/YARN-689.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/955//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/955//console

This message is automatically generated.

 Add multiplier unit to resourcecapabilities
 ---

 Key: YARN-689
 URL: https://issues.apache.org/jira/browse/YARN-689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: YARN-689.patch, YARN-689.patch, YARN-689.patch


 Currently we overloading the minimum resource value as the actual multiplier 
 used by the scheduler.
 Today with a minimum memory set to 1GB, requests for 1.5GB are always 
 translated to allocation of 2GB.
 We should decouple the minimum allocation from the multiplier.
 The multiplier should also be exposed to the client via the 
 RegisterApplicationMasterResponse

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira