date:20140103


[ 
https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861324#comment-13861324
 ] 

Hadoop QA commented on YARN-1166:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621270/YARN-1166.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2789//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2789//console

This message is automatically generated.

 YARN 'appsFailed' metric should be of type 'counter'
 

 Key: YARN-1166
 URL: https://issues.apache.org/jira/browse/YARN-1166
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Zhijie Shen
Priority: Blocker
 Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch, 
 YARN-1166.5.patch, YARN-1166.6.patch, YARN-1166.patch


 Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of 
 type 'guage' - which means the exact value will be reported. 
 All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) 
 are all of type 'counter' - meaning Ganglia will use slope to provide deltas 
 between time-points.
 To be consistent, AppsFailed metric should also be of type 'counter'. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-03 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861341#comment-13861341
 ] 

Junping Du commented on YARN-1506:
--

Looks like JVM get crash in running test. Kick off Jenkins test again.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, 
 YARN-1506-v3.patch, YARN-1506-v4.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.


[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861362#comment-13861362
 ] 

Hadoop QA commented on YARN-1506:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621217/YARN-1506-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-sls 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2790//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2790//console

This message is automatically generated.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, 
 YARN-1506-v3.patch, YARN-1506-v4.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2014-01-03 Thread Gera Shegalov (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gera Shegalov updated YARN-1529:

Attachment: YARN-1529.v03.patch

addressing javadoc warning

Add Localization overhead metrics to NM
---

Key: YARN-1529
URL: https://issues.apache.org/jira/browse/YARN-1529
Project: Hadoop YARN
Issue Type: Improvement
Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch,
YARN-1529.v03.patch

Users are often unaware of localization cost that their jobs incur. To
measure effectiveness of localization caches it is necessary to expose the
overhead in the form of metrics.
We propose addition of the following metrics to NodeManagerMetrics.
When a container is about to launch, its set of LocalResources has to be
fetched from a central location, typically on HDFS, that results in a number
of download requests for the files missing in caches.
LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache
misses.
LocalizedFilesCached: total localization requests that were served from local
caches. Cache hits.
LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
LocalizedBytesCached: total bytes satisfied from local caches.
Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that
were served out of cache: ratio = 100 * caches / (caches + misses)
LocalizationDownloadNanos: total elapsed time in nanoseconds for a container
to go from ResourceRequestTransition to LocalizedTransition

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861403#comment-13861403
 ] 

Hadoop QA commented on YARN-1529:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621292/YARN-1529.v03.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2791//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2791//console

This message is automatically generated.

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
 YARN-1529.v03.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1493) Schedulers don't recognize apps separately from app-attempts


[ 
https://issues.apache.org/jira/browse/YARN-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861428#comment-13861428
 ] 

Hudson commented on YARN-1493:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #441 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/441/])
YARN-1493. Changed ResourceManager and Scheduler interfacing to recognize 
app-attempts separately from apps. Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1554896)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSCapacityScheduler.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/event/RMAppAttemptRejectedEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ActiveUsersManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerAppReport.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerAppUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAttemptAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppRemovedSchedulerEvent.java
*

[jira] [Commented] (YARN-1549) TestUnmanagedAMLauncher#testDSShell fails in trunk


[ 
https://issues.apache.org/jira/browse/YARN-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861426#comment-13861426
 ] 

Hudson commented on YARN-1549:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #441 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/441/])
YARN-1549. Fixed a bug in ResourceManager's ApplicationMasterService that was 
causing unamanged AMs to not finish correctly. Contributed by haosdent. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1554886)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/java/org/apache/hadoop/yarn/applications/unmanagedamlauncher/TestUnmanagedAMLauncher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java


 TestUnmanagedAMLauncher#testDSShell fails in trunk
 --

 Key: YARN-1549
 URL: https://issues.apache.org/jira/browse/YARN-1549
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 2.2.0
Reporter: Ted Yu
Assignee: haosdent
 Fix For: 2.4.0

 Attachments: YARN-1549.1.patch, YARN-1549.patch


 The following error is reproducible:
 {code}
 testDSShell(org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher)
   Time elapsed: 14.911 sec   ERROR!
 java.lang.RuntimeException: Failed to receive final expected state in 
 ApplicationReport, CurrentState=RUNNING, ExpectedStates=FINISHED,FAILED,KILLED
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.monitorApplication(UnmanagedAMLauncher.java:447)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.run(UnmanagedAMLauncher.java:352)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.testDSShell(TestUnmanagedAMLauncher.java:147)
 {code}
 See https://builds.apache.org/job/Hadoop-Yarn-trunk/435



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1549) TestUnmanagedAMLauncher#testDSShell fails in trunk


[ 
https://issues.apache.org/jira/browse/YARN-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861511#comment-13861511
 ] 

Hudson commented on YARN-1549:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1633 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1633/])
YARN-1549. Fixed a bug in ResourceManager's ApplicationMasterService that was 
causing unamanged AMs to not finish correctly. Contributed by haosdent. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1554886)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/java/org/apache/hadoop/yarn/applications/unmanagedamlauncher/TestUnmanagedAMLauncher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java


 TestUnmanagedAMLauncher#testDSShell fails in trunk
 --

 Key: YARN-1549
 URL: https://issues.apache.org/jira/browse/YARN-1549
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 2.2.0
Reporter: Ted Yu
Assignee: haosdent
 Fix For: 2.4.0

 Attachments: YARN-1549.1.patch, YARN-1549.patch


 The following error is reproducible:
 {code}
 testDSShell(org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher)
   Time elapsed: 14.911 sec   ERROR!
 java.lang.RuntimeException: Failed to receive final expected state in 
 ApplicationReport, CurrentState=RUNNING, ExpectedStates=FINISHED,FAILED,KILLED
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.monitorApplication(UnmanagedAMLauncher.java:447)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.run(UnmanagedAMLauncher.java:352)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.testDSShell(TestUnmanagedAMLauncher.java:147)
 {code}
 See https://builds.apache.org/job/Hadoop-Yarn-trunk/435



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1493) Schedulers don't recognize apps separately from app-attempts


[ 
https://issues.apache.org/jira/browse/YARN-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861513#comment-13861513
 ] 

Hudson commented on YARN-1493:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1633 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1633/])
YARN-1493. Changed ResourceManager and Scheduler interfacing to recognize 
app-attempts separately from apps. Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1554896)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSCapacityScheduler.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/event/RMAppAttemptRejectedEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ActiveUsersManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerAppReport.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerAppUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAttemptAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppRemovedSchedulerEvent.java
*

[jira] [Commented] (YARN-1549) TestUnmanagedAMLauncher#testDSShell fails in trunk


[ 
https://issues.apache.org/jira/browse/YARN-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861556#comment-13861556
 ] 

Hudson commented on YARN-1549:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1658 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1658/])
YARN-1549. Fixed a bug in ResourceManager's ApplicationMasterService that was 
causing unamanged AMs to not finish correctly. Contributed by haosdent. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1554886)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/java/org/apache/hadoop/yarn/applications/unmanagedamlauncher/TestUnmanagedAMLauncher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java


 TestUnmanagedAMLauncher#testDSShell fails in trunk
 --

 Key: YARN-1549
 URL: https://issues.apache.org/jira/browse/YARN-1549
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 2.2.0
Reporter: Ted Yu
Assignee: haosdent
 Fix For: 2.4.0

 Attachments: YARN-1549.1.patch, YARN-1549.patch


 The following error is reproducible:
 {code}
 testDSShell(org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher)
   Time elapsed: 14.911 sec   ERROR!
 java.lang.RuntimeException: Failed to receive final expected state in 
 ApplicationReport, CurrentState=RUNNING, ExpectedStates=FINISHED,FAILED,KILLED
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.monitorApplication(UnmanagedAMLauncher.java:447)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.run(UnmanagedAMLauncher.java:352)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.testDSShell(TestUnmanagedAMLauncher.java:147)
 {code}
 See https://builds.apache.org/job/Hadoop-Yarn-trunk/435



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1493) Schedulers don't recognize apps separately from app-attempts


[ 
https://issues.apache.org/jira/browse/YARN-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861558#comment-13861558
 ] 

Hudson commented on YARN-1493:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1658 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1658/])
YARN-1493. Changed ResourceManager and Scheduler interfacing to recognize 
app-attempts separately from apps. Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1554896)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSCapacityScheduler.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/event/RMAppAttemptRejectedEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ActiveUsersManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerAppReport.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerAppUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAttemptAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppRemovedSchedulerEvent.java
*

[jira] [Created] (YARN-1556) NPE getting application report with a null appId

2014-01-03 Thread Steve Loughran (JIRA)

Steve Loughran created YARN-1556:


 Summary: NPE getting application report with a null appId
 Key: YARN-1556
 URL: https://issues.apache.org/jira/browse/YARN-1556
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Steve Loughran
Priority: Trivial


If you accidentally pass in a null appId to get application report, you get an 
NPE back. This is arguably as intended, except that maybe a guard statement 
could report this in such a way as to make it easy for callers to track down 
the cause.
{code}
java.lang.NullPointerException: java.lang.NullPointerException
org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
java.lang.NullPointerException
at 
java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:333)
at 
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:988)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:243)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:120)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy75.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:137)
... 28 more
{code}







--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (YARN-1557) TestYarnClient#testAMMRTokens fails in trunk

2014-01-03 Thread Xuan Gong (JIRA)

Xuan Gong created YARN-1557:
---

 Summary: TestYarnClient#testAMMRTokens fails in trunk
 Key: YARN-1557
 URL: https://issues.apache.org/jira/browse/YARN-1557
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Xuan Gong






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM

2014-01-03 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861678#comment-13861678
 ] 

Xuan Gong commented on YARN-1482:
-

The testcase failure can be tracked in 
https://issues.apache.org/jira/browse/YARN-1557

 WebApplicationProxy should be always-on w.r.t HA even if it is embedded in 
 the RM
 -

 Key: YARN-1482
 URL: https://issues.apache.org/jira/browse/YARN-1482
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch, 
 YARN-1482.4.patch, YARN-1482.4.patch


 This way, even if an RM goes to standby mode, we can affect a redirect to the 
 active. And more importantly, users will not suddenly see all their links 
 stop working.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM

2014-01-03 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861713#comment-13861713
 ] 

Bikas Saha commented on YARN-1029:
--

Patch looks good to me. Although the flakiness of the new Test needs to be 
monitored. One option would be to walk through the test in a debugger to 
satisfy yourself that things are indeed happening the way they should.
Lets commit the patch and move onto the next items. I think this patch may have 
partially covered some of the work of the ZKFC jira. We can address further 
comments as they come.

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch, 
 yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM

2014-01-03 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861716#comment-13861716
 ] 

Bikas Saha commented on YARN-1029:
--

Thanks for your patience through the review. These things are pretty subtle and 
the more time we spent making it simple and thinking through stuff the better 
later on. Although I am sure we will be surprised by real life later on :P

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch, 
 yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission

2014-01-03 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861731#comment-13861731
 ] 

Xuan Gong commented on YARN-1410:
-

[~bikassaha]. [~kkambatl] Any further comments ?

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410.1.patch


 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1557) TestYarnClient#testAMMRTokens fails in trunk


[ 
https://issues.apache.org/jira/browse/YARN-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861776#comment-13861776
 ] 

Jian He commented on YARN-1557:
---

This is caused by YARN-1493 and will be fixed in YARN-1490

 TestYarnClient#testAMMRTokens fails in trunk
 

 Key: YARN-1557
 URL: https://issues.apache.org/jira/browse/YARN-1557
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Xuan Gong





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM


[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861777#comment-13861777
 ] 

Vinod Kumar Vavilapalli commented on YARN-1029:
---

bq. Vinod Kumar Vavilapalli - did you get a chance to look at the latest patch?
Looking at it right now..

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch, 
 yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1557) TestYarnClient#testAMMRTokens fails in trunk


[ 
https://issues.apache.org/jira/browse/YARN-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861780#comment-13861780
 ] 

Jian He commented on YARN-1557:
---

btw. this is a test issue not, core code issue.

 TestYarnClient#testAMMRTokens fails in trunk
 

 Key: YARN-1557
 URL: https://issues.apache.org/jira/browse/YARN-1557
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Xuan Gong





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits


 [ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1490:
--

Attachment: YARN-1490.2.patch

 RM should optionally not kill all containers when an ApplicationMaster exits
 

 Key: YARN-1490
 URL: https://issues.apache.org/jira/browse/YARN-1490
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Jian He
 Attachments: YARN-1490.1.patch, YARN-1490.2.patch


 This is needed to enable work-preserving AM restart. Some apps can chose to 
 reconnect with old running containers, some may not want to. This should be 
 an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits


[ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861850#comment-13861850
 ] 

Hadoop QA commented on YARN-1490:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621367/YARN-1490.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 15 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2792//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2792//console

This message is automatically generated.

 RM should optionally not kill all containers when an ApplicationMaster exits
 

 Key: YARN-1490
 URL: https://issues.apache.org/jira/browse/YARN-1490
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Jian He
 Attachments: YARN-1490.1.patch, YARN-1490.2.patch


 This is needed to enable work-preserving AM restart. Some apps can chose to 
 reconnect with old running containers, some may not want to. This should be 
 an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1495) Allow moving apps between queues


[ 
https://issues.apache.org/jira/browse/YARN-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861864#comment-13861864
 ] 

Sandy Ryza commented on YARN-1495:
--

Good point, Bikas.  Filed YARN-1558 for this.

 Allow moving apps between queues
 

 Key: YARN-1495
 URL: https://issues.apache.org/jira/browse/YARN-1495
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 2.2.0
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 This is an umbrella JIRA for work needed to allow moving YARN applications 
 from one queue to another.  The work will consist of additions in the command 
 line options, additions in the client RM protocol, and changes in the 
 schedulers to support this.
 I have a picture of how this should function in the Fair Scheduler, but I'm 
 not familiar enough with the Capacity Scheduler for the same there.  
 Ultimately, the decision to whether an application can be moved should go 
 down to the scheduler - some schedulers may wish not to support this at all.  
 However, schedulers that do support it should share some common semantics 
 around ACLs and what happens to running containers.
 Here is how I see the general semantics working out:
 * A move request is issued by the client.  After it gets past ACLs, the 
 scheduler checks whether executing the move will violate any constraints. For 
 the Fair Scheduler, these would be queue maxRunningApps and queue 
 maxResources constraints
 * All running containers are transferred from the old queue to the new queue
 * All outstanding requests are transferred from the old queue to the new queue
 Here is I see the ACLs of this working out:
 * To move an app from a queue a user must have modify access on the app or 
 administer access on the queue
 * To move an app to a queue a user must have submit access on the queue or 
 administer access on the queue 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (YARN-1558) Persist app queue changes in the RM state store

Sandy Ryza created YARN-1558:


 Summary: Persist app queue changes in the RM state store
 Key: YARN-1558
 URL: https://issues.apache.org/jira/browse/YARN-1558
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Sandy Ryza






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits


 [ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1490:
--

Attachment: YARN-1490.3.patch

 RM should optionally not kill all containers when an ApplicationMaster exits
 

 Key: YARN-1490
 URL: https://issues.apache.org/jira/browse/YARN-1490
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Jian He
 Attachments: YARN-1490.1.patch, YARN-1490.2.patch, YARN-1490.3.patch


 This is needed to enable work-preserving AM restart. Some apps can chose to 
 reconnect with old running containers, some may not want to. This should be 
 an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits


[ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861906#comment-13861906
 ] 

Hadoop QA commented on YARN-1490:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621381/YARN-1490.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 16 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2793//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2793//console

This message is automatically generated.

 RM should optionally not kill all containers when an ApplicationMaster exits
 

 Key: YARN-1490
 URL: https://issues.apache.org/jira/browse/YARN-1490
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Jian He
 Attachments: YARN-1490.1.patch, YARN-1490.2.patch, YARN-1490.3.patch


 This is needed to enable work-preserving AM restart. Some apps can chose to 
 reconnect with old running containers, some may not want to. This should be 
 an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1496) Protocol additions to allow moving apps between queues


[ 
https://issues.apache.org/jira/browse/YARN-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861921#comment-13861921
 ] 

Sandy Ryza commented on YARN-1496:
--

Uploading a polished patch

 Protocol additions to allow moving apps between queues
 --

 Key: YARN-1496
 URL: https://issues.apache.org/jira/browse/YARN-1496
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1496.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1496) Protocol additions to allow moving apps between queues


 [ 
https://issues.apache.org/jira/browse/YARN-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1496:
-

Attachment: YARN-1496-1.patch

 Protocol additions to allow moving apps between queues
 --

 Key: YARN-1496
 URL: https://issues.apache.org/jira/browse/YARN-1496
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1496-1.patch, YARN-1496.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1496) Protocol additions to allow moving apps between queues


[ 
https://issues.apache.org/jira/browse/YARN-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861935#comment-13861935
 ] 

Hadoop QA commented on YARN-1496:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621388/YARN-1496-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2794//console

This message is automatically generated.

 Protocol additions to allow moving apps between queues
 --

 Key: YARN-1496
 URL: https://issues.apache.org/jira/browse/YARN-1496
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1496-1.patch, YARN-1496.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1496) Protocol additions to allow moving apps between queues


 [ 
https://issues.apache.org/jira/browse/YARN-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1496:
-

Attachment: YARN-1496-2.patch

 Protocol additions to allow moving apps between queues
 --

 Key: YARN-1496
 URL: https://issues.apache.org/jira/browse/YARN-1496
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1496-1.patch, YARN-1496-2.patch, YARN-1496.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1496) Protocol additions to allow moving apps between queues


[ 
https://issues.apache.org/jira/browse/YARN-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861949#comment-13861949
 ] 

Sandy Ryza commented on YARN-1496:
--

Fixing compilation issue

 Protocol additions to allow moving apps between queues
 --

 Key: YARN-1496
 URL: https://issues.apache.org/jira/browse/YARN-1496
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1496-1.patch, YARN-1496-2.patch, YARN-1496.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits


[ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861951#comment-13861951
 ] 

Jian He commented on YARN-1490:
---

sounds better, thanks!

 RM should optionally not kill all containers when an ApplicationMaster exits
 

 Key: YARN-1490
 URL: https://issues.apache.org/jira/browse/YARN-1490
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Jian He
 Attachments: YARN-1490.1.patch, YARN-1490.2.patch, YARN-1490.3.patch


 This is needed to enable work-preserving AM restart. Some apps can chose to 
 reconnect with old running containers, some may not want to. This should be 
 an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert


[ 
https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861957#comment-13861957
 ] 

Hadoop QA commented on YARN-1136:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12620886/yarn1136.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2796//console

This message is automatically generated.

 Replace junit.framework.Assert with org.junit.Assert
 

 Key: YARN-1136
 URL: https://issues.apache.org/jira/browse/YARN-1136
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Chen He
  Labels: newbie, test
 Attachments: yarn1136.patch


 There are several places where we are using junit.framework.Assert instead of 
 org.junit.Assert.
 {code}grep -rn junit.framework.Assert hadoop-yarn-project/ 
 --include=*.java{code} 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1136) Replace junit.framework.Assert with org.junit.Assert

2014-01-03 Thread Chen He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861967#comment-13861967
 ] 

Chen He commented on YARN-1136:
---

Service Temporarily Unavailable for the Console output: 
https://builds.apache.org/job/PreCommit-YARN-Build/2796//console;

 Replace junit.framework.Assert with org.junit.Assert
 

 Key: YARN-1136
 URL: https://issues.apache.org/jira/browse/YARN-1136
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Chen He
  Labels: newbie, test
 Attachments: yarn1136.patch


 There are several places where we are using junit.framework.Assert instead of 
 org.junit.Assert.
 {code}grep -rn junit.framework.Assert hadoop-yarn-project/ 
 --include=*.java{code} 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1555) [YARN-321] Failing tests in org.apache.hadoop.yarn.server.applicationhistoryservice.*

2014-01-03 Thread Mayank Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861971#comment-13861971
 ] 

Mayank Bansal commented on YARN-1555:
-

+1 
Committing

Thanks,
Mayank

 [YARN-321] Failing tests in 
 org.apache.hadoop.yarn.server.applicationhistoryservice.*
 -

 Key: YARN-1555
 URL: https://issues.apache.org/jira/browse/YARN-1555
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-1555-20140102.txt


 Several tests are failing on the latest YARN-321 branch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1555) [YARN-321] Failing tests in org.apache.hadoop.yarn.server.applicationhistoryservice.*

2014-01-03 Thread Mayank Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861989#comment-13861989
 ] 

Mayank Bansal commented on YARN-1555:
-

Committed to YARN-321 branch. Thanks [~vinodkv]

 [YARN-321] Failing tests in 
 org.apache.hadoop.yarn.server.applicationhistoryservice.*
 -

 Key: YARN-1555
 URL: https://issues.apache.org/jira/browse/YARN-1555
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-1555-20140102.txt


 Several tests are failing on the latest YARN-321 branch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM


[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861990#comment-13861990
 ] 

Vinod Kumar Vavilapalli commented on YARN-1029:
---

Some comments/questions on the last patch:
 - yarn_server_resourcemanager_service_protos.proto: RMActiveNodeInfoProto - 
ActiveRMInfoProto ?
 - yarn-default.xml: This kind of failover is embedded in the RM and does not 
explicitly fence stores.” - “does not” or “does”?
  - I think we should force admins to set yarn.resourcemanager.cluster-id 
explicitly (only in case HA is enabled for now). Defaults don’t tend to be 
changed and a default cluster-id can potentially cause hard-to-debug issues.
 - No need for YarnBadConfigurationException. It isn’t adding any value and is 
inconsistent with how we tackle misconfigs everywhere. Let’s just use 
YarnRuntimeException.
 - Why is ZK added to hadoop-yarn-client module? It should be only in 
server-common?
 - RMFatalEventType.EMBEDDED_ELECTOR - EMBEDDED_ELECTOR_FAILED or something 
like that? Similarly STORE_FENCED to STATE_STORE_FENCED and STORE_OP_FAILED to 
STATE_STORE_OP_FAILED for making it explicit.

EmbeddedElectorService
 - Initialized in AdminService? It can be initialize in ResourceManager class 
itself and it can access AdminService via RMContext.
 - It can similarly access rmDispatcher from RMContext.

Testing
 - We should have one test that switches off the automatic failover. May be 
retain the old testExplicitFailover test in TestRMFailover?
 - TestRMHA.testTransitionsWhenAutomaticFailoverEnabled: After each transition, 
check the state?

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch, 
 yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM


[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862013#comment-13862013
 ] 

Karthik Kambatla commented on YARN-1029:


bq. Initialized in AdminService? It can be initialize in ResourceManager class 
itself and it can access AdminService via RMContext.
We initially had it in the RM, but thought AdminService is a better place. 
http://tinyurl.com/qdo2vos

bq. Why is ZK added to hadoop-yarn-client module? It should be only in 
server-common?
TestRMFailover needs it.

bq. yarn-default.xml: This kind of failover is embedded in the RM and does not 
explicitly fence stores.” - “does not” or “does”?
The elector doesn't explicitly fence (as in the way HDFS does), it is implicit 
and the store is supposed to ensure a single RM can modify it at any point in 
time.

bq. I think we should force admins to set yarn.resourcemanager.cluster-id 
explicitly (only in case HA is enabled for now). Defaults don’t tend to be 
changed and a default cluster-id can potentially cause hard-to-debug issues.
I am okay either way, but I think the fewer configs we *force* admins to set 
the better. If there is a single cluster, it should be perfectly okay to just 
use the default. No? 

Will address remaining suggestions. 

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch, 
 yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1496) Protocol additions to allow moving apps between queues


[ 
https://issues.apache.org/jira/browse/YARN-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862051#comment-13862051
 ] 

Hadoop QA commented on YARN-1496:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621392/YARN-1496-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2795//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2795//console

This message is automatically generated.

 Protocol additions to allow moving apps between queues
 --

 Key: YARN-1496
 URL: https://issues.apache.org/jira/browse/YARN-1496
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1496-1.patch, YARN-1496-2.patch, YARN-1496.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs

2014-01-03 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862099#comment-13862099
 ] 

Zhijie Shen commented on YARN-1461:
---

Thanks Karthik for the patch. In addition to the discussion in YARN-1399. 
Here're some comments on the patch.

1. How about making the two constants configurable?
{code}
+  @InterfaceStability.Evolving
+  public static final int MAX_TAGS = 10;
+  @InterfaceStability.Evolving
+  public static final int MAX_TAG_LENGTH = 25;
{code}

2. Should ApplicationSubmissionContext#newInstance have String[] tags as well? 
Same for ApplicationReport and GetApplicationsRequest. Or you didn't do it on 
purpose for sake of compatibility? If so, I'm just feeling we're going to have 
more newInstance methods that cannot cover all the fields the objects should 
have. 

3. Should we consider both case-sensitive and -insensitive, and both AND and OR 
logic?
{code}
+  if (tags != null  !tags.isEmpty()) {
+SetString appTags = application.getTags();
+if (appTags == null || appTags.isEmpty()) {
+  continue;
+}
+boolean match = false;
+for (String tag : tags) {
+  if (appTags.contains(tag)) {
+match = true;
+break;
+  }
+}
+if (!match) {
+  continue;
+}
+  }
{code}

4. IMHO, one useful web UI is to list  top tags (or tag cloud) on the side bar. 
When one tag is clicked, the applications with this tag is shown on the page. 
Anyway, we can deal with the new UI in a separate ticket.


 RM API and RM changes to handle tags for running jobs
 -

 Key: YARN-1461
 URL: https://issues.apache.org/jira/browse/YARN-1461
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, 
 yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, 
 yarn-1461-7.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1453) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2014-01-03 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated YARN-1453:
-

Attachment: 1453-trunk.patch
1453-branch-2.patch

Updated patches refreshed to latest trunk and branch-2.

 [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments
 -

 Key: YARN-1453
 URL: https://issues.apache.org/jira/browse/YARN-1453
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 2.4.0
Reporter: Andrew Purtell
Priority: Minor
 Attachments: 1453-branch-2.patch, 1453-branch-2.patch, 
 1453-trunk.patch, 1453-trunk.patch


 Javadoc is more strict by default in JDK8 and will error out on malformed or 
 illegal tags found in doc comments. Although tagged as JDK8 all of the 
 required changes are generic Javadoc cleanups.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM

[
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862110#comment-13862110
]

Vinod Kumar Vavilapalli commented on YARN-1029:
---

bq. We initially had it in the RM, but thought AdminService is a better place.
http://tinyurl.com/qdo2vos
Sure. It's not a big deal either ways. Let's leave it the way you had in the
latest patch. But the comment about using fields from RMContext holds.

bq. TestRMFailover needs it.
Hm.. then let's put it in server-common as a compile-time dependency and
specifically in hadop-yarn-client as a test-dependency. Okay?

bq. The elector doesn't explicitly fence [...]
May be state that somehow? It did confuse me a little.

bq. I am okay either way, but I think the fewer configs we force admins to set
the better. If there is a single cluster, it should be perfectly okay to just
use the default. No?
Yeah, thought about it. But it seemed to me that the problem of debugging bad
issues with conflicting cluster-ids is worse than the little convenience the
default value is bringing.

Allow embedding leader election into the RM
---

Key: YARN-1029
URL: https://issues.apache.org/jira/browse/YARN-1029
Project: Hadoop YARN
Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch,
yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch,
yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch,
yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch,
yarn-1029-approach.patch

It should be possible to embed common ActiveStandyElector into the RM such
that ZooKeeper based leader election and notification is in-built. In
conjunction with a ZK state store, this configuration will be a simple
deployment option.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM


[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862111#comment-13862111
 ] 

Vinod Kumar Vavilapalli commented on YARN-1029:
---

Oh, and apologies for the delayed review, holidays and all. And tx for being 
patient too. I hope to commit this over this week-end or as soon as you can 
make it available. Tx.

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-2.patch, yarn-1029-3.patch, 
 yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, 
 yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch, 
 yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1555) [YARN-321] Failing tests in org.apache.hadoop.yarn.server.applicationhistoryservice.*


 [ 
https://issues.apache.org/jira/browse/YARN-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1555:
--

Fix Version/s: YARN-321
 Hadoop Flags: Reviewed

Tx Mayank!

You have to set the reviewed-flag and the fix-version during the commit. 
Setting this one myself for now.

 [YARN-321] Failing tests in 
 org.apache.hadoop.yarn.server.applicationhistoryservice.*
 -

 Key: YARN-1555
 URL: https://issues.apache.org/jira/browse/YARN-1555
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Fix For: YARN-321

 Attachments: YARN-1555-20140102.txt


 Several tests are failing on the latest YARN-321 branch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (YARN-1559) Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE

Karthik Kambatla created YARN-1559:
--

 Summary: Race between ServerRMProxy and ClientRMProxy setting 
RMProxy#INSTANCE
 Key: YARN-1559
 URL: https://issues.apache.org/jira/browse/YARN-1559
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker


RMProxy#INSTANCE is a non-final static field and both ServerRMProxy and 
ClientRMProxy set it. This leads to races as witnessed on - YARN-1482.

Sample trace:
{noformat}
java.lang.IllegalArgumentException: RM does not support this client protocol
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at 
org.apache.hadoop.yarn.client.ClientRMProxy.checkAllowedProtocols(ClientRMProxy.java:119)
at 
org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:58)
at 
org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:158)
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:88)
at 
org.apache.hadoop.yarn.server.api.ServerRMProxy.createRMProxy(ServerRMProxy.java:56)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM


[ 
https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862118#comment-13862118
 ] 

Karthik Kambatla commented on YARN-1482:


bq. in TestRMFailOver.java to pass the test case. Otherwise it will throw out 
this exception:
Ran into something similar - believe it is because of a race between 
ClientRMProxy and ServerRMProxy - the way we set INSTANCE is unorthodox and 
lends itself to these. Created YARN-1559.

 WebApplicationProxy should be always-on w.r.t HA even if it is embedded in 
 the RM
 -

 Key: YARN-1482
 URL: https://issues.apache.org/jira/browse/YARN-1482
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch, 
 YARN-1482.4.patch, YARN-1482.4.patch


 This way, even if an RM goes to standby mode, we can affect a redirect to the 
 active. And more importantly, users will not suddenly see all their links 
 stop working.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM