date:20141204


 [ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2762:
-
Attachment: YARN-2762.2.patch

 RMAdminCLI node-labels-related args should be trimmed and checked before 
 sending to RM
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
 YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM


[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234164#comment-14234164
 ] 

Rohith commented on YARN-2762:
--

Updated patch with log message consistency.Kindly review

 RMAdminCLI node-labels-related args should be trimmed and checked before 
 sending to RM
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
 YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()


 [ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2914:
---
Summary: Potential race condition in 
SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()  (was: 
Potential race condition in 
SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance())

 Potential race condition in 
 SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
 

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance()


 [ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2914:
---
Summary: Potential race condition in 
SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance()  (was: Potential race 
condition in 
SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance())

 Potential race condition in 
 SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance()
 -

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()


 [ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2914:
---
Summary: Potential race condition in 
SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()  (was: 
Potential race condition in 
SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance())

 Potential race condition in 
 SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
 

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.002.patch, YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance()


 [ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2914:
---
Attachment: YARN-2914.002.patch

 Potential race condition in 
 SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance()
 -

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.002.patch, YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()


[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234316#comment-14234316
 ] 

Varun Saxena commented on YARN-2914:


Also changed the issue title to reflect fix for CleanerMetrics as well

 Potential race condition in 
 SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
 

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.002.patch, YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()


[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234313#comment-14234313
 ] 

Varun Saxena commented on YARN-2914:


[~ctrezzo], [~ozawa] and [~tedyu], kindly review. 
Have updated code for CleanerMetrics as well. Configuration object being passed 
was unnecessary, so removed it.


 Potential race condition in 
 SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
 

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.002.patch, YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2437) start-yarn.sh/stop-yarn should give info


 [ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2437:
---
Attachment: YARN-2437.patch

 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 2.7.0

 Attachments: YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup


 [ 
https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2923:

Issue Type: Sub-task  (was: New Feature)
Parent: YARN-2492

 Support configuration based NodeLabelsProvider Service in Distributed Node 
 Label Configuration Setup 
 -

 Key: YARN-2923
 URL: https://issues.apache.org/jira/browse/YARN-2923
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R

 As part of Distributed Node Labels configuration we need to support Node 
 labels to be configured in Yarn-site.xml. And on modification of Node Labels 
 configuration in yarn-site.xml, NM should be able to get modified Node labels 
 from this NodeLabelsprovider service without NM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)


 [ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2495:

Description: 
Target of this JIRA is to allow admin specify labels in each NM, this covers
- User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or using 
script suggested by [~aw] (YARN-2729) )
- NM will send labels to RM via ResourceTracker API
- RM will set labels in NodeLabelManager when NM register/update labels

  was:
Target of this JIRA is to allow admin specify labels in each NM, this covers
- User can set labels in each NM (by setting yarn-site.xml or using script 
suggested by [~aw])
- NM will send labels to RM via ResourceTracker API
- RM will set labels in NodeLabelManager when NM register/update labels


 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234369#comment-14234369
 ] 

Allen Wittenauer commented on YARN-2437:


This code change should be in start and stop and not in yarn.  Doing it yarn 
means that running 'yarn resourcemanager' throws a message which is not ideal 
for startup scripts.

 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 2.7.0

 Attachments: YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)


[ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234377#comment-14234377
 ] 

Hadoop QA commented on YARN-2900:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12684991/YARN-2900.patch
  against trunk revision 9d1a8f5.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5990//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5990//console

This message is automatically generated.

 Application (Attempt and Container) Not Found in AHS results in Internal 
 Server Error (500)
 ---

 Key: YARN-2900
 URL: https://issues.apache.org/jira/browse/YARN-2900
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
 YARN-2900.patch


 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)


 [ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2495:

Attachment: YARN-2495.20141204-1.patch

Hi [~wangda],
I have done the required modifications for segregating conf-based node label 
provider implementation to a separated ticket.

 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup


 [ 
https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2923:

Attachment: YARN-2923.20141204-1.patch

uploading patch after segregating configuration based NodeLabelProvider service 
from YARN-2495

 Support configuration based NodeLabelsProvider Service in Distributed Node 
 Label Configuration Setup 
 -

 Key: YARN-2923
 URL: https://issues.apache.org/jira/browse/YARN-2923
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Attachments: YARN-2923.20141204-1.patch


 As part of Distributed Node Labels configuration we need to support Node 
 labels to be configured in Yarn-site.xml. And on modification of Node Labels 
 configuration in yarn-site.xml, NM should be able to get modified Node labels 
 from this NodeLabelsprovider service without NM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2043) Rename internal names to being Timeline Service instead of application history


[ 
https://issues.apache.org/jira/browse/YARN-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234421#comment-14234421
 ] 

Naganarasimha G R commented on YARN-2043:
-

Need to fix some issues mentioned as part of yarn-2838

 Rename internal names to being Timeline Service instead of application history
 --

 Key: YARN-2043
 URL: https://issues.apache.org/jira/browse/YARN-2043
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Naganarasimha G R

 Like package and class names. In line with YARN-2033, YARN-1982 etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2014-12-04 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234475#comment-14234475
 ] 

Junping Du commented on YARN-2637:
--

Thanks [~cwelch] for replying my comments and [~leftnoteasy] for your feedback.
bq. option 1 does have the possible issue you describe, and the issue with 
possibly starving all other queues if one queue has the am percent set higher 
than the others I mentioned above.
Agree. Option 1 could be leveraged by malicious behaviors in a multi-tenant 
scenario, i.e. one can ask more AM resources to block AMs in other queues. 
Option 2 sounds reasonable and I agree that we should make sure at least 1 AM 
get launched and warn this case (percentage is set too low). Thoughts?

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.2.patch, YARN-2637.6.patch, 
 YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM


[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234499#comment-14234499
 ] 

Wangda Tan commented on YARN-2762:
--

[~rohithsharma],
I thought you uploaded a wrong patch :)

bq. Do you mean validation check for variable map i.e MapNodeId, SetString 
map = new HashMapNodeId, SetString(); 
Yes 

bq. should be done not only for map.isEmpty but also for map values i.e set 
individually? If so, what if input string is host1:port,, host2:port,x?
host1:port host2,x has another meaning, which means user wants to remove all 
labels in host1.

Thanks,

 RMAdminCLI node-labels-related args should be trimmed and checked before 
 sending to RM
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
 YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2437) start-yarn.sh/stop-yarn should give info


 [ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2437:
---
Attachment: YARN-2437.001.patch

 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 2.7.0

 Attachments: YARN-2437.001.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info


[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234502#comment-14234502
 ] 

Varun Saxena commented on YARN-2437:


[~aw], we cant print hosts information alongside Starting message unlike 
start-dfs.sh though. Because we do not yet support a command like hdfs getconf 
-namenodes in YARN

 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 2.7.0

 Attachments: YARN-2437.001.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)


[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234506#comment-14234506
 ] 

Wangda Tan commented on YARN-2495:
--

Hi [~Naganarasimha],
What I meant is completely remove conf-based node label provider implementation 
(not leave a not-implemented class it in the patch), it should be fine to use a 
dummy node label provider in tests.

 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-2924) Node to labels mapping should not transfer to lowercase when adding from RMAdminCLI

Wangda Tan created YARN-2924:


 Summary: Node to labels mapping should not transfer to lowercase 
when adding from RMAdminCLI
 Key: YARN-2924
 URL: https://issues.apache.org/jira/browse/YARN-2924
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Reporter: Wangda Tan
Assignee: Wangda Tan


In existing implementation, when parsing node-to-labels mapping from 
RMAdminCLI, it transferred all labels to lowercase:

{code}
  for (int i = 1; i  splits.length; i++) {
if (!splits[i].trim().isEmpty()) {
  map.get(nodeId).add(splits[i].trim().toLowerCase());
}
  }
{code}

That is not correct, we should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM


 [ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2762:
-
Attachment: YARN-2762.3.patch

Apologies and thanks wangda for finding wrong patch. I was thinking still why 
QA did not run even its been long time since I attached the patch!!!

 RMAdminCLI node-labels-related args should be trimmed and checked before 
 sending to RM
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
 YARN-2762.3.patch, YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2924) Node to labels mapping should not transfer to lowercase when adding from RMAdminCLI


 [ 
https://issues.apache.org/jira/browse/YARN-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2924:
-
Attachment: YARN-2924.1.patch

Attached a fix for this, without this fix, modified test will fail.

 Node to labels mapping should not transfer to lowercase when adding from 
 RMAdminCLI
 ---

 Key: YARN-2924
 URL: https://issues.apache.org/jira/browse/YARN-2924
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2924.1.patch


 In existing implementation, when parsing node-to-labels mapping from 
 RMAdminCLI, it transferred all labels to lowercase:
 {code}
   for (int i = 1; i  splits.length; i++) {
 if (!splits[i].trim().isEmpty()) {
   map.get(nodeId).add(splits[i].trim().toLowerCase());
 }
   }
 {code}
 That is not correct, we should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM


[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234521#comment-14234521
 ] 

Rohith commented on YARN-2762:
--

I updated correct patch, Kindly review.
The difference from previous patch is only log messages.

 RMAdminCLI node-labels-related args should be trimmed and checked before 
 sending to RM
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
 YARN-2762.3.patch, YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM


[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234526#comment-14234526
 ] 

Wangda Tan commented on YARN-2762:
--

[~rohithsharma],
The Jenkins should have some issues, I've uploaded several patches yesterday, 
but none of them get ran.

One comment is, could you please merge same error message to a final field of 
RMAdminCLI, like 
{code}
   static final String NO_LABEL = xx
{code}

Thanks,

 RMAdminCLI node-labels-related args should be trimmed and checked before 
 sending to RM
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
 YARN-2762.3.patch, YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2921) MockRM#waitForState methods can be too slow and flaky

2014-12-04 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234540#comment-14234540
 ] 

Tsuyoshi OZAWA commented on YARN-2921:
--

Thanks for your suggestion, Karthik. I agree with that we should use 
wait-by-loop approach in this case. I'll update it in a next patch.

{quote}
Other than the smaller sleep, we should also handle the case where the App or 
AppAttempt enters the required state and then moves to a latter state. e.g. App 
moving to RUNNING state when we are waiting for it to get ACCEPTED.
{quote}

I spent some time how I can implement it - how about using variadic as a 
argument of waitFor()? This will work without changing code.

{code}
  public void waitForState(RMAppAttemptState... finalStatesArray)
  throws Exception {
EnumSetRMAppAttemptState finalStates
= EnumSet.copyOf(Arrays.asList(finalStatesArray));
while (!finalStates.contains(attempt.getAppAttemptState())  !isTimeout() {
}
{code}

 MockRM#waitForState methods can be too slow and flaky
 -

 Key: YARN-2921
 URL: https://issues.apache.org/jira/browse/YARN-2921
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: test
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2921.001.patch


 MockRM#waitForState methods currently sleep for too long (2 seconds and 1 
 second). This leads to slow tests and sometimes failures if the 
 App/AppAttempt moves to another state. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM


 [ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2762:
-
Attachment: YARN-2762.4.patch

Thanks for quick review. I updated the patch, please review

 RMAdminCLI node-labels-related args should be trimmed and checked before 
 sending to RM
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
 YARN-2762.3.patch, YARN-2762.4.patch, YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG


 [ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2213:
---
Attachment: YARN-2213.patch

 Change proxy-user cookie log in AmIpFilter to DEBUG
 ---

 Key: YARN-2213
 URL: https://issues.apache.org/jira/browse/YARN-2213
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2213.patch


 I saw a lot of the following lines in AppMaster log:
 {code}
 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 {code}
 For long running app, this would consume considerable log space.
 Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG


 [ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-2213:
--

Assignee: Varun Saxena

 Change proxy-user cookie log in AmIpFilter to DEBUG
 ---

 Key: YARN-2213
 URL: https://issues.apache.org/jira/browse/YARN-2213
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2213.patch


 I saw a lot of the following lines in AppMaster log:
 {code}
 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
 cookie, so user will not be set
 {code}
 For long running app, this would consume considerable log space.
 Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2869) CapacityScheduler should trim sub queue names when parse configuration

[
https://issues.apache.org/jira/browse/YARN-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234577#comment-14234577
]

Hadoop QA commented on YARN-2869:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12685021/YARN-2869-3.patch
against trunk revision 1bbcc3d.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 10 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:red}-1 core tests{color}. The patch failed these unit tests in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler

org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps

org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes

org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification

org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched

org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices

org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels

org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart

org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization

The following test timeouts occurred in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.applications.distributedshell.TestDistrTests
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification

{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/5992//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5992//console

This message is automatically generated.

CapacityScheduler should trim sub queue names when parse configuration
--

Key: YARN-2869
URL: https://issues.apache.org/jira/browse/YARN-2869
Project: Hadoop YARN
Issue Type: Bug
Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
Attachments: YARN-2869-1.patch, YARN-2869-2.patch, YARN-2869-3.patch

Currently, capacity scheduler doesn't trim sub queue name when parsing queue
names, for example, the configuration
{code}
configuration
property
name...root.queues/name
value a, b , c/value
/property
property
name...root.b.capacity/name
value100/value
/property

...
/property
{code}
Will fail with error:
{code}
java.lang.IllegalArgumentException: Illegal capacity of -1.0 for queue root.
a
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:332)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.getCapacityFromConf(LeafQueue.java:196)

{code}
It will try to find a queues with name a, b , and c, which is
apparently wrong, we should do trimming on these sub queue names.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature

[
https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234578#comment-14234578
]

Hadoop QA commented on YARN-2800:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment

http://issues.apache.org/jira/secure/attachment/12685015/YARN-2800-20141203-1.patch
against trunk revision 1bbcc3d.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 10 new
or modified test files.

{color:red}-1 javac{color}. The applied patch generated 1226 javac
compiler warnings (more than the trunk's current 1221 warnings).

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels

org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
org.apache.hadoop.yarn.client.api.impl.TestTimelineClient

org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart

org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification

{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/5993//testReport/
Javac warnings:
https://builds.apache.org/job/PreCommit-YARN-Build/5993//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5993//console

This message is automatically generated.

Remove MemoryNodeLabelsStore and add a way to enable/disable node labels
feature

Key: YARN-2800
URL: https://issues.apache.org/jira/browse/YARN-2800
Project: Hadoop YARN
Issue Type: Sub-task
Components: client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch,
YARN-2800-20141118-1.patch, YARN-2800-20141118-2.patch,
YARN-2800-20141119-1.patch, YARN-2800-20141203-1.patch

In the past, we have a MemoryNodeLabelStore, mostly for user to try this
feature without configuring where to store node labels on file system. It
seems convenient for user to try this, but actually it causes some bad use
experience. User may add/remove labels, and edit capacity-scheduler.xml.
After RM restart, labels will gone, (we store it in mem). And RM cannot get
started if we have some queue uses labels, and the labels don't exist in
cluster.
As what we discussed, we should have an explicitly way to let user specify if
he/she wants this feature or not. If node label is disabled, any operations
trying to modify/use node labels will throw exception.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2301) Improve yarn container command


[ 
https://issues.apache.org/jira/browse/YARN-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234583#comment-14234583
 ] 

Hadoop QA commented on YARN-2301:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12685097/YARN-2301.20141204-1.patch
  against trunk revision 1bbcc3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5994//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5994//console

This message is automatically generated.

 Improve yarn container command
 --

 Key: YARN-2301
 URL: https://issues.apache.org/jira/browse/YARN-2301
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Jian He
Assignee: Naganarasimha G R
  Labels: usability
 Attachments: YARN-2301.01.patch, YARN-2301.03.patch, 
 YARN-2301.20141120-1.patch, YARN-2301.20141203-1.patch, 
 YARN-2301.20141204-1.patch, YARN-2303.patch


 While running yarn container -list Application Attempt ID command, some 
 observations:
 1) the scheme (e.g. http/https  ) before LOG-URL is missing
 2) the start-time is printed as milli seconds (e.g. 1405540544844). Better to 
 print as time format.
 3) finish-time is 0 if container is not yet finished. May be N/A
 4) May have an option to run as yarn container -list appId OR  yarn 
 application -list-containers appId also.  
 As attempt Id is not shown on console, this is easier for user to just copy 
 the appId and run it, may  also be useful for container-preserving AM 
 restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234594#comment-14234594
 ] 

Allen Wittenauer commented on YARN-2437:


Why can't you just use the HDFS one?

 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 2.7.0

 Attachments: YARN-2437.001.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2301) Improve yarn container command

2014-12-04 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234605#comment-14234605
 ] 

Jian He commented on YARN-2301:
---

Hi [~Naganarasimha], 
patch looks good. thanks for updating!

reviewer sometimes cancel the patch for the patch to be updated. You can just 
re-submit the patch once upload a new patch.

 Improve yarn container command
 --

 Key: YARN-2301
 URL: https://issues.apache.org/jira/browse/YARN-2301
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Jian He
Assignee: Naganarasimha G R
  Labels: usability
 Attachments: YARN-2301.01.patch, YARN-2301.03.patch, 
 YARN-2301.20141120-1.patch, YARN-2301.20141203-1.patch, 
 YARN-2301.20141204-1.patch, YARN-2303.patch


 While running yarn container -list Application Attempt ID command, some 
 observations:
 1) the scheme (e.g. http/https  ) before LOG-URL is missing
 2) the start-time is printed as milli seconds (e.g. 1405540544844). Better to 
 print as time format.
 3) finish-time is 0 if container is not yet finished. May be N/A
 4) May have an option to run as yarn container -list appId OR  yarn 
 application -list-containers appId also.  
 As attempt Id is not shown on console, this is easier for user to just copy 
 the appId and run it, may  also be useful for container-preserving AM 
 restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2920) CapacityScheduler should be notified when labels on nodes changed

[
https://issues.apache.org/jira/browse/YARN-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234608#comment-14234608
]

Hadoop QA commented on YARN-2920:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12685013/YARN-2920.1.patch
against trunk revision 1bbcc3d.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 1 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}. The applied patch generated 2
release audit warnings.

{color:red}-1 core tests{color}. The patch failed these unit tests in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens

org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart

org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerQueueACLs

org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerQueueACLs

org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs

org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler

org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage

org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication

org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCSQueueUtils
org.apache.hadoop.yarn.server.resourcemanager.TestRM

org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler

org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService

The following test timeouts occurred in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.security.TestClientToAMTokens
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart

{color:green}+1 contrib tests{color}. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/5991//testReport/
Release audit warnings:
https://builds.apache.org/job/PreCommit-YARN-Build/5991//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5991//console

This message is automatically generated.

CapacityScheduler should be notified when labels on nodes changed
-

Key: YARN-2920
URL: https://issues.apache.org/jira/browse/YARN-2920
Project: Hadoop YARN
Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan
Attachments: YARN-2920.1.patch

Currently, labels on nodes changes will only be handled by
RMNodeLabelsManager, but that is not enough upon labels on nodes changes:
- Scheduler should be able to do take actions to running containers. (Like
kill/preempt/do-nothing)
- Used / available capacity in scheduler should be updated for future
planning.
We need add a new event to pass such updates to scheduler

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2301) Improve yarn container command

2014-12-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234618#comment-14234618
 ] 

Hudson commented on YARN-2301:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6649 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6649/])
YARN-2301. Improved yarn container command. Contributed by Naganarasimha G R 
(jianhe: rev 258623ff8bb1a1057ae3501d4f20982d5a59ea34)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/TestRMContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* hadoop-yarn-project/CHANGES.txt


 Improve yarn container command
 --

 Key: YARN-2301
 URL: https://issues.apache.org/jira/browse/YARN-2301
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Jian He
Assignee: Naganarasimha G R
  Labels: usability
 Fix For: 2.7.0

 Attachments: YARN-2301.01.patch, YARN-2301.03.patch, 
 YARN-2301.20141120-1.patch, YARN-2301.20141203-1.patch, 
 YARN-2301.20141204-1.patch, YARN-2303.patch


 While running yarn container -list Application Attempt ID command, some 
 observations:
 1) the scheme (e.g. http/https  ) before LOG-URL is missing
 2) the start-time is printed as milli seconds (e.g. 1405540544844). Better to 
 print as time format.
 3) finish-time is 0 if container is not yet finished. May be N/A
 4) May have an option to run as yarn container -list appId OR  yarn 
 application -list-containers appId also.  
 As attempt Id is not shown on console, this is easier for user to just copy 
 the appId and run it, may  also be useful for container-preserving AM 
 restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2189) Admin service for cache manager

2014-12-04 Thread Chris Trezzo (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-2189:
---
Attachment: YARN-2189-trunk-v7.patch

[~kasha] V7 attached. V6 to V7 diff: 
https://github.com/ctrezzo/hadoop/commit/ef90864334a3b0569ccfbc3a1d3783afd7ea89f4

I cleaned up the imports in SCMAdmin. I started to make the more generic audit 
logger, but all methods in RMAuditLogger are static! This makes it difficult to 
create an interface or abstract class that can be shared across servers. I 
looked at making a more generic verify access method that would be shared, but 
the utility is lost because of the inability to share the logger. I could make 
an RMAuditLogger without static methods, but that seems like something that 
probably deserves its own jira. Let me know what you think. I could be missing 
something completely.

 Admin service for cache manager
 ---

 Key: YARN-2189
 URL: https://issues.apache.org/jira/browse/YARN-2189
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Attachments: YARN-2189-trunk-v1.patch, YARN-2189-trunk-v2.patch, 
 YARN-2189-trunk-v3.patch, YARN-2189-trunk-v4.patch, YARN-2189-trunk-v5.patch, 
 YARN-2189-trunk-v6.patch, YARN-2189-trunk-v7.patch


 Implement the admin service for the shared cache manager. This service is 
 responsible for handling administrative commands such as manually running a 
 cleaner task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info


[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234700#comment-14234700
 ] 

Varun Saxena commented on YARN-2437:


[~aw], you mean why cant we use the same logic which is present in start-dfs.sh 
to print the list of node managers to be started ?  

 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 2.7.0

 Attachments: YARN-2437.001.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234706#comment-14234706
 ] 

Allen Wittenauer commented on YARN-2437:


Yup. Go ahead and use 'hdfs getconf' to get the info.

 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 2.7.0

 Attachments: YARN-2437.001.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

[
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234736#comment-14234736
]

Varun Saxena commented on YARN-2437:

[~aw], what I meant in my earlier comment was hdfs getconf is not supported
for YARN. hdfs getconf essentially reads hdfs-site.xml and gets a list of
namenodes by using below command :
{code}
hdfs getconf -namenodes
{code}

Infact, even hdfs getconf -datanodes is not supported.
So in start-dfs.sh, we can list all the namenodes which are being started but
cant do the same for datanodes. So you will find only Starting datanodes
message when datanodes are started.

For YARN, as there is no such command, we cant get hosts info. And hence can
only print Starting resourcemanager or Starting nodemanagers.
Even in 2.4, we only printed starting resourcemanager, logging to x
There is a JIRA open though for implementing yarn getconf command.

start-yarn.sh/stop-yarn should give info

Key: YARN-2437
URL: https://issues.apache.org/jira/browse/YARN-2437
Project: Hadoop YARN
Issue Type: Improvement
Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
Labels: newbie
Fix For: 2.7.0

Attachments: YARN-2437.001.patch, YARN-2437.patch

With the merger and cleanup of the daemon launch code, yarn-daemons.sh no
longer prints Starting information. This should be made more of an analog
of start-dfs.sh/stop-dfs.sh.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234734#comment-14234734
 ] 

Ted Yu commented on YARN-2914:
--

lgtm

I triggered a QA run manually.

 Potential race condition in 
 SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
 

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.002.patch, YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info


[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234750#comment-14234750
 ] 

Varun Saxena commented on YARN-2437:


We can although read the ${CONFIG_DIR}/slaves file and get a list of slaves. 
Currently slaves file is read(contents not printed) in hadoop_connect_to_hosts 
function in hadoop-functions.sh
We can read it in yarn-daemons.sh too, if the file exists to provide hosts info.

 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 2.7.0

 Attachments: YARN-2437.001.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()


[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234754#comment-14234754
 ] 

Hadoop QA commented on YARN-2914:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685124/YARN-2914.002.patch
  against trunk revision 26d8dec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5995//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5995//console

This message is automatically generated.

 Potential race condition in 
 SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
 

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.002.patch, YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info


[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234760#comment-14234760
 ] 

Varun Saxena commented on YARN-2437:


I meant read it in start-yarn.sh

 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 2.7.0

 Attachments: YARN-2437.001.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234765#comment-14234765
 ] 

Tsuyoshi OZAWA commented on YARN-2914:
--

LGTM(non-binding).

 Potential race condition in 
 SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
 

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.002.patch, YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2837) Timeline server needs to recover the timeline DT when restarting

2014-12-04 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234801#comment-14234801
 ] 

Jian He commented on YARN-2837:
---

Thanks Zhijie and Li, some comments on my side:

- Logic in {{Version loadedVersion = loadVersion();}} Consider this scenario: 
CURRENT_VERSION_INFO = 2.0; there’s no version info currently saved in 
store-store. {{loadVersion}} returns 1.0; It’ll throw inCompatible exception, 
even though it should not. 
- we can probably use protobuf to incorporate both tokenIdentifer and the 
renewDate to support better compatibility. e.g. RMDelegationTokenIdentifierData
{code}
  renewDate = in.readLong();
  tokenId.readFields(in);
{code}
- {{LeveldbTimelineStateStore#updateToken}}, it’s always adding new token, the 
old token still remain, we should remove the old token
- {{AbstractDelegationTokenSecretManager#delegationTokenSequenceNumber}} is not 
updated on recovery; the implementation seems using sequenceNumber as the key, 
we need to keep track of the latest sequenceNumber so that it can be recovered.
- the following looks ok , a simpler way might be to just concatenate two 
strings.
{code}
.add(TOKEN_MASTER_KEY_ENTRY_PREFIX).add(Integer.toString(keyId)){code}
- the following log is present in both 
{{TimelineDelegationTokenSecretManager#storeNewMasterKey}} and the underlying 
state-store implementation; only printing  in one place is enough. similar to 
other operations.
{code}
  if (LOG.isDebugEnabled()) {
LOG.debug(Storing master key  + key.getKeyId());
  }
{code}
- FILENAME-DB_NAME; leveldb-state-store.ldb - timeline-state-store.ldb
- default path for state store is the same as time-line store for application 
data. If apps posts massive data in store, will that also affect system data 
seek performance ? If so, we should have a different store path from the one 
for apps.  Or we could force user to configure the path properly and throw 
exception otherwise.

 Timeline server needs to recover the timeline DT when restarting
 

 Key: YARN-2837
 URL: https://issues.apache.org/jira/browse/YARN-2837
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-2837.1.patch, YARN-2837.2.patch, YARN-2837.3.patch


 Timeline server needs to recover the stateful information when restarting as 
 RM/NM/JHS does now. So far the stateful information only includes the 
 timeline DT. Without recovery, the timeline DT of the existing YARN apps is 
 not long valid, and cannot be renewed any more after the timeline server is 
 restarted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Chris Trezzo (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234810#comment-14234810
 ] 

Chris Trezzo commented on YARN-2914:


+1 Looks good to me as well. Thanks!

 Potential race condition in 
 SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
 

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.002.patch, YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234836#comment-14234836
 ] 

Sangjin Lee commented on YARN-2914:
---

LGTM (non-binding). Thanks [~varun_saxena]!

 Potential race condition in 
 SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
 

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.002.patch, YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)

2014-12-04 Thread Mit Desai (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-2900:

Attachment: YARN-2900.patch

Refining the patch

 Application (Attempt and Container) Not Found in AHS results in Internal 
 Server Error (500)
 ---

 Key: YARN-2900
 URL: https://issues.apache.org/jira/browse/YARN-2900
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
 YARN-2900.patch, YARN-2900.patch


 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2837) Timeline server needs to recover the timeline DT when restarting

2014-12-04 Thread Li Lu (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234844#comment-14234844
]

Li Lu commented on YARN-2837:
-

{quote}
Logic in Version loadedVersion = loadVersion(); Consider this scenario:
CURRENT_VERSION_INFO = 2.0; there’s no version info currently saved in
store-store. loadVersion returns 1.0; It’ll throw inCompatible exception, even
though it should not.
{quote}
This looks to be a valid concern, but I noticed similar logic also exist in
LeveldbTimelineStore. We need to be consistent on this logic.

{quote}
default path for state store is the same as time-line store for application
data. If apps posts massive data in store, will that also affect system data
seek performance ?
{quote}
The two leveldb stores are working on different leveldb files, so I think it's
fine.

Timeline server needs to recover the timeline DT when restarting

Key: YARN-2837
URL: https://issues.apache.org/jira/browse/YARN-2837
Project: Hadoop YARN
Issue Type: New Feature
Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Priority: Blocker
Fix For: 2.7.0

Attachments: YARN-2837.1.patch, YARN-2837.2.patch, YARN-2837.3.patch

Timeline server needs to recover the stateful information when restarting as
RM/NM/JHS does now. So far the stateful information only includes the
timeline DT. Without recovery, the timeline DT of the existing YARN apps is
not long valid, and cannot be renewed any more after the timeline server is
restarted.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2837) Timeline server needs to recover the timeline DT when restarting

2014-12-04 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234847#comment-14234847
 ] 

Zhijie Shen commented on YARN-2837:
---

bq.  It’ll throw inCompatible exception, even though it should not.

I guess it's way we want to do. When currently CURRENT_VERSION_INFO = 1.0, we 
make it compatible with the existing store without version info. When 
CURRENT_VERSION_INFO is upgraded to 2.0, the db schema is no longer compatible 
to 1.0 and the exception is thrown. Going directly from no version to 2.0 may 
not be a valid use case.

Here I simply reuse the same logic that other leveldb impl is using. If you 
have some concern, we can figure out it separately.

bq. RMDelegationTokenIdentifierData

I noticed it before, but I don't think it is really necessary to create another 
protobuf obj just because we have one additional long integer to ser/des.

bq. it’s always adding new token, the old token still remain, we should remove 
the old token

{{db.put(k, v);}} will update the value if key already exists.

bq. the following looks ok , a simpler way might be to just concatenate two 
strings.

That's different. See KeyBuilder  and KeyParser for more detail. It's need to 
be broken into two section for the convenience of parsing it.

bq. Or we could force user to configure the path properly and throw exception 
otherwise.

I prefer to keep state store and data store under the same dir by default.  For 
advanced deployment, the user is free to config them separately.

bq. we need to keep track of the latest sequenceNumber so that it can be 
recovered.

It is updated here.
{code}
public void recover(TimelineServiceState state) throws IOException {
  LOG.info(Recovering  + getClass().getSimpleName());
  for (DelegationKey key : state.getTokenMasterKeyState()) {
addKey(key);
  }
  for (EntryTimelineDelegationTokenIdentifier, Long entry :
  state.getTokenState().entrySet()) {
addPersistedDelegationToken(entry.getKey(), entry.getValue());
  }
}
{code}

Otherwise, I updated the patch accordingly. BTW, I've removed the cache config, 
as it is not so important to the state store.

 Timeline server needs to recover the timeline DT when restarting
 

 Key: YARN-2837
 URL: https://issues.apache.org/jira/browse/YARN-2837
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-2837.1.patch, YARN-2837.2.patch, YARN-2837.3.patch


 Timeline server needs to recover the stateful information when restarting as 
 RM/NM/JHS does now. So far the stateful information only includes the 
 timeline DT. Without recovery, the timeline DT of the existing YARN apps is 
 not long valid, and cannot be renewed any more after the timeline server is 
 restarted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2837) Timeline server needs to recover the timeline DT when restarting

2014-12-04 Thread Zhijie Shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-2837:
--
Attachment: YARN-2837.4.patch

 Timeline server needs to recover the timeline DT when restarting
 

 Key: YARN-2837
 URL: https://issues.apache.org/jira/browse/YARN-2837
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-2837.1.patch, YARN-2837.2.patch, YARN-2837.3.patch, 
 YARN-2837.4.patch


 Timeline server needs to recover the stateful information when restarting as 
 RM/NM/JHS does now. So far the stateful information only includes the 
 timeline DT. Without recovery, the timeline DT of the existing YARN apps is 
 not long valid, and cannot be renewed any more after the timeline server is 
 restarted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-2925) Internal fields in LeafQueue access should be protected when accessed from FiCaSchedulerApp to calculate Headroom

Wangda Tan created YARN-2925:


 Summary: Internal fields in LeafQueue access should be protected 
when accessed from FiCaSchedulerApp to calculate Headroom
 Key: YARN-2925
 URL: https://issues.apache.org/jira/browse/YARN-2925
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Wangda Tan
Priority: Critical


Upon YARN-2644, FiCaScheduler will calculation up-to-date headroom before 
sending back Allocation response to AM.

Headroom calculation is happened in LeafQueue side, uses fields like used 
resource, etc. But it is not protected by any lock of LeafQueue, so it might be 
corrupted is someone else is editing it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2925) Internal fields in LeafQueue access should be protected when accessed from FiCaSchedulerApp to calculate Headroom


[ 
https://issues.apache.org/jira/browse/YARN-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234898#comment-14234898
 ] 

Wangda Tan commented on YARN-2925:
--

We cannot simply add a synchronized modifier to internal fields used to get 
user-limit and headroom, it will lead to deadlock:
Assume:
- Thread 1 is CS's message handler, it process a node's heartbeat and trying to 
allocate some containers. It will acquires LeafQueue's synchronized lock first, 
then acquires corresponding FiCaScheduler's synchronized lock
- Thread 2 is ApplicationMasterService.allocate, it will all CS.allocate, first 
will acquires FiCaScheduler's synchronized lock, then it will acquires 
LeafQueue's synchronized
Thread 1/2 will be deadlock after then.

Basically, we have two choices to solve this problem and avoid deadlock 
mentioned above,
- Adding synchronized modifier to CapacityScheduler.allocate, that writing 
operations to LeafQueue will be protected by CapacityScheduler lock. But 
according to read world use case, CapacityScheduler.allocate will be called by 
all application between a short period, lock whole CS seems too inefficiency 
here.
- Adding a fine-grained lock in LeafQueue, only protect resource/capacity 
related fields. With this, fields could be protected and CS lock will be 
avoided altogether, so I prefer to do the 2nd way. 

 Internal fields in LeafQueue access should be protected when accessed from 
 FiCaSchedulerApp to calculate Headroom
 -

 Key: YARN-2925
 URL: https://issues.apache.org/jira/browse/YARN-2925
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Wangda Tan
Priority: Critical

 Upon YARN-2644, FiCaScheduler will calculation up-to-date headroom before 
 sending back Allocation response to AM.
 Headroom calculation is happened in LeafQueue side, uses fields like used 
 resource, etc. But it is not protected by any lock of LeafQueue, so it might 
 be corrupted is someone else is editing it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2189) Admin service for cache manager

2014-12-04 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234899#comment-14234899
 ] 

Karthik Kambatla commented on YARN-2189:


bq. but all methods in RMAuditLogger are static!
Yeah, that is unfortunate. We should probably fix this, but agree on doing it 
in another JIRA. 

The latest patch looks good to me. +1, pending Jenkins. 


 Admin service for cache manager
 ---

 Key: YARN-2189
 URL: https://issues.apache.org/jira/browse/YARN-2189
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Attachments: YARN-2189-trunk-v1.patch, YARN-2189-trunk-v2.patch, 
 YARN-2189-trunk-v3.patch, YARN-2189-trunk-v4.patch, YARN-2189-trunk-v5.patch, 
 YARN-2189-trunk-v6.patch, YARN-2189-trunk-v7.patch


 Implement the admin service for the shared cache manager. This service is 
 responsible for handling administrative commands such as manually running a 
 cleaner task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2189) Admin service for cache manager


[ 
https://issues.apache.org/jira/browse/YARN-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234935#comment-14234935
 ] 

Hadoop QA commented on YARN-2189:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12685173/YARN-2189-trunk-v7.patch
  against trunk revision 26d8dec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager:

  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5996//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5996//console

This message is automatically generated.

 Admin service for cache manager
 ---

 Key: YARN-2189
 URL: https://issues.apache.org/jira/browse/YARN-2189
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Attachments: YARN-2189-trunk-v1.patch, YARN-2189-trunk-v2.patch, 
 YARN-2189-trunk-v3.patch, YARN-2189-trunk-v4.patch, YARN-2189-trunk-v5.patch, 
 YARN-2189-trunk-v6.patch, YARN-2189-trunk-v7.patch


 Implement the admin service for the shared cache manager. This service is 
 responsible for handling administrative commands such as manually running a 
 cleaner task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)

2014-12-04 Thread Jonathan Eagles (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234967#comment-14234967
 ] 

Jonathan Eagles commented on YARN-2900:
---

+1. [~zjshen], any last comments before this goes in?

 Application (Attempt and Container) Not Found in AHS results in Internal 
 Server Error (500)
 ---

 Key: YARN-2900
 URL: https://issues.apache.org/jira/browse/YARN-2900
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
 YARN-2900.patch, YARN-2900.patch


 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)

2014-12-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14235024#comment-14235024
 ] 

Hudson commented on YARN-1492:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6651 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6651/])
YARN-2189. [YARN-1492] Admin service for cache manager. (Chris Trezzo via 
kasha) (kasha: rev 78968155d7f87f2147faf96c5eef9c23dba38db8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/service/SCMAdminProtocolPBServiceImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/SCMAdminProtocol.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/SCMAdminProtocolPB.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RunSharedCacheCleanerTaskRequestPBImpl.java
* hadoop-yarn-project/CHANGES.txt
* hadoop-yarn-project/hadoop-yarn/bin/yarn
* hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/client/SCMAdminProtocolPBClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SCMAdminProtocolService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RunSharedCacheCleanerTaskResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/SCMAdmin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/TestSCMAdminProtocolService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RunSharedCacheCleanerTaskRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/SCM_Admin_protocol.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RunSharedCacheCleanerTaskResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


 truly shared cache for jars (jobjar/libjar)
 ---

 Key: YARN-1492
 URL: https://issues.apache.org/jira/browse/YARN-1492
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.4-alpha
Reporter: Sangjin Lee
Assignee: Chris Trezzo
Priority: Critical
 Attachments: YARN-1492-all-trunk-v1.patch, 
 YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch, 
 YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch, 
 shared_cache_design.pdf, shared_cache_design_v2.pdf, 
 shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, 
 shared_cache_design_v5.pdf, shared_cache_design_v6.pdf


 Currently there is the distributed cache that enables you to cache jars and 
 files so that attempts from the same job can reuse them. However, sharing is 
 limited with the distributed cache because it is normally on a per-job basis. 
 On a large cluster, sometimes copying of jobjars and libjars becomes so 
 prevalent that it consumes a large portion of the network bandwidth, not to 
 speak of defeating the purpose of bringing compute to where data is. This 
 is wasteful because in most cases code doesn't change much across many jobs.
 I'd like to propose and discuss feasibility of introducing a truly shared 
 cache so that multiple jobs from multiple users can share and cache jars. 
 This JIRA is to open the discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2189) Admin service for cache manager

2014-12-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14235023#comment-14235023
 ] 

Hudson commented on YARN-2189:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6651 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6651/])
YARN-2189. [YARN-1492] Admin service for cache manager. (Chris Trezzo via 
kasha) (kasha: rev 78968155d7f87f2147faf96c5eef9c23dba38db8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/SCM_Admin_protocol.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/service/SCMAdminProtocolPBServiceImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/SCMAdminProtocolPB.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RunSharedCacheCleanerTaskResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RunSharedCacheCleanerTaskRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/SCMAdminProtocol.java
* hadoop-yarn-project/hadoop-yarn/bin/yarn
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SCMAdminProtocolService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/client/SCMAdminProtocolPBClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/SCMAdmin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheManager.java
* hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RunSharedCacheCleanerTaskResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RunSharedCacheCleanerTaskRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/TestSCMAdminProtocolService.java


 Admin service for cache manager
 ---

 Key: YARN-2189
 URL: https://issues.apache.org/jira/browse/YARN-2189
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Fix For: 2.7.0

 Attachments: YARN-2189-trunk-v1.patch, YARN-2189-trunk-v2.patch, 
 YARN-2189-trunk-v3.patch, YARN-2189-trunk-v4.patch, YARN-2189-trunk-v5.patch, 
 YARN-2189-trunk-v6.patch, YARN-2189-trunk-v7.patch


 Implement the admin service for the shared cache manager. This service is 
 responsible for handling administrative commands such as manually running a 
 cleaner task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2837) Timeline server needs to recover the timeline DT when restarting


[ 
https://issues.apache.org/jira/browse/YARN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14235055#comment-14235055
 ] 

Hadoop QA commented on YARN-2837:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685212/YARN-2837.4.patch
  against trunk revision 7896815.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5997//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5997//console

This message is automatically generated.

 Timeline server needs to recover the timeline DT when restarting
 

 Key: YARN-2837
 URL: https://issues.apache.org/jira/browse/YARN-2837
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Priority: Blocker
 Fix For: 2.7.0

 Attachments: YARN-2837.1.patch, YARN-2837.2.patch, YARN-2837.3.patch, 
 YARN-2837.4.patch


 Timeline server needs to recover the stateful information when restarting as 
 RM/NM/JHS does now. So far the stateful information only includes the 
 timeline DT. Without recovery, the timeline DT of the existing YARN apps is 
 not long valid, and cannot be renewed any more after the timeline server is 
 restarted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)


[ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14235079#comment-14235079
 ] 

Hadoop QA commented on YARN-2900:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685209/YARN-2900.patch
  against trunk revision 7896815.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5999//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5999//console

This message is automatically generated.

 Application (Attempt and Container) Not Found in AHS results in Internal 
 Server Error (500)
 ---

 Key: YARN-2900
 URL: https://issues.apache.org/jira/browse/YARN-2900
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
 YARN-2900.patch, YARN-2900.patch


 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()


[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14235110#comment-14235110
 ] 

Hadoop QA commented on YARN-2914:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685124/YARN-2914.002.patch
  against trunk revision 7896815.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6000//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6000//console

This message is automatically generated.

 Potential race condition in 
 SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
 

 Key: YARN-2914
 URL: https://issues.apache.org/jira/browse/YARN-2914
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ted Yu
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-2914.002.patch, YARN-2914.patch


 {code}
   public static ClientSCMMetrics getInstance() {
 ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
 if (topMetrics == null) {
   throw new IllegalStateException(
 {code}
 getInstance() doesn't hold lock on Singleton.this
 This may result in IllegalStateException being thrown prematurely.
 [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)

2014-12-04 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14235116#comment-14235116
 ] 

Zhijie Shen commented on YARN-2900:
---

Thanks for working on this bug, [~mitdesai] and [~jeagles]! Here's my feedback 
on this patch:

1. Debug message is better to be wrapped in the {{if (LOG.isDebugEnabled())}} 
block.

2. NotFoundException is web only stuff. It shouldn't thrown from 
ApplicationHistoryManagerOnTimelineStore. Why not just returning null? If 
returning null, no change is required in WebServices, right?

3. In ApplicationHistoryClientService, for those getXXXs() methods, we don't 
throw exception, but just return the empty list.

4. In ApplicationHistoryClientService, take {{getApplicationReport}} for an 
example.
{code}
 GetApplicationReportResponse response =
GetApplicationReportResponse.newInstance(history
  .getApplication(applicationId));
{code}
could be changed to
{code}
ApplicationReport appReport = history.getApplication(applicationId);
if (appReport == null) {
  throw new ApplicationNotFoundException();
}
GetApplicationReportResponse response =
GetApplicationReportResponse.newInstance(appReport);
{code}
Other get-single-report methods can be changed accordingly.


 Application (Attempt and Container) Not Found in AHS results in Internal 
 Server Error (500)
 ---

 Key: YARN-2900
 URL: https://issues.apache.org/jira/browse/YARN-2900
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Mit Desai
 Attachments: YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
 YARN-2900.patch, YARN-2900.patch


 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
   at 
 org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info