[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235196#comment-14235196
 ] 

Hadoop QA commented on YARN-2437:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685148/YARN-2437.001.patch
  against trunk revision 7896815.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestHdfsAdmin

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5998//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5998//console

This message is automatically generated.

> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.001.patch, YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)

2014-12-04 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235116#comment-14235116
 ] 

Zhijie Shen commented on YARN-2900:
---

Thanks for working on this bug, [~mitdesai] and [~jeagles]! Here's my feedback 
on this patch:

1. Debug message is better to be wrapped in the {{if (LOG.isDebugEnabled())}} 
block.

2. NotFoundException is web only stuff. It shouldn't thrown from 
ApplicationHistoryManagerOnTimelineStore. Why not just returning null? If 
returning null, no change is required in WebServices, right?

3. In ApplicationHistoryClientService, for those getXXXs() methods, we don't 
throw exception, but just return the empty list.

4. In ApplicationHistoryClientService, take {{getApplicationReport}} for an 
example.
{code}
 GetApplicationReportResponse response =
GetApplicationReportResponse.newInstance(history
  .getApplication(applicationId));
{code}
could be changed to
{code}
ApplicationReport appReport = history.getApplication(applicationId);
if (appReport == null) {
  throw new ApplicationNotFoundException();
}
GetApplicationReportResponse response =
GetApplicationReportResponse.newInstance(appReport);
{code}
Other get-single-report methods can be changed accordingly.


> Application (Attempt and Container) Not Found in AHS results in Internal 
> Server Error (500)
> ---
>
> Key: YARN-2900
> URL: https://issues.apache.org/jira/browse/YARN-2900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Attachments: YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
> YARN-2900.patch, YARN-2900.patch
>
>
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
>   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235110#comment-14235110
 ] 

Hadoop QA commented on YARN-2914:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685124/YARN-2914.002.patch
  against trunk revision 7896815.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6000//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6000//console

This message is automatically generated.

> Potential race condition in 
> SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
> 
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.002.patch, YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235079#comment-14235079
 ] 

Hadoop QA commented on YARN-2900:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685209/YARN-2900.patch
  against trunk revision 7896815.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5999//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5999//console

This message is automatically generated.

> Application (Attempt and Container) Not Found in AHS results in Internal 
> Server Error (500)
> ---
>
> Key: YARN-2900
> URL: https://issues.apache.org/jira/browse/YARN-2900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Attachments: YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
> YARN-2900.patch, YARN-2900.patch
>
>
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
>   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2837) Timeline server needs to recover the timeline DT when restarting

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235055#comment-14235055
 ] 

Hadoop QA commented on YARN-2837:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685212/YARN-2837.4.patch
  against trunk revision 7896815.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5997//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5997//console

This message is automatically generated.

> Timeline server needs to recover the timeline DT when restarting
> 
>
> Key: YARN-2837
> URL: https://issues.apache.org/jira/browse/YARN-2837
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2837.1.patch, YARN-2837.2.patch, YARN-2837.3.patch, 
> YARN-2837.4.patch
>
>
> Timeline server needs to recover the stateful information when restarting as 
> RM/NM/JHS does now. So far the stateful information only includes the 
> timeline DT. Without recovery, the timeline DT of the existing YARN apps is 
> not long valid, and cannot be renewed any more after the timeline server is 
> restarted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2189) Admin service for cache manager

2014-12-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235023#comment-14235023
 ] 

Hudson commented on YARN-2189:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6651 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6651/])
YARN-2189. [YARN-1492] Admin service for cache manager. (Chris Trezzo via 
kasha) (kasha: rev 78968155d7f87f2147faf96c5eef9c23dba38db8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/SCM_Admin_protocol.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/service/SCMAdminProtocolPBServiceImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/SCMAdminProtocolPB.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RunSharedCacheCleanerTaskResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RunSharedCacheCleanerTaskRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/SCMAdminProtocol.java
* hadoop-yarn-project/hadoop-yarn/bin/yarn
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SCMAdminProtocolService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/client/SCMAdminProtocolPBClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/SCMAdmin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheManager.java
* hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RunSharedCacheCleanerTaskResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RunSharedCacheCleanerTaskRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/TestSCMAdminProtocolService.java


> Admin service for cache manager
> ---
>
> Key: YARN-2189
> URL: https://issues.apache.org/jira/browse/YARN-2189
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Fix For: 2.7.0
>
> Attachments: YARN-2189-trunk-v1.patch, YARN-2189-trunk-v2.patch, 
> YARN-2189-trunk-v3.patch, YARN-2189-trunk-v4.patch, YARN-2189-trunk-v5.patch, 
> YARN-2189-trunk-v6.patch, YARN-2189-trunk-v7.patch
>
>
> Implement the admin service for the shared cache manager. This service is 
> responsible for handling administrative commands such as manually running a 
> cleaner task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)

2014-12-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235024#comment-14235024
 ] 

Hudson commented on YARN-1492:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6651 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6651/])
YARN-2189. [YARN-1492] Admin service for cache manager. (Chris Trezzo via 
kasha) (kasha: rev 78968155d7f87f2147faf96c5eef9c23dba38db8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/service/SCMAdminProtocolPBServiceImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/SCMAdminProtocol.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/SCMAdminProtocolPB.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RunSharedCacheCleanerTaskRequestPBImpl.java
* hadoop-yarn-project/CHANGES.txt
* hadoop-yarn-project/hadoop-yarn/bin/yarn
* hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/client/SCMAdminProtocolPBClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SCMAdminProtocolService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RunSharedCacheCleanerTaskResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/SCMAdmin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/test/java/org/apache/hadoop/yarn/server/sharedcachemanager/TestSCMAdminProtocolService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RunSharedCacheCleanerTaskRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/SCM_Admin_protocol.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RunSharedCacheCleanerTaskResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SharedCacheManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


> truly shared cache for jars (jobjar/libjar)
> ---
>
> Key: YARN-1492
> URL: https://issues.apache.org/jira/browse/YARN-1492
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.0.4-alpha
>Reporter: Sangjin Lee
>Assignee: Chris Trezzo
>Priority: Critical
> Attachments: YARN-1492-all-trunk-v1.patch, 
> YARN-1492-all-trunk-v2.patch, YARN-1492-all-trunk-v3.patch, 
> YARN-1492-all-trunk-v4.patch, YARN-1492-all-trunk-v5.patch, 
> shared_cache_design.pdf, shared_cache_design_v2.pdf, 
> shared_cache_design_v3.pdf, shared_cache_design_v4.pdf, 
> shared_cache_design_v5.pdf, shared_cache_design_v6.pdf
>
>
> Currently there is the distributed cache that enables you to cache jars and 
> files so that attempts from the same job can reuse them. However, sharing is 
> limited with the distributed cache because it is normally on a per-job basis. 
> On a large cluster, sometimes copying of jobjars and libjars becomes so 
> prevalent that it consumes a large portion of the network bandwidth, not to 
> speak of defeating the purpose of "bringing compute to where data is". This 
> is wasteful because in most cases code doesn't change much across many jobs.
> I'd like to propose and discuss feasibility of introducing a truly shared 
> cache so that multiple jobs from multiple users can share and cache jars. 
> This JIRA is to open the discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)

2014-12-04 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234967#comment-14234967
 ] 

Jonathan Eagles commented on YARN-2900:
---

+1. [~zjshen], any last comments before this goes in?

> Application (Attempt and Container) Not Found in AHS results in Internal 
> Server Error (500)
> ---
>
> Key: YARN-2900
> URL: https://issues.apache.org/jira/browse/YARN-2900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Attachments: YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
> YARN-2900.patch, YARN-2900.patch
>
>
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
>   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2189) Admin service for cache manager

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234935#comment-14234935
 ] 

Hadoop QA commented on YARN-2189:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12685173/YARN-2189-trunk-v7.patch
  against trunk revision 26d8dec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager:

  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5996//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5996//console

This message is automatically generated.

> Admin service for cache manager
> ---
>
> Key: YARN-2189
> URL: https://issues.apache.org/jira/browse/YARN-2189
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: YARN-2189-trunk-v1.patch, YARN-2189-trunk-v2.patch, 
> YARN-2189-trunk-v3.patch, YARN-2189-trunk-v4.patch, YARN-2189-trunk-v5.patch, 
> YARN-2189-trunk-v6.patch, YARN-2189-trunk-v7.patch
>
>
> Implement the admin service for the shared cache manager. This service is 
> responsible for handling administrative commands such as manually running a 
> cleaner task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2189) Admin service for cache manager

2014-12-04 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234899#comment-14234899
 ] 

Karthik Kambatla commented on YARN-2189:


bq. but all methods in RMAuditLogger are static!
Yeah, that is unfortunate. We should probably fix this, but agree on doing it 
in another JIRA. 

The latest patch looks good to me. +1, pending Jenkins. 


> Admin service for cache manager
> ---
>
> Key: YARN-2189
> URL: https://issues.apache.org/jira/browse/YARN-2189
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: YARN-2189-trunk-v1.patch, YARN-2189-trunk-v2.patch, 
> YARN-2189-trunk-v3.patch, YARN-2189-trunk-v4.patch, YARN-2189-trunk-v5.patch, 
> YARN-2189-trunk-v6.patch, YARN-2189-trunk-v7.patch
>
>
> Implement the admin service for the shared cache manager. This service is 
> responsible for handling administrative commands such as manually running a 
> cleaner task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2925) Internal fields in LeafQueue access should be protected when accessed from FiCaSchedulerApp to calculate Headroom

2014-12-04 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234898#comment-14234898
 ] 

Wangda Tan commented on YARN-2925:
--

We cannot simply add a synchronized modifier to internal fields used to get 
user-limit and headroom, it will lead to deadlock:
Assume:
- Thread 1 is CS's message handler, it process a node's heartbeat and trying to 
allocate some containers. It will acquires LeafQueue's synchronized lock first, 
then acquires corresponding FiCaScheduler's synchronized lock
- Thread 2 is ApplicationMasterService.allocate, it will all CS.allocate, first 
will acquires FiCaScheduler's synchronized lock, then it will acquires 
LeafQueue's synchronized
Thread 1/2 will be deadlock after then.

Basically, we have two choices to solve this problem and avoid deadlock 
mentioned above,
- Adding synchronized modifier to CapacityScheduler.allocate, that writing 
operations to LeafQueue will be protected by CapacityScheduler lock. But 
according to read world use case, CapacityScheduler.allocate will be called by 
all application between a short period, lock whole CS seems too inefficiency 
here.
- Adding a fine-grained lock in LeafQueue, only protect resource/capacity 
related fields. With this, fields could be protected and CS lock will be 
avoided altogether, so I prefer to do the 2nd way. 

> Internal fields in LeafQueue access should be protected when accessed from 
> FiCaSchedulerApp to calculate Headroom
> -
>
> Key: YARN-2925
> URL: https://issues.apache.org/jira/browse/YARN-2925
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
>
> Upon YARN-2644, FiCaScheduler will calculation up-to-date headroom before 
> sending back Allocation response to AM.
> Headroom calculation is happened in LeafQueue side, uses fields like used 
> resource, etc. But it is not protected by any lock of LeafQueue, so it might 
> be corrupted is someone else is editing it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2925) Internal fields in LeafQueue access should be protected when accessed from FiCaSchedulerApp to calculate Headroom

2014-12-04 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-2925:


 Summary: Internal fields in LeafQueue access should be protected 
when accessed from FiCaSchedulerApp to calculate Headroom
 Key: YARN-2925
 URL: https://issues.apache.org/jira/browse/YARN-2925
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Wangda Tan
Priority: Critical


Upon YARN-2644, FiCaScheduler will calculation up-to-date headroom before 
sending back Allocation response to AM.

Headroom calculation is happened in LeafQueue side, uses fields like used 
resource, etc. But it is not protected by any lock of LeafQueue, so it might be 
corrupted is someone else is editing it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2837) Timeline server needs to recover the timeline DT when restarting

2014-12-04 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-2837:
--
Attachment: YARN-2837.4.patch

> Timeline server needs to recover the timeline DT when restarting
> 
>
> Key: YARN-2837
> URL: https://issues.apache.org/jira/browse/YARN-2837
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2837.1.patch, YARN-2837.2.patch, YARN-2837.3.patch, 
> YARN-2837.4.patch
>
>
> Timeline server needs to recover the stateful information when restarting as 
> RM/NM/JHS does now. So far the stateful information only includes the 
> timeline DT. Without recovery, the timeline DT of the existing YARN apps is 
> not long valid, and cannot be renewed any more after the timeline server is 
> restarted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2837) Timeline server needs to recover the timeline DT when restarting

2014-12-04 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234847#comment-14234847
 ] 

Zhijie Shen commented on YARN-2837:
---

bq.  It’ll throw inCompatible exception, even though it should not.

I guess it's way we want to do. When currently CURRENT_VERSION_INFO = 1.0, we 
make it compatible with the existing store without version info. When 
CURRENT_VERSION_INFO is upgraded to 2.0, the db schema is no longer compatible 
to 1.0 and the exception is thrown. Going directly from no version to 2.0 may 
not be a valid use case.

Here I simply reuse the same logic that other leveldb impl is using. If you 
have some concern, we can figure out it separately.

bq. RMDelegationTokenIdentifierData

I noticed it before, but I don't think it is really necessary to create another 
protobuf obj just because we have one additional long integer to ser/des.

bq. it’s always adding new token, the old token still remain, we should remove 
the old token

{{db.put(k, v);}} will update the value if key already exists.

bq. the following looks ok , a simpler way might be to just concatenate two 
strings.

That's different. See KeyBuilder  and KeyParser for more detail. It's need to 
be broken into two section for the convenience of parsing it.

bq. Or we could force user to configure the path properly and throw exception 
otherwise.

I prefer to keep state store and data store under the same dir by default.  For 
advanced deployment, the user is free to config them separately.

bq. we need to keep track of the latest sequenceNumber so that it can be 
recovered.

It is updated here.
{code}
public void recover(TimelineServiceState state) throws IOException {
  LOG.info("Recovering " + getClass().getSimpleName());
  for (DelegationKey key : state.getTokenMasterKeyState()) {
addKey(key);
  }
  for (Entry entry :
  state.getTokenState().entrySet()) {
addPersistedDelegationToken(entry.getKey(), entry.getValue());
  }
}
{code}

Otherwise, I updated the patch accordingly. BTW, I've removed the cache config, 
as it is not so important to the state store.

> Timeline server needs to recover the timeline DT when restarting
> 
>
> Key: YARN-2837
> URL: https://issues.apache.org/jira/browse/YARN-2837
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2837.1.patch, YARN-2837.2.patch, YARN-2837.3.patch
>
>
> Timeline server needs to recover the stateful information when restarting as 
> RM/NM/JHS does now. So far the stateful information only includes the 
> timeline DT. Without recovery, the timeline DT of the existing YARN apps is 
> not long valid, and cannot be renewed any more after the timeline server is 
> restarted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2837) Timeline server needs to recover the timeline DT when restarting

2014-12-04 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234844#comment-14234844
 ] 

Li Lu commented on YARN-2837:
-

{quote}
Logic in Version loadedVersion = loadVersion(); Consider this scenario: 
CURRENT_VERSION_INFO = 2.0; there’s no version info currently saved in 
store-store. loadVersion returns 1.0; It’ll throw inCompatible exception, even 
though it should not.
{quote}
This looks to be a valid concern, but I noticed similar logic also exist in 
LeveldbTimelineStore. We need to be consistent on this logic. 

{quote}
default path for state store is the same as time-line store for application 
data. If apps posts massive data in store, will that also affect system data 
seek performance ?
{quote}
The two leveldb stores are working on different leveldb files, so I think it's 
fine. 

> Timeline server needs to recover the timeline DT when restarting
> 
>
> Key: YARN-2837
> URL: https://issues.apache.org/jira/browse/YARN-2837
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2837.1.patch, YARN-2837.2.patch, YARN-2837.3.patch
>
>
> Timeline server needs to recover the stateful information when restarting as 
> RM/NM/JHS does now. So far the stateful information only includes the 
> timeline DT. Without recovery, the timeline DT of the existing YARN apps is 
> not long valid, and cannot be renewed any more after the timeline server is 
> restarted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)

2014-12-04 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-2900:

Attachment: YARN-2900.patch

Refining the patch

> Application (Attempt and Container) Not Found in AHS results in Internal 
> Server Error (500)
> ---
>
> Key: YARN-2900
> URL: https://issues.apache.org/jira/browse/YARN-2900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Attachments: YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
> YARN-2900.patch, YARN-2900.patch
>
>
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
>   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234836#comment-14234836
 ] 

Sangjin Lee commented on YARN-2914:
---

LGTM (non-binding). Thanks [~varun_saxena]!

> Potential race condition in 
> SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
> 
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.002.patch, YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234810#comment-14234810
 ] 

Chris Trezzo commented on YARN-2914:


+1 Looks good to me as well. Thanks!

> Potential race condition in 
> SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
> 
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.002.patch, YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2837) Timeline server needs to recover the timeline DT when restarting

2014-12-04 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234801#comment-14234801
 ] 

Jian He commented on YARN-2837:
---

Thanks Zhijie and Li, some comments on my side:

- Logic in {{Version loadedVersion = loadVersion();}} Consider this scenario: 
CURRENT_VERSION_INFO = 2.0; there’s no version info currently saved in 
store-store. {{loadVersion}} returns 1.0; It’ll throw inCompatible exception, 
even though it should not. 
- we can probably use protobuf to incorporate both tokenIdentifer and the 
renewDate to support better compatibility. e.g. RMDelegationTokenIdentifierData
{code}
  renewDate = in.readLong();
  tokenId.readFields(in);
{code}
- {{LeveldbTimelineStateStore#updateToken}}, it’s always adding new token, the 
old token still remain, we should remove the old token
- {{AbstractDelegationTokenSecretManager#delegationTokenSequenceNumber}} is not 
updated on recovery; the implementation seems using sequenceNumber as the key, 
we need to keep track of the latest sequenceNumber so that it can be recovered.
- the following looks ok , a simpler way might be to just concatenate two 
strings.
{code}
.add(TOKEN_MASTER_KEY_ENTRY_PREFIX).add(Integer.toString(keyId)){code}
- the following log is present in both 
{{TimelineDelegationTokenSecretManager#storeNewMasterKey}} and the underlying 
state-store implementation; only printing  in one place is enough. similar to 
other operations.
{code}
  if (LOG.isDebugEnabled()) {
LOG.debug("Storing master key " + key.getKeyId());
  }
{code}
- FILENAME->DB_NAME; leveldb-state-store.ldb -> timeline-state-store.ldb
- default path for state store is the same as time-line store for application 
data. If apps posts massive data in store, will that also affect system data 
seek performance ? If so, we should have a different store path from the one 
for apps.  Or we could force user to configure the path properly and throw 
exception otherwise.

> Timeline server needs to recover the timeline DT when restarting
> 
>
> Key: YARN-2837
> URL: https://issues.apache.org/jira/browse/YARN-2837
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Blocker
> Fix For: 2.7.0
>
> Attachments: YARN-2837.1.patch, YARN-2837.2.patch, YARN-2837.3.patch
>
>
> Timeline server needs to recover the stateful information when restarting as 
> RM/NM/JHS does now. So far the stateful information only includes the 
> timeline DT. Without recovery, the timeline DT of the existing YARN apps is 
> not long valid, and cannot be renewed any more after the timeline server is 
> restarted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234765#comment-14234765
 ] 

Tsuyoshi OZAWA commented on YARN-2914:
--

LGTM(non-binding).

> Potential race condition in 
> SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
> 
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.002.patch, YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234760#comment-14234760
 ] 

Varun Saxena commented on YARN-2437:


I meant read it in start-yarn.sh

> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.001.patch, YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234754#comment-14234754
 ] 

Hadoop QA commented on YARN-2914:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685124/YARN-2914.002.patch
  against trunk revision 26d8dec.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5995//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5995//console

This message is automatically generated.

> Potential race condition in 
> SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
> 
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.002.patch, YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234750#comment-14234750
 ] 

Varun Saxena commented on YARN-2437:


We can although read the ${CONFIG_DIR}/slaves file and get a list of slaves. 
Currently slaves file is read(contents not printed) in hadoop_connect_to_hosts 
function in hadoop-functions.sh
We can read it in yarn-daemons.sh too, if the file exists to provide hosts info.

> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.001.patch, YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234734#comment-14234734
 ] 

Ted Yu commented on YARN-2914:
--

lgtm

I triggered a QA run manually.

> Potential race condition in 
> SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
> 
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.002.patch, YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234736#comment-14234736
 ] 

Varun Saxena commented on YARN-2437:


[~aw], what I meant in my earlier comment was "hdfs getconf" is not supported 
for YARN. hdfs getconf essentially reads hdfs-site.xml and gets a list of 
namenodes by using below command :
{code}
hdfs getconf -namenodes
{code}

Infact, even "hdfs getconf -datanodes" is not supported.
So in start-dfs.sh, we can list all the namenodes which are being started but 
cant do the same for datanodes. So you will find only "Starting datanodes" 
message when datanodes are started.

For YARN, as there is no such command, we cant get hosts info. And hence can 
only print "Starting resourcemanager" or "Starting nodemanagers".
Even in 2.4, we only printed "starting resourcemanager, logging to x"
There is a JIRA open though for implementing "yarn getconf" command.



> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.001.patch, YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234706#comment-14234706
 ] 

Allen Wittenauer commented on YARN-2437:


Yup. Go ahead and use 'hdfs getconf' to get the info.

> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.001.patch, YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234700#comment-14234700
 ] 

Varun Saxena commented on YARN-2437:


[~aw], you mean why cant we use the same logic which is present in start-dfs.sh 
to print the list of node managers to be started ?  

> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.001.patch, YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2189) Admin service for cache manager

2014-12-04 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-2189:
---
Attachment: YARN-2189-trunk-v7.patch

[~kasha] V7 attached. V6 to V7 diff: 
https://github.com/ctrezzo/hadoop/commit/ef90864334a3b0569ccfbc3a1d3783afd7ea89f4

I cleaned up the imports in SCMAdmin. I started to make the more generic audit 
logger, but all methods in RMAuditLogger are static! This makes it difficult to 
create an interface or abstract class that can be shared across servers. I 
looked at making a more generic verify access method that would be shared, but 
the utility is lost because of the inability to share the logger. I could make 
an RMAuditLogger without static methods, but that seems like something that 
probably deserves its own jira. Let me know what you think. I could be missing 
something completely.

> Admin service for cache manager
> ---
>
> Key: YARN-2189
> URL: https://issues.apache.org/jira/browse/YARN-2189
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: YARN-2189-trunk-v1.patch, YARN-2189-trunk-v2.patch, 
> YARN-2189-trunk-v3.patch, YARN-2189-trunk-v4.patch, YARN-2189-trunk-v5.patch, 
> YARN-2189-trunk-v6.patch, YARN-2189-trunk-v7.patch
>
>
> Implement the admin service for the shared cache manager. This service is 
> responsible for handling administrative commands such as manually running a 
> cleaner task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2301) Improve yarn container command

2014-12-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234618#comment-14234618
 ] 

Hudson commented on YARN-2301:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6649 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6649/])
YARN-2301. Improved yarn container command. Contributed by Naganarasimha G R 
(jianhe: rev 258623ff8bb1a1057ae3501d4f20982d5a59ea34)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/TestRMContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* hadoop-yarn-project/CHANGES.txt


> Improve yarn container command
> --
>
> Key: YARN-2301
> URL: https://issues.apache.org/jira/browse/YARN-2301
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Jian He
>Assignee: Naganarasimha G R
>  Labels: usability
> Fix For: 2.7.0
>
> Attachments: YARN-2301.01.patch, YARN-2301.03.patch, 
> YARN-2301.20141120-1.patch, YARN-2301.20141203-1.patch, 
> YARN-2301.20141204-1.patch, YARN-2303.patch
>
>
> While running yarn container -list  command, some 
> observations:
> 1) the scheme (e.g. http/https  ) before LOG-URL is missing
> 2) the start-time is printed as milli seconds (e.g. 1405540544844). Better to 
> print as time format.
> 3) finish-time is 0 if container is not yet finished. May be "N/A"
> 4) May have an option to run as yarn container -list  OR  yarn 
> application -list-containers  also.  
> As attempt Id is not shown on console, this is easier for user to just copy 
> the appId and run it, may  also be useful for container-preserving AM 
> restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2920) CapacityScheduler should be notified when labels on nodes changed

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234608#comment-14234608
 ] 

Hadoop QA commented on YARN-2920:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685013/YARN-2920.1.patch
  against trunk revision 1bbcc3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 2 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerQueueACLs
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerQueueACLs
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage
  
org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCSQueueUtils
  org.apache.hadoop.yarn.server.resourcemanager.TestRM
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.security.TestClientToAMTokens
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5991//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/5991//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5991//console

This message is automatically generated.

> CapacityScheduler should be notified when labels on nodes changed
> -
>
> Key: YARN-2920
> URL: https://issues.apache.org/jira/browse/YARN-2920
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-2920.1.patch
>
>
> Currently, labels on nodes changes will only be handled by 
> RMNodeLabelsManager, but that is not enough upon labels on nodes changes:
> - Scheduler should be able to do take actions to running containers. (Like 
> kill/preempt/do-nothing)
> - Used / available capacity in scheduler should be updated for future 
> planning.
> We need add a new event to pass such updates to scheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2301) Improve yarn container command

2014-12-04 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234605#comment-14234605
 ] 

Jian He commented on YARN-2301:
---

Hi [~Naganarasimha], 
patch looks good. thanks for updating!

reviewer sometimes cancel the patch for the patch to be updated. You can just 
re-submit the patch once upload a new patch.

> Improve yarn container command
> --
>
> Key: YARN-2301
> URL: https://issues.apache.org/jira/browse/YARN-2301
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Jian He
>Assignee: Naganarasimha G R
>  Labels: usability
> Attachments: YARN-2301.01.patch, YARN-2301.03.patch, 
> YARN-2301.20141120-1.patch, YARN-2301.20141203-1.patch, 
> YARN-2301.20141204-1.patch, YARN-2303.patch
>
>
> While running yarn container -list  command, some 
> observations:
> 1) the scheme (e.g. http/https  ) before LOG-URL is missing
> 2) the start-time is printed as milli seconds (e.g. 1405540544844). Better to 
> print as time format.
> 3) finish-time is 0 if container is not yet finished. May be "N/A"
> 4) May have an option to run as yarn container -list  OR  yarn 
> application -list-containers  also.  
> As attempt Id is not shown on console, this is easier for user to just copy 
> the appId and run it, may  also be useful for container-preserving AM 
> restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234594#comment-14234594
 ] 

Allen Wittenauer commented on YARN-2437:


Why can't you just use the HDFS one?

> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.001.patch, YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2301) Improve yarn container command

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234583#comment-14234583
 ] 

Hadoop QA commented on YARN-2301:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12685097/YARN-2301.20141204-1.patch
  against trunk revision 1bbcc3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5994//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5994//console

This message is automatically generated.

> Improve yarn container command
> --
>
> Key: YARN-2301
> URL: https://issues.apache.org/jira/browse/YARN-2301
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Jian He
>Assignee: Naganarasimha G R
>  Labels: usability
> Attachments: YARN-2301.01.patch, YARN-2301.03.patch, 
> YARN-2301.20141120-1.patch, YARN-2301.20141203-1.patch, 
> YARN-2301.20141204-1.patch, YARN-2303.patch
>
>
> While running yarn container -list  command, some 
> observations:
> 1) the scheme (e.g. http/https  ) before LOG-URL is missing
> 2) the start-time is printed as milli seconds (e.g. 1405540544844). Better to 
> print as time format.
> 3) finish-time is 0 if container is not yet finished. May be "N/A"
> 4) May have an option to run as yarn container -list  OR  yarn 
> application -list-containers  also.  
> As attempt Id is not shown on console, this is easier for user to just copy 
> the appId and run it, may  also be useful for container-preserving AM 
> restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234578#comment-14234578
 ] 

Hadoop QA commented on YARN-2800:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12685015/YARN-2800-20141203-1.patch
  against trunk revision 1bbcc3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 10 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1226 javac 
compiler warnings (more than the trunk's current 1221 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels
  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
  org.apache.hadoop.yarn.client.api.impl.TestTimelineClient
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService
  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5993//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/5993//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5993//console

This message is automatically generated.

> Remove MemoryNodeLabelsStore and add a way to enable/disable node labels 
> feature
> 
>
> Key: YARN-2800
> URL: https://issues.apache.org/jira/browse/YARN-2800
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch, 
> YARN-2800-20141118-1.patch, YARN-2800-20141118-2.patch, 
> YARN-2800-20141119-1.patch, YARN-2800-20141203-1.patch
>
>
> In the past, we have a MemoryNodeLabelStore, mostly for user to try this 
> feature without configuring where to store node labels on file system. It 
> seems convenient for user to try this, but actually it causes some bad use 
> experience. User may add/remove labels, and edit capacity-scheduler.xml. 
> After RM restart, labels will gone, (we store it in mem). And RM cannot get 
> started if we have some queue uses labels, and the labels don't exist in 
> cluster.
> As what we discussed, we should have an explicitly way to let user specify if 
> he/she wants this feature or not. If node label is disabled, any operations 
> trying to modify/use node labels will throw exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2869) CapacityScheduler should trim sub queue names when parse configuration

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234577#comment-14234577
 ] 

Hadoop QA commented on YARN-2869:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685021/YARN-2869-3.patch
  against trunk revision 1bbcc3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 10 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokens
  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.applications.distributedshell.TestDistrTests
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5992//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5992//console

This message is automatically generated.

> CapacityScheduler should trim sub queue names when parse configuration
> --
>
> Key: YARN-2869
> URL: https://issues.apache.org/jira/browse/YARN-2869
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-2869-1.patch, YARN-2869-2.patch, YARN-2869-3.patch
>
>
> Currently, capacity scheduler doesn't trim sub queue name when parsing queue 
> names, for example, the configuration
> {code}
> 
>  
>  ...root.queues
>   a, b  , c
>  
>  
>  ...root.b.capacity
>  100
>  
>   
>  ...
> 
> {code}
> Will fail with error: 
> {code}
> java.lang.IllegalArgumentException: Illegal capacity of -1.0 for queue root. 
> a 
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:332)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.getCapacityFromConf(LeafQueue.java:196)
> 
> {code}
> It will try to find a queues with name " a", " b  ", and " c", which is 
> apparently wrong, we should do trimming on these sub queue names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG

2014-12-04 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2213:
---
Attachment: YARN-2213.patch

> Change proxy-user cookie log in AmIpFilter to DEBUG
> ---
>
> Key: YARN-2213
> URL: https://issues.apache.org/jira/browse/YARN-2213
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2213.patch
>
>
> I saw a lot of the following lines in AppMaster log:
> {code}
> 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> {code}
> For long running app, this would consume considerable log space.
> Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2213) Change proxy-user cookie log in AmIpFilter to DEBUG

2014-12-04 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-2213:
--

Assignee: Varun Saxena

> Change proxy-user cookie log in AmIpFilter to DEBUG
> ---
>
> Key: YARN-2213
> URL: https://issues.apache.org/jira/browse/YARN-2213
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2213.patch
>
>
> I saw a lot of the following lines in AppMaster log:
> {code}
> 14/06/24 17:12:36 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> 14/06/24 17:12:39 WARN web.SliderAmIpFilter: Could not find proxy-user 
> cookie, so user will not be set
> {code}
> For long running app, this would consume considerable log space.
> Log level should be changed to DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM

2014-12-04 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2762:
-
Attachment: YARN-2762.4.patch

Thanks for quick review. I updated the patch, please review

> RMAdminCLI node-labels-related args should be trimmed and checked before 
> sending to RM
> --
>
> Key: YARN-2762
> URL: https://issues.apache.org/jira/browse/YARN-2762
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
> YARN-2762.3.patch, YARN-2762.4.patch, YARN-2762.patch
>
>
> All NodeLabel args validation's are done at server side. The same can be done 
> at RMAdminCLI so that unnecessary RPC calls can be avoided.
> And for the input such as "x,y,,z,", no need to add empty string instead can 
> be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2921) MockRM#waitForState methods can be too slow and flaky

2014-12-04 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234540#comment-14234540
 ] 

Tsuyoshi OZAWA commented on YARN-2921:
--

Thanks for your suggestion, Karthik. I agree with that we should use 
wait-by-loop approach in this case. I'll update it in a next patch.

{quote}
Other than the smaller sleep, we should also handle the case where the App or 
AppAttempt enters the required state and then moves to a latter state. e.g. App 
moving to RUNNING state when we are waiting for it to get ACCEPTED.
{quote}

I spent some time how I can implement it - how about using variadic as a 
argument of waitFor()? This will work without changing code.

{code}
  public void waitForState(RMAppAttemptState... finalStatesArray)
  throws Exception {
EnumSet finalStates
= EnumSet.copyOf(Arrays.asList(finalStatesArray));
while (!finalStates.contains(attempt.getAppAttemptState()) && !isTimeout() {
}
{code}

> MockRM#waitForState methods can be too slow and flaky
> -
>
> Key: YARN-2921
> URL: https://issues.apache.org/jira/browse/YARN-2921
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Karthik Kambatla
>Assignee: Tsuyoshi OZAWA
> Attachments: YARN-2921.001.patch
>
>
> MockRM#waitForState methods currently sleep for too long (2 seconds and 1 
> second). This leads to slow tests and sometimes failures if the 
> App/AppAttempt moves to another state. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM

2014-12-04 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234526#comment-14234526
 ] 

Wangda Tan commented on YARN-2762:
--

[~rohithsharma],
The Jenkins should have some issues, I've uploaded several patches yesterday, 
but none of them get ran.

One comment is, could you please merge same error message to a final field of 
RMAdminCLI, like 
{code}
   static final String NO_LABEL = "xx"
{code}

Thanks,

> RMAdminCLI node-labels-related args should be trimmed and checked before 
> sending to RM
> --
>
> Key: YARN-2762
> URL: https://issues.apache.org/jira/browse/YARN-2762
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
> YARN-2762.3.patch, YARN-2762.patch
>
>
> All NodeLabel args validation's are done at server side. The same can be done 
> at RMAdminCLI so that unnecessary RPC calls can be avoided.
> And for the input such as "x,y,,z,", no need to add empty string instead can 
> be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM

2014-12-04 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234521#comment-14234521
 ] 

Rohith commented on YARN-2762:
--

I updated correct patch, Kindly review.
The difference from previous patch is only log messages.

> RMAdminCLI node-labels-related args should be trimmed and checked before 
> sending to RM
> --
>
> Key: YARN-2762
> URL: https://issues.apache.org/jira/browse/YARN-2762
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
> YARN-2762.3.patch, YARN-2762.patch
>
>
> All NodeLabel args validation's are done at server side. The same can be done 
> at RMAdminCLI so that unnecessary RPC calls can be avoided.
> And for the input such as "x,y,,z,", no need to add empty string instead can 
> be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2924) Node to labels mapping should not transfer to lowercase when adding from RMAdminCLI

2014-12-04 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2924:
-
Attachment: YARN-2924.1.patch

Attached a fix for this, without this fix, modified test will fail.

> Node to labels mapping should not transfer to lowercase when adding from 
> RMAdminCLI
> ---
>
> Key: YARN-2924
> URL: https://issues.apache.org/jira/browse/YARN-2924
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-2924.1.patch
>
>
> In existing implementation, when parsing node-to-labels mapping from 
> RMAdminCLI, it transferred all labels to lowercase:
> {code}
>   for (int i = 1; i < splits.length; i++) {
> if (!splits[i].trim().isEmpty()) {
>   map.get(nodeId).add(splits[i].trim().toLowerCase());
> }
>   }
> {code}
> That is not correct, we should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM

2014-12-04 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2762:
-
Attachment: YARN-2762.3.patch

Apologies and thanks wangda for finding wrong patch. I was thinking still why 
QA did not run even its been long time since I attached the patch!!!

> RMAdminCLI node-labels-related args should be trimmed and checked before 
> sending to RM
> --
>
> Key: YARN-2762
> URL: https://issues.apache.org/jira/browse/YARN-2762
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
> YARN-2762.3.patch, YARN-2762.patch
>
>
> All NodeLabel args validation's are done at server side. The same can be done 
> at RMAdminCLI so that unnecessary RPC calls can be avoided.
> And for the input such as "x,y,,z,", no need to add empty string instead can 
> be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2924) Node to labels mapping should not transfer to lowercase when adding from RMAdminCLI

2014-12-04 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-2924:


 Summary: Node to labels mapping should not transfer to lowercase 
when adding from RMAdminCLI
 Key: YARN-2924
 URL: https://issues.apache.org/jira/browse/YARN-2924
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Reporter: Wangda Tan
Assignee: Wangda Tan


In existing implementation, when parsing node-to-labels mapping from 
RMAdminCLI, it transferred all labels to lowercase:

{code}
  for (int i = 1; i < splits.length; i++) {
if (!splits[i].trim().isEmpty()) {
  map.get(nodeId).add(splits[i].trim().toLowerCase());
}
  }
{code}

That is not correct, we should fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2014-12-04 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234506#comment-14234506
 ] 

Wangda Tan commented on YARN-2495:
--

Hi [~Naganarasimha],
What I meant is completely remove conf-based node label provider implementation 
(not leave a not-implemented class it in the patch), it should be fine to use a 
dummy node label provider in tests.

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234502#comment-14234502
 ] 

Varun Saxena commented on YARN-2437:


[~aw], we cant print hosts information alongside Starting message unlike 
start-dfs.sh though. Because we do not yet support a command like hdfs getconf 
-namenodes in YARN

> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.001.patch, YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2437:
---
Attachment: YARN-2437.001.patch

> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.001.patch, YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM

2014-12-04 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234499#comment-14234499
 ] 

Wangda Tan commented on YARN-2762:
--

[~rohithsharma],
I thought you uploaded a wrong patch :)

bq. Do you mean validation check for variable map i.e Map> 
map = new HashMap>(); 
Yes 

bq. should be done not only for map.isEmpty but also for map values i.e set 
individually? If so, what if input string is host1:port,, host2:port,x?
host1:port host2,x has another meaning, which means user wants to remove all 
labels in host1.

Thanks,

> RMAdminCLI node-labels-related args should be trimmed and checked before 
> sending to RM
> --
>
> Key: YARN-2762
> URL: https://issues.apache.org/jira/browse/YARN-2762
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
> YARN-2762.patch
>
>
> All NodeLabel args validation's are done at server side. The same can be done 
> at RMAdminCLI so that unnecessary RPC calls can be avoided.
> And for the input such as "x,y,,z,", no need to add empty string instead can 
> be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is > minimumAllocation

2014-12-04 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234475#comment-14234475
 ] 

Junping Du commented on YARN-2637:
--

Thanks [~cwelch] for replying my comments and [~leftnoteasy] for your feedback.
bq. option 1 does have the possible issue you describe, and the issue with 
possibly starving all other queues if one queue has the am percent set higher 
than the others I mentioned above.
Agree. Option 1 could be leveraged by malicious behaviors in a multi-tenant 
scenario, i.e. one can ask more AM resources to block AMs in other queues. 
Option 2 sounds reasonable and I agree that we should make sure at least 1 AM 
get launched and warn this case (percentage is set too low). Thoughts?

> maximum-am-resource-percent could be violated when resource of AM is > 
> minimumAllocation
> 
>
> Key: YARN-2637
> URL: https://issues.apache.org/jira/browse/YARN-2637
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Wangda Tan
>Assignee: Craig Welch
>Priority: Critical
> Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
> YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.2.patch, YARN-2637.6.patch, 
> YARN-2637.7.patch, YARN-2637.9.patch
>
>
> Currently, number of AM in leaf queue will be calculated in following way:
> {code}
> max_am_resource = queue_max_capacity * maximum_am_resource_percent
> #max_am_number = max_am_resource / minimum_allocation
> #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
> {code}
> And when submit new application to RM, it will check if an app can be 
> activated in following way:
> {code}
> for (Iterator i=pendingApplications.iterator(); 
>  i.hasNext(); ) {
>   FiCaSchedulerApp application = i.next();
>   
>   // Check queue limit
>   if (getNumActiveApplications() >= getMaximumActiveApplications()) {
> break;
>   }
>   
>   // Check user limit
>   User user = getUser(application.getUser());
>   if (user.getActiveApplications() < 
> getMaximumActiveApplicationsPerUser()) {
> user.activateApplication();
> activeApplications.add(application);
> i.remove();
> LOG.info("Application " + application.getApplicationId() +
> " from user: " + application.getUser() + 
> " activated in queue: " + getQueueName());
>   }
> }
> {code}
> An example is,
> If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
> resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
> launched is 200, and if user uses 5M for each AM (> minimum_allocation). All 
> apps can still be activated, and it will occupy all resource of a queue 
> instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2043) Rename internal names to being Timeline Service instead of application history

2014-12-04 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234421#comment-14234421
 ] 

Naganarasimha G R commented on YARN-2043:
-

Need to fix some issues mentioned as part of yarn-2838

> Rename internal names to being Timeline Service instead of application history
> --
>
> Key: YARN-2043
> URL: https://issues.apache.org/jira/browse/YARN-2043
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Naganarasimha G R
>
> Like package and class names. In line with YARN-2033, YARN-1982 etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup

2014-12-04 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2923:

Attachment: YARN-2923.20141204-1.patch

uploading patch after segregating configuration based NodeLabelProvider service 
from YARN-2495

> Support configuration based NodeLabelsProvider Service in Distributed Node 
> Label Configuration Setup 
> -
>
> Key: YARN-2923
> URL: https://issues.apache.org/jira/browse/YARN-2923
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: YARN-2923.20141204-1.patch
>
>
> As part of Distributed Node Labels configuration we need to support Node 
> labels to be configured in Yarn-site.xml. And on modification of Node Labels 
> configuration in yarn-site.xml, NM should be able to get modified Node labels 
> from this NodeLabelsprovider service without NM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2014-12-04 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2495:

Attachment: YARN-2495.20141204-1.patch

Hi [~wangda],
I have done the required modifications for segregating conf-based node label 
provider implementation to a separated ticket.

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)

2014-12-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234377#comment-14234377
 ] 

Hadoop QA commented on YARN-2900:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12684991/YARN-2900.patch
  against trunk revision 9d1a8f5.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5990//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5990//console

This message is automatically generated.

> Application (Attempt and Container) Not Found in AHS results in Internal 
> Server Error (500)
> ---
>
> Key: YARN-2900
> URL: https://issues.apache.org/jira/browse/YARN-2900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Attachments: YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, 
> YARN-2900.patch
>
>
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
>   at 
> org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218)
>   ... 59 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234369#comment-14234369
 ] 

Allen Wittenauer commented on YARN-2437:


This code change should be in start and stop and not in yarn.  Doing it yarn 
means that running 'yarn resourcemanager' throws a message which is not ideal 
for startup scripts.

> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2014-12-04 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2495:

Description: 
Target of this JIRA is to allow admin specify labels in each NM, this covers
- User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or using 
script suggested by [~aw] (YARN-2729) )
- NM will send labels to RM via ResourceTracker API
- RM will set labels in NodeLabelManager when NM register/update labels

  was:
Target of this JIRA is to allow admin specify labels in each NM, this covers
- User can set labels in each NM (by setting yarn-site.xml or using script 
suggested by [~aw])
- NM will send labels to RM via ResourceTracker API
- RM will set labels in NodeLabelManager when NM register/update labels


> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup

2014-12-04 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2923:

Issue Type: Sub-task  (was: New Feature)
Parent: YARN-2492

> Support configuration based NodeLabelsProvider Service in Distributed Node 
> Label Configuration Setup 
> -
>
> Key: YARN-2923
> URL: https://issues.apache.org/jira/browse/YARN-2923
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>
> As part of Distributed Node Labels configuration we need to support Node 
> labels to be configured in Yarn-site.xml. And on modification of Node Labels 
> configuration in yarn-site.xml, NM should be able to get modified Node labels 
> from this NodeLabelsprovider service without NM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup

2014-12-04 Thread Naganarasimha G R (JIRA)
Naganarasimha G R created YARN-2923:
---

 Summary: Support configuration based NodeLabelsProvider Service in 
Distributed Node Label Configuration Setup 
 Key: YARN-2923
 URL: https://issues.apache.org/jira/browse/YARN-2923
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R


As part of Distributed Node Labels configuration we need to support Node labels 
to be configured in Yarn-site.xml. And on modification of Node Labels 
configuration in yarn-site.xml, NM should be able to get modified Node labels 
from this NodeLabelsprovider service without NM restart




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-04 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2437:
---
Attachment: YARN-2437.patch

> start-yarn.sh/stop-yarn should give info
> 
>
> Key: YARN-2437
> URL: https://issues.apache.org/jira/browse/YARN-2437
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Allen Wittenauer
>Assignee: Varun Saxena
>  Labels: newbie
> Fix For: 2.7.0
>
> Attachments: YARN-2437.patch
>
>
> With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
> longer prints "Starting" information.  This should be made more of an analog 
> of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234313#comment-14234313
 ] 

Varun Saxena commented on YARN-2914:


[~ctrezzo], [~ozawa] and [~tedyu], kindly review. 
Have updated code for CleanerMetrics as well. Configuration object being passed 
was unnecessary, so removed it.


> Potential race condition in 
> SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
> 
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.002.patch, YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234316#comment-14234316
 ] 

Varun Saxena commented on YARN-2914:


Also changed the issue title to reflect fix for CleanerMetrics as well

> Potential race condition in 
> SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
> 
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.002.patch, YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2914:
---
Attachment: YARN-2914.002.patch

> Potential race condition in 
> SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance()
> -
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.002.patch, YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2914:
---
Summary: Potential race condition in 
SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()  (was: 
Potential race condition in 
SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance())

> Potential race condition in 
> SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
> 
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.002.patch, YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2914:
---
Summary: Potential race condition in 
SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance()  (was: Potential race 
condition in 
SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance())

> Potential race condition in 
> SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance()
> -
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2914) Potential race condition in SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()

2014-12-04 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2914:
---
Summary: Potential race condition in 
SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()  (was: 
Potential race condition in 
SharedCacheUploaderMetrics/ClientSCMMetrics#getInstance())

> Potential race condition in 
> SharedCacheUploaderMetrics/CleanerMetrics/ClientSCMMetrics#getInstance()
> 
>
> Key: YARN-2914
> URL: https://issues.apache.org/jira/browse/YARN-2914
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Varun Saxena
>Priority: Minor
> Fix For: 2.7.0
>
> Attachments: YARN-2914.patch
>
>
> {code}
>   public static ClientSCMMetrics getInstance() {
> ClientSCMMetrics topMetrics = Singleton.INSTANCE.impl;
> if (topMetrics == null) {
>   throw new IllegalStateException(
> {code}
> getInstance() doesn't hold lock on Singleton.this
> This may result in IllegalStateException being thrown prematurely.
> [~ctrezzo] reported that SharedCacheUploaderMetrics has also same kind of 
> race condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM

2014-12-04 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234164#comment-14234164
 ] 

Rohith commented on YARN-2762:
--

Updated patch with log message consistency.Kindly review

> RMAdminCLI node-labels-related args should be trimmed and checked before 
> sending to RM
> --
>
> Key: YARN-2762
> URL: https://issues.apache.org/jira/browse/YARN-2762
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
> YARN-2762.patch
>
>
> All NodeLabel args validation's are done at server side. The same can be done 
> at RMAdminCLI so that unnecessary RPC calls can be avoided.
> And for the input such as "x,y,,z,", no need to add empty string instead can 
> be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM

2014-12-04 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2762:
-
Attachment: YARN-2762.2.patch

> RMAdminCLI node-labels-related args should be trimmed and checked before 
> sending to RM
> --
>
> Key: YARN-2762
> URL: https://issues.apache.org/jira/browse/YARN-2762
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
> YARN-2762.patch
>
>
> All NodeLabel args validation's are done at server side. The same can be done 
> at RMAdminCLI so that unnecessary RPC calls can be avoided.
> And for the input such as "x,y,,z,", no need to add empty string instead can 
> be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2301) Improve yarn container command

2014-12-04 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2301:

Attachment: YARN-2301.20141204-1.patch

Hi [~jianhe],
Corrected the review comments and also one test case 
failure(org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService) was 
related to the modifications hence corrected it but the following test case 
failures/errors are not related to my changes :
# org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
# org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
# org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart

Also i am confused to whether to change status as patch available as you 
reverted it back to Open status twice earlier. Please check and inform if 
required will update the status as patch available.

> Improve yarn container command
> --
>
> Key: YARN-2301
> URL: https://issues.apache.org/jira/browse/YARN-2301
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Jian He
>Assignee: Naganarasimha G R
>  Labels: usability
> Attachments: YARN-2301.01.patch, YARN-2301.03.patch, 
> YARN-2301.20141120-1.patch, YARN-2301.20141203-1.patch, 
> YARN-2301.20141204-1.patch, YARN-2303.patch
>
>
> While running yarn container -list  command, some 
> observations:
> 1) the scheme (e.g. http/https  ) before LOG-URL is missing
> 2) the start-time is printed as milli seconds (e.g. 1405540544844). Better to 
> print as time format.
> 3) finish-time is 0 if container is not yet finished. May be "N/A"
> 4) May have an option to run as yarn container -list  OR  yarn 
> application -list-containers  also.  
> As attempt Id is not shown on console, this is easier for user to just copy 
> the appId and run it, may  also be useful for container-preserving AM 
> restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-04 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234044#comment-14234044
 ] 

Akira AJISAKA commented on YARN-2910:
-

bq.  if app submissions are frequent, I'd rather slow down requests for queue 
info than the submissions themselves.
Make sense to me. +1 (non-binding) for using {{SynchronizedList}}. Thanks.

> FSLeafQueue can throw ConcurrentModificationException
> -
>
> Key: YARN-2910
> URL: https://issues.apache.org/jira/browse/YARN-2910
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.5.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch
>
>
> The list that maintains the runnable and the non runnable apps are a standard 
> ArrayList but there is no guarantee that it will only be manipulated by one 
> thread in the system. This can lead to the following exception:
> {noformat}
> 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
> CONTACTING RM.
> java.util.ConcurrentModificationException: 
> java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
> at java.util.ArrayList$Itr.next(ArrayList.java:831)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
> {noformat}
> Full stack trace in the attached file.
> We should guard against that by using a thread safe version from 
> java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-04 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234035#comment-14234035
 ] 

Sandy Ryza commented on YARN-2910:
--

Using a CopyOnWriteArrayList would make adding an application an O(n) 
operation.  On many clusters, this happens quite frequently.  Acquiring a lock 
is cheap when there is no contention.  if app submissions are frequent, I'd 
rather slow down requests for queue info than the submissions themselves.  
Otherwise, the former shouldn't have a large effect on the performance of the 
latter.   


> FSLeafQueue can throw ConcurrentModificationException
> -
>
> Key: YARN-2910
> URL: https://issues.apache.org/jira/browse/YARN-2910
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.5.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch
>
>
> The list that maintains the runnable and the non runnable apps are a standard 
> ArrayList but there is no guarantee that it will only be manipulated by one 
> thread in the system. This can lead to the following exception:
> {noformat}
> 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
> CONTACTING RM.
> java.util.ConcurrentModificationException: 
> java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
> at java.util.ArrayList$Itr.next(ArrayList.java:831)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
> {noformat}
> Full stack trace in the attached file.
> We should guard against that by using a thread safe version from 
> java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-04 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234023#comment-14234023
 ] 

Akira AJISAKA commented on YARN-2910:
-

As [~wilfreds] mentioned, synchronization can cause RM slow down if there are 
many getQueueInfo requests from clients at a time, so I'm thinking 
{{CopyOnWriteArrayList}} might be better.
By the way, we should fix CapacityScheduler also. (YARN-2922)

> FSLeafQueue can throw ConcurrentModificationException
> -
>
> Key: YARN-2910
> URL: https://issues.apache.org/jira/browse/YARN-2910
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.5.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch
>
>
> The list that maintains the runnable and the non runnable apps are a standard 
> ArrayList but there is no guarantee that it will only be manipulated by one 
> thread in the system. This can lead to the following exception:
> {noformat}
> 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
> CONTACTING RM.
> java.util.ConcurrentModificationException: 
> java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
> at java.util.ArrayList$Itr.next(ArrayList.java:831)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
> {noformat}
> Full stack trace in the attached file.
> We should guard against that by using a thread safe version from 
> java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)