date:20150323

[jira] [Commented] (YARN-1880) Cleanup TestApplicationClientProtocolOnHA

2015-03-23 Thread Tsuyoshi Ozawa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377403#comment-14377403
 ] 

Tsuyoshi Ozawa commented on YARN-1880:
--

[~qwertymaniac] [~ajisakaa] thank you for the review!

> Cleanup TestApplicationClientProtocolOnHA
> -
>
> Key: YARN-1880
> URL: https://issues.apache.org/jira/browse/YARN-1880
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: YARN-1880.1.patch
>
>
> The tests introduced on YARN-1521 includes multiple assertion with &&. We 
> should separate them because it's difficult to identify which condition is 
> illegal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1880) Cleanup TestApplicationClientProtocolOnHA

2015-03-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377382#comment-14377382
 ] 

Hudson commented on YARN-1880:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7413 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7413/])
YARN-1880. Cleanup TestApplicationClientProtocolOnHA. Contributed by ozawa. 
(harsh: rev fbceb3b41834d6899c4353fb24f12ba3ecf67faf)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationClientProtocolOnHA.java
* hadoop-yarn-project/CHANGES.txt


> Cleanup TestApplicationClientProtocolOnHA
> -
>
> Key: YARN-1880
> URL: https://issues.apache.org/jira/browse/YARN-1880
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: YARN-1880.1.patch
>
>
> The tests introduced on YARN-1521 includes multiple assertion with &&. We 
> should separate them because it's difficult to identify which condition is 
> illegal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1880) Cleanup TestApplicationClientProtocolOnHA

2015-03-23 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated YARN-1880:
--
Component/s: test

> Cleanup TestApplicationClientProtocolOnHA
> -
>
> Key: YARN-1880
> URL: https://issues.apache.org/jira/browse/YARN-1880
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: YARN-1880.1.patch
>
>
> The tests introduced on YARN-1521 includes multiple assertion with &&. We 
> should separate them because it's difficult to identify which condition is 
> illegal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1880) Cleanup TestApplicationClientProtocolOnHA

2015-03-23 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated YARN-1880:
--
Affects Version/s: 2.6.0

> Cleanup TestApplicationClientProtocolOnHA
> -
>
> Key: YARN-1880
> URL: https://issues.apache.org/jira/browse/YARN-1880
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.6.0
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: YARN-1880.1.patch
>
>
> The tests introduced on YARN-1521 includes multiple assertion with &&. We 
> should separate them because it's difficult to identify which condition is 
> illegal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1880) Cleanup TestApplicationClientProtocolOnHA

2015-03-23 Thread Harsh J (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377378#comment-14377378
 ] 

Harsh J commented on YARN-1880:
---

+1, this still applies. Committing shortly, thanks [~ozawa] (and [~ajisakaa] 
for the earlier review)!



> Cleanup TestApplicationClientProtocolOnHA
> -
>
> Key: YARN-1880
> URL: https://issues.apache.org/jira/browse/YARN-1880
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Tsuyoshi Ozawa
>Assignee: Tsuyoshi Ozawa
>Priority: Trivial
> Attachments: YARN-1880.1.patch
>
>
> The tests introduced on YARN-1521 includes multiple assertion with &&. We 
> should separate them because it's difficult to identify which condition is 
> illegal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3225) New parameter or CLI for decommissioning node gracefully in RMAdmin CLI

2015-03-23 Thread Devaraj K (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377372#comment-14377372
 ] 

Devaraj K commented on YARN-3225:
-

{code:xml}
org.apache.hadoop.yarn.server.resourcemanager.TestRM
{code}

This test failure is not related to the patch.

> New parameter or CLI for decommissioning node gracefully in RMAdmin CLI
> ---
>
> Key: YARN-3225
> URL: https://issues.apache.org/jira/browse/YARN-3225
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Junping Du
>Assignee: Devaraj K
> Attachments: YARN-3225-1.patch, YARN-3225.patch, YARN-914.patch
>
>
> New CLI (or existing CLI with parameters) should put each node on 
> decommission list to decommissioning status and track timeout to terminate 
> the nodes that haven't get finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377356#comment-14377356
 ] 

Hadoop QA commented on YARN-2495:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12706826/YARN-2495.20150324-1.patch
  against trunk revision 9fae455.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7088//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7088//console

This message is automatically generated.

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3394) WebApplication proxy documentation is incomplete

2015-03-23 Thread Tsuyoshi Ozawa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377320#comment-14377320
 ] 

Tsuyoshi Ozawa commented on YARN-3394:
--

+1 for having the document.

> WebApplication  proxy documentation is incomplete
> -
>
> Key: YARN-3394
> URL: https://issues.apache.org/jira/browse/YARN-3394
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Bibin A Chundatt
>Assignee: Naganarasimha G R
>Priority: Minor
>
> Webproxy documentation is incomplete
> hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html
> 1.Configuration of service start/stop as separate server
> 2.Steps to start as daemon service
> 3.Secure mode for Web proxy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3394) WebApplication proxy documentation is incomplete

2015-03-23 Thread Bibin A Chundatt (JIRA)

Bibin A Chundatt created YARN-3394:
--

 Summary: WebApplication  proxy documentation is incomplete
 Key: YARN-3394
 URL: https://issues.apache.org/jira/browse/YARN-3394
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Bibin A Chundatt
Assignee: Naganarasimha G R
Priority: Minor



Webproxy documentation is incomplete

hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html

1.Configuration of service start/stop as separate server
2.Steps to start as daemon service
3.Secure mode for Web proxy





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3347) Improve YARN log command to get AMContainer logs

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377265#comment-14377265
 ] 

Hadoop QA commented on YARN-3347:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12706818/YARN-3347.2.rebase.patch
  against trunk revision 9fae455.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7087//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7087//console

This message is automatically generated.

> Improve YARN log command to get AMContainer logs
> 
>
> Key: YARN-3347
> URL: https://issues.apache.org/jira/browse/YARN-3347
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3347.1.patch, YARN-3347.1.rebase.patch, 
> YARN-3347.2.patch, YARN-3347.2.rebase.patch
>
>
> Right now, we could specify applicationId, node http address and container ID 
> to get the specify container log. Or we could only specify applicationId to 
> get all the container logs. It is very hard for the users to get logs for AM 
> container since the AMContainer logs have more useful information. Users need 
> to know the AMContainer's container ID and related Node http address.
> We could improve the YARN Log Command to allow users to get AMContainer logs 
> directly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3336) FileSystem memory leak in DelegationTokenRenewer

2015-03-23 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377258#comment-14377258
 ] 

zhihai xu commented on YARN-3336:
-

[~cnauroth], Not a problem, thanks for the notification.

> FileSystem memory leak in DelegationTokenRenewer
> 
>
> Key: YARN-3336
> URL: https://issues.apache.org/jira/browse/YARN-3336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: YARN-3336.000.patch, YARN-3336.001.patch, 
> YARN-3336.002.patch, YARN-3336.003.patch, YARN-3336.004.patch
>
>
> FileSystem memory leak in DelegationTokenRenewer.
> Every time DelegationTokenRenewer#obtainSystemTokensForUser is called, a new 
> FileSystem entry will be added to  FileSystem#CACHE which will never be 
> garbage collected.
> This is the implementation of obtainSystemTokensForUser:
> {code}
>   protected Token[] obtainSystemTokensForUser(String user,
>   final Credentials credentials) throws IOException, InterruptedException 
> {
> // Get new hdfs tokens on behalf of this user
> UserGroupInformation proxyUser =
> UserGroupInformation.createProxyUser(user,
>   UserGroupInformation.getLoginUser());
> Token[] newTokens =
> proxyUser.doAs(new PrivilegedExceptionAction[]>() {
>   @Override
>   public Token[] run() throws Exception {
> return FileSystem.get(getConfig()).addDelegationTokens(
>   UserGroupInformation.getLoginUser().getUserName(), credentials);
>   }
> });
> return newTokens;
>   }
> {code}
> The memory leak happened when FileSystem.get(getConfig()) is called with a 
> new proxy user.
> Because createProxyUser will always create a new Subject.
> The calling sequence is 
> FileSystem.get(getConfig())=>FileSystem.get(getDefaultUri(conf), 
> conf)=>FileSystem.CACHE.get(uri, conf)=>FileSystem.CACHE.getInternal(uri, 
> conf, key)=>FileSystem.CACHE.map.get(key)=>createFileSystem(uri, conf)
> {code}
> public static UserGroupInformation createProxyUser(String user,
>   UserGroupInformation realUser) {
> if (user == null || user.isEmpty()) {
>   throw new IllegalArgumentException("Null user");
> }
> if (realUser == null) {
>   throw new IllegalArgumentException("Null real user");
> }
> Subject subject = new Subject();
> Set principals = subject.getPrincipals();
> principals.add(new User(user));
> principals.add(new RealUser(realUser));
> UserGroupInformation result =new UserGroupInformation(subject);
> result.setAuthenticationMethod(AuthenticationMethod.PROXY);
> return result;
>   }
> {code}
> FileSystem#Cache#Key.equals will compare the ugi
> {code}
>   Key(URI uri, Configuration conf, long unique) throws IOException {
> scheme = uri.getScheme()==null?"":uri.getScheme().toLowerCase();
> authority = 
> uri.getAuthority()==null?"":uri.getAuthority().toLowerCase();
> this.unique = unique;
> this.ugi = UserGroupInformation.getCurrentUser();
>   }
>   public boolean equals(Object obj) {
> if (obj == this) {
>   return true;
> }
> if (obj != null && obj instanceof Key) {
>   Key that = (Key)obj;
>   return isEqual(this.scheme, that.scheme)
>  && isEqual(this.authority, that.authority)
>  && isEqual(this.ugi, that.ugi)
>  && (this.unique == that.unique);
> }
> return false;
>   }
> {code}
> UserGroupInformation.equals will compare subject by reference.
> {code}
>   public boolean equals(Object o) {
> if (o == this) {
>   return true;
> } else if (o == null || getClass() != o.getClass()) {
>   return false;
> } else {
>   return subject == ((UserGroupInformation) o).subject;
> }
>   }
> {code}
> So in this case, every time createProxyUser and FileSystem.get(getConfig()) 
> are called, a new FileSystem will be created and a new entry will be added to 
> FileSystem.CACHE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-23 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377252#comment-14377252
 ] 

Naganarasimha G R commented on YARN-2495:
-

Hi [~leftnoteasy],
Oops mistake from my side, Have uploaded the patch with correction, Please 
check.

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-23 Thread Naganarasimha G R (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2495:

Attachment: YARN-2495.20150324-1.patch

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495.20150324-1.patch, 
> YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3393) Getting application(s) goes wrong when app finishes before starting the attempt

2015-03-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377216#comment-14377216
 ] 

Hudson commented on YARN-3393:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7409 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7409/])
YARN-3393. Getting application(s) goes wrong when app finishes before (xgong: 
rev 9fae455e26e0230107e1c6db58a49a5b6b296cf4)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryManagerOnTimelineStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java


> Getting application(s) goes wrong when app finishes before starting the 
> attempt
> ---
>
> Key: YARN-3393
> URL: https://issues.apache.org/jira/browse/YARN-3393
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: YARN-3393.1.patch
>
>
> When generating app report in ApplicationHistoryManagerOnTimelineStore, it 
> checks if appAttempt == null.
> {code}
> ApplicationAttemptReport appAttempt = 
> getApplicationAttempt(app.appReport.getCurrentApplicationAttemptId());
> if (appAttempt != null) {
>   app.appReport.setHost(appAttempt.getHost());
>   app.appReport.setRpcPort(appAttempt.getRpcPort());
>   app.appReport.setTrackingUrl(appAttempt.getTrackingUrl());
>   
> app.appReport.setOriginalTrackingUrl(appAttempt.getOriginalTrackingUrl());
> }
> {code}
> However, {{getApplicationAttempt}} doesn't return null but throws 
> ApplicationAttemptNotFoundException:
> {code}
> if (entity == null) {
>   throw new ApplicationAttemptNotFoundException(
>   "The entity for application attempt " + appAttemptId +
>   " doesn't exist in the timeline store");
> } else {
>   return convertToApplicationAttemptReport(entity);
> }
> {code}
> They code isn't coupled well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3347) Improve YARN log command to get AMContainer logs

2015-03-23 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-3347:

Attachment: YARN-3347.2.rebase.patch

> Improve YARN log command to get AMContainer logs
> 
>
> Key: YARN-3347
> URL: https://issues.apache.org/jira/browse/YARN-3347
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3347.1.patch, YARN-3347.1.rebase.patch, 
> YARN-3347.2.patch, YARN-3347.2.rebase.patch
>
>
> Right now, we could specify applicationId, node http address and container ID 
> to get the specify container log. Or we could only specify applicationId to 
> get all the container logs. It is very hard for the users to get logs for AM 
> container since the AMContainer logs have more useful information. Users need 
> to know the AMContainer's container ID and related Node http address.
> We could improve the YARN Log Command to allow users to get AMContainer logs 
> directly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3393) Getting application(s) goes wrong when app finishes before starting the attempt

2015-03-23 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377211#comment-14377211
 ] 

Xuan Gong commented on YARN-3393:
-

Committed into trunk/branch-2/branch-2.7. Thanks, zhijie.

> Getting application(s) goes wrong when app finishes before starting the 
> attempt
> ---
>
> Key: YARN-3393
> URL: https://issues.apache.org/jira/browse/YARN-3393
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: YARN-3393.1.patch
>
>
> When generating app report in ApplicationHistoryManagerOnTimelineStore, it 
> checks if appAttempt == null.
> {code}
> ApplicationAttemptReport appAttempt = 
> getApplicationAttempt(app.appReport.getCurrentApplicationAttemptId());
> if (appAttempt != null) {
>   app.appReport.setHost(appAttempt.getHost());
>   app.appReport.setRpcPort(appAttempt.getRpcPort());
>   app.appReport.setTrackingUrl(appAttempt.getTrackingUrl());
>   
> app.appReport.setOriginalTrackingUrl(appAttempt.getOriginalTrackingUrl());
> }
> {code}
> However, {{getApplicationAttempt}} doesn't return null but throws 
> ApplicationAttemptNotFoundException:
> {code}
> if (entity == null) {
>   throw new ApplicationAttemptNotFoundException(
>   "The entity for application attempt " + appAttemptId +
>   " doesn't exist in the timeline store");
> } else {
>   return convertToApplicationAttemptReport(entity);
> }
> {code}
> They code isn't coupled well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3393) Getting application(s) goes wrong when app finishes before starting the attempt

2015-03-23 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377204#comment-14377204
 ] 

Xuan Gong commented on YARN-3393:
-

+1 LGTM. Will commit

> Getting application(s) goes wrong when app finishes before starting the 
> attempt
> ---
>
> Key: YARN-3393
> URL: https://issues.apache.org/jira/browse/YARN-3393
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Critical
> Attachments: YARN-3393.1.patch
>
>
> When generating app report in ApplicationHistoryManagerOnTimelineStore, it 
> checks if appAttempt == null.
> {code}
> ApplicationAttemptReport appAttempt = 
> getApplicationAttempt(app.appReport.getCurrentApplicationAttemptId());
> if (appAttempt != null) {
>   app.appReport.setHost(appAttempt.getHost());
>   app.appReport.setRpcPort(appAttempt.getRpcPort());
>   app.appReport.setTrackingUrl(appAttempt.getTrackingUrl());
>   
> app.appReport.setOriginalTrackingUrl(appAttempt.getOriginalTrackingUrl());
> }
> {code}
> However, {{getApplicationAttempt}} doesn't return null but throws 
> ApplicationAttemptNotFoundException:
> {code}
> if (entity == null) {
>   throw new ApplicationAttemptNotFoundException(
>   "The entity for application attempt " + appAttemptId +
>   " doesn't exist in the timeline store");
> } else {
>   return convertToApplicationAttemptReport(entity);
> }
> {code}
> They code isn't coupled well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377201#comment-14377201
 ] 

Hadoop QA commented on YARN-3021:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706795/YARN-3021.006.patch
  against trunk revision 2c238ae.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7085//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7085//console

This message is automatically generated.

> YARN's delegation-token handling disallows certain trust setups to operate 
> properly over DistCp
> ---
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Harsh J
>Assignee: Yongjun Zhang
> Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
> YARN-3021.003.patch, YARN-3021.004.patch, YARN-3021.005.patch, 
> YARN-3021.006.patch, YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
> as it attempts a renewDelegationToken(…) synchronously during application 
> submission (to validate the managed token before it adds it to a scheduler 
> for automatic renewal). The call obviously fails cause B realm will not trust 
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously 
> and once the renewal attempt failed we simply ceased to schedule any further 
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on 
> the failure and skip the scheduling alone, rather than bubble back an error 
> to the client, failing the app submission. This way the old behaviour is 
> retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377196#comment-14377196
 ] 

Naganarasimha G R commented on YARN-3034:
-

Hi [~zjshen],
bq. According to this comments, it seems that you want to create a separate 
stack to put entities into RMTimelineCollector, right? If so, the current 
design makes sense.
yes I wanted to create a separate stack similar to SystemMetricsPublisher, so 
that ATS V1 and V2 are less coupled and removal of SMP once completely 
deprecated is smoother

bq.  yarn.resourcemanager.system-metrics-publisher.enabled for v1 
SystemMetricsPublisher. For v2, both RM and NM reads 
yarn.system-metrics-publisher.enabled? No need to have v1/v2 flag?
On after thoughts i feel this approach is better as once we deprecate SMP then 
there will be unnecessary additional configuration of which version type, which 
would not be required. If all are fine then will move back to the approach as 
mentioned by Zhijie.

> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3244) Add user specified information for clean-up container in ApplicationSubmissionContext

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377175#comment-14377175
 ] 

Hadoop QA commented on YARN-3244:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706802/YARN-3244.2.patch
  against trunk revision 2c238ae.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7086//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7086//console

This message is automatically generated.

> Add user specified information for clean-up container in 
> ApplicationSubmissionContext
> -
>
> Key: YARN-3244
> URL: https://issues.apache.org/jira/browse/YARN-3244
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3244.1.patch, YARN-3244.2.patch
>
>
> To launch user-specified clean up container, users need to provide proper 
> informations to YARN.
> It should at least have following properties:
> * A flag to indicate whether needs to launch the clean-up container
> * A time-out period to indicate how long the clean-up container can run
> * maxRetry times
> * containerLaunchContext for clean-up container



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3393) Getting application(s) goes wrong when app finishes before starting the attempt

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377133#comment-14377133
 ] 

Hadoop QA commented on YARN-3393:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706792/YARN-3393.1.patch
  against trunk revision 2c238ae.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7084//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7084//console

This message is automatically generated.

> Getting application(s) goes wrong when app finishes before starting the 
> attempt
> ---
>
> Key: YARN-3393
> URL: https://issues.apache.org/jira/browse/YARN-3393
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Critical
> Attachments: YARN-3393.1.patch
>
>
> When generating app report in ApplicationHistoryManagerOnTimelineStore, it 
> checks if appAttempt == null.
> {code}
> ApplicationAttemptReport appAttempt = 
> getApplicationAttempt(app.appReport.getCurrentApplicationAttemptId());
> if (appAttempt != null) {
>   app.appReport.setHost(appAttempt.getHost());
>   app.appReport.setRpcPort(appAttempt.getRpcPort());
>   app.appReport.setTrackingUrl(appAttempt.getTrackingUrl());
>   
> app.appReport.setOriginalTrackingUrl(appAttempt.getOriginalTrackingUrl());
> }
> {code}
> However, {{getApplicationAttempt}} doesn't return null but throws 
> ApplicationAttemptNotFoundException:
> {code}
> if (entity == null) {
>   throw new ApplicationAttemptNotFoundException(
>   "The entity for application attempt " + appAttemptId +
>   " doesn't exist in the timeline store");
> } else {
>   return convertToApplicationAttemptReport(entity);
> }
> {code}
> They code isn't coupled well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3244) Add user specified information for clean-up container in ApplicationSubmissionContext

2015-03-23 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377098#comment-14377098
 ] 

Xuan Gong commented on YARN-3244:
-

Created a new object named CleanupContainer which includes launch-context for 
cleanup container and maxCleanupContainerAttempts. Also, add two global yarn 
configurations: RM_CLEAN_UP_CONTAINER_TIMEOUT_MS and 
RM_CLEAN_UP_CONTAINER_MAX_ATTEMPTS

> Add user specified information for clean-up container in 
> ApplicationSubmissionContext
> -
>
> Key: YARN-3244
> URL: https://issues.apache.org/jira/browse/YARN-3244
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3244.1.patch, YARN-3244.2.patch
>
>
> To launch user-specified clean up container, users need to provide proper 
> informations to YARN.
> It should at least have following properties:
> * A flag to indicate whether needs to launch the clean-up container
> * A time-out period to indicate how long the clean-up container can run
> * maxRetry times
> * containerLaunchContext for clean-up container



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3244) Add user specified information for clean-up container in ApplicationSubmissionContext

2015-03-23 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-3244:

Attachment: YARN-3244.2.patch

Address all the latest comments

> Add user specified information for clean-up container in 
> ApplicationSubmissionContext
> -
>
> Key: YARN-3244
> URL: https://issues.apache.org/jira/browse/YARN-3244
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3244.1.patch, YARN-3244.2.patch
>
>
> To launch user-specified clean up container, users need to provide proper 
> informations to YARN.
> It should at least have following properties:
> * A flag to indicate whether needs to launch the clean-up container
> * A time-out period to indicate how long the clean-up container can run
> * maxRetry times
> * containerLaunchContext for clean-up container



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-23 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated YARN-3021:

Attachment: YARN-3021.006.patch

The test failure seems to be unrelated, upload same patch 06 to trigger another 
jenkins run. 


> YARN's delegation-token handling disallows certain trust setups to operate 
> properly over DistCp
> ---
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Harsh J
>Assignee: Yongjun Zhang
> Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
> YARN-3021.003.patch, YARN-3021.004.patch, YARN-3021.005.patch, 
> YARN-3021.006.patch, YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
> as it attempts a renewDelegationToken(…) synchronously during application 
> submission (to validate the managed token before it adds it to a scheduler 
> for automatic renewal). The call obviously fails cause B realm will not trust 
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously 
> and once the renewal attempt failed we simply ceased to schedule any further 
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on 
> the failure and skip the scheduling alone, rather than bubble back an error 
> to the client, failing the app submission. This way the old behaviour is 
> retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-23 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated YARN-3021:

Attachment: (was: YARN-3021.006.patch)

> YARN's delegation-token handling disallows certain trust setups to operate 
> properly over DistCp
> ---
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Harsh J
>Assignee: Yongjun Zhang
> Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
> YARN-3021.003.patch, YARN-3021.004.patch, YARN-3021.005.patch, 
> YARN-3021.006.patch, YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
> as it attempts a renewDelegationToken(…) synchronously during application 
> submission (to validate the managed token before it adds it to a scheduler 
> for automatic renewal). The call obviously fails cause B realm will not trust 
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously 
> and once the renewal attempt failed we simply ceased to schedule any further 
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on 
> the failure and skip the scheduling alone, rather than bubble back an error 
> to the client, failing the app submission. This way the old behaviour is 
> retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3393) Getting application(s) goes wrong when app finishes before starting the attempt

2015-03-23 Thread Zhijie Shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-3393:
--
Attachment: YARN-3393.1.patch

Create the patch to fix the problem

> Getting application(s) goes wrong when app finishes before starting the 
> attempt
> ---
>
> Key: YARN-3393
> URL: https://issues.apache.org/jira/browse/YARN-3393
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>Priority: Critical
> Attachments: YARN-3393.1.patch
>
>
> When generating app report in ApplicationHistoryManagerOnTimelineStore, it 
> checks if appAttempt == null.
> {code}
> ApplicationAttemptReport appAttempt = 
> getApplicationAttempt(app.appReport.getCurrentApplicationAttemptId());
> if (appAttempt != null) {
>   app.appReport.setHost(appAttempt.getHost());
>   app.appReport.setRpcPort(appAttempt.getRpcPort());
>   app.appReport.setTrackingUrl(appAttempt.getTrackingUrl());
>   
> app.appReport.setOriginalTrackingUrl(appAttempt.getOriginalTrackingUrl());
> }
> {code}
> However, {{getApplicationAttempt}} doesn't return null but throws 
> ApplicationAttemptNotFoundException:
> {code}
> if (entity == null) {
>   throw new ApplicationAttemptNotFoundException(
>   "The entity for application attempt " + appAttemptId +
>   " doesn't exist in the timeline store");
> } else {
>   return convertToApplicationAttemptReport(entity);
> }
> {code}
> They code isn't coupled well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2901) Add errors and warning stats to RM, NM web UI

2015-03-23 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377040#comment-14377040
 ] 

Wangda Tan commented on YARN-2901:
--

Hi [~vvasudev],

I spent some time take a look at Log4JMetricsAppeneder implementation (will 
include other modified component in next round).

1) Log4jMetricsAppender, 
1.1 Better to place in yarn-server-common?
1.2 If you agree above, how about put into package o.a.h.y.server.metrics (or 
utils)?
1.3 Rename it to Log4jWarnErrorMetricsAppender?
1.4 Comments about implementation:
I think currently, implementation of cleanup can be improved, now cutoff 
process of message/count is basically loop all items stored, which could be 
inefficient (imaging if number of stored message > threshold), existing logics 
in the patch would lead to lots of potential stored message (tons of messages 
could be genereated in 5 min, which is purge message task run interval).

If you can make the data structure to be:
SortedMap> errors (and warnings), the outside 
map is sorted by value (SortedMap with smallest timestamp goes first), and 
inside map is sorted by key (smallest timestamp goes first), purge can happen 
when we add any event, it will just take at most log(N=500) time to do the 
purge, and no extra timer task needed.

To make SortedMap can sort by value, one way to do that can refer to 
http://stackoverflow.com/questions/109383/how-to-sort-a-mapkey-value-on-the-values-in-java
 (first answer).

Here, value = SortedMap>, we can sort the SortedMaps according 
to smallest key in each SortedMap.

And one corner case may need to consider is, it is possible a same message can 
have lots of different timestamps, so we need purge the inner SortedMap too.

To make better code readability, you can wrap the SortedMap to a inner class 
like MessageInfo.

> Add errors and warning stats to RM, NM web UI
> -
>
> Key: YARN-2901
> URL: https://issues.apache.org/jira/browse/YARN-2901
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: Exception collapsed.png, Exception expanded.jpg, Screen 
> Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch, 
> apache-yarn-2901.1.patch
>
>
> It would be really useful to have statistics on the number of errors and 
> warnings in the RM and NM web UI. I'm thinking about -
> 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day
> 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 
> hours/day
> By errors and warnings I'm referring to the log level.
> I suspect we can probably achieve this by writing a custom appender?(I'm open 
> to suggestions on alternate mechanisms for implementing this).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3347) Improve YARN log command to get AMContainer logs

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377031#comment-14377031
 ] 

Hadoop QA commented on YARN-3347:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706770/YARN-3347.2.patch
  against trunk revision 2c238ae.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7083//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7083//console

This message is automatically generated.

> Improve YARN log command to get AMContainer logs
> 
>
> Key: YARN-3347
> URL: https://issues.apache.org/jira/browse/YARN-3347
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3347.1.patch, YARN-3347.1.rebase.patch, 
> YARN-3347.2.patch
>
>
> Right now, we could specify applicationId, node http address and container ID 
> to get the specify container log. Or we could only specify applicationId to 
> get all the container logs. It is very hard for the users to get logs for AM 
> container since the AMContainer logs have more useful information. Users need 
> to know the AMContainer's container ID and related Node http address.
> We could improve the YARN Log Command to allow users to get AMContainer logs 
> directly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3387) container complete message couldn't pass to am if am restarted and rm changed

2015-03-23 Thread sandflee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377019#comment-14377019
 ] 

sandflee commented on YARN-3387:


yes

> container complete message couldn't pass to am if am restarted and rm changed
> -
>
> Key: YARN-3387
> URL: https://issues.apache.org/jira/browse/YARN-3387
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: sandflee
>Priority: Critical
>
> suppose am work preserving and rm ha is enabled.
> container complete message is passed to appattemt.justFinishedContainers in 
> rm。in normal situation，all attempt in one app shares the same 
> justFinishedContainers, but when rm changed, every attempt has it's own 
> justFinishedContainers, so in situations below, container complete message 
> couldn't passed to am:
> 1, am restart
> 2, rm changes
> 3, container launched by first am completes
> container complete message will be passed to appAttempt1 not appAttempt2, but 
> am pull finished containers from appAttempt2 (currentAppAttempt)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377008#comment-14377008
 ] 

Hadoop QA commented on YARN-3304:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706767/YARN-3304-v2.patch
  against trunk revision 2c238ae.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7082//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7082//console

This message is automatically generated.

> ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
> inconsistent with other getters
> 
>
> Key: YARN-3304
> URL: https://issues.apache.org/jira/browse/YARN-3304
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Junping Du
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-3304-v2.patch, YARN-3304.patch
>
>
> Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
> unavailable case while other resource metrics are return 0 in the same case 
> which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3347) Improve YARN log command to get AMContainer logs

2015-03-23 Thread Xuan Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-3347:

Attachment: YARN-3347.2.patch

fix -1 on findBug

> Improve YARN log command to get AMContainer logs
> 
>
> Key: YARN-3347
> URL: https://issues.apache.org/jira/browse/YARN-3347
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-3347.1.patch, YARN-3347.1.rebase.patch, 
> YARN-3347.2.patch
>
>
> Right now, we could specify applicationId, node http address and container ID 
> to get the specify container log. Or we could only specify applicationId to 
> get all the container logs. It is very hard for the users to get logs for AM 
> container since the AMContainer logs have more useful information. Users need 
> to know the AMContainer's container ID and related Node http address.
> We could improve the YARN Log Command to allow users to get AMContainer logs 
> directly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-23 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376973#comment-14376973
 ] 

Karthik Kambatla commented on YARN-3304:


bq. That's an incompatible change which sounds not necessary for now.
In previous releases, we have never called these APIs Public even if they were 
intended to be sub-classed. In my mind, this is the last opportunity to decide 
on what the API should do? I think consistent and reasonable return values 
should be given a higher priority over compatibility. 

bq. May be we don't have to leverage "-1" in resource usage to distinguish 
unavailable case? e.g. we can have some boolean value to identify the resource 
is available or not which sounds more correct than using odd value like Karthik 
Kambatla mentioned before.

I am okay with adding boolean methods to capture unavailability, but that seems 
a little overboard. Using -1 in the ResourceCalculatorProcessTree is okay by 
me. My concern was with logging this -1 value in the metrics. In either case, I 
would like for the container usage metrics to see if the usage is available 
before logging the same.

bq. So I propose to go patch here (after fixing a minor test failure) in 2.7 
given this is a blocker and we can fix YARN-3392 later in 2.8. Thoughts?
Since it is not too much work or risk, I would prefer we fix both in 2.7. This 
is solely wearing my Apache hat on. My Cloudera hat doesn't really mind it 
being in 2.8 vs 2.7. 



> ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
> inconsistent with other getters
> 
>
> Key: YARN-3304
> URL: https://issues.apache.org/jira/browse/YARN-3304
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Junping Du
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-3304-v2.patch, YARN-3304.patch
>
>
> Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
> unavailable case while other resource metrics are return 0 in the same case 
> which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-23 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3304:
-
Attachment: YARN-3304-v2.patch

Update patch to v2 to fix the test failure for 1st patch.

> ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
> inconsistent with other getters
> 
>
> Key: YARN-3304
> URL: https://issues.apache.org/jira/browse/YARN-3304
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Junping Du
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-3304-v2.patch, YARN-3304.patch
>
>
> Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
> unavailable case while other resource metrics are return 0 in the same case 
> which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376944#comment-14376944
 ] 

Hadoop QA commented on YARN-3021:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706735/YARN-3021.006.patch
  against trunk revision 972f1f1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesHttpStaticUserPermissions
  org.apache.hadoop.yarn.server.resourcemanager.TestRM
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7081//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7081//console

This message is automatically generated.

> YARN's delegation-token handling disallows certain trust setups to operate 
> properly over DistCp
> ---
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Harsh J
>Assignee: Yongjun Zhang
> Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
> YARN-3021.003.patch, YARN-3021.004.patch, YARN-3021.005.patch, 
> YARN-3021.006.patch, YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
> as it attempts a renewDelegationToken(…) synchronously during application 
> submission (to validate the managed token before it adds it to a scheduler 
> for automatic renewal). The call obviously fails cause B realm will not trust 
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously 
> and once the renewal attempt failed we simply ceased to schedule any further 
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on 
> the failure and skip the scheduling alone, rather than bubble back an error 
> to the client, failing the app submission. This way the old behaviour is 
> retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-23 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376920#comment-14376920
 ] 

Junping Du commented on YARN-3304:
--

Thanks [~adhoot] for comments!
bq. If we use a default of zero we cannot distinguish when its unavailable 
versus zero usage. That will make the future "track the improvement to handle 
unavailable case later" near impossible to do.
May be we don't have to leverage "-1" in resource usage to distinguish 
unavailable case? e.g. we can have some boolean value to identify the resource 
is available or not which sounds more correct than using odd value like 
[~ka...@cloudera.com] mentioned before.

bq.  I propose we make all the defaults consistently -1.
That's an incompatible change which sounds not necessary for now.  

bq. I can fix the metrics as well to use this to implement tracking unavailable 
case. Opened YARN-3392 for that.
Agree that we should have some fix on metrics side later. But even that, with 
changed all default values to -1, it is still a incompatible behavior with old 
released version. So I propose to go patch here (after fixing a minor test 
failure) in 2.7 given this is a blocker and we can fix YARN-3392 later in 2.8. 
Thoughts?

> ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
> inconsistent with other getters
> 
>
> Key: YARN-3304
> URL: https://issues.apache.org/jira/browse/YARN-3304
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Junping Du
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-3304.patch
>
>
> Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
> unavailable case while other resource metrics are return 0 in the same case 
> which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3336) FileSystem memory leak in DelegationTokenRenewer

2015-03-23 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376914#comment-14376914
 ] 

Chris Nauroth commented on YARN-3336:
-

[~zxu], I apologize, but I missed entering your name on the git commit message:

{code}
commit 6ca1f12024fd7cec7b01df0f039ca59f3f365dc1
Author: cnauroth 
Date:   Mon Mar 23 10:45:50 2015 -0700

YARN-3336. FileSystem memory leak in DelegationTokenRenewer.
{code}

Unfortunately, this isn't something we can change, because it could mess up the 
git history.

You're still there in CHANGES.txt though, so you get the proper credit for the 
patch:

{code}
YARN-3336. FileSystem memory leak in DelegationTokenRenewer.
(Zhihai Xu via cnauroth)
{code}


> FileSystem memory leak in DelegationTokenRenewer
> 
>
> Key: YARN-3336
> URL: https://issues.apache.org/jira/browse/YARN-3336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: YARN-3336.000.patch, YARN-3336.001.patch, 
> YARN-3336.002.patch, YARN-3336.003.patch, YARN-3336.004.patch
>
>
> FileSystem memory leak in DelegationTokenRenewer.
> Every time DelegationTokenRenewer#obtainSystemTokensForUser is called, a new 
> FileSystem entry will be added to  FileSystem#CACHE which will never be 
> garbage collected.
> This is the implementation of obtainSystemTokensForUser:
> {code}
>   protected Token[] obtainSystemTokensForUser(String user,
>   final Credentials credentials) throws IOException, InterruptedException 
> {
> // Get new hdfs tokens on behalf of this user
> UserGroupInformation proxyUser =
> UserGroupInformation.createProxyUser(user,
>   UserGroupInformation.getLoginUser());
> Token[] newTokens =
> proxyUser.doAs(new PrivilegedExceptionAction[]>() {
>   @Override
>   public Token[] run() throws Exception {
> return FileSystem.get(getConfig()).addDelegationTokens(
>   UserGroupInformation.getLoginUser().getUserName(), credentials);
>   }
> });
> return newTokens;
>   }
> {code}
> The memory leak happened when FileSystem.get(getConfig()) is called with a 
> new proxy user.
> Because createProxyUser will always create a new Subject.
> The calling sequence is 
> FileSystem.get(getConfig())=>FileSystem.get(getDefaultUri(conf), 
> conf)=>FileSystem.CACHE.get(uri, conf)=>FileSystem.CACHE.getInternal(uri, 
> conf, key)=>FileSystem.CACHE.map.get(key)=>createFileSystem(uri, conf)
> {code}
> public static UserGroupInformation createProxyUser(String user,
>   UserGroupInformation realUser) {
> if (user == null || user.isEmpty()) {
>   throw new IllegalArgumentException("Null user");
> }
> if (realUser == null) {
>   throw new IllegalArgumentException("Null real user");
> }
> Subject subject = new Subject();
> Set principals = subject.getPrincipals();
> principals.add(new User(user));
> principals.add(new RealUser(realUser));
> UserGroupInformation result =new UserGroupInformation(subject);
> result.setAuthenticationMethod(AuthenticationMethod.PROXY);
> return result;
>   }
> {code}
> FileSystem#Cache#Key.equals will compare the ugi
> {code}
>   Key(URI uri, Configuration conf, long unique) throws IOException {
> scheme = uri.getScheme()==null?"":uri.getScheme().toLowerCase();
> authority = 
> uri.getAuthority()==null?"":uri.getAuthority().toLowerCase();
> this.unique = unique;
> this.ugi = UserGroupInformation.getCurrentUser();
>   }
>   public boolean equals(Object obj) {
> if (obj == this) {
>   return true;
> }
> if (obj != null && obj instanceof Key) {
>   Key that = (Key)obj;
>   return isEqual(this.scheme, that.scheme)
>  && isEqual(this.authority, that.authority)
>  && isEqual(this.ugi, that.ugi)
>  && (this.unique == that.unique);
> }
> return false;
>   }
> {code}
> UserGroupInformation.equals will compare subject by reference.
> {code}
>   public boolean equals(Object o) {
> if (o == this) {
>   return true;
> } else if (o == null || getClass() != o.getClass()) {
>   return false;
> } else {
>   return subject == ((UserGroupInformation) o).subject;
> }
>   }
> {code}
> So in this case, every time createProxyUser and FileSystem.get(getConfig()) 
> are called, a new FileSystem will be created and a new entry will be added to 
> FileSystem.CACHE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3383) AdminService should use "warn" instead of "info" to log exception when operation fails

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376830#comment-14376830
 ] 

Hadoop QA commented on YARN-3383:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12706717/YARN-3383-032315.patch
  against trunk revision 972f1f1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7080//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7080//console

This message is automatically generated.

> AdminService should use "warn" instead of "info" to log exception when 
> operation fails
> --
>
> Key: YARN-3383
> URL: https://issues.apache.org/jira/browse/YARN-3383
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Li Lu
> Attachments: YARN-3383-032015.patch, YARN-3383-032315.patch
>
>
> Now it uses info:
> {code}
>   private YarnException logAndWrapException(IOException ioe, String user,
>   String argName, String msg) throws YarnException {
> LOG.info("Exception " + msg, ioe);
> {code}
> But it should use warn instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-23 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376765#comment-14376765
 ] 

Yongjun Zhang commented on YARN-3021:
-

Hi [~jianhe],

Thanks a lot for the clarification, I did a new rev (06) to address your latest 
comment, and also tested it against real clusters. Would you please take a  
further look? Thanks.





> YARN's delegation-token handling disallows certain trust setups to operate 
> properly over DistCp
> ---
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Harsh J
>Assignee: Yongjun Zhang
> Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
> YARN-3021.003.patch, YARN-3021.004.patch, YARN-3021.005.patch, 
> YARN-3021.006.patch, YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
> as it attempts a renewDelegationToken(…) synchronously during application 
> submission (to validate the managed token before it adds it to a scheduler 
> for automatic renewal). The call obviously fails cause B realm will not trust 
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously 
> and once the renewal attempt failed we simply ceased to schedule any further 
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on 
> the failure and skip the scheduling alone, rather than bubble back an error 
> to the client, failing the app submission. This way the old behaviour is 
> retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly over DistCp

2015-03-23 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated YARN-3021:

Attachment: YARN-3021.006.patch

> YARN's delegation-token handling disallows certain trust setups to operate 
> properly over DistCp
> ---
>
> Key: YARN-3021
> URL: https://issues.apache.org/jira/browse/YARN-3021
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
>Reporter: Harsh J
>Assignee: Yongjun Zhang
> Attachments: YARN-3021.001.patch, YARN-3021.002.patch, 
> YARN-3021.003.patch, YARN-3021.004.patch, YARN-3021.005.patch, 
> YARN-3021.006.patch, YARN-3021.patch
>
>
> Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
> and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
> clusters.
> Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
> needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
> as it attempts a renewDelegationToken(…) synchronously during application 
> submission (to validate the managed token before it adds it to a scheduler 
> for automatic renewal). The call obviously fails cause B realm will not trust 
> A's credentials (here, the RM's principal is the renewer).
> In the 1.x JobTracker the same call is present, but it is done asynchronously 
> and once the renewal attempt failed we simply ceased to schedule any further 
> attempts of renewals, rather than fail the job immediately.
> We should change the logic such that we attempt the renewal but go easy on 
> the failure and skip the scheduling alone, rather than bubble back an error 
> to the client, failing the app submission. This way the old behaviour is 
> retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3391) Clearly define flow ID/ flow run / flow version in API and storage

2015-03-23 Thread Sangjin Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-3391:
--
Description: 
To continue the discussion in YARN-3040, let's figure out the best way to 
describe the flow.

Some key issues that we need to conclude on:
- How do we include the flow version in the context so that it gets passed into 
the collector and to the storage eventually?
- Flow run id should be a number as opposed to a generic string?
- Default behavior for the flow run id if it is missing (i.e. client did not 
set it)
- How do we handle flow attributes in case of nested levels of flows?


  was:
To continue the discussion in YARN-3040, let's figure out the best way to 
describe the flow.

Some key issues that we need to conclude on:
- How do we include the flow version in the context so that it gets passed into 
the collector and to the storage eventually?
- Flow run id should be a number as opposed to a generic string?
- Default behavior for the flow run id if it is missing (i.e. client did not 
set it)



> Clearly define flow ID/ flow run / flow version in API and storage
> --
>
> Key: YARN-3391
> URL: https://issues.apache.org/jira/browse/YARN-3391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>
> To continue the discussion in YARN-3040, let's figure out the best way to 
> describe the flow.
> Some key issues that we need to conclude on:
> - How do we include the flow version in the context so that it gets passed 
> into the collector and to the storage eventually?
> - Flow run id should be a number as opposed to a generic string?
> - Default behavior for the flow run id if it is missing (i.e. client did not 
> set it)
> - How do we handle flow attributes in case of nested levels of flows?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3391) Clearly define flow ID/ flow run / flow version in API and storage

2015-03-23 Thread Sangjin Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-3391:
--
Description: 
To continue the discussion in YARN-3040, let's figure out the best way to 
describe the flow.

Some key issues that we need to conclude on:
- How do we include the flow version in the context so that it gets passed into 
the collector and to the storage eventually?
- Flow run id should be a number as opposed to a generic string?
- Default behavior for the flow run id if it is missing (i.e. client did not 
set it)


  was:To continue the discussion in YARN-3040, let's figure out the best way to 
describe the flow.


> Clearly define flow ID/ flow run / flow version in API and storage
> --
>
> Key: YARN-3391
> URL: https://issues.apache.org/jira/browse/YARN-3391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>
> To continue the discussion in YARN-3040, let's figure out the best way to 
> describe the flow.
> Some key issues that we need to conclude on:
> - How do we include the flow version in the context so that it gets passed 
> into the collector and to the storage eventually?
> - Flow run id should be a number as opposed to a generic string?
> - Default behavior for the flow run id if it is missing (i.e. client did not 
> set it)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-23 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376749#comment-14376749
 ] 

Sangjin Lee commented on YARN-3040:
---

Thanks [~zjshen] for the updated patch!

I am comfortable with continuing to work on the flow-related items in the 
separate JIRA. I'll jot down the key points in that JIRA shortly.

I went over the latest patch, and overall it looks good. I do have a few 
comments:

(AppLevelTimelineCollector.java)
{code}
50protected void serviceInit(Configuration conf) throws Exception {
51  context.setClusterId(conf.get(YarnConfiguration.RM_CLUSTER_ID,
52  YarnConfiguration.DEFAULT_RM_CLUSTER_ID));
53  
context.setUserId(UserGroupInformation.getCurrentUser().getShortUserName());
54  
context.setFlowId(TimelineUtils.generateDefaultFlowIdBasedOnAppId(appId));
55  context.setFlowRunId("0");
56  context.setAppId(appId.toString());
{code}

I'm not sure of these set calls. Are these here just to initialize the context 
to default values? For example, UGI.getCurrentUser().getShortUserName() would 
return the user under which the daemon was started (whether it is NM or a 
standalone daemon) in case of a per-node daemon, which is highly likely to be 
incorrect. Do we need to bother setting default values if they are going to be 
incorrect anyway, for example, for user?

At minimum, it would be helpful to have a comment here why this is being done.

(AMLauncher.java)
- Do we need to be case-insensitive here? I think we can be strict about the 
tag names?
- You might want to be bit defensive about the tag not carrying any value (e.g. 
"TIMELINE_FLOW_ID_TAG:"). If the value is empty, tag.substring() would throw an 
IndexOutOfBoundsException.

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3241) FairScheduler handles "invalid" queue names inconsistently

2015-03-23 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376737#comment-14376737
 ] 

zhihai xu commented on YARN-3241:
-

Thanks [~kasha] for valuable feedback and committing the patch!

> FairScheduler handles "invalid" queue names inconsistently
> --
>
> Key: YARN-3241
> URL: https://issues.apache.org/jira/browse/YARN-3241
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.8.0
>
> Attachments: YARN-3241.000.patch, YARN-3241.001.patch, 
> YARN-3241.002.patch
>
>
> Leading space, trailing space and empty sub queue name may cause 
> MetricsException(Metrics source XXX already exists! ) when add application to 
> FairScheduler.
> The reason is because QueueMetrics parse the queue name different from the 
> QueueManager.
> QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
> and trailing space in the sub queue name, It will also remove empty sub queue 
> name.
> {code}
>   static final Splitter Q_SPLITTER =
>   Splitter.on('.').omitEmptyStrings().trimResults(); 
> {code}
> But QueueManager won't remove Leading space, trailing space and empty sub 
> queue name.
> This will cause out of sync between FSQueue and FSQueueMetrics.
> QueueManager will think two queue names are different so it will try to 
> create a new queue.
> But FSQueueMetrics will treat these two queue names as same queue which will 
> create "Metrics source XXX already exists!" MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-23 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376719#comment-14376719
 ] 

Li Lu commented on YARN-3047:
-

Hi [~varun_saxena], thanks for the new patch. Could you please elaborate more 
about which exact comment will be addressed in YARN-3051? Thanks! BTW, in 003 
patch I can still see TimelineEvents.java. Do we still need that? 

> [Data Serving] Set up ATS reader with basic request serving structure and 
> lifecycle
> ---
>
> Key: YARN-3047
> URL: https://issues.apache.org/jira/browse/YARN-3047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3047.001.patch, YARN-3047.003.patch, 
> YARN-3047.02.patch
>
>
> Per design in YARN-2938, set up the ATS reader as a service and implement the 
> basic structure as a service. It includes lifecycle management, request 
> serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3393) Getting application(s) goes wrong when app finishes before starting the attempt

2015-03-23 Thread Zhijie Shen (JIRA)

Zhijie Shen created YARN-3393:
-

 Summary: Getting application(s) goes wrong when app finishes 
before starting the attempt
 Key: YARN-3393
 URL: https://issues.apache.org/jira/browse/YARN-3393
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Priority: Critical


When generating app report in ApplicationHistoryManagerOnTimelineStore, it 
checks if appAttempt == null.
{code}
ApplicationAttemptReport appAttempt = 
getApplicationAttempt(app.appReport.getCurrentApplicationAttemptId());
if (appAttempt != null) {
  app.appReport.setHost(appAttempt.getHost());
  app.appReport.setRpcPort(appAttempt.getRpcPort());
  app.appReport.setTrackingUrl(appAttempt.getTrackingUrl());
  
app.appReport.setOriginalTrackingUrl(appAttempt.getOriginalTrackingUrl());
}
{code}

However, {{getApplicationAttempt}} doesn't return null but throws 
ApplicationAttemptNotFoundException:
{code}
if (entity == null) {
  throw new ApplicationAttemptNotFoundException(
  "The entity for application attempt " + appAttemptId +
  " doesn't exist in the timeline store");
} else {
  return convertToApplicationAttemptReport(entity);
}
{code}
They code isn't coupled well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-2605) [RM HA] Rest api endpoints doing redirect incorrectly

2015-03-23 Thread Anubhav Dhoot (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot reassigned YARN-2605:
---

Assignee: Anubhav Dhoot

> [RM HA] Rest api endpoints doing redirect incorrectly
> -
>
> Key: YARN-2605
> URL: https://issues.apache.org/jira/browse/YARN-2605
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: bc Wong
>Assignee: Anubhav Dhoot
>  Labels: newbie
>
> The standby RM's webui tries to do a redirect via meta-refresh. That is fine 
> for pages designed to be viewed by web browsers. But the API endpoints 
> shouldn't do that. Most programmatic HTTP clients do not do meta-refresh. I'd 
> suggest HTTP 303, or return a well-defined error message (json or xml) 
> stating that the standby status and a link to the active RM.
> The standby RM is returning this today:
> {noformat}
> $ curl -i http://bcsec-1.ent.cloudera.com:8088/ws/v1/cluster/metrics
> HTTP/1.1 200 OK
> Cache-Control: no-cache
> Expires: Thu, 25 Sep 2014 18:34:53 GMT
> Date: Thu, 25 Sep 2014 18:34:53 GMT
> Pragma: no-cache
> Expires: Thu, 25 Sep 2014 18:34:53 GMT
> Date: Thu, 25 Sep 2014 18:34:53 GMT
> Pragma: no-cache
> Content-Type: text/plain; charset=UTF-8
> Refresh: 3; url=http://bcsec-2.ent.cloudera.com:8088/ws/v1/cluster/metrics
> Content-Length: 117
> Server: Jetty(6.1.26)
> This is standby RM. Redirecting to the current active RM: 
> http://bcsec-2.ent.cloudera.com:8088/ws/v1/cluster/metrics
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-23 Thread Anubhav Dhoot (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376695#comment-14376695
 ] 

Anubhav Dhoot commented on YARN-3304:
-

Hi [~djp] [~vinodkv],

If we use a default of zero we cannot distinguish when its unavailable versus 
zero usage.  
That will make the future "track the improvement to handle unavailable case 
later" near impossible to do.
I propose we make all the defaults consistently -1. 
I can fix the metrics as well to use this to implement tracking unavailable 
case. Opened YARN-3392 for that




> ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
> inconsistent with other getters
> 
>
> Key: YARN-3304
> URL: https://issues.apache.org/jira/browse/YARN-3304
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Junping Du
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-3304.patch
>
>
> Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
> unavailable case while other resource metrics are return 0 in the same case 
> which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3392) Change NodeManager metrics to not populate resource usage metrics if they are unavailable

2015-03-23 Thread Anubhav Dhoot (JIRA)

Anubhav Dhoot created YARN-3392:
---

 Summary: Change NodeManager metrics to not populate resource usage 
metrics if they are unavailable 
 Key: YARN-3392
 URL: https://issues.apache.org/jira/browse/YARN-3392
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3390) RMTimelineCollector should have the context info of each app

2015-03-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376684#comment-14376684
 ] 

Zhijie Shen commented on YARN-3390:
---

It shouldn't. Storage layer implementations only depends on the writer 
interface, which is covered in YARN-3040.

> RMTimelineCollector should have the context info of each app
> 
>
> Key: YARN-3390
> URL: https://issues.apache.org/jira/browse/YARN-3390
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>
> RMTimelineCollector should have the context info of each app whose entity  
> has been put



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2868) FairScheduler: Metric for latency to allocate first container for an application

2015-03-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376672#comment-14376672
 ] 

Hudson commented on YARN-2868:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7407 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7407/])
YARN-2868. FairScheduler: Metric for latency to allocate first container for an 
application. (Ray Chiang via kasha) (kasha: rev 
972f1f1ab94a26ec446a272ad030fe13f03ed442)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* hadoop-yarn-project/CHANGES.txt


> FairScheduler: Metric for latency to allocate first container for an 
> application
> 
>
> Key: YARN-2868
> URL: https://issues.apache.org/jira/browse/YARN-2868
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: metrics, supportability
> Fix For: 2.8.0
>
> Attachments: YARN-2868-01.patch, YARN-2868.002.patch, 
> YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, 
> YARN-2868.006.patch, YARN-2868.007.patch, YARN-2868.008.patch, 
> YARN-2868.009.patch, YARN-2868.010.patch, YARN-2868.011.patch, 
> YARN-2868.012.patch
>
>
> Add a metric to measure the latency between "starting container allocation" 
> and "first container actually allocated".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3390) RMTimelineCollector should have the context info of each app

2015-03-23 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376670#comment-14376670
 ] 

Li Lu commented on YARN-3390:
-

Hi [~zjshen], could you please confirm that this JIRA will also block all 
storage layer implementations? Or we can proceed after YARN-3040 is in? Thanks! 

> RMTimelineCollector should have the context info of each app
> 
>
> Key: YARN-3390
> URL: https://issues.apache.org/jira/browse/YARN-3390
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>
> RMTimelineCollector should have the context info of each app whose entity  
> has been put



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-23 Thread Zhijie Shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-3040:
--
Attachment: YARN-3040.3.patch

Upload a new patch to address the comments so far. The notable change in this 
patch is to remove the timestamp suffix. And add the default for RM_CLUSTER_ID, 
such that the ID won't change across RM restarting or failover.

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch, YARN-3040.3.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3386) Cgroups feature should work with default hierarchy settings of CentOS 7

2015-03-23 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376645#comment-14376645
 ] 

Karthik Kambatla commented on YARN-3386:


YARN-2194 seems to imply there are more changes required for cgroups to work 
with RHEL/Centos 7? Should this marked a duplicate of the other? 

> Cgroups feature should work with default hierarchy settings of CentOS 7
> ---
>
> Key: YARN-3386
> URL: https://issues.apache.org/jira/browse/YARN-3386
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>
> The path found by CgroupsLCEResourcesHandler#parseMtab contains comma and 
> results in failure of container-executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3383) AdminService should use "warn" instead of "info" to log exception when operation fails

2015-03-23 Thread Li Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3383:

Attachment: YARN-3383-032315.patch

Rebase the patch with the latest trunk. 

> AdminService should use "warn" instead of "info" to log exception when 
> operation fails
> --
>
> Key: YARN-3383
> URL: https://issues.apache.org/jira/browse/YARN-3383
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Li Lu
> Attachments: YARN-3383-032015.patch, YARN-3383-032315.patch
>
>
> Now it uses info:
> {code}
>   private YarnException logAndWrapException(IOException ioe, String user,
>   String argName, String msg) throws YarnException {
> LOG.info("Exception " + msg, ioe);
> {code}
> But it should use warn instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3387) container complete message couldn't pass to am if am restarted and rm changed

2015-03-23 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376640#comment-14376640
 ] 

Karthik Kambatla commented on YARN-3387:


Does this imply our work-preserving AM restart is broken on a RM failover? 

> container complete message couldn't pass to am if am restarted and rm changed
> -
>
> Key: YARN-3387
> URL: https://issues.apache.org/jira/browse/YARN-3387
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: sandflee
>Priority: Critical
>
> suppose am work preserving and rm ha is enabled.
> container complete message is passed to appattemt.justFinishedContainers in 
> rm。in normal situation，all attempt in one app shares the same 
> justFinishedContainers, but when rm changed, every attempt has it's own 
> justFinishedContainers, so in situations below, container complete message 
> couldn't passed to am:
> 1, am restart
> 2, rm changes
> 3, container launched by first am completes
> container complete message will be passed to appAttempt1 not appAttempt2, but 
> am pull finished containers from appAttempt2 (currentAppAttempt)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3387) container complete message couldn't pass to am if am restarted and rm changed

2015-03-23 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-3387:
---
Priority: Critical  (was: Major)
Target Version/s: 2.7.0

> container complete message couldn't pass to am if am restarted and rm changed
> -
>
> Key: YARN-3387
> URL: https://issues.apache.org/jira/browse/YARN-3387
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: sandflee
>Priority: Critical
>
> suppose am work preserving and rm ha is enabled.
> container complete message is passed to appattemt.justFinishedContainers in 
> rm。in normal situation，all attempt in one app shares the same 
> justFinishedContainers, but when rm changed, every attempt has it's own 
> justFinishedContainers, so in situations below, container complete message 
> couldn't passed to am:
> 1, am restart
> 2, rm changes
> 3, container launched by first am completes
> container complete message will be passed to appAttempt1 not appAttempt2, but 
> am pull finished containers from appAttempt2 (currentAppAttempt)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2868) FairScheduler: Metric for latency to allocate first container for an application

2015-03-23 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2868:
---
Summary: FairScheduler: Metric for latency to allocate first container for 
an application  (was: Add metric for initial container launch time to 
FairScheduler)

> FairScheduler: Metric for latency to allocate first container for an 
> application
> 
>
> Key: YARN-2868
> URL: https://issues.apache.org/jira/browse/YARN-2868
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: metrics, supportability
> Attachments: YARN-2868-01.patch, YARN-2868.002.patch, 
> YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, 
> YARN-2868.006.patch, YARN-2868.007.patch, YARN-2868.008.patch, 
> YARN-2868.009.patch, YARN-2868.010.patch, YARN-2868.011.patch, 
> YARN-2868.012.patch
>
>
> Add a metric to measure the latency between "starting container allocation" 
> and "first container actually allocated".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2868) Add metric for initial container launch time to FairScheduler

2015-03-23 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376633#comment-14376633
 ] 

Karthik Kambatla commented on YARN-2868:


+1, checking this in. 

> Add metric for initial container launch time to FairScheduler
> -
>
> Key: YARN-2868
> URL: https://issues.apache.org/jira/browse/YARN-2868
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: metrics, supportability
> Attachments: YARN-2868-01.patch, YARN-2868.002.patch, 
> YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, 
> YARN-2868.006.patch, YARN-2868.007.patch, YARN-2868.008.patch, 
> YARN-2868.009.patch, YARN-2868.010.patch, YARN-2868.011.patch, 
> YARN-2868.012.patch
>
>
> Add a metric to measure the latency between "starting container allocation" 
> and "first container actually allocated".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3383) AdminService should use "warn" instead of "info" to log exception when operation fails

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376629#comment-14376629
 ] 

Hadoop QA commented on YARN-3383:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12706096/YARN-3383-032015.patch
  against trunk revision 2bc097c.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7079//console

This message is automatically generated.

> AdminService should use "warn" instead of "info" to log exception when 
> operation fails
> --
>
> Key: YARN-3383
> URL: https://issues.apache.org/jira/browse/YARN-3383
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Li Lu
> Attachments: YARN-3383-032015.patch
>
>
> Now it uses info:
> {code}
>   private YarnException logAndWrapException(IOException ioe, String user,
>   String argName, String msg) throws YarnException {
> LOG.info("Exception " + msg, ioe);
> {code}
> But it should use warn instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-23 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376605#comment-14376605
 ] 

Wangda Tan commented on YARN-2495:
--

Hmm.. {{StringArrayProto.stringElement -> elements}} is still not changed in 
latest patch, could you take a look again?
I meant to remove the "string" prefix, since the StringArrayProto already 
indicates that. Beyond that, patch LGTM.

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
> YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
> YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
> YARN-2495.20150318-1.patch, YARN-2495.20150320-1.patch, 
> YARN-2495.20150321-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
> using script suggested by [~aw] (YARN-2729) )
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-23 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376602#comment-14376602
 ] 

Junping Du commented on YARN-3040:
--

Hi [~zjshen], thanks for the patch! I am still reviewing the patch but have 
some quick comments so far:
{code}
+  public static String generateDefaultClusterIdBasedOnAppId(
+  ApplicationId appId) {
+return "cluster_" + appId.getClusterTimestamp();
+  }
{code}
It seems appId's ClusterTimestamp comes from RM and get changed everytime RM 
get restart. I think here we need a ClusterID that can keep consistent across 
from RM restarts. Isn't it? Or applications get submitted to the same cluster 
could get different ClusterID just because RM failed over which shouldn't be 
users' expectation. Suggest to add a configuration for user to input a 
specified ClusterID or it generate default (and variable) value for test 
purpose.

{code}
+  rpc getTimelienCollectorContext (GetTimelineCollectorContextRequestProto) 
returns (GetTimelineCollectorContextResponseProto);
{code}
One typos here and other places, "Timelien" should be "Timeline".

{code}
-import java.util.ArrayList;
-import java.util.HashMap;
-import java.util.List;
-import java.util.Map;
-import java.util.Vector;
+import java.util.*;
{code}
We shouldn't do this which could load unnecessary classes.

{code}
+   * The aggregator needs to get the context information including user, flow
{code}
aggregator => collector

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3241) FairScheduler handles "invalid" queue names inconsistently

2015-03-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376577#comment-14376577
 ] 

Hudson commented on YARN-3241:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7406 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7406/])
YARN-3241. FairScheduler handles invalid queue names inconsistently. (Zhihai Xu 
via kasha) (kasha: rev 2bc097cd14692e6ceb06bff959f28531534eb307)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/InvalidQueueNameException.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueueManager.java


> FairScheduler handles "invalid" queue names inconsistently
> --
>
> Key: YARN-3241
> URL: https://issues.apache.org/jira/browse/YARN-3241
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-3241.000.patch, YARN-3241.001.patch, 
> YARN-3241.002.patch
>
>
> Leading space, trailing space and empty sub queue name may cause 
> MetricsException(Metrics source XXX already exists! ) when add application to 
> FairScheduler.
> The reason is because QueueMetrics parse the queue name different from the 
> QueueManager.
> QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
> and trailing space in the sub queue name, It will also remove empty sub queue 
> name.
> {code}
>   static final Splitter Q_SPLITTER =
>   Splitter.on('.').omitEmptyStrings().trimResults(); 
> {code}
> But QueueManager won't remove Leading space, trailing space and empty sub 
> queue name.
> This will cause out of sync between FSQueue and FSQueueMetrics.
> QueueManager will think two queue names are different so it will try to 
> create a new queue.
> But FSQueueMetrics will treat these two queue names as same queue which will 
> create "Metrics source XXX already exists!" MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376575#comment-14376575
 ] 

Zhijie Shen commented on YARN-3034:
---

bq. then only way RMTimelineCollector can be invoked is through 
SystemMetricsPublisher's (SMP) public methods

Oh, probably I misunderstood your intention. I used to think this is way that 
you want to do put the data into RMTimelineCollector. So in this case, we could 
put RMTimelineCollector inside SystemMetricsPublisher, and whereas we invoke 
timeline client, we call RMTimelineCollector for v2.

According to this comments, it seems that you want to create a separate stack 
to put entities into RMTimelineCollector, right? If so, the current design 
makes sense.

bq. So in NM side too we req a configuration but we cannot use the existing one

I meant we keep {{yarn.resourcemanager.system-metrics-publisher.enabled}} for 
v1 SystemMetricsPublisher. For v2,  both RM and NM reads 
{{yarn.system-metrics-publisher.enabled}}? No need to have v1/v2 flag?

> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3241) FairScheduler handles "invalid" queue names inconsistently

2015-03-23 Thread Karthik Kambatla (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-3241:
---
Summary: FairScheduler handles "invalid" queue names inconsistently  (was: 
Leading space, trailing space and empty sub queue name may cause 
MetricsException for fair scheduler)

> FairScheduler handles "invalid" queue names inconsistently
> --
>
> Key: YARN-3241
> URL: https://issues.apache.org/jira/browse/YARN-3241
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-3241.000.patch, YARN-3241.001.patch, 
> YARN-3241.002.patch
>
>
> Leading space, trailing space and empty sub queue name may cause 
> MetricsException(Metrics source XXX already exists! ) when add application to 
> FairScheduler.
> The reason is because QueueMetrics parse the queue name different from the 
> QueueManager.
> QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
> and trailing space in the sub queue name, It will also remove empty sub queue 
> name.
> {code}
>   static final Splitter Q_SPLITTER =
>   Splitter.on('.').omitEmptyStrings().trimResults(); 
> {code}
> But QueueManager won't remove Leading space, trailing space and empty sub 
> queue name.
> This will cause out of sync between FSQueue and FSQueueMetrics.
> QueueManager will think two queue names are different so it will try to 
> create a new queue.
> But FSQueueMetrics will treat these two queue names as same queue which will 
> create "Metrics source XXX already exists!" MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-03-23 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376563#comment-14376563
 ] 

Naganarasimha G R commented on YARN-3362:
-

Thanks for the feedback [~leftnoteasy], 
bq. different labels under same queue can have different 
user-limit/capacity/maximum-capacity/max-am-resource, etc.
If this is the case then the approach which you specified makes sense but by 
"can" you mean currently its not there and in future it can come in ? 

More than repeated info, other drawback i can see is suppose for particular 
label userlimit is not reached but as overall at queue level if the user has 
reached his limit it will be difficult for user to go through all labels and 
find out whether user has reached queue limit . Correct me if my understanding 
on this is wrong .


> Add node label usage in RM CapacityScheduler web UI
> ---
>
> Key: YARN-3362
> URL: https://issues.apache.org/jira/browse/YARN-3362
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, resourcemanager, webapp
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>
> We don't have node label usage in RM CapacityScheduler web UI now, without 
> this, user will be hard to understand what happened to nodes have labels 
> assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-23 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376559#comment-14376559
 ] 

Karthik Kambatla commented on YARN-3241:


+1. Checking this in. 

> Leading space, trailing space and empty sub queue name may cause 
> MetricsException for fair scheduler
> 
>
> Key: YARN-3241
> URL: https://issues.apache.org/jira/browse/YARN-3241
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-3241.000.patch, YARN-3241.001.patch, 
> YARN-3241.002.patch
>
>
> Leading space, trailing space and empty sub queue name may cause 
> MetricsException(Metrics source XXX already exists! ) when add application to 
> FairScheduler.
> The reason is because QueueMetrics parse the queue name different from the 
> QueueManager.
> QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
> and trailing space in the sub queue name, It will also remove empty sub queue 
> name.
> {code}
>   static final Splitter Q_SPLITTER =
>   Splitter.on('.').omitEmptyStrings().trimResults(); 
> {code}
> But QueueManager won't remove Leading space, trailing space and empty sub 
> queue name.
> This will cause out of sync between FSQueue and FSQueueMetrics.
> QueueManager will think two queue names are different so it will try to 
> create a new queue.
> But FSQueueMetrics will treat these two queue names as same queue which will 
> create "Metrics source XXX already exists!" MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3024) LocalizerRunner should give DIE action when all resources are localized

2015-03-23 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376556#comment-14376556
 ] 

Karthik Kambatla commented on YARN-3024:


[~chengbing.liu] - thanks for the clarifications. Makes sense.

For the TODOs, it would be nice to have follow-up JIRAs. If it is not too much 
trouble, can you create them so interested contributors could follow up? 

> LocalizerRunner should give DIE action when all resources are localized
> ---
>
> Key: YARN-3024
> URL: https://issues.apache.org/jira/browse/YARN-3024
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Fix For: 2.7.0
>
> Attachments: YARN-3024.01.patch, YARN-3024.02.patch, 
> YARN-3024.03.patch, YARN-3024.04.patch
>
>
> We have observed that {{LocalizerRunner}} always gives a LIVE action at the 
> end of localization process.
> The problem is {{findNextResource()}} can return null even when {{pending}} 
> was not empty prior to the call. This method removes localized resources from 
> {{pending}}, therefore we should check the return value, and gives DIE action 
> when it returns null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376498#comment-14376498
 ] 

Naganarasimha G R commented on YARN-3034:
-

Thanks for the comments [~zjshen],
bq.  and in this approach, I don't think we should couple RMTimelineCollector 
and SystemMetricsPublisher. Keeping SystemMetricsPublisher separate, we can 
easily deprecate and even remove it from the code base later. 
May be i am missing something here , If RM or RM context is not aware then only 
way RMTimelineCollector can be invoked is through SystemMetricsPublisher's 
(SMP) public methods like appCreated, appFinished,appAttemptRegistered or 
RMTimelineCollector can have its own event handler and during initialization  
SMP can select the event handler present in its class or of 
RMTimelineCollector. But still there will be dependency of event source calling 
public methods of SMP . So i feel it will not be smoother to deprecate and 
remove SystemMetricsPublisher's as it will have code for creation of 
RMTimelineCollector, sending events to RMTimelineCollector to publish to ATS V2.

bq. Moreover, we can keep the existing config as what it is now, and create a 
new config to control starting v2 RM writing data stack.
IMHO i feel the current config is better because in ATS V2 container events are 
planned to be moved to NM side (YARN-3045), So in NM side too we req a 
configuration but we cannot use the existing one 
{{"yarn.resourcemanager.system-metrics-publisher.enabled"}}  as it indicates 
more like RM side configuration only
Approach specified in the patch uses a single config for both NM and RM 

> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376466#comment-14376466
 ] 

Hadoop QA commented on YARN-3304:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706680/YARN-3304.patch
  against trunk revision 6ca1f12.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common:

  org.apache.hadoop.yarn.util.TestProcfsBasedProcessTree

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7078//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7078//console

This message is automatically generated.

> ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
> inconsistent with other getters
> 
>
> Key: YARN-3304
> URL: https://issues.apache.org/jira/browse/YARN-3304
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Junping Du
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-3304.patch
>
>
> Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
> unavailable case while other resource metrics are return 0 in the same case 
> which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3336) FileSystem memory leak in DelegationTokenRenewer

2015-03-23 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376447#comment-14376447
 ] 

zhihai xu commented on YARN-3336:
-

Thanks [~cnauroth] for valuable feedback and committing the patch! Greatly 
appreciated.

> FileSystem memory leak in DelegationTokenRenewer
> 
>
> Key: YARN-3336
> URL: https://issues.apache.org/jira/browse/YARN-3336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: YARN-3336.000.patch, YARN-3336.001.patch, 
> YARN-3336.002.patch, YARN-3336.003.patch, YARN-3336.004.patch
>
>
> FileSystem memory leak in DelegationTokenRenewer.
> Every time DelegationTokenRenewer#obtainSystemTokensForUser is called, a new 
> FileSystem entry will be added to  FileSystem#CACHE which will never be 
> garbage collected.
> This is the implementation of obtainSystemTokensForUser:
> {code}
>   protected Token[] obtainSystemTokensForUser(String user,
>   final Credentials credentials) throws IOException, InterruptedException 
> {
> // Get new hdfs tokens on behalf of this user
> UserGroupInformation proxyUser =
> UserGroupInformation.createProxyUser(user,
>   UserGroupInformation.getLoginUser());
> Token[] newTokens =
> proxyUser.doAs(new PrivilegedExceptionAction[]>() {
>   @Override
>   public Token[] run() throws Exception {
> return FileSystem.get(getConfig()).addDelegationTokens(
>   UserGroupInformation.getLoginUser().getUserName(), credentials);
>   }
> });
> return newTokens;
>   }
> {code}
> The memory leak happened when FileSystem.get(getConfig()) is called with a 
> new proxy user.
> Because createProxyUser will always create a new Subject.
> The calling sequence is 
> FileSystem.get(getConfig())=>FileSystem.get(getDefaultUri(conf), 
> conf)=>FileSystem.CACHE.get(uri, conf)=>FileSystem.CACHE.getInternal(uri, 
> conf, key)=>FileSystem.CACHE.map.get(key)=>createFileSystem(uri, conf)
> {code}
> public static UserGroupInformation createProxyUser(String user,
>   UserGroupInformation realUser) {
> if (user == null || user.isEmpty()) {
>   throw new IllegalArgumentException("Null user");
> }
> if (realUser == null) {
>   throw new IllegalArgumentException("Null real user");
> }
> Subject subject = new Subject();
> Set principals = subject.getPrincipals();
> principals.add(new User(user));
> principals.add(new RealUser(realUser));
> UserGroupInformation result =new UserGroupInformation(subject);
> result.setAuthenticationMethod(AuthenticationMethod.PROXY);
> return result;
>   }
> {code}
> FileSystem#Cache#Key.equals will compare the ugi
> {code}
>   Key(URI uri, Configuration conf, long unique) throws IOException {
> scheme = uri.getScheme()==null?"":uri.getScheme().toLowerCase();
> authority = 
> uri.getAuthority()==null?"":uri.getAuthority().toLowerCase();
> this.unique = unique;
> this.ugi = UserGroupInformation.getCurrentUser();
>   }
>   public boolean equals(Object obj) {
> if (obj == this) {
>   return true;
> }
> if (obj != null && obj instanceof Key) {
>   Key that = (Key)obj;
>   return isEqual(this.scheme, that.scheme)
>  && isEqual(this.authority, that.authority)
>  && isEqual(this.ugi, that.ugi)
>  && (this.unique == that.unique);
> }
> return false;
>   }
> {code}
> UserGroupInformation.equals will compare subject by reference.
> {code}
>   public boolean equals(Object o) {
> if (o == this) {
>   return true;
> } else if (o == null || getClass() != o.getClass()) {
>   return false;
> } else {
>   return subject == ((UserGroupInformation) o).subject;
> }
>   }
> {code}
> So in this case, every time createProxyUser and FileSystem.get(getConfig()) 
> are called, a new FileSystem will be created and a new entry will be added to 
> FileSystem.CACHE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376446#comment-14376446
 ] 

Zhijie Shen commented on YARN-3034:
---

bq. so i think its not an incompatible change. Please provide your opinion on 
the same.

Sorry, I missed that piece.

bq. IIUC SystemMetricsPublisher.publish*Event methods can determine which 
version of ATS to publish and can post it accordingly ?

I meant in the current approach SystemMetricsPublisher can be self contained. 
RMTimelineCollector can be a private stuff in SystemMetricsPublisher, 
constructed and started there. It's not necessary to be visible in RM and its 
context objects. 

bq. we might not require much of the functionality of SystemMetricsPublisher 
and it will be just delegating the calls to RMTimelineCollector.

I'm not sure about if there's previous discussion about the way for RM to put 
entities, but this approach sound cleaner, and in this approach, I don't think 
we should couple RMTimelineCollector and SystemMetricsPublisher. Keeping 
SystemMetricsPublisher separate, we can easily deprecate and even remove it 
from the code base later. Moreover, we can keep the existing config as what it 
is now, and create a new config to control starting v2 RM writing data stack.

> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3225) New parameter or CLI for decommissioning node gracefully in RMAdmin CLI

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376444#comment-14376444
 ] 

Hadoop QA commented on YARN-3225:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706513/YARN-3225-1.patch
  against trunk revision 7e6f384.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRM

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7077//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7077//console

This message is automatically generated.

> New parameter or CLI for decommissioning node gracefully in RMAdmin CLI
> ---
>
> Key: YARN-3225
> URL: https://issues.apache.org/jira/browse/YARN-3225
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Junping Du
>Assignee: Devaraj K
> Attachments: YARN-3225-1.patch, YARN-3225.patch, YARN-914.patch
>
>
> New CLI (or existing CLI with parameters) should put each node on 
> decommission list to decommissioning status and track timeout to terminate 
> the nodes that haven't get finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376415#comment-14376415
 ] 

Naganarasimha G R commented on YARN-3034:
-

Also [~zjshen], earlier thought process  to expose RMTimelineCollector to RM 
and it's context, was to gradually replace SystemMetricsPublisher with 
RMTimelineCollector, as i felt once we deprecate & completely remove ATSV1, we 
might not require much of the functionality of SystemMetricsPublisher and it 
will be just delegating the calls to RMTimelineCollector. your thoughts ?

> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-03-23 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376402#comment-14376402
 ] 

Naganarasimha G R commented on YARN-3044:
-

Hi [~zjshen] & [~sjlee0],
As part of this jira following basic App and AppAttempt life cycle events, i am 
planning to capture in {{RMTimelineCollector}} :
* ApplicationCreated
* ApplicationFinished
* ApplicationACLsUpdated
* AppAttemptRegistered
* AppAttemptFinished

Apart from these any other events you guys have thought about to be captured 
(as i remember, some where Sangjin had mentioned to capture all the life cycle 
events/states) ?




> [Event producers] Implement RM writing app lifecycle events to ATS
> --
>
> Key: YARN-3044
> URL: https://issues.apache.org/jira/browse/YARN-3044
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
>
> Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376400#comment-14376400
 ] 

Zhijie Shen commented on YARN-3040:
---

[~sjlee0], thanks for more comments, but would you mind continuing the flow 
attributes discussion in YARN-3391 to unblock this jira? In this jira, how 
about focusing on the data flow to passing this context info to the collector? 
For flow info, no matter what it should be specifically, this patch works out 
the path to collect it from user via application submission context and pass it 
to RM, NM and finally to the collector. If we're okay with is approach. It is 
easy for us to add new flow info or correct existing flow info later on. I 
filed YARN-3391 to fork the flow related discussion.

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376380#comment-14376380
 ] 

Hadoop QA commented on YARN-3136:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706581/0008-YARN-3136.patch
  against trunk revision 36af4a9.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 14 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-sls 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7076//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7076//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7076//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7076//console

This message is automatically generated.

> getTransferredContainers can be a bottleneck during AM registration
> ---
>
> Key: YARN-3136
> URL: https://issues.apache.org/jira/browse/YARN-3136
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Sunil G
> Attachments: 0001-YARN-3136.patch, 0002-YARN-3136.patch, 
> 0003-YARN-3136.patch, 0004-YARN-3136.patch, 0005-YARN-3136.patch, 
> 0006-YARN-3136.patch, 0007-YARN-3136.patch, 0008-YARN-3136.patch
>
>
> While examining RM stack traces on a busy cluster I noticed a pattern of AMs 
> stuck waiting for the scheduler lock trying to call getTransferredContainers. 
>  The scheduler lock is highly contended, especially on a large cluster with 
> many nodes heartbeating, and it would be nice if we could find a way to 
> eliminate the need to grab this lock during this call.  We've already done 
> similar work during AM allocate calls to make sure they don't needlessly grab 
> the scheduler lock, and it would be good to do so here as well, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3391) Clearly define flow ID/ flow run / flow version in API and storage

2015-03-23 Thread Zhijie Shen (JIRA)

Zhijie Shen created YARN-3391:
-

 Summary: Clearly define flow ID/ flow run / flow version in API 
and storage
 Key: YARN-3391
 URL: https://issues.apache.org/jira/browse/YARN-3391
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen


To continue the discussion in YARN-3040, let's figure out the best way to 
describe the flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters

2015-03-23 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3304:
-
Attachment: YARN-3304.patch

Deliver a quick patch to fix it, given this is a blocker for release.

> ResourceCalculatorProcessTree#getCpuUsagePercent default return value is 
> inconsistent with other getters
> 
>
> Key: YARN-3304
> URL: https://issues.apache.org/jira/browse/YARN-3304
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Junping Du
>Assignee: Karthik Kambatla
>Priority: Blocker
> Attachments: YARN-3304.patch
>
>
> Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for 
> unavailable case while other resource metrics are return 0 in the same case 
> which sounds inconsistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376349#comment-14376349
 ] 

Naganarasimha G R commented on YARN-3034:
-

Thanks for your comments [~zjshen]
bq. RM_SYSTEM_METRICS_PUBLISHER_ENABLED -> SYSTEM_METRICS_PUBLISHER_ENABLED is 
an incompatible change. :
This i incorporated based on [~vinodkv]'s 
[comment|https://issues.apache.org/jira/browse/YARN-3034?focusedCommentId=14360797&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14360797],
 And also i have added old keys as part of {{addDeprecatedKeys}}, so i think 
its not an incompatible change. Please provide your opinion on the same.

bq. RMTimelineCollector doesn't need to be exposed to RM and it's context. It 
seems to be enough to construct it inside SystemMetricsPublisher only.
IIUC  SystemMetricsPublisher.publish*Event methods can determine which version 
of ATS to publish and can post it accordingly ?

> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3390) RMTimelineCollector should have the context info of each app

2015-03-23 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376323#comment-14376323
 ] 

Naganarasimha G R commented on YARN-3390:
-

Hi [~zjshen], 
Shall i work on this jira ?  as i can utilize the same in YARN-3044 ?

> RMTimelineCollector should have the context info of each app
> 
>
> Key: YARN-3390
> URL: https://issues.apache.org/jira/browse/YARN-3390
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>
> RMTimelineCollector should have the context info of each app whose entity  
> has been put



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3336) FileSystem memory leak in DelegationTokenRenewer

2015-03-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376300#comment-14376300
 ] 

Hudson commented on YARN-3336:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7405 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7405/])
YARN-3336. FileSystem memory leak in DelegationTokenRenewer. (cnauroth: rev 
6ca1f12024fd7cec7b01df0f039ca59f3f365dc1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java


> FileSystem memory leak in DelegationTokenRenewer
> 
>
> Key: YARN-3336
> URL: https://issues.apache.org/jira/browse/YARN-3336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: YARN-3336.000.patch, YARN-3336.001.patch, 
> YARN-3336.002.patch, YARN-3336.003.patch, YARN-3336.004.patch
>
>
> FileSystem memory leak in DelegationTokenRenewer.
> Every time DelegationTokenRenewer#obtainSystemTokensForUser is called, a new 
> FileSystem entry will be added to  FileSystem#CACHE which will never be 
> garbage collected.
> This is the implementation of obtainSystemTokensForUser:
> {code}
>   protected Token[] obtainSystemTokensForUser(String user,
>   final Credentials credentials) throws IOException, InterruptedException 
> {
> // Get new hdfs tokens on behalf of this user
> UserGroupInformation proxyUser =
> UserGroupInformation.createProxyUser(user,
>   UserGroupInformation.getLoginUser());
> Token[] newTokens =
> proxyUser.doAs(new PrivilegedExceptionAction[]>() {
>   @Override
>   public Token[] run() throws Exception {
> return FileSystem.get(getConfig()).addDelegationTokens(
>   UserGroupInformation.getLoginUser().getUserName(), credentials);
>   }
> });
> return newTokens;
>   }
> {code}
> The memory leak happened when FileSystem.get(getConfig()) is called with a 
> new proxy user.
> Because createProxyUser will always create a new Subject.
> The calling sequence is 
> FileSystem.get(getConfig())=>FileSystem.get(getDefaultUri(conf), 
> conf)=>FileSystem.CACHE.get(uri, conf)=>FileSystem.CACHE.getInternal(uri, 
> conf, key)=>FileSystem.CACHE.map.get(key)=>createFileSystem(uri, conf)
> {code}
> public static UserGroupInformation createProxyUser(String user,
>   UserGroupInformation realUser) {
> if (user == null || user.isEmpty()) {
>   throw new IllegalArgumentException("Null user");
> }
> if (realUser == null) {
>   throw new IllegalArgumentException("Null real user");
> }
> Subject subject = new Subject();
> Set principals = subject.getPrincipals();
> principals.add(new User(user));
> principals.add(new RealUser(realUser));
> UserGroupInformation result =new UserGroupInformation(subject);
> result.setAuthenticationMethod(AuthenticationMethod.PROXY);
> return result;
>   }
> {code}
> FileSystem#Cache#Key.equals will compare the ugi
> {code}
>   Key(URI uri, Configuration conf, long unique) throws IOException {
> scheme = uri.getScheme()==null?"":uri.getScheme().toLowerCase();
> authority = 
> uri.getAuthority()==null?"":uri.getAuthority().toLowerCase();
> this.unique = unique;
> this.ugi = UserGroupInformation.getCurrentUser();
>   }
>   public boolean equals(Object obj) {
> if (obj == this) {
>   return true;
> }
> if (obj != null && obj instanceof Key) {
>   Key that = (Key)obj;
>   return isEqual(this.scheme, that.scheme)
>  && isEqual(this.authority, that.authority)
>  && isEqual(this.ugi, that.ugi)
>  && (this.unique == that.unique);
> }
> return false;
>   }
> {code}
> UserGroupInformation.equals will compare subject by reference.
> {code}
>   public boolean equals(Object o) {
> if (o == this) {
>   return true;
> } else if (o == null || getClass() != o.getClass()) {
>   return false;
> } else {
>   return subject == ((UserGroupInformation) o).subject;
> }
>   }
> {code}
> So in this case, every time createProxyUser and FileSystem.get(getConfig()) 
> are called, a new FileSystem will be created and a new entry will be added to 
> FileSystem.CACHE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-23 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376285#comment-14376285
 ] 

Sangjin Lee commented on YARN-3040:
---

{quote}
I can understand this particular case described above. Like my prior comment 
about flow run ID, my concern is whether flow/version/run's explicit hierarchy 
is so general to capture most use cases. IMHO, by nature, the hierarchy is the 
tree of flows, and a flow can be the flow of flows or the flow of apps. 
However, if other users just want to use one level of flow, version/run info 
seems to be redundant. On the other side, if use the flow recursion structure, 
it's elastic to have flow levels from one to many. We can treat the first level 
as the flow, the second as version and third and run. I don't have expertise 
knowledge about workflow such as Oozie, but just want to think out my concern 
loudly. That said, if flow/version/run is the general description of a flow, I 
agree we should pass in these three env vars together and separately.
{quote}

Agreed that we need to consider both use cases (single level and multi-level). 
I just want to clarify that even with one level of flows, it is possible (and 
in fact it is more common) that there are multiple runs for a given flow 
version, and multiple version for a given flow name; e.g. "foo.pig"/"v.1"/1, 
"foo.pig"/"v.1"/2, ..., "foo.pig"/"v.2"/10, "foo.pig"/"v.2"/11, ...

Also, my mental model is that flow id/version/run-id is not a hierarchy. It's 
just a group of 3 attributes (although there is some implied contains 
relationship).

Also, when we store these 3 attributes in the storage, I suspect schemas like 
HBase/phoenix will probably make only the flow id (name) and the flow run id as 
part of the primary/row key, and store the flow version in a separate table.

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-23 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376280#comment-14376280
 ] 

Sangjin Lee commented on YARN-3040:
---

{quote}
I can understand this particular case described above. Like my prior comment 
about flow run ID, my concern is whether flow/version/run's explicit hierarchy 
is so general to capture most use cases. IMHO, by nature, the hierarchy is the 
tree of flows, and a flow can be the flow of flows or the flow of apps. 
However, if other users just want to use one level of flow, version/run info 
seems to be redundant. On the other side, if use the flow recursion structure, 
it's elastic to have flow levels from one to many. We can treat the first level 
as the flow, the second as version and third and run. I don't have expertise 
knowledge about workflow such as Oozie, but just want to think out my concern 
loudly. That said, if flow/version/run is the general description of a flow, I 
agree we should pass in these three env vars together and separately.
{quote}

Agreed that we need to consider both use cases (single level and multi-level). 
I just want to clarify that even with one level of flows, it is possible (and 
in fact it is more common) that there are multiple runs for a given flow 
version, and multiple version for a given flow name; e.g. "foo.pig"/"v.1"/1, 
"foo.pig"/"v.1"/2, ..., "foo.pig"/"v.2"/10, "foo.pig"/"v.2"/11, ...

Also, my mental model is that flow id/version/run-id is not a hierarchy. It's 
just a group of 3 attributes (although there is some implied contains 
relationship).

Also, when we store these 3 attributes in the storage, I suspect schemas like 
HBase/phoenix will probably make only the flow id (name) and the flow run id as 
part of the primary/row key, and store the flow version in a separate table.

> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-23 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376255#comment-14376255
 ] 

Sangjin Lee commented on YARN-3040:
---

bq. I can see the benefit. For example, if it represents the timestamp, we can 
filter the flow runs and say give me the runs in the last 5 mins. But my 
concern is whether it's the general way to let user to describe a run.

The design doc says the flow runs for a given flow must have "unique and 
totally ordered run identifiers". We obviously had numbers in mind when we had 
that (mostly coming from the ease of sorting and ordering in the storage). And 
that's the convention we will push frameworks to use. I think it is important 
that we make it a number (long). However, there is a difference between having 
numbers as run id's and having timestamps as run id's. I don't think we need to 
go so far as requiring timestamps as run id's. As long as they are numbers, I 
think it would be fine. I can imagine some flows using run id's like "1", "2", 
...

We could allow any arbitrary scheme to generate the run id's, but the challenge 
is it might seriously hamper the ability to store and sort them efficiently. 
And, in most cases, the timestamp of the flow start is a quite natural scheme, 
and I would think most frameworks will just adopt that scheme. What do you 
think?

On a related note, we should also generate the default run id if it is missing. 
I realize this could be bit tricky. If the flow id is also missing, then we're 
treating this single YARN app as a flow in and of itself. Then we can do 
flow/version/run id = (yarn app name)/("1")/(app submission timestamp). This is 
also mentioned in the design doc.

However, if the flow id is provided but not the flow run id, it can be tricky 
as there can be multiple YARN apps for the given flow run. One obvious solution 
might be to reject app submission if the flow client (not the timeline client) 
sets the flow id but not the flow run id. For that we'd need some kind of a 
common layer for checks. Thoughts?


> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376256#comment-14376256
 ] 

Zhijie Shen commented on YARN-3034:
---

Some comments about the patch:

1. RM_SYSTEM_METRICS_PUBLISHER_ENABLED -> SYSTEM_METRICS_PUBLISHER_ENABLED is 
an incompatible change.

2. RMTimelineCollector doesn't need to be exposed to RM and it's context. It 
seems to be enough to construct it inside SystemMetricsPublisher only.

bq. I would prefer for the former one as it would be simpler to review. Please 
provide your opinion

I filed a separate Jira: YARN-3390

> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3390) RMTimelineCollector should have the context info of each app

2015-03-23 Thread Zhijie Shen (JIRA)

Zhijie Shen created YARN-3390:
-

 Summary: RMTimelineCollector should have the context info of each 
app
 Key: YARN-3390
 URL: https://issues.apache.org/jira/browse/YARN-3390
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Zhijie Shen


RMTimelineCollector should have the context info of each app whose entity  has 
been put



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376229#comment-14376229
 ] 

Naganarasimha G R commented on YARN-3034:
-

Thanks [~sjlee0] & [~djp] for the reviews, 
{{" so I still suggest to add some check and warning here."}} : well currently 
i have logged a warning message as {{"RMTimelineCollector has not been 
configured to publish System Metrics in ATS V2"}}  if it is not configured to 
publish system metrics for ATS v2. will that suffice ?
bq. Zhijie Shen, can we put that work on your patch in YARN-3040? Or you 
suggest something else?
We can do it in 2 ways , 
* as Zhijie suggested 
[earlier|https://issues.apache.org/jira/browse/YARN-3034?focusedCommentId=14372342&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14372342],
 we can handle it in a separate jira 
* can handle as part of YARN-3044 (which i am working on )

I would prefer for the former one as it would be simpler to review. Please 
provide your opinion 
 

> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376228#comment-14376228
 ] 

Zhijie Shen commented on YARN-3034:
---

Let me elaborate my previous comments. In YARN-3040, I'm working on the issue 
to make the context info available in app-level collector, such that when we 
use timeline client to put entity inside AM and NM, the entity will be 
automatically associated to this context.

This jiar is to create RM collector. To achieve the similar thing, RM collector 
should have the context info available too. RM has all this information 
available (should be inside RMApp), such that RM collector needs to make sure 
this information is available in some when when putting an entity. I'm okay if 
you want to exclude this work here, and I'll file a separate jira for it. 
However, I want to exclude it from YARN-3040 to prevent the patch there growing 
even bigger. That one is required to unblock the framework to write their 
specific data, and I wish it could get in asap.

> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration

2015-03-23 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3136:
--
Attachment: 0008-YARN-3136.patch

> getTransferredContainers can be a bottleneck during AM registration
> ---
>
> Key: YARN-3136
> URL: https://issues.apache.org/jira/browse/YARN-3136
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Sunil G
> Attachments: 0001-YARN-3136.patch, 0002-YARN-3136.patch, 
> 0003-YARN-3136.patch, 0004-YARN-3136.patch, 0005-YARN-3136.patch, 
> 0006-YARN-3136.patch, 0007-YARN-3136.patch, 0008-YARN-3136.patch
>
>
> While examining RM stack traces on a busy cluster I noticed a pattern of AMs 
> stuck waiting for the scheduler lock trying to call getTransferredContainers. 
>  The scheduler lock is highly contended, especially on a large cluster with 
> many nodes heartbeating, and it would be nice if we could find a way to 
> eliminate the need to grab this lock during this call.  We've already done 
> similar work during AM allocate calls to make sure they don't needlessly grab 
> the scheduler lock, and it would be good to do so here as well, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376200#comment-14376200
 ] 

Junping Du commented on YARN-3034:
--

Thanks [~Naganarasimha] for updating the patch!
bq.  Also, we should add a warning message log if user put something illegal 
here or it just get silent without any warn. This i feel is not required as we 
don't do this for any other configuration and also we have clearly captured the 
possible values in the yarn-default.xml.
Most configurations get loaded as boolean value or int number. Some String 
configuration is for loading class, so ClassNotFound will get throw immediately 
if name is wrong. Here it belongs a different case, so I still suggest to add 
some check and warning here.

For context info, [~zjshen], can we put that work on your patch in YARN-3040? 
Or you suggest something else? 


> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3389) Two attempts might operate on same data structures concurrently

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376180#comment-14376180
 ] 

Hadoop QA commented on YARN-3389:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706549/YARN-3389.01.patch
  against trunk revision 0b9f12c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens
  org.apache.hadoop.yarn.server.resourcemanager.TestRM

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7075//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7075//console

This message is automatically generated.

> Two attempts might operate on same data structures concurrently
> ---
>
> Key: YARN-3389
> URL: https://issues.apache.org/jira/browse/YARN-3389
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jun Gong
>Assignee: Jun Gong
> Attachments: YARN-3389.01.patch
>
>
> In AttemptFailedTransition, the new attempt will get 
> state('justFinishedContainers' and 'finishedContainersSentToAM') reference 
> from the failed attempt. Then the two attempts might operate on these two 
> variables concurrently, e.g. they might update 'justFinishedContainers' 
> concurrently when they are both handling CONTAINER_FINISHED event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3040) [Data Model] Make putEntities operation be aware of the app's context

2015-03-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376169#comment-14376169
 ] 

Zhijie Shen commented on YARN-3040:
---

bq.  It sounds not quite scalable if we have one client for each app in the 
RM...

In RM/NM, I think we can and we should implement a wrapper layer, which may 
contain multiple applications, to have delegator  to write the data for 
multiple applications.

bq. One most significant advantage to have run ids as integers is we can easily 
sort all existing runs for one flow in ascending or descending order. This 
might be a solid use case in general?

I can see the benefit. For example, if it represents the timestamp, we can 
filter the flow runs and say give me the runs in the last 5 mins. But my 
concern is whether it's the general way to let user to describe a run.

bq. Hmm, I didn't think the version as part of the flow id.

I can understand this particular case described above. Like my prior comment 
about flow run ID, my concern is whether flow/version/run's explicit hierarchy 
is so general to capture most use cases. IMHO, by nature, the hierarchy is the 
tree of flows, and a flow can be the flow of flows or the flow of apps. 
However, if other users just want to use one level of flow, version/run info 
seems to be redundant. On the other side, if use the flow recursion structure, 
it's elastic to have flow levels from one to many. We can treat the first level 
as the flow, the second as version and third and run. I don't have expertise 
knowledge about workflow such as Oozie, but just want to think out my concern 
loudly. That said, if flow/version/run is the general description of a flow, I 
agree we should pass in these three env vars together and separately.

bq. Mostly fine, but I have some concerns about rolling upgrades.
bq. I'm still not sure why it would make sense to have different logical 
cluster id's every time the RM/cluster restarts. 

I meant the admin can configure a cluster ID explicitly, which won't be 
appended with the timestamp. I added it for the default value to distinguish 
the clusters that are started by you and me, but I think about it again, and it 
seems that RM restarting problem makes sense. I'll change the default not to 
append timestamp.


> [Data Model] Make putEntities operation be aware of the app's context
> -
>
> Key: YARN-3040
> URL: https://issues.apache.org/jira/browse/YARN-3040
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Zhijie Shen
> Attachments: YARN-3040.1.patch, YARN-3040.2.patch
>
>
> Per design in YARN-2928, implement client-side API for handling *flows*. 
> Frameworks should be able to define and pass in all attributes of flows and 
> flow runs to YARN, and they should be passed into ATS writers.
> YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) [Collector wireup] Implement RM starting its timeline collector

2015-03-23 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376150#comment-14376150
 ] 

Sangjin Lee commented on YARN-3034:
---

LGTM. Let's wait to hear from Zhijie.

> [Collector wireup] Implement RM starting its timeline collector
> ---
>
> Key: YARN-3034
> URL: https://issues.apache.org/jira/browse/YARN-3034
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Naganarasimha G R
> Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
> YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch, 
> YARN-3034.20150320-1.patch
>
>
> Per design in YARN-2928, implement resource managers starting their own ATS 
> writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-23 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376109#comment-14376109
 ] 

Zhijie Shen commented on YARN-3047:
---

bq. We can probably move it to yarn-api.

I prefer to keeping it in the server module, unless it's supposed to be public 
to users.

bq. This has to be discussed though as Zhijie Shen thinks we can use the same 
v1 config.

My opinion is that collector should bind on a random port, which will be 
reported to timeline client. Reader on single daemon should start on a 
configured port, and users know it form the config.

bq. TimelineReaderWebServer

If you'd like to keep "reader, I'm fine with it, but let's still say 
TimelineReaderServer. Meanwhile, TimelineReaderWebService -> 
TimelineReaderWebService*s*.

> [Data Serving] Set up ATS reader with basic request serving structure and 
> lifecycle
> ---
>
> Key: YARN-3047
> URL: https://issues.apache.org/jira/browse/YARN-3047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3047.001.patch, YARN-3047.003.patch, 
> YARN-3047.02.patch
>
>
> Per design in YARN-2938, set up the ATS reader as a service and implement the 
> basic structure as a service. It includes lifecycle management, request 
> serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3111) Fix ratio problem on FairScheduler page

2015-03-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376086#comment-14376086
 ] 

Hadoop QA commented on YARN-3111:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12706530/YARN-3111.v2.patch
  against trunk revision 0b9f12c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7074//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7074//console

This message is automatically generated.

> Fix ratio problem on FairScheduler page
> ---
>
> Key: YARN-3111
> URL: https://issues.apache.org/jira/browse/YARN-3111
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Peng Zhang
>Assignee: Peng Zhang
>Priority: Minor
> Attachments: YARN-3111.1.patch, YARN-3111.png, YARN-3111.v2.patch, 
> parenttooltip.png
>
>
> Found 3 problems on FairScheduler page:
> 1. Only compute memory for ratio even when queue schedulingPolicy is DRF.
> 2. When min resources is configured larger than real resources, the steady 
> fair share ratio is so long that it is out the page.
> 3. When cluster resources is 0(no nodemanager start), ratio is displayed as 
> "NaN% used"
> Attached image shows the snapshot of above problems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3384) TestLogAggregationService.verifyContainerLogs fails after YARN-2777

2015-03-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376067#comment-14376067
 ] 

Hudson commented on YARN-3384:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #7402 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7402/])
YARN-3384. TestLogAggregationService.verifyContainerLogs fails after YARN-2777. 
Contributed by Naganarasimha G R. (ozawa: rev 
82eda771e05cf2b31788ee1582551e65f1c0f9aa)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java


> TestLogAggregationService.verifyContainerLogs fails after YARN-2777
> ---
>
> Key: YARN-3384
> URL: https://issues.apache.org/jira/browse/YARN-3384
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
>  Labels: test-fail
> Fix For: 2.7.0
>
> Attachments: YARN-3384.20150321-1.patch
>
>
> Following test cases of TestLogAggregationService is failing :
> testMultipleAppsLogAggregation
> testLogAggregationServiceWithRetention
> testLogAggregationServiceWithInterval
> testLogAggregationServiceWithPatterns 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2777) Mark the end of individual log in aggregated log

2015-03-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376066#comment-14376066
 ] 

Hudson commented on YARN-2777:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #7402 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7402/])
YARN-3384. TestLogAggregationService.verifyContainerLogs fails after YARN-2777. 
Contributed by Naganarasimha G R. (ozawa: rev 
82eda771e05cf2b31788ee1582551e65f1c0f9aa)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java


> Mark the end of individual log in aggregated log
> 
>
> Key: YARN-2777
> URL: https://issues.apache.org/jira/browse/YARN-2777
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Varun Saxena
>  Labels: log-aggregation
> Fix For: 2.7.0
>
> Attachments: YARN-2777.001.patch, YARN-2777.02.patch
>
>
> Below is snippet of aggregated log showing hbase master log:
> {code}
> LogType: hbase-hbase-master-ip-172-31-34-167.log
> LogUploadTime: 29-Oct-2014 22:31:55
> LogLength: 24103045
> Log Contents:
> Wed Oct 29 15:43:57 UTC 2014 Starting master on ip-172-31-34-167
> ...
>   at 
> org.apache.hadoop.hbase.master.cleaner.CleanerChore.chore(CleanerChore.java:124)
>   at org.apache.hadoop.hbase.Chore.run(Chore.java:80)
>   at java.lang.Thread.run(Thread.java:745)
> LogType: hbase-hbase-master-ip-172-31-34-167.out
> {code}
> Since logs from various daemons are aggregated in one log file, it would be 
> desirable to mark the end of one log before starting with the next.
> e.g. with such a line:
> {code}
> End of LogType: hbase-hbase-master-ip-172-31-34-167.log
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3388) userlimit isn't playing well with DRF calculator

2015-03-23 Thread Nathan Roberts (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376060#comment-14376060
 ] 

Nathan Roberts commented on YARN-3388:
--

Example (lots of things going on in this algorithm. I simplified to just the 
key pieces for clarity.)
tuples are resources [memory] or [memory,cpu]

just memory:
-
Queue Capacity is [100]
2 active users, both request [10] at a time
User1 is at [45]
User2 is at [40]
Limit is calculated to be 100/2=50, both users can allocate
User2 goes to [50] - now used Capacity is 45+50=95
Limit is still 50
User1 goes to [55] - used Capacity now 50+55=105
Limit is now 105/2
User2 goes to [60] - used Capacity is now 60+55=115
Limit is now 115/2
So on and so forth until maxCapacity is hit.
Notice how the users essentially leap frog one another, allowing the Limit to 
continually move higher.

memory and cpu

Queue Capacity is [100,100]
2 active users, User1 asks for [10,20], User2 asks for [20,10]
User1 is at [35,45]
User2 is at [45,35]
Limit is calculated to be [100/2=50,100/2=50], both users can allocate
User2 goes to [65,45] - used Capacity is now [65+35=100,45+45=90]
Limit is still [50,50]
User1 goes to [45,65] - used Capacity is now [65+45=110,45+65=110]
Limit is now [110/2=55, 110/2=55]
User1 and User2 are now both considered over limit and neither can allocate. 
User1 is over on cpu, User2 is over on memory.

Open to suggestions on simple ways to fix this. I'm currently thinking a 
reasonable (simple, effective, computationally cheap, mostly fair) approach 
might be to give some small percentage of additional leeway for userLimit. 



> userlimit isn't playing well with DRF calculator
> 
>
> Key: YARN-3388
> URL: https://issues.apache.org/jira/browse/YARN-3388
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.6.0
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
>
> When there are multiple active users in a queue, it should be possible for 
> those users to make use of capacity up-to max_capacity (or close). The 
> resources should be fairly distributed among the active users in the queue. 
> This works pretty well when there is a single resource being scheduled.   
> However, when there are multiple resources the situation gets more complex 
> and the current algorithm tends to get stuck at Capacity. 
> Example illustrated in subsequent comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3384) TestLogAggregationService.verifyContainerLogs fails after YARN-2777

2015-03-23 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376059#comment-14376059
 ] 

Naganarasimha G R commented on YARN-3384:
-

Thanks [~ozawa], for reviewing and committing the patch :)

> TestLogAggregationService.verifyContainerLogs fails after YARN-2777
> ---
>
> Key: YARN-3384
> URL: https://issues.apache.org/jira/browse/YARN-3384
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
>  Labels: test-fail
> Fix For: 2.7.0
>
> Attachments: YARN-3384.20150321-1.patch
>
>
> Following test cases of TestLogAggregationService is failing :
> testMultipleAppsLogAggregation
> testLogAggregationServiceWithRetention
> testLogAggregationServiceWithInterval
> testLogAggregationServiceWithPatterns 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3384) TestLogAggregationService.verifyContainerLogs fails after YARN-2777

2015-03-23 Thread Tsuyoshi Ozawa (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-3384:
-
Summary: TestLogAggregationService.verifyContainerLogs fails after 
YARN-2777  (was: Test failures since 
TestLogAggregationService.verifyContainerLogs fails after YARN-2777)

> TestLogAggregationService.verifyContainerLogs fails after YARN-2777
> ---
>
> Key: YARN-3384
> URL: https://issues.apache.org/jira/browse/YARN-3384
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
>  Labels: test-fail
> Attachments: YARN-3384.20150321-1.patch
>
>
> Following test cases of TestLogAggregationService is failing :
> testMultipleAppsLogAggregation
> testLogAggregationServiceWithRetention
> testLogAggregationServiceWithInterval
> testLogAggregationServiceWithPatterns 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 116 matches

Mail list logo