date:20150421


 [ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-3517:

Attachment: YARN-3517.001.patch

Uploaded patch with fix.

 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: YARN-3517.001.patch


 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3514) Active directory usernames like domain\login cause YARN failures


 [ 
https://issues.apache.org/jira/browse/YARN-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated YARN-3514:

 Component/s: (was: yarn)
  nodemanager
Target Version/s: 2.8.0
Assignee: Chris Nauroth

 Active directory usernames like domain\login cause YARN failures
 

 Key: YARN-3514
 URL: https://issues.apache.org/jira/browse/YARN-3514
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
 Environment: CentOS6
Reporter: john lilley
Assignee: Chris Nauroth
Priority: Minor
 Attachments: YARN-3514.001.patch


 We have a 2.2.0 (Cloudera 5.3) cluster running on CentOS6 that is 
 Kerberos-enabled and uses an external AD domain controller for the KDC.  We 
 are able to authenticate, browse HDFS, etc.  However, YARN fails during 
 localization because it seems to get confused by the presence of a \ 
 character in the local user name.
 Our AD authentication on the nodes goes through sssd and set configured to 
 map AD users onto the form domain\username.  For example, our test user has a 
 Kerberos principal of hadoopu...@domain.com and that maps onto a CentOS user 
 domain\hadoopuser.  We have no problem validating that user with PAM, 
 logging in as that user, su-ing to that user, etc.
 However, when we attempt to run a YARN application master, the localization 
 step fails when setting up the local cache directory for the AM.  The error 
 that comes out of the RM logs:
 2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: 
 ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, 
 diagnostics='Application application_1429295486450_0001 failed 1 times due to 
 AM Container for appattempt_1429295486450_0001_01 exited with  exitCode: 
 -1000 due to: Application application_1429295486450_0001 initialization 
 failed (exitCode=255) with output: main : command provided 0
 main : user is DOMAIN\hadoopuser
 main : requested yarn user is domain\hadoopuser
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
 directory: 
 /data/yarn/nm/usercache/domain%5Chadoopuser/appcache/application_1429295486450_0001/filecache/10
 at 
 org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
 .Failing this attempt.. Failing the application.'
 However, when we look on the node launching the AM, we see this:
 [root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
 [root@rpb-cdh-kerb-2 usercache]# ls -l
 drwxr-s--- 4 DOMAIN\hadoopuser yarn 4096 Apr 17 12:10 domain\hadoopuser
 There appears to be different treatment of the \ character in different 
 places.  Something creates the directory as domain\hadoopuser but something 
 else later attempts to use it as domain%5Chadoopuser.  I’m not sure where 
 or why the URL escapement converts the \ to %5C or why this is not consistent.
 I should also mention, for the sake of completeness, our auth_to_local rule 
 is set up to map u...@domain.com to domain\user:
 RULE:[1:$1@$0](^.*@DOMAIN\.COM$)s/^(.*)@DOMAIN\.COM$/domain\\$1/g



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only

2015-04-21 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3517:
---
Component/s: security

 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, security
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev
  Labels: security
 Attachments: YARN-3517.001.patch


 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only

2015-04-21 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3517:
---
Labels: security  (was: )

 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, security
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev
  Labels: security
 Attachments: YARN-3517.001.patch


 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3489) RMServerUtils.validateResourceRequests should only obtain queue info once


[ 
https://issues.apache.org/jira/browse/YARN-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504437#comment-14504437
 ] 

Hadoop QA commented on YARN-3489:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726790/YARN-3489.02.patch
  against trunk revision d52de61.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7417//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7417//console

This message is automatically generated.

 RMServerUtils.validateResourceRequests should only obtain queue info once
 -

 Key: YARN-3489
 URL: https://issues.apache.org/jira/browse/YARN-3489
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3489.01.patch, YARN-3489.02.patch


 Since the label support was added we now get the queue info for each request 
 being validated in SchedulerUtils.validateResourceRequest.  If 
 validateResourceRequests needs to validate a lot of requests at a time (e.g.: 
 large cluster with lots of varied locality in the requests) then it will get 
 the queue info for each request.  Since we build the queue info this 
 generates a lot of unnecessary garbage, as the queue isn't changing between 
 requests.  We should grab the queue info once and pass it down rather than 
 building it again for each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only


 [ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-3517:

Affects Version/s: 2.7.0

 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev

 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only

Varun Vasudev created YARN-3517:
---

 Summary: RM web ui for dumping scheduler logs should be for admins 
only
 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Varun Vasudev
Assignee: Varun Vasudev


YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3514) Active directory usernames like domain\login cause YARN failures


 [ 
https://issues.apache.org/jira/browse/YARN-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated YARN-3514:

Attachment: YARN-3514.001.patch

I'm attaching a patch with the fix I described in my last comment.  I added a 
test that passes a file name containing a '\' character through localization.  
With the existing code using {{URI#getRawPath}}, the test fails as shown below. 
 (Note the incorrect URI-encoded path, similar to the reported symptom in the 
description.)  After switching to {{URI#getPath}}, the test passes as expected.

{code}
Failed tests: 
  TestContainerLocalizer.testLocalizerDiskCheckDoesNotUriEncodePath:265 
Argument(s) are different! Wanted:
containerLocalizer.checkDir(/my\File);
- at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer.testLocalizerDiskCheckDoesNotUriEncodePath(TestContainerLocalizer.java:265)
Actual invocation has different arguments:
containerLocalizer.checkDir(/my%5CFile);
- at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer.testLocalizerDiskCheckDoesNotUriEncodePath(TestContainerLocalizer.java:264)
{code}


 Active directory usernames like domain\login cause YARN failures
 

 Key: YARN-3514
 URL: https://issues.apache.org/jira/browse/YARN-3514
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.2.0
 Environment: CentOS6
Reporter: john lilley
Priority: Minor
 Attachments: YARN-3514.001.patch


 We have a 2.2.0 (Cloudera 5.3) cluster running on CentOS6 that is 
 Kerberos-enabled and uses an external AD domain controller for the KDC.  We 
 are able to authenticate, browse HDFS, etc.  However, YARN fails during 
 localization because it seems to get confused by the presence of a \ 
 character in the local user name.
 Our AD authentication on the nodes goes through sssd and set configured to 
 map AD users onto the form domain\username.  For example, our test user has a 
 Kerberos principal of hadoopu...@domain.com and that maps onto a CentOS user 
 domain\hadoopuser.  We have no problem validating that user with PAM, 
 logging in as that user, su-ing to that user, etc.
 However, when we attempt to run a YARN application master, the localization 
 step fails when setting up the local cache directory for the AM.  The error 
 that comes out of the RM logs:
 2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: 
 ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, 
 diagnostics='Application application_1429295486450_0001 failed 1 times due to 
 AM Container for appattempt_1429295486450_0001_01 exited with  exitCode: 
 -1000 due to: Application application_1429295486450_0001 initialization 
 failed (exitCode=255) with output: main : command provided 0
 main : user is DOMAIN\hadoopuser
 main : requested yarn user is domain\hadoopuser
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
 directory: 
 /data/yarn/nm/usercache/domain%5Chadoopuser/appcache/application_1429295486450_0001/filecache/10
 at 
 org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
 .Failing this attempt.. Failing the application.'
 However, when we look on the node launching the AM, we see this:
 [root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
 [root@rpb-cdh-kerb-2 usercache]# ls -l
 drwxr-s--- 4 DOMAIN\hadoopuser yarn 4096 Apr 17 12:10 domain\hadoopuser
 There appears to be different treatment of the \ character in different 
 places.  Something creates the directory as domain\hadoopuser but something 
 else later attempts to use it as domain%5Chadoopuser.  I’m not sure where 
 or why the URL escapement converts the \ to %5C or why this is not consistent.
 I should also mention, for the sake of completeness, our auth_to_local rule 
 is set up to map u...@domain.com to domain\user:
 RULE:[1:$1@$0](^.*@DOMAIN\.COM$)s/^(.*)@DOMAIN\.COM$/domain\\$1/g



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3516) killing ContainerLocalizer action doesn't take effect when private localizer receives FETCH_FAILURE status.

2015-04-21 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3516:

Attachment: (was: YARN-3516.000.patch)

 killing ContainerLocalizer action doesn't take effect when private localizer 
 receives FETCH_FAILURE status.
 ---

 Key: YARN-3516
 URL: https://issues.apache.org/jira/browse/YARN-3516
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu

 killing ContainerLocalizer action doesn't take effect when private localizer 
 receives FETCH_FAILURE status. This is a typo from YARN-3024. With YARN-3024, 
 ContainerLocalizer will be killed only if {{action}} is set to 
 {{LocalizerAction.DIE}}, calling {{response.setLocalizerAction}} will be 
 overwritten. This is also a regression from old code.
 Also it make sense to kill the ContainerLocalizer when FETCH_FAILURE 
 happened, because the container will send CLEANUP_CONTAINER_RESOURCES event 
 after localization failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3319) Implement a FairOrderingPolicy


[ 
https://issues.apache.org/jira/browse/YARN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505421#comment-14505421
 ] 

Hadoop QA commented on YARN-3319:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726913/YARN-3319.73.patch
  against trunk revision 8ddbb8d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7426//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7426//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7426//console

This message is automatically generated.

 Implement a FairOrderingPolicy
 --

 Key: YARN-3319
 URL: https://issues.apache.org/jira/browse/YARN-3319
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-3319.13.patch, YARN-3319.14.patch, 
 YARN-3319.17.patch, YARN-3319.35.patch, YARN-3319.39.patch, 
 YARN-3319.45.patch, YARN-3319.47.patch, YARN-3319.53.patch, 
 YARN-3319.58.patch, YARN-3319.70.patch, YARN-3319.71.patch, 
 YARN-3319.72.patch, YARN-3319.73.patch


 Implement a FairOrderingPolicy which prefers to allocate to 
 SchedulerProcesses with least current usage, very similar to the 
 FairScheduler's FairSharePolicy.  
 The Policy will offer allocations to applications in a queue in order of 
 least resources used, and preempt applications in reverse order (from most 
 resources used). This will include conditional support for sizeBasedWeight 
 style adjustment
 Optionally, based on a conditional configuration to enable sizeBasedWeight 
 (default false), an adjustment to boost larger applications (to offset the 
 natural preference for smaller applications) will adjust the resource usage 
 value based on demand, dividing it by the below value:
 Math.log1p(app memory demand) / Math.log(2);
 In cases where the above is indeterminate (two applications are equal after 
 this comparison), behavior falls back to comparison based on the application 
 id, which is generally lexically FIFO for that comparison



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3520) get rid of excessive stacktrace caused by expired cookie in timeline log

2015-04-21 Thread Jonathan Eagles (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505522#comment-14505522
 ] 

Jonathan Eagles commented on YARN-3520:
---

+1. While this jira is regarding a custom KerberosAuthenticationHandler, I 
agree with the excessive logging. Failing to login shouldn't cause an 
exception. Exceptions in the log should be confined to unexpected conditions 
that are unhandled.



 get rid of excessive stacktrace caused by expired cookie in timeline log
 

 Key: YARN-3520
 URL: https://issues.apache.org/jira/browse/YARN-3520
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-3520.patch


 {code}
 WARN sso.CookieValidatorHelpers: Cookie has expired by 25364187 msec
 WARN server.AuthenticationFilter: Authentication exception: Invalid Cookie
 166 org.apache.hadoop.security.authentication.client.AuthenticationException: 
 Invalid Bouncer Cookie
 167 at 
 KerberosAuthenticationHandler.bouncerAuthenticate(KerberosAuthenticationHandler.java:94)
 168 at 
 AuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:82)
 169 at 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:507)
 170 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 171 at 
 org.apache.hadoop.yarn.server.timeline.webapp.CrossOriginFilter.doFilter(CrossOriginFilter.java:95)
 172 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 173 at 
 org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:78)
 174 at GzipFilter.doFilter(GzipFilter.java:188)
 175 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 176 at 
 org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1224)
 177 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 178 at 
 org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
 179 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 180 at 
 org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
 181 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 182 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 183 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 184 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 185 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 186 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 187 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 188 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 189 at org.mortbay.jetty.Server.handle(Server.java:326)
 190 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 191 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 192 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 193 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 194 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 195 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
 196 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
  WARN sso.CookieValidatorHelpers: Cookie has expired by 25373197 msec
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3520) get rid of excessive stacktrace caused by expired cookie in timeline log

2015-04-21 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505549#comment-14505549
 ] 

Chang Li commented on YARN-3520:


[~jlowe] could you please help do a final review of this patch and help commit 
it? Thanks

 get rid of excessive stacktrace caused by expired cookie in timeline log
 

 Key: YARN-3520
 URL: https://issues.apache.org/jira/browse/YARN-3520
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-3520.patch


 {code}
 WARN sso.CookieValidatorHelpers: Cookie has expired by 25364187 msec
 WARN server.AuthenticationFilter: Authentication exception: Invalid Cookie
 166 org.apache.hadoop.security.authentication.client.AuthenticationException: 
 Invalid Bouncer Cookie
 167 at 
 KerberosAuthenticationHandler.bouncerAuthenticate(KerberosAuthenticationHandler.java:94)
 168 at 
 AuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:82)
 169 at 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:507)
 170 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 171 at 
 org.apache.hadoop.yarn.server.timeline.webapp.CrossOriginFilter.doFilter(CrossOriginFilter.java:95)
 172 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 173 at 
 org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:78)
 174 at GzipFilter.doFilter(GzipFilter.java:188)
 175 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 176 at 
 org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1224)
 177 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 178 at 
 org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
 179 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 180 at 
 org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
 181 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 182 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 183 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 184 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 185 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 186 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 187 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 188 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 189 at org.mortbay.jetty.Server.handle(Server.java:326)
 190 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 191 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 192 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 193 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 194 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 195 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
 196 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
  WARN sso.CookieValidatorHelpers: Cookie has expired by 25373197 msec
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3046) [Event producers] Implement MapReduce AM writing MR events to v2 ATS


[ 
https://issues.apache.org/jira/browse/YARN-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505580#comment-14505580
 ] 

Junping Du commented on YARN-3046:
--

bq. If that turns out to be the case, we could conceivably create subclasses in 
a standard place for MR, and have all use cases use those concrete subclasses. 
But I'm fine deferring that aspect a little bit. It's not a critical point.
Another option is to make HierarchicalTimelineEntity non abstract? I also agree 
we should open a new JIRA to discuss this later.

 [Event producers] Implement MapReduce AM writing MR events to v2 ATS
 

 Key: YARN-3046
 URL: https://issues.apache.org/jira/browse/YARN-3046
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Attachments: YARN-3046-no-test-v2.patch, YARN-3046-no-test.patch, 
 YARN-3046-v1-rebase.patch, YARN-3046-v1.patch, YARN-3046-v2.patch, 
 YARN-3046-v3.patch, YARN-3046-v4.patch, YARN-3046-v5.patch, YARN-3046-v6.patch


 Per design in YARN-2928, select a handful of MR metrics (e.g. HDFS bytes 
 written) and have the MR AM write the framework-specific metrics to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3437) convert load test driver to timeline service v.2


[ 
https://issues.apache.org/jira/browse/YARN-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505593#comment-14505593
 ] 

Junping Du commented on YARN-3437:
--

I agree too if we don't have clear plan for YARN-2556 so far. There should be 
no reason to block other going efforts.
An suggestion (optional only) is: can we adjust name (or package path) slightly 
for duplicated file (TimelineServerPerformance.java) with YARN-2556? We can 
have an additional patch to remove duplicated file when YARN-2556 get in trunk.
I assume this could be easier for YARN-2928 rebase back to trunk/branch-2 as 
less conflict. Thoughts?

 convert load test driver to timeline service v.2
 

 Key: YARN-3437
 URL: https://issues.apache.org/jira/browse/YARN-3437
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-3437.001.patch, YARN-3437.002.patch


 This subtask covers the work for converting the proposed patch for the load 
 test driver (YARN-2556) to work with the timeline service v.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only

2015-04-21 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505301#comment-14505301
 ] 

Sunil G commented on YARN-3517:
---

Thanks [~vvasudev]
Patch looks good. 

 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, security
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev
Priority: Blocker
  Labels: security
 Attachments: YARN-3517.001.patch, YARN-3517.002.patch, 
 YARN-3517.003.patch


 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3516) killing ContainerLocalizer action doesn't take effect when private localizer receives FETCH_FAILURE status.


[ 
https://issues.apache.org/jira/browse/YARN-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505326#comment-14505326
 ] 

Xuan Gong commented on YARN-3516:
-

[~zxu] Thanks for working on this jira. I will take a look shortly.

 killing ContainerLocalizer action doesn't take effect when private localizer 
 receives FETCH_FAILURE status.
 ---

 Key: YARN-3516
 URL: https://issues.apache.org/jira/browse/YARN-3516
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3516.000.patch


 killing ContainerLocalizer action doesn't take effect when private localizer 
 receives FETCH_FAILURE status. This is a typo from YARN-3024. With YARN-3024, 
 ContainerLocalizer will be killed only if {{action}} is set to 
 {{LocalizerAction.DIE}}, calling {{response.setLocalizerAction}} will be 
 overwritten. This is also a regression from old code.
 Also it make sense to kill the ContainerLocalizer when FETCH_FAILURE 
 happened, because the container will send CLEANUP_CONTAINER_RESOURCES event 
 after localization failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-21 Thread Inigo Goiri (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505420#comment-14505420
 ] 

Inigo Goiri commented on YARN-3482:
---

I agree, 2 is more distributed and fits better the model that we want to push.
I'll implement it today.

 Report NM available resources in heartbeat
 --

 Key: YARN-3482
 URL: https://issues.apache.org/jira/browse/YARN-3482
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager, resourcemanager
Affects Versions: 2.7.0
Reporter: Inigo Goiri
   Original Estimate: 504h
  Remaining Estimate: 504h

 NMs are usually collocated with other processes like HDFS, Impala or HBase. 
 To manage this scenario correctly, YARN should be aware of the actual 
 available resources. The proposal is to have an interface to dynamically 
 change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-21 Thread Lei Guo (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505440#comment-14505440
 ] 

Lei Guo commented on YARN-3482:
---

What's the relationship between this and 3332? They should be considered 
together.

 Report NM available resources in heartbeat
 --

 Key: YARN-3482
 URL: https://issues.apache.org/jira/browse/YARN-3482
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager, resourcemanager
Affects Versions: 2.7.0
Reporter: Inigo Goiri
   Original Estimate: 504h
  Remaining Estimate: 504h

 NMs are usually collocated with other processes like HDFS, Impala or HBase. 
 To manage this scenario correctly, YARN should be aware of the actual 
 available resources. The proposal is to have an interface to dynamically 
 change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3431) Sub resources of timeline entity needs to be passed to a separate endpoint.

2015-04-21 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505334#comment-14505334
 ] 

Sangjin Lee commented on YARN-3431:
---

It looks good to me.

One small suggestion (it's not critical but would be nicer): It would be a 
little more consistent and perform slightly better if the type check in 
getChildren() is consolidated into validateChildren(). In validateChildren() we 
iterate over the set anyway, and we could do the type check as part of 
validating it. What do you think?

 Sub resources of timeline entity needs to be passed to a separate endpoint.
 ---

 Key: YARN-3431
 URL: https://issues.apache.org/jira/browse/YARN-3431
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-3431.1.patch, YARN-3431.2.patch, YARN-3431.3.patch, 
 YARN-3431.4.patch, YARN-3431.5.patch


 We have TimelineEntity and some other entities as subclass that inherit from 
 it. However, we only have a single endpoint, which consume TimelineEntity 
 rather than sub-classes and this endpoint will check the incoming request 
 body contains exactly TimelineEntity object. However, the json data which is 
 serialized from sub-class object seems not to be treated as an TimelineEntity 
 object, and won't be deserialized into the corresponding sub-class object 
 which cause deserialization failure as some discussions in YARN-3334 : 
 https://issues.apache.org/jira/browse/YARN-3334?focusedCommentId=14391059page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14391059.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-21 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505396#comment-14505396
 ] 

Karthik Kambatla commented on YARN-3482:


I like 2 better. 

 Report NM available resources in heartbeat
 --

 Key: YARN-3482
 URL: https://issues.apache.org/jira/browse/YARN-3482
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager, resourcemanager
Affects Versions: 2.7.0
Reporter: Inigo Goiri
   Original Estimate: 504h
  Remaining Estimate: 504h

 NMs are usually collocated with other processes like HDFS, Impala or HBase. 
 To manage this scenario correctly, YARN should be aware of the actual 
 available resources. The proposal is to have an interface to dynamically 
 change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3521) Support return structured NodeLabel objects in REST API when call getClusterNodeLabels

2015-04-21 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G reassigned YARN-3521:
-

Assignee: Sunil G

 Support return structured NodeLabel objects in REST API when call 
 getClusterNodeLabels
 --

 Key: YARN-3521
 URL: https://issues.apache.org/jira/browse/YARN-3521
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Sunil G

 In YARN-3413, yarn cluster CLI returns NodeLabel instead of String, we should 
 make the same change in REST API side to make them consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3468) NM should not blindly rename usercache/filecache/nmPrivate on restart


[ 
https://issues.apache.org/jira/browse/YARN-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505561#comment-14505561
 ] 

Hadoop QA commented on YARN-3468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726942/YARN-3468.v2.patch
  against trunk revision 424a00d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerReboot
  
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7430//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7430//console

This message is automatically generated.

 NM should not blindly rename usercache/filecache/nmPrivate on restart
 -

 Key: YARN-3468
 URL: https://issues.apache.org/jira/browse/YARN-3468
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-3468.v1.patch, YARN-3468.v2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3445) Cache runningApps in RMNode for getting running apps on given NodeId


[ 
https://issues.apache.org/jira/browse/YARN-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505333#comment-14505333
 ] 

Vinod Kumar Vavilapalli commented on YARN-3445:
---

Better than before, will comment once see an updated patch.

 Cache runningApps in RMNode for getting running apps on given NodeId
 

 Key: YARN-3445
 URL: https://issues.apache.org/jira/browse/YARN-3445
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Affects Versions: 2.7.0
Reporter: Junping Du
Assignee: Junping Du
 Attachments: YARN-3445.patch


 Per discussion in YARN-3334, we need filter out unnecessary collectors info 
 from RM in heartbeat response. Our propose is to add cache for runningApps in 
 RMNode, so RM only send collectors for local running apps back. This is also 
 needed in YARN-914 (graceful decommission) that if no running apps in NM 
 which is in decommissioning stage, it will get decommissioned immediately. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3516) killing ContainerLocalizer action doesn't take effect when private localizer receives FETCH_FAILURE status.


[ 
https://issues.apache.org/jira/browse/YARN-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505409#comment-14505409
 ] 

Hadoop QA commented on YARN-3516:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726922/YARN-3516.000.patch
  against trunk revision 8ddbb8d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesApps
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesContainers

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7428//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7428//console

This message is automatically generated.

 killing ContainerLocalizer action doesn't take effect when private localizer 
 receives FETCH_FAILURE status.
 ---

 Key: YARN-3516
 URL: https://issues.apache.org/jira/browse/YARN-3516
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3516.000.patch


 killing ContainerLocalizer action doesn't take effect when private localizer 
 receives FETCH_FAILURE status. This is a typo from YARN-3024. With YARN-3024, 
 ContainerLocalizer will be killed only if {{action}} is set to 
 {{LocalizerAction.DIE}}, calling {{response.setLocalizerAction}} will be 
 overwritten. This is also a regression from old code.
 Also it make sense to kill the ContainerLocalizer when FETCH_FAILURE 
 happened, because the container will send CLEANUP_CONTAINER_RESOURCES event 
 after localization failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3514) Active directory usernames like domain\login cause YARN failures


 [ 
https://issues.apache.org/jira/browse/YARN-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated YARN-3514:

Attachment: YARN-3514.002.patch

In the first patch, the new test passed for me locally but failed on Jenkins.  
I think this is because I was using a hard-coded destination path for the 
localized resource, and this might have caused a permissions violation on the 
Jenkins host.  Here is patch v002.  I changed the test so that the localized 
resource is relative to the user's filecache, which is in the proper test 
working directory.  I also added a second test to make sure that we don't 
accidentally URI-decode anything.

bq. I am very impressed with the short time it took to patch.

Thanks!  Before we declare victory though, can you check that your local file 
system allows the '\' character in file and directory names?  The patch here 
definitely fixes a bug, but testing the '\' character on your local file system 
will tell us whether or not the whole problem is resolved for your deployment.  
Even better would be if you have the capability to test with my patch applied.


 Active directory usernames like domain\login cause YARN failures
 

 Key: YARN-3514
 URL: https://issues.apache.org/jira/browse/YARN-3514
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
 Environment: CentOS6
Reporter: john lilley
Assignee: Chris Nauroth
Priority: Minor
 Attachments: YARN-3514.001.patch, YARN-3514.002.patch


 We have a 2.2.0 (Cloudera 5.3) cluster running on CentOS6 that is 
 Kerberos-enabled and uses an external AD domain controller for the KDC.  We 
 are able to authenticate, browse HDFS, etc.  However, YARN fails during 
 localization because it seems to get confused by the presence of a \ 
 character in the local user name.
 Our AD authentication on the nodes goes through sssd and set configured to 
 map AD users onto the form domain\username.  For example, our test user has a 
 Kerberos principal of hadoopu...@domain.com and that maps onto a CentOS user 
 domain\hadoopuser.  We have no problem validating that user with PAM, 
 logging in as that user, su-ing to that user, etc.
 However, when we attempt to run a YARN application master, the localization 
 step fails when setting up the local cache directory for the AM.  The error 
 that comes out of the RM logs:
 2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: 
 ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, 
 diagnostics='Application application_1429295486450_0001 failed 1 times due to 
 AM Container for appattempt_1429295486450_0001_01 exited with  exitCode: 
 -1000 due to: Application application_1429295486450_0001 initialization 
 failed (exitCode=255) with output: main : command provided 0
 main : user is DOMAIN\hadoopuser
 main : requested yarn user is domain\hadoopuser
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
 directory: 
 /data/yarn/nm/usercache/domain%5Chadoopuser/appcache/application_1429295486450_0001/filecache/10
 at 
 org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
 .Failing this attempt.. Failing the application.'
 However, when we look on the node launching the AM, we see this:
 [root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
 [root@rpb-cdh-kerb-2 usercache]# ls -l
 drwxr-s--- 4 DOMAIN\hadoopuser yarn 4096 Apr 17 12:10 domain\hadoopuser
 There appears to be different treatment of the \ character in different 
 places.  Something creates the directory as domain\hadoopuser but something 
 else later attempts to use it as domain%5Chadoopuser.  I’m not sure where 
 or why the URL escapement converts the \ to %5C or why this is not consistent.
 I should also mention, for the sake of completeness, our auth_to_local rule 
 is set up to map u...@domain.com to domain\user:
 RULE:[1:$1@$0](^.*@DOMAIN\.COM$)s/^(.*)@DOMAIN\.COM$/domain\\$1/g



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3046) [Event producers] Implement MapReduce AM writing MR events/counters to v2 ATS


 [ 
https://issues.apache.org/jira/browse/YARN-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3046:
-
Summary: [Event producers] Implement MapReduce AM writing MR 
events/counters to v2 ATS  (was: [Event producers] Implement MapReduce AM 
writing MR events to v2 ATS)

 [Event producers] Implement MapReduce AM writing MR events/counters to v2 ATS
 -

 Key: YARN-3046
 URL: https://issues.apache.org/jira/browse/YARN-3046
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Attachments: YARN-3046-no-test-v2.patch, YARN-3046-no-test.patch, 
 YARN-3046-v1-rebase.patch, YARN-3046-v1.patch, YARN-3046-v2.patch, 
 YARN-3046-v3.patch, YARN-3046-v4.patch, YARN-3046-v5.patch, YARN-3046-v6.patch


 Per design in YARN-2928, select a handful of MR metrics (e.g. HDFS bytes 
 written) and have the MR AM write the framework-specific metrics to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3520) get rid of excessive stacktrace caused by expired cookie in timeline log

2015-04-21 Thread Mit Desai (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505425#comment-14505425
 ] 

Mit Desai commented on YARN-3520:
-

lgtm +1 (non-binding)

This change is related to logging so there is no need for tests.

 get rid of excessive stacktrace caused by expired cookie in timeline log
 

 Key: YARN-3520
 URL: https://issues.apache.org/jira/browse/YARN-3520
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-3520.patch


 {code}
 WARN sso.CookieValidatorHelpers: Cookie has expired by 25364187 msec
 WARN server.AuthenticationFilter: Authentication exception: Invalid Cookie
 166 org.apache.hadoop.security.authentication.client.AuthenticationException: 
 Invalid Bouncer Cookie
 167 at 
 KerberosAuthenticationHandler.bouncerAuthenticate(KerberosAuthenticationHandler.java:94)
 168 at 
 AuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:82)
 169 at 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:507)
 170 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 171 at 
 org.apache.hadoop.yarn.server.timeline.webapp.CrossOriginFilter.doFilter(CrossOriginFilter.java:95)
 172 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 173 at 
 org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:78)
 174 at GzipFilter.doFilter(GzipFilter.java:188)
 175 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 176 at 
 org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1224)
 177 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 178 at 
 org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
 179 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 180 at 
 org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
 181 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 182 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 183 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 184 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 185 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 186 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 187 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 188 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 189 at org.mortbay.jetty.Server.handle(Server.java:326)
 190 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 191 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 192 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 193 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 194 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 195 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
 196 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
  WARN sso.CookieValidatorHelpers: Cookie has expired by 25373197 msec
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation


[ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505602#comment-14505602
 ] 

Hadoop QA commented on YARN-3434:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726966/YARN-3434.patch
  against trunk revision 997408e.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7432//console

This message is automatically generated.

 Interaction between reservations and userlimit can result in significant ULF 
 violation
 --

 Key: YARN-3434
 URL: https://issues.apache.org/jira/browse/YARN-3434
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Thomas Graves
Assignee: Thomas Graves
 Attachments: YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
 YARN-3434.patch


 ULF was set to 1.0
 User was able to consume 1.4X queue capacity.
 It looks like when this application launched, it reserved about 1000 
 containers, each 8G each, within about 5 seconds. I think this allowed the 
 logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3520) get rid of excessive stacktrace caused by expired cookie in timeline log


[ 
https://issues.apache.org/jira/browse/YARN-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505311#comment-14505311
 ] 

Hadoop QA commented on YARN-3520:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726911/YARN-3520.patch
  against trunk revision 8ddbb8d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-auth.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7425//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7425//console

This message is automatically generated.

 get rid of excessive stacktrace caused by expired cookie in timeline log
 

 Key: YARN-3520
 URL: https://issues.apache.org/jira/browse/YARN-3520
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-3520.patch


 {code}
 WARN sso.CookieValidatorHelpers: Cookie has expired by 25364187 msec
 WARN server.AuthenticationFilter: Authentication exception: Invalid Cookie
 166 org.apache.hadoop.security.authentication.client.AuthenticationException: 
 Invalid Bouncer Cookie
 167 at 
 KerberosAuthenticationHandler.bouncerAuthenticate(KerberosAuthenticationHandler.java:94)
 168 at 
 AuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:82)
 169 at 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:507)
 170 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 171 at 
 org.apache.hadoop.yarn.server.timeline.webapp.CrossOriginFilter.doFilter(CrossOriginFilter.java:95)
 172 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 173 at 
 org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:78)
 174 at GzipFilter.doFilter(GzipFilter.java:188)
 175 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 176 at 
 org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1224)
 177 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 178 at 
 org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
 179 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 180 at 
 org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
 181 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 182 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 183 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 184 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 185 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 186 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 187 at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
 188 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 189 at org.mortbay.jetty.Server.handle(Server.java:326)
 190 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 191 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
 192 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
 193 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 194 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 195 at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
 196 at

[jira] [Commented] (YARN-3521) Support return structured NodeLabel objects in REST API when call getClusterNodeLabels

2015-04-21 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505437#comment-14505437
 ] 

Sunil G commented on YARN-3521:
---

Recently have done few work in Rest. I wud like to take over, pls reassign 
otherwise. 

 Support return structured NodeLabel objects in REST API when call 
 getClusterNodeLabels
 --

 Key: YARN-3521
 URL: https://issues.apache.org/jira/browse/YARN-3521
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Sunil G

 In YARN-3413, yarn cluster CLI returns NodeLabel instead of String, we should 
 make the same change in REST API side to make them consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation

2015-04-21 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-3434:

Attachment: YARN-3434.patch

updated based on review comments

 Interaction between reservations and userlimit can result in significant ULF 
 violation
 --

 Key: YARN-3434
 URL: https://issues.apache.org/jira/browse/YARN-3434
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Thomas Graves
Assignee: Thomas Graves
 Attachments: YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
 YARN-3434.patch


 ULF was set to 1.0
 User was able to consume 1.4X queue capacity.
 It looks like when this application launched, it reserved about 1000 
 containers, each 8G each, within about 5 seconds. I think this allowed the 
 logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3410) YARN admin should be able to remove individual application records from RMStateStore


[ 
https://issues.apache.org/jira/browse/YARN-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505299#comment-14505299
 ] 

Vinod Kumar Vavilapalli commented on YARN-3410:
---

Seems like this is going in first. If not, this should also take care of 
YARN-2268.

 YARN admin should be able to remove individual application records from 
 RMStateStore
 

 Key: YARN-3410
 URL: https://issues.apache.org/jira/browse/YARN-3410
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, yarn
Reporter: Wangda Tan
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-3410-v1.patch, 0001-YARN-3410.patch, 
 0001-YARN-3410.patch, 0002-YARN-3410.patch, 0003-YARN-3410.patch, 
 0004-YARN-3410-branch-2.patch, 0004-YARN-3410.patch


 When RM state store entered an unexpected state, one example is YARN-2340, 
 when an attempt is not in final state but app already completed, RM can never 
 get up unless format RMStateStore.
 I think we should support remove individual application records from 
 RMStateStore to unblock RM admin make choice of either waiting for a fix or 
 format state store.
 In addition, RM should be able to report all fatal errors (which will 
 shutdown RM) when doing app recovery, this can save admin some time to remove 
 apps in bad state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3413) Node label attributes (like exclusivity) should settable via addToClusterNodeLabels but shouldn't be changeable at runtime


 [ 
https://issues.apache.org/jira/browse/YARN-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3413:
--
Summary: Node label attributes (like exclusivity) should settable via 
addToClusterNodeLabels but shouldn't be changeable at runtime  (was: Node label 
attributes (like exclusive or not) should be able to set when 
addToClusterNodeLabels and shouldn't be changed during runtime)

 Node label attributes (like exclusivity) should settable via 
 addToClusterNodeLabels but shouldn't be changeable at runtime
 --

 Key: YARN-3413
 URL: https://issues.apache.org/jira/browse/YARN-3413
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3413.1.patch, YARN-3413.2.patch


 As mentioned in : 
 https://issues.apache.org/jira/browse/YARN-3345?focusedCommentId=14384947page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14384947.
 Changing node label exclusivity and/or other attributes may not be a real use 
 case, and also we should support setting node label attributes whiling adding 
 them to cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3519) registerApplicationMaster couldn't get all running containers if rm is rebuilding container info while am is relaunched

2015-04-21 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505378#comment-14505378
 ] 

Jian He commented on YARN-3519:
---

[~sandflee],   is this the issue that AM could re-register with RM before 
containers are actually recovered in RM?
 This is a known issue  which is tracked down at YARN-2038.

 registerApplicationMaster couldn't get all running containers if rm is 
 rebuilding container info while am is relaunched 
 

 Key: YARN-3519
 URL: https://issues.apache.org/jira/browse/YARN-3519
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: sandflee

 1, rm failes over, have recovered all app info but all container 
 2, am relaunched and register to am
 3, nm with container launched by am  reregister to rm
 The container in nm and corresponding NMToken could not passed to am



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3521) Support return structured NodeLabel objects in REST API when call getClusterNodeLabels

Wangda Tan created YARN-3521:


 Summary: Support return structured NodeLabel objects in REST API 
when call getClusterNodeLabels
 Key: YARN-3521
 URL: https://issues.apache.org/jira/browse/YARN-3521
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan


In YARN-3413, yarn cluster CLI returns NodeLabel instead of String, we should 
make the same change in REST API side to make them consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3482) Report NM available resources in heartbeat

2015-04-21 Thread Inigo Goiri (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505476#comment-14505476
 ] 

Inigo Goiri commented on YARN-3482:
---

[~grey], the ultimate target of this task is to provide an interface for 
external applications to change the amount of available resources in a node.
A part of YARN-3332 targets a smarter way of calculating the amount of 
resources available to an NM, this can be somewhat related but I think this 
effort is still needed.

Anyway, thanks for the pointer as I'm targetting some of the sub-tasks 
described in that task.

 Report NM available resources in heartbeat
 --

 Key: YARN-3482
 URL: https://issues.apache.org/jira/browse/YARN-3482
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager, resourcemanager
Affects Versions: 2.7.0
Reporter: Inigo Goiri
   Original Estimate: 504h
  Remaining Estimate: 504h

 NMs are usually collocated with other processes like HDFS, Impala or HBase. 
 To manage this scenario correctly, YARN should be aware of the actual 
 available resources. The proposal is to have an interface to dynamically 
 change the available resources and report this to the RM in every heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2268) Disallow formatting the RMStateStore when there is an RM running


[ 
https://issues.apache.org/jira/browse/YARN-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505322#comment-14505322
 ] 

Xuan Gong commented on YARN-2268:
-

bq. If an active RM creates a I am using the state-store lock-file, then the 
command can bail out. Similarly, the command can create a I am blowing up the 
state-store while you were presumably away, so that RM can crash 
deterministically when a format is in progress.

+1 for the proposal. This is probably the simplest way to fix the issue.

 Disallow formatting the RMStateStore when there is an RM running
 

 Key: YARN-2268
 URL: https://issues.apache.org/jira/browse/YARN-2268
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Rohith
 Attachments: 0001-YARN-2268.patch


 YARN-2131 adds a way to format the RMStateStore. However, it can be a problem 
 if we format the store while an RM is actively using it. It would be nice to 
 fail the format if there is an RM running and using this store. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3437) convert load test driver to timeline service v.2

2015-04-21 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505382#comment-14505382
 ] 

Sangjin Lee commented on YARN-3437:
---

I think we need to make progress on this as this is blocking other JIRAs and 
also it's tied to the schema evaluation.

My vote is to get this committed, and adjust this once YARN-2556 lands and we 
rebase. Thoughts?

 convert load test driver to timeline service v.2
 

 Key: YARN-3437
 URL: https://issues.apache.org/jira/browse/YARN-3437
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-3437.001.patch, YARN-3437.002.patch


 This subtask covers the work for converting the proposed patch for the load 
 test driver (YARN-2556) to work with the timeline service v.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3468) NM should not blindly rename usercache/filecache/nmPrivate on restart

2015-04-21 Thread Siqi Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-3468:
--
Attachment: YARN-3468.v2.patch

 NM should not blindly rename usercache/filecache/nmPrivate on restart
 -

 Key: YARN-3468
 URL: https://issues.apache.org/jira/browse/YARN-3468
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-3468.v1.patch, YARN-3468.v2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2268) Disallow formatting the RMStateStore when there is an RM running


[ 
https://issues.apache.org/jira/browse/YARN-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505296#comment-14505296
 ] 

Vinod Kumar Vavilapalli commented on YARN-2268:
---

Further, this should also take care of YARN-3410, whichever patch goes in first.

 Disallow formatting the RMStateStore when there is an RM running
 

 Key: YARN-2268
 URL: https://issues.apache.org/jira/browse/YARN-2268
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Rohith
 Attachments: 0001-YARN-2268.patch


 YARN-2131 adds a way to format the RMStateStore. However, it can be a problem 
 if we format the store while an RM is actively using it. It would be nice to 
 fail the format if there is an RM running and using this store. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3516) killing ContainerLocalizer action doesn't take effect when private localizer receives FETCH_FAILURE status.

2015-04-21 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3516:

Attachment: YARN-3516.000.patch

 killing ContainerLocalizer action doesn't take effect when private localizer 
 receives FETCH_FAILURE status.
 ---

 Key: YARN-3516
 URL: https://issues.apache.org/jira/browse/YARN-3516
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3516.000.patch


 killing ContainerLocalizer action doesn't take effect when private localizer 
 receives FETCH_FAILURE status. This is a typo from YARN-3024. With YARN-3024, 
 ContainerLocalizer will be killed only if {{action}} is set to 
 {{LocalizerAction.DIE}}, calling {{response.setLocalizerAction}} will be 
 overwritten. This is also a regression from old code.
 Also it make sense to kill the ContainerLocalizer when FETCH_FAILURE 
 happened, because the container will send CLEANUP_CONTAINER_RESOURCES event 
 after localization failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3413) Node label attributes (like exclusivity) should settable via addToClusterNodeLabels but shouldn't be changeable at runtime


 [ 
https://issues.apache.org/jira/browse/YARN-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3413:
-
Attachment: YARN-3413.3.patch

Thanks for review, [~vinodkv]:
bq. We should simply force the labelName to follow a ( ) block - i.e. anything 
next to a comma upto the left parenthesis is a label. Right?
I think we need to support both, if we enforce this, it will be a imcompatible 
behavior

Other comments are all addressed.

In addition:
- Make GetClusterNodeLabelResponse returns NodeLabel instead of String.
- Filed YARN-3521 to track REST API changes.

 Node label attributes (like exclusivity) should settable via 
 addToClusterNodeLabels but shouldn't be changeable at runtime
 --

 Key: YARN-3413
 URL: https://issues.apache.org/jira/browse/YARN-3413
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3413.1.patch, YARN-3413.2.patch, YARN-3413.3.patch


 As mentioned in : 
 https://issues.apache.org/jira/browse/YARN-3345?focusedCommentId=14384947page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14384947.
 Changing node label exclusivity and/or other attributes may not be a real use 
 case, and also we should support setting node label attributes whiling adding 
 them to cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only


[ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505501#comment-14505501
 ] 

Hadoop QA commented on YARN-3517:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726919/YARN-3517.003.patch
  against trunk revision 8ddbb8d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7427//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7427//console

This message is automatically generated.

 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, security
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev
Priority: Blocker
  Labels: security
 Attachments: YARN-3517.001.patch, YARN-3517.002.patch, 
 YARN-3517.003.patch


 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3521) Support return structured NodeLabel objects in REST API when call getClusterNodeLabels


[ 
https://issues.apache.org/jira/browse/YARN-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505503#comment-14505503
 ] 

Wangda Tan commented on YARN-3521:
--

[~sunilg], thanks for taking this, it's yours :)

 Support return structured NodeLabel objects in REST API when call 
 getClusterNodeLabels
 --

 Key: YARN-3521
 URL: https://issues.apache.org/jira/browse/YARN-3521
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Sunil G

 In YARN-3413, yarn cluster CLI returns NodeLabel instead of String, we should 
 make the same change in REST API side to make them consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3519) registerApplicationMaster couldn't get all running containers if rm is rebuilding container info while am is relaunched

2015-04-21 Thread sandflee (JIRA)

sandflee created YARN-3519:
--

 Summary: registerApplicationMaster couldn't get all running 
containers if rm is rebuilding container info while am is relaunched 
 Key: YARN-3519
 URL: https://issues.apache.org/jira/browse/YARN-3519
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: sandflee


1, rm failes over, have recovered all app info but all container 
2, am relaunched and register to am
3, nm with container launched by am  reregister to rm

The container in nm and corresponding NMToken could not passed to am



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3463) Integrate OrderingPolicy Framework with CapacityScheduler


[ 
https://issues.apache.org/jira/browse/YARN-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504813#comment-14504813
 ] 

Hudson commented on YARN-3463:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #161 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/161/])
YARN-3463. Integrate OrderingPolicy Framework with CapacityScheduler. (Craig 
Welch via wangda) (wangda: rev 44872b76fcc0ddfbc7b0a4e54eef50fe8708e0f5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/AbstractComparatorOrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FifoOrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/OrderingPolicy.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java


 Integrate OrderingPolicy Framework with CapacityScheduler
 -

 Key: YARN-3463
 URL: https://issues.apache.org/jira/browse/YARN-3463
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Fix For: 2.8.0

 Attachments: YARN-3463.50.patch, YARN-3463.61.patch, 
 YARN-3463.64.patch, YARN-3463.65.patch, YARN-3463.66.patch, 
 YARN-3463.67.patch, YARN-3463.68.patch, YARN-3463.69.patch, YARN-3463.70.patch


 Integrate the OrderingPolicy Framework with the CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3497) ContainerManagementProtocolProxy modifies IPC timeout conf without making a copy


[ 
https://issues.apache.org/jira/browse/YARN-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504815#comment-14504815
 ] 

Hudson commented on YARN-3497:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #161 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/161/])
YARN-3497. ContainerManagementProtocolProxy modifies IPC timeout conf without 
making a copy. Contributed by Jason Lowe (jianhe: rev 
f967fd2f21791c5c4a5a090cc14ee88d155d2e2b)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/ContainerManagementProtocolProxy.java


 ContainerManagementProtocolProxy modifies IPC timeout conf without making a 
 copy
 

 Key: YARN-3497
 URL: https://issues.apache.org/jira/browse/YARN-3497
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 2.7.1

 Attachments: YARN-3497.001.patch, YARN-3497.002.patch


 yarn-client's ContainerManagementProtocolProxy is updating 
 ipc.client.connection.maxidletime in the conf passed in without making a copy 
 of it. That modification leaks into other systems using the same conf and 
 can cause them to setup RPC connections with a timeout of zero as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3445) Cache runningApps in RMNode for getting running apps on given NodeId

[
https://issues.apache.org/jira/browse/YARN-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Junping Du updated YARN-3445:
-
Description: Per discussion in YARN-3334, we need filter out unnecessary
collectors info from RM in heartbeat response. Our propose is to add cache for
runningApps in RMNode, so RM only send collectors for local running apps back.
This is also needed in YARN-914 (graceful decommission) that if no running apps
in NM which is in decommissioning stage, it will get decommissioned
immediately. (was: Per discussion in YARN-3334, we need filter out
unnecessary collectors info from RM in heartbeat response. Our propose is to
add additional field for running apps in NM heartbeat request, so RM only send
collectors for local running apps back. This is also needed in YARN-914
(graceful decommission) that if no running apps in NM which is in
decommissioning stage, it will get decommissioned immediately. )

Cache runningApps in RMNode for getting running apps on given NodeId

Key: YARN-3445
URL: https://issues.apache.org/jira/browse/YARN-3445
Project: Hadoop YARN
Issue Type: Sub-task
Components: nodemanager, resourcemanager
Affects Versions: 2.7.0
Reporter: Junping Du
Assignee: Junping Du
Attachments: YARN-3445.patch

Per discussion in YARN-3334, we need filter out unnecessary collectors info
from RM in heartbeat response. Our propose is to add cache for runningApps in
RMNode, so RM only send collectors for local running apps back. This is also
needed in YARN-914 (graceful decommission) that if no running apps in NM
which is in decommissioning stage, it will get decommissioned immediately.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3406) Add a Running Container for RM Web UI


[ 
https://issues.apache.org/jira/browse/YARN-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504928#comment-14504928
 ] 

Hadoop QA commented on YARN-3406:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726857/YARN-3406.2.patch
  against trunk revision 8ddbb8d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7421//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7421//console

This message is automatically generated.

 Add a Running Container for RM Web UI
 -

 Key: YARN-3406
 URL: https://issues.apache.org/jira/browse/YARN-3406
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Ryu Kobayashi
Assignee: Ryu Kobayashi
Priority: Minor
 Attachments: YARN-3406.1.patch, YARN-3406.2.patch, screenshot.png, 
 screenshot2.png


 View the number of containers in the all application list. And, add REST API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3514) Active directory usernames like domain\login cause YARN failures

[
https://issues.apache.org/jira/browse/YARN-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504521#comment-14504521
]

Hadoop QA commented on YARN-3514:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12726815/YARN-3514.001.patch
against trunk revision d52de61.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 1 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:red}-1 core tests{color}. The patch failed these unit tests in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/7419//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7419//console

This message is automatically generated.

Active directory usernames like domain\login cause YARN failures

Key: YARN-3514
URL: https://issues.apache.org/jira/browse/YARN-3514
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.2.0
Environment: CentOS6
Reporter: john lilley
Assignee: Chris Nauroth
Priority: Minor
Attachments: YARN-3514.001.patch

We have a 2.2.0 (Cloudera 5.3) cluster running on CentOS6 that is
Kerberos-enabled and uses an external AD domain controller for the KDC. We
are able to authenticate, browse HDFS, etc. However, YARN fails during
localization because it seems to get confused by the presence of a \
character in the local user name.
Our AD authentication on the nodes goes through sssd and set configured to
map AD users onto the form domain\username. For example, our test user has a
Kerberos principal of hadoopu...@domain.com and that maps onto a CentOS user
domain\hadoopuser. We have no problem validating that user with PAM,
logging in as that user, su-ing to that user, etc.
However, when we attempt to run a YARN application master, the localization
step fails when setting up the local cache directory for the AM. The error
that comes out of the RM logs:
2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication:
ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED,
diagnostics='Application application_1429295486450_0001 failed 1 times due to
AM Container for appattempt_1429295486450_0001_01 exited with exitCode:
-1000 due to: Application application_1429295486450_0001 initialization
failed (exitCode=255) with output: main : command provided 0
main : user is DOMAIN\hadoopuser
main : requested yarn user is domain\hadoopuser
org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create
directory:
/data/yarn/nm/usercache/domain%5Chadoopuser/appcache/application_1429295486450_0001/filecache/10
at
org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
.Failing this attempt.. Failing the application.'
However, when we look on the node launching the AM, we see this:
[root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
[root@rpb-cdh-kerb-2 usercache]# ls -l
drwxr-s--- 4 DOMAIN\hadoopuser yarn 4096 Apr 17 12:10 domain\hadoopuser
There appears to be different treatment of the \ character in different
places. Something creates the directory as domain\hadoopuser but something
else later attempts to use it as domain%5Chadoopuser. I’m not sure where
or why the URL escapement converts the \ to

[jira] [Assigned] (YARN-3484) Fix up yarn top shell code


 [ 
https://issues.apache.org/jira/browse/YARN-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev reassigned YARN-3484:
---

Assignee: Varun Vasudev

 Fix up yarn top shell code
 --

 Key: YARN-3484
 URL: https://issues.apache.org/jira/browse/YARN-3484
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Assignee: Varun Vasudev

 We need to do some work on yarn top's shell code.
 a) Just checking for TERM isn't good enough.  We really need to check the 
 return on tput, especially since the output will not be a number but an error 
 string which will likely blow up the java code in horrible ways.
 b) All the single bracket tests should be double brackets to force the bash 
 built-in.
 c) I'd think I'd rather see the shell portion in a function since it's rather 
 large.  This will allow for args, etc, to get local'ized and clean up the 
 case statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3484) Fix up yarn top shell code


 [ 
https://issues.apache.org/jira/browse/YARN-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-3484:

Attachment: YARN-3484.001.patch

Allen, I've uploaded a patch to address your comments. Can you please review? 
If it looks good to you, I'll upload a version for branch-2.

 Fix up yarn top shell code
 --

 Key: YARN-3484
 URL: https://issues.apache.org/jira/browse/YARN-3484
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Assignee: Varun Vasudev
 Attachments: YARN-3484.001.patch


 We need to do some work on yarn top's shell code.
 a) Just checking for TERM isn't good enough.  We really need to check the 
 return on tput, especially since the output will not be a number but an error 
 string which will likely blow up the java code in horrible ways.
 b) All the single bracket tests should be double brackets to force the bash 
 built-in.
 c) I'd think I'd rather see the shell portion in a function since it's rather 
 large.  This will allow for args, etc, to get local'ized and clean up the 
 case statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3294) Allow dumping of Capacity Scheduler debug logs via web UI for a fixed time period


[ 
https://issues.apache.org/jira/browse/YARN-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504613#comment-14504613
 ] 

Varun Vasudev commented on YARN-3294:
-

[~tgraves] - thanks for pointing out the admin issue. My apologies for missing 
it. I've filed YARN-3517 and updated it with a patch, which allows only admins 
to use the functionality. Can you please review and leave comments there?

 Allow dumping of Capacity Scheduler debug logs via web UI for a fixed time 
 period
 -

 Key: YARN-3294
 URL: https://issues.apache.org/jira/browse/YARN-3294
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: Screen Shot 2015-03-12 at 8.51.25 PM.png, 
 apache-yarn-3294.0.patch, apache-yarn-3294.1.patch, apache-yarn-3294.2.patch, 
 apache-yarn-3294.3.patch, apache-yarn-3294.4.patch


 It would be nice to have a button on the web UI that would allow dumping of 
 debug logs for just the capacity scheduler for a fixed period of time(1 min, 
 5 min or so) in a separate log file. It would be useful when debugging 
 scheduler behavior without affecting the rest of the resourcemanager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3046) [Event producers] Implement MapReduce AM writing MR events to v2 ATS


[ 
https://issues.apache.org/jira/browse/YARN-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504712#comment-14504712
 ] 

Junping Du commented on YARN-3046:
--

Thanks [~zjshen] and [~sjlee0] for review and comments!
bq. createEntity's comments need to be updated to reflect the latest code 
changes.
Nice catch! Updated in v6 patch.

bq. The following code does nothing, and can be removed.
The only thing it does is get rid of falling into default (for unrecognized 
event) where it will be return directly. Let's keep it here.

bq.  OK, this is another existing bug I'll leave it up to you to decide whether 
we want to fix this in ATS v.1 in a separate JIRA.
Let's fix in refactor patch (MAPREDUCE-6318) given we already roll back a 
prefix tiny bug on v1.

bq. I.150: same issue
Good catch! Fix it in v6 patch.

 [Event producers] Implement MapReduce AM writing MR events to v2 ATS
 

 Key: YARN-3046
 URL: https://issues.apache.org/jira/browse/YARN-3046
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Attachments: YARN-3046-no-test-v2.patch, YARN-3046-no-test.patch, 
 YARN-3046-v1-rebase.patch, YARN-3046-v1.patch, YARN-3046-v2.patch, 
 YARN-3046-v3.patch, YARN-3046-v4.patch, YARN-3046-v5.patch


 Per design in YARN-2928, select a handful of MR metrics (e.g. HDFS bytes 
 written) and have the MR AM write the framework-specific metrics to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3046) [Event producers] Implement MapReduce AM writing MR events to v2 ATS


 [ 
https://issues.apache.org/jira/browse/YARN-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3046:
-
Attachment: YARN-3046-v6.patch

 [Event producers] Implement MapReduce AM writing MR events to v2 ATS
 

 Key: YARN-3046
 URL: https://issues.apache.org/jira/browse/YARN-3046
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Attachments: YARN-3046-no-test-v2.patch, YARN-3046-no-test.patch, 
 YARN-3046-v1-rebase.patch, YARN-3046-v1.patch, YARN-3046-v2.patch, 
 YARN-3046-v3.patch, YARN-3046-v4.patch, YARN-3046-v5.patch, YARN-3046-v6.patch


 Per design in YARN-2928, select a handful of MR metrics (e.g. HDFS bytes 
 written) and have the MR AM write the framework-specific metrics to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3518) default rm/am expire interval should less than default resourcemanager connect wait time

2015-04-21 Thread sandflee (JIRA)

sandflee created YARN-3518:
--

 Summary: default rm/am expire interval should less than default 
resourcemanager connect wait time
 Key: YARN-3518
 URL: https://issues.apache.org/jira/browse/YARN-3518
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Reporter: sandflee


take am for example, if am can't connect to RM, after am expire (600s), RM 
relaunch am, and there will be two am at the same time util resourcemanager 
connect max wait time(900s) passed.

DEFAULT_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS =  15 * 60 * 1000;
DEFAULT_RM_AM_EXPIRY_INTERVAL_MS = 60;
DEFAULT_RM_NM_EXPIRY_INTERVAL_MS = 60;





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3516) killing ContainerLocalizer action doesn't take effect when private localizer receives FETCH_FAILURE status.


[ 
https://issues.apache.org/jira/browse/YARN-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504513#comment-14504513
 ] 

Hadoop QA commented on YARN-3516:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726776/YARN-3516.000.patch
  against trunk revision d52de61.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDataTransferProtocol

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.TestLeaseRecovery2

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7415//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7415//console

This message is automatically generated.

 killing ContainerLocalizer action doesn't take effect when private localizer 
 receives FETCH_FAILURE status.
 ---

 Key: YARN-3516
 URL: https://issues.apache.org/jira/browse/YARN-3516
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3516.000.patch


 killing ContainerLocalizer action doesn't take effect when private localizer 
 receives FETCH_FAILURE status. This is a typo from YARN-3024. With YARN-3024, 
 ContainerLocalizer will be killed only if {{action}} is set to 
 {{LocalizerAction.DIE}}, calling {{response.setLocalizerAction}} will be 
 overwritten. This is also a regression from old code.
 Also it make sense to kill the ContainerLocalizer when FETCH_FAILURE 
 happened, because the container will send CLEANUP_CONTAINER_RESOURCES event 
 after localization failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only


[ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504554#comment-14504554
 ] 

Hadoop QA commented on YARN-3517:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726810/YARN-3517.001.patch
  against trunk revision d52de61.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7418//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7418//console

This message is automatically generated.

 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, security
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev
  Labels: security
 Attachments: YARN-3517.001.patch


 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only


[ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504617#comment-14504617
 ] 

Varun Vasudev commented on YARN-3517:
-

Test failure is unrelated.

 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, security
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev
  Labels: security
 Attachments: YARN-3517.001.patch


 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3484) Fix up yarn top shell code


[ 
https://issues.apache.org/jira/browse/YARN-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504660#comment-14504660
 ] 

Hadoop QA commented on YARN-3484:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726827/YARN-3484.001.patch
  against trunk revision d52de61.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7420//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7420//console

This message is automatically generated.

 Fix up yarn top shell code
 --

 Key: YARN-3484
 URL: https://issues.apache.org/jira/browse/YARN-3484
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Assignee: Varun Vasudev
 Attachments: YARN-3484.001.patch


 We need to do some work on yarn top's shell code.
 a) Just checking for TERM isn't good enough.  We really need to check the 
 return on tput, especially since the output will not be a number but an error 
 string which will likely blow up the java code in horrible ways.
 b) All the single bracket tests should be double brackets to force the bash 
 built-in.
 c) I'd think I'd rather see the shell portion in a function since it's rather 
 large.  This will allow for args, etc, to get local'ized and clean up the 
 case statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3511) Add errors and warnings page to ATS


[ 
https://issues.apache.org/jira/browse/YARN-3511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504621#comment-14504621
 ] 

Varun Vasudev commented on YARN-3511:
-

The eclipse target passes on my machine. I'm not sure if we can add any tests 
for this.

 Add errors and warnings page to ATS
 ---

 Key: YARN-3511
 URL: https://issues.apache.org/jira/browse/YARN-3511
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: YARN-3511.001.patch


 YARN-2901 adds the capability to view errors and warnings on the web UI. The 
 ATS was missed out. Add support for the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3463) Integrate OrderingPolicy Framework with CapacityScheduler


[ 
https://issues.apache.org/jira/browse/YARN-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505077#comment-14505077
 ] 

Hudson commented on YARN-3463:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #171 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/171/])
YARN-3463. Integrate OrderingPolicy Framework with CapacityScheduler. (Craig 
Welch via wangda) (wangda: rev 44872b76fcc0ddfbc7b0a4e54eef50fe8708e0f5)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FifoOrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/OrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/AbstractComparatorOrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


 Integrate OrderingPolicy Framework with CapacityScheduler
 -

 Key: YARN-3463
 URL: https://issues.apache.org/jira/browse/YARN-3463
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Fix For: 2.8.0

 Attachments: YARN-3463.50.patch, YARN-3463.61.patch, 
 YARN-3463.64.patch, YARN-3463.65.patch, YARN-3463.66.patch, 
 YARN-3463.67.patch, YARN-3463.68.patch, YARN-3463.69.patch, YARN-3463.70.patch


 Integrate the OrderingPolicy Framework with the CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3497) ContainerManagementProtocolProxy modifies IPC timeout conf without making a copy


[ 
https://issues.apache.org/jira/browse/YARN-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505079#comment-14505079
 ] 

Hudson commented on YARN-3497:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #171 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/171/])
YARN-3497. ContainerManagementProtocolProxy modifies IPC timeout conf without 
making a copy. Contributed by Jason Lowe (jianhe: rev 
f967fd2f21791c5c4a5a090cc14ee88d155d2e2b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/ContainerManagementProtocolProxy.java
* hadoop-yarn-project/CHANGES.txt


 ContainerManagementProtocolProxy modifies IPC timeout conf without making a 
 copy
 

 Key: YARN-3497
 URL: https://issues.apache.org/jira/browse/YARN-3497
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 2.7.1

 Attachments: YARN-3497.001.patch, YARN-3497.002.patch


 yarn-client's ContainerManagementProtocolProxy is updating 
 ipc.client.connection.maxidletime in the conf passed in without making a copy 
 of it. That modification leaks into other systems using the same conf and 
 can cause them to setup RPC connections with a timeout of zero as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3494) Expose AM resource limit and user limit in QueueMetrics


[ 
https://issues.apache.org/jira/browse/YARN-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504958#comment-14504958
 ] 

Hadoop QA commented on YARN-3494:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726861/0002-YARN-3494.patch
  against trunk revision 8ddbb8d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7423//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7423//console

This message is automatically generated.

 Expose AM resource limit and user limit in QueueMetrics 
 

 Key: YARN-3494
 URL: https://issues.apache.org/jira/browse/YARN-3494
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3494.patch, 0002-YARN-3494.patch, 
 0002-YARN-3494.patch


 Now we have the AM resource limit and user limit shown on the web UI, it 
 would be useful to expose them in the QueueMetrics as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only

2015-04-21 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504954#comment-14504954
 ] 

Thomas Graves commented on YARN-3517:
-

Thanks for following up on this.  Could you also change it to not show the 
button if you aren't an admin?  I don't want to confuse users by having a 
button there that doesn't do anything.

One other thing is could you add some css or something to make it look more 
like a button.  Right now it just looks like text and I didn't know it was 
clickable at first.   The placement of it seems a bit weird to me also but as 
along as its only showing up for admins that is less of an issue.

I haven't looked at the patch if details but I see we are creating a new 
AdminACLsManager each time. It would be nice if we didn't have to do that.

 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, security
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev
  Labels: security
 Attachments: YARN-3517.001.patch


 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only


 [ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-3517:

Attachment: YARN-3517.002.patch

Uploaded a new patch to address Thomas's comments.

bq. Could you also change it to not show the button if you aren't an admin?

Fixed.

{quote}
One other thing is could you add some css or something to make it look more 
like a button. Right now it just looks like text and I didn't know it was 
clickable at first. The placement of it seems a bit weird to me also but as 
along as its only showing up for admins that is less of an issue.
{quote}

I've added some style elements to make it look better.

{quote}
I haven't looked at the patch if details but I see we are creating a new 
AdminACLsManager each time. It would be nice if we didn't have to do that.
{quote}

Fixed.


 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, security
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev
  Labels: security
 Attachments: YARN-3517.001.patch, YARN-3517.002.patch


 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3437) convert load test driver to timeline service v.2


[ 
https://issues.apache.org/jira/browse/YARN-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504842#comment-14504842
 ] 

Junping Du commented on YARN-3437:
--

Filed a new JIRA to track fine-grained performance data sounds good to me. 
Given the patch here have duplicated code with YARN-2556, I would like to 
understand what's our plan for YARN-2556. [~jeagles], can you share your vision 
on this? 
Looks like this JIRA block YARN-3390 (a refactor JIRA) which block YARN-3044 
(RM writing events to v2 ATS service). I would like to have a clear path to 
make all patches goes in as a pipeline with getting ride of any potential 
deadlock. :) 
May be the first step is to make YARN-2556 get committed it, and get patch here 
rebased? [~jeagles], [~sjlee0] and [~zjshen], what's your opinion on this? 

 convert load test driver to timeline service v.2
 

 Key: YARN-3437
 URL: https://issues.apache.org/jira/browse/YARN-3437
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-3437.001.patch, YARN-3437.002.patch


 This subtask covers the work for converting the proposed patch for the load 
 test driver (YARN-2556) to work with the timeline service v.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3514) Active directory usernames like domain\login cause YARN failures

2015-04-21 Thread john lilley (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504869#comment-14504869
 ] 

john lilley commented on YARN-3514:
---

Thank you!  I am very impressed with the short time it took to patch.

 Active directory usernames like domain\login cause YARN failures
 

 Key: YARN-3514
 URL: https://issues.apache.org/jira/browse/YARN-3514
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
 Environment: CentOS6
Reporter: john lilley
Assignee: Chris Nauroth
Priority: Minor
 Attachments: YARN-3514.001.patch


 We have a 2.2.0 (Cloudera 5.3) cluster running on CentOS6 that is 
 Kerberos-enabled and uses an external AD domain controller for the KDC.  We 
 are able to authenticate, browse HDFS, etc.  However, YARN fails during 
 localization because it seems to get confused by the presence of a \ 
 character in the local user name.
 Our AD authentication on the nodes goes through sssd and set configured to 
 map AD users onto the form domain\username.  For example, our test user has a 
 Kerberos principal of hadoopu...@domain.com and that maps onto a CentOS user 
 domain\hadoopuser.  We have no problem validating that user with PAM, 
 logging in as that user, su-ing to that user, etc.
 However, when we attempt to run a YARN application master, the localization 
 step fails when setting up the local cache directory for the AM.  The error 
 that comes out of the RM logs:
 2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: 
 ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, 
 diagnostics='Application application_1429295486450_0001 failed 1 times due to 
 AM Container for appattempt_1429295486450_0001_01 exited with  exitCode: 
 -1000 due to: Application application_1429295486450_0001 initialization 
 failed (exitCode=255) with output: main : command provided 0
 main : user is DOMAIN\hadoopuser
 main : requested yarn user is domain\hadoopuser
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
 directory: 
 /data/yarn/nm/usercache/domain%5Chadoopuser/appcache/application_1429295486450_0001/filecache/10
 at 
 org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
 .Failing this attempt.. Failing the application.'
 However, when we look on the node launching the AM, we see this:
 [root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
 [root@rpb-cdh-kerb-2 usercache]# ls -l
 drwxr-s--- 4 DOMAIN\hadoopuser yarn 4096 Apr 17 12:10 domain\hadoopuser
 There appears to be different treatment of the \ character in different 
 places.  Something creates the directory as domain\hadoopuser but something 
 else later attempts to use it as domain%5Chadoopuser.  I’m not sure where 
 or why the URL escapement converts the \ to %5C or why this is not consistent.
 I should also mention, for the sake of completeness, our auth_to_local rule 
 is set up to map u...@domain.com to domain\user:
 RULE:[1:$1@$0](^.*@DOMAIN\.COM$)s/^(.*)@DOMAIN\.COM$/domain\\$1/g



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3497) ContainerManagementProtocolProxy modifies IPC timeout conf without making a copy


[ 
https://issues.apache.org/jira/browse/YARN-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504748#comment-14504748
 ] 

Hudson commented on YARN-3497:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #170 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/170/])
YARN-3497. ContainerManagementProtocolProxy modifies IPC timeout conf without 
making a copy. Contributed by Jason Lowe (jianhe: rev 
f967fd2f21791c5c4a5a090cc14ee88d155d2e2b)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/ContainerManagementProtocolProxy.java


 ContainerManagementProtocolProxy modifies IPC timeout conf without making a 
 copy
 

 Key: YARN-3497
 URL: https://issues.apache.org/jira/browse/YARN-3497
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 2.7.1

 Attachments: YARN-3497.001.patch, YARN-3497.002.patch


 yarn-client's ContainerManagementProtocolProxy is updating 
 ipc.client.connection.maxidletime in the conf passed in without making a copy 
 of it. That modification leaks into other systems using the same conf and 
 can cause them to setup RPC connections with a timeout of zero as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3463) Integrate OrderingPolicy Framework with CapacityScheduler


[ 
https://issues.apache.org/jira/browse/YARN-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504746#comment-14504746
 ] 

Hudson commented on YARN-3463:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #170 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/170/])
YARN-3463. Integrate OrderingPolicy Framework with CapacityScheduler. (Craig 
Welch via wangda) (wangda: rev 44872b76fcc0ddfbc7b0a4e54eef50fe8708e0f5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FifoOrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/AbstractComparatorOrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/OrderingPolicy.java


 Integrate OrderingPolicy Framework with CapacityScheduler
 -

 Key: YARN-3463
 URL: https://issues.apache.org/jira/browse/YARN-3463
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Fix For: 2.8.0

 Attachments: YARN-3463.50.patch, YARN-3463.61.patch, 
 YARN-3463.64.patch, YARN-3463.65.patch, YARN-3463.66.patch, 
 YARN-3463.67.patch, YARN-3463.68.patch, YARN-3463.69.patch, YARN-3463.70.patch


 Integrate the OrderingPolicy Framework with the CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3445) Cache runningApps in RMNode for getting running apps on given NodeId


 [ 
https://issues.apache.org/jira/browse/YARN-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3445:
-
Summary: Cache runningApps in RMNode for getting running apps on given 
NodeId  (was: NM notify RM on running Apps in NM-RM heartbeat)

 Cache runningApps in RMNode for getting running apps on given NodeId
 

 Key: YARN-3445
 URL: https://issues.apache.org/jira/browse/YARN-3445
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Affects Versions: 2.7.0
Reporter: Junping Du
Assignee: Junping Du
 Attachments: YARN-3445.patch


 Per discussion in YARN-3334, we need filter out unnecessary collectors info 
 from RM in heartbeat response. Our propose is to add additional field for 
 running apps in NM heartbeat request, so RM only send collectors for local 
 running apps back. This is also needed in YARN-914 (graceful decommission) 
 that if no running apps in NM which is in decommissioning stage, it will get 
 decommissioned immediately. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3406) Add a Running Container for RM Web UI

2015-04-21 Thread Ryu Kobayashi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryu Kobayashi updated YARN-3406:

Attachment: YARN-3406.2.patch

[~ozawa] I fixed the conflict. And, I was changed to the label name Runngin 
Containers (same as the REST API). Also, I fixed because there was a bug in 
the sort of fair scheduler.

 Add a Running Container for RM Web UI
 -

 Key: YARN-3406
 URL: https://issues.apache.org/jira/browse/YARN-3406
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Ryu Kobayashi
Assignee: Ryu Kobayashi
Priority: Minor
 Attachments: YARN-3406.1.patch, YARN-3406.2.patch, screenshot.png, 
 screenshot2.png


 View the number of containers in the all application list. And, add REST API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3494) Expose AM resource limit and user limit in QueueMetrics


[ 
https://issues.apache.org/jira/browse/YARN-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504830#comment-14504830
 ] 

Hadoop QA commented on YARN-3494:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726858/0002-YARN-3494.patch
  against trunk revision 8ddbb8d.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7422//console

This message is automatically generated.

 Expose AM resource limit and user limit in QueueMetrics 
 

 Key: YARN-3494
 URL: https://issues.apache.org/jira/browse/YARN-3494
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3494.patch, 0002-YARN-3494.patch


 Now we have the AM resource limit and user limit shown on the web UI, it 
 would be useful to expose them in the QueueMetrics as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3494) Expose AM resource limit and user limit in QueueMetrics

2015-04-21 Thread Rohith (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3494:
-
Attachment: 0002-YARN-3494.patch

 Expose AM resource limit and user limit in QueueMetrics 
 

 Key: YARN-3494
 URL: https://issues.apache.org/jira/browse/YARN-3494
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3494.patch, 0002-YARN-3494.patch, 
 0002-YARN-3494.patch


 Now we have the AM resource limit and user limit shown on the web UI, it 
 would be useful to expose them in the QueueMetrics as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3517) RM web ui for dumping scheduler logs should be for admins only


[ 
https://issues.apache.org/jira/browse/YARN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505185#comment-14505185
 ] 

Hadoop QA commented on YARN-3517:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12726890/YARN-3517.002.patch
  against trunk revision 8ddbb8d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7424//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7424//console

This message is automatically generated.

 RM web ui for dumping scheduler logs should be for admins only
 --

 Key: YARN-3517
 URL: https://issues.apache.org/jira/browse/YARN-3517
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, security
Affects Versions: 2.7.0
Reporter: Varun Vasudev
Assignee: Varun Vasudev
  Labels: security
 Attachments: YARN-3517.001.patch, YARN-3517.002.patch


 YARN-3294 allows users to dump scheduler logs from the web UI. This should be 
 for admins only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3463) Integrate OrderingPolicy Framework with CapacityScheduler


[ 
https://issues.apache.org/jira/browse/YARN-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505129#comment-14505129
 ] 

Hudson commented on YARN-3463:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2120 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2120/])
YARN-3463. Integrate OrderingPolicy Framework with CapacityScheduler. (Craig 
Welch via wangda) (wangda: rev 44872b76fcc0ddfbc7b0a4e54eef50fe8708e0f5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/AbstractComparatorOrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/OrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FifoOrderingPolicy.java
* hadoop-yarn-project/CHANGES.txt


 Integrate OrderingPolicy Framework with CapacityScheduler
 -

 Key: YARN-3463
 URL: https://issues.apache.org/jira/browse/YARN-3463
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Fix For: 2.8.0

 Attachments: YARN-3463.50.patch, YARN-3463.61.patch, 
 YARN-3463.64.patch, YARN-3463.65.patch, YARN-3463.66.patch, 
 YARN-3463.67.patch, YARN-3463.68.patch, YARN-3463.69.patch, YARN-3463.70.patch


 Integrate the OrderingPolicy Framework with the CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3497) ContainerManagementProtocolProxy modifies IPC timeout conf without making a copy


[ 
https://issues.apache.org/jira/browse/YARN-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505131#comment-14505131
 ] 

Hudson commented on YARN-3497:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2120 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2120/])
YARN-3497. ContainerManagementProtocolProxy modifies IPC timeout conf without 
making a copy. Contributed by Jason Lowe (jianhe: rev 
f967fd2f21791c5c4a5a090cc14ee88d155d2e2b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/ContainerManagementProtocolProxy.java
* hadoop-yarn-project/CHANGES.txt


 ContainerManagementProtocolProxy modifies IPC timeout conf without making a 
 copy
 

 Key: YARN-3497
 URL: https://issues.apache.org/jira/browse/YARN-3497
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 2.7.1

 Attachments: YARN-3497.001.patch, YARN-3497.002.patch


 yarn-client's ContainerManagementProtocolProxy is updating 
 ipc.client.connection.maxidletime in the conf passed in without making a copy 
 of it. That modification leaks into other systems using the same conf and 
 can cause them to setup RPC connections with a timeout of zero as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3514) Active directory usernames like domain\login cause YARN failures

2015-04-21 Thread john lilley (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505644#comment-14505644
 ] 

john lilley commented on YARN-3514:
---

We did work around the issue by changing our username mapping in sssd and 
auth_to_local rules to use plain usernames, that seemed to be the path of least 
resistance.

 Active directory usernames like domain\login cause YARN failures
 

 Key: YARN-3514
 URL: https://issues.apache.org/jira/browse/YARN-3514
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
 Environment: CentOS6
Reporter: john lilley
Assignee: Chris Nauroth
Priority: Minor
 Attachments: YARN-3514.001.patch, YARN-3514.002.patch


 We have a 2.2.0 (Cloudera 5.3) cluster running on CentOS6 that is 
 Kerberos-enabled and uses an external AD domain controller for the KDC.  We 
 are able to authenticate, browse HDFS, etc.  However, YARN fails during 
 localization because it seems to get confused by the presence of a \ 
 character in the local user name.
 Our AD authentication on the nodes goes through sssd and set configured to 
 map AD users onto the form domain\username.  For example, our test user has a 
 Kerberos principal of hadoopu...@domain.com and that maps onto a CentOS user 
 domain\hadoopuser.  We have no problem validating that user with PAM, 
 logging in as that user, su-ing to that user, etc.
 However, when we attempt to run a YARN application master, the localization 
 step fails when setting up the local cache directory for the AM.  The error 
 that comes out of the RM logs:
 2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: 
 ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, 
 diagnostics='Application application_1429295486450_0001 failed 1 times due to 
 AM Container for appattempt_1429295486450_0001_01 exited with  exitCode: 
 -1000 due to: Application application_1429295486450_0001 initialization 
 failed (exitCode=255) with output: main : command provided 0
 main : user is DOMAIN\hadoopuser
 main : requested yarn user is domain\hadoopuser
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
 directory: 
 /data/yarn/nm/usercache/domain%5Chadoopuser/appcache/application_1429295486450_0001/filecache/10
 at 
 org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
 .Failing this attempt.. Failing the application.'
 However, when we look on the node launching the AM, we see this:
 [root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
 [root@rpb-cdh-kerb-2 usercache]# ls -l
 drwxr-s--- 4 DOMAIN\hadoopuser yarn 4096 Apr 17 12:10 domain\hadoopuser
 There appears to be different treatment of the \ character in different 
 places.  Something creates the directory as domain\hadoopuser but something 
 else later attempts to use it as domain%5Chadoopuser.  I’m not sure where 
 or why the URL escapement converts the \ to %5C or why this is not consistent.
 I should also mention, for the sake of completeness, our auth_to_local rule 
 is set up to map u...@domain.com to domain\user:
 RULE:[1:$1@$0](^.*@DOMAIN\.COM$)s/^(.*)@DOMAIN\.COM$/domain\\$1/g



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3495) Confusing log generated by FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505754#comment-14505754
 ] 

Hudson commented on YARN-3495:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7627 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7627/])
YARN-3495. Confusing log generated by FairScheduler. Contributed by Brahma 
Reddy Battula. (ozawa: rev 105afd54779852c518b978101f23526143e234a5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* hadoop-yarn-project/CHANGES.txt


 Confusing log generated by FairScheduler
 

 Key: YARN-3495
 URL: https://issues.apache.org/jira/browse/YARN-3495
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Fix For: 2.8.0

 Attachments: YARN-3495.patch


 2015-04-16 12:03:48,531 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2740) ResourceManager side should properly handle node label modifications when distributed node label configuration enabled


[ 
https://issues.apache.org/jira/browse/YARN-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505709#comment-14505709
 ] 

Wangda Tan commented on YARN-2740:
--

General LGTM, some minor comments.

1) Mark YarnConfiguration.isDistributedNodeLabelConfiguration to @Private
2) It's better to cover remove label case since remove label = remove label in 
the cluster + remove label in nodes, add a test to make sure it works in 
distributed mode, same as TestRMAdminService/TestRMWebServicesNodeLabels
3) RMWebServices.replaceLabelsOnNode(s) should be merged to avoid we need to 
maintain both.

 ResourceManager side should properly handle node label modifications when 
 distributed node label configuration enabled
 --

 Key: YARN-2740
 URL: https://issues.apache.org/jira/browse/YARN-2740
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Fix For: 2.8.0

 Attachments: YARN-2740-20141024-1.patch, YARN-2740.20150320-1.patch, 
 YARN-2740.20150327-1.patch, YARN-2740.20150411-1.patch, 
 YARN-2740.20150411-2.patch, YARN-2740.20150411-3.patch, 
 YARN-2740.20150417-1.patch, YARN-2740.20150420-1.patch, 
 YARN-2740.20150421-1.patch


 According to YARN-2495, when distributed node label configuration is enabled:
 - RMAdmin / REST API should reject change labels on node operations.
 - CommonNodeLabelsManager shouldn't persist labels on nodes when NM do 
 heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2556) Tool to measure the performance of the timeline server

2015-04-21 Thread Jonathan Eagles (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505832#comment-14505832
 ] 

Jonathan Eagles commented on YARN-2556:
---

I'm very swamped current. Even though it would take very little time to address 
this, I just can't find the time. Please let's just move on and I will get to 
this in time.

 Tool to measure the performance of the timeline server
 --

 Key: YARN-2556
 URL: https://issues.apache.org/jira/browse/YARN-2556
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Chang Li
 Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
 YARN-2556.1.patch, YARN-2556.2.patch, YARN-2556.patch, yarn2556.patch, 
 yarn2556.patch, yarn2556_wip.patch


 We need to be able to understand the capacity model for the timeline server 
 to give users the tools they need to deploy a timeline server with the 
 correct capacity.
 I propose we create a mapreduce job that can measure timeline server write 
 and read performance. Transactions per second, I/O for both read and write 
 would be a good start.
 This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3225) New parameter or CLI for decommissioning node gracefully in RMAdmin CLI

2015-04-21 Thread Devaraj K (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-3225:

Attachment: YARN-3225-5.patch

 New parameter or CLI for decommissioning node gracefully in RMAdmin CLI
 ---

 Key: YARN-3225
 URL: https://issues.apache.org/jira/browse/YARN-3225
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Junping Du
Assignee: Devaraj K
 Attachments: YARN-3225-1.patch, YARN-3225-2.patch, YARN-3225-3.patch, 
 YARN-3225-4.patch, YARN-3225-5.patch, YARN-3225.patch, YARN-914.patch


 New CLI (or existing CLI with parameters) should put each node on 
 decommission list to decommissioning status and track timeout to terminate 
 the nodes that haven't get finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3514) Active directory usernames like domain\login cause YARN failures

[
https://issues.apache.org/jira/browse/YARN-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505661#comment-14505661
]

Hadoop QA commented on YARN-3514:
-

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12726964/YARN-3514.002.patch
against trunk revision 997408e.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 1 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 core tests{color}. The patch passed unit tests in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/7431//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7431//console

This message is automatically generated.

Active directory usernames like domain\login cause YARN failures

[jira] [Commented] (YARN-2740) ResourceManager side should properly handle node label modifications when distributed node label configuration enabled

2015-04-21 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505891#comment-14505891
 ] 

Naganarasimha G R commented on YARN-2740:
-

Thanks for the comment [~wangda] 
bq. 2) It's better to cover remove label case since remove label = remove label 
in the cluster + remove label in nodes, add a test to make sure it works in 
distributed mode, same as TestRMAdminService/TestRMWebServicesNodeLabels.
IIUC you want {{prevent removing clusterNodeLabel while distributed enabled}} 
and add test cases for it  ?
bq. 3) RMWebServices.replaceLabelsOnNode(s) should be merged to avoid we need 
to maintain both.
I dont mind working on it here but 2 queries
* Is it that we should still support existing 2 rest api's for 
replaceLabelsOnNode on a single node and other for multiple nodes and ensure 
the common part is extracted to a new method ?
* I understand that it will make it easy for committer to limit number of 
checkins but Is it good in terms of maintainability to include code changes in 
the patch which is not related to this jira description ?

 ResourceManager side should properly handle node label modifications when 
 distributed node label configuration enabled
 --

 Key: YARN-2740
 URL: https://issues.apache.org/jira/browse/YARN-2740
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Fix For: 2.8.0

 Attachments: YARN-2740-20141024-1.patch, YARN-2740.20150320-1.patch, 
 YARN-2740.20150327-1.patch, YARN-2740.20150411-1.patch, 
 YARN-2740.20150411-2.patch, YARN-2740.20150411-3.patch, 
 YARN-2740.20150417-1.patch, YARN-2740.20150420-1.patch, 
 YARN-2740.20150421-1.patch


 According to YARN-2495, when distributed node label configuration is enabled:
 - RMAdmin / REST API should reject change labels on node operations.
 - CommonNodeLabelsManager shouldn't persist labels on nodes when NM do 
 heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3519) registerApplicationMaster couldn't get all running containers if rm is rebuilding container info while am is relaunched

2015-04-21 Thread sandflee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505894#comment-14505894
 ] 

sandflee commented on YARN-3519:


yes, the same issue

 registerApplicationMaster couldn't get all running containers if rm is 
 rebuilding container info while am is relaunched 
 

 Key: YARN-3519
 URL: https://issues.apache.org/jira/browse/YARN-3519
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: sandflee

 1, rm failes over, have recovered all app info but all container 
 2, am relaunched and register to am
 3, nm with container launched by am  reregister to rm
 The container in nm and corresponding NMToken could not passed to am



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-04-21 Thread Anubhav Dhoot (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505744#comment-14505744
 ] 

Anubhav Dhoot commented on YARN-3363:
-

The duration do not seem to belong in a ProcessTree class. 
Can we instead of flowing the durations through the ProcessTree class, add the 
metrics directly in the ContainersMonitorImpl#handle by reading the 
startEvent.getLaunchDuration() directly?

Nit: ContainerMetrics:recordTime maybe could rename to 
recordStateChangeDurations or something to that effect?
sendContainerMonitorStartEvent has two different ways duration were calculated. 
Maybe make 2 local variables for launchDuration and localizationDuration and 
then we do not need the comment.


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability
 Attachments: YARN-3363.000.patch


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-04-21 Thread Li Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3134:

Attachment: YARN-3134-042115.patch

In this patch I addressed [~djp]'s comments on code readability and releasing 
PreparedStatements. In this patch, PreparedStatements will be released after 
the try with resource statements. I've also addressed the concurrent 
modification exception pointed out by [~zjshen]. Now we're using per-thread 
Phoenix JDBC connections which will be mapped to the same heavy-weight HBase 
connection internally. 

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
 Attachments: YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134DataSchema.pdf


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3287) TimelineClient kerberos authentication failure uses wrong login context.

2015-04-21 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505910#comment-14505910
 ] 

Zhijie Shen commented on YARN-3287:
---

It breaks the timeline access control of distributed shell. In distributed 
shell AM:

{code}
if (conf.getBoolean(YarnConfiguration.TIMELINE_SERVICE_ENABLED,
  YarnConfiguration.DEFAULT_TIMELINE_SERVICE_ENABLED)) {
  // Creating the Timeline Client
  timelineClient = TimelineClient.createTimelineClient();
  timelineClient.init(conf);
  timelineClient.start();
} else {
  timelineClient = null;
  LOG.warn(Timeline service is not enabled);
}
{code}

{code}
  ugi.doAs(new PrivilegedExceptionActionTimelinePutResponse() {
@Override
public TimelinePutResponse run() throws Exception {
  return timelineClient.putEntities(entity);
}
  });
{code}

This Jira changes the timeline client to get the right ugi at serviceInit, but 
DS AM still doesn't use submitter ugi to init timeline client, but use the ugi 
for each put entity call. It result in the wrong user of the put request.

 TimelineClient kerberos authentication failure uses wrong login context.
 

 Key: YARN-3287
 URL: https://issues.apache.org/jira/browse/YARN-3287
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Daryn Sharp
 Fix For: 2.7.0

 Attachments: YARN-3287.1.patch, YARN-3287.2.patch, YARN-3287.3.patch, 
 timeline.patch


 TimelineClientImpl:doPosting is not wrapped in a doAs, which can cause 
 failure for yarn clients to create timeline domains during job submission.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3413) Node label attributes (like exclusivity) should settable via addToClusterNodeLabels but shouldn't be changeable at runtime


 [ 
https://issues.apache.org/jira/browse/YARN-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3413:
-
Attachment: YARN-3413.4.patch

Attached ver.4 addressed findbugs warnings and test failures (MR test failures 
seems not related to this patch).

 Node label attributes (like exclusivity) should settable via 
 addToClusterNodeLabels but shouldn't be changeable at runtime
 --

 Key: YARN-3413
 URL: https://issues.apache.org/jira/browse/YARN-3413
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3413.1.patch, YARN-3413.2.patch, YARN-3413.3.patch, 
 YARN-3413.4.patch


 As mentioned in : 
 https://issues.apache.org/jira/browse/YARN-3345?focusedCommentId=14384947page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14384947.
 Changing node label exclusivity and/or other attributes may not be a real use 
 case, and also we should support setting node label attributes whiling adding 
 them to cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3514) Active directory usernames like domain\login cause YARN failures


[ 
https://issues.apache.org/jira/browse/YARN-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505711#comment-14505711
 ] 

Chris Nauroth commented on YARN-3514:
-

[~john.lil...@redpoint.net], thank you for the confirmation.

 Active directory usernames like domain\login cause YARN failures
 

 Key: YARN-3514
 URL: https://issues.apache.org/jira/browse/YARN-3514
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
 Environment: CentOS6
Reporter: john lilley
Assignee: Chris Nauroth
Priority: Minor
 Attachments: YARN-3514.001.patch, YARN-3514.002.patch


 We have a 2.2.0 (Cloudera 5.3) cluster running on CentOS6 that is 
 Kerberos-enabled and uses an external AD domain controller for the KDC.  We 
 are able to authenticate, browse HDFS, etc.  However, YARN fails during 
 localization because it seems to get confused by the presence of a \ 
 character in the local user name.
 Our AD authentication on the nodes goes through sssd and set configured to 
 map AD users onto the form domain\username.  For example, our test user has a 
 Kerberos principal of hadoopu...@domain.com and that maps onto a CentOS user 
 domain\hadoopuser.  We have no problem validating that user with PAM, 
 logging in as that user, su-ing to that user, etc.
 However, when we attempt to run a YARN application master, the localization 
 step fails when setting up the local cache directory for the AM.  The error 
 that comes out of the RM logs:
 2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: 
 ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, 
 diagnostics='Application application_1429295486450_0001 failed 1 times due to 
 AM Container for appattempt_1429295486450_0001_01 exited with  exitCode: 
 -1000 due to: Application application_1429295486450_0001 initialization 
 failed (exitCode=255) with output: main : command provided 0
 main : user is DOMAIN\hadoopuser
 main : requested yarn user is domain\hadoopuser
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
 directory: 
 /data/yarn/nm/usercache/domain%5Chadoopuser/appcache/application_1429295486450_0001/filecache/10
 at 
 org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
 .Failing this attempt.. Failing the application.'
 However, when we look on the node launching the AM, we see this:
 [root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
 [root@rpb-cdh-kerb-2 usercache]# ls -l
 drwxr-s--- 4 DOMAIN\hadoopuser yarn 4096 Apr 17 12:10 domain\hadoopuser
 There appears to be different treatment of the \ character in different 
 places.  Something creates the directory as domain\hadoopuser but something 
 else later attempts to use it as domain%5Chadoopuser.  I’m not sure where 
 or why the URL escapement converts the \ to %5C or why this is not consistent.
 I should also mention, for the sake of completeness, our auth_to_local rule 
 is set up to map u...@domain.com to domain\user:
 RULE:[1:$1@$0](^.*@DOMAIN\.COM$)s/^(.*)@DOMAIN\.COM$/domain\\$1/g



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3514) Active directory usernames like domain\login cause YARN failures

2015-04-21 Thread john lilley (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505640#comment-14505640
 ] 

john lilley commented on YARN-3514:
---

Sadly, we aren't equipped to upgrade and patch, we are mandated to go with the 
flow of the commercial distros we support.  However I can assure you that our 
local FS definitely supports the \ in the filename, as I saw the usercache 
folder with the \ in it.

 Active directory usernames like domain\login cause YARN failures
 

 Key: YARN-3514
 URL: https://issues.apache.org/jira/browse/YARN-3514
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
 Environment: CentOS6
Reporter: john lilley
Assignee: Chris Nauroth
Priority: Minor
 Attachments: YARN-3514.001.patch, YARN-3514.002.patch


 We have a 2.2.0 (Cloudera 5.3) cluster running on CentOS6 that is 
 Kerberos-enabled and uses an external AD domain controller for the KDC.  We 
 are able to authenticate, browse HDFS, etc.  However, YARN fails during 
 localization because it seems to get confused by the presence of a \ 
 character in the local user name.
 Our AD authentication on the nodes goes through sssd and set configured to 
 map AD users onto the form domain\username.  For example, our test user has a 
 Kerberos principal of hadoopu...@domain.com and that maps onto a CentOS user 
 domain\hadoopuser.  We have no problem validating that user with PAM, 
 logging in as that user, su-ing to that user, etc.
 However, when we attempt to run a YARN application master, the localization 
 step fails when setting up the local cache directory for the AM.  The error 
 that comes out of the RM logs:
 2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: 
 ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, 
 diagnostics='Application application_1429295486450_0001 failed 1 times due to 
 AM Container for appattempt_1429295486450_0001_01 exited with  exitCode: 
 -1000 due to: Application application_1429295486450_0001 initialization 
 failed (exitCode=255) with output: main : command provided 0
 main : user is DOMAIN\hadoopuser
 main : requested yarn user is domain\hadoopuser
 org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create 
 directory: 
 /data/yarn/nm/usercache/domain%5Chadoopuser/appcache/application_1429295486450_0001/filecache/10
 at 
 org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
 .Failing this attempt.. Failing the application.'
 However, when we look on the node launching the AM, we see this:
 [root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
 [root@rpb-cdh-kerb-2 usercache]# ls -l
 drwxr-s--- 4 DOMAIN\hadoopuser yarn 4096 Apr 17 12:10 domain\hadoopuser
 There appears to be different treatment of the \ character in different 
 places.  Something creates the directory as domain\hadoopuser but something 
 else later attempts to use it as domain%5Chadoopuser.  I’m not sure where 
 or why the URL escapement converts the \ to %5C or why this is not consistent.
 I should also mention, for the sake of completeness, our auth_to_local rule 
 is set up to map u...@domain.com to domain\user:
 RULE:[1:$1@$0](^.*@DOMAIN\.COM$)s/^(.*)@DOMAIN\.COM$/domain\\$1/g



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3413) Node label attributes (like exclusivity) should settable via addToClusterNodeLabels but shouldn't be changeable at runtime

[
https://issues.apache.org/jira/browse/YARN-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505815#comment-14505815
]

Hadoop QA commented on YARN-3413:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12726935/YARN-3413.3.patch
against trunk revision dfc1c4c.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 19 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:red}-1 findbugs{color}. The patch appears to introduce 1 new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:red}-1 core tests{color}. The patch failed these unit tests in
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.mapred.TestReporter
org.apache.hadoop.mapreduce.v2.TestUberAM
org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
org.apache.hadoop.mapred.TestMRIntermediateDataEncryption

org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService

The test build failed in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/7429//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-YARN-Build/7429//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-api.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7429//console

This message is automatically generated.

Node label attributes (like exclusivity) should settable via
addToClusterNodeLabels but shouldn't be changeable at runtime
--

Key: YARN-3413
URL: https://issues.apache.org/jira/browse/YARN-3413
Project: Hadoop YARN
Issue Type: Sub-task
Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
Attachments: YARN-3413.1.patch, YARN-3413.2.patch, YARN-3413.3.patch

As mentioned in :
https://issues.apache.org/jira/browse/YARN-3345?focusedCommentId=14384947page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14384947.
Changing node label exclusivity and/or other attributes may not be a real use
case, and also we should support setting node label attributes whiling adding
them to cluster.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3522) DistributedShell uses the wrong user to put timeline data

2015-04-21 Thread Zhijie Shen (JIRA)

Zhijie Shen created YARN-3522:
-

 Summary: DistributedShell uses the wrong user to put timeline data
 Key: YARN-3522
 URL: https://issues.apache.org/jira/browse/YARN-3522
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Priority: Blocker


YARN-3287 breaks the timeline access control of distributed shell. In 
distributed shell AM:

{code}
if (conf.getBoolean(YarnConfiguration.TIMELINE_SERVICE_ENABLED,
  YarnConfiguration.DEFAULT_TIMELINE_SERVICE_ENABLED)) {
  // Creating the Timeline Client
  timelineClient = TimelineClient.createTimelineClient();
  timelineClient.init(conf);
  timelineClient.start();
} else {
  timelineClient = null;
  LOG.warn(Timeline service is not enabled);
}
{code}

{code}
  ugi.doAs(new PrivilegedExceptionActionTimelinePutResponse() {
@Override
public TimelinePutResponse run() throws Exception {
  return timelineClient.putEntities(entity);
}
  });
{code}

YARN-3287 changes the timeline client to get the right ugi at serviceInit, but 
DS AM still doesn't use submitter ugi to init timeline client, but use the ugi 
for each put entity call. It result in the wrong user of the put request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Moved] (YARN-3523) Cleanup ResourceManagerAdministrationProtocol interface audience


 [ 
https://issues.apache.org/jira/browse/YARN-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan moved MAPREDUCE-6326 to YARN-3523:
-

Component/s: (was: resourcemanager)
 (was: client)
 resourcemanager
 client
Key: YARN-3523  (was: MAPREDUCE-6326)
Project: Hadoop YARN  (was: Hadoop Map/Reduce)

 Cleanup ResourceManagerAdministrationProtocol interface audience
 

 Key: YARN-3523
 URL: https://issues.apache.org/jira/browse/YARN-3523
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Reporter: Wangda Tan

 I noticed ResourceManagerAdministrationProtocol has @Private audience for the 
 class and @Public audience for methods. It doesn't make sense to me. We 
 should make class audience and methods audience consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3523) Cleanup ResourceManagerAdministrationProtocol interface audience

2015-04-21 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505971#comment-14505971
 ] 

Naganarasimha G R commented on YARN-3523:
-

Hi [~wangda],
IIUC we need to make all the methods to have @Private audience right, as all 
methods over here are for administrative purpose ?
if you already have a patch for this, please feel free to reassign :)


 Cleanup ResourceManagerAdministrationProtocol interface audience
 

 Key: YARN-3523
 URL: https://issues.apache.org/jira/browse/YARN-3523
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R

 I noticed ResourceManagerAdministrationProtocol has @Private audience for the 
 class and @Public audience for methods. It doesn't make sense to me. We 
 should make class audience and methods audience consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3495) Confusing log generated by FairScheduler

2015-04-21 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506044#comment-14506044
 ] 

Brahma Reddy Battula commented on YARN-3495:


Thanks a lot [~ozawa] for reviewing and committing patch!!!

 Confusing log generated by FairScheduler
 

 Key: YARN-3495
 URL: https://issues.apache.org/jira/browse/YARN-3495
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Fix For: 2.8.0

 Attachments: YARN-3495.patch


 2015-04-16 12:03:48,531 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
 Null container completed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3301) Fix the format issue of the new RM web UI and AHS web UI


[ 
https://issues.apache.org/jira/browse/YARN-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506121#comment-14506121
 ] 

Xuan Gong commented on YARN-3301:
-

bq. seems the outstanding Resource Requests table still has some format issue,
Attached a new patch and a screenshot of the web page.


 Fix the format issue of the new RM web UI and AHS web UI
 

 Key: YARN-3301
 URL: https://issues.apache.org/jira/browse/YARN-3301
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: Screen Shot 2015-04-21 at 5.09.25 PM.png, Screen Shot 
 2015-04-21 at 5.38.39 PM.png, YARN-3301.1.patch, YARN-3301.2.patch, 
 YARN-3301.3.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3523) Cleanup ResourceManagerAdministrationProtocol interface audience


[ 
https://issues.apache.org/jira/browse/YARN-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506118#comment-14506118
 ] 

Wangda Tan commented on YARN-3523:
--

[~Naganarasimha], thanks for taking this.
Actually I'm not sure about what should be the correct audience setting for 
ResourceManagerAdministrationProtocol, as a public API, it should be @Public. 
But actually, it will only used by RMAdminCLI, I don't know is there any 3rd 
party projects writing their own admin CLI and implements 
ResourceManagerAdministrationProtocol.

To not break compatibility, I think simple solution is to only change 
ResourceManagerAdministrationProtocol from @Private to @Public. I don't know if 
there's any thoughts about this.

 Cleanup ResourceManagerAdministrationProtocol interface audience
 

 Key: YARN-3523
 URL: https://issues.apache.org/jira/browse/YARN-3523
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R

 I noticed ResourceManagerAdministrationProtocol has @Private audience for the 
 class and @Public audience for methods. It doesn't make sense to me. We 
 should make class audience and methods audience consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3301) Fix the format issue of the new RM web UI and AHS web UI


 [ 
https://issues.apache.org/jira/browse/YARN-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-3301:

Attachment: YARN-3301.3.patch

 Fix the format issue of the new RM web UI and AHS web UI
 

 Key: YARN-3301
 URL: https://issues.apache.org/jira/browse/YARN-3301
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: Screen Shot 2015-04-21 at 5.09.25 PM.png, Screen Shot 
 2015-04-21 at 5.38.39 PM.png, YARN-3301.1.patch, YARN-3301.2.patch, 
 YARN-3301.3.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3301) Fix the format issue of the new RM web UI and AHS web UI