[jira] [Commented] (YARN-2428) LCE default banned user list should have yarn

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298757#comment-14298757
 ] 

Hudson commented on YARN-2428:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2040 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2040/])
YARN-2428. LCE default banned user list should have yarn (Varun Saxena via aw) 
(aw: rev 9dd0b7a2ab6538d8f72b004eb97c2750ff3d98dd)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


 LCE default banned user list should have yarn
 -

 Key: YARN-2428
 URL: https://issues.apache.org/jira/browse/YARN-2428
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Allen Wittenauer
Assignee: Varun Saxena
Priority: Trivial
  Labels: newbie
 Fix For: 3.0.0

 Attachments: YARN-2428.001.patch


 When task-controller was retrofitted to YARN, the default banned user list 
 didn't add yarn.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2854) The document about timeline service and generic service needs to be updated

2015-01-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298908#comment-14298908
 ] 

Zhijie Shen commented on YARN-2854:
---

Will take a look

 The document about timeline service and generic service needs to be updated
 ---

 Key: YARN-2854
 URL: https://issues.apache.org/jira/browse/YARN-2854
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
Priority: Critical
 Attachments: TimelineServer.html, YARN-2854.20141120-1.patch, 
 YARN-2854.20150128.1.patch, timeline_structure.jpg






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-01-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298881#comment-14298881
 ] 

Allen Wittenauer commented on YARN-3100:


I still see no provided evidence as to why YARN needs its own ACL 
implementation.  It was always a mistake that queue ACLs and the like weren't 
implemented with the common ACL implementation, given how simplistic YARN's 
needs ultimately are. This seems like a good opportunity to fix it without 
making more technical debt as proposed by this JIRA.

I'm still at -1.

 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-3100.1.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3108) ApplicationHistoryServer doesn't process -D arguments

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298752#comment-14298752
 ] 

Hudson commented on YARN-3108:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2040 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2040/])
YARN-3108. ApplicationHistoryServer doesn't process -D arguments (Chang Li via 
jeagles) (jeagles: rev 30a8778c632c0f57cdd005080a470065a60756a8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryServer.java


 ApplicationHistoryServer doesn't process -D arguments
 -

 Key: YARN-3108
 URL: https://issues.apache.org/jira/browse/YARN-3108
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: chang li
Assignee: chang li
 Fix For: 2.7.0

 Attachments: yarn3108.patch, yarn3108.patch, yarn3108.patch


 ApplicationHistoryServer doesn't process -D arguments when created, it's nice 
 to have it to do that  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3029) FSDownload.unpack() uses local locale for FS case conversion, may not work everywhere

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298756#comment-14298756
 ] 

Hudson commented on YARN-3029:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2040 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2040/])
YARN-3029. FSDownload.unpack() uses local locale for FS case conversion, may 
not work everywhere. Contributed by Varun Saxena. (ozawa: rev 
7acce7d3648d6f1e45ce280e2147e7dedf5693fc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java
* hadoop-yarn-project/CHANGES.txt


 FSDownload.unpack() uses local locale for FS case conversion, may not work 
 everywhere
 -

 Key: YARN-3029
 URL: https://issues.apache.org/jira/browse/YARN-3029
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Varun Saxena
 Attachments: YARN-3029-003.patch, YARN-3029.001.patch, 
 YARN-3029.002.patch


 {{FSDownload.unpack()}} lower-cases filenames in the local locale before 
 looking at extensions for, tar, zip, ..
 {code}
 String lowerDst = dst.getName().toLowerCase();
 {code}
 it MUST use LOCALE_EN for the locale, else a file .ZIP won't be recognised as 
 a zipfile in a turkish locale cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit

2015-01-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298876#comment-14298876
 ] 

Allen Wittenauer commented on YARN-3119:


How should scheduling behave in this scenario?  What happens if multiple 
containers are over their limit and/or what order are containers killed?  

 Memory limit check need not be enforced unless aggregate usage of all 
 containers is near limit
 --

 Key: YARN-3119
 URL: https://issues.apache.org/jira/browse/YARN-3119
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3119.prelim.patch


 Today we kill any container preemptively even if the total usage of 
 containers for that is well within the limit for YARN. Instead if we enforce 
 memory limit only if the total limit of all containers is close to some 
 configurable ratio of overall memory assigned to containers, we can allow for 
 flexibility in container memory usage without adverse effects. This is 
 similar in principle to how cgroups uses soft_limit_in_bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298887#comment-14298887
 ] 

Sandy Ryza commented on YARN-3101:
--

[~adhoot] is this the same condition that's evaluated when reserving a resource 
in the first place?  I.e. might we ever make a reservation and then immediately 
end up canceling it?

Also, I believe [~l201514] is correct that 
reservedAppSchedulable.getResource(reservedPriority))) will not return the 
right quantity and node.getReservedContainer().getReservedResource() is 
correct. 

Last of all, while we're at it, can we rename fitInMaxShare to 
fitsInMaxShare?

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101.001.patch, 
 YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-3101:
--
Attachment: (was: YARN-3101-Siqi.v2.patch)

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2808) yarn client tool can not list app_attempt's container info correctly

2015-01-30 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2808:

Attachment: YARN-2808.20150130-1.patch

Find bugs issue is fixed and 
following issues are not related to my changes 
org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
org.apache.hadoop.yarn.client.cli.TestRMAdminCLI

[~zjshen], can you please take a look at this issue too...

 yarn client tool can not list app_attempt's container info correctly
 

 Key: YARN-2808
 URL: https://issues.apache.org/jira/browse/YARN-2808
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.6.0
Reporter: Gordon Wang
Assignee: Naganarasimha G R
 Attachments: YARN-2808.20150126-1.patch, YARN-2808.20150130-1.patch


 When enabling timeline server, yarn client can not list the container info 
 for a application attempt correctly.
 Here is the reproduce step.
 # enabling yarn timeline server
 # submit a MR job
 # after the job is finished. use yarn client to list the container info of 
 the app attempt.
 Then, since the RM has cached the application's attempt info, the output show 
 {noformat}
 [hadoop@localhost hadoop-3.0.0-SNAPSHOT]$ ./bin/yarn container -list 
 appattempt_1415168250217_0001_01
 14/11/05 01:19:15 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 14/11/05 01:19:15 INFO impl.TimelineClientImpl: Timeline service address: 
 http://0.0.0.0:8188/ws/v1/timeline/
 14/11/05 01:19:16 INFO client.RMProxy: Connecting to ResourceManager at 
 /0.0.0.0:8032
 14/11/05 01:19:16 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 Total number of containers :0
   Container-Id  Start Time Finish 
 Time   StateHost  
   LOG-URL
 {noformat}
 But if the rm is restarted, client can fetch the container info from timeline 
 server correctly.
 {noformat}
 [hadoop@localhost hadoop-3.0.0-SNAPSHOT]$ ./bin/yarn container -list 
 appattempt_1415168250217_0001_01
 14/11/05 01:21:06 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 14/11/05 01:21:06 INFO impl.TimelineClientImpl: Timeline service address: 
 http://0.0.0.0:8188/ws/v1/timeline/
 14/11/05 01:21:06 INFO client.RMProxy: Connecting to ResourceManager at 
 /0.0.0.0:8032
 14/11/05 01:21:06 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 Total number of containers :4
   Container-Id  Start Time Finish 
 Time   StateHost  
   LOG-URL
 container_1415168250217_0001_01_01   1415168318376   
 1415168349896COMPLETElocalhost.localdomain:47024 
 http://0.0.0.0:8188/applicationhistory/logs/localhost.localdomain:47024/container_1415168250217_0001_01_01/container_1415168250217_0001_01_01/hadoop
 container_1415168250217_0001_01_02   1415168326399   
 1415168334858COMPLETElocalhost.localdomain:47024 
 http://0.0.0.0:8188/applicationhistory/logs/localhost.localdomain:47024/container_1415168250217_0001_01_02/container_1415168250217_0001_01_02/hadoop
 container_1415168250217_0001_01_03   1415168326400   
 1415168335277COMPLETElocalhost.localdomain:47024 
 http://0.0.0.0:8188/applicationhistory/logs/localhost.localdomain:47024/container_1415168250217_0001_01_03/container_1415168250217_0001_01_03/hadoop
 container_1415168250217_0001_01_04   1415168335825   
 1415168343873COMPLETElocalhost.localdomain:47024 
 http://0.0.0.0:8188/applicationhistory/logs/localhost.localdomain:47024/container_1415168250217_0001_01_04/container_1415168250217_0001_01_04/hadoop
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3104) RM generates new AMRM tokens every heartbeat between rolling and activation

2015-01-30 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299032#comment-14299032
 ] 

Jian He commented on YARN-3104:
---

thanks for your detailed explanation. Effectively, so far, the new token client 
get from the server  is not used by the server at all for re-authenticatoin.. 

patch looks good to me.

 RM generates new AMRM tokens every heartbeat between rolling and activation
 ---

 Key: YARN-3104
 URL: https://issues.apache.org/jira/browse/YARN-3104
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-3104.001.patch


 When the RM rolls a new AMRM secret, it conveys this to the AMs when it 
 notices they are still connected with the old key.  However neither the RM 
 nor the AM explicitly close the connection or otherwise try to reconnect with 
 the new secret.  Therefore the RM keeps thinking the AM doesn't have the new 
 token on every heartbeat and keeps sending new tokens for the period between 
 the key roll and the key activation.  Once activated the RM no longer squawks 
 in its logs about needing to generate a new token every heartbeat (i.e.: 
 second) for every app, but the apps can still be using the old token.  The 
 token is only checked upon connection to the RM.  The apps don't reconnect 
 when sent a new token, and the RM doesn't force them to reconnect by closing 
 the connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2543) Resource usage should be published to the timeline server as well

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299123#comment-14299123
 ] 

Hadoop QA commented on YARN-2543:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12695578/YARN-2543.20150130-1.patch
  against trunk revision f2c9109.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6467//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6467//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6467//console

This message is automatically generated.

 Resource usage should be published to the timeline server as well
 -

 Key: YARN-2543
 URL: https://issues.apache.org/jira/browse/YARN-2543
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
 Attachments: YARN-2543.20150125-1.patch, YARN-2543.20150130-1.patch


 RM will include the resource usage in the app report, but generic history 
 service doesn't, because RM doesn't publish this data to the timeline server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2942) Aggregated Log Files should be compacted

2015-01-30 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-2942:

Attachment: YARN-2942-preliminary.002.patch

I've just uploaded YARN-2942-preliminary.002.patch, which follows the v2 design 
doc.  It's basically all there except for unit tests and ZK/Curator security 
stuff, which I'm still working on.  

Here's some more technical info on the implementation:
- Compacted logs and are placed in the same directory as the aggregated logs, 
which has some benefits, including
-- No need to duplicate directory structure
-- {{AggregatedLogDeletionService}} handles cleaning up old compacted logs 
without any changes :)
- {{CompactedAggregatedLogFormat}} has a Reader and Writer that handles the 
details of reading and writing the compacted logs.
- To simplify some reading code in the {{AggregatedLogsBlock}}, I created a 
{{LogFormatReader}} interface, which defines some common methods that both an 
{{AggregatedLogFormat.LogReader}} and a 
{{CompactedAggregatedLogFormat.LogReader}} both have
- The {{AggregatedLogsBlock}} first tries to read from the compacted log file; 
if it can't find it, or can't find the container in the index, or has some 
other problem, it will fallback to the aggregated log and have the same 
behavior as before
-- The file formats for the aggregated logs and compacted logs are similar 
enough that the {{AggregatedLogFormat.ContainerLogsReader}} can be used on 
either, so there's no new log file parsing code for that
- Here's the process that the NM goes through (if compaction is enabled):
-- After the {{AppLogAggregatorImpl}} is done uploading aggregated log files, 
it will then try to acquire the Curator lock for the current application
-- Then it will append its log file
-- Then it will delete its aggregated log file
-- Then it will release the lock

It would be great if I could get some feedback on the current patch so far.

 Aggregated Log Files should be compacted
 

 Key: YARN-2942
 URL: https://issues.apache.org/jira/browse/YARN-2942
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.6.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: CompactedAggregatedLogsProposal_v1.pdf, 
 CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
 YARN-2942-preliminary.002.patch


 Turning on log aggregation allows users to easily store container logs in 
 HDFS and subsequently view them in the YARN web UIs from a central place.  
 Currently, there is a separate log file for each Node Manager.  This can be a 
 problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
 accumulating many (possibly small) files per YARN application.  The current 
 “solution” for this problem is to configure YARN (actually the JHS) to 
 automatically delete these files after some amount of time.  
 We should improve this by compacting the per-node aggregated log files into 
 one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2031) YARN Proxy model doesn't support REST APIs in AMs

2015-01-30 Thread Jonathan Maron (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Maron updated YARN-2031:
-
Attachment: YARN-2031.patch.001

initial attempt at supplying the correct HTTP response code (307) from the 
proxy servlet

 YARN Proxy model doesn't support REST APIs in AMs
 -

 Key: YARN-2031
 URL: https://issues.apache.org/jira/browse/YARN-2031
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: YARN-2031.patch.001


 AMs can't support REST APIs because
 # the AM filter redirects all requests to the proxy with a 302 response (not 
 307)
 # the proxy doesn't forward PUT/POST/DELETE verbs
 Either the AM filter needs to return 307 and the proxy to forward the verbs, 
 or Am filter should not filter a REST bit of the web site



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit

2015-01-30 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299304#comment-14299304
 ] 

Anubhav Dhoot commented on YARN-3119:
-

The scheduling should continue like before where we schedule containers. If the 
new container causes the ratio to be exceeded we would kill the offending 
containers. In case we dont exceed the limit, the offending containers could be 
given a chance to succeed which can improve throughput of jobs that have skews 
like this.
If multiple containers are over limit they are all deleted for now. In future 
we can be more sophisticated that we kill containers in reverse order of the 
amount they exceed by or some other criteria, until we go back below the ratio. 
That would be a good second improvement over this. 
In general this jira attempts to make memory a little more of a flexible 
resource.

 Memory limit check need not be enforced unless aggregate usage of all 
 containers is near limit
 --

 Key: YARN-3119
 URL: https://issues.apache.org/jira/browse/YARN-3119
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3119.prelim.patch


 Today we kill any container preemptively even if the total usage of 
 containers for that is well within the limit for YARN. Instead if we enforce 
 memory limit only if the total limit of all containers is close to some 
 configurable ratio of overall memory assigned to containers, we can allow for 
 flexibility in container memory usage without adverse effects. This is 
 similar in principle to how cgroups uses soft_limit_in_bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299306#comment-14299306
 ] 

Hadoop QA commented on YARN-3101:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12695624/YARN-3101-Siqi.v2.patch
  against trunk revision 951b360.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6470//console

This message is automatically generated.

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-3101:
--
Attachment: YARN-3101-Siqi.v2.patch

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-3101:
--
Attachment: YARN-3101-Siqi.v2.patch

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2868) Add metric for initial container launch time to FairScheduler

2015-01-30 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299302#comment-14299302
 ] 

Wangda Tan commented on YARN-2868:
--

bq. Our scenario is debugging queue related issues for which we need queue 
related metrics because scheduling decisions are made based on the queue. What 
would be a good place to add metrics for all those queue related metrics?
It makes sense to me since it's use-case driven.
However, I'm wondering maybe the first container allocation delay is not 
correctly calculated in this patch. Thinking about a queue with some pending 
applications, but no queue gets resource allocated from RM (maybe there's any 
issue of the cluster). In this case, the first container allocation delay 
will be 0. I think we should consider the time of an app waiting for RM 
allocating container. So even if there's no container allocated in a queue, 
first container allocation delay will still be consistently increasing, which 
can help trouble shooting cluster issues.

Does this make sense? [~jianhe].

 Add metric for initial container launch time to FairScheduler
 -

 Key: YARN-2868
 URL: https://issues.apache.org/jira/browse/YARN-2868
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Ray Chiang
Assignee: Anubhav Dhoot
  Labels: metrics, supportability
 Attachments: YARN-2868-01.patch, YARN-2868.002.patch, 
 YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, 
 YARN-2868.006.patch, YARN-2868.007.patch


 Add a metric to measure the latency between starting container allocation 
 and first container actually allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit

2015-01-30 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299326#comment-14299326
 ] 

Wangda Tan commented on YARN-3119:
--

IMHO, this could be problematic if a under-usage container (c1) wants to get 
more resource, but the resource is over-used by another container (c2). It is 
possible c1 tries to allocate but failed since memory is exhausted since NM 
needs some time to get resource back (kill c2).

 Memory limit check need not be enforced unless aggregate usage of all 
 containers is near limit
 --

 Key: YARN-3119
 URL: https://issues.apache.org/jira/browse/YARN-3119
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3119.prelim.patch


 Today we kill any container preemptively even if the total usage of 
 containers for that is well within the limit for YARN. Instead if we enforce 
 memory limit only if the total limit of all containers is close to some 
 configurable ratio of overall memory assigned to containers, we can allow for 
 flexibility in container memory usage without adverse effects. This is 
 similar in principle to how cgroups uses soft_limit_in_bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2543) Resource usage should be published to the timeline server as well

2015-01-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299317#comment-14299317
 ] 

Zhijie Shen commented on YARN-2543:
---

Can you check the test failure in TestSystemMetricsPublisher?

 Resource usage should be published to the timeline server as well
 -

 Key: YARN-2543
 URL: https://issues.apache.org/jira/browse/YARN-2543
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
 Attachments: YARN-2543.20150125-1.patch, YARN-2543.20150130-1.patch


 RM will include the resource usage in the app report, but generic history 
 service doesn't, because RM doesn't publish this data to the timeline server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299348#comment-14299348
 ] 

Anubhav Dhoot commented on YARN-3101:
-

Hi [~l201514] why do we need to make the comparator  instead =. What case 
does this address? 
[~sandyr] I did not see a check when placing a reservation. We check a queue 
usage once in FSLeafQueue#assignContainerPreCheck but we do not know the 
container size until actual reserve happens in FSAppAttempt reserve.

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3112) AM restart and keep containers from previous attempts, then new container launch failed

2015-01-30 Thread Jack Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299003#comment-14299003
 ] 

Jack Chen commented on YARN-3112:
-

I have found the cause for this error: the new launched appattempt will 
transfer the old containers from previous attempts, so the Nodeset in 
NMTokenSecretManagerInRM.java will be filled. When the new appattempt to get 
the allocated containers via  pullNewlyAllocatedContainersAndNMTokens(), it 
will get null nmToken because of the full Nodeset in createAndGetNMToken(). 
The Null nmToken will be returned to the ContainerLauncher, so the new 
container will fail in the launch. What i have done is clear the nodeset in  
pullNewlyAllocatedContainersAndNMTokens() before the creation of container and 
node tokens. 

   public synchronized ContainersAndNMTokensAllocation
  438   pullNewlyAllocatedContainersAndNMTokens() { 
 
  439 ListContainer returnContainerList =
  440 new ArrayListContainer(newlyAllocatedContainers.size());
  441 ListNMToken nmTokens = new ArrayListNMToken();
+ 442 // clear the nodeset for NMTokens
+ 443 
rmContext.getNMTokenSecretManager().clearNodeSetForAttempt(getApplicationAttemptId());
  444 for (IteratorRMContainer i = newlyAllocatedContainers.iterator(); i
  445   .hasNext();) {
  446   RMContainer rmContainer = i.next();
  447   Container container = rmContainer.getContainer();
  448   try {
  449 // create container token and NMToken altogether.
  450 
container.setContainerToken(rmContext.getContainerTokenSecretManager()
  451   .createContainerToken(container.getId(), container.getNodeId(),
  452 getUser(), container.getResource(), container.getPriority(),
  453 rmContainer.getCreationTime(), this.logAggregationContext));
  454 NMToken nmToken =
  455 
rmContext.getNMTokenSecretManager().createAndGetNMToken(getUser(),
  456   getApplicationAttemptId(), container);
+ 457 //check whether nmtoken is null
+ 458 LOG.info([hchen]NMToken for container +container.getId()+ 
NMToken:+nmToken);
  459 if (nmToken != null) {
  460   nmTokens.add(nmToken);
  461 }
  462   } catch (IllegalArgumentException e) {
  463 // DNS might be down, skip returning this container.
  464 LOG.error(Error trying to assign container token and NM token 
to +
  465  an allocated container  + container.getId(), e);
  466 continue;
  467   }
  468   returnContainerList.add(container);
  469   i.remove();
  470   rmContainer.handle(new 
RMContainerEvent(rmContainer.getContainerId(),
  471 RMContainerEventType.ACQUIRED));
  472 }
  473 return new ContainersAndNMTokensAllocation(returnContainerList, 
nmTokens);
  474   }

 AM restart and keep containers from previous attempts, then new container 
 launch failed
 ---

 Key: YARN-3112
 URL: https://issues.apache.org/jira/browse/YARN-3112
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications, resourcemanager
Affects Versions: 2.6.0
 Environment: in real linux cluster
Reporter: Jack Chen

 This error is very similar to YARN-1795, YARN-1839, but i have check the 
 solution of those jira, the patches are already included in my version. I 
 think this error is caused by the different NMTokens between old and new 
 appattempts. New AM has inherited the old tokens from previous AM according 
 to my configuration (keepContainers=true), so the token for new containers 
 are replaced by the old one in the NMTokenCache.
 206 2015-01-29 10:04:49,603 ERROR [ContainerLauncher #0] 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Container 
 launch failed for  container_1422546145900_0001_02_02 : 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
 for ixk02:47625
  207 ›   at 
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProt
  ocolProxy.java:256)
  208 ›   at 
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtoc
  olProxy.java:246)
  209 ›   at 
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:132)
  210 ›   at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:401)
  211 ›   at 
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
  212 ›   at 
 

[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-01-30 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299137#comment-14299137
 ] 

Sangjin Lee commented on YARN-2928:
---

Thanks [~zjshen] for putting it together! It looks good mostly. Some high level 
comments...

(1) Are relates to and is related to meant to capture the parent-child 
relationship?

(2)
(Flow) run and application definitely have a parent-child relationship.

Now it's less clear between the flow and the flow run. One scenario that is 
definitely worth considering is a flow of flows, and that brings some 
complications to this.

Suppose you have an oozie flow that starts a pig script which in turn spawns 
multiple MR jobs. If flow is an entity and parent of the flow run, how to model 
this situation becomes more challenging. One idea might be

oozie flow - oozie flow run - pig flow - pig flow run - MR job

However, the oozie flow run is not really the parent of the pig flow. Rather, 
the oozie flow run is the parent of the pig flow run.

Another idea is not to have the flow as a separate entity but as metadata of 
the flow run entities. And that's actually what the design doc indicates (see 
sections 3.1.1. and 3.1.2).

Now one issue with not having the flow as an entity is that it might complicate 
the aggregation scenario. More on that later...

(3) Could we stick with the same terminology as in the design doc? Those are 
flow and flow run. Thoughts? Better suggestions?

(4)
The part about the metrics would need to be further expanded with the metrics 
API JIRA, but I definitely see at least two types of metrics: one that requires 
a time series and another that doesn't. The former may be something like CPU, 
and the latter would be something like HDFS bytes written for example.

For the latter type, the only value that matters for a given metric is the 
latest value. And depending on which type, the way to implement the storage 
could be hugely different.

I think we need to come up with a well-defined set of metric types that cover 
most useful cases. Initially we said we were going to look at the existing 
hadoop metrics types, but we might need to come up with our own here.

(5)
The parent-child relationship (and therefore the necessity of making things 
entities) is tightly related with *aggregation* (rolling up the values from 
children to parent). The idea was that for parent-child entities aggregation 
would be done generically as part of creating/updating those entities (what we 
called primary aggregation in some discussion).

If cluster or user is not an entity, then there is no parent-child 
relationship, and aggregation from flows to user or cluster would have to be 
done explicitly outside the context of the parent-child relationship.

Of course that is doable; we could just do it as specific aggregation. Maybe 
that's what we need to do (and the queue-level aggregation which Robert 
mentioned could be treated in the same manner).

Either way, I think we should mention how the run/flow/user/cluster/queue 
aggregation would be done.

 Application Timeline Server (ATS) next gen: phase 1
 ---

 Key: YARN-2928
 URL: https://issues.apache.org/jira/browse/YARN-2928
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
 v1.pdf


 We have the application timeline server implemented in yarn per YARN-1530 and 
 YARN-321. Although it is a great feature, we have recognized several critical 
 issues and features that need to be addressed.
 This JIRA proposes the design and implementation changes to address those. 
 This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3020) n similar addContainerRequest()s produce n*(n+1)/2 containers

2015-01-30 Thread Peter D Kirchner (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299184#comment-14299184
 ] 

Peter D Kirchner commented on YARN-3020:


That expected usage you describe, and current implementation contains a basic 
synchronization problem.
The client application's RPC updates requests to the RM before it receives the 
containers newly assigned during that heartbeat.
Therefore, if (as is currently the case) the client calculates the total 
requests, the total is too large by at least the number of matching incoming 
assignments.
Per expected usage and current implementation, both add and remove cause this 
obsolete, too-high total to be sent.
Cause or coincidence, I see applications (including but not limited to 
distributedShell) making matching requests in a short interval and never 
calling remove.
They receive the behavior they need, or closer to it, than the expected usage 
would produce.

Further, in this API implementation/expected usage the remove API tries to 
serve two purposes that are similar but not identical: to update the 
client-side bookkeeping and to identify the request data to be sent to the 
server.  The problem here is that if there are only removes for allocated 
containers, then the server-side bookkeeping is correct until the client sends 
the total.  The removes called for incoming assigned containers should not be 
forwarded to the RM until there is at least one matching add, or a bona-fide 
removal of a previously add-ed request.

I suppose the current implementation could be defended because its error is:
1) only too high by the number of matching incoming assignments,
2) persists only for the number of heartbeats it takes to clear the 
out of sync condition
3) results in spurious allocations only once the application's 
intentional matching requests were granted.
I maintain that spurious allocations are worst-case and especially damaging if 
obtained by preemption.

I want to suggest an alternative that is simpler and accurate, and limited to 
the AMRMClient and RM. The fact that the scheduler is updated by replacement 
informs the choice of where Yarn should calculate that total for a matching 
request.
The client is in a position to accurately calculate how much its current wants 
differ from what it has asked for over its life.
This suggests a fix to the synchronization problem by having the client send 
the net of add/remove requests it has accumulated over a heartbeat cycle,
and having the RM update its totals, from the difference obtained from the 
client, using synchronized methods.
(Note, this client would not ordinarily call remove when it received a 
container, as the scheduler has already
properly accounted for it when it made the allocation).

 n similar addContainerRequest()s produce n*(n+1)/2 containers
 -

 Key: YARN-3020
 URL: https://issues.apache.org/jira/browse/YARN-3020
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Peter D Kirchner
   Original Estimate: 24h
  Remaining Estimate: 24h

 BUG: If the application master calls addContainerRequest() n times, but with 
 the same priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 .  The most 
 containers are requested when the interval between calls to 
 addContainerRequest() exceeds the heartbeat interval of calls to allocate() 
 (in AMRMClientImpl's run() method).
 If the application master calls addContainerRequest() n times, but with a 
 unique priority each time, I get n containers (as I intended).
 Analysis:
 There is a logic problem in AMRMClientImpl.java.
 Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent 
 calls to addContainerRequest(), addResourceRequest() finds the previous 
 matching remoteRequest and increments the container count rather than 
 starting anew, and does an addResourceRequestToAsk() which defeats the 
 ask.clear().
 From documentation and code comments, it was hard for me to discern the 
 intended behavior of the API, but the inconsistency reported in this issue 
 suggests one case or the other is implemented incorrectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3120) YarnException on windows + org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dirnm-local-dir, which was marked as good.

2015-01-30 Thread vaidhyanathan (JIRA)
vaidhyanathan created YARN-3120:
---

 Summary: YarnException on windows + 
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
dirnm-local-dir, which was marked as good.
 Key: YARN-3120
 URL: https://issues.apache.org/jira/browse/YARN-3120
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
 Environment: Windows 8 , Hadoop 2.6.0
Reporter: vaidhyanathan


Hi,

I tried to follow the instructiosn in 
http://wiki.apache.org/hadoop/Hadoop2OnWindows and have setup hadoop-2.6.0.jar 
in my windows system.

I was able to start everything properly but when i try to run the job wordcount 
as given in the above URL , the job fails with the below exception .
15/01/30 12:56:09 INFO localizer.ResourceLocalizationService: Localizer failed
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local di
r /tmp/hadoop-haremangala/nm-local-dir, which was marked as good.
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
ResourceLocalizationService.getInitializedLocalDirs(ResourceLocalizationService.
java:1372)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
ResourceLocalizationService.access$900(ResourceLocalizationService.java:137)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java
:1085)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2808) yarn client tool can not list app_attempt's container info correctly

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299033#comment-14299033
 ] 

Hadoop QA commented on YARN-2808:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12695573/YARN-2808.20150130-1.patch
  against trunk revision f2c9109.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:

  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6466//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6466//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6466//console

This message is automatically generated.

 yarn client tool can not list app_attempt's container info correctly
 

 Key: YARN-2808
 URL: https://issues.apache.org/jira/browse/YARN-2808
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.6.0
Reporter: Gordon Wang
Assignee: Naganarasimha G R
 Attachments: YARN-2808.20150126-1.patch, YARN-2808.20150130-1.patch


 When enabling timeline server, yarn client can not list the container info 
 for a application attempt correctly.
 Here is the reproduce step.
 # enabling yarn timeline server
 # submit a MR job
 # after the job is finished. use yarn client to list the container info of 
 the app attempt.
 Then, since the RM has cached the application's attempt info, the output show 
 {noformat}
 [hadoop@localhost hadoop-3.0.0-SNAPSHOT]$ ./bin/yarn container -list 
 appattempt_1415168250217_0001_01
 14/11/05 01:19:15 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 14/11/05 01:19:15 INFO impl.TimelineClientImpl: Timeline service address: 
 http://0.0.0.0:8188/ws/v1/timeline/
 14/11/05 01:19:16 INFO client.RMProxy: Connecting to ResourceManager at 
 /0.0.0.0:8032
 14/11/05 01:19:16 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 Total number of containers :0
   Container-Id  Start Time Finish 
 Time   StateHost  
   LOG-URL
 {noformat}
 But if the rm is restarted, client can fetch the container info from timeline 
 server correctly.
 {noformat}
 [hadoop@localhost hadoop-3.0.0-SNAPSHOT]$ ./bin/yarn container -list 
 appattempt_1415168250217_0001_01
 14/11/05 01:21:06 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 14/11/05 01:21:06 INFO impl.TimelineClientImpl: Timeline service address: 
 http://0.0.0.0:8188/ws/v1/timeline/
 14/11/05 01:21:06 INFO client.RMProxy: Connecting to ResourceManager at 
 /0.0.0.0:8032
 14/11/05 01:21:06 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 Total number of containers :4
   Container-Id  Start Time Finish 
 Time   StateHost  
   LOG-URL
 container_1415168250217_0001_01_01   1415168318376   
 1415168349896COMPLETElocalhost.localdomain:47024 
 http://0.0.0.0:8188/applicationhistory/logs/localhost.localdomain:47024/container_1415168250217_0001_01_01/container_1415168250217_0001_01_01/hadoop
 container_1415168250217_0001_01_02   1415168326399   
 1415168334858COMPLETElocalhost.localdomain:47024 
 http://0.0.0.0:8188/applicationhistory/logs/localhost.localdomain:47024/container_1415168250217_0001_01_02/container_1415168250217_0001_01_02/hadoop
 container_1415168250217_0001_01_03

[jira] [Updated] (YARN-2543) Resource usage should be published to the timeline server as well

2015-01-30 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2543:

Attachment: YARN-2543.20150130-1.patch

Hi [~zjshen]
Below testcase failure is not related to my changes
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore

1,2  3 are taken care with the attached patch. 
Will start working on the other issue

 Resource usage should be published to the timeline server as well
 -

 Key: YARN-2543
 URL: https://issues.apache.org/jira/browse/YARN-2543
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
 Attachments: YARN-2543.20150125-1.patch, YARN-2543.20150130-1.patch


 RM will include the resource usage in the app report, but generic history 
 service doesn't, because RM doesn't publish this data to the timeline server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3075) NodeLabelsManager implementation to retrieve label to node mapping

2015-01-30 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3075:
---
Attachment: YARN-3075.003.patch

 NodeLabelsManager implementation to retrieve label to node mapping
 --

 Key: YARN-3075
 URL: https://issues.apache.org/jira/browse/YARN-3075
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3075.001.patch, YARN-3075.002.patch, 
 YARN-3075.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3107) Update TestYarnConfigurationFields to flag missing properties in yarn-default.xml with an error

2015-01-30 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-3107:
-
Attachment: YARN-3107.001.patch

Get an example checked in.  Can't turn on until yarn-default.xml is clean.

- Set flag to check xml files
- Add SCMStore properties to ignore
- Add ATS v1 properties to ignore


 Update TestYarnConfigurationFields to flag missing properties in 
 yarn-default.xml with an error
 ---

 Key: YARN-3107
 URL: https://issues.apache.org/jira/browse/YARN-3107
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Ray Chiang
Assignee: Ray Chiang
  Labels: supportability
 Attachments: YARN-3107.001.patch


 TestYarnConfigurationFields currently makes sure each property in 
 yarn-default.xml is documented in one of the YARN configuration Java classes. 
  The reverse check can be turned on once the each YARN property is:
 A) documented in yarn-default.xml OR
 B) listed as an exception (with comments, e.g. for internal use) in the 
 TestYarnConfigurationFields unit test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3075) NodeLabelsManager implementation to retrieve label to node mapping

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299173#comment-14299173
 ] 

Hadoop QA commented on YARN-3075:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12695584/YARN-3075.003.patch
  against trunk revision f2c9109.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6468//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6468//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6468//console

This message is automatically generated.

 NodeLabelsManager implementation to retrieve label to node mapping
 --

 Key: YARN-3075
 URL: https://issues.apache.org/jira/browse/YARN-3075
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3075.001.patch, YARN-3075.002.patch, 
 YARN-3075.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-01-30 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298999#comment-14298999
 ] 

Jian He commented on YARN-3100:
---

bq. It was always a mistake that queue ACLs and the like weren't implemented 
with the common ACL implementation,
Would you please specify which exact piece of code regarding the service acl 
implementation YARN should re-use, but YARN did not ? YARN always  re-use any 
existing library from common.

 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-3100.1.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2808) yarn client tool can not list app_attempt's container info correctly

2015-01-30 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299101#comment-14299101
 ] 

Zhijie Shen commented on YARN-2808:
---

I think the patch should work, though it's not guarantee all the containers 
will be returned for a running attempt due to some race condition that 
container is finished, its info is pushed to timeline server, but is still not 
persisted. Anyway, it will be a good improvement in terms of user experience.

Some minor comments:

1. Is it possible to improve the performance? The application could be big to 
have hundreds of containers. It's not efficient to loop through them many 
times. Maybe run through them once, and put the ids in a hashset for check?
{code}
for (int i = 0; i  containersFromHistoryServer.size(); i++) {
if (containersFromHistoryServer.get(i).getContainerId()
.equals(tmp.getContainerId())) {
  containersFromHistoryServer.remove(i);
  //Remove containers from AHS as container from RM will 
have latest
  //information
  break;
}
  }
{code}

2. In the test can we add a case that the running container is in RM, and it's 
also in the timeline server as part of its information is written there, the 
container info cached in RM is sourced instead of the partial info in the 
timeline server.

 yarn client tool can not list app_attempt's container info correctly
 

 Key: YARN-2808
 URL: https://issues.apache.org/jira/browse/YARN-2808
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.6.0
Reporter: Gordon Wang
Assignee: Naganarasimha G R
 Attachments: YARN-2808.20150126-1.patch, YARN-2808.20150130-1.patch


 When enabling timeline server, yarn client can not list the container info 
 for a application attempt correctly.
 Here is the reproduce step.
 # enabling yarn timeline server
 # submit a MR job
 # after the job is finished. use yarn client to list the container info of 
 the app attempt.
 Then, since the RM has cached the application's attempt info, the output show 
 {noformat}
 [hadoop@localhost hadoop-3.0.0-SNAPSHOT]$ ./bin/yarn container -list 
 appattempt_1415168250217_0001_01
 14/11/05 01:19:15 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 14/11/05 01:19:15 INFO impl.TimelineClientImpl: Timeline service address: 
 http://0.0.0.0:8188/ws/v1/timeline/
 14/11/05 01:19:16 INFO client.RMProxy: Connecting to ResourceManager at 
 /0.0.0.0:8032
 14/11/05 01:19:16 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 Total number of containers :0
   Container-Id  Start Time Finish 
 Time   StateHost  
   LOG-URL
 {noformat}
 But if the rm is restarted, client can fetch the container info from timeline 
 server correctly.
 {noformat}
 [hadoop@localhost hadoop-3.0.0-SNAPSHOT]$ ./bin/yarn container -list 
 appattempt_1415168250217_0001_01
 14/11/05 01:21:06 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 14/11/05 01:21:06 INFO impl.TimelineClientImpl: Timeline service address: 
 http://0.0.0.0:8188/ws/v1/timeline/
 14/11/05 01:21:06 INFO client.RMProxy: Connecting to ResourceManager at 
 /0.0.0.0:8032
 14/11/05 01:21:06 INFO client.AHSProxy: Connecting to Application History 
 server at /0.0.0.0:10200
 Total number of containers :4
   Container-Id  Start Time Finish 
 Time   StateHost  
   LOG-URL
 container_1415168250217_0001_01_01   1415168318376   
 1415168349896COMPLETElocalhost.localdomain:47024 
 http://0.0.0.0:8188/applicationhistory/logs/localhost.localdomain:47024/container_1415168250217_0001_01_01/container_1415168250217_0001_01_01/hadoop
 container_1415168250217_0001_01_02   1415168326399   
 1415168334858COMPLETElocalhost.localdomain:47024 
 http://0.0.0.0:8188/applicationhistory/logs/localhost.localdomain:47024/container_1415168250217_0001_01_02/container_1415168250217_0001_01_02/hadoop
 container_1415168250217_0001_01_03   1415168326400   
 1415168335277COMPLETElocalhost.localdomain:47024 
 

[jira] [Commented] (YARN-3120) YarnException on windows + org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dirnm-local-dir, which was marked as good.

2015-01-30 Thread vaidhyanathan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299111#comment-14299111
 ] 

vaidhyanathan commented on YARN-3120:
-

I tried to change the folder permission for nmprivate manually by using chmod 
700 as its expecting but the issue doesnt see to be resolved.

 YarnException on windows + 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
 dirnm-local-dir, which was marked as good.
 ---

 Key: YARN-3120
 URL: https://issues.apache.org/jira/browse/YARN-3120
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
 Environment: Windows 8 , Hadoop 2.6.0
Reporter: vaidhyanathan

 Hi,
 I tried to follow the instructiosn in 
 http://wiki.apache.org/hadoop/Hadoop2OnWindows and have setup 
 hadoop-2.6.0.jar in my windows system.
 I was able to start everything properly but when i try to run the job 
 wordcount as given in the above URL , the job fails with the below exception .
 15/01/30 12:56:09 INFO localizer.ResourceLocalizationService: Localizer failed
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
 di
 r /tmp/hadoop-haremangala/nm-local-dir, which was marked as good.
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
 ResourceLocalizationService.getInitializedLocalDirs(ResourceLocalizationService.
 java:1372)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
 ResourceLocalizationService.access$900(ResourceLocalizationService.java:137)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
 ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java
 :1085)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3075) NodeLabelsManager implementation to retrieve label to node mapping

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299180#comment-14299180
 ] 

Hadoop QA commented on YARN-3075:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12695585/YARN-3075.003.patch
  against trunk revision f2c9109.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6469//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6469//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6469//console

This message is automatically generated.

 NodeLabelsManager implementation to retrieve label to node mapping
 --

 Key: YARN-3075
 URL: https://issues.apache.org/jira/browse/YARN-3075
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3075.001.patch, YARN-3075.002.patch, 
 YARN-3075.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3075) NodeLabelsManager implementation to retrieve label to node mapping

2015-01-30 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3075:
---
Attachment: YARN-3075.003.patch

 NodeLabelsManager implementation to retrieve label to node mapping
 --

 Key: YARN-3075
 URL: https://issues.apache.org/jira/browse/YARN-3075
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3075.001.patch, YARN-3075.002.patch, 
 YARN-3075.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3075) NodeLabelsManager implementation to retrieve label to node mapping

2015-01-30 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3075:
---
Attachment: (was: YARN-3075.003.patch)

 NodeLabelsManager implementation to retrieve label to node mapping
 --

 Key: YARN-3075
 URL: https://issues.apache.org/jira/browse/YARN-3075
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3075.001.patch, YARN-3075.002.patch, 
 YARN-3075.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3099) Capacity Scheduler LeafQueue/ParentQueue should use ResourceUsage to track used-resources-by-label.

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299364#comment-14299364
 ] 

Hudson commented on YARN-3099:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6971 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6971/])
YARN-3099. Capacity Scheduler LeafQueue/ParentQueue should use ResourceUsage to 
track used-resources-by-label. Contributed by Wangda Tan (jianhe: rev 
86358221fc85a7743052a0b4c1647353508bf308)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCSQueueUtils.java


 Capacity Scheduler LeafQueue/ParentQueue should use ResourceUsage to track 
 used-resources-by-label.
 ---

 Key: YARN-3099
 URL: https://issues.apache.org/jira/browse/YARN-3099
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.7.0

 Attachments: YARN-3099.1.patch, YARN-3099.2.patch, YARN-3099.3.patch, 
 YARN-3099.4.patch


 After YARN-3092, resource-by-label (include 
 used-resource/pending-resource/reserved-resource/AM-resource, etc.) should be 
 tracked in ResourceUsage.
 To make each individual patch smaller to get easier review, this patch is 
 targeting to make used-resources-by-label in CS Queues are all tracked by 
 ResourceUsage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Siqi Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299403#comment-14299403
 ] 

Siqi Li commented on YARN-3101:
---

feel free to still use =. This doesn't change the overall behavior

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2543) Resource usage should be published to the timeline server as well

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299611#comment-14299611
 ] 

Hadoop QA commented on YARN-2543:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12695701/YARN-2543.20150131-1.patch
  against trunk revision 09ad9a8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6473//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6473//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6473//console

This message is automatically generated.

 Resource usage should be published to the timeline server as well
 -

 Key: YARN-2543
 URL: https://issues.apache.org/jira/browse/YARN-2543
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
 Attachments: YARN-2543.20150125-1.patch, YARN-2543.20150130-1.patch, 
 YARN-2543.20150131-1.patch


 RM will include the resource usage in the app report, but generic history 
 service doesn't, because RM doesn't publish this data to the timeline server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299396#comment-14299396
 ] 

Anubhav Dhoot commented on YARN-3101:
-

The test case was modified to add a negative test that show reservation that 
should be maintained are not being anymore. So we need the test case changes 
from my patch.

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299423#comment-14299423
 ] 

Hadoop QA commented on YARN-3101:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12695632/YARN-3101-Siqi.v2.patch
  against trunk revision 951b360.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6471//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6471//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6471//console

This message is automatically generated.

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3104) RM generates new AMRM tokens every heartbeat between rolling and activation

2015-01-30 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299474#comment-14299474
 ] 

Jian He commented on YARN-3104:
---

Hi Jason, just one comment on the patch:
{{amrmToken.decodeIdentifier().getKeyId(}} is internally doing reflection 
stuff. will it be costly to invoke this on every AM heartbeat. maybe we can 
cache the keyId ?

 RM generates new AMRM tokens every heartbeat between rolling and activation
 ---

 Key: YARN-3104
 URL: https://issues.apache.org/jira/browse/YARN-3104
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-3104.001.patch


 When the RM rolls a new AMRM secret, it conveys this to the AMs when it 
 notices they are still connected with the old key.  However neither the RM 
 nor the AM explicitly close the connection or otherwise try to reconnect with 
 the new secret.  Therefore the RM keeps thinking the AM doesn't have the new 
 token on every heartbeat and keeps sending new tokens for the period between 
 the key roll and the key activation.  Once activated the RM no longer squawks 
 in its logs about needing to generate a new token every heartbeat (i.e.: 
 second) for every app, but the apps can still be using the old token.  The 
 token is only checked upon connection to the RM.  The apps don't reconnect 
 when sent a new token, and the RM doesn't force them to reconnect by closing 
 the connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2543) Resource usage should be published to the timeline server as well

2015-01-30 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-2543:

Attachment: YARN-2543.20150131-1.patch

missed to make the modifications for TestSystemMetricsPublisher have 
corrected it now 

 Resource usage should be published to the timeline server as well
 -

 Key: YARN-2543
 URL: https://issues.apache.org/jira/browse/YARN-2543
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
 Attachments: YARN-2543.20150125-1.patch, YARN-2543.20150130-1.patch, 
 YARN-2543.20150131-1.patch


 RM will include the resource usage in the app report, but generic history 
 service doesn't, because RM doesn't publish this data to the timeline server



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-01-30 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299530#comment-14299530
 ] 

Robert Kanter commented on YARN-2928:
-

To mirror my comment from the doc that Sangjin is referring to; I had said:
{quote}It would be useful to be able to aggregate to queues; what would be a 
good way to fit those into the data model?{quote}
in the Some issues to address section.

As discussed, if we only do child -- parent aggregation (primary 
aggregation), then we can't aggregate to queues because they don't really fit 
in that path.  

 Application Timeline Server (ATS) next gen: phase 1
 ---

 Key: YARN-2928
 URL: https://issues.apache.org/jira/browse/YARN-2928
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vinod Kumar Vavilapalli
Priority: Critical
 Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
 v1.pdf


 We have the application timeline server implemented in yarn per YARN-1530 and 
 YARN-321. Although it is a great feature, we have recognized several critical 
 issues and features that need to be addressed.
 This JIRA proposes the design and implementation changes to address those. 
 This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3122) Metrics for container's actual CPU usage

2015-01-30 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299570#comment-14299570
 ] 

Anubhav Dhoot commented on YARN-3122:
-

Similar to YARN-2984 this would track CPU usage

 Metrics for container's actual CPU usage
 

 Key: YARN-3122
 URL: https://issues.apache.org/jira/browse/YARN-3122
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

 It would be nice to capture resource usage per container, for a variety of 
 reasons. This JIRA is to track CPU usage. 
 YARN-2965 tracks the resource usage on the node, and the two implementations 
 should reuse code as much as possible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3122) Metrics for container's actual CPU usage

2015-01-30 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3122:

Description: 
It would be nice to capture resource usage per container, for a variety of 
reasons. This JIRA is to track CPU usage. 

YARN-2965 tracks the resource usage on the node, and the two implementations 
should reuse code as much as possible. 



  was:
It would be nice to capture resource usage per container, for a variety of 
reasons. This JIRA is to track memory usage. 

YARN-2965 tracks the resource usage on the node, and the two implementations 
should reuse code as much as possible. 




 Metrics for container's actual CPU usage
 

 Key: YARN-3122
 URL: https://issues.apache.org/jira/browse/YARN-3122
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

 It would be nice to capture resource usage per container, for a variety of 
 reasons. This JIRA is to track CPU usage. 
 YARN-2965 tracks the resource usage on the node, and the two implementations 
 should reuse code as much as possible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3122) Metrics for container's actual CPU usage

2015-01-30 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3122:

Fix Version/s: (was: 2.7.0)

 Metrics for container's actual CPU usage
 

 Key: YARN-3122
 URL: https://issues.apache.org/jira/browse/YARN-3122
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

 It would be nice to capture resource usage per container, for a variety of 
 reasons. This JIRA is to track memory usage. 
 YARN-2965 tracks the resource usage on the node, and the two implementations 
 should reuse code as much as possible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3041) create the ATS entity/event API

2015-01-30 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299517#comment-14299517
 ] 

Sangjin Lee commented on YARN-3041:
---

bq. I suggest using more generalized in/outbound relationship instead of 
parent-child one.

One parent can have multiple children obviously, but we said in the current 
design that we want to limit the parent to be one. The consideration was that 
the parent-child relationship is used really to handle the aggregation along 
the linear hierarchy, and multiple parents complicate that significantly.

 create the ATS entity/event API
 ---

 Key: YARN-3041
 URL: https://issues.apache.org/jira/browse/YARN-3041
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter
 Attachments: YARN-3041.preliminary.001.patch


 Per design in YARN-2928, create the ATS entity and events API.
 Also, as part of this JIRA, create YARN system entities (e.g. cluster, user, 
 flow, flow run, YARN app, ...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299428#comment-14299428
 ] 

Sandy Ryza commented on YARN-3101:
--

In that case it sounds like the behavior is that we can go one container over 
the max resources.  While this might be worth changing in a separate JIRA, we 
should maintain that behavior with the reservations.

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3122) Metrics for container's actual CPU usage

2015-01-30 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot reassigned YARN-3122:
---

Assignee: Anubhav Dhoot  (was: Karthik Kambatla)

 Metrics for container's actual CPU usage
 

 Key: YARN-3122
 URL: https://issues.apache.org/jira/browse/YARN-3122
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

 It would be nice to capture resource usage per container, for a variety of 
 reasons. This JIRA is to track memory usage. 
 YARN-2965 tracks the resource usage on the node, and the two implementations 
 should reuse code as much as possible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3122) Metrics for container's actual CPU usage

2015-01-30 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-3122:
---

 Summary: Metrics for container's actual CPU usage
 Key: YARN-3122
 URL: https://issues.apache.org/jira/browse/YARN-3122
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Anubhav Dhoot
Assignee: Karthik Kambatla
 Fix For: 2.7.0


It would be nice to capture resource usage per container, for a variety of 
reasons. This JIRA is to track memory usage. 

YARN-2965 tracks the resource usage on the node, and the two implementations 
should reuse code as much as possible. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3077) RM should create yarn.resourcemanager.zk-state-store.parent-path recursively

2015-01-30 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-3077:
--
Assignee: Chun Chen

 RM should create yarn.resourcemanager.zk-state-store.parent-path recursively
 

 Key: YARN-3077
 URL: https://issues.apache.org/jira/browse/YARN-3077
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Chun Chen
Assignee: Chun Chen
 Attachments: YARN-3077.2.patch, YARN-3077.3.patch, YARN-3077.patch


 If multiple clusters share a zookeeper cluster, users might use 
 /rmstore/${yarn.resourcemanager.cluster-id} as the state store path. If user 
 specified a customer value which is not a top-level path for 
 ${yarn.resourcemanager.zk-state-store.parent-path}, yarn should create parent 
 path first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3101:

Attachment: YARN-3101.003.patch

Reverts changes from test case testReservationWhileMultiplePriorities that was 
modified by YARN-2811. As there are no limits on the queue, so no reservation 
should be removed. Hence the older behavior on the test still applies

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3121) FairScheduler preemption metrics

2015-01-30 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot reassigned YARN-3121:
---

Assignee: Anubhav Dhoot

 FairScheduler preemption metrics
 

 Key: YARN-3121
 URL: https://issues.apache.org/jira/browse/YARN-3121
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

 Add FSQueuemetrics for preemption related information



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3121) FairScheduler preemption metrics

2015-01-30 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-3121:
---

 Summary: FairScheduler preemption metrics
 Key: YARN-3121
 URL: https://issues.apache.org/jira/browse/YARN-3121
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Anubhav Dhoot


Add FSQueuemetrics for preemption related information



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3077) RM should create yarn.resourcemanager.zk-state-store.parent-path recursively

2015-01-30 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299542#comment-14299542
 ] 

Jian He commented on YARN-3077:
---

[~chenchun], I added you to the contributor list. You should be able to assign 
jira to yourself now.

 RM should create yarn.resourcemanager.zk-state-store.parent-path recursively
 

 Key: YARN-3077
 URL: https://issues.apache.org/jira/browse/YARN-3077
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Chun Chen
Assignee: Chun Chen
 Fix For: 2.7.0

 Attachments: YARN-3077.2.patch, YARN-3077.3.patch, YARN-3077.patch


 If multiple clusters share a zookeeper cluster, users might use 
 /rmstore/${yarn.resourcemanager.cluster-id} as the state store path. If user 
 specified a customer value which is not a top-level path for 
 ${yarn.resourcemanager.zk-state-store.parent-path}, yarn should create parent 
 path first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3077) RM should create yarn.resourcemanager.zk-state-store.parent-path recursively

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299547#comment-14299547
 ] 

Hudson commented on YARN-3077:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6974 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6974/])
YARN-3077. Fixed RM to create zk root path recursively. Contributed by Chun 
Chen (jianhe: rev 054a947989d6ccbe54a803ca96dcebeba8328367)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java


 RM should create yarn.resourcemanager.zk-state-store.parent-path recursively
 

 Key: YARN-3077
 URL: https://issues.apache.org/jira/browse/YARN-3077
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Chun Chen
Assignee: Chun Chen
 Fix For: 2.7.0

 Attachments: YARN-3077.2.patch, YARN-3077.3.patch, YARN-3077.patch


 If multiple clusters share a zookeeper cluster, users might use 
 /rmstore/${yarn.resourcemanager.cluster-id} as the state store path. If user 
 specified a customer value which is not a top-level path for 
 ${yarn.resourcemanager.zk-state-store.parent-path}, yarn should create parent 
 path first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3041) create the ATS entity/event API

2015-01-30 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299548#comment-14299548
 ] 

Robert Kanter commented on YARN-3041:
-

I'm glad it mostly matches up with the doc.

1. I think that makes sense.  A Metric doesn't need all of the stuff that it's 
inheriting from the {{TimelineServiceEntity}}.  I'm already using the old 
{{TimelineEvent}}, which matches up with what you had in the doc (other than 
having {{eventInfo}} instead of {{metadata}} 

2. It sounds like we may need more discussion on this area.  As [~sjlee0] 
pointed out, we had originally said a single parent to have a linear hierarchy 
for aggregation.  This is different than the Relates to and Is related to 
in the doc and having a DAG.  I wonder if it makes sense to have a parent-child 
relationship only to relate the entities to each other (e.g. Application is a 
child of Run, etc), and some other structure (not sure what) for aggregation?  
That would help us capture other aggregation paths for things that don't fit in 
the parental hierarchy.  Though that makes things more complicated :(

3. You're right: they don't really need all the stuff they're inheriting from 
{{TimelineServiceEntity}}.  I think they really only need the relationship 
field(s) and an id.  

I'll do some refactoring for another prelim version.

 create the ATS entity/event API
 ---

 Key: YARN-3041
 URL: https://issues.apache.org/jira/browse/YARN-3041
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter
 Attachments: YARN-3041.preliminary.001.patch


 Per design in YARN-2928, create the ATS entity and events API.
 Also, as part of this JIRA, create YARN system entities (e.g. cluster, user, 
 flow, flow run, YARN app, ...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299592#comment-14299592
 ] 

Hadoop QA commented on YARN-3101:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12695697/YARN-3101.003.patch
  against trunk revision 09ad9a8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6472//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6472//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6472//console

This message is automatically generated.

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1983) Support heterogeneous container types at runtime on YARN

2015-01-30 Thread Chun Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chen updated YARN-1983:

Attachment: YARN-1983.2.patch

Update the patch to rewrite unit tests.

 Support heterogeneous container types at runtime on YARN
 

 Key: YARN-1983
 URL: https://issues.apache.org/jira/browse/YARN-1983
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Junping Du
 Attachments: YARN-1983.2.patch, YARN-1983.patch


 Different container types (default, LXC, docker, VM box, etc.) have different 
 semantics on isolation of security, namespace/env, performance, etc.
 Per discussions in YARN-1964, we have some good thoughts on supporting 
 different types of containers running on YARN and specified by application at 
 runtime which largely enhance YARN's flexibility to meet heterogenous app's 
 requirement on isolation at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3120) YarnException on windows + org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dirnm-local-dir, which was marked as good.

2015-01-30 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299677#comment-14299677
 ] 

Varun Vasudev commented on YARN-3120:
-

Are you running everything on your install drive(C:)? Windows has special 
security permissions on the install drive. Try creating another partition and 
set the local dirs on that partition.

 YarnException on windows + 
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
 dirnm-local-dir, which was marked as good.
 ---

 Key: YARN-3120
 URL: https://issues.apache.org/jira/browse/YARN-3120
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
 Environment: Windows 8 , Hadoop 2.6.0
Reporter: vaidhyanathan

 Hi,
 I tried to follow the instructiosn in 
 http://wiki.apache.org/hadoop/Hadoop2OnWindows and have setup 
 hadoop-2.6.0.jar in my windows system.
 I was able to start everything properly but when i try to run the job 
 wordcount as given in the above URL , the job fails with the below exception .
 15/01/30 12:56:09 INFO localizer.ResourceLocalizationService: Localizer failed
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
 di
 r /tmp/hadoop-haremangala/nm-local-dir, which was marked as good.
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
 ResourceLocalizationService.getInitializedLocalDirs(ResourceLocalizationService.
 java:1372)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
 ResourceLocalizationService.access$900(ResourceLocalizationService.java:137)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
 ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java
 :1085)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1983) Support heterogeneous container types at runtime on YARN

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299688#comment-14299688
 ] 

Hadoop QA commented on YARN-1983:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12695731/YARN-1983.2.patch
  against trunk revision 26c2de3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 10 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6474//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6474//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6474//console

This message is automatically generated.

 Support heterogeneous container types at runtime on YARN
 

 Key: YARN-1983
 URL: https://issues.apache.org/jira/browse/YARN-1983
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Junping Du
 Attachments: YARN-1983.2.patch, YARN-1983.patch


 Different container types (default, LXC, docker, VM box, etc.) have different 
 semantics on isolation of security, namespace/env, performance, etc.
 Per discussions in YARN-1964, we have some good thoughts on supporting 
 different types of containers running on YARN and specified by application at 
 runtime which largely enhance YARN's flexibility to meet heterogenous app's 
 requirement on isolation at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3090) DeletionService can silently ignore deletion task failures

2015-01-30 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3090:
---
Attachment: YARN-3090.001.patch

 DeletionService can silently ignore deletion task failures
 --

 Key: YARN-3090
 URL: https://issues.apache.org/jira/browse/YARN-3090
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3090.001.patch


 If a non-I/O exception occurs while the DeletionService is executing a 
 deletion task then it will be silently ignored.  The exception bubbles up to 
 the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
 the throwable to the Future that was returned when the task was scheduled.  
 However the thread pool is used as a fire-and-forget pool, so nothing ever 
 looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299689#comment-14299689
 ] 

Anubhav Dhoot commented on YARN-3101:
-

The failure does not repro after pulling the latest trunk and the release audit 
warning is in a unrelated file

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3090) DeletionService can silently ignore deletion task failures

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299700#comment-14299700
 ] 

Hadoop QA commented on YARN-3090:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12695733/YARN-3090.001.patch
  against trunk revision 26c2de3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6475//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6475//artifact/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6475//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6475//console

This message is automatically generated.

 DeletionService can silently ignore deletion task failures
 --

 Key: YARN-3090
 URL: https://issues.apache.org/jira/browse/YARN-3090
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3090.001.patch


 If a non-I/O exception occurs while the DeletionService is executing a 
 deletion task then it will be silently ignored.  The exception bubbles up to 
 the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
 the throwable to the Future that was returned when the task was scheduled.  
 However the thread pool is used as a fire-and-forget pool, so nothing ever 
 looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3048) handle how to set up and start/stop ATS reader instances

2015-01-30 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299690#comment-14299690
 ] 

Varun Saxena commented on YARN-3048:


[~sjlee0], isnt YARN-3118 similar to this JIRA ? Or intention of this JIRA is 
something else ?

 handle how to set up and start/stop ATS reader instances
 

 Key: YARN-3048
 URL: https://issues.apache.org/jira/browse/YARN-3048
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena

 Per design in YARN-2928, come up with a way to set up and start/stop ATS 
 reader instances.
 This should allow setting up multiple instances and managing user traffic to 
 those instances.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3090) DeletionService can silently ignore deletion task failures

2015-01-30 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3090:
---
Attachment: YARN-3090.002.patch

 DeletionService can silently ignore deletion task failures
 --

 Key: YARN-3090
 URL: https://issues.apache.org/jira/browse/YARN-3090
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3090.001.patch, YARN-3090.002.patch


 If a non-I/O exception occurs while the DeletionService is executing a 
 deletion task then it will be silently ignored.  The exception bubbles up to 
 the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
 the throwable to the Future that was returned when the task was scheduled.  
 However the thread pool is used as a fire-and-forget pool, so nothing ever 
 looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Siqi Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299369#comment-14299369
 ] 

Siqi Li commented on YARN-3101:
---

[~adhoot] This reason of using  instead of = is basically for keeping the 
test case intact.  

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser

2015-01-30 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298637#comment-14298637
 ] 

Eric Payne commented on YARN-3089:
--

Thank you, [~sunilg], for your review of this patch.

{quote}
{code}
int subDirEmptyStr = (subdir == NULL || subdir[0] == 0);
{code}
I think strlen(subdir) also has to be checked against 0, correct?
{quote}
The {{strlen}} function will do exactly the same thing that {{subdir[0] == 0}} 
does, which is check that the first byte in the string is 0. In {{strlen}}, it 
takes the form of {{*s == '\0'}}, but it amounts to the same thing. By checking 
for empty string as is done in the existing patch, it avoids the overhead of 
another function call.

 LinuxContainerExecutor does not handle file arguments to deleteAsUser
 -

 Key: YARN-3089
 URL: https://issues.apache.org/jira/browse/YARN-3089
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Eric Payne
Priority: Blocker
 Attachments: YARN-3089.v1.txt


 YARN-2468 added the deletion of individual logs that are aggregated, but this 
 fails to delete log files when the LCE is being used.  The LCE native 
 executable assumes the paths being passed are paths and the delete fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2428) LCE default banned user list should have yarn

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298663#comment-14298663
 ] 

Hudson commented on YARN-2428:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2021 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2021/])
YARN-2428. LCE default banned user list should have yarn (Varun Saxena via aw) 
(aw: rev 9dd0b7a2ab6538d8f72b004eb97c2750ff3d98dd)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* hadoop-yarn-project/CHANGES.txt


 LCE default banned user list should have yarn
 -

 Key: YARN-2428
 URL: https://issues.apache.org/jira/browse/YARN-2428
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Allen Wittenauer
Assignee: Varun Saxena
Priority: Trivial
  Labels: newbie
 Fix For: 3.0.0

 Attachments: YARN-2428.001.patch


 When task-controller was retrofitted to YARN, the default banned user list 
 didn't add yarn.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3108) ApplicationHistoryServer doesn't process -D arguments

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298658#comment-14298658
 ] 

Hudson commented on YARN-3108:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2021 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2021/])
YARN-3108. ApplicationHistoryServer doesn't process -D arguments (Chang Li via 
jeagles) (jeagles: rev 30a8778c632c0f57cdd005080a470065a60756a8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryServer.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java


 ApplicationHistoryServer doesn't process -D arguments
 -

 Key: YARN-3108
 URL: https://issues.apache.org/jira/browse/YARN-3108
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: chang li
Assignee: chang li
 Fix For: 2.7.0

 Attachments: yarn3108.patch, yarn3108.patch, yarn3108.patch


 ApplicationHistoryServer doesn't process -D arguments when created, it's nice 
 to have it to do that  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3029) FSDownload.unpack() uses local locale for FS case conversion, may not work everywhere

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298662#comment-14298662
 ] 

Hudson commented on YARN-3029:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2021 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2021/])
YARN-3029. FSDownload.unpack() uses local locale for FS case conversion, may 
not work everywhere. Contributed by Varun Saxena. (ozawa: rev 
7acce7d3648d6f1e45ce280e2147e7dedf5693fc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java


 FSDownload.unpack() uses local locale for FS case conversion, may not work 
 everywhere
 -

 Key: YARN-3029
 URL: https://issues.apache.org/jira/browse/YARN-3029
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Varun Saxena
 Attachments: YARN-3029-003.patch, YARN-3029.001.patch, 
 YARN-3029.002.patch


 {{FSDownload.unpack()}} lower-cases filenames in the local locale before 
 looking at extensions for, tar, zip, ..
 {code}
 String lowerDst = dst.getName().toLowerCase();
 {code}
 it MUST use LOCALE_EN for the locale, else a file .ZIP won't be recognised as 
 a zipfile in a turkish locale cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3029) FSDownload.unpack() uses local locale for FS case conversion, may not work everywhere

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298679#comment-14298679
 ] 

Hudson commented on YARN-3029:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #86 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/86/])
YARN-3029. FSDownload.unpack() uses local locale for FS case conversion, may 
not work everywhere. Contributed by Varun Saxena. (ozawa: rev 
7acce7d3648d6f1e45ce280e2147e7dedf5693fc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java


 FSDownload.unpack() uses local locale for FS case conversion, may not work 
 everywhere
 -

 Key: YARN-3029
 URL: https://issues.apache.org/jira/browse/YARN-3029
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Varun Saxena
 Attachments: YARN-3029-003.patch, YARN-3029.001.patch, 
 YARN-3029.002.patch


 {{FSDownload.unpack()}} lower-cases filenames in the local locale before 
 looking at extensions for, tar, zip, ..
 {code}
 String lowerDst = dst.getName().toLowerCase();
 {code}
 it MUST use LOCALE_EN for the locale, else a file .ZIP won't be recognised as 
 a zipfile in a turkish locale cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3108) ApplicationHistoryServer doesn't process -D arguments

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298675#comment-14298675
 ] 

Hudson commented on YARN-3108:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #86 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/86/])
YARN-3108. ApplicationHistoryServer doesn't process -D arguments (Chang Li via 
jeagles) (jeagles: rev 30a8778c632c0f57cdd005080a470065a60756a8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryServer.java


 ApplicationHistoryServer doesn't process -D arguments
 -

 Key: YARN-3108
 URL: https://issues.apache.org/jira/browse/YARN-3108
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: chang li
Assignee: chang li
 Fix For: 2.7.0

 Attachments: yarn3108.patch, yarn3108.patch, yarn3108.patch


 ApplicationHistoryServer doesn't process -D arguments when created, it's nice 
 to have it to do that  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2428) LCE default banned user list should have yarn

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298680#comment-14298680
 ] 

Hudson commented on YARN-2428:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #86 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/86/])
YARN-2428. LCE default banned user list should have yarn (Varun Saxena via aw) 
(aw: rev 9dd0b7a2ab6538d8f72b004eb97c2750ff3d98dd)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* hadoop-yarn-project/CHANGES.txt


 LCE default banned user list should have yarn
 -

 Key: YARN-2428
 URL: https://issues.apache.org/jira/browse/YARN-2428
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Allen Wittenauer
Assignee: Varun Saxena
Priority: Trivial
  Labels: newbie
 Fix For: 3.0.0

 Attachments: YARN-2428.001.patch


 When task-controller was retrofitted to YARN, the default banned user list 
 didn't add yarn.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2854) The document about timeline service and generic service needs to be updated

2015-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298647#comment-14298647
 ] 

Hadoop QA commented on YARN-2854:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12695043/timeline_structure.jpg
  against trunk revision f2c9109.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6465//console

This message is automatically generated.

 The document about timeline service and generic service needs to be updated
 ---

 Key: YARN-2854
 URL: https://issues.apache.org/jira/browse/YARN-2854
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
Priority: Critical
 Attachments: TimelineServer.html, YARN-2854.20141120-1.patch, 
 YARN-2854.20150128.1.patch, timeline_structure.jpg






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3104) RM generates new AMRM tokens every heartbeat between rolling and activation

2015-01-30 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-3104:
-
Summary: RM generates new AMRM tokens every heartbeat between rolling and 
activation  (was: RM continues to send new AMRM tokens every heartbeat between 
rolling and activation)

Yes, the connection is not re-established so the updated token in the client's 
UGI is never re-sent to the RPC server.  Therefore every time the 
RM asks the RPC server for the client's UGI we will continue to get the old 
one.  Since the RM thinks the client is still using the token that was used 
when the connection was established, it continues to regenerate tokens (and 
emit corresponding logs) every heartbeat for the interval between when the new 
key was rolled and when it is activated (i.e.: as long as nextMasterKey != 
null).

To tell whether the client really is using the new token we either need the RPC 
connection to be re-established or a way to tell the RPC layer to 
re-authenticate the connection.  I don't believe there's a good way to do 
either of those given the RPC API, so this patch works around the issue a bit 
by comparing the token we have recorded for the app attempt with the next key.  
It solves the problem of regenerating tokens unnecessarily for the same app 
attempt.  However we will continue to send the token each heartbeat since we 
cannot tell whether the client really has the new token.  I tweaked the summary 
accordingly.

 RM generates new AMRM tokens every heartbeat between rolling and activation
 ---

 Key: YARN-3104
 URL: https://issues.apache.org/jira/browse/YARN-3104
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-3104.001.patch


 When the RM rolls a new AMRM secret, it conveys this to the AMs when it 
 notices they are still connected with the old key.  However neither the RM 
 nor the AM explicitly close the connection or otherwise try to reconnect with 
 the new secret.  Therefore the RM keeps thinking the AM doesn't have the new 
 token on every heartbeat and keeps sending new tokens for the period between 
 the key roll and the key activation.  Once activated the RM no longer squawks 
 in its logs about needing to generate a new token every heartbeat (i.e.: 
 second) for every app, but the apps can still be using the old token.  The 
 token is only checked upon connection to the RM.  The apps don't reconnect 
 when sent a new token, and the RM doesn't force them to reconnect by closing 
 the connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2428) LCE default banned user list should have yarn

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298722#comment-14298722
 ] 

Hudson commented on YARN-2428:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #90 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/90/])
YARN-2428. LCE default banned user list should have yarn (Varun Saxena via aw) 
(aw: rev 9dd0b7a2ab6538d8f72b004eb97c2750ff3d98dd)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* hadoop-yarn-project/CHANGES.txt


 LCE default banned user list should have yarn
 -

 Key: YARN-2428
 URL: https://issues.apache.org/jira/browse/YARN-2428
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Allen Wittenauer
Assignee: Varun Saxena
Priority: Trivial
  Labels: newbie
 Fix For: 3.0.0

 Attachments: YARN-2428.001.patch


 When task-controller was retrofitted to YARN, the default banned user list 
 didn't add yarn.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3029) FSDownload.unpack() uses local locale for FS case conversion, may not work everywhere

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298721#comment-14298721
 ] 

Hudson commented on YARN-3029:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #90 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/90/])
YARN-3029. FSDownload.unpack() uses local locale for FS case conversion, may 
not work everywhere. Contributed by Varun Saxena. (ozawa: rev 
7acce7d3648d6f1e45ce280e2147e7dedf5693fc)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java


 FSDownload.unpack() uses local locale for FS case conversion, may not work 
 everywhere
 -

 Key: YARN-3029
 URL: https://issues.apache.org/jira/browse/YARN-3029
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Varun Saxena
 Attachments: YARN-3029-003.patch, YARN-3029.001.patch, 
 YARN-3029.002.patch


 {{FSDownload.unpack()}} lower-cases filenames in the local locale before 
 looking at extensions for, tar, zip, ..
 {code}
 String lowerDst = dst.getName().toLowerCase();
 {code}
 it MUST use LOCALE_EN for the locale, else a file .ZIP won't be recognised as 
 a zipfile in a turkish locale cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3108) ApplicationHistoryServer doesn't process -D arguments

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298717#comment-14298717
 ] 

Hudson commented on YARN-3108:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #90 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/90/])
YARN-3108. ApplicationHistoryServer doesn't process -D arguments (Chang Li via 
jeagles) (jeagles: rev 30a8778c632c0f57cdd005080a470065a60756a8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryServer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* hadoop-yarn-project/CHANGES.txt


 ApplicationHistoryServer doesn't process -D arguments
 -

 Key: YARN-3108
 URL: https://issues.apache.org/jira/browse/YARN-3108
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: chang li
Assignee: chang li
 Fix For: 2.7.0

 Attachments: yarn3108.patch, yarn3108.patch, yarn3108.patch


 ApplicationHistoryServer doesn't process -D arguments when created, it's nice 
 to have it to do that  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2428) LCE default banned user list should have yarn

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298464#comment-14298464
 ] 

Hudson commented on YARN-2428:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #89 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/89/])
YARN-2428. LCE default banned user list should have yarn (Varun Saxena via aw) 
(aw: rev 9dd0b7a2ab6538d8f72b004eb97c2750ff3d98dd)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


 LCE default banned user list should have yarn
 -

 Key: YARN-2428
 URL: https://issues.apache.org/jira/browse/YARN-2428
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Allen Wittenauer
Assignee: Varun Saxena
Priority: Trivial
  Labels: newbie
 Fix For: 3.0.0

 Attachments: YARN-2428.001.patch


 When task-controller was retrofitted to YARN, the default banned user list 
 didn't add yarn.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3029) FSDownload.unpack() uses local locale for FS case conversion, may not work everywhere

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298463#comment-14298463
 ] 

Hudson commented on YARN-3029:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #89 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/89/])
YARN-3029. FSDownload.unpack() uses local locale for FS case conversion, may 
not work everywhere. Contributed by Varun Saxena. (ozawa: rev 
7acce7d3648d6f1e45ce280e2147e7dedf5693fc)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java


 FSDownload.unpack() uses local locale for FS case conversion, may not work 
 everywhere
 -

 Key: YARN-3029
 URL: https://issues.apache.org/jira/browse/YARN-3029
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Varun Saxena
 Attachments: YARN-3029-003.patch, YARN-3029.001.patch, 
 YARN-3029.002.patch


 {{FSDownload.unpack()}} lower-cases filenames in the local locale before 
 looking at extensions for, tar, zip, ..
 {code}
 String lowerDst = dst.getName().toLowerCase();
 {code}
 it MUST use LOCALE_EN for the locale, else a file .ZIP won't be recognised as 
 a zipfile in a turkish locale cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3108) ApplicationHistoryServer doesn't process -D arguments

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298459#comment-14298459
 ] 

Hudson commented on YARN-3108:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #89 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/89/])
YARN-3108. ApplicationHistoryServer doesn't process -D arguments (Chang Li via 
jeagles) (jeagles: rev 30a8778c632c0f57cdd005080a470065a60756a8)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryServer.java


 ApplicationHistoryServer doesn't process -D arguments
 -

 Key: YARN-3108
 URL: https://issues.apache.org/jira/browse/YARN-3108
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: chang li
Assignee: chang li
 Fix For: 2.7.0

 Attachments: yarn3108.patch, yarn3108.patch, yarn3108.patch


 ApplicationHistoryServer doesn't process -D arguments when created, it's nice 
 to have it to do that  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit

2015-01-30 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3119:

Component/s: nodemanager

 Memory limit check need not be enforced unless aggregate usage of all 
 containers is near limit
 --

 Key: YARN-3119
 URL: https://issues.apache.org/jira/browse/YARN-3119
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3119.prelim.patch


 Today we kill any container preemptively even if the total usage of 
 containers for that is well within the limit for YARN. Instead if we enforce 
 memory limit only if the total limit of all containers is close to some 
 configurable ratio of overall memory assigned to containers, we can allow for 
 flexibility in container memory usage without adverse effects. This is 
 similar in principle to how cgroups uses soft_limit_in_bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit

2015-01-30 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot reassigned YARN-3119:
---

Assignee: Anubhav Dhoot

 Memory limit check need not be enforced unless aggregate usage of all 
 containers is near limit
 --

 Key: YARN-3119
 URL: https://issues.apache.org/jira/browse/YARN-3119
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3119.prelim.patch


 Today we kill any container preemptively even if the total usage of 
 containers for that is well within the limit for YARN. Instead if we enforce 
 memory limit only if the total limit of all containers is close to some 
 configurable ratio of overall memory assigned to containers, we can allow for 
 flexibility in container memory usage without adverse effects. This is 
 similar in principle to how cgroups uses soft_limit_in_bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser

2015-01-30 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298371#comment-14298371
 ] 

Sunil G commented on YARN-3089:
---

Hi [~eepayne]

Thank you for bringing this up.. I have a comment on same.

{code}
int subDirEmptyStr = (subdir == NULL || subdir[0] == 0);
{code}
I think strlen(subdir) also has to be checked against 0, correct?





 LinuxContainerExecutor does not handle file arguments to deleteAsUser
 -

 Key: YARN-3089
 URL: https://issues.apache.org/jira/browse/YARN-3089
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Eric Payne
Priority: Blocker
 Attachments: YARN-3089.v1.txt


 YARN-2468 added the deletion of individual logs that are aggregated, but this 
 fails to delete log files when the LCE is being used.  The LCE native 
 executable assumes the paths being passed are paths and the delete fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3108) ApplicationHistoryServer doesn't process -D arguments

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298477#comment-14298477
 ] 

Hudson commented on YARN-3108:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #823 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/823/])
YARN-3108. ApplicationHistoryServer doesn't process -D arguments (Chang Li via 
jeagles) (jeagles: rev 30a8778c632c0f57cdd005080a470065a60756a8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryServer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java
* hadoop-yarn-project/CHANGES.txt


 ApplicationHistoryServer doesn't process -D arguments
 -

 Key: YARN-3108
 URL: https://issues.apache.org/jira/browse/YARN-3108
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: chang li
Assignee: chang li
 Fix For: 2.7.0

 Attachments: yarn3108.patch, yarn3108.patch, yarn3108.patch


 ApplicationHistoryServer doesn't process -D arguments when created, it's nice 
 to have it to do that  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3029) FSDownload.unpack() uses local locale for FS case conversion, may not work everywhere

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298481#comment-14298481
 ] 

Hudson commented on YARN-3029:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #823 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/823/])
YARN-3029. FSDownload.unpack() uses local locale for FS case conversion, may 
not work everywhere. Contributed by Varun Saxena. (ozawa: rev 
7acce7d3648d6f1e45ce280e2147e7dedf5693fc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestFSDownload.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/FSDownload.java
* hadoop-yarn-project/CHANGES.txt


 FSDownload.unpack() uses local locale for FS case conversion, may not work 
 everywhere
 -

 Key: YARN-3029
 URL: https://issues.apache.org/jira/browse/YARN-3029
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Varun Saxena
 Attachments: YARN-3029-003.patch, YARN-3029.001.patch, 
 YARN-3029.002.patch


 {{FSDownload.unpack()}} lower-cases filenames in the local locale before 
 looking at extensions for, tar, zip, ..
 {code}
 String lowerDst = dst.getName().toLowerCase();
 {code}
 it MUST use LOCALE_EN for the locale, else a file .ZIP won't be recognised as 
 a zipfile in a turkish locale cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3021) YARN's delegation-token handling disallows certain trust setups to operate properly

2015-01-30 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298513#comment-14298513
 ] 

Harsh J commented on YARN-3021:
---

Overall the patch looks fine to me, but please do hold up for [~vinodkv] or 
another YARN active committer to take a look.

Could you conceive a test case for this as well, to catch regressions in 
behaviour in future? For example it could be done by adding an invalid token 
with the app, but with this option turned on. With the option turned off, such 
a thing will always fail and app gets rejected, but with the fix in proper 
behaviour it will pass through the submit procedure at least. Checkout the 
test-case modified in the earlier patch for a reusable reference.

Also, could you document the added MR config in mapred-default.xml, describing 
its use and marking it also as advanced, as it disables some features of a 
regular resilient application such as token reuse and renewals.

 YARN's delegation-token handling disallows certain trust setups to operate 
 properly
 ---

 Key: YARN-3021
 URL: https://issues.apache.org/jira/browse/YARN-3021
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.3.0
Reporter: Harsh J
 Attachments: YARN-3021.001.patch, YARN-3021.patch


 Consider this scenario of 3 realms: A, B and COMMON, where A trusts COMMON, 
 and B trusts COMMON (one way trusts both), and both A and B run HDFS + YARN 
 clusters.
 Now if one logs in with a COMMON credential, and runs a job on A's YARN that 
 needs to access B's HDFS (such as a DistCp), the operation fails in the RM, 
 as it attempts a renewDelegationToken(…) synchronously during application 
 submission (to validate the managed token before it adds it to a scheduler 
 for automatic renewal). The call obviously fails cause B realm will not trust 
 A's credentials (here, the RM's principal is the renewer).
 In the 1.x JobTracker the same call is present, but it is done asynchronously 
 and once the renewal attempt failed we simply ceased to schedule any further 
 attempts of renewals, rather than fail the job immediately.
 We should change the logic such that we attempt the renewal but go easy on 
 the failure and skip the scheduling alone, rather than bubble back an error 
 to the client, failing the app submission. This way the old behaviour is 
 retained.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2428) LCE default banned user list should have yarn

2015-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298482#comment-14298482
 ] 

Hudson commented on YARN-2428:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #823 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/823/])
YARN-2428. LCE default banned user list should have yarn (Varun Saxena via aw) 
(aw: rev 9dd0b7a2ab6538d8f72b004eb97c2750ff3d98dd)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* hadoop-yarn-project/CHANGES.txt


 LCE default banned user list should have yarn
 -

 Key: YARN-2428
 URL: https://issues.apache.org/jira/browse/YARN-2428
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Allen Wittenauer
Assignee: Varun Saxena
Priority: Trivial
  Labels: newbie
 Fix For: 3.0.0

 Attachments: YARN-2428.001.patch


 When task-controller was retrofitted to YARN, the default banned user list 
 didn't add yarn.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)