[jira] [Updated] (YARN-1027) Implement RMHAServiceProtocol

2013-08-06 Thread nemon lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nemon lou updated YARN-1027:


Assignee: Karthik Kambatla  (was: nemon lou)

 Implement RMHAServiceProtocol
 -

 Key: YARN-1027
 URL: https://issues.apache.org/jira/browse/YARN-1027
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla

 Implement existing HAServiceProtocol from Hadoop common. This protocol is the 
 single point of interaction between the RM and HA clients/services.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1033) Expose RM active/standby state to web UI and metrics

2013-08-06 Thread nemon lou (JIRA)
nemon lou created YARN-1033:
---

 Summary: Expose RM active/standby state to web UI and metrics
 Key: YARN-1033
 URL: https://issues.apache.org/jira/browse/YARN-1033
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.1.0-beta
Reporter: nemon lou


Both active and standby RM shall expose it's web server and show it's current 
state (active or standby) on web page.
Cluster metrics also need this state for monitor.
RM web services shall refuse client request unless querying for RM state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1033) Expose RM active/standby state to web UI and metrics

2013-08-06 Thread nemon lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nemon lou updated YARN-1033:


Assignee: nemon lou

 Expose RM active/standby state to web UI and metrics
 

 Key: YARN-1033
 URL: https://issues.apache.org/jira/browse/YARN-1033
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.1.0-beta
Reporter: nemon lou
Assignee: nemon lou

 Both active and standby RM shall expose it's web server and show it's current 
 state (active or standby) on web page.
 Cluster metrics also need this state for monitor.
 RM web services shall refuse client request unless querying for RM state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1033) Expose RM active/standby state to web UI and metrics

2013-08-06 Thread nemon lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nemon lou updated YARN-1033:


Description: 
Both active and standby RM shall expose it's web server and show it's current 
state (active or standby) on web page.
Cluster metrics also need this state for monitor.
Standby RM web services shall refuse client request unless querying for RM 
state.

  was:
Both active and standby RM shall expose it's web server and show it's current 
state (active or standby) on web page.
Cluster metrics also need this state for monitor.
RM web services shall refuse client request unless querying for RM state.


 Expose RM active/standby state to web UI and metrics
 

 Key: YARN-1033
 URL: https://issues.apache.org/jira/browse/YARN-1033
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.1.0-beta
Reporter: nemon lou
Assignee: nemon lou

 Both active and standby RM shall expose it's web server and show it's current 
 state (active or standby) on web page.
 Cluster metrics also need this state for monitor.
 Standby RM web services shall refuse client request unless querying for RM 
 state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-06 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730622#comment-13730622
 ] 

Junping Du commented on YARN-1024:
--

I would also prefer #1 in scheduling resources as #2 is only meaningful in 
charge/billing as [~philip] mentioned above. 
For #2, simple calculation like ECU (it is released in 2006/2007, but didn't 
change over 7 years which against Moore's law :)) has two common questioned 
scenarios below:
- assignment of multiple slow p-cores (4 x 1G) to a single thread task (1 x 4G) 
asking for a fast core (mapping to multiple vcore) cannot help performance but 
a waste of cpu resource: unused core will still consume timer interrupts, and 
idle loop cause resources too. In addition, maintaining a consistent memory 
view among multiple vCPUs consume resources. All of these are unnecessary. 
Another case is that it is possible for OS CPU scheduler to migrate a 
single-threaded workload amongst multiple vCPUs, thereby losing cache locality.
- assignment of single faster p-cores (1 x 4G) to multiple thread task asking 
for multiple slow core (4 x 1G), it will cause performance issues as Steve 
mentioned above and in YARN-972, too much overhead in process context switch 
and cache miss.
#1 sounds more reasonable and 1 vcore don't have to be 1pcore, but could be 
mapped to 1 vCPU on virtualization and can be overcommit latter (with 
configured ratio) by virtualized platform.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1031) JQuery UI components reference external css in branch-23

2013-08-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730708#comment-13730708
 ] 

Hudson commented on YARN-1031:
--

SUCCESS: Integrated in Hadoop-Hdfs-0.23-Build #691 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/691/])
YARN-1031. JQuery UI components reference external css in branch-23 (jeagles) 
(jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510775)
* /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java
* 
/hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery
* 
/hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery/themes-1.8.16
* 
/hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery/themes-1.8.16/base
* 
/hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery/themes-1.8.16/base/jquery-ui.css


 JQuery UI components reference external css in branch-23
 

 Key: YARN-1031
 URL: https://issues.apache.org/jira/browse/YARN-1031
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 0.23.9
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Fix For: 0.23.10

 Attachments: YARN-1031-branch-0.23.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-696) Enable multiple states to to be specified in Resource Manager apps REST call

2013-08-06 Thread Trevor Lorimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Lorimer updated YARN-696:


Attachment: (was: YARN-696.diff)

 Enable multiple states to to be specified in Resource Manager apps REST call
 

 Key: YARN-696
 URL: https://issues.apache.org/jira/browse/YARN-696
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Trevor Lorimer
Assignee: Trevor Lorimer
Priority: Trivial

 Within the YARN Resource Manager REST API the GET call which returns all 
 Applications can be filtered by a single State query parameter (http://rm 
 http address:port/ws/v1/cluster/apps). 
 There are 8 possible states (New, Submitted, Accepted, Running, Finishing, 
 Finished, Failed, Killed), if no state parameter is specified all states are 
 returned, however if a sub-set of states is required then multiple REST calls 
 are required (max. of 7).
 The proposal is to be able to specify multiple states in a single REST call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-696) Enable multiple states to to be specified in Resource Manager apps REST call

2013-08-06 Thread Trevor Lorimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Lorimer updated YARN-696:


Attachment: YARN-696.diff

 Enable multiple states to to be specified in Resource Manager apps REST call
 

 Key: YARN-696
 URL: https://issues.apache.org/jira/browse/YARN-696
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Trevor Lorimer
Assignee: Trevor Lorimer
Priority: Trivial
 Attachments: YARN-696.diff


 Within the YARN Resource Manager REST API the GET call which returns all 
 Applications can be filtered by a single State query parameter (http://rm 
 http address:port/ws/v1/cluster/apps). 
 There are 8 possible states (New, Submitted, Accepted, Running, Finishing, 
 Finished, Failed, Killed), if no state parameter is specified all states are 
 returned, however if a sub-set of states is required then multiple REST calls 
 are required (max. of 7).
 The proposal is to be able to specify multiple states in a single REST call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-696) Enable multiple states to to be specified in Resource Manager apps REST call

2013-08-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730822#comment-13730822
 ] 

Hadoop QA commented on YARN-696:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12596352/YARN-696.diff
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1659//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1659//console

This message is automatically generated.

 Enable multiple states to to be specified in Resource Manager apps REST call
 

 Key: YARN-696
 URL: https://issues.apache.org/jira/browse/YARN-696
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Trevor Lorimer
Assignee: Trevor Lorimer
Priority: Trivial
 Attachments: YARN-696.diff


 Within the YARN Resource Manager REST API the GET call which returns all 
 Applications can be filtered by a single State query parameter (http://rm 
 http address:port/ws/v1/cluster/apps). 
 There are 8 possible states (New, Submitted, Accepted, Running, Finishing, 
 Finished, Failed, Killed), if no state parameter is specified all states are 
 returned, however if a sub-set of states is required then multiple REST calls 
 are required (max. of 7).
 The proposal is to be able to specify multiple states in a single REST call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-90) NodeManager should identify failed disks becoming good back again

2013-08-06 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730850#comment-13730850
 ] 

Ravi Prakash commented on YARN-90:
--

Do we know what we need to do for this JIRA? I can see in DirectoryCollection, 
we need to be able to remove from failedDirs, and be able to recognize this 
fact in LocalDirsHandler service. Would anything else need to be done?

 NodeManager should identify failed disks becoming good back again
 -

 Key: YARN-90
 URL: https://issues.apache.org/jira/browse/YARN-90
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Ravi Gummadi

 MAPREDUCE-3121 makes NodeManager identify disk failures. But once a disk goes 
 down, it is marked as failed forever. To reuse that disk (after it becomes 
 good), NodeManager needs restart. This JIRA is to improve NodeManager to 
 reuse good disks(which could be bad some time back).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1032) NPE in RackResolve

2013-08-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730895#comment-13730895
 ] 

Hadoop QA commented on YARN-1032:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12596360/YARN-1032.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1660//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1660//console

This message is automatically generated.

 NPE in RackResolve
 --

 Key: YARN-1032
 URL: https://issues.apache.org/jira/browse/YARN-1032
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.5-alpha
 Environment: linux
Reporter: Lohit Vijayarenu
Priority: Minor
 Attachments: YARN-1032.1.patch, YARN-1032.2.patch


 We found a case where our rack resolve script was not returning rack due to 
 problem with resolving host address. This exception was see in 
 RackResolver.java as NPE, ultimately caught in RMContainerAllocator. 
 {noformat}
 2013-08-01 07:11:37,708 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM. 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:99)
   at 
 org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:92)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assignMapsWithLocality(RMContainerAllocator.java:1039)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assignContainers(RMContainerAllocator.java:925)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assign(RMContainerAllocator.java:861)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.access$400(RMContainerAllocator.java:681)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
   at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:243)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-06 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730961#comment-13730961
 ] 

Eli Collins commented on YARN-1024:
---

bq. vcores are optional anyway (only used in DRF) 

Sandy corrected me offline that while this is true for the CS it is not true 
for the FS, which by default (w/o DRF) will not schedule more containers worth 
of vcores than configured vcores (which seems like it could lead to 
under-utilization given that the default resource calculator only uses memory 
and not every container needs a whole core). By default the # vcores is the # 
cores on the machine and MR asks containers w/ 1 vcore so we effectively have 
vcore=pcore today as the default (re-inforced by the decision to remove the 
notion of pcore in YARN-782).

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1034) Remove experimental in the Fair Scheduler documentation

2013-08-06 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1034:


 Summary: Remove experimental in the Fair Scheduler documentation
 Key: YARN-1034
 URL: https://issues.apache.org/jira/browse/YARN-1034
 Project: Hadoop YARN
  Issue Type: Task
  Components: documentation, scheduler
Affects Versions: 2.1.0-beta
 Environment: The YARN Fair Scheduler is largely stable now, and should 
no longer be declared experimental.
Reporter: Sandy Ryza
Assignee: Karthik Kambatla




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-291) Dynamic resource configuration

2013-08-06 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731037#comment-13731037
 ] 

Alejandro Abdelnur commented on YARN-291:
-

Are we talking about an admin call to the RM that would set a resources 
correction on per node basis and the RM would adjust the NM reported resource 
capacity based on this correction? This would not require changes in the NMs. 
And potentially the correction could be done on the node update event before it 
makes to the scheduler impl, thus transparent to the scheduler impl. And if we 
want to persist these corrections, this could be done by the RM itself.

If I got things right I'm OK with the approach.

 Dynamic resource configuration
 --

 Key: YARN-291
 URL: https://issues.apache.org/jira/browse/YARN-291
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: Elastic Resources for YARN-v0.2.pdf, 
 YARN-291-AddClientRMProtocolToSetNodeResource-03.patch, 
 YARN-291-all-v1.patch, YARN-291-core-HeartBeatAndScheduler-01.patch, 
 YARN-291-JMXInterfaceOnNM-02.patch, 
 YARN-291-OnlyUpdateWhenResourceChange-01-fix.patch, 
 YARN-291-YARNClientCommandline-04.patch


 The current Hadoop YARN resource management logic assumes per node resource 
 is static during the lifetime of the NM process. Allowing run-time 
 configuration on per node resource will give us finer granularity of resource 
 elasticity. This allows Hadoop workloads to coexist with other workloads on 
 the same hardware efficiently, whether or not the environment is virtualized. 
 More background and design details can be found in attached proposal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS

2013-08-06 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur reassigned YARN-160:
---

Assignee: (was: Alejandro Abdelnur)

I have my hands full at the moment, I won't be able to take onto this one for a 
while. 

Making it unassigned in case somebody wants to take a stab to it.

 nodemanagers should obtain cpu/memory values from underlying OS
 ---

 Key: YARN-160
 URL: https://issues.apache.org/jira/browse/YARN-160
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.0.3-alpha
Reporter: Alejandro Abdelnur
 Fix For: 2.1.0-beta


 As mentioned in YARN-2
 *NM memory and CPU configs*
 Currently these values are coming from the config of the NM, we should be 
 able to obtain those values from the OS (ie, in the case of Linux from 
 /proc/meminfo  /proc/cpuinfo). As this is highly OS dependent we should have 
 an interface that obtains this information. In addition implementations of 
 this interface should be able to specify a mem/cpu offset (amount of mem/cpu 
 not to be avail as YARN resource), this would allow to reserve mem/cpu for 
 the OS and other services outside of YARN containers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.

2013-08-06 Thread Joseph Kniest (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731053#comment-13731053
 ] 

Joseph Kniest commented on YARN-1019:
-

Hi, new to yarn, where do I look in the code base for this?

 YarnConfiguration validation for local disk path and http addresses.
 

 Key: YARN-1019
 URL: https://issues.apache.org/jira/browse/YARN-1019
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Omkar Vinit Joshi
Priority: Minor
  Labels: newbie

 Today we are not validating certain configuration parameters set in 
 yarn-site.xml. 1) Configurations related to paths... such as local-dirs, 
 log-dirs.. Our NM crashes during startup if they are set to relative paths 
 rather than absolute paths. To avoid such failures we can enforce checks 
 (absolute paths) before startup . i.e. before we actually startup...( i.e. 
 directory handler creating directories).
 2) Also for all the parameters using hostname:port unless we are ok with 
 default port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.

2013-08-06 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731060#comment-13731060
 ] 

Omkar Vinit Joshi commented on YARN-1019:
-

Hi, Welcome to yarn group. Probably you can get started from here [Checkout 
Code|http://wiki.apache.org/hadoop/HowToContribute]. Subscribe to user / dev 
mailing list and ask questions there (General questions such as how to checkout 
code/ issues running code). Here we usually discuss the current issue related 
problems. To get started. Run YARN..simple map reduce program. Once you are 
familiar with this you can take up one of the tickets marked as newbie and 
start working on that.

 YarnConfiguration validation for local disk path and http addresses.
 

 Key: YARN-1019
 URL: https://issues.apache.org/jira/browse/YARN-1019
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Omkar Vinit Joshi
Priority: Minor
  Labels: newbie

 Today we are not validating certain configuration parameters set in 
 yarn-site.xml. 1) Configurations related to paths... such as local-dirs, 
 log-dirs.. Our NM crashes during startup if they are set to relative paths 
 rather than absolute paths. To avoid such failures we can enforce checks 
 (absolute paths) before startup . i.e. before we actually startup...( i.e. 
 directory handler creating directories).
 2) Also for all the parameters using hostname:port unless we are ok with 
 default port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1035) NPE when trying to create an error message response of RPC

2013-08-06 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731063#comment-13731063
 ] 

Steve Loughran commented on YARN-1035:
--

{code}
8]] INFO  DataNode.clienttrace (BlockSender.java:sendBlock(695)) - src: 
/127.0.0.1:58247, dest: /127.0.0.1:58308, bytes: 5439, op: HDFS_READ, cliID: 
DFSClient_NONMAPREDUCE_-539248485_697, offset: 0, srvID: 
DS-502087106-10.11.3.237-58247-1375813762260, blockid: 
BP-384257351-10.11.3.237-1375813760919:blk_1073741832_1008, duration: 293000
2013-08-06 11:29:30,802 [IPC Server handler 1 on 58224] INFO  
localizer.LocalizedResource (LocalizedResource.java:handle(196)) - Resource 
hdfs://localhost:58246/user/stevel/.hoya/cluster/TestLiveRegionService/generated/hbase-env.sh
 transitioned from DOWNLOADING to LOCALIZED
2013-08-06 11:29:30,802 [AsyncDispatcher event handler] INFO  
container.Container (ContainerImpl.java:handle(860)) - Container 
container_1375813755119_0001_01_02 transitioned from LOCALIZING to LOCALIZED
2013-08-06 11:29:30,921 [AsyncDispatcher event handler] INFO  
container.Container (ContainerImpl.java:handle(860)) - Container 
container_1375813755119_0001_01_02 transitioned from LOCALIZED to RUNNING
2013-08-06 11:29:31,140 [ContainersLauncher #0] INFO  
nodemanager.DefaultContainerExecutor 
(DefaultContainerExecutor.java:launchContainer(189)) - launchContainer: [nice, 
-n, 0, bash, -c, 
/Users/stevel/Projects/Hortonworks/Projects/hoya/target/TestLiveRegionService/TestLiveRegionService-localDir-nm-0_0/usercache/stevel/appcache/application_1375813755119_0001/container_1375813755119_0001_01_02/default_container_executor.sh]
2013-08-06 11:29:31,169 [ProcessThread(sid:0 cport:-1):] INFO  
server.PrepRequestProcessor (PrepRequestProcessor.java:pRequest(627)) - Got 
user-level KeeperException when processing sessionid:0x14054e3f67f0001 
type:delete cxid:0x13 zxid:0xc txntype:-1 reqpath:n/a Error 
Path:/yarnapps_hoya_stevel_TestLiveRegionService/backup-masters/10.11.3.237,58296,1375813768541
 Error:KeeperErrorCode = NoNode for 
/yarnapps_hoya_stevel_TestLiveRegionService/backup-masters/10.11.3.237,58296,1375813768541
2013-08-06 11:29:31,713 [Socket Reader #1 for port 58246] INFO  ipc.Server 
(Server.java:doRead(800)) - IPC Server listener on 58246: readAndProcess from 
client 127.0.0.1 threw exception [java.lang.NullPointerException]
java.lang.NullPointerException
at 
org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$Builder.setErrorMsg(RpcHeaderProtos.java:1843)
at org.apache.hadoop.ipc.Server.setupResponse(Server.java:2330)
at org.apache.hadoop.ipc.Server.access$2900(Server.java:121)
at org.apache.hadoop.ipc.Server$Connection.doSaslReply(Server.java:1430)
at 
org.apache.hadoop.ipc.Server$Connection.initializeAuthContext(Server.java:1548)
at 
org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1507)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:791)
at 
org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:590)
at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:565)
2013-08-06 11:29:31,729 [Socket Reader #1 for port 58246] INFO  ipc.Server 
(Server.java:doRead(800)) - IPC Server listener on 58246: readAndProcess from 
client 127.0.0.1 threw exception [java.lang.NullPointerException]
java.lang.NullPointerException
at 
org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$Builder.setErrorMsg(RpcHeaderProtos.java:1843)
at org.apache.hadoop.ipc.Server.setupResponse(Server.java:2330)
at org.apache.hadoop.ipc.Server.access$2900(Server.java:121)
at org.apache.hadoop.ipc.Server$Connection.doSaslReply(Server.java:1430)
at 
org.apache.hadoop.ipc.Server$Connection.initializeAuthContext(Server.java:1548)
at 
org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1507)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:791)
at 
org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:590)
at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:565)
2013-08-06 11:29:32,070 [ProcessThread(sid:0 cport:-1):] INFO  
server.PrepRequestProcessor (PrepRequestProcessor.java:pRequest2Txn(476)) - 
Processed session termination for sessionid: 0x14054e3f67f0001
{code}

 NPE when trying to create an error message response of RPC
 --

 Key: YARN-1035
 URL: https://issues.apache.org/jira/browse/YARN-1035
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Steve Loughran

 I'm seeing an NPE which is raised when the server is trying to create an 
 error response to send back to the caller and there is no error text.
 The root 

[jira] [Created] (YARN-1035) NPE when trying to create an error message response of RPC

2013-08-06 Thread Steve Loughran (JIRA)
Steve Loughran created YARN-1035:


 Summary: NPE when trying to create an error message response of RPC
 Key: YARN-1035
 URL: https://issues.apache.org/jira/browse/YARN-1035
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Steve Loughran


I'm seeing an NPE which is raised when the server is trying to create an error 
response to send back to the caller and there is no error text.

The root cause is probably somewhere in SASL, but sending something back to the 
caller would seem preferable to NPE-ing server-side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1035) NPE when trying to create an error message response of RPC

2013-08-06 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731064#comment-13731064
 ] 

Steve Loughran commented on YARN-1035:
--

Looking up the stack, it's in
{code}

private void doSaslReply(Exception ioe) throws IOException {
  setupResponse(authFailedResponse, authFailedCall,
  RpcStatusProto.FATAL, RpcErrorCodeProto.FATAL_UNAUTHORIZED,
  null, ioe.getClass().getName(), ioe.getLocalizedMessage());
  responder.doRespond(authFailedCall);
}

{code}

This code assumes that the {{ioe.getLocalizedMessage()}} always returns a 
non-null string. Some exceptions do return null. For a robust response, 
{{ioe.toString()}} should be used.

 NPE when trying to create an error message response of RPC
 --

 Key: YARN-1035
 URL: https://issues.apache.org/jira/browse/YARN-1035
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Steve Loughran

 I'm seeing an NPE which is raised when the server is trying to create an 
 error response to send back to the caller and there is no error text.
 The root cause is probably somewhere in SASL, but sending something back to 
 the caller would seem preferable to NPE-ing server-side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.

2013-08-06 Thread Joseph Kniest (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731106#comment-13731106
 ] 

Joseph Kniest commented on YARN-1019:
-

Thanks, I've done all that, built the latest from source, kicked off sample 
mapreduce job, looking for where this is handled in the code 

 YarnConfiguration validation for local disk path and http addresses.
 

 Key: YARN-1019
 URL: https://issues.apache.org/jira/browse/YARN-1019
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Omkar Vinit Joshi
Priority: Minor
  Labels: newbie

 Today we are not validating certain configuration parameters set in 
 yarn-site.xml. 1) Configurations related to paths... such as local-dirs, 
 log-dirs.. Our NM crashes during startup if they are set to relative paths 
 rather than absolute paths. To avoid such failures we can enforce checks 
 (absolute paths) before startup . i.e. before we actually startup...( i.e. 
 directory handler creating directories).
 2) Also for all the parameters using hostname:port unless we are ok with 
 default port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-985) Nodemanager should log where a resource was localized

2013-08-06 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-985:
--

Attachment: YARN-985.branch-0.23.patch

For branch-0.23


 Nodemanager should log where a resource was localized
 -

 Key: YARN-985
 URL: https://issues.apache.org/jira/browse/YARN-985
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-985.branch-0.23.patch, YARN-985.patch, 
 YARN-985.patch


 When a resource is localized, we should log WHERE on the local disk it was 
 localized. This helps in debugging afterwards (e.g. if the disk was to go 
 bad).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-985) Nodemanager should log where a resource was localized

2013-08-06 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-985:
--

Attachment: YARN-985.patch

This is for trunk. I've incorporated Omkar's suggestion now

 Nodemanager should log where a resource was localized
 -

 Key: YARN-985
 URL: https://issues.apache.org/jira/browse/YARN-985
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-985.branch-0.23.patch, YARN-985.patch, 
 YARN-985.patch


 When a resource is localized, we should log WHERE on the local disk it was 
 localized. This helps in debugging afterwards (e.g. if the disk was to go 
 bad).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.

2013-08-06 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731120#comment-13731120
 ] 

Omkar Vinit Joshi commented on YARN-1019:
-

Start with YarnConfiguration.java. track all the places from where it is  
getting used and fix all path related and host:port checks. Also once done 
upload a patch. Someone will take a look at it. Make sure your patch file is 
something like jira-number-date-in--mm-dd.number.patch format. It 
will help reviewers. Also make sure your code is formatted well. Make sure your 
changes are as minimal as possible. You are set then. Start contributing!!

 YarnConfiguration validation for local disk path and http addresses.
 

 Key: YARN-1019
 URL: https://issues.apache.org/jira/browse/YARN-1019
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Omkar Vinit Joshi
Priority: Minor
  Labels: newbie

 Today we are not validating certain configuration parameters set in 
 yarn-site.xml. 1) Configurations related to paths... such as local-dirs, 
 log-dirs.. Our NM crashes during startup if they are set to relative paths 
 rather than absolute paths. To avoid such failures we can enforce checks 
 (absolute paths) before startup . i.e. before we actually startup...( i.e. 
 directory handler creating directories).
 2) Also for all the parameters using hostname:port unless we are ok with 
 default port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-985) Nodemanager should log where a resource was localized

2013-08-06 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731125#comment-13731125
 ] 

Omkar Vinit Joshi commented on YARN-985:


+1

 Nodemanager should log where a resource was localized
 -

 Key: YARN-985
 URL: https://issues.apache.org/jira/browse/YARN-985
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-985.branch-0.23.patch, YARN-985.patch, 
 YARN-985.patch


 When a resource is localized, we should log WHERE on the local disk it was 
 localized. This helps in debugging afterwards (e.g. if the disk was to go 
 bad).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1008) MiniYARNCluster with multiple nodemanagers, all nodes have same key for allocations

2013-08-06 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731136#comment-13731136
 ] 

Alejandro Abdelnur commented on YARN-1008:
--

[~vinodkv], I don't think the change should go beyond minicluster for the 
following reason as in a real cluster there is one NM per node. Said this, 
maybe what we should do is that AMs should be able to specify a HOST:PORT 
(which typically will be the DN HOST:PORT), in the case of Minicluster, we 
would need a mapping between DN HOST:PORT to  NM HOST:PORT when processing the 
resource request. We should also support directly HOST:PORT without mapping for 
cases where MiniHDFS is not there.

[~ojoshi], multiple NMs register with its nodeId which contains HOST:PORT, so 
you have multiple nodes in the minicluster. But the scheduler logic, in all 
schedulers, use the node.getHost() to do the scheduling, that is why you see it 
working fine, all nodes report the same host. The problem is, you have no 
control on which NM you get.

The challenge is how do we get this to work nicely in minicluster and real 
setups without disruption.

 MiniYARNCluster with multiple nodemanagers, all nodes have same key for 
 allocations
 ---

 Key: YARN-1008
 URL: https://issues.apache.org/jira/browse/YARN-1008
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur

 While the NMs are keyed using the NodeId, the allocation is done based on the 
 hostname. 
 This makes the different nodes indistinguishable to the scheduler.
 There should be an option to enabled the host:port instead just port for 
 allocations. The nodes reported to the AM should report the 'key' (host or 
 host:port). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-985) Nodemanager should log where a resource was localized

2013-08-06 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731166#comment-13731166
 ] 

Jonathan Eagles commented on YARN-985:
--

Looks like we are all happy. Putting this in. Thanks, everybody.

 Nodemanager should log where a resource was localized
 -

 Key: YARN-985
 URL: https://issues.apache.org/jira/browse/YARN-985
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-985.branch-0.23.patch, YARN-985.patch, 
 YARN-985.patch


 When a resource is localized, we should log WHERE on the local disk it was 
 localized. This helps in debugging afterwards (e.g. if the disk was to go 
 bad).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-985) Nodemanager should log where a resource was localized

2013-08-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731212#comment-13731212
 ] 

Hudson commented on YARN-985:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #4221 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4221/])
YARN-985. Nodemanager should log where a resource was localized (Ravi Prakash 
via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1511100)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTrackerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalizedResource.java


 Nodemanager should log where a resource was localized
 -

 Key: YARN-985
 URL: https://issues.apache.org/jira/browse/YARN-985
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 3.0.0, 2.3.0, 0.23.10

 Attachments: YARN-985.branch-0.23.patch, YARN-985.patch, 
 YARN-985.patch


 When a resource is localized, we should log WHERE on the local disk it was 
 localized. This helps in debugging afterwards (e.g. if the disk was to go 
 bad).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1004) yarn.scheduler.minimum|maximum|increment-allocation-mb should have scheduler

2013-08-06 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731249#comment-13731249
 ] 

Alejandro Abdelnur commented on YARN-1004:
--

bq. Isn't it simpler for FS to ignore the existing configs?

It is simpler, but it is not correct. it will create confusion due to 
misconfigurations when moving from one scheduler to another (either way).

 yarn.scheduler.minimum|maximum|increment-allocation-mb should have scheduler
 

 Key: YARN-1004
 URL: https://issues.apache.org/jira/browse/YARN-1004
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Priority: Blocker
 Attachments: YARN-1004.patch


 As yarn.scheduler.minimum-allocation-mb is now a scheduler-specific 
 configuration, and functions differently for the Fair and Capacity 
 schedulers, it would be less confusing for the config names to include the 
 scheduler names, i.e. yarn.scheduler.fair.minimum-allocation-mb, 
 yarn.scheduler.capacity.minimum-allocation-mb, and 
 yarn.scheduler.fifo.minimum-allocation-mb.
 The same goes for yarn.scheduler.increment-allocation-mb, which only exists 
 for the Fair Scheduler, and yarn.scheduler.maximum-allocation-mb, for 
 consistency.
 If we wish to preserve backwards compatibility, we can deprecate the old 
 configs to the new ones. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-589) Expose a REST API for monitoring the fair scheduler

2013-08-06 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-589:


Attachment: YARN-589-2.patch

 Expose a REST API for monitoring the fair scheduler
 ---

 Key: YARN-589
 URL: https://issues.apache.org/jira/browse/YARN-589
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, 
 YARN-589.patch


 The fair scheduler should have an HTTP interface that exposes information 
 such as applications per queue, fair shares, demands, current allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-08-06 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Attachment: YARN-1021-images.tar.gz
YARN-1021-demo.tar.gz
YARN-1021.pdf

YARN-1021.pdf: simulator documentation.
YARN-1021-demo.tar.gz: configuration (for YARN) and data used for a demo 
running.
YARN-1021-images.tar.gz: images used by simulator site document.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-08-06 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Description: 
The Yarn Scheduler is a fertile area of interest with different 
implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, several 
optimizations are also made to improve scheduler performance for different 
scenarios and workload. Each scheduler algorithm has its own set of features, 
and drives scheduling decisions by many factors, such as fairness, capacity 
guarantee, resource availability, etc. It is very important to evaluate a 
scheduler algorithm very well before we deploy it in a production cluster. 
Unfortunately, currently it is non-trivial to evaluate a scheduling algorithm. 
Evaluating in a real cluster is always time and cost consuming, and it is also 
very hard to find a large-enough cluster. Hence, a simulator which can predict 
how well a scheduler algorithm for some specific workload would be quite useful.

We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
clusters and application loads in a single machine. This would be invaluable in 
furthering Yarn by providing a tool for researchers and developers to prototype 
new scheduler features and predict their behavior and performance with 
reasonable amount of confidence, there-by aiding rapid innovation.

The simulator will exercise the real Yarn ResourceManager removing the network 
factor by simulating NodeManagers and ApplicationMasters via handling and 
dispatching NM/AMs heartbeat events from within the same JVM.

To keep tracking of scheduler behavior and performance, a scheduler wrapper 
will wrap the real scheduler.

The simulator will produce real time metrics while executing, including:

* Resource usages for whole cluster and each queue, which can be utilized to 
configure cluster and queue's capacity.
* The detailed application execution trace (recorded in relation to simulated 
time), which can be analyzed to understand/validate the  scheduler behavior 
(individual jobs turn around time, throughput, fairness, capacity guarantee, 
etc).
* Several key metrics of scheduler algorithm, such as time cost of each 
scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
developers to find the code spots and scalability limits.

The simulator will provide real time charts showing the behavior of the 
scheduler and its performance.

A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

  was:
The Yarn Scheduler is a fertile area of interest with different 
implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, several 
optimizations are also made to improve scheduler performance for different 
scenarios and workload. Each scheduler algorithm has its own set of features, 
and drives scheduling decisions by many factors, such as fairness, capacity 
guarantee, resource availability, etc. It is very important to evaluate a 
scheduler algorithm very well before we deploy it in a production cluster. 
Unfortunately, currently it is non-trivial to evaluate a scheduling algorithm. 
Evaluating in a real cluster is always time and cost consuming, and it is also 
very hard to find a large-enough cluster. Hence, a simulator which can predict 
how well a scheduler algorithm for some specific workload would be quite useful.

We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
clusters and application loads in a single machine. This would be invaluable in 
furthering Yarn by providing a tool for researchers and developers to prototype 
new scheduler features and predict their behavior and performance with 
reasonable amount of confidence, there-by aiding rapid innovation.

The simulator will exercise the real Yarn ResourceManager removing the network 
factor by simulating NodeManagers and ApplicationMasters via handling and 
dispatching NM/AMs heartbeat events from within the same JVM.

To keep tracking of scheduler behavior and performance, a scheduler wrapper 
will wrap the real scheduler.

The simulator will produce real time metrics while executing, including:

* Resource usages for whole cluster and each queue, which can be utilized to 
configure cluster and queue's capacity.
* The detailed application execution trace (recorded in relation to simulated 
time), which can be analyzed to understand/validate the  scheduler behavior 
(individual jobs turn around time, throughput, fairness, capacity guarantee, 
etc).
* Several key metrics of scheduler algorithm, such as time cost of each 
scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
developers to find the code spots and scalability limits.

The simulator will provide real time charts showing the behavior of the 
scheduler and its performance.


 Yarn Scheduler Load Simulator
 

[jira] [Commented] (YARN-589) Expose a REST API for monitoring the fair scheduler

2013-08-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731436#comment-13731436
 ] 

Hadoop QA commented on YARN-589:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12596445/YARN-589-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1662//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1662//console

This message is automatically generated.

 Expose a REST API for monitoring the fair scheduler
 ---

 Key: YARN-589
 URL: https://issues.apache.org/jira/browse/YARN-589
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, 
 YARN-589.patch


 The fair scheduler should have an HTTP interface that exposes information 
 such as applications per queue, fair shares, demands, current allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.

2013-08-06 Thread Joseph Kniest (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731462#comment-13731462
 ] 

Joseph Kniest commented on YARN-1019:
-

Ok so this module YarnConfiguration, does other portions of the codebase access 
this for the config info like directories and what not and I need to find all 
those places? How does that information get passed to this object? Ultimately, 
we want to find where this object gets instantiated and ensure that it doesn't 
get relative paths correct? What exactly do we want with number 2 of this 
issue? I'm confused about that one

 YarnConfiguration validation for local disk path and http addresses.
 

 Key: YARN-1019
 URL: https://issues.apache.org/jira/browse/YARN-1019
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Omkar Vinit Joshi
Priority: Minor
  Labels: newbie

 Today we are not validating certain configuration parameters set in 
 yarn-site.xml. 1) Configurations related to paths... such as local-dirs, 
 log-dirs.. Our NM crashes during startup if they are set to relative paths 
 rather than absolute paths. To avoid such failures we can enforce checks 
 (absolute paths) before startup . i.e. before we actually startup...( i.e. 
 directory handler creating directories).
 2) Also for all the parameters using hostname:port unless we are ok with 
 default port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-06 Thread Ravi Prakash (JIRA)
Ravi Prakash created YARN-1036:
--

 Summary: Distributed Cache gives inconsistent result if cache 
files get deleted from task tracker 
 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash


This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-06 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-1036:
---

Attachment: YARN-1036.branch-0.23.patch

This is exactly the same patch as MAPREDUCE-4342.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-1010) FairScheduler: decouple container scheduling from nodemanager heartbeats

2013-08-06 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan reassigned YARN-1010:
-

Assignee: Wei Yan  (was: Alejandro Abdelnur)

 FairScheduler: decouple container scheduling from nodemanager heartbeats
 

 Key: YARN-1010
 URL: https://issues.apache.org/jira/browse/YARN-1010
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Wei Yan
Priority: Critical

 Currently scheduling for a node is done when a node heartbeats.
 For large cluster where the heartbeat interval is set to several seconds this 
 delays scheduling of incoming allocations significantly.
 We could have a continuous loop scanning all nodes and doing scheduling. If 
 there is availability AMs will get the allocation in the next heartbeat after 
 the one that placed the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731470#comment-13731470
 ] 

Hadoop QA commented on YARN-1036:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12596459/YARN-1036.branch-0.23.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1664//console

This message is automatically generated.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-08-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731474#comment-13731474
 ] 

Hadoop QA commented on YARN-1021:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12596449/YARN-1021.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1163 javac 
compiler warnings (more than the trunk's current 1147 warnings).

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 28 new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 7 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-assemblies hadoop-tools/hadoop-sls hadoop-tools/hadoop-tools-dist.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1663//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/1663//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/1663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-sls.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/1663//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1663//console

This message is automatically generated.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was 

[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-06 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731473#comment-13731473
 ] 

Omkar Vinit Joshi commented on YARN-1036:
-

Thanks [~raviprak]
Probably we need to isolate the logic for LOCALIZED and REQUEST scenarios? 
thoughts?
{code}
+  if (rsrc != null  (!isResourcePresent(rsrc))) {
+LOG.info(Resource  + rsrc.getLocalPath()
++  is missing, localizing it again);
+localrsrc.remove(req);
+rsrc = null;
+  }
{code}
the code is not required to be executed when a resource is getting LOCALIZED.. 
in trunk we have isolated them. Probably as in branch 0.23 we don't have 
anything like localCacheDirectoryManager it makes sense to just keep 
break...and do nothing in case it is LOCALIZED?
{code}
case LOCALIZED: break;
case REQUEST:
+  if (rsrc != null  (!isResourcePresent(rsrc))) {
+LOG.info(Resource  + rsrc.getLocalPath()
++  is missing, localizing it again);
+localrsrc.remove(req);
+rsrc = null;
+  }
.
{code}
didn't review the test code.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-353) Add Zookeeper-based store implementation for RMStateStore

2013-08-06 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-353:
--

Attachment: YARN-353.11.patch

Manually inspected the fields findbugs is complaining about - don't see any 
particular issues or additional need for synchronization. 

Uploading patch that adds exclusions for the two fields in question.


 Add Zookeeper-based store implementation for RMStateStore
 -

 Key: YARN-353
 URL: https://issues.apache.org/jira/browse/YARN-353
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Hitesh Shah
Assignee: Bikas Saha
 Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.1.patch, 
 YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch, YARN-353.5.patch, 
 YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch


 Add store that write RM state data to ZK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore

2013-08-06 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731480#comment-13731480
 ] 

Karthik Kambatla commented on YARN-353:
---

YARN-353.11.patch is the patch with findbugs exclusions.

 Add Zookeeper-based store implementation for RMStateStore
 -

 Key: YARN-353
 URL: https://issues.apache.org/jira/browse/YARN-353
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Hitesh Shah
Assignee: Bikas Saha
 Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.1.patch, 
 YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch, YARN-353.5.patch, 
 YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch


 Add store that write RM state data to ZK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore

2013-08-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731528#comment-13731528
 ] 

Hadoop QA commented on YARN-353:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12596465/YARN-353.11.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1665//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1665//console

This message is automatically generated.

 Add Zookeeper-based store implementation for RMStateStore
 -

 Key: YARN-353
 URL: https://issues.apache.org/jira/browse/YARN-353
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Hitesh Shah
Assignee: Bikas Saha
 Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.1.patch, 
 YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch, YARN-353.5.patch, 
 YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch


 Add store that write RM state data to ZK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.

2013-08-06 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731551#comment-13731551
 ] 

Omkar Vinit Joshi commented on YARN-1019:
-

[~josephkniest] to give you more insight into how it is used (General 
configuration read in hadoop).

bq. Ok so this module YarnConfiguration, does other portions of the codebase 
access this for the config info like directories and what not and I need to 
find all those places?
Probably not. If you are using eclipse for hadoop development then just do call 
hierarchy for the variable under consideration. say, for 
YarnConfiguration#RM_ADDRESS. You will see where it is getting used. That way 
your search is reduced. You should here ignore places where it is getting used 
inside test code. You don't need to validate test code. But you will have to 
add unit test case later to verify your changes.

bq. How does that information get passed to this object?
Probably you don't need to worry about this. You can trace
{code}
new YarnConfiguration()
{code}
call. It will read from configuration files. for YARN : yarn-site.xml, HDFS 
:hdfs-site.xml, CORE : core-site.xml

bq. Ultimately, we want to find where this object gets instantiated and ensure 
that it doesn't get relative paths correct?
Yes for all the places where we are getting file paths we need to ensure this. 
Make sure it is not OS specific. i.e. it works for WINDOWS/LINUX/MAC.

bq. What exactly do we want with number 2 of this issue? I'm confused about 
that one
when we expect for example RM_ADDRESS then we expect it to be host:port. 
just validate it.

Finally once you have done the changes you need to create patch and upload it 
via More Actions -- Attach Files and then Submit Patch.

 YarnConfiguration validation for local disk path and http addresses.
 

 Key: YARN-1019
 URL: https://issues.apache.org/jira/browse/YARN-1019
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Omkar Vinit Joshi
Priority: Minor
  Labels: newbie

 Today we are not validating certain configuration parameters set in 
 yarn-site.xml. 1) Configurations related to paths... such as local-dirs, 
 log-dirs.. Our NM crashes during startup if they are set to relative paths 
 rather than absolute paths. To avoid such failures we can enforce checks 
 (absolute paths) before startup . i.e. before we actually startup...( i.e. 
 directory handler creating directories).
 2) Also for all the parameters using hostname:port unless we are ok with 
 default port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1037) Create a helper function to create a local resource object given a path to file

2013-08-06 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-1037:
-

 Summary: Create a helper function to create a local resource 
object given a path to file
 Key: YARN-1037
 URL: https://issues.apache.org/jira/browse/YARN-1037
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah


A helper function that could given either a qualified or non-qualified path 
construct a local resource object. 

Should be available in one of the client library layers for developers to write 
against. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-899) Get queue administration ACLs working

2013-08-06 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731602#comment-13731602
 ] 

Xuan Gong commented on YARN-899:


Create a QueueACLsManager to save ApplicationId, CSQueue. Whenever the users 
try to get the applicationReport, list applications, kill applications thru 
commandLine, webservice or UI, queueACLsManager will check users' permission. 


 Get queue administration ACLs working
 -

 Key: YARN-899
 URL: https://issues.apache.org/jira/browse/YARN-899
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Xuan Gong
 Attachments: YARN-899.1.patch


 The Capacity Scheduler documents the 
 yarn.scheduler.capacity.root.queue-path.acl_administer_queue config option 
 for controlling who can administer a queue, but it is not hooked up to 
 anything.  The Fair Scheduler could make use of a similar option as well.  
 This is a feature-parity regression from MR1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-899) Get queue administration ACLs working

2013-08-06 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-899:
---

Attachment: YARN-899.1.patch

 Get queue administration ACLs working
 -

 Key: YARN-899
 URL: https://issues.apache.org/jira/browse/YARN-899
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Xuan Gong
 Attachments: YARN-899.1.patch


 The Capacity Scheduler documents the 
 yarn.scheduler.capacity.root.queue-path.acl_administer_queue config option 
 for controlling who can administer a queue, but it is not hooked up to 
 anything.  The Fair Scheduler could make use of a similar option as well.  
 This is a feature-parity regression from MR1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira