date:20150120


[ 
https://issues.apache.org/jira/browse/YARN-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283741#comment-14283741
 ] 

Hudson commented on YARN-3071:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #813 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/813/])
YARN-3071. Remove invalid char from sample conf in doc of FairScheduler. 
(Contributed by Masatake Iwasaki) (aajisaka: rev 
4a5c3a4cfee6b8008a722801821e64850582a985)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm


 Remove invalid char from sample conf in doc of FairScheduler
 

 Key: YARN-3071
 URL: https://issues.apache.org/jira/browse/YARN-3071
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.5.0, 2.6.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Trivial
 Fix For: 2.7.0

 Attachments: YARN-3071.001.patch, YARN-3071.002.patch


 copying and pasting conf causes failure on RM startup
 {code}
 Caused by: org.xml.sax.SAXParseException; systemId: 
 file:/home/iwasakims/dist/hadoop-2.6.0/etc/hadoop/fair-scheduler.xml; 
 lineNumber: 18; columnNumber: 5; The content of elements must consist of 
 well-formed character data or markup.
 at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
 at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
 at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:250)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1275)
 ... 9 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3015) yarn classpath command should support same options as hadoop classpath.


[ 
https://issues.apache.org/jira/browse/YARN-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283742#comment-14283742
 ] 

Hudson commented on YARN-3015:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #813 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/813/])
YARN-3015. yarn classpath command should support same options as hadoop 
classpath. Contributed by Varun Saxena. (cnauroth: rev 
cb0a15d20180c7ca3799e63a2d53aa8dee800abd)
* hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd
* hadoop-yarn-project/hadoop-yarn/bin/yarn
* hadoop-yarn-project/CHANGES.txt


 yarn classpath command should support same options as hadoop classpath.
 ---

 Key: YARN-3015
 URL: https://issues.apache.org/jira/browse/YARN-3015
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-3015-branch-2.patch, YARN-3015.002.patch, 
 YARN-3015.003.patch, YARN-3015.004.patch, YARN-3015.005.patch


 HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional 
 expansion of the wildcards and bundling the classpath into a jar file 
 containing a manifest with the Class-Path attribute. The other classpath 
 commands should do the same for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2933) Capacity Scheduler preemption policy should only consider capacity without labels temporarily


[ 
https://issues.apache.org/jira/browse/YARN-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283740#comment-14283740
 ] 

Hudson commented on YARN-2933:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #813 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/813/])
YARN-2933. Capacity Scheduler preemption policy should only consider capacity 
without labels temporarily. Contributed by Mayank Bansal (wangda: rev 
0a2d3e717d9c42090a32ff177991a222a1e34132)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java


 Capacity Scheduler preemption policy should only consider capacity without 
 labels temporarily
 -

 Key: YARN-2933
 URL: https://issues.apache.org/jira/browse/YARN-2933
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Mayank Bansal
 Fix For: 2.7.0

 Attachments: YARN-2933-1.patch, YARN-2933-2.patch, YARN-2933-3.patch, 
 YARN-2933-4.patch, YARN-2933-5.patch, YARN-2933-6.patch, YARN-2933-7.patch, 
 YARN-2933-8.patch, YARN-2933-9.patch


 Currently, we have capacity enforcement on each queue for each label in 
 CapacityScheduler, but we don't have preemption policy to support that. 
 YARN-2498 is targeting to support preemption respect node labels, but we have 
 some gaps in code base, like queues/FiCaScheduler should be able to get 
 usedResource/pendingResource, etc. by label. These items potentially need to 
 refactor CS which we need spend some time carefully think about.
 For now, what immediately we can do is allow calculate ideal_allocation and 
 preempt containers only for resources on nodes without labels, to avoid 
 regression like: A cluster has some nodes with labels and some not, assume 
 queueA isn't satisfied for resource without label, but for now, preemption 
 policy may preempt resource from nodes with labels for queueA, that is not 
 correct.
 Again, it is just a short-term enhancement, YARN-2498 will consider 
 preemption respecting node-labels for Capacity Scheduler which is our final 
 target. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3071) Remove invalid char from sample conf in doc of FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283687#comment-14283687
 ] 

Hudson commented on YARN-3071:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #79 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/79/])
YARN-3071. Remove invalid char from sample conf in doc of FairScheduler. 
(Contributed by Masatake Iwasaki) (aajisaka: rev 
4a5c3a4cfee6b8008a722801821e64850582a985)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm


 Remove invalid char from sample conf in doc of FairScheduler
 

 Key: YARN-3071
 URL: https://issues.apache.org/jira/browse/YARN-3071
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.5.0, 2.6.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Trivial
 Fix For: 2.7.0

 Attachments: YARN-3071.001.patch, YARN-3071.002.patch


 copying and pasting conf causes failure on RM startup
 {code}
 Caused by: org.xml.sax.SAXParseException; systemId: 
 file:/home/iwasakims/dist/hadoop-2.6.0/etc/hadoop/fair-scheduler.xml; 
 lineNumber: 18; columnNumber: 5; The content of elements must consist of 
 well-formed character data or markup.
 at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
 at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
 at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:250)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1275)
 ... 9 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2933) Capacity Scheduler preemption policy should only consider capacity without labels temporarily


[ 
https://issues.apache.org/jira/browse/YARN-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283686#comment-14283686
 ] 

Hudson commented on YARN-2933:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #79 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/79/])
YARN-2933. Capacity Scheduler preemption policy should only consider capacity 
without labels temporarily. Contributed by Mayank Bansal (wangda: rev 
0a2d3e717d9c42090a32ff177991a222a1e34132)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java


 Capacity Scheduler preemption policy should only consider capacity without 
 labels temporarily
 -

 Key: YARN-2933
 URL: https://issues.apache.org/jira/browse/YARN-2933
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Mayank Bansal
 Fix For: 2.7.0

 Attachments: YARN-2933-1.patch, YARN-2933-2.patch, YARN-2933-3.patch, 
 YARN-2933-4.patch, YARN-2933-5.patch, YARN-2933-6.patch, YARN-2933-7.patch, 
 YARN-2933-8.patch, YARN-2933-9.patch


 Currently, we have capacity enforcement on each queue for each label in 
 CapacityScheduler, but we don't have preemption policy to support that. 
 YARN-2498 is targeting to support preemption respect node labels, but we have 
 some gaps in code base, like queues/FiCaScheduler should be able to get 
 usedResource/pendingResource, etc. by label. These items potentially need to 
 refactor CS which we need spend some time carefully think about.
 For now, what immediately we can do is allow calculate ideal_allocation and 
 preempt containers only for resources on nodes without labels, to avoid 
 regression like: A cluster has some nodes with labels and some not, assume 
 queueA isn't satisfied for resource without label, but for now, preemption 
 policy may preempt resource from nodes with labels for queueA, that is not 
 correct.
 Again, it is just a short-term enhancement, YARN-2498 will consider 
 preemption respecting node-labels for Capacity Scheduler which is our final 
 target. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3015) yarn classpath command should support same options as hadoop classpath.


[ 
https://issues.apache.org/jira/browse/YARN-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283688#comment-14283688
 ] 

Hudson commented on YARN-3015:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #79 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/79/])
YARN-3015. yarn classpath command should support same options as hadoop 
classpath. Contributed by Varun Saxena. (cnauroth: rev 
cb0a15d20180c7ca3799e63a2d53aa8dee800abd)
* hadoop-yarn-project/hadoop-yarn/bin/yarn
* hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd
* hadoop-yarn-project/CHANGES.txt


 yarn classpath command should support same options as hadoop classpath.
 ---

 Key: YARN-3015
 URL: https://issues.apache.org/jira/browse/YARN-3015
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-3015-branch-2.patch, YARN-3015.002.patch, 
 YARN-3015.003.patch, YARN-3015.004.patch, YARN-3015.005.patch


 HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional 
 expansion of the wildcards and bundling the classpath into a jar file 
 containing a manifest with the Class-Path attribute. The other classpath 
 commands should do the same for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-20 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283734#comment-14283734
 ] 

Steve Loughran commented on YARN-1039:
--

I've always envisaged the flag could switch on some different policies, though 
with container-preservation across restarts, labels, log aggregation and 
windows for failure tracking, much of that is dealt with.

Otherwise, the longevity flag could be of use in
# RM UI. There's no percentage done any more, more live/not-live. This 
already causes confusion for our slider users.
# placement: do you want 100% of a node capacity to be for long-lived stuff, at 
the expense of being able to run anything short-lived there?
# pre-emption. The cost of pre-emption may be higher, but at the same time 
long-lived containers are the ones you may want to pre-empt the most, because 
the scheduler knows they won't go away any time soon.

The easy target is the UI, as that doesn't need scheduling changes, and the 
current percentage done view doesn't work. Something to indicate live/not 
live makes more sense (though not red/green unless you don't want colour blind 
people using your app)

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3003) Provide API for client to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3003:
---
Attachment: (was: YARN-3003.001.patch)

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3003) Provide API for client to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3003:
---
Attachment: (was: YARN-3003.001.patch)

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3003) Provide API for client to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3003:
---
Attachment: YARN-3003.001.patch

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3003) Provide API for client to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3003:
---
Attachment: YARN-3003.001.patch

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2933) Capacity Scheduler preemption policy should only consider capacity without labels temporarily


[ 
https://issues.apache.org/jira/browse/YARN-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283878#comment-14283878
 ] 

Hudson commented on YARN-2933:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2011 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2011/])
YARN-2933. Capacity Scheduler preemption policy should only consider capacity 
without labels temporarily. Contributed by Mayank Bansal (wangda: rev 
0a2d3e717d9c42090a32ff177991a222a1e34132)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java


 Capacity Scheduler preemption policy should only consider capacity without 
 labels temporarily
 -

 Key: YARN-2933
 URL: https://issues.apache.org/jira/browse/YARN-2933
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Mayank Bansal
 Fix For: 2.7.0

 Attachments: YARN-2933-1.patch, YARN-2933-2.patch, YARN-2933-3.patch, 
 YARN-2933-4.patch, YARN-2933-5.patch, YARN-2933-6.patch, YARN-2933-7.patch, 
 YARN-2933-8.patch, YARN-2933-9.patch


 Currently, we have capacity enforcement on each queue for each label in 
 CapacityScheduler, but we don't have preemption policy to support that. 
 YARN-2498 is targeting to support preemption respect node labels, but we have 
 some gaps in code base, like queues/FiCaScheduler should be able to get 
 usedResource/pendingResource, etc. by label. These items potentially need to 
 refactor CS which we need spend some time carefully think about.
 For now, what immediately we can do is allow calculate ideal_allocation and 
 preempt containers only for resources on nodes without labels, to avoid 
 regression like: A cluster has some nodes with labels and some not, assume 
 queueA isn't satisfied for resource without label, but for now, preemption 
 policy may preempt resource from nodes with labels for queueA, that is not 
 correct.
 Again, it is just a short-term enhancement, YARN-2498 will consider 
 preemption respecting node-labels for Capacity Scheduler which is our final 
 target. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3015) yarn classpath command should support same options as hadoop classpath.


[ 
https://issues.apache.org/jira/browse/YARN-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283880#comment-14283880
 ] 

Hudson commented on YARN-3015:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2011 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2011/])
YARN-3015. yarn classpath command should support same options as hadoop 
classpath. Contributed by Varun Saxena. (cnauroth: rev 
cb0a15d20180c7ca3799e63a2d53aa8dee800abd)
* hadoop-yarn-project/hadoop-yarn/bin/yarn
* hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd
* hadoop-yarn-project/CHANGES.txt


 yarn classpath command should support same options as hadoop classpath.
 ---

 Key: YARN-3015
 URL: https://issues.apache.org/jira/browse/YARN-3015
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-3015-branch-2.patch, YARN-3015.002.patch, 
 YARN-3015.003.patch, YARN-3015.004.patch, YARN-3015.005.patch


 HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional 
 expansion of the wildcards and bundling the classpath into a jar file 
 containing a manifest with the Class-Path attribute. The other classpath 
 commands should do the same for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3071) Remove invalid char from sample conf in doc of FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283879#comment-14283879
 ] 

Hudson commented on YARN-3071:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2011 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2011/])
YARN-3071. Remove invalid char from sample conf in doc of FairScheduler. 
(Contributed by Masatake Iwasaki) (aajisaka: rev 
4a5c3a4cfee6b8008a722801821e64850582a985)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm


 Remove invalid char from sample conf in doc of FairScheduler
 

 Key: YARN-3071
 URL: https://issues.apache.org/jira/browse/YARN-3071
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.5.0, 2.6.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Trivial
 Fix For: 2.7.0

 Attachments: YARN-3071.001.patch, YARN-3071.002.patch


 copying and pasting conf causes failure on RM startup
 {code}
 Caused by: org.xml.sax.SAXParseException; systemId: 
 file:/home/iwasakims/dist/hadoop-2.6.0/etc/hadoop/fair-scheduler.xml; 
 lineNumber: 18; columnNumber: 5; The content of elements must consist of 
 well-formed character data or markup.
 at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
 at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
 at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:250)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1275)
 ... 9 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3071) Remove invalid char from sample conf in doc of FairScheduler


[ 
https://issues.apache.org/jira/browse/YARN-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283866#comment-14283866
 ] 

Hudson commented on YARN-3071:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #76 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/76/])
YARN-3071. Remove invalid char from sample conf in doc of FairScheduler. 
(Contributed by Masatake Iwasaki) (aajisaka: rev 
4a5c3a4cfee6b8008a722801821e64850582a985)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm


 Remove invalid char from sample conf in doc of FairScheduler
 

 Key: YARN-3071
 URL: https://issues.apache.org/jira/browse/YARN-3071
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.5.0, 2.6.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Trivial
 Fix For: 2.7.0

 Attachments: YARN-3071.001.patch, YARN-3071.002.patch


 copying and pasting conf causes failure on RM startup
 {code}
 Caused by: org.xml.sax.SAXParseException; systemId: 
 file:/home/iwasakims/dist/hadoop-2.6.0/etc/hadoop/fair-scheduler.xml; 
 lineNumber: 18; columnNumber: 5; The content of elements must consist of 
 well-formed character data or markup.
 at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
 at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
 at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:250)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1275)
 ... 9 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3015) yarn classpath command should support same options as hadoop classpath.


[ 
https://issues.apache.org/jira/browse/YARN-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283867#comment-14283867
 ] 

Hudson commented on YARN-3015:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #76 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/76/])
YARN-3015. yarn classpath command should support same options as hadoop 
classpath. Contributed by Varun Saxena. (cnauroth: rev 
cb0a15d20180c7ca3799e63a2d53aa8dee800abd)
* hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd
* hadoop-yarn-project/CHANGES.txt
* hadoop-yarn-project/hadoop-yarn/bin/yarn


 yarn classpath command should support same options as hadoop classpath.
 ---

 Key: YARN-3015
 URL: https://issues.apache.org/jira/browse/YARN-3015
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-3015-branch-2.patch, YARN-3015.002.patch, 
 YARN-3015.003.patch, YARN-3015.004.patch, YARN-3015.005.patch


 HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional 
 expansion of the wildcards and bundling the classpath into a jar file 
 containing a manifest with the Class-Path attribute. The other classpath 
 commands should do the same for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2933) Capacity Scheduler preemption policy should only consider capacity without labels temporarily


[ 
https://issues.apache.org/jira/browse/YARN-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283865#comment-14283865
 ] 

Hudson commented on YARN-2933:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #76 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/76/])
YARN-2933. Capacity Scheduler preemption policy should only consider capacity 
without labels temporarily. Contributed by Mayank Bansal (wangda: rev 
0a2d3e717d9c42090a32ff177991a222a1e34132)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
* hadoop-yarn-project/CHANGES.txt


 Capacity Scheduler preemption policy should only consider capacity without 
 labels temporarily
 -

 Key: YARN-2933
 URL: https://issues.apache.org/jira/browse/YARN-2933
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Mayank Bansal
 Fix For: 2.7.0

 Attachments: YARN-2933-1.patch, YARN-2933-2.patch, YARN-2933-3.patch, 
 YARN-2933-4.patch, YARN-2933-5.patch, YARN-2933-6.patch, YARN-2933-7.patch, 
 YARN-2933-8.patch, YARN-2933-9.patch


 Currently, we have capacity enforcement on each queue for each label in 
 CapacityScheduler, but we don't have preemption policy to support that. 
 YARN-2498 is targeting to support preemption respect node labels, but we have 
 some gaps in code base, like queues/FiCaScheduler should be able to get 
 usedResource/pendingResource, etc. by label. These items potentially need to 
 refactor CS which we need spend some time carefully think about.
 For now, what immediately we can do is allow calculate ideal_allocation and 
 preempt containers only for resources on nodes without labels, to avoid 
 regression like: A cluster has some nodes with labels and some not, assume 
 queueA isn't satisfied for resource without label, but for now, preemption 
 policy may preempt resource from nodes with labels for queueA, that is not 
 correct.
 Again, it is just a short-term enhancement, YARN-2498 will consider 
 preemption respecting node-labels for Capacity Scheduler which is our final 
 target. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2933) Capacity Scheduler preemption policy should only consider capacity without labels temporarily


[ 
https://issues.apache.org/jira/browse/YARN-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283926#comment-14283926
 ] 

Hudson commented on YARN-2933:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2030 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2030/])
YARN-2933. Capacity Scheduler preemption policy should only consider capacity 
without labels temporarily. Contributed by Mayank Bansal (wangda: rev 
0a2d3e717d9c42090a32ff177991a222a1e34132)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* hadoop-yarn-project/CHANGES.txt


 Capacity Scheduler preemption policy should only consider capacity without 
 labels temporarily
 -

 Key: YARN-2933
 URL: https://issues.apache.org/jira/browse/YARN-2933
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Mayank Bansal
 Fix For: 2.7.0

 Attachments: YARN-2933-1.patch, YARN-2933-2.patch, YARN-2933-3.patch, 
 YARN-2933-4.patch, YARN-2933-5.patch, YARN-2933-6.patch, YARN-2933-7.patch, 
 YARN-2933-8.patch, YARN-2933-9.patch


 Currently, we have capacity enforcement on each queue for each label in 
 CapacityScheduler, but we don't have preemption policy to support that. 
 YARN-2498 is targeting to support preemption respect node labels, but we have 
 some gaps in code base, like queues/FiCaScheduler should be able to get 
 usedResource/pendingResource, etc. by label. These items potentially need to 
 refactor CS which we need spend some time carefully think about.
 For now, what immediately we can do is allow calculate ideal_allocation and 
 preempt containers only for resources on nodes without labels, to avoid 
 regression like: A cluster has some nodes with labels and some not, assume 
 queueA isn't satisfied for resource without label, but for now, preemption 
 policy may preempt resource from nodes with labels for queueA, that is not 
 correct.
 Again, it is just a short-term enhancement, YARN-2498 will consider 
 preemption respecting node-labels for Capacity Scheduler which is our final 
 target. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3015) yarn classpath command should support same options as hadoop classpath.


[ 
https://issues.apache.org/jira/browse/YARN-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283927#comment-14283927
 ] 

Hudson commented on YARN-3015:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2030 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2030/])
YARN-3015. yarn classpath command should support same options as hadoop 
classpath. Contributed by Varun Saxena. (cnauroth: rev 
cb0a15d20180c7ca3799e63a2d53aa8dee800abd)
* hadoop-yarn-project/hadoop-yarn/bin/yarn
* hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd
* hadoop-yarn-project/CHANGES.txt


 yarn classpath command should support same options as hadoop classpath.
 ---

 Key: YARN-3015
 URL: https://issues.apache.org/jira/browse/YARN-3015
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-3015-branch-2.patch, YARN-3015.002.patch, 
 YARN-3015.003.patch, YARN-3015.004.patch, YARN-3015.005.patch


 HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional 
 expansion of the wildcards and bundling the classpath into a jar file 
 containing a manifest with the Class-Path attribute. The other classpath 
 commands should do the same for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3015) yarn classpath command should support same options as hadoop classpath.


[ 
https://issues.apache.org/jira/browse/YARN-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283983#comment-14283983
 ] 

Varun Saxena commented on YARN-3015:


Thanks [~cnauroth] for the review and commit

 yarn classpath command should support same options as hadoop classpath.
 ---

 Key: YARN-3015
 URL: https://issues.apache.org/jira/browse/YARN-3015
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-3015-branch-2.patch, YARN-3015.002.patch, 
 YARN-3015.003.patch, YARN-3015.004.patch, YARN-3015.005.patch


 HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional 
 expansion of the wildcards and bundling the classpath into a jar file 
 containing a manifest with the Class-Path attribute. The other classpath 
 commands should do the same for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2933) Capacity Scheduler preemption policy should only consider capacity without labels temporarily


[ 
https://issues.apache.org/jira/browse/YARN-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283903#comment-14283903
 ] 

Hudson commented on YARN-2933:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #80 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/80/])
YARN-2933. Capacity Scheduler preemption policy should only consider capacity 
without labels temporarily. Contributed by Mayank Bansal (wangda: rev 
0a2d3e717d9c42090a32ff177991a222a1e34132)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
* hadoop-yarn-project/CHANGES.txt


 Capacity Scheduler preemption policy should only consider capacity without 
 labels temporarily
 -

 Key: YARN-2933
 URL: https://issues.apache.org/jira/browse/YARN-2933
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Mayank Bansal
 Fix For: 2.7.0

 Attachments: YARN-2933-1.patch, YARN-2933-2.patch, YARN-2933-3.patch, 
 YARN-2933-4.patch, YARN-2933-5.patch, YARN-2933-6.patch, YARN-2933-7.patch, 
 YARN-2933-8.patch, YARN-2933-9.patch


 Currently, we have capacity enforcement on each queue for each label in 
 CapacityScheduler, but we don't have preemption policy to support that. 
 YARN-2498 is targeting to support preemption respect node labels, but we have 
 some gaps in code base, like queues/FiCaScheduler should be able to get 
 usedResource/pendingResource, etc. by label. These items potentially need to 
 refactor CS which we need spend some time carefully think about.
 For now, what immediately we can do is allow calculate ideal_allocation and 
 preempt containers only for resources on nodes without labels, to avoid 
 regression like: A cluster has some nodes with labels and some not, assume 
 queueA isn't satisfied for resource without label, but for now, preemption 
 policy may preempt resource from nodes with labels for queueA, that is not 
 correct.
 Again, it is just a short-term enhancement, YARN-2498 will consider 
 preemption respecting node-labels for Capacity Scheduler which is our final 
 target. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3003) Provide API for client to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3003:
---
Attachment: (was: YARN-3003.001.patch)

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena

 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3003) Provide API for client to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3003:
---
Attachment: YARN-3003.001.patch

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3003) Provide API for client to retrieve label to node mapping


[ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284196#comment-14284196
 ] 

Hadoop QA commented on YARN-3003:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693309/YARN-3003.001.patch
  against trunk revision c94c0d2.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.mapred.TestNetworkedJob
  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.nodelabels.TestRMNodeLabelsManager
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6365//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6365//console

This message is automatically generated.

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3003) Provide API for client to retrieve label to node mapping

2015-01-20 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284205#comment-14284205
 ] 

Ted Yu commented on YARN-3003:
--

For messgae LabelsToNodeIdProto, should it be named LabelsToNodeIdsProto since 
nodeId field is repeated ?



 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3030) set up ATS writer with basic request serving structure and lifecycle

2015-01-20 Thread Zhijie Shen (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284209#comment-14284209
]

Zhijie Shen commented on YARN-3030:
---

bq. If that is not feasible, I'd say run test-patch.sh by hand to ensure basic
issues are caught

+1. Sounds a good idea.

I took a look at the patch, and had some thoughts:

1. The aggregator may have two responsibilities (perhaps except RM aggregator):
1. collecting the timeline data from the application, and 2. putting it into a
scalable storage. I can see BaseAggregatorService is the abstraction for the
latter piece. However, for the former piece we don't have such kind of
abstraction, but the implementation embedded in the per-node aggregator. It's
fine now, but once we move on with per-app aggregator, we need to copy paste
the same collecting logic.

IMHO, we should have an abstraction for collecting the timeline data from app
too, and make a REST based implementation now. In the future, we can even
replace it with a RPC based implementation. Per-node and per-app aggregators
are assembled with both collecting and aggregating services, while RM
aggregator only consists of aggregating service because it pulls its internal
data only.

2. For per-node aggregator, we make want to implement {{AuxiliaryService}}
interface, such that it can be installed to NM as the auxiliary service. The
interface provides similar life cycle we want to use, such as init app and stop
app.

set up ATS writer with basic request serving structure and lifecycle

Key: YARN-3030
URL: https://issues.apache.org/jira/browse/YARN-3030
Project: Hadoop YARN
Issue Type: Sub-task
Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Attachments: YARN-3030.001.patch

Per design in YARN-2928, create an ATS writer as a service, and implement the
basic service structure including the lifecycle management.
Also, as part of this JIRA, we should come up with the ATS client API for
sending requests to this ATS writer.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3003) Provide API for client to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3003:
---
Attachment: YARN-3003.001.patch

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3015) yarn classpath command should support same options as hadoop classpath.


[ 
https://issues.apache.org/jira/browse/YARN-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283904#comment-14283904
 ] 

Hudson commented on YARN-3015:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #80 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/80/])
YARN-3015. yarn classpath command should support same options as hadoop 
classpath. Contributed by Varun Saxena. (cnauroth: rev 
cb0a15d20180c7ca3799e63a2d53aa8dee800abd)
* hadoop-yarn-project/hadoop-yarn/bin/yarn
* hadoop-yarn-project/CHANGES.txt
* hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd


 yarn classpath command should support same options as hadoop classpath.
 ---

 Key: YARN-3015
 URL: https://issues.apache.org/jira/browse/YARN-3015
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Reporter: Chris Nauroth
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-3015-branch-2.patch, YARN-3015.002.patch, 
 YARN-3015.003.patch, YARN-3015.004.patch, YARN-3015.005.patch


 HADOOP-10903 enhanced the {{hadoop classpath}} command to support optional 
 expansion of the wildcards and bundling the classpath into a jar file 
 containing a manifest with the Class-Path attribute. The other classpath 
 commands should do the same for consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-20 Thread Carlo Curino (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284104#comment-14284104
 ] 

Carlo Curino commented on YARN-1039:


I am happy the conversation is re-ignited. As I was mentioning in [above | 
https://issues.apache.org/jira/browse/YARN-1039?focusedCommentId=14048345page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14048345],
 the long-lived tag is a coarse grained version of the notion of duration 
we added to the ReservationRequest (which tracks very closely ResourceRequest) 
in YARN-1051. 

The idea is that the AM could provide an estimate of the task duration, 
enabling (beyond what Steve already listed above) optimistic scheduling 
decisions like the one in YARN-2877 very short tasks (we run several 
experiments and the potential for increased utilization is substantial).  Given 
a duration parameter, expressing long-lived can be done by setting duration 
to a large value (or MAX_INT, or -1 or whatever convention).



 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2015-01-20 Thread Wei Yan (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284099#comment-14284099
]

Wei Yan commented on YARN-1021:
---

Thanks for the reply, [~kasha].

[~fcolzada], sorry for the late reply. I just checked hadoop-2.6.0 using the
sample-conf and sample-data, it works fine.
The first exception is because the web module not loaded yet. Just need to wait
2-3 seconds after you start the simulator. This exception would not affect the
simulator running.
For the second one, could u send you config and workload files to me? I can
look into it.

[~kasha], could u help to review YARN-1393, which provides a quick-start
tutorial.

Yarn Scheduler Load Simulator
-

Key: YARN-1021
URL: https://issues.apache.org/jira/browse/YARN-1021
Project: Hadoop YARN
Issue Type: New Feature
Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
Fix For: 2.3.0

Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf

The Yarn Scheduler is a fertile area of interest with different
implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile,
several optimizations are also made to improve scheduler performance for
different scenarios and workload. Each scheduler algorithm has its own set of
features, and drives scheduling decisions by many factors, such as fairness,
capacity guarantee, resource availability, etc. It is very important to
evaluate a scheduler algorithm very well before we deploy it in a production
cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling
algorithm. Evaluating in a real cluster is always time and cost consuming,
and it is also very hard to find a large-enough cluster. Hence, a simulator
which can predict how well a scheduler algorithm for some specific workload
would be quite useful.
We want to build a Scheduler Load Simulator to simulate large-scale Yarn
clusters and application loads in a single machine. This would be invaluable
in furthering Yarn by providing a tool for researchers and developers to
prototype new scheduler features and predict their behavior and performance
with reasonable amount of confidence, there-by aiding rapid innovation.
The simulator will exercise the real Yarn ResourceManager removing the
network factor by simulating NodeManagers and ApplicationMasters via handling
and dispatching NM/AMs heartbeat events from within the same JVM.
To keep tracking of scheduler behavior and performance, a scheduler wrapper
will wrap the real scheduler.
The simulator will produce real time metrics while executing, including:
* Resource usages for whole cluster and each queue, which can be utilized to
configure cluster and queue's capacity.
* The detailed application execution trace (recorded in relation to simulated
time), which can be analyzed to understand/validate the scheduler behavior
(individual jobs turn around time, throughput, fairness, capacity guarantee,
etc).
* Several key metrics of scheduler algorithm, such as time cost of each
scheduler operation (allocate, handle, etc), which can be utilized by Hadoop
developers to find the code spots and scalability limits.
The simulator will provide real time charts showing the behavior of the
scheduler and its performance.
A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing
how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3047) set up ATS reader with basic request serving structure and lifecycle


[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284130#comment-14284130
 ] 

Varun Saxena commented on YARN-3047:


[~sjlee0], wanted to know whether we will have a single instance of ATS Reader 
which the clients will connect to for querying. And this instance can in turn 
launch multiple threads to query storage in parallel.
Or a global instance which distributes requests to multiple instances of ATS 
readers ?



 set up ATS reader with basic request serving structure and lifecycle
 

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena

 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-20 Thread Craig Welch (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284325#comment-14284325
 ] 

Craig Welch commented on YARN-1039:
---

Another thought - if we do need this kind of flag, I think we should detach 
the notion from duration or long life as such - I think it's more about 
service vs batch - where a service's duration is not necessarily related to any 
preset notion of a work item it will start, work on, and complete - it will be 
started to handle work which is given to it, of unknown quantity ( potentially 
many different items) and stopped when no longer needed - it's not so much 
about the duration as the lifecycle (a batch operation may have a longer 
runtime than a service, for example).  So, I'd suggest dropping the temporal 
flavor and going with service vs batch, or something along those lines.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk


 [ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-3074:
--

Assignee: Varun Saxena

 Nodemanager dies when localizer runner tries to write to a full disk
 

 Key: YARN-3074
 URL: https://issues.apache.org/jira/browse/YARN-3074
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena

 When a LocalizerRunner tries to write to a full disk it can bring down the 
 nodemanager process.  Instead of failing the whole process we should fail 
 only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2731) Fixed RegisterApplicationMasterResponsePBImpl to properly invoke maybeInitBuilder


[ 
https://issues.apache.org/jira/browse/YARN-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284579#comment-14284579
 ] 

Hudson commented on YARN-2731:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6895 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6895/])
YARN-2731. Fixed RegisterApplicationMasterResponsePBImpl to properly invoke 
maybeInitBuilder. (Contributed by Carlo Curino) (wangda: rev 
f250ad1773b19713d6aea81ae290ebb4c90fd44b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/RegisterApplicationMasterResponsePBImpl.java
* hadoop-yarn-project/CHANGES.txt


 Fixed RegisterApplicationMasterResponsePBImpl to properly invoke 
 maybeInitBuilder
 -

 Key: YARN-2731
 URL: https://issues.apache.org/jira/browse/YARN-2731
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Carlo Curino
Assignee: Carlo Curino
 Fix For: 2.7.0

 Attachments: YARN-2731.patch


 If I am not mistaken in RegisterApplicationMasterResponsePBImpl we are 
 missing to initialize the builder in  setNMTokensFromPreviousAttempts(), and 
 we initialize the builder in the wrong place in:  setClientToAMTokenMasterKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2896) Server side PB changes for Priority Label Manager and Admin CLI support

2015-01-20 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284431#comment-14284431
 ] 

Eric Payne commented on YARN-2896:
--

Thanks, [~sunilg], for working on this feature and posting this patch to 
support PB framework.

I have a general question about why job priorities need labels. Why can't they 
just be number based? It seems like extra work to label them, pass the labels, 
and then interpret them.



{{PriorityLabelsPerQueue.java}}:
{code}
  public String toString() {
return Max priority label:  + this.getMaxPriorityLabel() +  ,
+ Default priority label:  + this.getDefaultPriorityLabel();
  }
{code}
This is just a nit, but in {{PriorityLabelsPerQueue#toString}}, the space 
should be after the comma. Currently, this will output something like this:
{code}
Max priority label: foo ,Default priority label: bar
{code}


{code}
  public int compareTo(PriorityLabelsPerQueue priorityLabelsPerQueue) {
int defltLabelCompare = this.getDefaultPriorityLabel().compareTo(
priorityLabelsPerQueue.getDefaultPriorityLabel());
if (defltLabelCompare == 0) {
  return this.getMaxPriorityLabel().compareTo(
  priorityLabelsPerQueue.getMaxPriorityLabel());
} else {
  return defltLabelCompare;
}
  }
{code}
{{PriorityLabelsPerQueue#compareTo}} should probably check for NULL for 
{{priorityLabelsPerQueue}}, {{this.getDefaultPriorityLabel()}}, and 
{{this.getMaxPriorityLabel()}}.  If {{priorityLabelsPerQueue}} is NULL, 
{{this.getDefaultPriorityLabel()}} returns NULL, or 
{{this.getMaxPriorityLabel()}} returns NULL, {{compareTo}} will throw NPE.



{{yarn_protos.proto}}:
{code}
message ApplicationPriorityProto {
  optional string app_priority = 1;
}
{code}
Where is this referenced?


 Server side PB changes for Priority Label Manager and Admin CLI support
 ---

 Key: YARN-2896
 URL: https://issues.apache.org/jira/browse/YARN-2896
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, resourcemanager
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-2896.patch, 0002-YARN-2896.patch, 
 0003-YARN-2896.patch, 0004-YARN-2896.patch


 Common changes:
  * PB support changes required for Admin APIs 
  * PB support for File System store (Priority Label Store)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3020) n similar addContainerRequest()s produce n*(n+1)/2 containers


[ 
https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284451#comment-14284451
 ] 

Wangda Tan commented on YARN-3020:
--

[~peterdkirchner],
The expected usage of AMRMClient is (Thanks for input from [~hitesh] and 
[~jianhe]): When you received newly allocated containers from RM, you should 
manually call {{removeContainerRequest}} to remove pending container requests. 
AMRMClient itself will not automatically deduct #pendingContainerRequests.

The reason is, when a container allocated from RM, AMRMClient doesn't know the 
container allocated from which ResourceRequest. You may think container has 
priority, capacity and resourceName, so that AMRMClient can get ResourceRequest 
via {{getMatchingRequests}}. But it is possible some applications may use the 
container for other propose (AMRMClient cannot understand application's 
specific logic). So AM should call {{removeContainerRequest}} itself.

To improve this, I think 1) we need add this behavior to YARN doc -- people 
should better understand how to use AMRMClient. And 2) maybe we should add a 
default implementation to deduct pending resource requests by 
prioirty/resource-name/capacity of allocated containers automatically (User can 
disable this default behavior, implement their own logic to deduct pending 
resource requests.)

Does this make sense to you?

Thanks,
Wangda

 n similar addContainerRequest()s produce n*(n+1)/2 containers
 -

 Key: YARN-3020
 URL: https://issues.apache.org/jira/browse/YARN-3020
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Peter D Kirchner
   Original Estimate: 24h
  Remaining Estimate: 24h

 BUG: If the application master calls addContainerRequest() n times, but with 
 the same priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 .  The most 
 containers are requested when the interval between calls to 
 addContainerRequest() exceeds the heartbeat interval of calls to allocate() 
 (in AMRMClientImpl's run() method).
 If the application master calls addContainerRequest() n times, but with a 
 unique priority each time, I get n containers (as I intended).
 Analysis:
 There is a logic problem in AMRMClientImpl.java.
 Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent 
 calls to addContainerRequest(), addResourceRequest() finds the previous 
 matching remoteRequest and increments the container count rather than 
 starting anew, and does an addResourceRequestToAsk() which defeats the 
 ask.clear().
 From documentation and code comments, it was hard for me to discern the 
 intended behavior of the API, but the inconsistency reported in this issue 
 suggests one case or the other is implemented incorrectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2731) RegisterApplicationMasterResponsePBImpl: not properly initialized builder


[ 
https://issues.apache.org/jira/browse/YARN-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284491#comment-14284491
 ] 

Wangda Tan commented on YARN-2731:
--

Good finding, [~curino]! LGTM, will commit once Jenkins get back.

 RegisterApplicationMasterResponsePBImpl: not properly initialized builder
 -

 Key: YARN-2731
 URL: https://issues.apache.org/jira/browse/YARN-2731
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Carlo Curino
Assignee: Carlo Curino
 Fix For: 2.7.0

 Attachments: YARN-2731.patch


 If I am not mistaken in RegisterApplicationMasterResponsePBImpl we are 
 missing to initialize the builder in  setNMTokensFromPreviousAttempts(), and 
 we initialize the builder in the wrong place in:  setClientToAMTokenMasterKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3076) YarnClient implementation to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3076:
---
Summary: YarnClient implementation to retrieve label to node mapping  (was: 
YarnClient related changes to retrieve label to node mapping)

 YarnClient implementation to retrieve label to node mapping
 ---

 Key: YARN-3076
 URL: https://issues.apache.org/jira/browse/YARN-3076
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3076) Client side implementation to retrieve label to node mapping


[ 
https://issues.apache.org/jira/browse/YARN-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284599#comment-14284599
 ] 

Wangda Tan commented on YARN-3076:
--

Converted this and YARN-3075 to sub task of YARN-2492. And suggest to rename 
their titles: They're not simply client/server, should be YarnClient and 
NodeLabelsManager, YarnClient itself contains client and server 
(ClientRMService). 

 Client side implementation to retrieve label to node mapping
 

 Key: YARN-3076
 URL: https://issues.apache.org/jira/browse/YARN-3076
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-20 Thread Jian Fang (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284446#comment-14284446
]

Jian Fang commented on YARN-1039:
-

The duration concept comes with a good intention, but what I really am afraid
of is that it could introduce a huge complex to YARN if it is not designed
properly.

First, there are so many moving parts under the hook for the estimation, for
example, the time of a 30 node cluster may be significantly different from the
one of a 300 node cluster. Getting into the measurement and estimation business
is very much like walking into benchmark comparison business, which is very
hard in reality.

Secondly, the duration probably relies on hadoop customers to provide a proper
value for it if YARN is not smart enough to derive the value by itself, which
could be impractical for many customers. Remember that many hadoop users are
not even developers. Many of them rely on high level components such as pig and
hive to run hadoop jobs. They probably don't know or care about the estimation.

As a result, at least, the duration should only be an enhancement if the value
is provided. YARN should still work properly without such a value.

Add parameter for YARN resource requests to indicate long lived
-

Key: YARN-1039
URL: https://issues.apache.org/jira/browse/YARN-1039
Project: Hadoop YARN
Issue Type: Sub-task
Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch

A container request could support a new parameter long-lived. This could be
used by a scheduler that would know not to host the service on a transient
(cloud: spot priced) node.
Schedulers could also decide whether or not to allocate multiple long-lived
containers on the same node

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3076) YarnClient related changes to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3076:
---
Summary: YarnClient related changes to retrieve label to node mapping  
(was: Client side implementation to retrieve label to node mapping)

 YarnClient related changes to retrieve label to node mapping
 

 Key: YARN-3076
 URL: https://issues.apache.org/jira/browse/YARN-3076
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers

2015-01-20 Thread Max (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284637#comment-14284637
 ] 

Max commented on YARN-2466:
---

It would be helpful but it should be switchable. It should be possible to 
activate for debugin. Turning it off will increase security for production 
systems. 

 Umbrella issue for Yarn launched Docker Containers
 --

 Key: YARN-2466
 URL: https://issues.apache.org/jira/browse/YARN-2466
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.4.1
Reporter: Abin Shahab
Assignee: Abin Shahab

 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to package their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).
 In addition to software isolation mentioned above, Docker containers will 
 provide resource, network, and user-namespace isolation. 
 Docker provides resource isolation through cgroups, similar to 
 LinuxContainerExecutor. This prevents one job from taking other jobs 
 resource(memory and CPU) on the same hadoop cluster. 
 User-namespace isolation will ensure that the root on the container is mapped 
 an unprivileged user on the host. This is currently being added to Docker.
 Network isolation will ensure that one user’s network traffic is completely 
 isolated from another user’s network traffic. 
 Last but not the least, the interaction of Docker and Kerberos will have to 
 be worked out. These Docker containers must work in a secure hadoop 
 environment.
 Additional details are here: 
 https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk

2015-01-20 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284587#comment-14284587
 ] 

Chris Douglas commented on YARN-3074:
-

bq. catch FSError since it will be a common and recoverable error in this case.

+1

 Nodemanager dies when localizer runner tries to write to a full disk
 

 Key: YARN-3074
 URL: https://issues.apache.org/jira/browse/YARN-3074
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena

 When a LocalizerRunner tries to write to a full disk it can bring down the 
 nodemanager process.  Instead of failing the whole process we should fail 
 only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers

2015-01-20 Thread Chen He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284598#comment-14284598
 ] 

Chen He commented on YARN-2466:
---

Hi [~ashahab], do we need to add a configuration parameter that can enable the 
container in interactive mode. Such as: yarn.docker.interactive. Then, user 
can attach to the running container just for debugging concern. 

 Umbrella issue for Yarn launched Docker Containers
 --

 Key: YARN-2466
 URL: https://issues.apache.org/jira/browse/YARN-2466
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.4.1
Reporter: Abin Shahab
Assignee: Abin Shahab

 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to package their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).
 In addition to software isolation mentioned above, Docker containers will 
 provide resource, network, and user-namespace isolation. 
 Docker provides resource isolation through cgroups, similar to 
 LinuxContainerExecutor. This prevents one job from taking other jobs 
 resource(memory and CPU) on the same hadoop cluster. 
 User-namespace isolation will ensure that the root on the container is mapped 
 an unprivileged user on the host. This is currently being added to Docker.
 Network isolation will ensure that one user’s network traffic is completely 
 isolated from another user’s network traffic. 
 Last but not the least, the interaction of Docker and Kerberos will have to 
 be worked out. These Docker containers must work in a secure hadoop 
 environment.
 Additional details are here: 
 https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers

2015-01-20 Thread Chen He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284898#comment-14284898
 ] 

Chen He commented on YARN-2466:
---

Yes, that is what I mean. Thank you for clarifying it, [~mikhmv]. 

 Umbrella issue for Yarn launched Docker Containers
 --

 Key: YARN-2466
 URL: https://issues.apache.org/jira/browse/YARN-2466
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.4.1
Reporter: Abin Shahab
Assignee: Abin Shahab

 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to package their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).
 In addition to software isolation mentioned above, Docker containers will 
 provide resource, network, and user-namespace isolation. 
 Docker provides resource isolation through cgroups, similar to 
 LinuxContainerExecutor. This prevents one job from taking other jobs 
 resource(memory and CPU) on the same hadoop cluster. 
 User-namespace isolation will ensure that the root on the container is mapped 
 an unprivileged user on the host. This is currently being added to Docker.
 Network isolation will ensure that one user’s network traffic is completely 
 isolated from another user’s network traffic. 
 Last but not the least, the interaction of Docker and Kerberos will have to 
 be worked out. These Docker containers must work in a secure hadoop 
 environment.
 Additional details are here: 
 https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-20 Thread Konstantinos Karanasos (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284920#comment-14284920
 ] 

Konstantinos Karanasos commented on YARN-1039:
--

Let me add my thoughts regarding whether we should allow duration to be 
reported instead of just a boolean switch for short tasks.
I am actively involved on adding distributed scheduling capabilities 
([YARN-2877]). We have performed an extensive experimental evaluation that has 
shown significant performance improvements in terms of throughput and latency, 
especially when short tasks are concerned. In that scenario, having the ability 
to specify the duration of the task is crucial (for deciding what type of 
container to use [[YARN-2882]], for estimating the waiting time in the NMs 
[[YARN-2886]], etc.).

I understand the concerns that have been raised about how to properly provide 
the right task duration. However, this can be done either based on historical 
information (previous waves of this task type or previous execution of the same 
job) or on application level knowledge.
We are already experimenting with ways of how to deal with imprecise task 
durations.

That said, I definitely agree with [~john.jian.fang] that the user should not 
*have to* provide any task duration (i.e., the system should work properly in 
case no durations are provided), but on the other hand, in case she does, we 
should be able to take advantage of it.
Moreover, as [~curino] pointed out, if the API exposes an integer instead of a 
boolean, we can simulate the boolean switch (e.g., by setting the value to 
MAX_INT for long tasks), but if we simply use a boolean, we would have to 
change the API in the future to support duration.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2731) RegisterApplicationMasterResponsePBImpl: not properly initialized builder


[ 
https://issues.apache.org/jira/browse/YARN-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284553#comment-14284553
 ] 

Hadoop QA commented on YARN-2731:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12676492/YARN-2731.patch
  against trunk revision dd0228b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6368//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6368//console

This message is automatically generated.

 RegisterApplicationMasterResponsePBImpl: not properly initialized builder
 -

 Key: YARN-2731
 URL: https://issues.apache.org/jira/browse/YARN-2731
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Carlo Curino
Assignee: Carlo Curino
 Fix For: 2.7.0

 Attachments: YARN-2731.patch


 If I am not mistaken in RegisterApplicationMasterResponsePBImpl we are 
 missing to initialize the builder in  setNMTokensFromPreviousAttempts(), and 
 we initialize the builder in the wrong place in:  setClientToAMTokenMasterKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2731) Fixed RegisterApplicationMasterResponsePBImpl to properly invoke maybeInitBuilder


[ 
https://issues.apache.org/jira/browse/YARN-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284575#comment-14284575
 ] 

Wangda Tan commented on YARN-2731:
--

Committed to trunk and branch-2, thanks Carlo!

 Fixed RegisterApplicationMasterResponsePBImpl to properly invoke 
 maybeInitBuilder
 -

 Key: YARN-2731
 URL: https://issues.apache.org/jira/browse/YARN-2731
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Carlo Curino
Assignee: Carlo Curino
 Fix For: 2.7.0

 Attachments: YARN-2731.patch


 If I am not mistaken in RegisterApplicationMasterResponsePBImpl we are 
 missing to initialize the builder in  setNMTokensFromPreviousAttempts(), and 
 we initialize the builder in the wrong place in:  setClientToAMTokenMasterKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2731) Fixed RegisterApplicationMasterResponsePBImpl to properly invoke maybeInitBuilder


 [ 
https://issues.apache.org/jira/browse/YARN-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2731:
-
Summary: Fixed RegisterApplicationMasterResponsePBImpl to properly invoke 
maybeInitBuilder  (was: RegisterApplicationMasterResponsePBImpl: not properly 
initialized builder)

 Fixed RegisterApplicationMasterResponsePBImpl to properly invoke 
 maybeInitBuilder
 -

 Key: YARN-2731
 URL: https://issues.apache.org/jira/browse/YARN-2731
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Carlo Curino
Assignee: Carlo Curino
 Fix For: 2.7.0

 Attachments: YARN-2731.patch


 If I am not mistaken in RegisterApplicationMasterResponsePBImpl we are 
 missing to initialize the builder in  setNMTokensFromPreviousAttempts(), and 
 we initialize the builder in the wrong place in:  setClientToAMTokenMasterKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3076) Client side implementation to retrieve label to node mapping

Varun Saxena created YARN-3076:
--

 Summary: Client side implementation to retrieve label to node 
mapping
 Key: YARN-3076
 URL: https://issues.apache.org/jira/browse/YARN-3076
 Project: Hadoop YARN
  Issue Type: Task
  Components: client
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3030) set up ATS writer with basic request serving structure and lifecycle

2015-01-20 Thread Sangjin Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-3030:
--
Attachment: YARN-3030.002.patch

Posted patch v.2.

Basically switched PerNodeAggregator to AuxiliaryService. Ran the test-patch 
script:

{color:green}+1 overall{color}.  

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version ) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.


 set up ATS writer with basic request serving structure and lifecycle
 

 Key: YARN-3030
 URL: https://issues.apache.org/jira/browse/YARN-3030
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-3030.001.patch, YARN-3030.002.patch


 Per design in YARN-2928, create an ATS writer as a service, and implement the 
 basic service structure including the lifecycle management.
 Also, as part of this JIRA, we should come up with the ATS client API for 
 sending requests to this ATS writer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3077) RM should create yarn.resourcemanager.zk-state-store.parent-path recursively


[ 
https://issues.apache.org/jira/browse/YARN-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285164#comment-14285164
 ] 

Varun Saxena commented on YARN-3077:


Sure...go ahead. I have unassigned it

 RM should create yarn.resourcemanager.zk-state-store.parent-path recursively
 

 Key: YARN-3077
 URL: https://issues.apache.org/jira/browse/YARN-3077
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Chun Chen

 If multiple clusters share a zookeeper cluster, users might use 
 /rmstore/${yarn.resourcemanager.cluster-id} as the state store path. If user 
 specified a customer value which is not a top-level path for 
 ${yarn.resourcemanager.zk-state-store.parent-path}, yarn should create parent 
 path first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3077) RM should create yarn.resourcemanager.zk-state-store.parent-path recursively


 [ 
https://issues.apache.org/jira/browse/YARN-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3077:
---
Assignee: (was: Varun Saxena)

 RM should create yarn.resourcemanager.zk-state-store.parent-path recursively
 

 Key: YARN-3077
 URL: https://issues.apache.org/jira/browse/YARN-3077
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Chun Chen

 If multiple clusters share a zookeeper cluster, users might use 
 /rmstore/${yarn.resourcemanager.cluster-id} as the state store path. If user 
 specified a customer value which is not a top-level path for 
 ${yarn.resourcemanager.zk-state-store.parent-path}, yarn should create parent 
 path first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3078) LogCLIHelpers lacks of a blank space before string 'does not exist'

2015-01-20 Thread sam liu (JIRA)

sam liu created YARN-3078:
-

 Summary: LogCLIHelpers lacks of a blank space before string 'does 
not exist'
 Key: YARN-3078
 URL: https://issues.apache.org/jira/browse/YARN-3078
 Project: Hadoop YARN
  Issue Type: Bug
  Components: log-aggregation
Affects Versions: trunk-win
Reporter: sam liu
Priority: Minor


LogCLIHelpers lacks of a blank space before string 'does not exist' and it will 
bring incorrect return message.

For example, I ran command 'yarn logs -applicationId 
application_1421742816585_0003', and the return message includes 
'logs/application_1421742816585_0003does not exist'. 

Obviously it's incorrect and the correct return message should be 
'logs/application_1421742816585_0003 does not exist'




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3077) RM should create yarn.resourcemanager.zk-state-store.parent-path recursively


[ 
https://issues.apache.org/jira/browse/YARN-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285174#comment-14285174
 ] 

Chun Chen commented on YARN-3077:
-

Thanks. :)

 RM should create yarn.resourcemanager.zk-state-store.parent-path recursively
 

 Key: YARN-3077
 URL: https://issues.apache.org/jira/browse/YARN-3077
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Chun Chen

 If multiple clusters share a zookeeper cluster, users might use 
 /rmstore/${yarn.resourcemanager.cluster-id} as the state store path. If user 
 specified a customer value which is not a top-level path for 
 ${yarn.resourcemanager.zk-state-store.parent-path}, yarn should create parent 
 path first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3037) create HBase cluster backing storage implementation for ATS writes

2015-01-20 Thread Sangjin Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285004#comment-14285004
 ] 

Sangjin Lee commented on YARN-3037:
---

Agreed. We should definitely work together on this. I think this depends on the 
object model (YARN-3041) and since the new object model will be different than 
the existing one (esp. regarding flows, etc.), I suspect the implementation 
would be quite different. Nonetheless, we should be able to use learnings from 
the previous work that's done.

We should have more discussions on this as you said.

 create HBase cluster backing storage implementation for ATS writes
 --

 Key: YARN-3037
 URL: https://issues.apache.org/jira/browse/YARN-3037
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C

 Per design in YARN-2928, create a backing storage implementation for ATS 
 writes based on a full HBase cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3077) RM should create yarn.resourcemanager.zk-state-store.parent-path recursively


[ 
https://issues.apache.org/jira/browse/YARN-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285161#comment-14285161
 ] 

Chun Chen commented on YARN-3077:
-

[~varun_saxena] I would like to do it myself.

 RM should create yarn.resourcemanager.zk-state-store.parent-path recursively
 

 Key: YARN-3077
 URL: https://issues.apache.org/jira/browse/YARN-3077
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Chun Chen
Assignee: Varun Saxena

 If multiple clusters share a zookeeper cluster, users might use 
 /rmstore/${yarn.resourcemanager.cluster-id} as the state store path. If user 
 specified a customer value which is not a top-level path for 
 ${yarn.resourcemanager.zk-state-store.parent-path}, yarn should create parent 
 path first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3078) LogCLIHelpers lacks of a blank space before string 'does not exist'

2015-01-20 Thread sam liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam liu updated YARN-3078:
--
Attachment: YARN-3078.001.patch

This fix will resolve the issue.

Could any expert help please review the patch and assign this jira to me?

Thanks!

 LogCLIHelpers lacks of a blank space before string 'does not exist'
 ---

 Key: YARN-3078
 URL: https://issues.apache.org/jira/browse/YARN-3078
 Project: Hadoop YARN
  Issue Type: Bug
  Components: log-aggregation
Affects Versions: trunk-win
Reporter: sam liu
Priority: Minor
 Attachments: YARN-3078.001.patch


 LogCLIHelpers lacks of a blank space before string 'does not exist' and it 
 will bring incorrect return message.
 For example, I ran command 'yarn logs -applicationId 
 application_1421742816585_0003', and the return message includes 
 'logs/application_1421742816585_0003does not exist'. 
 Obviously it's incorrect and the correct return message should be 
 'logs/application_1421742816585_0003 does not exist'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3078) LogCLIHelpers lacks of a blank space before string 'does not exist'


[ 
https://issues.apache.org/jira/browse/YARN-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285160#comment-14285160
 ] 

Hadoop QA commented on YARN-3078:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693495/YARN-3078.001.patch
  against trunk revision 73b72a0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6369//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6369//console

This message is automatically generated.

 LogCLIHelpers lacks of a blank space before string 'does not exist'
 ---

 Key: YARN-3078
 URL: https://issues.apache.org/jira/browse/YARN-3078
 Project: Hadoop YARN
  Issue Type: Bug
  Components: log-aggregation
Affects Versions: trunk-win
Reporter: sam liu
Priority: Minor
 Attachments: YARN-3078.001.patch


 LogCLIHelpers lacks of a blank space before string 'does not exist' and it 
 will bring incorrect return message.
 For example, I ran command 'yarn logs -applicationId 
 application_1421742816585_0003', and the return message includes 
 'logs/application_1421742816585_0003does not exist'. 
 Obviously it's incorrect and the correct return message should be 
 'logs/application_1421742816585_0003 does not exist'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3077) RM should create yarn.resourcemanager.zk-state-store.parent-path recursively

Chun Chen created YARN-3077:
---

 Summary: RM should create 
yarn.resourcemanager.zk-state-store.parent-path recursively
 Key: YARN-3077
 URL: https://issues.apache.org/jira/browse/YARN-3077
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Chun Chen


If multiple clusters share a zookeeper cluster, users might use 
/rmstore/${yarn.resourcemanager.cluster-id} as the state store path. If user 
specified a customer value which is not a top-level path for 
${yarn.resourcemanager.zk-state-store.parent-path}, yarn should create parent 
path first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2868) Add metric for initial container launch time

2015-01-20 Thread Ray Chiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-2868:
-
Attachment: YARN-2868.006.patch

Implement changes based on feedback.

 Add metric for initial container launch time
 

 Key: YARN-2868
 URL: https://issues.apache.org/jira/browse/YARN-2868
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Ray Chiang
Assignee: Ray Chiang
  Labels: metrics, supportability
 Attachments: YARN-2868-01.patch, YARN-2868.002.patch, 
 YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, 
 YARN-2868.006.patch


 Add a metric to measure the latency between starting container allocation 
 and first container actually allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3075) Server side implementation to retrieve label to node mapping

Varun Saxena created YARN-3075:
--

 Summary: Server side implementation to retrieve label to node 
mapping
 Key: YARN-3075
 URL: https://issues.apache.org/jira/browse/YARN-3075
 Project: Hadoop YARN
  Issue Type: Task
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3075) Server side implementation to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3075:
-
Issue Type: Sub-task  (was: Task)
Parent: YARN-2492

 Server side implementation to retrieve label to node mapping
 

 Key: YARN-3075
 URL: https://issues.apache.org/jira/browse/YARN-3075
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3076) Client side implementation to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3076:
-
Issue Type: Sub-task  (was: Task)
Parent: YARN-2492

 Client side implementation to retrieve label to node mapping
 

 Key: YARN-3076
 URL: https://issues.apache.org/jira/browse/YARN-3076
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3028) Better syntax for replace label CLI

2015-01-20 Thread Rohith (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3028:
-
Attachment: 0001-YARN-3028.patch

 Better syntax for replace label CLI
 ---

 Key: YARN-3028
 URL: https://issues.apache.org/jira/browse/YARN-3028
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3028.patch


 The command to replace label now is such:
 {code}
 yarn rmadmin -replaceLabelsOnNode [node1:port,label1,label2 
 node2:port,label1,label2]
 {code}
 Instead of {code} node1:port,label1,label2 {code} I think it's better to say 
 {code} node1:port=label1,label2 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3024) LocalizerRunner should give DIE action when all resources are localized

2015-01-20 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285287#comment-14285287
 ] 

Xuan Gong commented on YARN-3024:
-

Yes, I will take a look shortly.

 LocalizerRunner should give DIE action when all resources are localized
 ---

 Key: YARN-3024
 URL: https://issues.apache.org/jira/browse/YARN-3024
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Chengbing Liu
Assignee: Chengbing Liu
 Attachments: YARN-3024.01.patch, YARN-3024.02.patch, 
 YARN-3024.03.patch


 We have observed that {{LocalizerRunner}} always gives a LIVE action at the 
 end of localization process.
 The problem is {{findNextResource()}} can return null even when {{pending}} 
 was not empty prior to the call. This method removes localized resources from 
 {{pending}}, therefore we should check the return value, and gives DIE action 
 when it returns null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3028) Better syntax for replace label CLI


[ 
https://issues.apache.org/jira/browse/YARN-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285256#comment-14285256
 ] 

Hadoop QA commented on YARN-3028:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693515/0001-YARN-3028.patch
  against trunk revision 6b17eb9.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:

  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
  org.apache.hadoop.yarn.client.api.impl.TestYarnClient

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6370//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6370//console

This message is automatically generated.

 Better syntax for replace label CLI
 ---

 Key: YARN-3028
 URL: https://issues.apache.org/jira/browse/YARN-3028
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3028.patch


 The command to replace label now is such:
 {code}
 yarn rmadmin -replaceLabelsOnNode [node1:port,label1,label2 
 node2:port,label1,label2]
 {code}
 Instead of {code} node1:port,label1,label2 {code} I think it's better to say 
 {code} node1:port=label1,label2 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3077) RM should create yarn.resourcemanager.zk-state-store.parent-path recursively


 [ 
https://issues.apache.org/jira/browse/YARN-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chen updated YARN-3077:

Attachment: YARN-3077.patch

 RM should create yarn.resourcemanager.zk-state-store.parent-path recursively
 

 Key: YARN-3077
 URL: https://issues.apache.org/jira/browse/YARN-3077
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Chun Chen
 Attachments: YARN-3077.patch


 If multiple clusters share a zookeeper cluster, users might use 
 /rmstore/${yarn.resourcemanager.cluster-id} as the state store path. If user 
 specified a customer value which is not a top-level path for 
 ${yarn.resourcemanager.zk-state-store.parent-path}, yarn should create parent 
 path first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3057) Need update apps' runnability when reloading allocation files for FairScheduler

2015-01-20 Thread Jun Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Gong updated YARN-3057:
---
Attachment: YARN-3057.patch

 Need update apps' runnability when reloading allocation files for 
 FairScheduler
 ---

 Key: YARN-3057
 URL: https://issues.apache.org/jira/browse/YARN-3057
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3057.patch


 If we submit a app and the number of running app in its corresponding leaf 
 queue has reached its max limit, the app will be put into 'nonRunnableApps'. 
 And its runnabiltiy will only be updated when removing a 
 appattempt(FairScheduler will call `updateRunnabilityOnAppRemoval` at that 
 time).
 Suppose there are only service apps running, they will not finish, so the 
 submitted app will not be scheduled even we change leaf queue's max limit. I 
 think we need update apps' runnability when reloading allocation files for 
 FairScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2896) Server side PB changes for Priority Label Manager and Admin CLI support

2015-01-20 Thread Sunil G (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285090#comment-14285090
]

Sunil G commented on YARN-2896:
---

Thank you [~eepayne] and [~wangda] for the comments

The idea of keeping Application Priority as a string is for better handling and
for easiness from user perspective.
Internally RM will have a corresponding integer mapping, and only that will be
used by Schedulers. Hence as wangda mentioned, it can be operated just like an
integer with user priority etc.

A rough idea is like, user is submitting a job with priority as “High” and
scheduler will be treating as an integer namely “3”.
Priority Label Manager will act as an interface to User and Scheduler and can
give the priority as string or integer accordingly.

Now coming to the advantages, admin can operate on names or labels for
priority, it will be easier. Also it can be displayed in UI very easily.
Also admin can config the priority label as per his norms along by defining
corresponding integer mapping associated with each label.
For eg:
{noformat}
yarn.cluster.priority-labels=low:1,medium:3,high:5
{noformat}
Configuring ACLs based on a priority label name will be more easier.
{noformat}
yarn.scheduler.capacity.root.queueA.High.acl=user1,user2
{noformat}

Please share your thoughts.
I will address the other comments from Eric and will update a patch.

Server side PB changes for Priority Label Manager and Admin CLI support
---

Key: YARN-2896
URL: https://issues.apache.org/jira/browse/YARN-2896
Project: Hadoop YARN
Issue Type: Sub-task
Components: api, resourcemanager
Reporter: Sunil G
Assignee: Sunil G
Attachments: 0001-YARN-2896.patch, 0002-YARN-2896.patch,
0003-YARN-2896.patch, 0004-YARN-2896.patch

Common changes:
* PB support changes required for Admin APIs
* PB support for File System store (Priority Label Store)

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3077) RM should create yarn.resourcemanager.zk-state-store.parent-path recursively


 [ 
https://issues.apache.org/jira/browse/YARN-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-3077:
--

Assignee: Varun Saxena

 RM should create yarn.resourcemanager.zk-state-store.parent-path recursively
 

 Key: YARN-3077
 URL: https://issues.apache.org/jira/browse/YARN-3077
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Chun Chen
Assignee: Varun Saxena

 If multiple clusters share a zookeeper cluster, users might use 
 /rmstore/${yarn.resourcemanager.cluster-id} as the state store path. If user 
 specified a customer value which is not a top-level path for 
 ${yarn.resourcemanager.zk-state-store.parent-path}, yarn should create parent 
 path first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3028) Better syntax for replace label CLI

2015-01-20 Thread Rohith (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285213#comment-14285213
 ] 

Rohith commented on YARN-3028:
--

Attached the patch for supporting both of them, kindly review

 Better syntax for replace label CLI
 ---

 Key: YARN-3028
 URL: https://issues.apache.org/jira/browse/YARN-3028
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3028.patch


 The command to replace label now is such:
 {code}
 yarn rmadmin -replaceLabelsOnNode [node1:port,label1,label2 
 node2:port,label1,label2]
 {code}
 Instead of {code} node1:port,label1,label2 {code} I think it's better to say 
 {code} node1:port=label1,label2 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3024) LocalizerRunner should give DIE action when all resources are localized

2015-01-20 Thread Chengbing Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285270#comment-14285270
 ] 

Chengbing Liu commented on YARN-3024:
-

[~xgong] Could you please review the changed logic described in my last 
comment? This patch will save at least one second for each localization process.

 LocalizerRunner should give DIE action when all resources are localized
 ---

 Key: YARN-3024
 URL: https://issues.apache.org/jira/browse/YARN-3024
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: Chengbing Liu
Assignee: Chengbing Liu
 Attachments: YARN-3024.01.patch, YARN-3024.02.patch, 
 YARN-3024.03.patch


 We have observed that {{LocalizerRunner}} always gives a LIVE action at the 
 end of localization process.
 The problem is {{findNextResource()}} can return null even when {{pending}} 
 was not empty prior to the call. This method removes localized resources from 
 {{pending}}, therefore we should check the return value, and gives DIE action 
 when it returns null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2868) Add metric for initial container launch time


[ 
https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285306#comment-14285306
 ] 

Hadoop QA commented on YARN-2868:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12693520/YARN-2868.006.patch
  against trunk revision 6b17eb9.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6371//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6371//console

This message is automatically generated.

 Add metric for initial container launch time
 

 Key: YARN-2868
 URL: https://issues.apache.org/jira/browse/YARN-2868
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Ray Chiang
Assignee: Ray Chiang
  Labels: metrics, supportability
 Attachments: YARN-2868-01.patch, YARN-2868.002.patch, 
 YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, 
 YARN-2868.006.patch


 Add a metric to measure the latency between starting container allocation 
 and first container actually allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk

2015-01-20 Thread Jason Lowe (JIRA)

Jason Lowe created YARN-3074:


 Summary: Nodemanager dies when localizer runner tries to write to 
a full disk
 Key: YARN-3074
 URL: https://issues.apache.org/jira/browse/YARN-3074
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe


When a LocalizerRunner tries to write to a full disk it can bring down the 
nodemanager process.  Instead of failing the whole process we should fail only 
the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk

2015-01-20 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284243#comment-14284243
 ] 

Jason Lowe commented on YARN-3074:
--

Sample stacktrace:
{noformat}
2015-01-16 12:06:56,399 [LocalizerRunner for 
container_1416815736267_3849544_01_000817] FATAL 
yarn.YarnUncaughtExceptionHandler: Thread Thread[LocalizerRunner for 
container_1416815736267_3849544_01_000817,5,main] threw an Error.  Shutting 
down now...
org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
at 
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:226)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.FilterOutputStream.close(FilterOutputStream.java:157)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at 
org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.close(ChecksumFs.java:366)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.writeCredentials(ResourceLocalizationService.java:1125)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1068)
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:318)
at 
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:224)
... 10 more
{noformat}

Looks like the Hadoop filesystem layer helpfully changed what was originally 
an IOException into an FSError.  FSError is _not_ an Exception, so the 
try...catch(Exception) block in LocalizerRunner.run doesn't catch it.  It then 
bubbles up to the top of the thread, and the uncaught exception handler kills 
the whole process.

We should consider catching Throwable rather than Exception in 
LocalizerRunner.run, or at least also catch FSError since it will be a common 
and recoverable error in this case.

 Nodemanager dies when localizer runner tries to write to a full disk
 

 Key: YARN-3074
 URL: https://issues.apache.org/jira/browse/YARN-3074
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe

 When a LocalizerRunner tries to write to a full disk it can bring down the 
 nodemanager process.  Instead of failing the whole process we should fail 
 only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived


[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284290#comment-14284290
 ] 

Wangda Tan commented on YARN-1039:
--

For task placement for long-lived request, YARN-796 could take care of deciding 
which instance should run for a specific long-lived request. User can either 
manually specify label they want for such long-lived containers, or add some 
rules in scheduler side to configure and add labels automatically to such 
long-lived requests.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3003) Provide API for client to retrieve label to node mapping


 [ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3003:
---
Attachment: YARN-3003.002.patch

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch, YARN-3003.002.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3003) Provide API for client to retrieve label to node mapping


[ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284224#comment-14284224
 ] 

Wangda Tan commented on YARN-3003:
--

Thanks for providing your thoughts!

To Naga's first point:
I think performance is one concern (as mentioned by Varun, we may need to 
rewrite some part of code to make it's efficient). And we need more solid use 
case / input for that. For first step, we can focus on simply receive a set of 
labels as input and return a labels-nodes mapping.

To second point, I think it's important to keep it in memory, not much extra 
memory needed to keep such mapping in memory.

And thanks for [~varun_saxena] working on this, one suggestion is, could you 
split it to two parts, one is adding getLabelsToNodes API and implementation to 
NodeLabelsManager. Another one is adding API and implementation to YarnClient. 
(latter one will depends on previous one). Sounds good?

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3003) Provide API for client to retrieve label to node mapping


[ 
https://issues.apache.org/jira/browse/YARN-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284231#comment-14284231
 ] 

Varun Saxena commented on YARN-3003:


[~leftnoteasy], you mean break this JIRA into 2 parts ?

 Provide API for client to retrieve label to node mapping
 

 Key: YARN-3003
 URL: https://issues.apache.org/jira/browse/YARN-3003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Ted Yu
Assignee: Varun Saxena
 Attachments: YARN-3003.001.patch


 Currently YarnClient#getNodeToLabels() returns the mapping from NodeId to set 
 of labels associated with the node.
 Client (such as Slider) may be interested in label to node mapping - given 
 label, return the nodes with this label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-20 Thread Jian Fang (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284232#comment-14284232
]

Jian Fang commented on YARN-1039:
-

Thanks Steve for your clarification. Seems the long lived concept makes sense
now if this flag is associated with policy switch in YARN.

I think the above is only one part of the story. Cluster infrastructure itself
probably is another part that we need to consider. Just like the spot instance
feature in EC2 as mentioned in this JIRA.

The long lived concept should have more impacts on hadoop clusters in a cloud
environment. For example instance type could affect the container scheduling.
We should also take this concept into consideration for some elastic features
such as graceful expansion and shrink of a cluster in cloud.

On the other side, I still think YARN-796 should be used together with the long
lived concept. For example, how would resource manager know which instance
should run a long lived daemon/task? There should be a mapping between the long
lived concept and the tags/labels provided by instance. Right?

Add parameter for YARN resource requests to indicate long lived
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived