[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Attachment: YARN-1021.patch

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.pdf, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Attachment: YARN-1021.pdf

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.pdf, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Attachment: (was: YARN-1021.pdf)

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1226) Inconsistent hostname leads to low data locality

2013-09-24 Thread Kaibo Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaibo Zhou updated YARN-1226:
-

Summary: Inconsistent hostname leads to low data locality  (was: 
Inconsistent hostname leads to poor data locality)

 Inconsistent hostname leads to low data locality
 

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName() to get the host name, but the 
 return result of this function depends on IPV4 or IPV6, see 
 [InetAddress.getLocalHost().getHostName() returns 
 FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 
 On machines with ipv4, NodeManager get hostName as: 
 search042097.sqa.cm4.site.net
 But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
 search042097.sqa.cm4.site.net.
 
 For the mapred job which scan hbase table, the InputSplit contains node 
 locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
 search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
 are allocated by HMaster. HMaster communicate with RegionServers and get the 
 region server's host name use java NIO: 
 clientChannel.socket().getInetAddress().getHostName().
 Also see the startup log of region server:
 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
 passed us hostname to use. Was=search042024.sqa.cm4, 
 Now=search042024.sqa.cm4.site.net
 
 As you can see, most machines in the Yarn cluster with IPV6 get the short 
 hostname, but hbase always get the full hostname, so the Host cannot matched 
 (see RMContainerAllocator::assignToMap).This can lead to poor locality.
 After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
 locality in the cluster.
 Thanks,
 Kaibo

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1226) Inconsistent hostname leads to poor data locality

2013-09-24 Thread Kaibo Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaibo Zhou updated YARN-1226:
-

Summary: Inconsistent hostname leads to poor data locality  (was: ipv4 and 
ipv6 lead to poor data locality)

 Inconsistent hostname leads to poor data locality
 -

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName() to get the host name, but the 
 return result of this function depends on IPV4 or IPV6, see 
 [InetAddress.getLocalHost().getHostName() returns 
 FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 
 On machines with ipv4, NodeManager get hostName as: 
 search042097.sqa.cm4.site.net
 But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
 search042097.sqa.cm4.site.net.
 
 For the mapred job which scan hbase table, the InputSplit contains node 
 locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
 search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
 are allocated by HMaster. HMaster communicate with RegionServers and get the 
 region server's host name use java NIO: 
 clientChannel.socket().getInetAddress().getHostName().
 Also see the startup log of region server:
 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
 passed us hostname to use. Was=search042024.sqa.cm4, 
 Now=search042024.sqa.cm4.site.net
 
 As you can see, most machines in the Yarn cluster with IPV6 get the short 
 hostname, but hbase always get the full hostname, so the Host cannot matched 
 (see RMContainerAllocator::assignToMap).This can lead to poor locality.
 After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
 locality in the cluster.
 Thanks,
 Kaibo

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776072#comment-13776072
 ] 

Siddharth Seth commented on YARN-1229:
--

I'm in favour of renaming the shuffle service id as well, and enforcing 
constraints on the names. Shell parameters apparently have name restrictions - 
http://stackoverflow.com/questions/2821043/allowed-characters-in-linux-environment-variable-names
 has some links to standards. Setting aux-service name restrictions based on 
shell name restrictions seems ok to me.

This is an incompatible change though. Sites which have Hadoop 2 (or 0.23) 
deployed would need to change their configs to reflect the shuffle service name 
update. (The shuffleService isn't started when using the default hadoop 
configuration files).

An alternate could be to use base32 encoding for the service name - but would 
prefer not going there.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1156) Change NodeManager AllocatedGB and AvailableGB metrics to show decimal values

2013-09-24 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-1156:


Assignee: Tsuyoshi OZAWA

 Change NodeManager AllocatedGB and AvailableGB metrics to show decimal values
 -

 Key: YARN-1156
 URL: https://issues.apache.org/jira/browse/YARN-1156
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.1.0-beta
Reporter: Akira AJISAKA
Assignee: Tsuyoshi OZAWA
Priority: Minor
  Labels: metrics, newbie
 Fix For: 2.3.0

 Attachments: YARN-1156.1.patch


 AllocatedGB and AvailableGB metrics are now integer type. If there are four 
 times 500MB memory allocation to container, AllocatedGB is incremented four 
 times by {{(int)500/1024}}, which means 0. That is, the memory size allocated 
 is actually 2000MB, but the metrics shows 0GB. Let's use float type for these 
 metrics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-90) NodeManager should identify failed disks becoming good back again

2013-09-24 Thread nijel (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776113#comment-13776113
 ] 

nijel commented on YARN-90:
---

To handle this we can check the failed dirs first in 
DirectoryCollection.checkDirs() and add back to localDirs if the directories 
are recovered from error.


 NodeManager should identify failed disks becoming good back again
 -

 Key: YARN-90
 URL: https://issues.apache.org/jira/browse/YARN-90
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Ravi Gummadi

 MAPREDUCE-3121 makes NodeManager identify disk failures. But once a disk goes 
 down, it is marked as failed forever. To reuse that disk (after it becomes 
 good), NodeManager needs restart. This JIRA is to improve NodeManager to 
 reuse good disks(which could be bad some time back).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1226) Inconsistent hostname leads to low data locality

2013-09-24 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776128#comment-13776128
 ] 

Steve Loughran commented on YARN-1226:
--

Right now Dont use IPv6 is one of those installation rules: 
[http://wiki.apache.org/hadoop/HadoopIPv6], precisely because of issues w/ IPv6 
in Java.

Now, if there are some bits of code that could be changed to make things work 
slightly better they'd be welcome, but right now the focus is on IPv4 -if this 
is an IPv6 problem it's going to get low priority

 Inconsistent hostname leads to low data locality
 

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName() to get the host name, but the 
 return result of this function depends on IPV4 or IPV6, see 
 [InetAddress.getLocalHost().getHostName() returns 
 FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 
 On machines with ipv4, NodeManager get hostName as: 
 search042097.sqa.cm4.site.net
 But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
 search042097.sqa.cm4.site.net.
 
 For the mapred job which scan hbase table, the InputSplit contains node 
 locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
 search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
 are allocated by HMaster. HMaster communicate with RegionServers and get the 
 region server's host name use java NIO: 
 clientChannel.socket().getInetAddress().getHostName().
 Also see the startup log of region server:
 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
 passed us hostname to use. Was=search042024.sqa.cm4, 
 Now=search042024.sqa.cm4.site.net
 
 As you can see, most machines in the Yarn cluster with IPV6 get the short 
 hostname, but hbase always get the full hostname, so the Host cannot matched 
 (see RMContainerAllocator::assignToMap).This can lead to poor locality.
 After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
 locality in the cluster.
 Thanks,
 Kaibo

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1226) Inconsistent hostname leads to low data locality on IPv6 hosts

2013-09-24 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated YARN-1226:
-

Environment: Linux, IPv6
Summary: Inconsistent hostname leads to low data locality on IPv6 hosts 
 (was: Inconsistent hostname leads to low data locality)

 Inconsistent hostname leads to low data locality on IPv6 hosts
 --

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
 Environment: Linux, IPv6
Reporter: Kaibo Zhou

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName() to get the host name, but the 
 return result of this function depends on IPV4 or IPV6, see 
 [InetAddress.getLocalHost().getHostName() returns 
 FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 
 On machines with ipv4, NodeManager get hostName as: 
 search042097.sqa.cm4.site.net
 But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
 search042097.sqa.cm4.site.net.
 
 For the mapred job which scan hbase table, the InputSplit contains node 
 locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
 search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
 are allocated by HMaster. HMaster communicate with RegionServers and get the 
 region server's host name use java NIO: 
 clientChannel.socket().getInetAddress().getHostName().
 Also see the startup log of region server:
 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
 passed us hostname to use. Was=search042024.sqa.cm4, 
 Now=search042024.sqa.cm4.site.net
 
 As you can see, most machines in the Yarn cluster with IPV6 get the short 
 hostname, but hbase always get the full hostname, so the Host cannot matched 
 (see RMContainerAllocator::assignToMap).This can lead to poor locality.
 After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
 locality in the cluster.
 Thanks,
 Kaibo

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776142#comment-13776142
 ] 

Hadoop QA commented on YARN-1021:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604747/YARN-1021.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1149 javac 
compiler warnings (more than the trunk's current 1145 warnings).

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-assemblies hadoop-tools/hadoop-sls hadoop-tools/hadoop-tools-dist.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1995//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/1995//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1995//console

This message is automatically generated.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more 

[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Attachment: YARN-1021.patch

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776173#comment-13776173
 ] 

Hadoop QA commented on YARN-1021:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604767/YARN-1021.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-assemblies hadoop-tools/hadoop-sls hadoop-tools/hadoop-tools-dist:

  org.apache.hadoop.yarn.sls.TestSLSRunner

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1996//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1996//console

This message is automatically generated.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1231) Fix test cases that will hit max- am-used-resources-percent limit after YARN-276

2013-09-24 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated YARN-1231:


Attachment: YARN-1231.patch

A patch fixing test cases in hadoop-yarn-server-resourcemanager project.

 Fix test cases that will hit max- am-used-resources-percent limit after 
 YARN-276
 

 Key: YARN-1231
 URL: https://issues.apache.org/jira/browse/YARN-1231
 Project: Hadoop YARN
  Issue Type: Task
Affects Versions: 2.1.1-beta
Reporter: Nemon Lou
Assignee: Nemon Lou
  Labels: test
 Attachments: YARN-1231.patch


 Use a separate jira to fix YARN's test cases that will fail by hitting max- 
 am-used-resources-percent limit after YARN-276.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1231) Fix test cases that will hit max- am-used-resources-percent limit after YARN-276

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776302#comment-13776302
 ] 

Hadoop QA commented on YARN-1231:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604791/YARN-1231.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 10 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1997//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1997//console

This message is automatically generated.

 Fix test cases that will hit max- am-used-resources-percent limit after 
 YARN-276
 

 Key: YARN-1231
 URL: https://issues.apache.org/jira/browse/YARN-1231
 Project: Hadoop YARN
  Issue Type: Task
Affects Versions: 2.1.1-beta
Reporter: Nemon Lou
Assignee: Nemon Lou
  Labels: test
 Attachments: YARN-1231.patch


 Use a separate jira to fix YARN's test cases that will fail by hitting max- 
 am-used-resources-percent limit after YARN-276.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Attachment: YARN-1021.patch

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776350#comment-13776350
 ] 

Chris Nauroth commented on YARN-1229:
-

BTW, if we use {{[a-zA-Z_]+[a-zA-Z0-9_]*}}, then that will be compatible with 
Windows too.  It looks like Windows actually allows many more characters than 
that, but I think it makes sense to stick to a minimal set that we expect to 
work cross-platform.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776357#comment-13776357
 ] 

Hadoop QA commented on YARN-1021:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604801/YARN-1021.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-assemblies hadoop-tools/hadoop-sls hadoop-tools/hadoop-tools-dist.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1998//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1998//console

This message is automatically generated.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776364#comment-13776364
 ] 

Wei Yan commented on YARN-1021:
---

Update a new patch according to [~tucu00]'s latest comments.
And also let simulator support two types of inputs:
(1) The rumen traces, thus users can directly deploy their rumen traces to the 
simulator.
(2) The simulator itself traces (sls), which is much simpler and users can 
easily generate various workloads. The simulator also has a tool to help users 
convert rumen traces to sls traces.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776367#comment-13776367
 ] 

Alejandro Abdelnur commented on YARN-1021:
--

[~ywskycn], we shouldn't use /tmp as that does not get clean up by the build, 
instead we should use a temp subdir under target/, easily done by:

{code}
File dir = new File(target, UUID.randomUUID());
dir.mkdirs();
{code}

And the documentation, in the appendix should have a complete/simple example of 
an sls JSON input file as a reference.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Attachment: YARN-1021.patch

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Attachment: (was: YARN-1021.pdf)

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Attachment: YARN-1021.pdf

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776441#comment-13776441
 ] 

Hadoop QA commented on YARN-1021:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604818/YARN-1021.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-assemblies hadoop-tools/hadoop-sls hadoop-tools/hadoop-tools-dist.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1999//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1999//console

This message is automatically generated.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1204) Need to add https port related property in Yarn

2013-09-24 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776485#comment-13776485
 ] 

Vinod Kumar Vavilapalli commented on YARN-1204:
---

The latest patch looks good to me. +1. Checking this in.

 Need to add https port related property in Yarn
 ---

 Key: YARN-1204
 URL: https://issues.apache.org/jira/browse/YARN-1204
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Omkar Vinit Joshi
 Attachments: YARN-1204.20131018.1.patch, YARN-1204.20131020.1.patch, 
 YARN-1204.20131020.2.patch, YARN-1204.20131020.3.patch, 
 YARN-1204.20131020.4.patch, YARN-1204.20131023.1.patch


 There is no yarn property available to configure https port for Resource 
 manager, nodemanager and history server. Currently, Yarn services uses the 
 port defined for http [defined by 
 'mapreduce.jobhistory.webapp.address','yarn.nodemanager.webapp.address', 
 'yarn.resourcemanager.webapp.address'] for running services on https protocol.
 Yarn should have list of property to assign https port for RM, NM and JHS.
 It can be like below.
 yarn.nodemanager.webapp.https.address
 yarn.resourcemanager.webapp.https.address
 mapreduce.jobhistory.webapp.https.address 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1068) Add admin support for HA operations

2013-09-24 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776484#comment-13776484
 ] 

Karthik Kambatla commented on YARN-1068:


[~bikassaha], when you get a chance, can you review the latest patch? 

 Add admin support for HA operations
 ---

 Key: YARN-1068
 URL: https://issues.apache.org/jira/browse/YARN-1068
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
 yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, 
 yarn-1068-prelim.patch


 Support HA admin operations to facilitate transitioning the RM to Active and 
 Standby states.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1229:


Attachment: YARN-1229.1.patch

Attached patch changes the mapreduce.shuffle to MapreduceShuffle. Also enforce 
the check(service name should contain only a-zA-Z0-9) at AuxSerivce


 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1229.1.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-24 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776503#comment-13776503
 ] 

Arun C Murthy commented on YARN-1089:
-

I don't think we should put this in branch-2.1 or target this for hadoop-2.2.

This is a major new feature which can be implemented in a compatible manner - 
let's target this for 2.3.0.

 Add YARN compute units alongside virtual cores
 --

 Key: YARN-1089
 URL: https://issues.apache.org/jira/browse/YARN-1089
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1089-1.patch, YARN-1089.patch


 Based on discussion in YARN-1024, we will add YARN compute units as a 
 resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1068) Add admin support for HA operations

2013-09-24 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776502#comment-13776502
 ] 

Alejandro Abdelnur commented on YARN-1068:
--

One nit, in the RMHAProtocolService, the {{serviceStop()}} should be symmetric 
with the start in the sense it should do the {{if (haEnabled)}} check to stop 
the HAAdmin server (instead of doing this check in the HAAdmin service itself).


 Add admin support for HA operations
 ---

 Key: YARN-1068
 URL: https://issues.apache.org/jira/browse/YARN-1068
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
 yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, 
 yarn-1068-prelim.patch


 Support HA admin operations to facilitate transitioning the RM to Active and 
 Standby states.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1089:


Target Version/s: 2.3.0  (was: 2.1.1-beta)

 Add YARN compute units alongside virtual cores
 --

 Key: YARN-1089
 URL: https://issues.apache.org/jira/browse/YARN-1089
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1089-1.patch, YARN-1089.patch


 Based on discussion in YARN-1024, we will add YARN compute units as a 
 resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1232) Configuration support for RM HA

2013-09-24 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1232:
--

 Summary: Configuration support for RM HA
 Key: YARN-1232
 URL: https://issues.apache.org/jira/browse/YARN-1232
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla


We should augment the configuration to allow users specify two RMs and the 
individual RPC addresses for them. This blocks ConfiguredFailoverProxyProvider.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1232) Configuration support for RM HA

2013-09-24 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1232:
---

Attachment: yarn-1232-1.patch

Patch that adds the configs to YarnConfiguration and hooks them up to RM 
startup and RMProxy implementation through HAUtil.

 Configuration support for RM HA
 ---

 Key: YARN-1232
 URL: https://issues.apache.org/jira/browse/YARN-1232
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1232-1.patch


 We should augment the configuration to allow users specify two RMs and the 
 individual RPC addresses for them. This blocks 
 ConfiguredFailoverProxyProvider.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1232) Configuration support for RM HA

2013-09-24 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776508#comment-13776508
 ] 

Karthik Kambatla commented on YARN-1232:


Will post another patch that describes these configs in yarn-default.xml. Don't 
think we can have default values for these though.

 Configuration support for RM HA
 ---

 Key: YARN-1232
 URL: https://issues.apache.org/jira/browse/YARN-1232
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1232-1.patch


 We should augment the configuration to allow users specify two RMs and the 
 individual RPC addresses for them. This blocks 
 ConfiguredFailoverProxyProvider.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy

2013-09-24 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776509#comment-13776509
 ] 

Karthik Kambatla commented on YARN-1028:


Using the configs introduced in YARN-1232, we should be able to retry alternate 
RMs by setting {{yarn.resourcemanager.ha.nodes.id}}. [~devaraj.k], I hope it is 
okay if I take this up.

 Add FailoverProxyProvider like capability to RMProxy
 

 Key: YARN-1028
 URL: https://issues.apache.org/jira/browse/YARN-1028
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Devaraj K

 RMProxy layer currently abstracts RM discovery and implements it by looking 
 up service information from configuration. Motivated by HDFS and using 
 existing classes from Common, we can add failover proxy providers that may 
 provide RM discovery in extensible ways.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1204) Need to add https port related property in Yarn

2013-09-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776515#comment-13776515
 ] 

Hudson commented on YARN-1204:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4462 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4462/])
YARN-1204. Added separate configuration properties for https for RM and NM 
without which servers enabled with https will also start on http ports. 
Contributed by Omkar Vinit Joshi.
MAPREDUCE-5523. Added separate configuration properties for https for JHS 
without which even when https is enabled, it starts on http port itself. 
Contributed by Omkar Vinit Joshi. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525947)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/AppController.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/WebAppUtil.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/jobhistory/JHAdminConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NavBlock.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/WebServer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/amfilter/AmFilterInitializer.java


 Need to add https port related property in Yarn
 ---

 Key: YARN-1204
 URL: https://issues.apache.org/jira/browse/YARN-1204
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Omkar Vinit Joshi
 Attachments: YARN-1204.20131018.1.patch, YARN-1204.20131020.1.patch, 
 YARN-1204.20131020.2.patch, YARN-1204.20131020.3.patch, 
 YARN-1204.20131020.4.patch, YARN-1204.20131023.1.patch


 There is no yarn property available to configure https port for Resource 
 manager, nodemanager and history server. Currently, Yarn services uses the 
 port defined for http [defined by 
 'mapreduce.jobhistory.webapp.address','yarn.nodemanager.webapp.address', 
 'yarn.resourcemanager.webapp.address'] for running services on https protocol.
 Yarn should have list of property to assign https port for RM, NM and JHS.
 It can be like below.
 yarn.nodemanager.webapp.https.address
 yarn.resourcemanager.webapp.https.address
 mapreduce.jobhistory.webapp.https.address 

--
This message is 

[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1229:


Attachment: YARN-1229.2.patch

Add a test case

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1068) Add admin support for HA operations

2013-09-24 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1068:
---

Attachment: yarn-1068-7.patch

Thanks [~tucu00]. Updated patch to address the comment.

 Add admin support for HA operations
 ---

 Key: YARN-1068
 URL: https://issues.apache.org/jira/browse/YARN-1068
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
 yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, yarn-1068-7.patch, 
 yarn-1068-prelim.patch


 Support HA admin operations to facilitate transitioning the RM to Active and 
 Standby states.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776529#comment-13776529
 ] 

Xuan Gong commented on YARN-1229:
-

Run the full YARN test, all the YARN Test are passing.
Run the full MAPREDUCE test, some of tests in mapred package has time out 
issue, which I do not think it is caused by this patch.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776546#comment-13776546
 ] 

Bikas Saha commented on YARN-1229:
--

base32 encoding is a good idea if we dont want to break compatibility. It 
basically boils down to that.

Xuan, the AuxServiceHelper is still using NM_AUX_SERVICE prefix that has _ in 
it.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1068) Add admin support for HA operations

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776571#comment-13776571
 ] 

Hadoop QA commented on YARN-1068:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604842/yarn-1068-7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2000//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2000//console

This message is automatically generated.

 Add admin support for HA operations
 ---

 Key: YARN-1068
 URL: https://issues.apache.org/jira/browse/YARN-1068
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
 yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, yarn-1068-7.patch, 
 yarn-1068-prelim.patch


 Support HA admin operations to facilitate transitioning the RM to Active and 
 Standby states.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1233) NodeManager doesn't renew krb5 creds

2013-09-24 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created YARN-1233:
--

 Summary: NodeManager doesn't renew krb5 creds
 Key: YARN-1233
 URL: https://issues.apache.org/jira/browse/YARN-1233
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Allen Wittenauer


In 2.1.0-beta-rc1 (sorry, haven't upgraded yet) the NM is not renewing krb5 
TGTs after they expire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1229:


Attachment: YARN-1229.3.patch

Changed the NM_AUX_SERVICE prefix to NodeManagerAuxService to eliminate the _

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1157:


Attachment: YARN-1157.5.patch

create the patch based on the latest trunk

 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch, YARN-1157.4.patch, YARN-1157.5.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776598#comment-13776598
 ] 

Hadoop QA commented on YARN-1229:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604849/YARN-1229.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2002//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2002//console

This message is automatically generated.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776629#comment-13776629
 ] 

Hadoop QA commented on YARN-1157:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604851/YARN-1157.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2003//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2003//console

This message is automatically generated.

 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch, YARN-1157.4.patch, YARN-1157.5.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-09-24 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776628#comment-13776628
 ] 

Alejandro Abdelnur commented on YARN-1021:
--

+1

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1157:


Attachment: YARN-1157.6.patch

Adding more comments in RegisterApplicationMasterRequest and 
FinishApplicationMasterRequest

 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch, YARN-1157.4.patch, YARN-1157.5.patch, YARN-1157.6.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1068) Add admin support for HA operations

2013-09-24 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776634#comment-13776634
 ] 

Bikas Saha commented on YARN-1068:
--

It would be educative to compare the HAAdmin server start code with existing 
admin RM server like the AdminService. I notice 2 things. 
1) AdminService does not use the HAServiceProtocolServerSideTranslatorPB pattern
2) AdminService does something with HADOOP_SECURITY_AUTHORIZATION which is 
missing in HAAdminService. This probably defines who has access to perform the 
admin operations. We will likely need that for the HAAdmin right?

Having thought about this, it seems to me that this jira is actually blocked by 
YARN-986. Without a concept of a logical name how can we expect the CLI etc to 
find the correct RM address from configuration? The client conf files would be 
expected to have entries for all RM instances and we would need to be able to 
issue admin commands to any one of them. So we need to be able to address them 
via a logical name, right? So the current approach that picks the 
RM_HA_ADMIN_SERVICE address does not seem like a viable solution. Similarly, 
server conf files would need to tell the server what its logical name is so 
that it can try to pick and instance specific configurations. This is precisely 
why we have the HAAdmin.resolveTarget() method.
Again, it would be educative to look at NNHAServiceTarget for client side and 
the constructor for NameNode where is uses the logical name to translate and 
re-write server side conf.

 Add admin support for HA operations
 ---

 Key: YARN-1068
 URL: https://issues.apache.org/jira/browse/YARN-1068
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
 yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, yarn-1068-7.patch, 
 yarn-1068-prelim.patch


 Support HA admin operations to facilitate transitioning the RM to Active and 
 Standby states.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-986) YARN should have a ClusterId/ServiceId

2013-09-24 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776637#comment-13776637
 ] 

Bikas Saha commented on YARN-986:
-

This should be used to set the service address for tokens. This would also be 
needed to pick up the correct configs for HA scenarios.

 YARN should have a ClusterId/ServiceId
 --

 Key: YARN-986
 URL: https://issues.apache.org/jira/browse/YARN-986
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 This needs to be done to support non-ip based fail over of RM. Once the 
 server sets the token service address to be this generic ClusterId/ServiceId, 
 clients can translate it to appropriate final IP and then be able to select 
 tokens via TokenSelectors.
 Some workarounds for other related issues were put in place at YARN-945.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-986) YARN should have a ClusterId/ServiceId

2013-09-24 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-986:


Summary: YARN should have a ClusterId/ServiceId  (was: YARN should have a 
ClusterId/ServiceId that should be used to set the service address for tokens)

 YARN should have a ClusterId/ServiceId
 --

 Key: YARN-986
 URL: https://issues.apache.org/jira/browse/YARN-986
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 This needs to be done to support non-ip based fail over of RM. Once the 
 server sets the token service address to be this generic ClusterId/ServiceId, 
 clients can translate it to appropriate final IP and then be able to select 
 tokens via TokenSelectors.
 Some workarounds for other related issues were put in place at YARN-945.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1068) Add admin support for HA operations

2013-09-24 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776653#comment-13776653
 ] 

Karthik Kambatla commented on YARN-1068:


Thanks [~bikassaha], agree with most of your points.

bq. AdminService does not use the HAServiceProtocolServerSideTranslatorPB 
pattern
The reason for this is our attempt to reuse most of the common code - protos 
and client implementations.

bq. Having thought about this, it seems to me that this jira is actually 
blocked by YARN-986.
To fix the admin support in entirety, I agree that we need YARN-1232 and 
YARN-986. That said, for ease of development, I would propose splitting the 
admin support into two parts (JIRAs) - basic support (this JIRA) to go in first 
to help testing YARN-1232 and YARN-986, and complete admin support that adds 
the remaining parts. Otherwise, we need applying this over those other JIRAs to 
test. Thoughts?



 Add admin support for HA operations
 ---

 Key: YARN-1068
 URL: https://issues.apache.org/jira/browse/YARN-1068
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
 yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, yarn-1068-7.patch, 
 yarn-1068-prelim.patch


 Support HA admin operations to facilitate transitioning the RM to Active and 
 Standby states.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776663#comment-13776663
 ] 

Hadoop QA commented on YARN-1157:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604859/YARN-1157.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2004//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2004//console

This message is automatically generated.

 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch, YARN-1157.4.patch, YARN-1157.5.patch, YARN-1157.6.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-24 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776684#comment-13776684
 ] 

Jian He commented on YARN-1157:
---

Tests look much clean, thanks for the update, patch looks good, + 1

 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch, YARN-1157.4.patch, YARN-1157.5.patch, YARN-1157.6.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776765#comment-13776765
 ] 

Bikas Saha commented on YARN-1229:
--

Looks good to me.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1214) Register ClientToken MasterKey in SecretManager after it is saved

2013-09-24 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776802#comment-13776802
 ] 

Bikas Saha commented on YARN-1214:
--

+1

 Register ClientToken MasterKey in SecretManager after it is saved
 -

 Key: YARN-1214
 URL: https://issues.apache.org/jira/browse/YARN-1214
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1214.1.patch, YARN-1214.2.patch, YARN-1214.3.patch, 
 YARN-1214.4.patch, YARN-1214.5.patch, YARN-1214.patch


 Currently, app attempt ClientToken master key is registered before it is 
 saved. This can cause problem that before the master key is saved, client 
 gets the token and RM also crashes, RM cannot reloads the master key back 
 after it restarts as it is not saved. As a result, client is holding an 
 invalid token.
 We can register the client token master key after it is saved in the store.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1214) Register ClientToken MasterKey in SecretManager after it is saved

2013-09-24 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1214:
--

Attachment: YARN-1214.6.patch

patch rebased

 Register ClientToken MasterKey in SecretManager after it is saved
 -

 Key: YARN-1214
 URL: https://issues.apache.org/jira/browse/YARN-1214
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1214.1.patch, YARN-1214.2.patch, YARN-1214.3.patch, 
 YARN-1214.4.patch, YARN-1214.5.patch, YARN-1214.6.patch, YARN-1214.patch


 Currently, app attempt ClientToken master key is registered before it is 
 saved. This can cause problem that before the master key is saved, client 
 gets the token and RM also crashes, RM cannot reloads the master key back 
 after it restarts as it is not saved. As a result, client is holding an 
 invalid token.
 We can register the client token master key after it is saved in the store.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1214) Register ClientToken MasterKey in SecretManager after it is saved

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776843#comment-13776843
 ] 

Hadoop QA commented on YARN-1214:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604886/YARN-1214.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2005//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2005//console

This message is automatically generated.

 Register ClientToken MasterKey in SecretManager after it is saved
 -

 Key: YARN-1214
 URL: https://issues.apache.org/jira/browse/YARN-1214
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1214.1.patch, YARN-1214.2.patch, YARN-1214.3.patch, 
 YARN-1214.4.patch, YARN-1214.5.patch, YARN-1214.6.patch, YARN-1214.patch


 Currently, app attempt ClientToken master key is registered before it is 
 saved. This can cause problem that before the master key is saved, client 
 gets the token and RM also crashes, RM cannot reloads the master key back 
 after it restarts as it is not saved. As a result, client is holding an 
 invalid token.
 We can register the client token master key after it is saved in the store.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-624) Support gang scheduling in the AM RM protocol

2013-09-24 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776847#comment-13776847
 ] 

Carlo Curino commented on YARN-624:
---

Hi Guys,

I would like to quantify what is the typical waste of resources while 
hoarding containers towards a gang for Gyraph or Storm. 
Anyone have an intuition/measure of the typical time-delay and container 
slot-time wasted while hoarding containers, before the 
useful part of the computation starts?  Thanks.. 


 Support gang scheduling in the AM RM protocol
 -

 Key: YARN-624
 URL: https://issues.apache.org/jira/browse/YARN-624
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza

 Per discussion on YARN-392 and elsewhere, gang scheduling, in which a 
 scheduler runs a set of tasks when they can all be run at the same time, 
 would be a useful feature for YARN schedulers to support.
 Currently, AMs can approximate this by holding on to containers until they 
 get all the ones they need.  However, this lends itself to deadlocks when 
 different AMs are waiting on the same containers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776852#comment-13776852
 ] 

Vinod Kumar Vavilapalli commented on YARN-1229:
---

*sigh* more incompatible changes. Thought for a while if we can do it in a 
compatible manner, but doesn't seem like there is any way.

Looked at the patch, +1 for the changes. Let's get it in asap.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-1229:
-

Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1204) Need to add https port related property in Yarn

2013-09-24 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1204:
--

Fix Version/s: 2.1.2-beta

 Need to add https port related property in Yarn
 ---

 Key: YARN-1204
 URL: https://issues.apache.org/jira/browse/YARN-1204
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Omkar Vinit Joshi
 Fix For: 2.1.2-beta

 Attachments: YARN-1204.20131018.1.patch, YARN-1204.20131020.1.patch, 
 YARN-1204.20131020.2.patch, YARN-1204.20131020.3.patch, 
 YARN-1204.20131020.4.patch, YARN-1204.20131023.1.patch


 There is no yarn property available to configure https port for Resource 
 manager, nodemanager and history server. Currently, Yarn services uses the 
 port defined for http [defined by 
 'mapreduce.jobhistory.webapp.address','yarn.nodemanager.webapp.address', 
 'yarn.resourcemanager.webapp.address'] for running services on https protocol.
 Yarn should have list of property to assign https port for RM, NM and JHS.
 It can be like below.
 yarn.nodemanager.webapp.https.address
 yarn.resourcemanager.webapp.https.address
 mapreduce.jobhistory.webapp.https.address 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776865#comment-13776865
 ] 

Siddharth Seth commented on YARN-1229:
--

Just looked at the patch, it'd be nice to include underscores as well - 
provides for a separator in the allowed character set.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1128) FifoPolicy.computeShares throws NPE on empty list of Schedulables

2013-09-24 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-1128:
-

Fix Version/s: 2.1.2-beta

 FifoPolicy.computeShares throws NPE on empty list of Schedulables
 -

 Key: YARN-1128
 URL: https://issues.apache.org/jira/browse/YARN-1128
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Karthik Kambatla
 Fix For: 2.1.2-beta

 Attachments: yarn-1128-1.patch


 FifoPolicy gives all of a queue's share to the earliest-scheduled application.
 {code}
 Schedulable earliest = null;
 for (Schedulable schedulable : schedulables) {
   if (earliest == null ||
   schedulable.getStartTime()  earliest.getStartTime()) {
 earliest = schedulable;
   }
 }
 earliest.setFairShare(Resources.clone(totalResources));
 {code}
 If the queue has no schedulables in it, earliest will be left null, leading 
 to an NPE on the last line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1203) Application Manager UI does not appear with Https enabled

2013-09-24 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1203:
--

Fix Version/s: 2.1.2-beta

 Application Manager UI does not appear with Https enabled
 -

 Key: YARN-1203
 URL: https://issues.apache.org/jira/browse/YARN-1203
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Omkar Vinit Joshi
 Fix For: 2.1.2-beta

 Attachments: YARN-1203.20131017.1.patch, YARN-1203.20131017.2.patch, 
 YARN-1203.20131017.3.patch, YARN-1203.20131018.1.patch, 
 YARN-1203.20131018.2.patch, YARN-1203.20131019.1.patch


 Need to add support to disable 'hadoop.ssl.enabled' for MR jobs.
 A job should be able to run on http protocol by setting 'hadoop.ssl.enabled' 
 property at job level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776872#comment-13776872
 ] 

Chris Nauroth commented on YARN-1229:
-

Agreed on underscores.  Various resources indicate that 
{{[a-zA-Z_]+[a-zA-Z0-9_]*}} is a good format that we can expect to work 
cross-platform.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1214) Register ClientToken MasterKey in SecretManager after it is saved

2013-09-24 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-1214:
-

Priority: Critical  (was: Major)

 Register ClientToken MasterKey in SecretManager after it is saved
 -

 Key: YARN-1214
 URL: https://issues.apache.org/jira/browse/YARN-1214
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Attachments: YARN-1214.1.patch, YARN-1214.2.patch, YARN-1214.3.patch, 
 YARN-1214.4.patch, YARN-1214.5.patch, YARN-1214.6.patch, YARN-1214.patch


 Currently, app attempt ClientToken master key is registered before it is 
 saved. This can cause problem that before the master key is saved, client 
 gets the token and RM also crashes, RM cannot reloads the master key back 
 after it restarts as it is not saved. As a result, client is holding an 
 invalid token.
 We can register the client token master key after it is saved in the store.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1229:


Attachment: YARN-1229.4.patch

Allow _ as valid character in auxServiceName, and disallow auxServiceName 
starting at number

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
 YARN-1229.4.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1053) Diagnostic message from ContainerExitEvent is ignored in ContainerImpl

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1053:


Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 Diagnostic message from ContainerExitEvent is ignored in ContainerImpl
 --

 Key: YARN-1053
 URL: https://issues.apache.org/jira/browse/YARN-1053
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Omkar Vinit Joshi
  Labels: newbie
 Fix For: 2.3.0, 2.1.2-beta

 Attachments: YARN-1053.20130809.patch


 If the container launch fails then we send ContainerExitEvent. This event 
 contains exitCode and diagnostic message. Today we are ignoring diagnostic 
 message while handling this event inside ContainerImpl. Fixing it as it is 
 useful in diagnosing the failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1158) ResourceManager UI has application stdout missing if application stdout is not in the same directory as AppMaster stdout

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1158:


Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 ResourceManager UI has application stdout missing if application stdout is 
 not in the same directory as AppMaster stdout
 

 Key: YARN-1158
 URL: https://issues.apache.org/jira/browse/YARN-1158
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
 Fix For: 2.1.2-beta


 Configure yarn-site.xml's yarn.nodemanager.local-dirs to multiple 
 directories. Turn on log aggregation. Run distributed shell application. If 
 an application writes AppMaster.stdout in one directory and stdout in another 
 directory. Goto ResourceManager web UI. Open up container logs. Only 
 AppMaster.stdout would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1121) RMStateStore should flush all pending store events before closing

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1121:


Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 RMStateStore should flush all pending store events before closing
 -

 Key: YARN-1121
 URL: https://issues.apache.org/jira/browse/YARN-1121
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Bikas Saha
 Fix For: 2.1.2-beta


 on serviceStop it should wait for all internal pending events to drain before 
 stopping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776915#comment-13776915
 ] 

Siddharth Seth commented on YARN-1229:
--

Took a quick look.
- Can you please rename MapreduceShuffle to mapreduce_shuffle (closer to the 
old name)
- The check can be regex based, rather than walking through all the characters.
- Include an empty check along with the null check

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
 YARN-1229.4.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1167) Submitted distributed shell application shows appMasterHost = empty

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1167:


Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 Submitted distributed shell application shows appMasterHost = empty
 ---

 Key: YARN-1167
 URL: https://issues.apache.org/jira/browse/YARN-1167
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
 Fix For: 2.1.2-beta


 Submit distributed shell application. Once the application turns to be 
 RUNNING state, app master host should not be empty. In reality, it is empty.
 ==console logs==
 distributedshell.Client: Got application report from ASM for, appId=12, 
 clientToAMToken=null, appDiagnostics=, appMasterHost=, appQueue=default, 
 appMasterRpcPort=0, appStartTime=1378505161360, yarnAppState=RUNNING, 
 distributedFinalState=UNDEFINED, 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1168) Cannot run echo \Hello World\

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1168:


Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 Cannot run echo \Hello World\
 -

 Key: YARN-1168
 URL: https://issues.apache.org/jira/browse/YARN-1168
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Reporter: Tassapol Athiapinya
Priority: Critical
 Fix For: 2.1.2-beta


 Run
 $ ssh localhost echo \Hello World\
 with bash does succeed. Hello World is shown in stdout.
 Run distributed shell with similar echo command. That is either
 $ /usr/bin/yarn  org.apache.hadoop.yarn.applications.distributedshell.Client 
 -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-2.*.jar 
 -shell_command echo -shell_args \Hello World\
 or
 $ /usr/bin/yarn  org.apache.hadoop.yarn.applications.distributedshell.Client 
 -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-2.*.jar 
 -shell_command echo -shell_args Hello World
 {code:title=yarn logs -- only hello is shown}
 LogType: stdout
 LogLength: 6
 Log Contents:
 hello
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1149) NM throws InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1149:


Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 NM throws InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING
 -

 Key: YARN-1149
 URL: https://issues.apache.org/jira/browse/YARN-1149
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ramya Sunil
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1149.1.patch, YARN-1149.2.patch, YARN-1149.3.patch, 
 YARN-1149.4.patch


 When nodemanager receives a kill signal when an application has finished 
 execution but log aggregation has not kicked in, 
 InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING is thrown
 {noformat}
 2013-08-25 20:45:00,875 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:finishLogAggregation(254)) - Application just 
 finished : application_1377459190746_0118
 2013-08-25 20:45:00,876 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:uploadLogsForContainer(105)) - Starting aggregate 
 log-file for app application_1377459190746_0118 at 
 /app-logs/foo/logs/application_1377459190746_0118/host_45454.tmp
 2013-08-25 20:45:00,876 INFO  logaggregation.LogAggregationService 
 (LogAggregationService.java:stopAggregators(151)) - Waiting for aggregation 
 to complete for application_1377459190746_0118
 2013-08-25 20:45:00,891 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:uploadLogsForContainer(122)) - Uploading logs for 
 container container_1377459190746_0118_01_04. Current good log dirs are 
 /tmp/yarn/local
 2013-08-25 20:45:00,915 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:doAppLogAggregation(182)) - Finished aggregate 
 log-file for app application_1377459190746_0118
 2013-08-25 20:45:00,925 WARN  application.Application 
 (ApplicationImpl.java:handle(427)) - Can't handle this event at current state
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
  
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:425)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:59)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:697)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:689)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
 at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)   
 at java.lang.Thread.run(Thread.java:662)
 2013-08-25 20:45:00,926 INFO  application.Application 
 (ApplicationImpl.java:handle(430)) - Application 
 application_1377459190746_0118 transitioned from RUNNING to null
 2013-08-25 20:45:00,927 WARN  monitor.ContainersMonitorImpl 
 (ContainersMonitorImpl.java:run(463)) - 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
  is interrupted. Exiting.
 2013-08-25 20:45:00,938 INFO  ipc.Server (Server.java:stop(2437)) - Stopping 
 server on 8040
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1157:


Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.2-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch, YARN-1157.4.patch, YARN-1157.5.patch, YARN-1157.6.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1022) Unnecessary INFO logs in AMRMClientAsync

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1022:


Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 Unnecessary INFO logs in AMRMClientAsync
 

 Key: YARN-1022
 URL: https://issues.apache.org/jira/browse/YARN-1022
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bikas Saha
Priority: Minor
  Labels: newbie
 Fix For: 2.1.2-beta


 Logs like the following should be debug or else every legitimate stop causes 
 unnecessary exception traces in the logs.
 464 2013-08-03 20:01:34,459 INFO [AMRM Heartbeater thread] 
 org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl:
 Heartbeater interrupted
 465 java.lang.InterruptedException: sleep interrupted
 466   at java.lang.Thread.sleep(Native Method)
 467   at 
 org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:249)
 468 2013-08-03 20:01:34,460 INFO [AMRM Callback Handler Thread] 
 org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl:   
 Interrupted while waiting for queue
 469 java.lang.InterruptedException
 470   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.
  java:1961)
 471   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1996)
 472   at 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
 473   at 
 org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:275)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1142) MiniYARNCluster web ui does not work properly

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1142:


Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 MiniYARNCluster web ui does not work properly
 -

 Key: YARN-1142
 URL: https://issues.apache.org/jira/browse/YARN-1142
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
 Fix For: 2.1.2-beta


 When going to the RM http port, the NM web ui is displayed. It seems there is 
 a singleton somewhere that breaks things when RM  NMs run in the same 
 process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1131) $ yarn logs should return a message log aggregation is during progress if YARN application is running

2013-09-24 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1131:


Fix Version/s: (was: 2.1.1-beta)
   2.1.2-beta

 $ yarn logs should return a message log aggregation is during progress if 
 YARN application is running
 -

 Key: YARN-1131
 URL: https://issues.apache.org/jira/browse/YARN-1131
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: client
Reporter: Tassapol Athiapinya
Assignee: Junping Du
Priority: Minor
 Fix For: 2.1.2-beta


 In the case when log aggregation is enabled, if a user submits MapReduce job 
 and runs $ yarn logs -applicationId app ID while the YARN application is 
 running, the command will return no message and return user back to shell. It 
 is nice to tell the user that log aggregation is in progress.
 {code}
 -bash-4.1$ /usr/bin/yarn logs -applicationId application_1377900193583_0002
 -bash-4.1$
 {code}
 At the same time, if invalid application ID is given, YARN CLI should say 
 that the application ID is incorrect rather than throwing 
 NoSuchElementException.
 {code}
 $ /usr/bin/yarn logs -applicationId application_0
 Exception in thread main java.util.NoSuchElementException
 at com.google.common.base.AbstractIterator.next(AbstractIterator.java:75)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:124)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:119)
 at org.apache.hadoop.yarn.logaggregation.LogDumper.run(LogDumper.java:110)
 at org.apache.hadoop.yarn.logaggregation.LogDumper.main(LogDumper.java:255)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1229:


Attachment: YARN-1229.5.patch

1.Change to mapreduce_shuffle
2. using regex for checking auxName

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
 YARN-1229.4.patch, YARN-1229.5.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776945#comment-13776945
 ] 

Siddharth Seth commented on YARN-1229:
--

Patch looks good. Missed this earlier, but there's several references to 
mapreduce.shuffle in documentation which need to be updated.
Also, since it's being updated - can you make the Pattern final. Thanks

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
 YARN-1229.4.patch, YARN-1229.5.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1229:


Attachment: YARN-1229.6.patch

fix documentation

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
 YARN-1229.4.patch, YARN-1229.5.patch, YARN-1229.6.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776975#comment-13776975
 ] 

Hadoop QA commented on YARN-1229:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604917/YARN-1229.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2007//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2007//console

This message is automatically generated.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
 YARN-1229.4.patch, YARN-1229.5.patch, YARN-1229.6.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776979#comment-13776979
 ] 

Siddharth Seth commented on YARN-1229:
--

+1. Committing.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
 YARN-1229.4.patch, YARN-1229.5.patch, YARN-1229.6.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1232) Configuration support for RM HA

2013-09-24 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1232:
---

Attachment: yarn-1232-2.patch

Patch that adds descriptions and tests HAUtil, and to be applied on trunk.

 Configuration support for RM HA
 ---

 Key: YARN-1232
 URL: https://issues.apache.org/jira/browse/YARN-1232
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1232-1.patch, yarn-1232-2.patch


 We should augment the configuration to allow users specify two RMs and the 
 individual RPC addresses for them. This blocks 
 ConfiguredFailoverProxyProvider.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (YARN-1234) Container localizer logs are not created in secured cluster

2013-09-24 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi moved MAPREDUCE-5532 to YARN-1234:


Key: YARN-1234  (was: MAPREDUCE-5532)
Project: Hadoop YARN  (was: Hadoop Map/Reduce)

  Container localizer logs are not created in secured cluster
 

 Key: YARN-1234
 URL: https://issues.apache.org/jira/browse/YARN-1234
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Omkar Vinit Joshi

 When we are running ContainerLocalizer in secured cluster we potentially are 
 not creating any log file to track log messages. This will be helpful in 
 potentially identifying ContainerLocalization issues in secured cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1234) Container localizer logs are not created in secured cluster

2013-09-24 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-1234:


Fix Version/s: 2.1.2-beta

  Container localizer logs are not created in secured cluster
 

 Key: YARN-1234
 URL: https://issues.apache.org/jira/browse/YARN-1234
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Omkar Vinit Joshi
Assignee: Omkar Vinit Joshi
 Fix For: 2.1.2-beta


 When we are running ContainerLocalizer in secured cluster we potentially are 
 not creating any log file to track log messages. This will be helpful in 
 potentially identifying ContainerLocalization issues in secured cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1234) Container localizer logs are not created in secured cluster

2013-09-24 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-1234:


Component/s: nodemanager

  Container localizer logs are not created in secured cluster
 

 Key: YARN-1234
 URL: https://issues.apache.org/jira/browse/YARN-1234
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Omkar Vinit Joshi
Assignee: Omkar Vinit Joshi
 Fix For: 2.1.2-beta


 When we are running ContainerLocalizer in secured cluster we potentially are 
 not creating any log file to track log messages. This will be helpful in 
 potentially identifying ContainerLocalization issues in secured cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777009#comment-13777009
 ] 

Hudson commented on YARN-1229:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4463 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4463/])
YARN-1229. Define constraints on Auxiliary Service names. Change ShuffleHandler 
service name from mapreduce.shuffle to mapreduce_shuffle. Contributed by Xuan 
Gong. (sseth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1526065)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/ClusterSetup.apt.vm
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/SingleCluster.apt.vm
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/INSTALL
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java


 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
 YARN-1229.4.patch, YARN-1229.5.patch, YARN-1229.6.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1232) Configuration support for RM HA

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777021#comment-13777021
 ] 

Hadoop QA commented on YARN-1232:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604931/yarn-1232-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.client.cli.TestYarnCLI
  org.apache.hadoop.yarn.client.TestGetGroups
  org.apache.hadoop.yarn.client.api.impl.TestYarnClient
  org.apache.hadoop.yarn.client.api.impl.TestAMRMClient
  org.apache.hadoop.yarn.client.api.impl.TestNMClient
  org.apache.hadoop.yarn.conf.TestYarnConfiguration
  org.apache.hadoop.yarn.logaggregation.TestLogDumper
  
org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebApp
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.TestRMContainerImpl
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSLeafQueue
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestRMStateStore
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
  org.apache.hadoop.yarn.server.resourcemanager.TestRMHA
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerUtils
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueParsing
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestChildQueueOrder
  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager
  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup
  
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService
  
org.apache.hadoop.yarn.server.resourcemanager.TestFifoScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched
  
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions
  
org.apache.hadoop.yarn.server.resourcemanager.TestRMNodeTransitions
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerEventLog
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes
  
org.apache.hadoop.yarn.server.resourcemanager.applicationmasterservice.TestApplicationMasterService
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler
  org.apache.hadoop.yarn.server.resourcemanager.TestRM
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestParentQueue
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestQueueMetrics
  

[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777023#comment-13777023
 ] 

Hadoop QA commented on YARN-1229:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604922/YARN-1229.6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2008//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2008//console

This message is automatically generated.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.2-beta

 Attachments: YARN-1229.1.patch, YARN-1229.2.patch, YARN-1229.3.patch, 
 YARN-1229.4.patch, YARN-1229.5.patch, YARN-1229.6.patch


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-899) Get queue administration ACLs working

2013-09-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-899:
---

Attachment: YARN-899.7.patch

create patch based on the latest trunk

 Get queue administration ACLs working
 -

 Key: YARN-899
 URL: https://issues.apache.org/jira/browse/YARN-899
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Xuan Gong
 Attachments: YARN-899.1.patch, YARN-899.2.patch, YARN-899.3.patch, 
 YARN-899.4.patch, YARN-899.5.patch, YARN-899.5.patch, YARN-899.6.patch, 
 YARN-899.7.patch


 The Capacity Scheduler documents the 
 yarn.scheduler.capacity.root.queue-path.acl_administer_queue config option 
 for controlling who can administer a queue, but it is not hooked up to 
 anything.  The Fair Scheduler could make use of a similar option as well.  
 This is a feature-parity regression from MR1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-899) Get queue administration ACLs working

2013-09-24 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-899:
---

Attachment: YARN-899.8.patch

 Get queue administration ACLs working
 -

 Key: YARN-899
 URL: https://issues.apache.org/jira/browse/YARN-899
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Xuan Gong
 Attachments: YARN-899.1.patch, YARN-899.2.patch, YARN-899.3.patch, 
 YARN-899.4.patch, YARN-899.5.patch, YARN-899.5.patch, YARN-899.6.patch, 
 YARN-899.7.patch, YARN-899.8.patch


 The Capacity Scheduler documents the 
 yarn.scheduler.capacity.root.queue-path.acl_administer_queue config option 
 for controlling who can administer a queue, but it is not hooked up to 
 anything.  The Fair Scheduler could make use of a similar option as well.  
 This is a feature-parity regression from MR1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-24 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777034#comment-13777034
 ] 

Sandy Ryza commented on YARN-1089:
--

I'm ok with with waiting until 2.3.  In case it's not clear, the consequence of 
this is that until then it will be impossible to place more tasks on a node 
than its number of virtual cores, which is essentially its number of physical 
cores.

I think we should make YARN-976, documenting the meaning of vcores, a blocker 
for 2.2.

 Add YARN compute units alongside virtual cores
 --

 Key: YARN-1089
 URL: https://issues.apache.org/jira/browse/YARN-1089
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1089-1.patch, YARN-1089.patch


 Based on discussion in YARN-1024, we will add YARN compute units as a 
 resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-674) Slow or failing DelegationToken renewals on submission itself make RM unavailable

2013-09-24 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777044#comment-13777044
 ] 

Jian He commented on YARN-674:
--

Is this related to ClientRMService.renewDelegationToken this method?

 Slow or failing DelegationToken renewals on submission itself make RM 
 unavailable
 -

 Key: YARN-674
 URL: https://issues.apache.org/jira/browse/YARN-674
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 This was caused by YARN-280. A slow or a down NameNode for will make it look 
 like RM is unavailable as it may run out of RPC handlers due to blocked 
 client submissions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-24 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777051#comment-13777051
 ] 

Bikas Saha commented on YARN-1089:
--

At this point, I am not seeing the benefit of creating yet another cpu related 
configuration. While I am not against useful configurations, its already hard 
to configure YARN. Like Vinod and others said, can a summary of the discussions 
made elsewhere be placed here.

 Add YARN compute units alongside virtual cores
 --

 Key: YARN-1089
 URL: https://issues.apache.org/jira/browse/YARN-1089
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1089-1.patch, YARN-1089.patch


 Based on discussion in YARN-1024, we will add YARN compute units as a 
 resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1214) Register ClientToken MasterKey in SecretManager after it is saved

2013-09-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777056#comment-13777056
 ] 

Hudson commented on YARN-1214:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4464 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4464/])
YARN-1214. Register ClientToken MasterKey in SecretManager after it is saved 
(Jian He via bikas) (bikas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1526078)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/ClientToAMTokenSecretManagerInRM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestRMStateStore.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestClientToAMTokens.java


 Register ClientToken MasterKey in SecretManager after it is saved
 -

 Key: YARN-1214
 URL: https://issues.apache.org/jira/browse/YARN-1214
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
Priority: Critical
 Fix For: 2.1.2-beta

 Attachments: YARN-1214.1.patch, YARN-1214.2.patch, YARN-1214.3.patch, 
 YARN-1214.4.patch, YARN-1214.5.patch, YARN-1214.6.patch, YARN-1214.patch


 Currently, app attempt ClientToken master key is registered before it is 
 saved. This can cause problem that before the master key is saved, client 
 gets the token and RM also crashes, RM cannot reloads the master key back 
 after it restarts as it is not saved. As a result, client is holding an 
 invalid token.
 We can register the client token master key after it is saved in the store.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1215) Yarn URL should include userinfo

2013-09-24 Thread Chuan Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Liu updated YARN-1215:


Attachment: YARN-1215-trunk.2.patch

Attach a new patch that adds a userInfo field for 
org.apache.hadoop.yarn.api.records.URL. This appends an optional filed to 
existing .proto file. This is allowed according to compatibility guide at:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#Wire_compatibility

 Yarn URL should include userinfo
 

 Key: YARN-1215
 URL: https://issues.apache.org/jira/browse/YARN-1215
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Chuan Liu
Assignee: Chuan Liu
 Attachments: YARN-1215-trunk.2.patch, YARN-1215-trunk.patch


 In the {{org.apache.hadoop.yarn.api.records.URL}} class, we don't have an 
 userinfo as part of the URL. When converting a {{java.net.URI}} object into 
 the YARN URL object in {{ConverterUtils.getYarnUrlFromURI()}} method, we will 
 set uri host as the url host. If the uri has a userinfo part, the userinfo is 
 discarded. This will lead to information loss if the original uri has the 
 userinfo, e.g. foo://username:passw...@example.com will be converted to 
 foo://example.com and username/password information is lost during the 
 conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-899) Get queue administration ACLs working

2013-09-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777069#comment-13777069
 ] 

Hadoop QA commented on YARN-899:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604938/YARN-899.8.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2010//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2010//console

This message is automatically generated.

 Get queue administration ACLs working
 -

 Key: YARN-899
 URL: https://issues.apache.org/jira/browse/YARN-899
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Xuan Gong
 Attachments: YARN-899.1.patch, YARN-899.2.patch, YARN-899.3.patch, 
 YARN-899.4.patch, YARN-899.5.patch, YARN-899.5.patch, YARN-899.6.patch, 
 YARN-899.7.patch, YARN-899.8.patch


 The Capacity Scheduler documents the 
 yarn.scheduler.capacity.root.queue-path.acl_administer_queue config option 
 for controlling who can administer a queue, but it is not hooked up to 
 anything.  The Fair Scheduler could make use of a similar option as well.  
 This is a feature-parity regression from MR1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-24 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777071#comment-13777071
 ] 

Sandy Ryza commented on YARN-1089:
--

As was requested, I posted a summary of the proposal on YARN-1024.

In case it's not clear on the summary, here's the problem we're trying to solve:
We want jobs to be portable between clusters. CPU is not a fluid resource in 
the way memory is. The number of cores on a machine is just as important its 
total processing power when scheduling tasks.

Imagine a cluster where every node has powerful CPUs with many cores.  One type 
of task that will be run on the cluster saturates a full CPU, but another type 
of task that will be run on the cluster contains two threads, each which can 
saturate only half a full CPU.  If we have a single dimension for CPU requests, 
these tasks will request an equal number of those.  What happens if we then 
move those tasks to a cluster with CPUs whose cores are half as fast?  The 
first task will run half as fast, and the second task will run in the same 
amount of time.  It's in the first task's interest to only request half as many 
CPU resources on that cluster.

I'm also afraid of things getting complicated, but I can't think of anything 
better that doesn't require having the meaning of a virtual core vary widely 
from cluster to cluster.

 Add YARN compute units alongside virtual cores
 --

 Key: YARN-1089
 URL: https://issues.apache.org/jira/browse/YARN-1089
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1089-1.patch, YARN-1089.patch


 Based on discussion in YARN-1024, we will add YARN compute units as a 
 resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1128) FifoPolicy.computeShares throws NPE on empty list of Schedulables

2013-09-24 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1128:
-

Hadoop Flags: Reviewed

Committed to trunk, branch-2, and branch-2.1-beta

 FifoPolicy.computeShares throws NPE on empty list of Schedulables
 -

 Key: YARN-1128
 URL: https://issues.apache.org/jira/browse/YARN-1128
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Karthik Kambatla
 Fix For: 2.1.2-beta

 Attachments: yarn-1128-1.patch


 FifoPolicy gives all of a queue's share to the earliest-scheduled application.
 {code}
 Schedulable earliest = null;
 for (Schedulable schedulable : schedulables) {
   if (earliest == null ||
   schedulable.getStartTime()  earliest.getStartTime()) {
 earliest = schedulable;
   }
 }
 earliest.setFairShare(Resources.clone(totalResources));
 {code}
 If the queue has no schedulables in it, earliest will be left null, leading 
 to an NPE on the last line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >