[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774336#comment-13774336
 ] 

Sandy Ryza commented on YARN-1188:
--

+1

 The context of QueueMetrics becomes 'default' when using FairScheduler
 --

 Key: YARN-1188
 URL: https://issues.apache.org/jira/browse/YARN-1188
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Akira AJISAKA
Priority: Minor
  Labels: metrics, newbie
 Attachments: YARN-1188.1.patch


 I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
 was using FairScheduler.
 The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
 like below:
 {code}
 + @Metrics(context=yarn)
 public class FSQueueMetrics extends QueueMetrics {
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1188:
-

Assignee: Akira AJISAKA

 The context of QueueMetrics becomes 'default' when using FairScheduler
 --

 Key: YARN-1188
 URL: https://issues.apache.org/jira/browse/YARN-1188
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: metrics, newbie
 Attachments: YARN-1188.1.patch


 I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
 was using FairScheduler.
 The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
 like below:
 {code}
 + @Metrics(context=yarn)
 public class FSQueueMetrics extends QueueMetrics {
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774344#comment-13774344
 ] 

Sandy Ryza commented on YARN-1188:
--

I just committed this to trunk and branch-2.  Thanks Tsuyoshi!

 The context of QueueMetrics becomes 'default' when using FairScheduler
 --

 Key: YARN-1188
 URL: https://issues.apache.org/jira/browse/YARN-1188
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Minor
  Labels: metrics, newbie
 Attachments: YARN-1188.1.patch


 I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
 was using FairScheduler.
 The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
 like below:
 {code}
 + @Metrics(context=yarn)
 public class FSQueueMetrics extends QueueMetrics {
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774350#comment-13774350
 ] 

Hudson commented on YARN-1188:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4451 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4451/])
Fix credit for YARN-1188 (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525518)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt


 The context of QueueMetrics becomes 'default' when using FairScheduler
 --

 Key: YARN-1188
 URL: https://issues.apache.org/jira/browse/YARN-1188
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Akira AJISAKA
Priority: Minor
  Labels: metrics, newbie
 Attachments: YARN-1188.1.patch


 I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
 was using FairScheduler.
 The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
 like below:
 {code}
 + @Metrics(context=yarn)
 public class FSQueueMetrics extends QueueMetrics {
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774457#comment-13774457
 ] 

Hudson commented on YARN-1188:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #341 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/341/])
Fix credit for YARN-1188 (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525518)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
YARN-1188. The context of QueueMetrics becomes default when using FairScheduler 
(Akira Ajisaka via Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525516)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueMetrics.java


 The context of QueueMetrics becomes 'default' when using FairScheduler
 --

 Key: YARN-1188
 URL: https://issues.apache.org/jira/browse/YARN-1188
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Akira AJISAKA
Priority: Minor
  Labels: metrics, newbie
 Attachments: YARN-1188.1.patch


 I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
 was using FairScheduler.
 The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
 like below:
 {code}
 + @Metrics(context=yarn)
 public class FSQueueMetrics extends QueueMetrics {
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774531#comment-13774531
 ] 

Hudson commented on YARN-1188:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1557 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1557/])
Fix credit for YARN-1188 (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525518)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
YARN-1188. The context of QueueMetrics becomes default when using FairScheduler 
(Akira Ajisaka via Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525516)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueMetrics.java


 The context of QueueMetrics becomes 'default' when using FairScheduler
 --

 Key: YARN-1188
 URL: https://issues.apache.org/jira/browse/YARN-1188
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Akira AJISAKA
Priority: Minor
  Labels: metrics, newbie
 Attachments: YARN-1188.1.patch


 I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
 was using FairScheduler.
 The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
 like below:
 {code}
 + @Metrics(context=yarn)
 public class FSQueueMetrics extends QueueMetrics {
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774537#comment-13774537
 ] 

Hudson commented on YARN-1188:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1531 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1531/])
Fix credit for YARN-1188 (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525518)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
YARN-1188. The context of QueueMetrics becomes default when using FairScheduler 
(Akira Ajisaka via Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525516)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueMetrics.java


 The context of QueueMetrics becomes 'default' when using FairScheduler
 --

 Key: YARN-1188
 URL: https://issues.apache.org/jira/browse/YARN-1188
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Akira AJISAKA
Priority: Minor
  Labels: metrics, newbie
 Attachments: YARN-1188.1.patch


 I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
 was using FairScheduler.
 The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
 like below:
 {code}
 + @Metrics(context=yarn)
 public class FSQueueMetrics extends QueueMetrics {
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1041) RM to bind and notify a restarted AM of existing containers

2013-09-23 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774539#comment-13774539
 ] 

Steve Loughran commented on YARN-1041:
--

I'd add that discarding outstanding requests on AM restart would avoid the AM 
from having the problem of handling container allocations which it was not 
expecting. After enumerating the active set of containers, the AM could make 
its own decisions about how many containers it needs -and where.

 RM to bind and notify a restarted AM of existing containers
 ---

 Key: YARN-1041
 URL: https://issues.apache.org/jira/browse/YARN-1041
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0
Reporter: Steve Loughran

 For long lived containers we don't want the AM to be a SPOF.
 When the RM restarts a (failed) AM, it should be given the list of containers 
 it had already been allocated. the AM should then be able to contact the NMs 
 to get details on them. NMs would also need to do any binding of the 
 containers needed to handle a moved/restarted AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality

2013-09-23 Thread Kaibo Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaibo Zhou updated YARN-1226:
-

Description: 
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
are allocated by HMaster. HMaster communicate with RegionServers and get the 
region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo

  was:
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the 
RegionServers' hostnames are allocated by HMaster. HMaster communicate with 
RegionServers and get the region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo


 ipv4 and ipv6 affect job data locality
 --

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou
Priority: Minor

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName()
  to get the host name, but the return result of this function depends on 
 IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
 

[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality

2013-09-23 Thread Kaibo Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaibo Zhou updated YARN-1226:
-

Description: 
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the 
RegionServers' hostnames are allocated by HMaster. HMaster communicate with 
RegionServers and get the region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo

  was:
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
FQDN](http://bugs.sun.com/view_bug.do?bug_id=7166687). 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the 
RegionServers' hostnames are allocated by HMaster. HMaster communicate with 
RegionServers and get the region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo


 ipv4 and ipv6 affect job data locality
 --

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou
Priority: Minor

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName()
  to get the host name, but the return result of this function depends on 
 IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
 

[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality

2013-09-23 Thread Kaibo Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaibo Zhou updated YARN-1226:
-

Description: 
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() 
to get the host name, but the return result of this function depends on IPV4 or 
IPV6, see [InetAddress.getLocalHost().getHostName() returns 
FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.


For the mapred job which scan hbase table, the InputSplit contains node 
locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
are allocated by HMaster. HMaster communicate with RegionServers and get the 
region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net


As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo

  was:
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
are allocated by HMaster. HMaster communicate with RegionServers and get the 
region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo


 ipv4 and ipv6 affect job data locality
 --

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou
Priority: Minor

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName() to get the host name, but the 
 return result of this function depends on IPV4 or IPV6, see 
 

[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality

2013-09-23 Thread Kaibo Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaibo Zhou updated YARN-1226:
-

Description: 
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
are allocated by HMaster. HMaster communicate with RegionServers and get the 
region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo

  was:
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
are allocated by HMaster. HMaster communicate with RegionServers and get the 
region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo


 ipv4 and ipv6 affect job data locality
 --

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou
Priority: Minor

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName()
  to get the host name, but the return result of this function depends on 
 IPV4 or IPV6, see 

[jira] [Created] (YARN-1226) ipv4 and ipv6 affect job data locality

2013-09-23 Thread Kaibo Zhou (JIRA)
Kaibo Zhou created YARN-1226:


 Summary: ipv4 and ipv6 affect job data locality
 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 2.1.0-beta, 2.0.0-alpha, 0.23.3
Reporter: Kaibo Zhou
Priority: Minor


When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality 
(capacity scheduler) around 0~10%.

Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and 
HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see http://bugs.sun.com/view_bug.do?bug_id=7166687;. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the 
RegionServers' hostnames are allocated by HMaster. HMaster communicate with 
RegionServers and get the region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality

2013-09-23 Thread Kaibo Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaibo Zhou updated YARN-1226:
-

Description: 
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see http://bugs.sun.com/view_bug.do?bug_id=7166687;. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the 
RegionServers' hostnames are allocated by HMaster. HMaster communicate with 
RegionServers and get the region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo

  was:
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality 
(capacity scheduler) around 0~10%.

Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and 
HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see http://bugs.sun.com/view_bug.do?bug_id=7166687;. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the 
RegionServers' hostnames are allocated by HMaster. HMaster communicate with 
RegionServers and get the region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo


 ipv4 and ipv6 affect job data locality
 --

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou
Priority: Minor

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName()
  to get the host name, but the return result of this function depends on 
 IPV4 or IPV6, see http://bugs.sun.com/view_bug.do?bug_id=7166687;. 
 On machines with ipv4, NodeManager get hostName as: 
 search042097.sqa.cm4.site.net
 But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
 if run with 

[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality

2013-09-23 Thread Kaibo Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaibo Zhou updated YARN-1226:
-

Description: 
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
FQDN](http://bugs.sun.com/view_bug.do?bug_id=7166687). 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the 
RegionServers' hostnames are allocated by HMaster. HMaster communicate with 
RegionServers and get the region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo

  was:
When I run a mapreduce job which use TableInputFormat to scan a hbase table on 
yarn cluser with 140+ nodes, I consistently get very low data locality around 
0~10%. 

The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
cluster with NodeManager, DataNode and HRegionServer run on the same node.

The reason of low data locality is: most machines in the cluster uses IPV6, few 
machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName()
 to get the host name, but the return result of this function depends on IPV4 
or IPV6, see http://bugs.sun.com/view_bug.do?bug_id=7166687;. 

On machines with ipv4, NodeManager get hostName as: 
search042097.sqa.cm4.site.net
But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
search042097.sqa.cm4.site.net.

For the mapred job which scan hbase table, the InputSplit contains node 
locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the 
RegionServers' hostnames are allocated by HMaster. HMaster communicate with 
RegionServers and get the region server's host name use java NIO: 
clientChannel.socket().getInetAddress().getHostName().
Also see the startup log of region server:

13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
passed us hostname to use. Was=search042024.sqa.cm4, 
Now=search042024.sqa.cm4.site.net

As you can see, most machines in the Yarn cluster with IPV6 get the short 
hostname, but hbase always get the full hostname, so the Host cannot matched 
(see RMContainerAllocator::assignToMap).This can lead to poor locality.

After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
locality in the cluster.

Thanks,
Kaibo


 ipv4 and ipv6 affect job data locality
 --

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou
Priority: Minor

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName()
  to get the host name, but the return result of this function depends on 
 IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns 
 FQDN](http://bugs.sun.com/view_bug.do?bug_id=7166687). 
 On machines with ipv4, NodeManager get 

[jira] [Updated] (YARN-1226) ipv4 and ipv6 lead to poor data locality

2013-09-23 Thread Kaibo Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaibo Zhou updated YARN-1226:
-

Priority: Major  (was: Minor)

 ipv4 and ipv6 lead to poor data locality
 

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName() to get the host name, but the 
 return result of this function depends on IPV4 or IPV6, see 
 [InetAddress.getLocalHost().getHostName() returns 
 FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 
 On machines with ipv4, NodeManager get hostName as: 
 search042097.sqa.cm4.site.net
 But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
 search042097.sqa.cm4.site.net.
 
 For the mapred job which scan hbase table, the InputSplit contains node 
 locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
 search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
 are allocated by HMaster. HMaster communicate with RegionServers and get the 
 region server's host name use java NIO: 
 clientChannel.socket().getInetAddress().getHostName().
 Also see the startup log of region server:
 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
 passed us hostname to use. Was=search042024.sqa.cm4, 
 Now=search042024.sqa.cm4.site.net
 
 As you can see, most machines in the Yarn cluster with IPV6 get the short 
 hostname, but hbase always get the full hostname, so the Host cannot matched 
 (see RMContainerAllocator::assignToMap).This can lead to poor locality.
 After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
 locality in the cluster.
 Thanks,
 Kaibo

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1226) ipv4 and ipv6 lead to poor data locality

2013-09-23 Thread Kaibo Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaibo Zhou updated YARN-1226:
-

Summary: ipv4 and ipv6 lead to poor data locality  (was: ipv4 and ipv6 
affect job data locality)

 ipv4 and ipv6 lead to poor data locality
 

 Key: YARN-1226
 URL: https://issues.apache.org/jira/browse/YARN-1226
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta
Reporter: Kaibo Zhou
Priority: Minor

 When I run a mapreduce job which use TableInputFormat to scan a hbase table 
 on yarn cluser with 140+ nodes, I consistently get very low data locality 
 around 0~10%. 
 The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the 
 cluster with NodeManager, DataNode and HRegionServer run on the same node.
 The reason of low data locality is: most machines in the cluster uses IPV6, 
 few machines use IPV4. NodeManager use 
 InetAddress.getLocalHost().getHostName() to get the host name, but the 
 return result of this function depends on IPV4 or IPV6, see 
 [InetAddress.getLocalHost().getHostName() returns 
 FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. 
 On machines with ipv4, NodeManager get hostName as: 
 search042097.sqa.cm4.site.net
 But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4
 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns 
 search042097.sqa.cm4.site.net.
 
 For the mapred job which scan hbase table, the InputSplit contains node 
 locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. 
 search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames 
 are allocated by HMaster. HMaster communicate with RegionServers and get the 
 region server's host name use java NIO: 
 clientChannel.socket().getInetAddress().getHostName().
 Also see the startup log of region server:
 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master 
 passed us hostname to use. Was=search042024.sqa.cm4, 
 Now=search042024.sqa.cm4.site.net
 
 As you can see, most machines in the Yarn cluster with IPV6 get the short 
 hostname, but hbase always get the full hostname, so the Host cannot matched 
 (see RMContainerAllocator::assignToMap).This can lead to poor locality.
 After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data 
 locality in the cluster.
 Thanks,
 Kaibo

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1227) Update Single Cluster doc to use yarn.resourcemanager.hostname

2013-09-23 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1227:


 Summary: Update Single Cluster doc to use 
yarn.resourcemanager.hostname
 Key: YARN-1227
 URL: https://issues.apache.org/jira/browse/YARN-1227
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza


Now that yarn.resourcemanager.hostname can be used in place or 
yarn.resourcemanager.address, yarn.resourcemanager.scheduler.address, etc., we 
should update the doc to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1204) Need to add https port related property in Yarn

2013-09-23 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774742#comment-13774742
 ] 

Vinod Kumar Vavilapalli commented on YARN-1204:
---

Quickly looked at the patch. In ResourceManager.java and WebAppProxy.java, you 
replaced the usage of {{YarnConfiguration.getProxyHostAndPort()}} with 
{{WebAppUtils.getResolvedRMWebAppURL()}}. Look like bugs to me.

Can you also look at test-failures? That's a lot of them, some of them are 
obviously tracked elsewhere, let's make sure all test-failures are tracked.

 Need to add https port related property in Yarn
 ---

 Key: YARN-1204
 URL: https://issues.apache.org/jira/browse/YARN-1204
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Omkar Vinit Joshi
 Attachments: YARN-1204.20131018.1.patch, YARN-1204.20131020.1.patch, 
 YARN-1204.20131020.2.patch, YARN-1204.20131020.3.patch, 
 YARN-1204.20131020.4.patch


 There is no yarn property available to configure https port for Resource 
 manager, nodemanager and history server. Currently, Yarn services uses the 
 port defined for http [defined by 
 'mapreduce.jobhistory.webapp.address','yarn.nodemanager.webapp.address', 
 'yarn.resourcemanager.webapp.address'] for running services on https protocol.
 Yarn should have list of property to assign https port for RM, NM and JHS.
 It can be like below.
 yarn.nodemanager.webapp.https.address
 yarn.resourcemanager.webapp.https.address
 mapreduce.jobhistory.webapp.https.address 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely

2013-09-23 Thread Siqi Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774765#comment-13774765
 ] 

Siqi Li commented on YARN-1221:
---

According to the fairsheduler log,
2013-09-23 17:32:30,593 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:2048, 
vCores:1
2013-09-23 17:32:35,591 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:4096, 
vCores:1
2013-09-23 17:32:36,595 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:4096, 
vCores:1
2013-09-23 17:32:37,598 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:4096, 
vCores:1
2013-09-23 17:32:38,602 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:4096, 
vCores:1
2013-09-23 17:32:39,606 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:4096, 
vCores:1
2013-09-23 17:32:43,622 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:2048, 
vCores:1
2013-09-23 17:32:48,640 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:4096, 
vCores:1
2013-09-23 17:32:49,647 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:-1, 
vCores:0
2013-09-23 17:33:11,213 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:4096, 
vCores:1
2013-09-23 17:33:11,245 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:-1, 
vCores:0
2013-09-23 17:33:13,221 ASSIGN  atla-aub-37-sr1.prod.twttr.net  memory:4096, 
vCores:1

the -1 might be the problem, it should be 4096 and the vCores should be 1 
instead of 0.
I tried several times, if the log has only one memory:-1, vCores:0, the 
reserved memory will be 4G after all the jobs done, and if the log has two 
memory:-1, vCores:0, the reserved memory will be 8G after all the jobs done.

 With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
 -

 Key: YARN-1221
 URL: https://issues.apache.org/jira/browse/YARN-1221
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-23 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1157:


Attachment: YARN-1157.3.patch

Update patch based on latest trunk

 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1227) Update Single Cluster doc to use yarn.resourcemanager.hostname

2013-09-23 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774812#comment-13774812
 ] 

Karthik Kambatla commented on YARN-1227:


This is related to HA-specific configuration changes we will be doing shortly 
as part of YARN-149.

 Update Single Cluster doc to use yarn.resourcemanager.hostname
 --

 Key: YARN-1227
 URL: https://issues.apache.org/jira/browse/YARN-1227
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
  Labels: newbie

 Now that yarn.resourcemanager.hostname can be used in place or 
 yarn.resourcemanager.address, yarn.resourcemanager.scheduler.address, etc., 
 we should update the doc to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1222) Make improvements in ZKRMStateStore for fencing

2013-09-23 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774658#comment-13774658
 ] 

Karthik Kambatla commented on YARN-1222:


Another approach we could take is let the user provide the ACLs - rm1-acl, 
rm2-acl - in the configuration, very much along the lines of how ACLs are 
passed today. This would allow users to hook these up to kerberos credentials 
also if they want to.

 Make improvements in ZKRMStateStore for fencing
 ---

 Key: YARN-1222
 URL: https://issues.apache.org/jira/browse/YARN-1222
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: yarn-1222-1.patch


 Using multi-operations for every ZK interaction. 
 In every operation, automatically creating/deleting a lock znode that is the 
 child of the root znode. This is to achieve fencing by modifying the 
 create/delete permissions on the root znode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1228) Don't allow other file than fair-scheduler.xml to be Fair Scheduler allocations file

2013-09-23 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1228:


 Summary: Don't allow other file than fair-scheduler.xml to be Fair 
Scheduler allocations file
 Key: YARN-1228
 URL: https://issues.apache.org/jira/browse/YARN-1228
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza


Currently the Fair Scheduler is configured in two ways
* An allocations file that has a different format than the standard Hadoop 
configuration file, which makes it easier to specify hierarchical objects like 
queues and their properties. 
* With properties like yarn.scheduler.fair.max.assign that are specified in the 
standard Hadoop configuration format.

The standard and default way of configuring it is to use fair-scheduler.xml as 
the allocations file and to put the yarn.scheduler properties in yarn-site.xml.

It is also possible to specify a different file as the allocations file, and to 
place the yarn.scheduler properties in fair-scheduler.xml, which will be 
interpreted as in the standard Hadoop configuration format.  This flexibility 
is both confusing and unnecessary.  There's no need to keep around the second 
way.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1188:
-

Assignee: Tsuyoshi OZAWA

 The context of QueueMetrics becomes 'default' when using FairScheduler
 --

 Key: YARN-1188
 URL: https://issues.apache.org/jira/browse/YARN-1188
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Akira AJISAKA
Assignee: Tsuyoshi OZAWA
Priority: Minor
  Labels: metrics, newbie
 Attachments: YARN-1188.1.patch


 I found the context of QueueMetrics changed to 'default' from 'yarn' when I 
 was using FairScheduler.
 The context should always be 'yarn' by adding an annotation to FSQueueMetrics 
 like below:
 {code}
 + @Metrics(context=yarn)
 public class FSQueueMetrics extends QueueMetrics {
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1227) Update Single Cluster doc to use yarn.resourcemanager.hostname

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1227:
-

Labels: newbie  (was: )

 Update Single Cluster doc to use yarn.resourcemanager.hostname
 --

 Key: YARN-1227
 URL: https://issues.apache.org/jira/browse/YARN-1227
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
  Labels: newbie

 Now that yarn.resourcemanager.hostname can be used in place or 
 yarn.resourcemanager.address, yarn.resourcemanager.scheduler.address, etc., 
 we should update the doc to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-23 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong reassigned YARN-1229:
---

Assignee: Xuan Gong

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Critical
 Fix For: 2.1.1-beta


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-819) ResourceManager and NodeManager should check for a minimum allowed version

2013-09-23 Thread Robert Parker (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775697#comment-13775697
 ] 

Robert Parker commented on YARN-819:


Jon, Thanks for the review and nice catch on branch-2.

* Changed the TestResourceTrackerService#testNodeRegistrationVersionLessThanRM 
test case to run on branch-2 and greater.

* The jira mentions reboot as on option.  A reboot would cover the case where 
new software is deployed but the NM process is not restarted. There is no 
guarantee that the new version can talk to the older version so rejection of 
the connection will satisfy the requirement and is much less complicated.

* Corrected the other three code issues.



 ResourceManager and NodeManager should check for a minimum allowed version
 --

 Key: YARN-819
 URL: https://issues.apache.org/jira/browse/YARN-819
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Robert Parker
Assignee: Robert Parker
 Attachments: YARN-819-1.patch, YARN-819-2.patch, YARN-819-3.patch


 Our use case is during upgrade on a large cluster several NodeManagers may 
 not restart with the new version.  Once the RM comes back up the NodeManager 
 will re-register without issue to the RM.
 The NM should report the version the RM.  The RM should have a configuration 
 to disallow the check (default), equal to the RM (to prevent config change 
 for each release), equal to or greater than RM (to allow NM upgrades), and 
 finally an explicit version or version range.
 The RM should also have an configuration on how to treat the mismatch: 
 REJECT, or REBOOT the NM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-451) Add more metrics to RM page

2013-09-23 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775700#comment-13775700
 ] 

Jason Lowe commented on YARN-451:
-

Is knowing how big an application might get in the future important?  Knowing 
how big an application is right now, both in terms of what it's using and what 
it's asking for, seems more relevant for understanding why a queue is 
overloaded or jobs aren't getting scheduled as quickly as expected.  The 
ApplicationResourceUsageReport already contains this information, and it should 
be straightforward to report as part of the ApplicationResourceUsageReport for 
display via the web UI, CLI, or REST services.

Note that YARN-415 is already attempting to do this for historical resource 
usage so it will be easy to see which jobs have taken large amounts of 
resources and could have slowed other jobs in the past.

 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1204) Need to add https port related property in Yarn

2013-09-23 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775703#comment-13775703
 ] 

Omkar Vinit Joshi commented on YARN-1204:
-

Thanks vinod for the review..

bq. Quickly looked at the patch. In ResourceManager.java and WebAppProxy.java, 
you replaced the usage of YarnConfiguration.getProxyHostAndPort() with 
WebAppUtils.getResolvedRMWebAppURL().
fixed..

bq. Can you also look at test-failures? That's a lot of them, some of them are 
obviously tracked elsewhere, let's make sure all test-failures are tracked.
I see all these test failures even without patch..

 Need to add https port related property in Yarn
 ---

 Key: YARN-1204
 URL: https://issues.apache.org/jira/browse/YARN-1204
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Omkar Vinit Joshi
 Attachments: YARN-1204.20131018.1.patch, YARN-1204.20131020.1.patch, 
 YARN-1204.20131020.2.patch, YARN-1204.20131020.3.patch, 
 YARN-1204.20131020.4.patch, YARN-1204.20131023.1.patch


 There is no yarn property available to configure https port for Resource 
 manager, nodemanager and history server. Currently, Yarn services uses the 
 port defined for http [defined by 
 'mapreduce.jobhistory.webapp.address','yarn.nodemanager.webapp.address', 
 'yarn.resourcemanager.webapp.address'] for running services on https protocol.
 Yarn should have list of property to assign https port for RM, NM and JHS.
 It can be like below.
 yarn.nodemanager.webapp.https.address
 yarn.resourcemanager.webapp.https.address
 mapreduce.jobhistory.webapp.https.address 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (YARN-451) Add more metrics to RM page

2013-09-23 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775700#comment-13775700
 ] 

Jason Lowe edited comment on YARN-451 at 9/23/13 9:59 PM:
--

Is knowing how big an application might get in the future important?  Knowing 
how big an application is right now, both in terms of what it's using and what 
it's asking for, seems more relevant for understanding why a queue is 
overloaded or jobs aren't getting scheduled as quickly as expected.  The 
ApplicationResourceUsageReport already contains this information, and it should 
be straightforward to display via the web UI, CLI, or REST services.

Note that YARN-415 is already attempting to do this for historical resource 
usage so it will be easy to see which jobs have taken large amounts of 
resources and could have slowed other jobs in the past.

  was (Author: jlowe):
Is knowing how big an application might get in the future important?  
Knowing how big an application is right now, both in terms of what it's using 
and what it's asking for, seems more relevant for understanding why a queue is 
overloaded or jobs aren't getting scheduled as quickly as expected.  The 
ApplicationResourceUsageReport already contains this information, and it should 
be straightforward to report as part of the ApplicationResourceUsageReport for 
display via the web UI, CLI, or REST services.

Note that YARN-415 is already attempting to do this for historical resource 
usage so it will be easy to see which jobs have taken large amounts of 
resources and could have slowed other jobs in the past.
  
 Add more metrics to RM page
 ---

 Key: YARN-451
 URL: https://issues.apache.org/jira/browse/YARN-451
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu
Assignee: Sangjin Lee
Priority: Blocker
 Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch


 ResourceManager webUI shows list of RUNNING applications, but it does not 
 tell which applications are requesting more resource compared to others. With 
 cluster running hundreds of applications at once it would be useful to have 
 some kind of metric to show high-resource usage applications vs low-resource 
 usage ones. At the minimum showing number of containers is good option.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-23 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775711#comment-13775711
 ] 

Xuan Gong commented on YARN-1229:
-

[~vinodkv], [~bikassaha], [~hitesh], [~sseth], [~jlowe], [~cnauroth]

The bug shows an error in launch_container.sh while trying to export 
NM_AUX_SERVICE_mapreduce.shuffle. The problem is that '.' is not considered a 
valid character in an environment variable. In order to solve this, we might 
need to rename the service name.
There are three places need to rename (use mapreduce_shuffle instead of 
mapreduce.shuffle):
{code}
  public static final String MAPREDUCE_SHUFFLE_SERVICEID =
  mapreduce.shuffle;
{code}
in ShuffleHandler.java.

The other two places are in yarn_site.xml
{code}
property
nameyarn.nodemanager.aux-services/name
valuemapreduce.shuffle/value
descriptionshuffle service that needs to be set for Map Reduce to run 
/description
/property

property
nameyarn.nodemanager.aux-services.mapreduce.shuffle.class/name
valueorg.apache.hadoop.mapred.ShuffleHandler/value
/property
{code}

We can just simply replace all three places with mapreduce_shuffle, or we can 
split the shuffle service out of the aux_services, say, create a new property 
called mapreduce_shuffle_service. The ShuffleHandler can read this property 
instead of defining MAPREDUCE_SHUFFLE_SERVICEID by itself. And 
AuxService#init() will need to read both mapreduce_shuffle_service and 
yarn.nodemanager.aux-services to do the initialization. 

An alternate is to convert all special characters to _ - and 
AuxServiceHelpers becomes the public API to access this data.

Since we're trying to rename variables, this can be considered backward 
incompatible. I would like get in touch with folks who are already using it.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Critical
 Fix For: 2.1.1-beta


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1215) Yarn URL should include userinfo

2013-09-23 Thread shanyu zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775718#comment-13775718
 ] 

shanyu zhao commented on YARN-1215:
---

Looks good to me overall. But I think the best fix is to add a userInfo field 
for org.apache.hadoop.yarn.api.records.URL.

 Yarn URL should include userinfo
 

 Key: YARN-1215
 URL: https://issues.apache.org/jira/browse/YARN-1215
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Chuan Liu
Assignee: Chuan Liu
 Attachments: YARN-1215-trunk.patch


 In the {{org.apache.hadoop.yarn.api.records.URL}} class, we don't have an 
 userinfo as part of the URL. When converting a {{java.net.URI}} object into 
 the YARN URL object in {{ConverterUtils.getYarnUrlFromURI()}} method, we will 
 set uri host as the url host. If the uri has a userinfo part, the userinfo is 
 discarded. This will lead to information loss if the original uri has the 
 userinfo, e.g. foo://username:passw...@example.com will be converted to 
 foo://example.com and username/password information is lost during the 
 conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1199) Make NM/RM Versions Available

2013-09-23 Thread Robert Parker (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775724#comment-13775724
 ] 

Robert Parker commented on YARN-1199:
-

+1 non-binding

 Make NM/RM Versions Available
 -

 Key: YARN-1199
 URL: https://issues.apache.org/jira/browse/YARN-1199
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: YARN-1199.patch, YARN-1199.patch, YARN-1199.patch


 Now as we have the NM and RM Versions available, we can display the YARN 
 version of nodes running in the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-23 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775730#comment-13775730
 ] 

Jian He commented on YARN-1157:
---

For the following code, we may create a common function for both 
AMRegisteredTransition and AMUnregisteredTransition
{code}
  String url = unregisterEvent.getTrackingUrl();
  if (url == null || url.trim().isEmpty()) {
appAttempt.origTrackingUrl = N/A;
  } else {
appAttempt.origTrackingUrl = url;
  }
  appAttempt.proxiedTrackingUrl = 
appAttempt.generateProxyUriWithoutScheme(appAttempt.origTrackingUrl);
{code}

bq. Let's document RegisterApplicationMasterRequest.getTrackingUrl() and 
setTrackingUrl()
Can you also document in the specific method comments ? for both 
registerRequest and unregisterRequest. And also say something like for those 
default values, will fallback to ResourceManager's app page

Typo in RegisterApplicationMasterRequest: are all values

The tests can probably be done with 
TestRMAppAttemptTransitions.runApplicationAttempt, In fact,the earlier tests in 
TestRMAppAttemptImpl can probably also be merged into 
TestRMAppAttemptTransitions. and so we don't need to change the visibility of 
AMregisteredTransition and AMUnregisteredTransition. 


 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-1210) During RM restart, RM should start a new attempt only when previous attempt exits for real

2013-09-23 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He reassigned YARN-1210:
-

Assignee: Jian He  (was: Vinod Kumar Vavilapalli)

 During RM restart, RM should start a new attempt only when previous attempt 
 exits for real
 --

 Key: YARN-1210
 URL: https://issues.apache.org/jira/browse/YARN-1210
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Jian He

 When RM recovers, it can wait for existing AMs to contact RM back and then 
 kill them forcefully before even starting a new AM. Worst case, RM will start 
 a new AppAttempt after waiting for 10 mins ( the expiry interval). This way 
 we'll minimize multiple AMs racing with each other. This can help issues with 
 downstream components like Pig, Hive and Oozie during RM restart.
 In the mean while, new apps will proceed as usual as existing apps wait for 
 recovery.
 This can continue to be useful after work-preserving restart, so that AMs 
 which can properly sync back up with RM can continue to run and those that 
 don't are guaranteed to be killed before starting a new attempt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-955) [YARN-321] History Service should create the RPC server and wire it to HistoryStorage

2013-09-23 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated YARN-955:
---

Attachment: YARN-955-1.patch

Attaching patch.

Thanks,
Mayank

 [YARN-321] History Service should create the RPC server and wire it to 
 HistoryStorage
 -

 Key: YARN-955
 URL: https://issues.apache.org/jira/browse/YARN-955
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Mayank Bansal
 Attachments: YARN-955-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1228:
-

Summary: Clean up Fair Scheduler configuration loading  (was: Don't allow 
other file than fair-scheduler.xml to be Fair Scheduler allocations file)

 Clean up Fair Scheduler configuration loading
 -

 Key: YARN-1228
 URL: https://issues.apache.org/jira/browse/YARN-1228
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza

 Currently the Fair Scheduler is configured in two ways
 * An allocations file that has a different format than the standard Hadoop 
 configuration file, which makes it easier to specify hierarchical objects 
 like queues and their properties. 
 * With properties like yarn.scheduler.fair.max.assign that are specified in 
 the standard Hadoop configuration format.
 The standard and default way of configuring it is to use fair-scheduler.xml 
 as the allocations file and to put the yarn.scheduler properties in 
 yarn-site.xml.
 It is also possible to specify a different file as the allocations file, and 
 to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
 interpreted as in the standard Hadoop configuration format.  This flexibility 
 is both confusing and unnecessary.  There's no need to keep around the second 
 way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1210) During RM restart, RM should start a new attempt only when previous attempt exits for real

2013-09-23 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775762#comment-13775762
 ] 

Jian He commented on YARN-1210:
---

Also, NM needs to be changed to at least report back the finished containers on 
NM resync

 During RM restart, RM should start a new attempt only when previous attempt 
 exits for real
 --

 Key: YARN-1210
 URL: https://issues.apache.org/jira/browse/YARN-1210
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Jian He

 When RM recovers, it can wait for existing AMs to contact RM back and then 
 kill them forcefully before even starting a new AM. Worst case, RM will start 
 a new AppAttempt after waiting for 10 mins ( the expiry interval). This way 
 we'll minimize multiple AMs racing with each other. This can help issues with 
 downstream components like Pig, Hive and Oozie during RM restart.
 In the mean while, new apps will proceed as usual as existing apps wait for 
 recovery.
 This can continue to be useful after work-preserving restart, so that AMs 
 which can properly sync back up with RM can continue to run and those that 
 don't are guaranteed to be killed before starting a new attempt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1228:
-

Attachment: YARN-1228.patch

 Clean up Fair Scheduler configuration loading
 -

 Key: YARN-1228
 URL: https://issues.apache.org/jira/browse/YARN-1228
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza
 Attachments: YARN-1228.patch


 Currently the Fair Scheduler is configured in two ways
 * An allocations file that has a different format than the standard Hadoop 
 configuration file, which makes it easier to specify hierarchical objects 
 like queues and their properties. 
 * With properties like yarn.scheduler.fair.max.assign that are specified in 
 the standard Hadoop configuration format.
 The standard and default way of configuring it is to use fair-scheduler.xml 
 as the allocations file and to put the yarn.scheduler properties in 
 yarn-site.xml.
 It is also possible to specify a different file as the allocations file, and 
 to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
 interpreted as in the standard Hadoop configuration format.  This flexibility 
 is both confusing and unnecessary.
 Additionally, the allocation file is loaded as fair-scheduler.xml from the 
 classpath if it is not specified, but is loaded as a File if it is.  This 
 causes two problems
 1. We see different behavior when not setting the 
 yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
 which is its default.
 2. Classloaders may choose to cache resources, which can break the reload 
 logic when yarn.scheduler.fair.allocation.file is not specified.
 We should never allow the yarn.scheduler properties to go into 
 fair-scheduler.xml.  And we should always load the allocations file as a 
 file, not as a resource on the classpath.  To preserve existing behavior and 
 allow loading files from the classpath, we can look for files on the 
 classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-23 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1228:
-

Description: 
Currently the Fair Scheduler is configured in two ways
* An allocations file that has a different format than the standard Hadoop 
configuration file, which makes it easier to specify hierarchical objects like 
queues and their properties. 
* With properties like yarn.scheduler.fair.max.assign that are specified in the 
standard Hadoop configuration format.

The standard and default way of configuring it is to use fair-scheduler.xml as 
the allocations file and to put the yarn.scheduler properties in yarn-site.xml.

It is also possible to specify a different file as the allocations file, and to 
place the yarn.scheduler properties in fair-scheduler.xml, which will be 
interpreted as in the standard Hadoop configuration format.  This flexibility 
is both confusing and unnecessary.

Additionally, the allocation file is loaded as fair-scheduler.xml from the 
classpath if it is not specified, but is loaded as a File if it is.  This 
causes two problems
1. We see different behavior when not setting the 
yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
which is its default.
2. Classloaders may choose to cache resources, which can break the reload logic 
when yarn.scheduler.fair.allocation.file is not specified.

We should never allow the yarn.scheduler properties to go into 
fair-scheduler.xml.  And we should always load the allocations file as a file, 
not as a resource on the classpath.  To preserve existing behavior and allow 
loading files from the classpath, we can look for files on the classpath, but 
strip of their scheme and interpret them as Files.


  was:
Currently the Fair Scheduler is configured in two ways
* An allocations file that has a different format than the standard Hadoop 
configuration file, which makes it easier to specify hierarchical objects like 
queues and their properties. 
* With properties like yarn.scheduler.fair.max.assign that are specified in the 
standard Hadoop configuration format.

The standard and default way of configuring it is to use fair-scheduler.xml as 
the allocations file and to put the yarn.scheduler properties in yarn-site.xml.

It is also possible to specify a different file as the allocations file, and to 
place the yarn.scheduler properties in fair-scheduler.xml, which will be 
interpreted as in the standard Hadoop configuration format.  This flexibility 
is both confusing and unnecessary.  There's no need to keep around the second 
way.



 Clean up Fair Scheduler configuration loading
 -

 Key: YARN-1228
 URL: https://issues.apache.org/jira/browse/YARN-1228
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza

 Currently the Fair Scheduler is configured in two ways
 * An allocations file that has a different format than the standard Hadoop 
 configuration file, which makes it easier to specify hierarchical objects 
 like queues and their properties. 
 * With properties like yarn.scheduler.fair.max.assign that are specified in 
 the standard Hadoop configuration format.
 The standard and default way of configuring it is to use fair-scheduler.xml 
 as the allocations file and to put the yarn.scheduler properties in 
 yarn-site.xml.
 It is also possible to specify a different file as the allocations file, and 
 to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
 interpreted as in the standard Hadoop configuration format.  This flexibility 
 is both confusing and unnecessary.
 Additionally, the allocation file is loaded as fair-scheduler.xml from the 
 classpath if it is not specified, but is loaded as a File if it is.  This 
 causes two problems
 1. We see different behavior when not setting the 
 yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
 which is its default.
 2. Classloaders may choose to cache resources, which can break the reload 
 logic when yarn.scheduler.fair.allocation.file is not specified.
 We should never allow the yarn.scheduler properties to go into 
 fair-scheduler.xml.  And we should always load the allocations file as a 
 file, not as a resource on the classpath.  To preserve existing behavior and 
 allow loading files from the classpath, we can look for files on the 
 classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775769#comment-13775769
 ] 

Sandy Ryza commented on YARN-1228:
--

Existing tests verify that absolute paths and not giving any file work. Adding 
a file to the classpath at runtime is difficult, so I verified that it picks up 
files from the classpath by manually testing on a pseudo-distributed cluster.


 Clean up Fair Scheduler configuration loading
 -

 Key: YARN-1228
 URL: https://issues.apache.org/jira/browse/YARN-1228
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1228.patch


 Currently the Fair Scheduler is configured in two ways
 * An allocations file that has a different format than the standard Hadoop 
 configuration file, which makes it easier to specify hierarchical objects 
 like queues and their properties. 
 * With properties like yarn.scheduler.fair.max.assign that are specified in 
 the standard Hadoop configuration format.
 The standard and default way of configuring it is to use fair-scheduler.xml 
 as the allocations file and to put the yarn.scheduler properties in 
 yarn-site.xml.
 It is also possible to specify a different file as the allocations file, and 
 to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
 interpreted as in the standard Hadoop configuration format.  This flexibility 
 is both confusing and unnecessary.
 Additionally, the allocation file is loaded as fair-scheduler.xml from the 
 classpath if it is not specified, but is loaded as a File if it is.  This 
 causes two problems
 1. We see different behavior when not setting the 
 yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
 which is its default.
 2. Classloaders may choose to cache resources, which can break the reload 
 logic when yarn.scheduler.fair.allocation.file is not specified.
 We should never allow the yarn.scheduler properties to go into 
 fair-scheduler.xml.  And we should always load the allocations file as a 
 file, not as a resource on the classpath.  To preserve existing behavior and 
 allow loading files from the classpath, we can look for files on the 
 classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-955) [YARN-321] History Service should create the RPC server and wire it to HistoryStorage

2013-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775797#comment-13775797
 ] 

Hadoop QA commented on YARN-955:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604687/YARN-955-1.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1986//console

This message is automatically generated.

 [YARN-321] History Service should create the RPC server and wire it to 
 HistoryStorage
 -

 Key: YARN-955
 URL: https://issues.apache.org/jira/browse/YARN-955
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Mayank Bansal
 Attachments: YARN-955-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1228) Clean up Fair Scheduler configuration loading

2013-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775823#comment-13775823
 ] 

Hadoop QA commented on YARN-1228:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604697/YARN-1228.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1985//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1985//console

This message is automatically generated.

 Clean up Fair Scheduler configuration loading
 -

 Key: YARN-1228
 URL: https://issues.apache.org/jira/browse/YARN-1228
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1228.patch


 Currently the Fair Scheduler is configured in two ways
 * An allocations file that has a different format than the standard Hadoop 
 configuration file, which makes it easier to specify hierarchical objects 
 like queues and their properties. 
 * With properties like yarn.scheduler.fair.max.assign that are specified in 
 the standard Hadoop configuration format.
 The standard and default way of configuring it is to use fair-scheduler.xml 
 as the allocations file and to put the yarn.scheduler properties in 
 yarn-site.xml.
 It is also possible to specify a different file as the allocations file, and 
 to place the yarn.scheduler properties in fair-scheduler.xml, which will be 
 interpreted as in the standard Hadoop configuration format.  This flexibility 
 is both confusing and unnecessary.
 Additionally, the allocation file is loaded as fair-scheduler.xml from the 
 classpath if it is not specified, but is loaded as a File if it is.  This 
 causes two problems
 1. We see different behavior when not setting the 
 yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, 
 which is its default.
 2. Classloaders may choose to cache resources, which can break the reload 
 logic when yarn.scheduler.fair.allocation.file is not specified.
 We should never allow the yarn.scheduler properties to go into 
 fair-scheduler.xml.  And we should always load the allocations file as a 
 file, not as a resource on the classpath.  To preserve existing behavior and 
 allow loading files from the classpath, we can look for files on the 
 classpath, but strip of their scheme and interpret them as Files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1214) Register ClientToken MasterKey in SecretManager after it is saved

2013-09-23 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1214:
--

Attachment: YARN-1214.5.patch

New patch added the comments.

 Register ClientToken MasterKey in SecretManager after it is saved
 -

 Key: YARN-1214
 URL: https://issues.apache.org/jira/browse/YARN-1214
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1214.1.patch, YARN-1214.2.patch, YARN-1214.3.patch, 
 YARN-1214.4.patch, YARN-1214.5.patch, YARN-1214.patch


 Currently, app attempt ClientToken master key is registered before it is 
 saved. This can cause problem that before the master key is saved, client 
 gets the token and RM also crashes, RM cannot reloads the master key back 
 after it restarts as it is not saved. As a result, client is holding an 
 invalid token.
 We can register the client token master key after it is saved in the store.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775850#comment-13775850
 ] 

Hadoop QA commented on YARN-1157:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604637/YARN-1157.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1987//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1987//console

This message is automatically generated.

 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1068) Add admin support for HA operations

2013-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775853#comment-13775853
 ] 

Hadoop QA commented on YARN-1068:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604406/yarn-1068-5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1988//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1988//console

This message is automatically generated.

 Add admin support for HA operations
 ---

 Key: YARN-1068
 URL: https://issues.apache.org/jira/browse/YARN-1068
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
 yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-prelim.patch


 Support HA admin operations to facilitate transitioning the RM to Active and 
 Standby states.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-49) Improve distributed shell application to work on a secure cluster

2013-09-23 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-49:


Attachment: YARN-49-20130923.3.txt

Straight forward patch to add security
 - Client obtains delegation token from default file-system (only default FS 
today, have to extend more) and puts it in AM Container tokens.
 - Because everything else magically happens, AMRMToken, NMToken, 
ContainerToken etc are already taken care of.
 - One thing that I'm doing in AM is to filter out AMRMToken from sending them 
across to containers.

No unit tests. Tested this on a single node secure setup.

 Improve distributed shell application to work on a secure cluster
 -

 Key: YARN-49
 URL: https://issues.apache.org/jira/browse/YARN-49
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: applications/distributed-shell
Reporter: Hitesh Shah
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-49-20130923.3.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1068) Add admin support for HA operations

2013-09-23 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1068:
---

Attachment: yarn-1068-6.patch

Minor update to the patch - RMHAProtocolService should continue to be an 
AbstractService, and not a CompositeService.

 Add admin support for HA operations
 ---

 Key: YARN-1068
 URL: https://issues.apache.org/jira/browse/YARN-1068
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
 yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, 
 yarn-1068-prelim.patch


 Support HA admin operations to facilitate transitioning the RM to Active and 
 Standby states.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-819) ResourceManager and NodeManager should check for a minimum allowed version

2013-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775877#comment-13775877
 ] 

Hadoop QA commented on YARN-819:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604672/YARN-819-3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1990//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1990//console

This message is automatically generated.

 ResourceManager and NodeManager should check for a minimum allowed version
 --

 Key: YARN-819
 URL: https://issues.apache.org/jira/browse/YARN-819
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.4-alpha
Reporter: Robert Parker
Assignee: Robert Parker
 Attachments: YARN-819-1.patch, YARN-819-2.patch, YARN-819-3.patch


 Our use case is during upgrade on a large cluster several NodeManagers may 
 not restart with the new version.  Once the RM comes back up the NodeManager 
 will re-register without issue to the RM.
 The NM should report the version the RM.  The RM should have a configuration 
 to disallow the check (default), equal to the RM (to prevent config change 
 for each release), equal to or greater than RM (to allow NM upgrades), and 
 finally an explicit version or version range.
 The RM should also have an configuration on how to treat the mismatch: 
 REJECT, or REBOOT the NM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1230) Fair scheduler aclSubmitApps does not handle acls with only groups

2013-09-23 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-1230:


 Summary: Fair scheduler aclSubmitApps does not handle acls with 
only groups
 Key: YARN-1230
 URL: https://issues.apache.org/jira/browse/YARN-1230
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.1-beta
Reporter: Sandy Ryza


ACLs are specified like user1,user2 group1,group2.   group1,group2, but 
will be interpreted incorrectly by the Fair Scheduler because it trims the 
leading space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1231) Fix test cases that will hit max- am-used-resources-percent limit after YARN-276

2013-09-23 Thread Nemon Lou (JIRA)
Nemon Lou created YARN-1231:
---

 Summary: Fix test cases that will hit max- 
am-used-resources-percent limit after YARN-276
 Key: YARN-1231
 URL: https://issues.apache.org/jira/browse/YARN-1231
 Project: Hadoop YARN
  Issue Type: Task
Affects Versions: 2.1.1-beta
Reporter: Nemon Lou
Assignee: Nemon Lou


Use a separate jira to fix YARN's test cases that will fail by hitting max- 
am-used-resources-percent limit after YARN-276.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-23 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775906#comment-13775906
 ] 

Xuan Gong commented on YARN-1157:
-

bq.For the following code, we may create a common function for both 
AMRegisteredTransition and AMUnregisteredTransition

Done

bq.Can you also document in the specific method comments ? for both 
registerRequest and unregisterRequest. And also say something like for those 
default values, will fallback to ResourceManager's app page

Added

bq.Typo in RegisterApplicationMasterRequest: are all values

Fixed

bq.The tests can probably be done with 
TestRMAppAttemptTransitions.runApplicationAttempt, In fact,the earlier tests in 
TestRMAppAttemptImpl can probably also be merged into 
TestRMAppAttemptTransitions. and so we don't need to change the visibility of 
AMregisteredTransition and AMUnregisteredTransition.

Removed TestRMAppAttemptImpl. We will cover all its tests in 
TestRMAppAttemptTransitions. Change the visibility of AMregisteredTransition 
and AMUnregisteredTransition back to private.



 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-23 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1157:


Attachment: YARN-1157.4.patch

 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch, YARN-1157.4.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-49) Improve distributed shell application to work on a secure cluster

2013-09-23 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775910#comment-13775910
 ] 

Omkar Vinit Joshi commented on YARN-49:
---

Thanks vinod..
bq. Because everything else magically happens, AMRMToken, NMToken, 
ContainerToken etc are already taken care of.
This was good and will set as an example for other Yarn app writers to use 
client libraries.

bq.One thing that I'm doing in AM is to filter out AMRMToken from sending them 
across to containers.
+1

bq. No unit tests. Tested this on a single node secure setup.
Tested this on my local secure setup. Also tested AMRMToken removal.

+1 lgtm

 Improve distributed shell application to work on a secure cluster
 -

 Key: YARN-49
 URL: https://issues.apache.org/jira/browse/YARN-49
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: applications/distributed-shell
Reporter: Hitesh Shah
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-49-20130923.3.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-23 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775918#comment-13775918
 ] 

Bikas Saha commented on YARN-1229:
--

We should probably just rename it to MapreduceShuffle. In many cases, the . 
works and so we didnt catch it in the tests.

We should also put some convention on the service names to make them safe. eg. 
service name can only be a-zA-Z0-9. In yarn-site/code etc we can put comments 
about it and enforce it in the code that reads aux-services from config.

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Critical
 Fix For: 2.1.1-beta


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-23 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-1229:
-

Priority: Blocker  (was: Critical)

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start

2013-09-23 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-1229:
-

Hadoop Flags: Incompatible change

 Shell$ExitCodeException could happen if AM fails to start
 -

 Key: YARN-1229
 URL: https://issues.apache.org/jira/browse/YARN-1229
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.1.1-beta


 I run sleep job. If AM fails to start, this exception could occur:
 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with 
 state FAILED due to: Application application_1379673267098_0020 failed 1 
 times due to AM Container for appattempt_1379673267098_0020_01 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException: 
 /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh:
  line 12: export: 
 `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA=
 ': not a valid identifier
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 .Failing this attempt.. Failing the application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores

2013-09-23 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775920#comment-13775920
 ] 

Bikas Saha commented on YARN-1089:
--

I am afraid this is getting confusing.

 Add YARN compute units alongside virtual cores
 --

 Key: YARN-1089
 URL: https://issues.apache.org/jira/browse/YARN-1089
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1089-1.patch, YARN-1089.patch


 Based on discussion in YARN-1024, we will add YARN compute units as a 
 resource for requesting and scheduling CPU processing power.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1204) Need to add https port related property in Yarn

2013-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775933#comment-13775933
 ] 

Hadoop QA commented on YARN-1204:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12604675/YARN-1204.20131023.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy:

  org.apache.hadoop.mapreduce.TestMRJobClient

  The following test timeouts occurred in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy:

org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator
org.apache.hadoop.mapreduce.v2.TestUberAM

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1989//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1989//console

This message is automatically generated.

 Need to add https port related property in Yarn
 ---

 Key: YARN-1204
 URL: https://issues.apache.org/jira/browse/YARN-1204
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Omkar Vinit Joshi
 Attachments: YARN-1204.20131018.1.patch, YARN-1204.20131020.1.patch, 
 YARN-1204.20131020.2.patch, YARN-1204.20131020.3.patch, 
 YARN-1204.20131020.4.patch, YARN-1204.20131023.1.patch


 There is no yarn property available to configure https port for Resource 
 manager, nodemanager and history server. Currently, Yarn services uses the 
 port defined for http [defined by 
 'mapreduce.jobhistory.webapp.address','yarn.nodemanager.webapp.address', 
 'yarn.resourcemanager.webapp.address'] for running services on https protocol.
 Yarn should have list of property to assign https port for RM, NM and JHS.
 It can be like below.
 yarn.nodemanager.webapp.https.address
 yarn.resourcemanager.webapp.https.address
 mapreduce.jobhistory.webapp.https.address 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1068) Add admin support for HA operations

2013-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775941#comment-13775941
 ] 

Hadoop QA commented on YARN-1068:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604708/yarn-1068-6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1991//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1991//console

This message is automatically generated.

 Add admin support for HA operations
 ---

 Key: YARN-1068
 URL: https://issues.apache.org/jira/browse/YARN-1068
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, 
 yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, 
 yarn-1068-prelim.patch


 Support HA admin operations to facilitate transitioning the RM to Active and 
 Standby states.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-49) Improve distributed shell application to work on a secure cluster

2013-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775952#comment-13775952
 ] 

Hadoop QA commented on YARN-49:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12604705/YARN-49-20130923.3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:

  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1992//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1992//console

This message is automatically generated.

 Improve distributed shell application to work on a secure cluster
 -

 Key: YARN-49
 URL: https://issues.apache.org/jira/browse/YARN-49
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: applications/distributed-shell
Reporter: Hitesh Shah
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-49-20130923.3.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1214) Register ClientToken MasterKey in SecretManager after it is saved

2013-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775960#comment-13775960
 ] 

Hadoop QA commented on YARN-1214:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604704/YARN-1214.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1993//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1993//console

This message is automatically generated.

 Register ClientToken MasterKey in SecretManager after it is saved
 -

 Key: YARN-1214
 URL: https://issues.apache.org/jira/browse/YARN-1214
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-1214.1.patch, YARN-1214.2.patch, YARN-1214.3.patch, 
 YARN-1214.4.patch, YARN-1214.5.patch, YARN-1214.patch


 Currently, app attempt ClientToken master key is registered before it is 
 saved. This can cause problem that before the master key is saved, client 
 gets the token and RM also crashes, RM cannot reloads the master key back 
 after it restarts as it is not saved. As a result, client is holding an 
 invalid token.
 We can register the client token master key after it is saved in the store.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application

2013-09-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775965#comment-13775965
 ] 

Hadoop QA commented on YARN-1157:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12604723/YARN-1157.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1994//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1994//console

This message is automatically generated.

 ResourceManager UI has invalid tracking URL link for distributed shell 
 application
 --

 Key: YARN-1157
 URL: https://issues.apache.org/jira/browse/YARN-1157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Tassapol Athiapinya
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, 
 YARN-1157.3.patch, YARN-1157.4.patch


 Submit YARN distributed shell application. Goto ResourceManager Web UI. The 
 application definitely appears. In Tracking UI column, there will be history 
 link. Click on that link. Instead of showing application master web UI, HTTP 
 error 500 would appear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira