[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774336#comment-13774336 ] Sandy Ryza commented on YARN-1188: -- +1 The context of QueueMetrics becomes 'default' when using FairScheduler -- Key: YARN-1188 URL: https://issues.apache.org/jira/browse/YARN-1188 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Akira AJISAKA Priority: Minor Labels: metrics, newbie Attachments: YARN-1188.1.patch I found the context of QueueMetrics changed to 'default' from 'yarn' when I was using FairScheduler. The context should always be 'yarn' by adding an annotation to FSQueueMetrics like below: {code} + @Metrics(context=yarn) public class FSQueueMetrics extends QueueMetrics { {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1188: - Assignee: Akira AJISAKA The context of QueueMetrics becomes 'default' when using FairScheduler -- Key: YARN-1188 URL: https://issues.apache.org/jira/browse/YARN-1188 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Labels: metrics, newbie Attachments: YARN-1188.1.patch I found the context of QueueMetrics changed to 'default' from 'yarn' when I was using FairScheduler. The context should always be 'yarn' by adding an annotation to FSQueueMetrics like below: {code} + @Metrics(context=yarn) public class FSQueueMetrics extends QueueMetrics { {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774344#comment-13774344 ] Sandy Ryza commented on YARN-1188: -- I just committed this to trunk and branch-2. Thanks Tsuyoshi! The context of QueueMetrics becomes 'default' when using FairScheduler -- Key: YARN-1188 URL: https://issues.apache.org/jira/browse/YARN-1188 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Akira AJISAKA Assignee: Akira AJISAKA Priority: Minor Labels: metrics, newbie Attachments: YARN-1188.1.patch I found the context of QueueMetrics changed to 'default' from 'yarn' when I was using FairScheduler. The context should always be 'yarn' by adding an annotation to FSQueueMetrics like below: {code} + @Metrics(context=yarn) public class FSQueueMetrics extends QueueMetrics { {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774350#comment-13774350 ] Hudson commented on YARN-1188: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4451 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4451/]) Fix credit for YARN-1188 (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525518) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt The context of QueueMetrics becomes 'default' when using FairScheduler -- Key: YARN-1188 URL: https://issues.apache.org/jira/browse/YARN-1188 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Akira AJISAKA Priority: Minor Labels: metrics, newbie Attachments: YARN-1188.1.patch I found the context of QueueMetrics changed to 'default' from 'yarn' when I was using FairScheduler. The context should always be 'yarn' by adding an annotation to FSQueueMetrics like below: {code} + @Metrics(context=yarn) public class FSQueueMetrics extends QueueMetrics { {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774457#comment-13774457 ] Hudson commented on YARN-1188: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #341 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/341/]) Fix credit for YARN-1188 (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525518) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt YARN-1188. The context of QueueMetrics becomes default when using FairScheduler (Akira Ajisaka via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525516) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueMetrics.java The context of QueueMetrics becomes 'default' when using FairScheduler -- Key: YARN-1188 URL: https://issues.apache.org/jira/browse/YARN-1188 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Akira AJISAKA Priority: Minor Labels: metrics, newbie Attachments: YARN-1188.1.patch I found the context of QueueMetrics changed to 'default' from 'yarn' when I was using FairScheduler. The context should always be 'yarn' by adding an annotation to FSQueueMetrics like below: {code} + @Metrics(context=yarn) public class FSQueueMetrics extends QueueMetrics { {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774531#comment-13774531 ] Hudson commented on YARN-1188: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1557 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1557/]) Fix credit for YARN-1188 (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525518) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt YARN-1188. The context of QueueMetrics becomes default when using FairScheduler (Akira Ajisaka via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525516) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueMetrics.java The context of QueueMetrics becomes 'default' when using FairScheduler -- Key: YARN-1188 URL: https://issues.apache.org/jira/browse/YARN-1188 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Akira AJISAKA Priority: Minor Labels: metrics, newbie Attachments: YARN-1188.1.patch I found the context of QueueMetrics changed to 'default' from 'yarn' when I was using FairScheduler. The context should always be 'yarn' by adding an annotation to FSQueueMetrics like below: {code} + @Metrics(context=yarn) public class FSQueueMetrics extends QueueMetrics { {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774537#comment-13774537 ] Hudson commented on YARN-1188: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1531 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1531/]) Fix credit for YARN-1188 (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525518) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt YARN-1188. The context of QueueMetrics becomes default when using FairScheduler (Akira Ajisaka via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1525516) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueMetrics.java The context of QueueMetrics becomes 'default' when using FairScheduler -- Key: YARN-1188 URL: https://issues.apache.org/jira/browse/YARN-1188 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Akira AJISAKA Priority: Minor Labels: metrics, newbie Attachments: YARN-1188.1.patch I found the context of QueueMetrics changed to 'default' from 'yarn' when I was using FairScheduler. The context should always be 'yarn' by adding an annotation to FSQueueMetrics like below: {code} + @Metrics(context=yarn) public class FSQueueMetrics extends QueueMetrics { {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1041) RM to bind and notify a restarted AM of existing containers
[ https://issues.apache.org/jira/browse/YARN-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774539#comment-13774539 ] Steve Loughran commented on YARN-1041: -- I'd add that discarding outstanding requests on AM restart would avoid the AM from having the problem of handling container allocations which it was not expecting. After enumerating the active set of containers, the AM could make its own decisions about how many containers it needs -and where. RM to bind and notify a restarted AM of existing containers --- Key: YARN-1041 URL: https://issues.apache.org/jira/browse/YARN-1041 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 3.0.0 Reporter: Steve Loughran For long lived containers we don't want the AM to be a SPOF. When the RM restarts a (failed) AM, it should be given the list of containers it had already been allocated. the AM should then be able to contact the NMs to get details on them. NMs would also need to do any binding of the containers needed to handle a moved/restarted AM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality
[ https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaibo Zhou updated YARN-1226: - Description: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo was: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo ipv4 and ipv6 affect job data locality -- Key: YARN-1226 URL: https://issues.apache.org/jira/browse/YARN-1226 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta Reporter: Kaibo Zhou Priority: Minor When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns
[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality
[ https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaibo Zhou updated YARN-1226: - Description: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo was: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN](http://bugs.sun.com/view_bug.do?bug_id=7166687). On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo ipv4 and ipv6 affect job data locality -- Key: YARN-1226 URL: https://issues.apache.org/jira/browse/YARN-1226 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta Reporter: Kaibo Zhou Priority: Minor When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns
[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality
[ https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaibo Zhou updated YARN-1226: - Description: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo was: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo ipv4 and ipv6 affect job data locality -- Key: YARN-1226 URL: https://issues.apache.org/jira/browse/YARN-1226 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta Reporter: Kaibo Zhou Priority: Minor When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see
[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality
[ https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaibo Zhou updated YARN-1226: - Description: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo was: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo ipv4 and ipv6 affect job data locality -- Key: YARN-1226 URL: https://issues.apache.org/jira/browse/YARN-1226 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta Reporter: Kaibo Zhou Priority: Minor When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see
[jira] [Created] (YARN-1226) ipv4 and ipv6 affect job data locality
Kaibo Zhou created YARN-1226: Summary: ipv4 and ipv6 affect job data locality Key: YARN-1226 URL: https://issues.apache.org/jira/browse/YARN-1226 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 2.1.0-beta, 2.0.0-alpha, 0.23.3 Reporter: Kaibo Zhou Priority: Minor When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality (capacity scheduler) around 0~10%. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see http://bugs.sun.com/view_bug.do?bug_id=7166687;. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality
[ https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaibo Zhou updated YARN-1226: - Description: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see http://bugs.sun.com/view_bug.do?bug_id=7166687;. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo was: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality (capacity scheduler) around 0~10%. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see http://bugs.sun.com/view_bug.do?bug_id=7166687;. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo ipv4 and ipv6 affect job data locality -- Key: YARN-1226 URL: https://issues.apache.org/jira/browse/YARN-1226 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta Reporter: Kaibo Zhou Priority: Minor When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see http://bugs.sun.com/view_bug.do?bug_id=7166687;. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with
[jira] [Updated] (YARN-1226) ipv4 and ipv6 affect job data locality
[ https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaibo Zhou updated YARN-1226: - Description: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN](http://bugs.sun.com/view_bug.do?bug_id=7166687). On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo was: When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see http://bugs.sun.com/view_bug.do?bug_id=7166687;. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of FQDN, e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo ipv4 and ipv6 affect job data locality -- Key: YARN-1226 URL: https://issues.apache.org/jira/browse/YARN-1226 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta Reporter: Kaibo Zhou Priority: Minor When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN](http://bugs.sun.com/view_bug.do?bug_id=7166687). On machines with ipv4, NodeManager get
[jira] [Updated] (YARN-1226) ipv4 and ipv6 lead to poor data locality
[ https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaibo Zhou updated YARN-1226: - Priority: Major (was: Minor) ipv4 and ipv6 lead to poor data locality Key: YARN-1226 URL: https://issues.apache.org/jira/browse/YARN-1226 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta Reporter: Kaibo Zhou When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1226) ipv4 and ipv6 lead to poor data locality
[ https://issues.apache.org/jira/browse/YARN-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaibo Zhou updated YARN-1226: - Summary: ipv4 and ipv6 lead to poor data locality (was: ipv4 and ipv6 affect job data locality) ipv4 and ipv6 lead to poor data locality Key: YARN-1226 URL: https://issues.apache.org/jira/browse/YARN-1226 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-beta Reporter: Kaibo Zhou Priority: Minor When I run a mapreduce job which use TableInputFormat to scan a hbase table on yarn cluser with 140+ nodes, I consistently get very low data locality around 0~10%. The scheduler is Capacity Scheduler. Hbase and hadoop are integrated in the cluster with NodeManager, DataNode and HRegionServer run on the same node. The reason of low data locality is: most machines in the cluster uses IPV6, few machines use IPV4. NodeManager use InetAddress.getLocalHost().getHostName() to get the host name, but the return result of this function depends on IPV4 or IPV6, see [InetAddress.getLocalHost().getHostName() returns FQDN|http://bugs.sun.com/view_bug.do?bug_id=7166687]. On machines with ipv4, NodeManager get hostName as: search042097.sqa.cm4.site.net But on machines with ipv6, NodeManager get hostName as: search042097.sqa.cm4 if run with IPv6 disabled, -Djava.net.preferIPv4Stack=true, then returns search042097.sqa.cm4.site.net. For the mapred job which scan hbase table, the InputSplit contains node locations of [FQDN|http://en.wikipedia.org/wiki/FQDN], e.g. search042097.sqa.cm4.site.net. Because in hbase, the RegionServers' hostnames are allocated by HMaster. HMaster communicate with RegionServers and get the region server's host name use java NIO: clientChannel.socket().getInetAddress().getHostName(). Also see the startup log of region server: 13:06:21,200 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=search042024.sqa.cm4, Now=search042024.sqa.cm4.site.net As you can see, most machines in the Yarn cluster with IPV6 get the short hostname, but hbase always get the full hostname, so the Host cannot matched (see RMContainerAllocator::assignToMap).This can lead to poor locality. After I use java.net.preferIPv4Stack to force IPv4 in yarn, I get 70+% data locality in the cluster. Thanks, Kaibo -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1227) Update Single Cluster doc to use yarn.resourcemanager.hostname
Sandy Ryza created YARN-1227: Summary: Update Single Cluster doc to use yarn.resourcemanager.hostname Key: YARN-1227 URL: https://issues.apache.org/jira/browse/YARN-1227 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Now that yarn.resourcemanager.hostname can be used in place or yarn.resourcemanager.address, yarn.resourcemanager.scheduler.address, etc., we should update the doc to use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1204) Need to add https port related property in Yarn
[ https://issues.apache.org/jira/browse/YARN-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774742#comment-13774742 ] Vinod Kumar Vavilapalli commented on YARN-1204: --- Quickly looked at the patch. In ResourceManager.java and WebAppProxy.java, you replaced the usage of {{YarnConfiguration.getProxyHostAndPort()}} with {{WebAppUtils.getResolvedRMWebAppURL()}}. Look like bugs to me. Can you also look at test-failures? That's a lot of them, some of them are obviously tracked elsewhere, let's make sure all test-failures are tracked. Need to add https port related property in Yarn --- Key: YARN-1204 URL: https://issues.apache.org/jira/browse/YARN-1204 Project: Hadoop YARN Issue Type: Bug Reporter: Yesha Vora Assignee: Omkar Vinit Joshi Attachments: YARN-1204.20131018.1.patch, YARN-1204.20131020.1.patch, YARN-1204.20131020.2.patch, YARN-1204.20131020.3.patch, YARN-1204.20131020.4.patch There is no yarn property available to configure https port for Resource manager, nodemanager and history server. Currently, Yarn services uses the port defined for http [defined by 'mapreduce.jobhistory.webapp.address','yarn.nodemanager.webapp.address', 'yarn.resourcemanager.webapp.address'] for running services on https protocol. Yarn should have list of property to assign https port for RM, NM and JHS. It can be like below. yarn.nodemanager.webapp.https.address yarn.resourcemanager.webapp.https.address mapreduce.jobhistory.webapp.https.address -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1221) With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely
[ https://issues.apache.org/jira/browse/YARN-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774765#comment-13774765 ] Siqi Li commented on YARN-1221: --- According to the fairsheduler log, 2013-09-23 17:32:30,593 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:2048, vCores:1 2013-09-23 17:32:35,591 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:4096, vCores:1 2013-09-23 17:32:36,595 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:4096, vCores:1 2013-09-23 17:32:37,598 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:4096, vCores:1 2013-09-23 17:32:38,602 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:4096, vCores:1 2013-09-23 17:32:39,606 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:4096, vCores:1 2013-09-23 17:32:43,622 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:2048, vCores:1 2013-09-23 17:32:48,640 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:4096, vCores:1 2013-09-23 17:32:49,647 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:-1, vCores:0 2013-09-23 17:33:11,213 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:4096, vCores:1 2013-09-23 17:33:11,245 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:-1, vCores:0 2013-09-23 17:33:13,221 ASSIGN atla-aub-37-sr1.prod.twttr.net memory:4096, vCores:1 the -1 might be the problem, it should be 4096 and the vCores should be 1 instead of 0. I tried several times, if the log has only one memory:-1, vCores:0, the reserved memory will be 4G after all the jobs done, and if the log has two memory:-1, vCores:0, the reserved memory will be 8G after all the jobs done. With Fair Scheduler, reserved MB reported in RM web UI increases indefinitely - Key: YARN-1221 URL: https://issues.apache.org/jira/browse/YARN-1221 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application
[ https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1157: Attachment: YARN-1157.3.patch Update patch based on latest trunk ResourceManager UI has invalid tracking URL link for distributed shell application -- Key: YARN-1157 URL: https://issues.apache.org/jira/browse/YARN-1157 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.1.1-beta Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, YARN-1157.3.patch Submit YARN distributed shell application. Goto ResourceManager Web UI. The application definitely appears. In Tracking UI column, there will be history link. Click on that link. Instead of showing application master web UI, HTTP error 500 would appear. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1227) Update Single Cluster doc to use yarn.resourcemanager.hostname
[ https://issues.apache.org/jira/browse/YARN-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774812#comment-13774812 ] Karthik Kambatla commented on YARN-1227: This is related to HA-specific configuration changes we will be doing shortly as part of YARN-149. Update Single Cluster doc to use yarn.resourcemanager.hostname -- Key: YARN-1227 URL: https://issues.apache.org/jira/browse/YARN-1227 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Labels: newbie Now that yarn.resourcemanager.hostname can be used in place or yarn.resourcemanager.address, yarn.resourcemanager.scheduler.address, etc., we should update the doc to use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1222) Make improvements in ZKRMStateStore for fencing
[ https://issues.apache.org/jira/browse/YARN-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774658#comment-13774658 ] Karthik Kambatla commented on YARN-1222: Another approach we could take is let the user provide the ACLs - rm1-acl, rm2-acl - in the configuration, very much along the lines of how ACLs are passed today. This would allow users to hook these up to kerberos credentials also if they want to. Make improvements in ZKRMStateStore for fencing --- Key: YARN-1222 URL: https://issues.apache.org/jira/browse/YARN-1222 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: yarn-1222-1.patch Using multi-operations for every ZK interaction. In every operation, automatically creating/deleting a lock znode that is the child of the root znode. This is to achieve fencing by modifying the create/delete permissions on the root znode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1228) Don't allow other file than fair-scheduler.xml to be Fair Scheduler allocations file
Sandy Ryza created YARN-1228: Summary: Don't allow other file than fair-scheduler.xml to be Fair Scheduler allocations file Key: YARN-1228 URL: https://issues.apache.org/jira/browse/YARN-1228 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. There's no need to keep around the second way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1188) The context of QueueMetrics becomes 'default' when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1188: - Assignee: Tsuyoshi OZAWA The context of QueueMetrics becomes 'default' when using FairScheduler -- Key: YARN-1188 URL: https://issues.apache.org/jira/browse/YARN-1188 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Akira AJISAKA Assignee: Tsuyoshi OZAWA Priority: Minor Labels: metrics, newbie Attachments: YARN-1188.1.patch I found the context of QueueMetrics changed to 'default' from 'yarn' when I was using FairScheduler. The context should always be 'yarn' by adding an annotation to FSQueueMetrics like below: {code} + @Metrics(context=yarn) public class FSQueueMetrics extends QueueMetrics { {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1227) Update Single Cluster doc to use yarn.resourcemanager.hostname
[ https://issues.apache.org/jira/browse/YARN-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1227: - Labels: newbie (was: ) Update Single Cluster doc to use yarn.resourcemanager.hostname -- Key: YARN-1227 URL: https://issues.apache.org/jira/browse/YARN-1227 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Labels: newbie Now that yarn.resourcemanager.hostname can be used in place or yarn.resourcemanager.address, yarn.resourcemanager.scheduler.address, etc., we should update the doc to use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start
[ https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong reassigned YARN-1229: --- Assignee: Xuan Gong Shell$ExitCodeException could happen if AM fails to start - Key: YARN-1229 URL: https://issues.apache.org/jira/browse/YARN-1229 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Tassapol Athiapinya Assignee: Xuan Gong Priority: Critical Fix For: 2.1.1-beta I run sleep job. If AM fails to start, this exception could occur: 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with state FAILED due to: Application application_1379673267098_0020 failed 1 times due to AM Container for appattempt_1379673267098_0020_01 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh: line 12: export: `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA= ': not a valid identifier at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) .Failing this attempt.. Failing the application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-819) ResourceManager and NodeManager should check for a minimum allowed version
[ https://issues.apache.org/jira/browse/YARN-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775697#comment-13775697 ] Robert Parker commented on YARN-819: Jon, Thanks for the review and nice catch on branch-2. * Changed the TestResourceTrackerService#testNodeRegistrationVersionLessThanRM test case to run on branch-2 and greater. * The jira mentions reboot as on option. A reboot would cover the case where new software is deployed but the NM process is not restarted. There is no guarantee that the new version can talk to the older version so rejection of the connection will satisfy the requirement and is much less complicated. * Corrected the other three code issues. ResourceManager and NodeManager should check for a minimum allowed version -- Key: YARN-819 URL: https://issues.apache.org/jira/browse/YARN-819 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Robert Parker Assignee: Robert Parker Attachments: YARN-819-1.patch, YARN-819-2.patch, YARN-819-3.patch Our use case is during upgrade on a large cluster several NodeManagers may not restart with the new version. Once the RM comes back up the NodeManager will re-register without issue to the RM. The NM should report the version the RM. The RM should have a configuration to disallow the check (default), equal to the RM (to prevent config change for each release), equal to or greater than RM (to allow NM upgrades), and finally an explicit version or version range. The RM should also have an configuration on how to treat the mismatch: REJECT, or REBOOT the NM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775700#comment-13775700 ] Jason Lowe commented on YARN-451: - Is knowing how big an application might get in the future important? Knowing how big an application is right now, both in terms of what it's using and what it's asking for, seems more relevant for understanding why a queue is overloaded or jobs aren't getting scheduled as quickly as expected. The ApplicationResourceUsageReport already contains this information, and it should be straightforward to report as part of the ApplicationResourceUsageReport for display via the web UI, CLI, or REST services. Note that YARN-415 is already attempting to do this for historical resource usage so it will be easy to see which jobs have taken large amounts of resources and could have slowed other jobs in the past. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1204) Need to add https port related property in Yarn
[ https://issues.apache.org/jira/browse/YARN-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775703#comment-13775703 ] Omkar Vinit Joshi commented on YARN-1204: - Thanks vinod for the review.. bq. Quickly looked at the patch. In ResourceManager.java and WebAppProxy.java, you replaced the usage of YarnConfiguration.getProxyHostAndPort() with WebAppUtils.getResolvedRMWebAppURL(). fixed.. bq. Can you also look at test-failures? That's a lot of them, some of them are obviously tracked elsewhere, let's make sure all test-failures are tracked. I see all these test failures even without patch.. Need to add https port related property in Yarn --- Key: YARN-1204 URL: https://issues.apache.org/jira/browse/YARN-1204 Project: Hadoop YARN Issue Type: Bug Reporter: Yesha Vora Assignee: Omkar Vinit Joshi Attachments: YARN-1204.20131018.1.patch, YARN-1204.20131020.1.patch, YARN-1204.20131020.2.patch, YARN-1204.20131020.3.patch, YARN-1204.20131020.4.patch, YARN-1204.20131023.1.patch There is no yarn property available to configure https port for Resource manager, nodemanager and history server. Currently, Yarn services uses the port defined for http [defined by 'mapreduce.jobhistory.webapp.address','yarn.nodemanager.webapp.address', 'yarn.resourcemanager.webapp.address'] for running services on https protocol. Yarn should have list of property to assign https port for RM, NM and JHS. It can be like below. yarn.nodemanager.webapp.https.address yarn.resourcemanager.webapp.https.address mapreduce.jobhistory.webapp.https.address -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (YARN-451) Add more metrics to RM page
[ https://issues.apache.org/jira/browse/YARN-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775700#comment-13775700 ] Jason Lowe edited comment on YARN-451 at 9/23/13 9:59 PM: -- Is knowing how big an application might get in the future important? Knowing how big an application is right now, both in terms of what it's using and what it's asking for, seems more relevant for understanding why a queue is overloaded or jobs aren't getting scheduled as quickly as expected. The ApplicationResourceUsageReport already contains this information, and it should be straightforward to display via the web UI, CLI, or REST services. Note that YARN-415 is already attempting to do this for historical resource usage so it will be easy to see which jobs have taken large amounts of resources and could have slowed other jobs in the past. was (Author: jlowe): Is knowing how big an application might get in the future important? Knowing how big an application is right now, both in terms of what it's using and what it's asking for, seems more relevant for understanding why a queue is overloaded or jobs aren't getting scheduled as quickly as expected. The ApplicationResourceUsageReport already contains this information, and it should be straightforward to report as part of the ApplicationResourceUsageReport for display via the web UI, CLI, or REST services. Note that YARN-415 is already attempting to do this for historical resource usage so it will be easy to see which jobs have taken large amounts of resources and could have slowed other jobs in the past. Add more metrics to RM page --- Key: YARN-451 URL: https://issues.apache.org/jira/browse/YARN-451 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: Lohit Vijayarenu Assignee: Sangjin Lee Priority: Blocker Attachments: in_progress_2x.png, yarn-451-trunk-20130916.1.patch ResourceManager webUI shows list of RUNNING applications, but it does not tell which applications are requesting more resource compared to others. With cluster running hundreds of applications at once it would be useful to have some kind of metric to show high-resource usage applications vs low-resource usage ones. At the minimum showing number of containers is good option. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start
[ https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775711#comment-13775711 ] Xuan Gong commented on YARN-1229: - [~vinodkv], [~bikassaha], [~hitesh], [~sseth], [~jlowe], [~cnauroth] The bug shows an error in launch_container.sh while trying to export NM_AUX_SERVICE_mapreduce.shuffle. The problem is that '.' is not considered a valid character in an environment variable. In order to solve this, we might need to rename the service name. There are three places need to rename (use mapreduce_shuffle instead of mapreduce.shuffle): {code} public static final String MAPREDUCE_SHUFFLE_SERVICEID = mapreduce.shuffle; {code} in ShuffleHandler.java. The other two places are in yarn_site.xml {code} property nameyarn.nodemanager.aux-services/name valuemapreduce.shuffle/value descriptionshuffle service that needs to be set for Map Reduce to run /description /property property nameyarn.nodemanager.aux-services.mapreduce.shuffle.class/name valueorg.apache.hadoop.mapred.ShuffleHandler/value /property {code} We can just simply replace all three places with mapreduce_shuffle, or we can split the shuffle service out of the aux_services, say, create a new property called mapreduce_shuffle_service. The ShuffleHandler can read this property instead of defining MAPREDUCE_SHUFFLE_SERVICEID by itself. And AuxService#init() will need to read both mapreduce_shuffle_service and yarn.nodemanager.aux-services to do the initialization. An alternate is to convert all special characters to _ - and AuxServiceHelpers becomes the public API to access this data. Since we're trying to rename variables, this can be considered backward incompatible. I would like get in touch with folks who are already using it. Shell$ExitCodeException could happen if AM fails to start - Key: YARN-1229 URL: https://issues.apache.org/jira/browse/YARN-1229 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Tassapol Athiapinya Assignee: Xuan Gong Priority: Critical Fix For: 2.1.1-beta I run sleep job. If AM fails to start, this exception could occur: 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with state FAILED due to: Application application_1379673267098_0020 failed 1 times due to AM Container for appattempt_1379673267098_0020_01 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh: line 12: export: `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA= ': not a valid identifier at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) .Failing this attempt.. Failing the application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1215) Yarn URL should include userinfo
[ https://issues.apache.org/jira/browse/YARN-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775718#comment-13775718 ] shanyu zhao commented on YARN-1215: --- Looks good to me overall. But I think the best fix is to add a userInfo field for org.apache.hadoop.yarn.api.records.URL. Yarn URL should include userinfo Key: YARN-1215 URL: https://issues.apache.org/jira/browse/YARN-1215 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0 Reporter: Chuan Liu Assignee: Chuan Liu Attachments: YARN-1215-trunk.patch In the {{org.apache.hadoop.yarn.api.records.URL}} class, we don't have an userinfo as part of the URL. When converting a {{java.net.URI}} object into the YARN URL object in {{ConverterUtils.getYarnUrlFromURI()}} method, we will set uri host as the url host. If the uri has a userinfo part, the userinfo is discarded. This will lead to information loss if the original uri has the userinfo, e.g. foo://username:passw...@example.com will be converted to foo://example.com and username/password information is lost during the conversion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1199) Make NM/RM Versions Available
[ https://issues.apache.org/jira/browse/YARN-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775724#comment-13775724 ] Robert Parker commented on YARN-1199: - +1 non-binding Make NM/RM Versions Available - Key: YARN-1199 URL: https://issues.apache.org/jira/browse/YARN-1199 Project: Hadoop YARN Issue Type: Improvement Reporter: Mit Desai Assignee: Mit Desai Attachments: YARN-1199.patch, YARN-1199.patch, YARN-1199.patch Now as we have the NM and RM Versions available, we can display the YARN version of nodes running in the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application
[ https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775730#comment-13775730 ] Jian He commented on YARN-1157: --- For the following code, we may create a common function for both AMRegisteredTransition and AMUnregisteredTransition {code} String url = unregisterEvent.getTrackingUrl(); if (url == null || url.trim().isEmpty()) { appAttempt.origTrackingUrl = N/A; } else { appAttempt.origTrackingUrl = url; } appAttempt.proxiedTrackingUrl = appAttempt.generateProxyUriWithoutScheme(appAttempt.origTrackingUrl); {code} bq. Let's document RegisterApplicationMasterRequest.getTrackingUrl() and setTrackingUrl() Can you also document in the specific method comments ? for both registerRequest and unregisterRequest. And also say something like for those default values, will fallback to ResourceManager's app page Typo in RegisterApplicationMasterRequest: are all values The tests can probably be done with TestRMAppAttemptTransitions.runApplicationAttempt, In fact,the earlier tests in TestRMAppAttemptImpl can probably also be merged into TestRMAppAttemptTransitions. and so we don't need to change the visibility of AMregisteredTransition and AMUnregisteredTransition. ResourceManager UI has invalid tracking URL link for distributed shell application -- Key: YARN-1157 URL: https://issues.apache.org/jira/browse/YARN-1157 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.1.1-beta Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, YARN-1157.3.patch Submit YARN distributed shell application. Goto ResourceManager Web UI. The application definitely appears. In Tracking UI column, there will be history link. Click on that link. Instead of showing application master web UI, HTTP error 500 would appear. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-1210) During RM restart, RM should start a new attempt only when previous attempt exits for real
[ https://issues.apache.org/jira/browse/YARN-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He reassigned YARN-1210: - Assignee: Jian He (was: Vinod Kumar Vavilapalli) During RM restart, RM should start a new attempt only when previous attempt exits for real -- Key: YARN-1210 URL: https://issues.apache.org/jira/browse/YARN-1210 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Jian He When RM recovers, it can wait for existing AMs to contact RM back and then kill them forcefully before even starting a new AM. Worst case, RM will start a new AppAttempt after waiting for 10 mins ( the expiry interval). This way we'll minimize multiple AMs racing with each other. This can help issues with downstream components like Pig, Hive and Oozie during RM restart. In the mean while, new apps will proceed as usual as existing apps wait for recovery. This can continue to be useful after work-preserving restart, so that AMs which can properly sync back up with RM can continue to run and those that don't are guaranteed to be killed before starting a new attempt. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-955) [YARN-321] History Service should create the RPC server and wire it to HistoryStorage
[ https://issues.apache.org/jira/browse/YARN-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-955: --- Attachment: YARN-955-1.patch Attaching patch. Thanks, Mayank [YARN-321] History Service should create the RPC server and wire it to HistoryStorage - Key: YARN-955 URL: https://issues.apache.org/jira/browse/YARN-955 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal Attachments: YARN-955-1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1228: - Summary: Clean up Fair Scheduler configuration loading (was: Don't allow other file than fair-scheduler.xml to be Fair Scheduler allocations file) Clean up Fair Scheduler configuration loading - Key: YARN-1228 URL: https://issues.apache.org/jira/browse/YARN-1228 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. There's no need to keep around the second way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1210) During RM restart, RM should start a new attempt only when previous attempt exits for real
[ https://issues.apache.org/jira/browse/YARN-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775762#comment-13775762 ] Jian He commented on YARN-1210: --- Also, NM needs to be changed to at least report back the finished containers on NM resync During RM restart, RM should start a new attempt only when previous attempt exits for real -- Key: YARN-1210 URL: https://issues.apache.org/jira/browse/YARN-1210 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Jian He When RM recovers, it can wait for existing AMs to contact RM back and then kill them forcefully before even starting a new AM. Worst case, RM will start a new AppAttempt after waiting for 10 mins ( the expiry interval). This way we'll minimize multiple AMs racing with each other. This can help issues with downstream components like Pig, Hive and Oozie during RM restart. In the mean while, new apps will proceed as usual as existing apps wait for recovery. This can continue to be useful after work-preserving restart, so that AMs which can properly sync back up with RM can continue to run and those that don't are guaranteed to be killed before starting a new attempt. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1228: - Attachment: YARN-1228.patch Clean up Fair Scheduler configuration loading - Key: YARN-1228 URL: https://issues.apache.org/jira/browse/YARN-1228 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza Attachments: YARN-1228.patch Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. Additionally, the allocation file is loaded as fair-scheduler.xml from the classpath if it is not specified, but is loaded as a File if it is. This causes two problems 1. We see different behavior when not setting the yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, which is its default. 2. Classloaders may choose to cache resources, which can break the reload logic when yarn.scheduler.fair.allocation.file is not specified. We should never allow the yarn.scheduler properties to go into fair-scheduler.xml. And we should always load the allocations file as a file, not as a resource on the classpath. To preserve existing behavior and allow loading files from the classpath, we can look for files on the classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1228: - Description: Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. Additionally, the allocation file is loaded as fair-scheduler.xml from the classpath if it is not specified, but is loaded as a File if it is. This causes two problems 1. We see different behavior when not setting the yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, which is its default. 2. Classloaders may choose to cache resources, which can break the reload logic when yarn.scheduler.fair.allocation.file is not specified. We should never allow the yarn.scheduler properties to go into fair-scheduler.xml. And we should always load the allocations file as a file, not as a resource on the classpath. To preserve existing behavior and allow loading files from the classpath, we can look for files on the classpath, but strip of their scheme and interpret them as Files. was: Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. There's no need to keep around the second way. Clean up Fair Scheduler configuration loading - Key: YARN-1228 URL: https://issues.apache.org/jira/browse/YARN-1228 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. Additionally, the allocation file is loaded as fair-scheduler.xml from the classpath if it is not specified, but is loaded as a File if it is. This causes two problems 1. We see different behavior when not setting the yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, which is its default. 2. Classloaders may choose to cache resources, which can break the reload logic when yarn.scheduler.fair.allocation.file is not specified. We should never allow the yarn.scheduler properties to go into fair-scheduler.xml. And we should always load the allocations file as a file, not as a resource on the classpath. To preserve existing behavior and allow loading files from the classpath, we can look for files on the classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775769#comment-13775769 ] Sandy Ryza commented on YARN-1228: -- Existing tests verify that absolute paths and not giving any file work. Adding a file to the classpath at runtime is difficult, so I verified that it picks up files from the classpath by manually testing on a pseudo-distributed cluster. Clean up Fair Scheduler configuration loading - Key: YARN-1228 URL: https://issues.apache.org/jira/browse/YARN-1228 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1228.patch Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. Additionally, the allocation file is loaded as fair-scheduler.xml from the classpath if it is not specified, but is loaded as a File if it is. This causes two problems 1. We see different behavior when not setting the yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, which is its default. 2. Classloaders may choose to cache resources, which can break the reload logic when yarn.scheduler.fair.allocation.file is not specified. We should never allow the yarn.scheduler properties to go into fair-scheduler.xml. And we should always load the allocations file as a file, not as a resource on the classpath. To preserve existing behavior and allow loading files from the classpath, we can look for files on the classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-955) [YARN-321] History Service should create the RPC server and wire it to HistoryStorage
[ https://issues.apache.org/jira/browse/YARN-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775797#comment-13775797 ] Hadoop QA commented on YARN-955: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604687/YARN-955-1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1986//console This message is automatically generated. [YARN-321] History Service should create the RPC server and wire it to HistoryStorage - Key: YARN-955 URL: https://issues.apache.org/jira/browse/YARN-955 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal Attachments: YARN-955-1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1228) Clean up Fair Scheduler configuration loading
[ https://issues.apache.org/jira/browse/YARN-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775823#comment-13775823 ] Hadoop QA commented on YARN-1228: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604697/YARN-1228.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1985//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1985//console This message is automatically generated. Clean up Fair Scheduler configuration loading - Key: YARN-1228 URL: https://issues.apache.org/jira/browse/YARN-1228 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1228.patch Currently the Fair Scheduler is configured in two ways * An allocations file that has a different format than the standard Hadoop configuration file, which makes it easier to specify hierarchical objects like queues and their properties. * With properties like yarn.scheduler.fair.max.assign that are specified in the standard Hadoop configuration format. The standard and default way of configuring it is to use fair-scheduler.xml as the allocations file and to put the yarn.scheduler properties in yarn-site.xml. It is also possible to specify a different file as the allocations file, and to place the yarn.scheduler properties in fair-scheduler.xml, which will be interpreted as in the standard Hadoop configuration format. This flexibility is both confusing and unnecessary. Additionally, the allocation file is loaded as fair-scheduler.xml from the classpath if it is not specified, but is loaded as a File if it is. This causes two problems 1. We see different behavior when not setting the yarn.scheduler.fair.allocation.file, and setting it to fair-scheduler.xml, which is its default. 2. Classloaders may choose to cache resources, which can break the reload logic when yarn.scheduler.fair.allocation.file is not specified. We should never allow the yarn.scheduler properties to go into fair-scheduler.xml. And we should always load the allocations file as a file, not as a resource on the classpath. To preserve existing behavior and allow loading files from the classpath, we can look for files on the classpath, but strip of their scheme and interpret them as Files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1214) Register ClientToken MasterKey in SecretManager after it is saved
[ https://issues.apache.org/jira/browse/YARN-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1214: -- Attachment: YARN-1214.5.patch New patch added the comments. Register ClientToken MasterKey in SecretManager after it is saved - Key: YARN-1214 URL: https://issues.apache.org/jira/browse/YARN-1214 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Attachments: YARN-1214.1.patch, YARN-1214.2.patch, YARN-1214.3.patch, YARN-1214.4.patch, YARN-1214.5.patch, YARN-1214.patch Currently, app attempt ClientToken master key is registered before it is saved. This can cause problem that before the master key is saved, client gets the token and RM also crashes, RM cannot reloads the master key back after it restarts as it is not saved. As a result, client is holding an invalid token. We can register the client token master key after it is saved in the store. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application
[ https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775850#comment-13775850 ] Hadoop QA commented on YARN-1157: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604637/YARN-1157.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1987//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1987//console This message is automatically generated. ResourceManager UI has invalid tracking URL link for distributed shell application -- Key: YARN-1157 URL: https://issues.apache.org/jira/browse/YARN-1157 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.1.1-beta Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, YARN-1157.3.patch Submit YARN distributed shell application. Goto ResourceManager Web UI. The application definitely appears. In Tracking UI column, there will be history link. Click on that link. Instead of showing application master web UI, HTTP error 500 would appear. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1068) Add admin support for HA operations
[ https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775853#comment-13775853 ] Hadoop QA commented on YARN-1068: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604406/yarn-1068-5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1988//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1988//console This message is automatically generated. Add admin support for HA operations --- Key: YARN-1068 URL: https://issues.apache.org/jira/browse/YARN-1068 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-prelim.patch Support HA admin operations to facilitate transitioning the RM to Active and Standby states. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-49) Improve distributed shell application to work on a secure cluster
[ https://issues.apache.org/jira/browse/YARN-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-49: Attachment: YARN-49-20130923.3.txt Straight forward patch to add security - Client obtains delegation token from default file-system (only default FS today, have to extend more) and puts it in AM Container tokens. - Because everything else magically happens, AMRMToken, NMToken, ContainerToken etc are already taken care of. - One thing that I'm doing in AM is to filter out AMRMToken from sending them across to containers. No unit tests. Tested this on a single node secure setup. Improve distributed shell application to work on a secure cluster - Key: YARN-49 URL: https://issues.apache.org/jira/browse/YARN-49 Project: Hadoop YARN Issue Type: Sub-task Components: applications/distributed-shell Reporter: Hitesh Shah Assignee: Vinod Kumar Vavilapalli Attachments: YARN-49-20130923.3.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1068) Add admin support for HA operations
[ https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1068: --- Attachment: yarn-1068-6.patch Minor update to the patch - RMHAProtocolService should continue to be an AbstractService, and not a CompositeService. Add admin support for HA operations --- Key: YARN-1068 URL: https://issues.apache.org/jira/browse/YARN-1068 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, yarn-1068-prelim.patch Support HA admin operations to facilitate transitioning the RM to Active and Standby states. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-819) ResourceManager and NodeManager should check for a minimum allowed version
[ https://issues.apache.org/jira/browse/YARN-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775877#comment-13775877 ] Hadoop QA commented on YARN-819: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604672/YARN-819-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1990//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1990//console This message is automatically generated. ResourceManager and NodeManager should check for a minimum allowed version -- Key: YARN-819 URL: https://issues.apache.org/jira/browse/YARN-819 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Robert Parker Assignee: Robert Parker Attachments: YARN-819-1.patch, YARN-819-2.patch, YARN-819-3.patch Our use case is during upgrade on a large cluster several NodeManagers may not restart with the new version. Once the RM comes back up the NodeManager will re-register without issue to the RM. The NM should report the version the RM. The RM should have a configuration to disallow the check (default), equal to the RM (to prevent config change for each release), equal to or greater than RM (to allow NM upgrades), and finally an explicit version or version range. The RM should also have an configuration on how to treat the mismatch: REJECT, or REBOOT the NM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1230) Fair scheduler aclSubmitApps does not handle acls with only groups
Sandy Ryza created YARN-1230: Summary: Fair scheduler aclSubmitApps does not handle acls with only groups Key: YARN-1230 URL: https://issues.apache.org/jira/browse/YARN-1230 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.1.1-beta Reporter: Sandy Ryza ACLs are specified like user1,user2 group1,group2. group1,group2, but will be interpreted incorrectly by the Fair Scheduler because it trims the leading space. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1231) Fix test cases that will hit max- am-used-resources-percent limit after YARN-276
Nemon Lou created YARN-1231: --- Summary: Fix test cases that will hit max- am-used-resources-percent limit after YARN-276 Key: YARN-1231 URL: https://issues.apache.org/jira/browse/YARN-1231 Project: Hadoop YARN Issue Type: Task Affects Versions: 2.1.1-beta Reporter: Nemon Lou Assignee: Nemon Lou Use a separate jira to fix YARN's test cases that will fail by hitting max- am-used-resources-percent limit after YARN-276. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application
[ https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775906#comment-13775906 ] Xuan Gong commented on YARN-1157: - bq.For the following code, we may create a common function for both AMRegisteredTransition and AMUnregisteredTransition Done bq.Can you also document in the specific method comments ? for both registerRequest and unregisterRequest. And also say something like for those default values, will fallback to ResourceManager's app page Added bq.Typo in RegisterApplicationMasterRequest: are all values Fixed bq.The tests can probably be done with TestRMAppAttemptTransitions.runApplicationAttempt, In fact,the earlier tests in TestRMAppAttemptImpl can probably also be merged into TestRMAppAttemptTransitions. and so we don't need to change the visibility of AMregisteredTransition and AMUnregisteredTransition. Removed TestRMAppAttemptImpl. We will cover all its tests in TestRMAppAttemptTransitions. Change the visibility of AMregisteredTransition and AMUnregisteredTransition back to private. ResourceManager UI has invalid tracking URL link for distributed shell application -- Key: YARN-1157 URL: https://issues.apache.org/jira/browse/YARN-1157 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.1.1-beta Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, YARN-1157.3.patch Submit YARN distributed shell application. Goto ResourceManager Web UI. The application definitely appears. In Tracking UI column, there will be history link. Click on that link. Instead of showing application master web UI, HTTP error 500 would appear. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application
[ https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1157: Attachment: YARN-1157.4.patch ResourceManager UI has invalid tracking URL link for distributed shell application -- Key: YARN-1157 URL: https://issues.apache.org/jira/browse/YARN-1157 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.1.1-beta Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, YARN-1157.3.patch, YARN-1157.4.patch Submit YARN distributed shell application. Goto ResourceManager Web UI. The application definitely appears. In Tracking UI column, there will be history link. Click on that link. Instead of showing application master web UI, HTTP error 500 would appear. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-49) Improve distributed shell application to work on a secure cluster
[ https://issues.apache.org/jira/browse/YARN-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775910#comment-13775910 ] Omkar Vinit Joshi commented on YARN-49: --- Thanks vinod.. bq. Because everything else magically happens, AMRMToken, NMToken, ContainerToken etc are already taken care of. This was good and will set as an example for other Yarn app writers to use client libraries. bq.One thing that I'm doing in AM is to filter out AMRMToken from sending them across to containers. +1 bq. No unit tests. Tested this on a single node secure setup. Tested this on my local secure setup. Also tested AMRMToken removal. +1 lgtm Improve distributed shell application to work on a secure cluster - Key: YARN-49 URL: https://issues.apache.org/jira/browse/YARN-49 Project: Hadoop YARN Issue Type: Sub-task Components: applications/distributed-shell Reporter: Hitesh Shah Assignee: Vinod Kumar Vavilapalli Attachments: YARN-49-20130923.3.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start
[ https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775918#comment-13775918 ] Bikas Saha commented on YARN-1229: -- We should probably just rename it to MapreduceShuffle. In many cases, the . works and so we didnt catch it in the tests. We should also put some convention on the service names to make them safe. eg. service name can only be a-zA-Z0-9. In yarn-site/code etc we can put comments about it and enforce it in the code that reads aux-services from config. Shell$ExitCodeException could happen if AM fails to start - Key: YARN-1229 URL: https://issues.apache.org/jira/browse/YARN-1229 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Tassapol Athiapinya Assignee: Xuan Gong Priority: Critical Fix For: 2.1.1-beta I run sleep job. If AM fails to start, this exception could occur: 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with state FAILED due to: Application application_1379673267098_0020 failed 1 times due to AM Container for appattempt_1379673267098_0020_01 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh: line 12: export: `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA= ': not a valid identifier at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) .Failing this attempt.. Failing the application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start
[ https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-1229: - Priority: Blocker (was: Critical) Shell$ExitCodeException could happen if AM fails to start - Key: YARN-1229 URL: https://issues.apache.org/jira/browse/YARN-1229 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Tassapol Athiapinya Assignee: Xuan Gong Priority: Blocker Fix For: 2.1.1-beta I run sleep job. If AM fails to start, this exception could occur: 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with state FAILED due to: Application application_1379673267098_0020 failed 1 times due to AM Container for appattempt_1379673267098_0020_01 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh: line 12: export: `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA= ': not a valid identifier at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) .Failing this attempt.. Failing the application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1229) Shell$ExitCodeException could happen if AM fails to start
[ https://issues.apache.org/jira/browse/YARN-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated YARN-1229: - Hadoop Flags: Incompatible change Shell$ExitCodeException could happen if AM fails to start - Key: YARN-1229 URL: https://issues.apache.org/jira/browse/YARN-1229 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Tassapol Athiapinya Assignee: Xuan Gong Priority: Blocker Fix For: 2.1.1-beta I run sleep job. If AM fails to start, this exception could occur: 13/09/20 11:00:23 INFO mapreduce.Job: Job job_1379673267098_0020 failed with state FAILED due to: Application application_1379673267098_0020 failed 1 times due to AM Container for appattempt_1379673267098_0020_01 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: /myappcache/application_1379673267098_0020/container_1379673267098_0020_01_01/launch_container.sh: line 12: export: `NM_AUX_SERVICE_mapreduce.shuffle=AAA0+gA= ': not a valid identifier at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:270) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:78) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) .Failing this attempt.. Failing the application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1089) Add YARN compute units alongside virtual cores
[ https://issues.apache.org/jira/browse/YARN-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775920#comment-13775920 ] Bikas Saha commented on YARN-1089: -- I am afraid this is getting confusing. Add YARN compute units alongside virtual cores -- Key: YARN-1089 URL: https://issues.apache.org/jira/browse/YARN-1089 Project: Hadoop YARN Issue Type: Improvement Components: api Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1089-1.patch, YARN-1089.patch Based on discussion in YARN-1024, we will add YARN compute units as a resource for requesting and scheduling CPU processing power. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1204) Need to add https port related property in Yarn
[ https://issues.apache.org/jira/browse/YARN-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775933#comment-13775933 ] Hadoop QA commented on YARN-1204: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604675/YARN-1204.20131023.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy: org.apache.hadoop.mapreduce.TestMRJobClient The following test timeouts occurred in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy: org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator org.apache.hadoop.mapreduce.v2.TestUberAM {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1989//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1989//console This message is automatically generated. Need to add https port related property in Yarn --- Key: YARN-1204 URL: https://issues.apache.org/jira/browse/YARN-1204 Project: Hadoop YARN Issue Type: Bug Reporter: Yesha Vora Assignee: Omkar Vinit Joshi Attachments: YARN-1204.20131018.1.patch, YARN-1204.20131020.1.patch, YARN-1204.20131020.2.patch, YARN-1204.20131020.3.patch, YARN-1204.20131020.4.patch, YARN-1204.20131023.1.patch There is no yarn property available to configure https port for Resource manager, nodemanager and history server. Currently, Yarn services uses the port defined for http [defined by 'mapreduce.jobhistory.webapp.address','yarn.nodemanager.webapp.address', 'yarn.resourcemanager.webapp.address'] for running services on https protocol. Yarn should have list of property to assign https port for RM, NM and JHS. It can be like below. yarn.nodemanager.webapp.https.address yarn.resourcemanager.webapp.https.address mapreduce.jobhistory.webapp.https.address -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1068) Add admin support for HA operations
[ https://issues.apache.org/jira/browse/YARN-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775941#comment-13775941 ] Hadoop QA commented on YARN-1068: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604708/yarn-1068-6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1991//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1991//console This message is automatically generated. Add admin support for HA operations --- Key: YARN-1068 URL: https://issues.apache.org/jira/browse/YARN-1068 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1068-1.patch, yarn-1068-2.patch, yarn-1068-3.patch, yarn-1068-4.patch, yarn-1068-5.patch, yarn-1068-6.patch, yarn-1068-prelim.patch Support HA admin operations to facilitate transitioning the RM to Active and Standby states. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-49) Improve distributed shell application to work on a secure cluster
[ https://issues.apache.org/jira/browse/YARN-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775952#comment-13775952 ] Hadoop QA commented on YARN-49: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604705/YARN-49-20130923.3.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1992//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1992//console This message is automatically generated. Improve distributed shell application to work on a secure cluster - Key: YARN-49 URL: https://issues.apache.org/jira/browse/YARN-49 Project: Hadoop YARN Issue Type: Sub-task Components: applications/distributed-shell Reporter: Hitesh Shah Assignee: Vinod Kumar Vavilapalli Attachments: YARN-49-20130923.3.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1214) Register ClientToken MasterKey in SecretManager after it is saved
[ https://issues.apache.org/jira/browse/YARN-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775960#comment-13775960 ] Hadoop QA commented on YARN-1214: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604704/YARN-1214.5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1993//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1993//console This message is automatically generated. Register ClientToken MasterKey in SecretManager after it is saved - Key: YARN-1214 URL: https://issues.apache.org/jira/browse/YARN-1214 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Attachments: YARN-1214.1.patch, YARN-1214.2.patch, YARN-1214.3.patch, YARN-1214.4.patch, YARN-1214.5.patch, YARN-1214.patch Currently, app attempt ClientToken master key is registered before it is saved. This can cause problem that before the master key is saved, client gets the token and RM also crashes, RM cannot reloads the master key back after it restarts as it is not saved. As a result, client is holding an invalid token. We can register the client token master key after it is saved in the store. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1157) ResourceManager UI has invalid tracking URL link for distributed shell application
[ https://issues.apache.org/jira/browse/YARN-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775965#comment-13775965 ] Hadoop QA commented on YARN-1157: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12604723/YARN-1157.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1994//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1994//console This message is automatically generated. ResourceManager UI has invalid tracking URL link for distributed shell application -- Key: YARN-1157 URL: https://issues.apache.org/jira/browse/YARN-1157 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.1.1-beta Attachments: YARN-1157.1.patch, YARN-1157.2.patch, YARN-1157.2.patch, YARN-1157.3.patch, YARN-1157.4.patch Submit YARN distributed shell application. Goto ResourceManager Web UI. The application definitely appears. In Tracking UI column, there will be history link. Click on that link. Instead of showing application master web UI, HTTP error 500 would appear. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira