[jira] [Commented] (MAPREDUCE-6286) A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390441#comment-14390441 ] Hudson commented on MAPREDUCE-6286: --- FAILURE: Integrated in Hadoop-Yarn-trunk #884 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/884/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly. - Key: MAPREDUCE-6286 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6286 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: MAPREDUCE-6286.000.patch A typo in HistoryViewer makes some code useless and it causes counter limits are not reset correctly. The typo is Limits.reset(conf); We should use jobConf instead of conf. With the typo, the following code becomes useless: {code} final Path jobConfPath = new Path(jobFile.getParent(), jobDetails[0] + _ + jobDetails[1] + _ + jobDetails[2] + _conf.xml); final Configuration jobConf = new Configuration(conf); jobConf.addResource(fs.open(jobConfPath), jobConfPath.toString()); {code} The code wants to load the configuration from the Job configuration file and reset the Limits based on the new configuration loaded from the Job configuration file. But with the typo, the Limits are reset with the old configuration. So this typo is apparent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6199) AbstractCounters are not reset completely on deserialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390438#comment-14390438 ] Hudson commented on MAPREDUCE-6199: --- FAILURE: Integrated in Hadoop-Yarn-trunk #884 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/884/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt AbstractCounters are not reset completely on deserialization Key: MAPREDUCE-6199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6199 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.8.0 Attachments: mr-6199.001.patch, mr-6199.001.patch, mr-6199.002.patch AbstractCounters are partially reset on deserialization. This patch completely resets them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6286) A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390429#comment-14390429 ] Hudson commented on MAPREDUCE-6286: --- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #150 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/150/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly. - Key: MAPREDUCE-6286 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6286 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: MAPREDUCE-6286.000.patch A typo in HistoryViewer makes some code useless and it causes counter limits are not reset correctly. The typo is Limits.reset(conf); We should use jobConf instead of conf. With the typo, the following code becomes useless: {code} final Path jobConfPath = new Path(jobFile.getParent(), jobDetails[0] + _ + jobDetails[1] + _ + jobDetails[2] + _conf.xml); final Configuration jobConf = new Configuration(conf); jobConf.addResource(fs.open(jobConfPath), jobConfPath.toString()); {code} The code wants to load the configuration from the Job configuration file and reset the Limits based on the new configuration loaded from the Job configuration file. But with the typo, the Limits are reset with the old configuration. So this typo is apparent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390425#comment-14390425 ] Hudson commented on MAPREDUCE-5875: --- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #150 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/150/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild --- Key: MAPREDUCE-5875 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, client, task Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Fix For: 2.8.0 Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, MAPREDUCE-5875.v09.patch Currently, counter limits mapreduce.job.counters.* handled by {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized asymmetrically: on the client side, and on the AM, job.xml is ignored whereas it's taken into account in YarnChild. It would be good to make the Limits job-configurable, such that max counters/groups is only increased when needed. With the current Limits implementation relying on static constants, it's going to be challenging for tools that submit jobs concurrently without resorting to class loading isolation. The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390437#comment-14390437 ] Hudson commented on MAPREDUCE-5875: --- FAILURE: Integrated in Hadoop-Yarn-trunk #884 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/884/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild --- Key: MAPREDUCE-5875 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, client, task Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Fix For: 2.8.0 Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, MAPREDUCE-5875.v09.patch Currently, counter limits mapreduce.job.counters.* handled by {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized asymmetrically: on the client side, and on the AM, job.xml is ignored whereas it's taken into account in YarnChild. It would be good to make the Limits job-configurable, such that max counters/groups is only increased when needed. With the current Limits implementation relying on static constants, it's going to be challenging for tools that submit jobs concurrently without resorting to class loading isolation. The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6199) AbstractCounters are not reset completely on deserialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390426#comment-14390426 ] Hudson commented on MAPREDUCE-6199: --- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #150 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/150/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt AbstractCounters are not reset completely on deserialization Key: MAPREDUCE-6199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6199 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.8.0 Attachments: mr-6199.001.patch, mr-6199.001.patch, mr-6199.002.patch AbstractCounters are partially reset on deserialization. This patch completely resets them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6286) A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390621#comment-14390621 ] Hudson commented on MAPREDUCE-6286: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #2082 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2082/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly. - Key: MAPREDUCE-6286 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6286 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: MAPREDUCE-6286.000.patch A typo in HistoryViewer makes some code useless and it causes counter limits are not reset correctly. The typo is Limits.reset(conf); We should use jobConf instead of conf. With the typo, the following code becomes useless: {code} final Path jobConfPath = new Path(jobFile.getParent(), jobDetails[0] + _ + jobDetails[1] + _ + jobDetails[2] + _conf.xml); final Configuration jobConf = new Configuration(conf); jobConf.addResource(fs.open(jobConfPath), jobConfPath.toString()); {code} The code wants to load the configuration from the Job configuration file and reset the Limits based on the new configuration loaded from the Job configuration file. But with the typo, the Limits are reset with the old configuration. So this typo is apparent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390617#comment-14390617 ] Hudson commented on MAPREDUCE-5875: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #2082 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2082/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild --- Key: MAPREDUCE-5875 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, client, task Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Fix For: 2.8.0 Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, MAPREDUCE-5875.v09.patch Currently, counter limits mapreduce.job.counters.* handled by {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized asymmetrically: on the client side, and on the AM, job.xml is ignored whereas it's taken into account in YarnChild. It would be good to make the Limits job-configurable, such that max counters/groups is only increased when needed. With the current Limits implementation relying on static constants, it's going to be challenging for tools that submit jobs concurrently without resorting to class loading isolation. The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390633#comment-14390633 ] Hudson commented on MAPREDUCE-5875: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #150 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/150/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild --- Key: MAPREDUCE-5875 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, client, task Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Fix For: 2.8.0 Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, MAPREDUCE-5875.v09.patch Currently, counter limits mapreduce.job.counters.* handled by {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized asymmetrically: on the client side, and on the AM, job.xml is ignored whereas it's taken into account in YarnChild. It would be good to make the Limits job-configurable, such that max counters/groups is only increased when needed. With the current Limits implementation relying on static constants, it's going to be challenging for tools that submit jobs concurrently without resorting to class loading isolation. The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6286) A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390637#comment-14390637 ] Hudson commented on MAPREDUCE-6286: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #150 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/150/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly. - Key: MAPREDUCE-6286 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6286 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: MAPREDUCE-6286.000.patch A typo in HistoryViewer makes some code useless and it causes counter limits are not reset correctly. The typo is Limits.reset(conf); We should use jobConf instead of conf. With the typo, the following code becomes useless: {code} final Path jobConfPath = new Path(jobFile.getParent(), jobDetails[0] + _ + jobDetails[1] + _ + jobDetails[2] + _conf.xml); final Configuration jobConf = new Configuration(conf); jobConf.addResource(fs.open(jobConfPath), jobConfPath.toString()); {code} The code wants to load the configuration from the Job configuration file and reset the Limits based on the new configuration loaded from the Job configuration file. But with the typo, the Limits are reset with the old configuration. So this typo is apparent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6199) AbstractCounters are not reset completely on deserialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390634#comment-14390634 ] Hudson commented on MAPREDUCE-6199: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #150 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/150/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt AbstractCounters are not reset completely on deserialization Key: MAPREDUCE-6199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6199 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.8.0 Attachments: mr-6199.001.patch, mr-6199.001.patch, mr-6199.002.patch AbstractCounters are partially reset on deserialization. This patch completely resets them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6199) AbstractCounters are not reset completely on deserialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390656#comment-14390656 ] Hudson commented on MAPREDUCE-6199: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #141 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/141/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt AbstractCounters are not reset completely on deserialization Key: MAPREDUCE-6199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6199 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.8.0 Attachments: mr-6199.001.patch, mr-6199.001.patch, mr-6199.002.patch AbstractCounters are partially reset on deserialization. This patch completely resets them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390655#comment-14390655 ] Hudson commented on MAPREDUCE-5875: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #141 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/141/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild --- Key: MAPREDUCE-5875 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, client, task Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Fix For: 2.8.0 Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, MAPREDUCE-5875.v09.patch Currently, counter limits mapreduce.job.counters.* handled by {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized asymmetrically: on the client side, and on the AM, job.xml is ignored whereas it's taken into account in YarnChild. It would be good to make the Limits job-configurable, such that max counters/groups is only increased when needed. With the current Limits implementation relying on static constants, it's going to be challenging for tools that submit jobs concurrently without resorting to class loading isolation. The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6286) A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390659#comment-14390659 ] Hudson commented on MAPREDUCE-6286: --- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #141 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/141/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly. - Key: MAPREDUCE-6286 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6286 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: MAPREDUCE-6286.000.patch A typo in HistoryViewer makes some code useless and it causes counter limits are not reset correctly. The typo is Limits.reset(conf); We should use jobConf instead of conf. With the typo, the following code becomes useless: {code} final Path jobConfPath = new Path(jobFile.getParent(), jobDetails[0] + _ + jobDetails[1] + _ + jobDetails[2] + _conf.xml); final Configuration jobConf = new Configuration(conf); jobConf.addResource(fs.open(jobConfPath), jobConfPath.toString()); {code} The code wants to load the configuration from the Job configuration file and reset the Limits based on the new configuration loaded from the Job configuration file. But with the typo, the Limits are reset with the old configuration. So this typo is apparent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6304) Specifying node labels when submitting MR jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated MAPREDUCE-6304: -- Description: Per the discussion on YARN-796, we need a mechanism in MAPREDUCE to specify node labels when submitting MR jobs. (was: Per the discussion on Yarn-796, we need a mechanism in MAPREDUCE to specify node labels when submitting MR jobs.) Specifying node labels when submitting MR jobs -- Key: MAPREDUCE-6304 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6304 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Jian Fang Per the discussion on YARN-796, we need a mechanism in MAPREDUCE to specify node labels when submitting MR jobs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6304) Specifying node labels when submitting MR jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391724#comment-14391724 ] Jian Fang commented on MAPREDUCE-6304: -- Link related JIRAs Specifying node labels when submitting MR jobs -- Key: MAPREDUCE-6304 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6304 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Jian Fang Per the discussion on YARN-796, we need a mechanism in MAPREDUCE to specify node labels when submitting MR jobs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6076) Zero map split input length combine with none zero map split input length will cause MR1 job hung.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-6076: - Resolution: Fixed Fix Version/s: 1.3.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Zhihai. Committed to branch-1! Zero map split input length combine with none zero map split input length will cause MR1 job hung. Key: MAPREDUCE-6076 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6076 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Reporter: zhihai xu Assignee: zhihai xu Fix For: 1.3.0 Attachments: MAPREDUCE-6076.branch-1.000.patch Zero map split input length combine with none zero map split input length will cause MR1 job hung. This problem may happen when use HBASE input split(TableSplit). HBASE split input length can be zero for unknown regions or non-zero for known regions in the following code: {code} // TableSplit.java public long getLength() { return length; } // RegionSizeCalculator.java public long getRegionSize(byte[] regionId) { Long size = sizeMap.get(regionId); if (size == null) { LOG.debug(Unknown region: + Arrays.toString(regionId)); return 0; } else { return size; } } {code} The TableSplit length come from RegionSizeCalculator.getRegionSize. The job hung is because in MR1, If these zero split input length map tasks are scheduled and completed before all none zero split input length map tasks are scheduled, Scheduling new map task in JobProgress.java will be failed to pass the TaskTracker resources check at. {code} // findNewMapTask // Check to ensure this TaskTracker has enough resources to // run tasks from this job long outSize = resourceEstimator.getEstimatedMapOutputSize(); long availSpace = tts.getResourceStatus().getAvailableSpace(); if(availSpace outSize) { LOG.warn(No room for map task. Node + tts.getHost() + has + availSpace + bytes free; but we expect map to take + outSize); return -1; //see if a different TIP might work better. } {code} The resource calculation is at {code} // in ResourceEstimator.java protected synchronized long getEstimatedTotalMapOutputSize() { if(completedMapsUpdates threshholdToUse) { return 0; } else { long inputSize = job.getInputLength() + job.desiredMaps(); //add desiredMaps() so that randomwriter case doesn't blow up //the multiplication might lead to overflow, casting it with //double prevents it long estimate = Math.round(((double)inputSize * completedMapsOutputSize * 2.0)/completedMapsInputSize); if (LOG.isDebugEnabled()) { LOG.debug(estimate total map output will be + estimate); } return estimate; } } protected synchronized void updateWithCompletedTask(TaskStatus ts, TaskInProgress tip) { //-1 indicates error, which we don't average in. if(tip.isMapTask() ts.getOutputSize() != -1) { completedMapsUpdates++; completedMapsInputSize+=(tip.getMapInputSize()+1); completedMapsOutputSize+=ts.getOutputSize(); if(LOG.isDebugEnabled()) { LOG.debug(completedMapsUpdates:+completedMapsUpdates+ + completedMapsInputSize:+completedMapsInputSize+ + completedMapsOutputSize:+completedMapsOutputSize); } } } {code} You can see in the calculation: completedMapsInputSize will be a very small number and inputSize * completedMapsOutputSize will be a very big number For example, completedMapsInputSize = 1; inputSize = 100MBytes and completedMapsOutputSize=100MBytes, The estimate will be 5000TB which will be more than most task tracker disk space size. So I think if the map split input length is 0, it means the split input length is unknown and it is reasonable to use map output size as input size for the calculation in ResourceEstimator. I will upload a fix based on this method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6076) Zero map split input length combine with none zero map split input length will cause MR1 job hung.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391777#comment-14391777 ] Robert Kanter commented on MAPREDUCE-6076: -- +1 Zero map split input length combine with none zero map split input length will cause MR1 job hung. Key: MAPREDUCE-6076 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6076 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-6076.branch-1.000.patch Zero map split input length combine with none zero map split input length will cause MR1 job hung. This problem may happen when use HBASE input split(TableSplit). HBASE split input length can be zero for unknown regions or non-zero for known regions in the following code: {code} // TableSplit.java public long getLength() { return length; } // RegionSizeCalculator.java public long getRegionSize(byte[] regionId) { Long size = sizeMap.get(regionId); if (size == null) { LOG.debug(Unknown region: + Arrays.toString(regionId)); return 0; } else { return size; } } {code} The TableSplit length come from RegionSizeCalculator.getRegionSize. The job hung is because in MR1, If these zero split input length map tasks are scheduled and completed before all none zero split input length map tasks are scheduled, Scheduling new map task in JobProgress.java will be failed to pass the TaskTracker resources check at. {code} // findNewMapTask // Check to ensure this TaskTracker has enough resources to // run tasks from this job long outSize = resourceEstimator.getEstimatedMapOutputSize(); long availSpace = tts.getResourceStatus().getAvailableSpace(); if(availSpace outSize) { LOG.warn(No room for map task. Node + tts.getHost() + has + availSpace + bytes free; but we expect map to take + outSize); return -1; //see if a different TIP might work better. } {code} The resource calculation is at {code} // in ResourceEstimator.java protected synchronized long getEstimatedTotalMapOutputSize() { if(completedMapsUpdates threshholdToUse) { return 0; } else { long inputSize = job.getInputLength() + job.desiredMaps(); //add desiredMaps() so that randomwriter case doesn't blow up //the multiplication might lead to overflow, casting it with //double prevents it long estimate = Math.round(((double)inputSize * completedMapsOutputSize * 2.0)/completedMapsInputSize); if (LOG.isDebugEnabled()) { LOG.debug(estimate total map output will be + estimate); } return estimate; } } protected synchronized void updateWithCompletedTask(TaskStatus ts, TaskInProgress tip) { //-1 indicates error, which we don't average in. if(tip.isMapTask() ts.getOutputSize() != -1) { completedMapsUpdates++; completedMapsInputSize+=(tip.getMapInputSize()+1); completedMapsOutputSize+=ts.getOutputSize(); if(LOG.isDebugEnabled()) { LOG.debug(completedMapsUpdates:+completedMapsUpdates+ + completedMapsInputSize:+completedMapsInputSize+ + completedMapsOutputSize:+completedMapsOutputSize); } } } {code} You can see in the calculation: completedMapsInputSize will be a very small number and inputSize * completedMapsOutputSize will be a very big number For example, completedMapsInputSize = 1; inputSize = 100MBytes and completedMapsOutputSize=100MBytes, The estimate will be 5000TB which will be more than most task tracker disk space size. So I think if the map split input length is 0, it means the split input length is unknown and it is reasonable to use map output size as input size for the calculation in ResourceEstimator. I will upload a fix based on this method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6297) Task Id of the failed task in diagnostics should link to the task page
[ https://issues.apache.org/jira/browse/MAPREDUCE-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391805#comment-14391805 ] Hadoop QA commented on MAPREDUCE-6297: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708805/MAPREDUCE-6297.v3.patch against trunk revision 3c7adaa. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: org.apache.hadoop.mapreduce.v2.hs.webapp.TestBlocks org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesTasks Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5365//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5365//console This message is automatically generated. Task Id of the failed task in diagnostics should link to the task page -- Key: MAPREDUCE-6297 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6297 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver Affects Versions: 2.6.0 Reporter: Siqi Li Assignee: Siqi Li Priority: Minor Attachments: 58CCA024-7455-4A87-BCFD-C88054FF841B.png, MAPREDUCE-6297.v1.patch, MAPREDUCE-6297.v2.patch, MAPREDUCE-6297.v3.patch Currently we have to copy it and search in the task list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6304) Specifying node labels when submitting MR jobs
Jian Fang created MAPREDUCE-6304: Summary: Specifying node labels when submitting MR jobs Key: MAPREDUCE-6304 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6304 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Jian Fang Per the discussion on Yarn-796, we need a mechanism in MAPREDUCE to specify node labels when submitting MR jobs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6302) deadlock in a job between map and reduce cores allocation
[ https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392021#comment-14392021 ] mai shurong commented on MAPREDUCE-6302: The version is hadoop-2.6.0 deadlock in a job between map and reduce cores allocation -- Key: MAPREDUCE-6302 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: mai shurong Priority: Critical Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, queue_with_max163cores.png, queue_with_max263cores.png, queue_with_max333cores.png I submit a big job, which has 500 maps and 350 reduce, to a queue(fairscheduler) with 300 max cores. When the big mapreduce job is running 100% maps, the 300 reduces have occupied 300 max cores in the queue. And then, a map fails and retry, waiting for a core, while the 300 reduces are waiting for failed map to finish. So a deadlock occur. As a result, the job is blocked, and the later job in the queue cannot run because no available cores in the queue. I think there is the similar issue for memory of a queue . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391183#comment-14391183 ] Hudson commented on MAPREDUCE-5875: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2100 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2100/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild --- Key: MAPREDUCE-5875 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, client, task Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Fix For: 2.8.0 Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, MAPREDUCE-5875.v09.patch Currently, counter limits mapreduce.job.counters.* handled by {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized asymmetrically: on the client side, and on the AM, job.xml is ignored whereas it's taken into account in YarnChild. It would be good to make the Limits job-configurable, such that max counters/groups is only increased when needed. With the current Limits implementation relying on static constants, it's going to be challenging for tools that submit jobs concurrently without resorting to class loading isolation. The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6199) AbstractCounters are not reset completely on deserialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391184#comment-14391184 ] Hudson commented on MAPREDUCE-6199: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2100 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2100/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt AbstractCounters are not reset completely on deserialization Key: MAPREDUCE-6199 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6199 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.8.0 Attachments: mr-6199.001.patch, mr-6199.001.patch, mr-6199.002.patch AbstractCounters are partially reset on deserialization. This patch completely resets them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6300) Task list sort by task id broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated MAPREDUCE-6300: --- Attachment: MAPREDUCE-6300.v3.patch Task list sort by task id broken Key: MAPREDUCE-6300 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6300 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Priority: Minor Attachments: MAPREDUCE-6300.v1.patch, MAPREDUCE-6300.v2.patch, MAPREDUCE-6300.v3.patch, screenshot-1.png -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6286) A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391187#comment-14391187 ] Hudson commented on MAPREDUCE-6286: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2100 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2100/]) Reverted MAPREDUCE-6286, MAPREDUCE-6199, and MAPREDUCE-5875 from branch-2.7. Editing CHANGES.txt to reflect this. (vinodkv: rev e428fea73029ea0c3494c71a50c5f6c994888fd2) * hadoop-mapreduce-project/CHANGES.txt A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly. - Key: MAPREDUCE-6286 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6286 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 2.6.0 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: MAPREDUCE-6286.000.patch A typo in HistoryViewer makes some code useless and it causes counter limits are not reset correctly. The typo is Limits.reset(conf); We should use jobConf instead of conf. With the typo, the following code becomes useless: {code} final Path jobConfPath = new Path(jobFile.getParent(), jobDetails[0] + _ + jobDetails[1] + _ + jobDetails[2] + _conf.xml); final Configuration jobConf = new Configuration(conf); jobConf.addResource(fs.open(jobConfPath), jobConfPath.toString()); {code} The code wants to load the configuration from the Job configuration file and reset the Limits based on the new configuration loaded from the Job configuration file. But with the typo, the Limits are reset with the old configuration. So this typo is apparent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6300) Task list sort by task id broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391143#comment-14391143 ] Hadoop QA commented on MAPREDUCE-6300: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708739/MAPREDUCE-6300.v3.patch against trunk revision 4922394. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5363//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5363//console This message is automatically generated. Task list sort by task id broken Key: MAPREDUCE-6300 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6300 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Priority: Minor Attachments: MAPREDUCE-6300.v1.patch, MAPREDUCE-6300.v2.patch, MAPREDUCE-6300.v3.patch, screenshot-1.png -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4844) Counters.java doesn't obey Java Memory Model
[ https://issues.apache.org/jira/browse/MAPREDUCE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391157#comment-14391157 ] Hadoop QA commented on MAPREDUCE-4844: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708750/MAPREDUCE-4844-002.patch against trunk revision 4922394. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5364//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5364//console This message is automatically generated. Counters.java doesn't obey Java Memory Model Key: MAPREDUCE-4844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4844 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.6.0 Reporter: Gera Shegalov Assignee: Brahma Reddy Battula Attachments: MAPREDUCE-4844-002.patch, MAPREDUCE-4844-002.patch, MAPREDUCE-4844-branch-1.patch Counters have a number of immutable fields that have not been declared 'final'. For example, the field groups is not final. It is, however, accessed in a couple of methods that are declared 'synchronized'. While there is a happens-before relationship between these methods calls, there is none between the Counters object initialization and these synchronized methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (MAPREDUCE-3286) Unit tests for MAPREDUCE-3186 - User jobs are getting hanged if the Resource manager process goes down and comes up while job is getting executed.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne resolved MAPREDUCE-3286. --- Resolution: Invalid Target Version/s: (was: ) Release Note: (was: New Yarn configuration property: Name: yarn.app.mapreduce.am.scheduler.connection.retries Description: Number of times AM should retry to contact RM if connection is lost.) RM has been refactored and restructured a few times in the past 3.5 years. Closing as invalid. Unit tests for MAPREDUCE-3186 - User jobs are getting hanged if the Resource manager process goes down and comes up while job is getting executed. -- Key: MAPREDUCE-3286 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3286 Project: Hadoop Map/Reduce Issue Type: Test Components: mrv2 Affects Versions: 0.23.0 Environment: linux Reporter: Eric Payne Assignee: Eric Payne Labels: test If the resource manager is restarted while the job execution is in progress, the job is getting hanged. UI shows the job as running. In the RM log, it is throwing an error ERROR org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: AppAttemptId doesnt exist in cache appattempt_1318579738195_0004_01 In the console MRAppMaster and Runjar processes are not getting killed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6302) deadlock in a job between map and reduce cores allocation
[ https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391208#comment-14391208 ] Wangda Tan commented on MAPREDUCE-6302: --- Moved to mapreduce. And [~shurong.mai], could you confirm the Hadoop version you're currently using? deadlock in a job between map and reduce cores allocation -- Key: MAPREDUCE-6302 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: mai shurong Priority: Critical Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, queue_with_max163cores.png, queue_with_max263cores.png, queue_with_max333cores.png I submit a big job, which has 500 maps and 350 reduce, to a queue(fairscheduler) with 300 max cores. When the big mapreduce job is running 100% maps, the 300 reduces have occupied 300 max cores in the queue. And then, a map fails and retry, waiting for a core, while the 300 reduces are waiting for failed map to finish. So a deadlock occur. As a result, the job is blocked, and the later job in the queue cannot run because no available cores in the queue. I think there is the similar issue for memory of a queue . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (MAPREDUCE-6302) deadlock in a job between map and reduce cores allocation
[ https://issues.apache.org/jira/browse/MAPREDUCE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan moved YARN-3416 to MAPREDUCE-6302: - Component/s: (was: fairscheduler) Affects Version/s: (was: 2.6.0) 2.6.0 Key: MAPREDUCE-6302 (was: YARN-3416) Project: Hadoop Map/Reduce (was: Hadoop YARN) deadlock in a job between map and reduce cores allocation -- Key: MAPREDUCE-6302 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6302 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: mai shurong Priority: Critical Attachments: AM_log_head10.txt.gz, AM_log_tail10.txt.gz, queue_with_max163cores.png, queue_with_max263cores.png, queue_with_max333cores.png I submit a big job, which has 500 maps and 350 reduce, to a queue(fairscheduler) with 300 max cores. When the big mapreduce job is running 100% maps, the 300 reduces have occupied 300 max cores in the queue. And then, a map fails and retry, waiting for a core, while the 300 reduces are waiting for failed map to finish. So a deadlock occur. As a result, the job is blocked, and the later job in the queue cannot run because no available cores in the queue. I think there is the similar issue for memory of a queue . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6303) Read timeout when retrying a fetch error can be fatal to a reducer
[ https://issues.apache.org/jira/browse/MAPREDUCE-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391516#comment-14391516 ] Jason Lowe commented on MAPREDUCE-6303: --- Sample reduce log snippet showing the issue: {noformat} 2015-03-28 00:31:54,393 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#7 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:633) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1322) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:427) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:392) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:338) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) 2015-03-28 00:31:54,511 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task {noformat} The problem is that the code caught an IOException trying to shuffle and within the catch block the code throws _again_ which leaks up to the top of the Fetcher thread and kills the task. Read timeout when retrying a fetch error can be fatal to a reducer -- Key: MAPREDUCE-6303 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6303 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Priority: Blocker If a reducer encounters an error trying to fetch from a node then encounters a read timeout when trying to re-establish the connection then the reducer can fail. The read timeout exception can leak to the top of the Fetcher thread which will cause the reduce task to teardown. This type of error can repeat across reducer attempts causing jobs to fail due to a single bad node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6303) Read timeout when retrying a fetch error can be fatal to a reducer
Jason Lowe created MAPREDUCE-6303: - Summary: Read timeout when retrying a fetch error can be fatal to a reducer Key: MAPREDUCE-6303 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6303 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Priority: Blocker If a reducer encounters an error trying to fetch from a node then encounters a read timeout when trying to re-establish the connection then the reducer can fail. The read timeout exception can leak to the top of the Fetcher thread which will cause the reduce task to teardown. This type of error can repeat across reducer attempts causing jobs to fail due to a single bad node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6297) Task Id of the failed task in diagnostics should link to the task page
[ https://issues.apache.org/jira/browse/MAPREDUCE-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated MAPREDUCE-6297: --- Attachment: MAPREDUCE-6297.v3.patch Task Id of the failed task in diagnostics should link to the task page -- Key: MAPREDUCE-6297 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6297 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver Affects Versions: 2.6.0 Reporter: Siqi Li Assignee: Siqi Li Priority: Minor Attachments: 58CCA024-7455-4A87-BCFD-C88054FF841B.png, MAPREDUCE-6297.v1.patch, MAPREDUCE-6297.v2.patch, MAPREDUCE-6297.v3.patch Currently we have to copy it and search in the task list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-4844) Counters.java doesn't obey Java Memory Model
[ https://issues.apache.org/jira/browse/MAPREDUCE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated MAPREDUCE-4844: Attachment: MAPREDUCE-4844-002.patch Counters.java doesn't obey Java Memory Model Key: MAPREDUCE-4844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4844 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.6.0 Reporter: Gera Shegalov Assignee: Brahma Reddy Battula Attachments: MAPREDUCE-4844-002.patch, MAPREDUCE-4844-002.patch, MAPREDUCE-4844-branch-1.patch Counters have a number of immutable fields that have not been declared 'final'. For example, the field groups is not final. It is, however, accessed in a couple of methods that are declared 'synchronized'. While there is a happens-before relationship between these methods calls, there is none between the Counters object initialization and these synchronized methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-4844) Counters.java doesn't obey Java Memory Model
[ https://issues.apache.org/jira/browse/MAPREDUCE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391057#comment-14391057 ] Brahma Reddy Battula commented on MAPREDUCE-4844: - Thanks a lot for review..Updated patch..Kindly Review... Counters.java doesn't obey Java Memory Model Key: MAPREDUCE-4844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4844 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0, 2.6.0 Reporter: Gera Shegalov Assignee: Brahma Reddy Battula Attachments: MAPREDUCE-4844-002.patch, MAPREDUCE-4844-002.patch, MAPREDUCE-4844-branch-1.patch Counters have a number of immutable fields that have not been declared 'final'. For example, the field groups is not final. It is, however, accessed in a couple of methods that are declared 'synchronized'. While there is a happens-before relationship between these methods calls, there is none between the Counters object initialization and these synchronized methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6300) Task list sort by task id broken
[ https://issues.apache.org/jira/browse/MAPREDUCE-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391511#comment-14391511 ] Siqi Li commented on MAPREDUCE-6300: Not sure if that 134KB patch in YARN-3323 really fixed the problem. Task list sort by task id broken Key: MAPREDUCE-6300 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6300 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Priority: Minor Attachments: MAPREDUCE-6300.v1.patch, MAPREDUCE-6300.v2.patch, MAPREDUCE-6300.v3.patch, screenshot-1.png -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-6303) Read timeout when retrying a fetch error can be fatal to a reducer
[ https://issues.apache.org/jira/browse/MAPREDUCE-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned MAPREDUCE-6303: - Assignee: Jason Lowe Read timeout when retrying a fetch error can be fatal to a reducer -- Key: MAPREDUCE-6303 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6303 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker If a reducer encounters an error trying to fetch from a node then encounters a read timeout when trying to re-establish the connection then the reducer can fail. The read timeout exception can leak to the top of the Fetcher thread which will cause the reduce task to teardown. This type of error can repeat across reducer attempts causing jobs to fail due to a single bad node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)