[jira] [Resolved] (MAPREDUCE-6005) native-task: fix some valgrind errors
[ https://issues.apache.org/jira/browse/MAPREDUCE-6005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Binglin Chang resolved MAPREDUCE-6005. -- Resolution: Fixed native-task: fix some valgrind errors -- Key: MAPREDUCE-6005 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6005 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Binglin Chang Assignee: Binglin Chang Attachments: MAPREDUCE-6005.v1.patch, MAPREDUCE-6005.v2.patch, MAPREDUCE-6005.v3.patch, MAPREDUCE-6005.v4.patch Running test with valgrind shows there are some bugs, this jira try to fix them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-6005) native-task: fix some valgrind errors
[ https://issues.apache.org/jira/browse/MAPREDUCE-6005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084351#comment-14084351 ] Binglin Chang commented on MAPREDUCE-6005: -- I have committed this, thanks Sean. native-task: fix some valgrind errors -- Key: MAPREDUCE-6005 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6005 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Binglin Chang Assignee: Binglin Chang Attachments: MAPREDUCE-6005.v1.patch, MAPREDUCE-6005.v2.patch, MAPREDUCE-6005.v3.patch, MAPREDUCE-6005.v4.patch Running test with valgrind shows there are some bugs, this jira try to fix them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-6015) Make MR ApplicationMaster disable loading user's jars firstly
[ https://issues.apache.org/jira/browse/MAPREDUCE-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084528#comment-14084528 ] Bing Jiang commented on MAPREDUCE-6015: --- [~sjlee0], Thanks very much. Agree with you at the point of 'mapreduce.job.classloader', it can tackle lots of issue of class conflicts. However, if the user's jar contains some conflict classes of launching AM. It seems that there is no way to finish using that property. e.g., jetty-6.1.26 (hadoop system jars) is conflicting with jetty-6.1.11.jar (users jar). AM failed to launch. == org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster java.lang.NoSuchMethodError: org.mortbay.jetty.webapp.WebAppContext.getUnavailableException()Ljava/lang/Throwable; at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:813) at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:273) at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.serviceStart(MRClientService.java:141) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1072) === And user logic code is dependent upon jetty-6.1.11.jar, so it cannot disable mapreduce.user.classpath.first. At this time, it gives me a impression to treat AM and normal Tasks separately. Make MR ApplicationMaster disable loading user's jars firstly -- Key: MAPREDUCE-6015 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6015 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster Affects Versions: 2.4.1 Reporter: Bing Jiang In most cases, we want to use -Dmapreduce.user.classpath.first=true to pick user's jars ahead of hadoop system's jars, which can make tasks run based upon the customized environment under the circumstance that hadoop system default library contains the different version of dependent jars. However, using -Dmapreduce.user.classpath.first=true will cause ApplicationMaster failure to launch due to conflicting classes. In most cases, if users do not customize the ApplicationMaster for MapReduce framework, I believe we can treat MRAppMaster different with MapTask/ReduceTask at the point of loading user's jar in classloader. I believe it can provide a property of '-Dmapreduce.am.user.classpath.first=false' to disable the feature of loading user's jars firstly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-6023) Fix SuppressWarnings from unchecked to rawtypes in O.A.H.mapreduce.lib.input.TaggedInputSplit
Junping Du created MAPREDUCE-6023: - Summary: Fix SuppressWarnings from unchecked to rawtypes in O.A.H.mapreduce.lib.input.TaggedInputSplit Key: MAPREDUCE-6023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6023 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Junping Du Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-6014) New task status field in task attempts table can lead to an empty web page
[ https://issues.apache.org/jira/browse/MAPREDUCE-6014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated MAPREDUCE-6014: - Attachment: MAPREDUCE-6014.patch Attaching patch for trunk and branch-2 New task status field in task attempts table can lead to an empty web page --- Key: MAPREDUCE-6014 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6014 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: MAPREDUCE-6014.patch MAPREDUCE-5550 added a new task attempts field but didn't Javascript-escape the contents. Tasks with status messages that have newlines or other characters can then break the parsing of the web page and leave the user with a blank page. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-6014) New task status field in task attempts table can lead to an empty web page
[ https://issues.apache.org/jira/browse/MAPREDUCE-6014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated MAPREDUCE-6014: - Status: Patch Available (was: Open) New task status field in task attempts table can lead to an empty web page --- Key: MAPREDUCE-6014 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6014 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: MAPREDUCE-6014.patch MAPREDUCE-5550 added a new task attempts field but didn't Javascript-escape the contents. Tasks with status messages that have newlines or other characters can then break the parsing of the web page and leave the user with a blank page. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-6014) New task status field in task attempts table can lead to an empty web page
[ https://issues.apache.org/jira/browse/MAPREDUCE-6014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084837#comment-14084837 ] Hadoop QA commented on MAPREDUCE-6014: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659661/MAPREDUCE-6014.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4785//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4785//console This message is automatically generated. New task status field in task attempts table can lead to an empty web page --- Key: MAPREDUCE-6014 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6014 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: MAPREDUCE-6014.patch MAPREDUCE-5550 added a new task attempts field but didn't Javascript-escape the contents. Tasks with status messages that have newlines or other characters can then break the parsing of the web page and leave the user with a blank page. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-6014) New task status field in task attempts table can lead to an empty web page
[ https://issues.apache.org/jira/browse/MAPREDUCE-6014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085015#comment-14085015 ] Jonathan Eagles commented on MAPREDUCE-6014: +1. This will correct this issue New task status field in task attempts table can lead to an empty web page --- Key: MAPREDUCE-6014 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6014 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: MAPREDUCE-6014.patch MAPREDUCE-5550 added a new task attempts field but didn't Javascript-escape the contents. Tasks with status messages that have newlines or other characters can then break the parsing of the web page and leave the user with a blank page. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-6018) Create a framework specific config to enable timeline server
[ https://issues.apache.org/jira/browse/MAPREDUCE-6018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated MAPREDUCE-6018: --- Issue Type: Sub-task (was: Improvement) Parent: MAPREDUCE-5858 Create a framework specific config to enable timeline server Key: MAPREDUCE-6018 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6018 Project: Hadoop Map/Reduce Issue Type: Sub-task Reporter: Jonathan Eagles -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5858) [Umbrella] MR should make use of the timeline server
[ https://issues.apache.org/jira/browse/MAPREDUCE-5858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085148#comment-14085148 ] Zhijie Shen commented on MAPREDUCE-5858: Unassigned the umbrella ticket as it may be contributed by multiple stakeholders. [Umbrella] MR should make use of the timeline server Key: MAPREDUCE-5858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5858 Project: Hadoop Map/Reduce Issue Type: Task Reporter: Zhijie Shen Now MR relies on its own JobHistoryServer for MR specific history information. Given the timeline server is ready, we should gradually migrate MR historic data to it as well. relieving MR from maintaining its own history server daemon. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5858) [Umbrella] MR should make use of the timeline server
[ https://issues.apache.org/jira/browse/MAPREDUCE-5858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated MAPREDUCE-5858: --- Assignee: (was: Zhijie Shen) [Umbrella] MR should make use of the timeline server Key: MAPREDUCE-5858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5858 Project: Hadoop Map/Reduce Issue Type: Task Reporter: Zhijie Shen Now MR relies on its own JobHistoryServer for MR specific history information. Given the timeline server is ready, we should gradually migrate MR historic data to it as well. relieving MR from maintaining its own history server daemon. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5858) [Umbrella] MR should make use of the timeline server
[ https://issues.apache.org/jira/browse/MAPREDUCE-5858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085166#comment-14085166 ] Jonathan Eagles commented on MAPREDUCE-5858: [~zjshen], we need to be careful when designing this. In particular, users of MR job history server will want to: - Continue using Job history server with no runtime dependency a timeline server being present - Enable or disable timeline server from the client side - automatic or manual pluggable configuration or disabling of an in memory solution for local mode testing - minicluster integration - Allow for history enabled for timeline server and job history server at the same time - What to do for the tracking url for different scenarios above - New MR UI using backing timeline backing store [Umbrella] MR should make use of the timeline server Key: MAPREDUCE-5858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5858 Project: Hadoop Map/Reduce Issue Type: Task Reporter: Zhijie Shen Now MR relies on its own JobHistoryServer for MR specific history information. Given the timeline server is ready, we should gradually migrate MR historic data to it as well. relieving MR from maintaining its own history server daemon. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-2911) Hamster: Hadoop And Mpi on the same cluSTER
[ https://issues.apache.org/jira/browse/MAPREDUCE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hars Vardhan updated MAPREDUCE-2911: Assignee: Ralph Castain (was: Hars Vardhan) Hamster: Hadoop And Mpi on the same cluSTER --- Key: MAPREDUCE-2911 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2911 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.0 Environment: All Unix-Environments Reporter: Milind Bhandarkar Assignee: Ralph Castain Original Estimate: 336h Remaining Estimate: 336h MPI is commonly used for many machine-learning applications. OpenMPI (http://www.open-mpi.org/) is a popular BSD-licensed version of MPI. In the past, running MPI application on a Hadoop cluster was achieved using Hadoop Streaming (http://videolectures.net/nipsworkshops2010_ye_gbd/), but it was kludgy. After the resource-manager separation from JobTracker in Hadoop, we have all the tools needed to make MPI a first-class citizen on a Hadoop cluster. I am currently working on the patch to make MPI an application-master. Initial version of this patch will be available soon (hopefully before September 10.) This jira will track the development of Hamster: The application master for MPI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-2911) Hamster: Hadoop And Mpi on the same cluSTER
[ https://issues.apache.org/jira/browse/MAPREDUCE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hars Vardhan updated MAPREDUCE-2911: Assignee: Ralph van Etten (was: Ralph Castain) Hamster: Hadoop And Mpi on the same cluSTER --- Key: MAPREDUCE-2911 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2911 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.0 Environment: All Unix-Environments Reporter: Milind Bhandarkar Assignee: Ralph van Etten Original Estimate: 336h Remaining Estimate: 336h MPI is commonly used for many machine-learning applications. OpenMPI (http://www.open-mpi.org/) is a popular BSD-licensed version of MPI. In the past, running MPI application on a Hadoop cluster was achieved using Hadoop Streaming (http://videolectures.net/nipsworkshops2010_ye_gbd/), but it was kludgy. After the resource-manager separation from JobTracker in Hadoop, we have all the tools needed to make MPI a first-class citizen on a Hadoop cluster. I am currently working on the patch to make MPI an application-master. Initial version of this patch will be available soon (hopefully before September 10.) This jira will track the development of Hamster: The application master for MPI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-2911) Hamster: Hadoop And Mpi on the same cluSTER
[ https://issues.apache.org/jira/browse/MAPREDUCE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hars Vardhan updated MAPREDUCE-2911: Assignee: Ralph H Castain (was: Ralph van Etten) Hamster: Hadoop And Mpi on the same cluSTER --- Key: MAPREDUCE-2911 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2911 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.0 Environment: All Unix-Environments Reporter: Milind Bhandarkar Assignee: Ralph H Castain Original Estimate: 336h Remaining Estimate: 336h MPI is commonly used for many machine-learning applications. OpenMPI (http://www.open-mpi.org/) is a popular BSD-licensed version of MPI. In the past, running MPI application on a Hadoop cluster was achieved using Hadoop Streaming (http://videolectures.net/nipsworkshops2010_ye_gbd/), but it was kludgy. After the resource-manager separation from JobTracker in Hadoop, we have all the tools needed to make MPI a first-class citizen on a Hadoop cluster. I am currently working on the patch to make MPI an application-master. Initial version of this patch will be available soon (hopefully before September 10.) This jira will track the development of Hamster: The application master for MPI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-2911) Hamster: Hadoop And Mpi on the same cluSTER
[ https://issues.apache.org/jira/browse/MAPREDUCE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hars Vardhan updated MAPREDUCE-2911: Assignee: (was: Ralph H Castain) Hamster: Hadoop And Mpi on the same cluSTER --- Key: MAPREDUCE-2911 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2911 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.0 Environment: All Unix-Environments Reporter: Milind Bhandarkar Original Estimate: 336h Remaining Estimate: 336h MPI is commonly used for many machine-learning applications. OpenMPI (http://www.open-mpi.org/) is a popular BSD-licensed version of MPI. In the past, running MPI application on a Hadoop cluster was achieved using Hadoop Streaming (http://videolectures.net/nipsworkshops2010_ye_gbd/), but it was kludgy. After the resource-manager separation from JobTracker in Hadoop, we have all the tools needed to make MPI a first-class citizen on a Hadoop cluster. I am currently working on the patch to make MPI an application-master. Initial version of this patch will be available soon (hopefully before September 10.) This jira will track the development of Hamster: The application master for MPI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5968) Work directory is not deleted in DistCache if Exception happen in downloadCacheObject.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085447#comment-14085447 ] Karthik Kambatla commented on MAPREDUCE-5968: - Patch looks mostly good. A couple of minor comments: # Do we need to set delWorkDir to false at both places? The latter is always executed and the former can be skipped. {code} + // promote the output to the final location + if (!localFs.rename(workDir, finalDir)) { +localFs.delete(workDir, true); +delWorkDir = false; +if (!localFs.exists(finalDir)) { + throw new IOException(Failed to promote distributed cache object + +workDir + to + finalDir); +} +// someone else promoted first +return 0; + } + delWorkDir = false; {code} # I understand the -work- comes from how work directory name is generated. Can we create the work directory name in a method that can be accessed from both the production and test code so the test continues to be useful in the future. {code} +String workDir = destination.getParent().toString() + -work-; {code} Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. --- Key: MAPREDUCE-5968 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5968 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 1.2.1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5968.branch1.patch Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. In downloadCacheObject, the cache file will be copied to temporarily work directory first, then the work directory will be renamed to the final directory. If IOException happens during the copy, the work directory will not be deleted. This will cause garbage data left in local disk cache. For example If the MR application use Distributed Cache to send a very large Archive/file(50G), if the disk is full during the copy, then the IOException will be triggered, the work directory will be not deleted or renamed and the work directory will occupy a big chunk of disk space. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5968) Work directory is not deleted in DistCache if Exception happen in downloadCacheObject.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085503#comment-14085503 ] zhihai xu commented on MAPREDUCE-5968: -- thanks for the comments. these are good findings. 1. Based on your suggestion, I can optimize the code: remove delWorkDir, check whether it exist before delete the work dir in final block. 2. define -work- as a constant in the class, so it can be reused by both production and test code. Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. --- Key: MAPREDUCE-5968 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5968 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 1.2.1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5968.branch1.patch Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. In downloadCacheObject, the cache file will be copied to temporarily work directory first, then the work directory will be renamed to the final directory. If IOException happens during the copy, the work directory will not be deleted. This will cause garbage data left in local disk cache. For example If the MR application use Distributed Cache to send a very large Archive/file(50G), if the disk is full during the copy, then the IOException will be triggered, the work directory will be not deleted or renamed and the work directory will occupy a big chunk of disk space. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-4815) FileOutputCommitter.commitJob can be very slow for jobs with many output files
[ https://issues.apache.org/jira/browse/MAPREDUCE-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated MAPREDUCE-4815: --- Attachment: MAPREDUCE-4815.v1.patch FileOutputCommitter.commitJob can be very slow for jobs with many output files -- Key: MAPREDUCE-4815 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4815 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3, 2.0.1-alpha, 2.4.1 Reporter: Jason Lowe Assignee: Arun C Murthy Attachments: MAPREDUCE-4815.v1.patch If a job generates many files to commit then the commitJob method call at the end of the job can take minutes. This is a performance regression from 1.x, as 1.x had the tasks commit directly to the final output directory as they were completing and commitJob had very little to do. The commit work was processed in parallel and overlapped the processing of outstanding tasks. In 0.23/2.x, the commit is single-threaded and waits until all tasks have completed before commencing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-4815) FileOutputCommitter.commitJob can be very slow for jobs with many output files
[ https://issues.apache.org/jira/browse/MAPREDUCE-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated MAPREDUCE-4815: --- Assignee: Siqi Li (was: Arun C Murthy) Affects Version/s: 2.4.1 Status: Patch Available (was: Open) FileOutputCommitter.commitJob can be very slow for jobs with many output files -- Key: MAPREDUCE-4815 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4815 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.4.1, 2.0.1-alpha, 0.23.3 Reporter: Jason Lowe Assignee: Siqi Li Attachments: MAPREDUCE-4815.v1.patch If a job generates many files to commit then the commitJob method call at the end of the job can take minutes. This is a performance regression from 1.x, as 1.x had the tasks commit directly to the final output directory as they were completing and commitJob had very little to do. The commit work was processed in parallel and overlapped the processing of outstanding tasks. In 0.23/2.x, the commit is single-threaded and waits until all tasks have completed before commencing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-6007) Create a new option for distcp -p which causes raw.* namespace extended attributes to not be preserved
[ https://issues.apache.org/jira/browse/MAPREDUCE-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085543#comment-14085543 ] Andrew Wang commented on MAPREDUCE-6007: Hi Charles, thanks for the patch, I had to take notes while reviewing this patch, the behavior is kind of complicated. We have a variety of flags that can be specified, and the destination FS can have different levels of support. It'd be very useful to specify this behavior in gory detail in the DistCp documentation. Check me on this though: Options: {noformat} -px : preserve raw and non-raw xattrs -pr : no xattrs are preserved -p : preserve raw xattrs -pxr: preserve non-raw xattrs : no xattrs are preserved {noformat} Behavior with a given src and dst, varying levels of dst support: * raw src, raw dst: the options apply as specified above * raw src, not-raw dst, dst supports xattrs but no {{/reserved/.raw}}: we will fail to set raw xattrs at runtime. * raw src, dst doesn't support xattrs: if {{-pX}} is specified, throws an exception. Else, silently discards raw xattrs. Some discussion on the above: * If the src is {{/reserved/.raw}}, the user is expecting preservation of raw xattrs when {{-p}} or {{-pX}} is specified. In this scenario, we should test that the dest is {{/.reserved/raw}} and that it's present on the dstFS. * There might be other weird cases, haven't thought through all of them Some code review comments: Misc: - We have both {{noPreserveRaw}} and {{preserveRaw}} booleans, can we standardize on one everywhere? I'd like a negative one, call it {{disableRaw}} or {{excludeRaw}} since it better captures the meaning of the flag. {{exclude}} feels a bit better IMO, but it looks like {{-pe}} is taken. - What's the expected behavior when the dest doesn't support xattrs or reserved raw, or supports xattrs but not reserved raw? - CopyListing, this is where we'd also test to see if the destFS has a /.reserved/raw directory - CopyMapper, two periods in the block comment Documentation: - I don't want to tie raw preservation just to encryption since we might also use it for compression, how about this instead: {quote} d: disable preservation of raw namespace extended attributes ... raw namespace extended attributes are preserved by default if supported. Specifying -pd disables preservation of these xattrs. {quote} - As noted above, it'd be good to have the expected preservation behavior laid out in the distcp documentation. DistCp: {code} if (!Path.getPathWithoutSchemeAndAuthority(target).toString(). {code} What if the target is a relative path here? Test: - Any reason this isn't part of the existing XAttr test? They seem pretty similar, and you also added a PXD test to the existing test. - Don't need to do makeFilesAndDirs inO the BeforeClass - Doesn't there need to be a non-raw attribute set so you can test some of these combinations? - Can we test what happens when the dest FS doesn't support xattrs or raw xattrs? Create a new option for distcp -p which causes raw.* namespace extended attributes to not be preserved -- Key: MAPREDUCE-6007 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6007 Project: Hadoop Map/Reduce Issue Type: New Feature Components: distcp Affects Versions: fs-encryption Reporter: Charles Lamb Assignee: Charles Lamb Attachments: MAPREDUCE-6007.001.patch As part of the Data at Rest Encryption work (HDFS-6134), we need to create a new option for distcp which causes raw.* namespace extended attributes to not be preserved. See the doc in HDFS-6509 for details. The default for this option will be to preserve raw.* xattrs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5968) Work directory is not deleted in DistCache if Exception happen in downloadCacheObject.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-5968: - Attachment: MAPREDUCE-5968.branch1_new.patch Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. --- Key: MAPREDUCE-5968 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5968 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 1.2.1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5968.branch1.patch, MAPREDUCE-5968.branch1_new.patch Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. In downloadCacheObject, the cache file will be copied to temporarily work directory first, then the work directory will be renamed to the final directory. If IOException happens during the copy, the work directory will not be deleted. This will cause garbage data left in local disk cache. For example If the MR application use Distributed Cache to send a very large Archive/file(50G), if the disk is full during the copy, then the IOException will be triggered, the work directory will be not deleted or renamed and the work directory will occupy a big chunk of disk space. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5968) Work directory is not deleted in DistCache if Exception happen in downloadCacheObject.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085563#comment-14085563 ] zhihai xu commented on MAPREDUCE-5968: -- Hi Karthik, I attached new patch to address these issues. please review it. thanks zhihai Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. --- Key: MAPREDUCE-5968 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5968 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 1.2.1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5968.branch1.patch, MAPREDUCE-5968.branch1_new.patch Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. In downloadCacheObject, the cache file will be copied to temporarily work directory first, then the work directory will be renamed to the final directory. If IOException happens during the copy, the work directory will not be deleted. This will cause garbage data left in local disk cache. For example If the MR application use Distributed Cache to send a very large Archive/file(50G), if the disk is full during the copy, then the IOException will be triggered, the work directory will be not deleted or renamed and the work directory will occupy a big chunk of disk space. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5968) Work directory is not deleted in DistCache if Exception happen in downloadCacheObject.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-5968: - Attachment: MAPREDUCE-5968.branch1_new1.patch Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. --- Key: MAPREDUCE-5968 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5968 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 1.2.1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5968.branch1.patch, MAPREDUCE-5968.branch1_new.patch, MAPREDUCE-5968.branch1_new1.patch Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. In downloadCacheObject, the cache file will be copied to temporarily work directory first, then the work directory will be renamed to the final directory. If IOException happens during the copy, the work directory will not be deleted. This will cause garbage data left in local disk cache. For example If the MR application use Distributed Cache to send a very large Archive/file(50G), if the disk is full during the copy, then the IOException will be triggered, the work directory will be not deleted or renamed and the work directory will occupy a big chunk of disk space. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5968) Work directory is not deleted in DistCache if Exception happen in downloadCacheObject.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085617#comment-14085617 ] zhihai xu commented on MAPREDUCE-5968: -- I just upload a new patch MAPREDUCE-5968.branch1_new1.patch to remove the duplicate localFs.delete(workDir, true);. thanks Karthik for your review. Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. --- Key: MAPREDUCE-5968 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5968 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 1.2.1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5968.branch1.patch, MAPREDUCE-5968.branch1_new.patch, MAPREDUCE-5968.branch1_new1.patch Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. In downloadCacheObject, the cache file will be copied to temporarily work directory first, then the work directory will be renamed to the final directory. If IOException happens during the copy, the work directory will not be deleted. This will cause garbage data left in local disk cache. For example If the MR application use Distributed Cache to send a very large Archive/file(50G), if the disk is full during the copy, then the IOException will be triggered, the work directory will be not deleted or renamed and the work directory will occupy a big chunk of disk space. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4815) FileOutputCommitter.commitJob can be very slow for jobs with many output files
[ https://issues.apache.org/jira/browse/MAPREDUCE-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085651#comment-14085651 ] Hadoop QA commented on MAPREDUCE-4815: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659761/MAPREDUCE-4815.v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: org.apache.hadoop.mapreduce.lib.output.TestPreemptableFileOutputCommitter {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4786//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4786//console This message is automatically generated. FileOutputCommitter.commitJob can be very slow for jobs with many output files -- Key: MAPREDUCE-4815 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4815 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3, 2.0.1-alpha, 2.4.1 Reporter: Jason Lowe Assignee: Siqi Li Attachments: MAPREDUCE-4815.v1.patch If a job generates many files to commit then the commitJob method call at the end of the job can take minutes. This is a performance regression from 1.x, as 1.x had the tasks commit directly to the final output directory as they were completing and commitJob had very little to do. The commit work was processed in parallel and overlapped the processing of outstanding tasks. In 0.23/2.x, the commit is single-threaded and waits until all tasks have completed before commencing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-2911) Hamster: Hadoop And Mpi on the same cluSTER
[ https://issues.apache.org/jira/browse/MAPREDUCE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085696#comment-14085696 ] Zhuoluo (Clark) Yang commented on MAPREDUCE-2911: - Hi, Madhurima I am not quite sure whether there is a public repo of OPENMPI There is an alpha version https://github.com/clarkyzl/mpich2-yarn MPICH2 implementation. I think you may try it on. Hamster: Hadoop And Mpi on the same cluSTER --- Key: MAPREDUCE-2911 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2911 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.0 Environment: All Unix-Environments Reporter: Milind Bhandarkar Original Estimate: 336h Remaining Estimate: 336h MPI is commonly used for many machine-learning applications. OpenMPI (http://www.open-mpi.org/) is a popular BSD-licensed version of MPI. In the past, running MPI application on a Hadoop cluster was achieved using Hadoop Streaming (http://videolectures.net/nipsworkshops2010_ye_gbd/), but it was kludgy. After the resource-manager separation from JobTracker in Hadoop, we have all the tools needed to make MPI a first-class citizen on a Hadoop cluster. I am currently working on the patch to make MPI an application-master. Initial version of this patch will be available soon (hopefully before September 10.) This jira will track the development of Hamster: The application master for MPI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5968) Work directory is not deleted in DistCache if Exception happen in downloadCacheObject.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085723#comment-14085723 ] Hadoop QA commented on MAPREDUCE-5968: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659787/MAPREDUCE-5968.branch1_new1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4787//console This message is automatically generated. Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. --- Key: MAPREDUCE-5968 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5968 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 1.2.1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5968.branch1.patch, MAPREDUCE-5968.branch1_new.patch, MAPREDUCE-5968.branch1_new1.patch Work directory is not deleted in DistCache if Exception happen in downloadCacheObject. In downloadCacheObject, the cache file will be copied to temporarily work directory first, then the work directory will be renamed to the final directory. If IOException happens during the copy, the work directory will not be deleted. This will cause garbage data left in local disk cache. For example If the MR application use Distributed Cache to send a very large Archive/file(50G), if the disk is full during the copy, then the IOException will be triggered, the work directory will be not deleted or renamed and the work directory will occupy a big chunk of disk space. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5984) native-task: upgrade lz4 to lastest version
[ https://issues.apache.org/jira/browse/MAPREDUCE-5984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085733#comment-14085733 ] Sean Zhong commented on MAPREDUCE-5984: --- Looks good, +1 native-task: upgrade lz4 to lastest version --- Key: MAPREDUCE-5984 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5984 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor Attachments: MAPREDUCE-5984.v1.patch, MAPREDUCE-5984.v2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5976) native-task should not fail to build if snappy is missing
[ https://issues.apache.org/jira/browse/MAPREDUCE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085734#comment-14085734 ] Sean Zhong commented on MAPREDUCE-5976: --- [~tlipcon], can you take a look at the new patch? native-task should not fail to build if snappy is missing - Key: MAPREDUCE-5976 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5976 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Todd Lipcon Assignee: Sean Zhong Attachments: mapreduce-5976-v2.txt, mapreduce-5976.txt Other native parts of Hadoop will automatically disable snappy support if snappy is not present and -Drequire.snappy is not passed. native-task should do the same. (right now, it fails to build if snappy is missing) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5978) native-task CompressTest failure on Ubuntu
[ https://issues.apache.org/jira/browse/MAPREDUCE-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085735#comment-14085735 ] Sean Zhong commented on MAPREDUCE-5978: --- Ok, patch looks good, +1 native-task CompressTest failure on Ubuntu -- Key: MAPREDUCE-5978 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5978 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Todd Lipcon Assignee: Manu Zhang Attachments: mapreduce-5978.txt The MR-2841 branch fails the following unit tests on my box: CompressTest.testBzip2Compress:84 file compare result: if they are the same ,then return true expected:true but was:false CompressTest.testDefaultCompress:116 file compare result: if they are the same ,then return true expected:true but was:false We need to fix these before merging. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-6004) native-task should not fail to build if zlib is missing
[ https://issues.apache.org/jira/browse/MAPREDUCE-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14085740#comment-14085740 ] Manu Zhang commented on MAPREDUCE-6004: --- closed since zlib is a prerequisite for compiling hadoop native code documented in https://github.com/apache/hadoop-common/blob/trunk/BUILDING.txt#L12 so there is no need to check here. native-task should not fail to build if zlib is missing --- Key: MAPREDUCE-6004 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6004 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Manu Zhang zlib is required by Gzip. We need to check for its existence in build and exclude Gzip related codes when zlib is missing. similar to MAPREDUCE-5976 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-6004) native-task should not fail to build if zlib is missing
[ https://issues.apache.org/jira/browse/MAPREDUCE-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manu Zhang resolved MAPREDUCE-6004. --- Resolution: Not a Problem native-task should not fail to build if zlib is missing --- Key: MAPREDUCE-6004 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6004 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: task Reporter: Manu Zhang zlib is required by Gzip. We need to check for its existence in build and exclude Gzip related codes when zlib is missing. similar to MAPREDUCE-5976 -- This message was sent by Atlassian JIRA (v6.2#6252)