[jira] [Created] (MAPREDUCE-5927) Getting following error
Kedar Dixit created MAPREDUCE-5927: -- Summary: Getting following error Key: MAPREDUCE-5927 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Reporter: Kedar Dixit Priority: Blocker -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5927) Getting following error
[ https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kedar Dixit updated MAPREDUCE-5927: --- Description: 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402913701967_0006 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: http://gs-1695:8088/proxy/application_1402913701967_0006/ 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in uber mode : false 14/06/16 16:21:54 INFO mapreduce.Job: map 0% reduce 0% 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with state FAILED due to: Application application_1402913701967_0006 failed 2 times due to AM Container for appattempt_1402913701967_0006_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) .Failing this attempt.. Failing the application. 14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0 Getting following error --- Key: MAPREDUCE-5927 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Reporter: Kedar Dixit Priority: Blocker 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated.
[jira] [Updated] (MAPREDUCE-5927) Getting following error
[ https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kedar Dixit updated MAPREDUCE-5927: --- Description: Hi, I am getting following error, while running application on cluser - 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402913701967_0006 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: http://gs-1695:8088/proxy/application_1402913701967_0006/ 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in uber mode : false 14/06/16 16:21:54 INFO mapreduce.Job: map 0% reduce 0% 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with state FAILED due to: Application application_1402913701967_0006 failed 2 times due to AM Container for appattempt_1402913701967_0006_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) .Failing this attempt.. Failing the application. 14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0 Can you please help me in fixing this ? Thanks, ~Kedar was: 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/06/16 16:21:49 INFO Configuration.deprecation:
[jira] [Commented] (MAPREDUCE-5927) Getting following error
[ https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032335#comment-14032335 ] Kedar Dixit commented on MAPREDUCE-5927: My .bashrc has following confs. export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64/ export HADOOP_HOME=/home/gslab/setup/hadoop-2.2.0 #export JAVA_HOME=$(/usr/libexec/java_home) export HADOOP_MAPRED_HOME=/home/gslab/setup/hadoop-2.2.0 export HADOOP_COMMON_HOME=/home/gslab/setup/hadoop-2.2.0 export HADOOP_HDFS_HOME=/home/gslab/setup/hadoop-2.2.0 export YARN_HOME=/home/gslab/setup/hadoop-2.2.0 export HADOOP_YARN_HOME=/home/gslab/setup/hadoop-2.2.0 export HADOOP_CONF_DIR=/home/gslab/setup/hadoop-2.2.0 #export MAHOUT_HOME=/home/gslab/setup/mahout-distribution-0.9 export MAHOUT_HOME=/home/gslab/setup/mahout-trunk export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$MAHOUT_HOME/bin:/$JAVA_HOME/bin export PATH=$PATH:$MAHOUT_HOME Getting following error --- Key: MAPREDUCE-5927 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Reporter: Kedar Dixit Priority: Blocker Hi, I am getting following error, while running application on cluser - 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402913701967_0006 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: http://gs-1695:8088/proxy/application_1402913701967_0006/ 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in uber mode : false 14/06/16 16:21:54 INFO mapreduce.Job: map 0% reduce 0% 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with state FAILED due to: Application application_1402913701967_0006 failed 2 times due to AM Container for appattempt_1402913701967_0006_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at
[jira] [Updated] (MAPREDUCE-5927) Getting following error
[ https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kedar Dixit updated MAPREDUCE-5927: --- Assignee: Vinod Kumar Vavilapalli Getting following error --- Key: MAPREDUCE-5927 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Reporter: Kedar Dixit Assignee: Vinod Kumar Vavilapalli Priority: Blocker Hi, I am getting following error, while running application on cluser - 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402913701967_0006 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: http://gs-1695:8088/proxy/application_1402913701967_0006/ 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in uber mode : false 14/06/16 16:21:54 INFO mapreduce.Job: map 0% reduce 0% 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with state FAILED due to: Application application_1402913701967_0006 failed 2 times due to AM Container for appattempt_1402913701967_0006_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) .Failing this attempt.. Failing the application. 14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0 Can you please help me in fixing this ? Thanks, ~Kedar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5898) distcp to support preserving HDFS extended attributes(XAttrs)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032352#comment-14032352 ] Hudson commented on MAPREDUCE-5898: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #585 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/585/]) Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt distcp to support preserving HDFS extended attributes(XAttrs) - Key: MAPREDUCE-5898 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5898 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 3.0.0 Reporter: Uma Maheswara Rao G Assignee: Yi Liu Fix For: 3.0.0, 2.5.0 Attachments: MAPREDUCE-5898.1.patch, MAPREDUCE-5898.patch This JIRA to track the distcp support to handle the Xattrs with preserving options. Add new command line argument to support that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5920) Add Xattr option in DistCp docs
[ https://issues.apache.org/jira/browse/MAPREDUCE-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032355#comment-14032355 ] Hudson commented on MAPREDUCE-5920: --- SUCCESS: Integrated in Hadoop-Yarn-trunk #585 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/585/]) Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Add Xattr option in DistCp docs Key: MAPREDUCE-5920 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5920 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp, documentation Affects Versions: 3.0.0 Reporter: Uma Maheswara Rao G Assignee: Yi Liu Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: MAPREDUCE-5920.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
Niels Basjes created MAPREDUCE-5928: --- Summary: Deadlock allocating containers for mappers and reducers Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated MAPREDUCE-5928: Attachment: MR job stuck in deadlock.png.jpg Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated MAPREDUCE-5928: Attachment: Cluster fully loaded.png.jpg NOTE: Node2 had issues so the system took it offline (0 containers). Perhaps this is what confused the MapReduce application? Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5920) Add Xattr option in DistCp docs
[ https://issues.apache.org/jira/browse/MAPREDUCE-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032451#comment-14032451 ] Hudson commented on MAPREDUCE-5920: --- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1776 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1776/]) Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Add Xattr option in DistCp docs Key: MAPREDUCE-5920 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5920 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp, documentation Affects Versions: 3.0.0 Reporter: Uma Maheswara Rao G Assignee: Yi Liu Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: MAPREDUCE-5920.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5018) Support raw binary data with Hadoop streaming
[ https://issues.apache.org/jira/browse/MAPREDUCE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Willis updated MAPREDUCE-5018: - Target Version/s: 1.1.2, trunk (was: trunk, 1.1.2) Status: Patch Available (was: Open) Support raw binary data with Hadoop streaming - Key: MAPREDUCE-5018 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5018 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/streaming Affects Versions: 1.1.2, trunk Reporter: Jay Hacker Assignee: Steven Willis Priority: Minor Attachments: MAPREDUCE-5018-branch-1.1.patch, MAPREDUCE-5018.patch, MAPREDUCE-5018.patch, justbytes.jar, mapstream People often have a need to run older programs over many files, and turn to Hadoop streaming as a reliable, performant batch system. There are good reasons for this: 1. Hadoop is convenient: they may already be using it for mapreduce jobs, and it is easy to spin up a cluster in the cloud. 2. It is reliable: HDFS replicates data and the scheduler retries failed jobs. 3. It is reasonably performant: it moves the code to the data, maintaining locality, and scales with the number of nodes. Historically Hadoop is of course oriented toward processing key/value pairs, and so needs to interpret the data passing through it. Unfortunately, this makes it difficult to use Hadoop streaming with programs that don't deal in key/value pairs, or with binary data in general. For example, something as simple as running md5sum to verify the integrity of files will not give the correct result, due to Hadoop's interpretation of the data. There have been several attempts at binary serialization schemes for Hadoop streaming, such as TypedBytes (HADOOP-1722); however, these are still aimed at efficiently encoding key/value pairs, and not passing data through unmodified. Even the RawBytes serialization scheme adds length fields to the data, rendering it not-so-raw. I often have a need to run a Unix filter on files stored in HDFS; currently, the only way I can do this on the raw data is to copy the data out and run the filter on one machine, which is inconvenient, slow, and unreliable. It would be very convenient to run the filter as a map-only job, allowing me to build on existing (well-tested!) building blocks in the Unix tradition instead of reimplementing them as mapreduce programs. However, most existing tools don't know about file splits, and so want to process whole files; and of course many expect raw binary input and output. The solution is to run a map-only job with an InputFormat and OutputFormat that just pass raw bytes and don't split. It turns out to be a little more complicated with streaming; I have attached a patch with the simplest solution I could come up with. I call the format JustBytes (as RawBytes was already taken), and it should be usable with most recent versions of Hadoop. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5898) distcp to support preserving HDFS extended attributes(XAttrs)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032448#comment-14032448 ] Hudson commented on MAPREDUCE-5898: --- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1776 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1776/]) Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt distcp to support preserving HDFS extended attributes(XAttrs) - Key: MAPREDUCE-5898 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5898 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 3.0.0 Reporter: Uma Maheswara Rao G Assignee: Yi Liu Fix For: 3.0.0, 2.5.0 Attachments: MAPREDUCE-5898.1.patch, MAPREDUCE-5898.patch This JIRA to track the distcp support to handle the Xattrs with preserving options. Add new command line argument to support that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5018) Support raw binary data with Hadoop streaming
[ https://issues.apache.org/jira/browse/MAPREDUCE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Willis updated MAPREDUCE-5018: - Target Version/s: 1.1.2, trunk (was: trunk, 1.1.2) Status: Open (was: Patch Available) Support raw binary data with Hadoop streaming - Key: MAPREDUCE-5018 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5018 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/streaming Affects Versions: 1.1.2, trunk Reporter: Jay Hacker Assignee: Steven Willis Priority: Minor Attachments: MAPREDUCE-5018-branch-1.1.patch, MAPREDUCE-5018.patch, MAPREDUCE-5018.patch, justbytes.jar, mapstream People often have a need to run older programs over many files, and turn to Hadoop streaming as a reliable, performant batch system. There are good reasons for this: 1. Hadoop is convenient: they may already be using it for mapreduce jobs, and it is easy to spin up a cluster in the cloud. 2. It is reliable: HDFS replicates data and the scheduler retries failed jobs. 3. It is reasonably performant: it moves the code to the data, maintaining locality, and scales with the number of nodes. Historically Hadoop is of course oriented toward processing key/value pairs, and so needs to interpret the data passing through it. Unfortunately, this makes it difficult to use Hadoop streaming with programs that don't deal in key/value pairs, or with binary data in general. For example, something as simple as running md5sum to verify the integrity of files will not give the correct result, due to Hadoop's interpretation of the data. There have been several attempts at binary serialization schemes for Hadoop streaming, such as TypedBytes (HADOOP-1722); however, these are still aimed at efficiently encoding key/value pairs, and not passing data through unmodified. Even the RawBytes serialization scheme adds length fields to the data, rendering it not-so-raw. I often have a need to run a Unix filter on files stored in HDFS; currently, the only way I can do this on the raw data is to copy the data out and run the filter on one machine, which is inconvenient, slow, and unreliable. It would be very convenient to run the filter as a map-only job, allowing me to build on existing (well-tested!) building blocks in the Unix tradition instead of reimplementing them as mapreduce programs. However, most existing tools don't know about file splits, and so want to process whole files; and of course many expect raw binary input and output. The solution is to run a map-only job with an InputFormat and OutputFormat that just pass raw bytes and don't split. It turns out to be a little more complicated with streaming; I have attached a patch with the simplest solution I could come up with. I call the format JustBytes (as RawBytes was already taken), and it should be usable with most recent versions of Hadoop. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032473#comment-14032473 ] Jason Lowe commented on MAPREDUCE-5928: --- This sounds like a bug in either headroom calculation or in RMContainerAllocator where the AM decides whether to preempt reducers. Could you look in the AM log and see what it saw for the headroom and whether it made any attempt at all to ramp down reducers? Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032476#comment-14032476 ] Niels Basjes commented on MAPREDUCE-5928: - I'm not the only one who ran into this: http://hortonworks.com/community/forums/topic/mapreduce-race-condition-big-job/ Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (MAPREDUCE-5927) Getting following error
[ https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reopened MAPREDUCE-5927: --- Getting following error --- Key: MAPREDUCE-5927 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Reporter: Kedar Dixit Assignee: Vinod Kumar Vavilapalli Priority: Blocker Hi, I am getting following error, while running application on cluser - 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402913701967_0006 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: http://gs-1695:8088/proxy/application_1402913701967_0006/ 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in uber mode : false 14/06/16 16:21:54 INFO mapreduce.Job: map 0% reduce 0% 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with state FAILED due to: Application application_1402913701967_0006 failed 2 times due to AM Container for appattempt_1402913701967_0006_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) .Failing this attempt.. Failing the application. 14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0 Can you please help me in fixing this ? Thanks, ~Kedar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-5927) Getting following error
[ https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved MAPREDUCE-5927. --- Resolution: Fixed Getting following error --- Key: MAPREDUCE-5927 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Reporter: Kedar Dixit Assignee: Vinod Kumar Vavilapalli Priority: Blocker Hi, I am getting following error, while running application on cluser - 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402913701967_0006 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: http://gs-1695:8088/proxy/application_1402913701967_0006/ 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in uber mode : false 14/06/16 16:21:54 INFO mapreduce.Job: map 0% reduce 0% 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with state FAILED due to: Application application_1402913701967_0006 failed 2 times due to AM Container for appattempt_1402913701967_0006_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) .Failing this attempt.. Failing the application. 14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0 Can you please help me in fixing this ? Thanks, ~Kedar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-5927) Getting following error
[ https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved MAPREDUCE-5927. --- Resolution: Invalid This is a general support question better asked on the u...@hadoop.apache.org list. JIRA is for tracking bugs and features in Hadoop and not a general user support channel. In this case the ApplicationMaster is crashing shortly after startup. You'll need to examine the ApplicationMaster log to determine what happened -- click on the tracking URL and then from there go to the AM logs link or you can also use the yarn logs command if log aggregation is enabled on your cluster. Getting following error --- Key: MAPREDUCE-5927 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Reporter: Kedar Dixit Assignee: Vinod Kumar Vavilapalli Priority: Blocker Hi, I am getting following error, while running application on cluser - 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402913701967_0006 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: http://gs-1695:8088/proxy/application_1402913701967_0006/ 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in uber mode : false 14/06/16 16:21:54 INFO mapreduce.Job: map 0% reduce 0% 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with state FAILED due to: Application application_1402913701967_0006 failed 2 times due to AM Container for appattempt_1402913701967_0006_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032496#comment-14032496 ] Jonathan Eagles commented on MAPREDUCE-5928: I think this is a case of task preemption not working since the headroom calculation is not correct. Can you verify you are using the capacity scheduler. YARN-1198 Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032503#comment-14032503 ] Jason Lowe commented on MAPREDUCE-5928: --- I'm wondering if the fact that the nodemanger memory has a fractional remainder when it's full triggers the issue. With tasks all being 512MB that means each node will have 152MB remaining. I'm guessing that with enough nodes those remainders will add up to appear to be enough space to run another task but in reality that task cannot be scheduled since the memory being reported is fragmented. Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated MAPREDUCE-5928: Attachment: AM-MR-syslog - Cleaned.txt.gz I downloaded the Application Master log and attached it to this issue. (I changed the domainname of the nodes) Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5018) Support raw binary data with Hadoop streaming
[ https://issues.apache.org/jira/browse/MAPREDUCE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032514#comment-14032514 ] Hadoop QA commented on MAPREDUCE-5018: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12644886/MAPREDUCE-5018.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. See https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4662//artifact/trunk/patchprocess/diffJavadocWarnings.txt for details. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-tools/hadoop-streaming. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4662//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4662//console This message is automatically generated. Support raw binary data with Hadoop streaming - Key: MAPREDUCE-5018 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5018 Project: Hadoop Map/Reduce Issue Type: New Feature Components: contrib/streaming Affects Versions: trunk, 1.1.2 Reporter: Jay Hacker Assignee: Steven Willis Priority: Minor Attachments: MAPREDUCE-5018-branch-1.1.patch, MAPREDUCE-5018.patch, MAPREDUCE-5018.patch, justbytes.jar, mapstream People often have a need to run older programs over many files, and turn to Hadoop streaming as a reliable, performant batch system. There are good reasons for this: 1. Hadoop is convenient: they may already be using it for mapreduce jobs, and it is easy to spin up a cluster in the cloud. 2. It is reliable: HDFS replicates data and the scheduler retries failed jobs. 3. It is reasonably performant: it moves the code to the data, maintaining locality, and scales with the number of nodes. Historically Hadoop is of course oriented toward processing key/value pairs, and so needs to interpret the data passing through it. Unfortunately, this makes it difficult to use Hadoop streaming with programs that don't deal in key/value pairs, or with binary data in general. For example, something as simple as running md5sum to verify the integrity of files will not give the correct result, due to Hadoop's interpretation of the data. There have been several attempts at binary serialization schemes for Hadoop streaming, such as TypedBytes (HADOOP-1722); however, these are still aimed at efficiently encoding key/value pairs, and not passing data through unmodified. Even the RawBytes serialization scheme adds length fields to the data, rendering it not-so-raw. I often have a need to run a Unix filter on files stored in HDFS; currently, the only way I can do this on the raw data is to copy the data out and run the filter on one machine, which is inconvenient, slow, and unreliable. It would be very convenient to run the filter as a map-only job, allowing me to build on existing (well-tested!) building blocks in the Unix tradition instead of reimplementing them as mapreduce programs. However, most existing tools don't know about file splits, and so want to process whole files; and of course many expect raw binary input and output. The solution is to run a map-only job with an InputFormat and OutputFormat that just pass raw bytes and don't split. It turns out to be a little more complicated with streaming; I have attached a patch with the simplest solution I could come up with. I call the format JustBytes (as RawBytes was already taken), and it should be usable with most recent versions of Hadoop. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032516#comment-14032516 ] Niels Basjes commented on MAPREDUCE-5928: - I have not actively configured any scheduling. So I guess it is running the 'default' setting ? Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5898) distcp to support preserving HDFS extended attributes(XAttrs)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032532#comment-14032532 ] Hudson commented on MAPREDUCE-5898: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1803 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1803/]) Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt distcp to support preserving HDFS extended attributes(XAttrs) - Key: MAPREDUCE-5898 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5898 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 3.0.0 Reporter: Uma Maheswara Rao G Assignee: Yi Liu Fix For: 3.0.0, 2.5.0 Attachments: MAPREDUCE-5898.1.patch, MAPREDUCE-5898.patch This JIRA to track the distcp support to handle the Xattrs with preserving options. Add new command line argument to support that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5920) Add Xattr option in DistCp docs
[ https://issues.apache.org/jira/browse/MAPREDUCE-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032535#comment-14032535 ] Hudson commented on MAPREDUCE-5920: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1803 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1803/]) Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Add Xattr option in DistCp docs Key: MAPREDUCE-5920 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5920 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp, documentation Affects Versions: 3.0.0 Reporter: Uma Maheswara Rao G Assignee: Yi Liu Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: MAPREDUCE-5920.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032545#comment-14032545 ] Jason Lowe commented on MAPREDUCE-5928: --- I'm pretty sure you're using the CapacityScheduler since that's been the default in Apache Hadoop for some time now. I'm not positive about the HDP release, but I suspect it, too, is configured to use the CapacityScheduler by default. After a quick examination of the AM log, it looks like a couple of things are going on. The AM is blacklisting one of the nodes, and we can see that node not being used in the cluster picture. There's a known issue with headroom calculation not taking into account blacklisted nodes. See YARN-1680. The node ends up being blacklisted because the NM shot a number of the tasks for being over container limits. It looks like the containers are being allocated as using 500MB but the JVM heap sizes are set to 512MB. Note that the container size includes the size of the entire process tree for the task. That's not just the heap, so it needs to also include thread stacks, JVM data, JVM code, any subprocesses launched (e.g.: hadoop streaming) etc. If you really need a 512MB heap then I'd allocate 768MB or maybe even 1024MB containers, depending on what the tasks are doing. It does look like some fractional memory shenanigans could be involved here, as the picture shows most of the nodes having only 200MB free. It'd be interesting to know if you still hit the deadlock after fixing the cause of the blacklisting. Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032564#comment-14032564 ] Niels Basjes commented on MAPREDUCE-5928: - I took the 'dead' node (node2) offline (completely stopped all hadoop/yarn related deamons) and ran the same job again after it had disappeared in all overviews. Now it does complete all mappers. Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5929) YARNRunner.java, path for jobJarPath not set correctly
Chao Tian created MAPREDUCE-5929: Summary: YARNRunner.java, path for jobJarPath not set correctly Key: MAPREDUCE-5929 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5929 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.2.0 Reporter: Chao Tian In YARNRunner.java, line 357, Path jobJarPath = new Path(jobConf.get(MRJobConfig.JAR)); This causes the job.jar file to miss scheme, host and port number on distributed file systems other than hdfs. If we compare line 357 with line 344, there job.xml is actually set as Path jobConfPath = new Path(jobSubmitDir,MRJobConfig.JOB_CONF_FILE); It appears jobSubmitDir is missing on line 357, which causes this problem. In hdfs, the additional qualify process will correct this problem, but not other generic distributed file systems. The proposed change is to replace 35 7 with Path jobJarPath = new Path(jobConf.get(jobSubmitDir,MRJobConfig.JAR)); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032766#comment-14032766 ] Niels Basjes commented on MAPREDUCE-5928: - Where/how can I determine for sure if the capacity scheduler is used? Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5926) Support utf-8 text with BOM (byte order marker) for branch-1
[ https://issues.apache.org/jira/browse/MAPREDUCE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032773#comment-14032773 ] Karthik Kambatla commented on MAPREDUCE-5926: - [~zxu] - can we do this as part of MAPREDUCE-5777 itself? You can annotate the patch with branch-1. Support utf-8 text with BOM (byte order marker) for branch-1 Key: MAPREDUCE-5926 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5926 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5926.000.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5926) Support utf-8 text with BOM (byte order marker) for branch-1
[ https://issues.apache.org/jira/browse/MAPREDUCE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032789#comment-14032789 ] zhihai xu commented on MAPREDUCE-5926: -- Karthik - Ok, I will do that. Thanks for the information. Support utf-8 text with BOM (byte order marker) for branch-1 Key: MAPREDUCE-5926 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5926 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5926.000.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032802#comment-14032802 ] Jason Lowe commented on MAPREDUCE-5928: --- You can click on the Tools-Configuration link on the UI and verify yarn.resourcemanager.scheduler.class is org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler or look for capacityscheduler in the RM logs. Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers
[ https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032831#comment-14032831 ] Niels Basjes commented on MAPREDUCE-5928: - Confirmed It is using the CapacityScheduler: {code} property nameyarn.resourcemanager.scheduler.class/name valueorg.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler/value sourceyarn-default.xml/source /property {code} I'm going to fiddle with the memory setting tomorrow. Deadlock allocating containers for mappers and reducers --- Key: MAPREDUCE-5928 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928 Project: Hadoop Map/Reduce Issue Type: Bug Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2) Reporter: Niels Basjes Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully loaded.png.jpg, MR job stuck in deadlock.png.jpg I have a small cluster consisting of 8 desktop class systems (1 master + 7 workers). Due to the small memory of these systems I configured yarn as follows: {quote} yarn.nodemanager.resource.memory-mb = 2200 yarn.scheduler.minimum-allocation-mb = 250 {quote} On my client I did {quote} mapreduce.map.memory.mb = 512 mapreduce.reduce.memory.mb = 512 {quote} Now I run a job with 27 mappers and 32 reducers. After a while I saw this deadlock occur: - All nodes had been filled to their maximum capacity with reducers. - 1 Mapper was waiting for a container slot to start in. I tried killing reducer attempts but that didn't help (new reducer attempts simply took the existing container). *Workaround*: I set this value from my job. The default value is 0.05 (= 5%) {quote} mapreduce.job.reduce.slowstart.completedmaps = 0.99f {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive
[ https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033114#comment-14033114 ] Karthik Kambatla commented on MAPREDUCE-5844: - Thanks for updating the patch, Maysam. Few comments: # Unfortunately, RMContainerAllocator and RMContainerRequestor are not annotated to @Private classes. So, all the fields/methods that are made accessible should have a @Private annotation in addition the @VisibleForTesting annotation. # By moving TestRMContainerAllocator to be in the same package as the above two files, we can limit the visibility to package-private instead of public. Can you please check if that is straight-forward? # Can we combine the following two statements into one? {code} allocationDelayThresholdMs = conf.getInt( MRJobConfig.MR_JOB_REDUCER_PREEMPT_DELAY_SEC, MRJobConfig.DEFAULT_MR_JOB_REDUCER_PREEMPT_DELAY_SEC); allocationDelayThresholdMs *= 1000; //sec - ms {code} # Nit: Rename setMapResourceReqt and setReduceResourceReqt to end in Request instead of Reqt? # # Nit: In the tests, can we use a smaller sleep time? Also, instead of sleeping for an extra second, can we sleep for the exact time and then check if the reducer gets preempted in a loop with much smaller sleep? YARN/MR should use a Clock so tests don't have to actually sleep for that long. Reducer Preemption is too aggressive Key: MAPREDUCE-5844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Maysam Yabandeh Assignee: Maysam Yabandeh Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch We observed cases where the reducer preemption makes the job finish much later, and the preemption does not seem to be necessary since after preemption both the preempted reducer and the mapper are assigned immediately--meaning that there was already enough space for the mapper. The logic for triggering preemption is at RMContainerAllocator::preemptReducesIfNeeded The preemption is triggered if the following is true: {code} headroom + am * |m| + pr * |r| mapResourceRequest {code} where am: number of assigned mappers, |m| is mapper size, pr is number of reducers being preempted, and |r| is the reducer size. The original idea apparently was that if headroom is not big enough for the new mapper requests, reducers should be preempted. This would work if the job is alone in the cluster. Once we have queues, the headroom calculation becomes more complicated and it would require a separate headroom calculation per queue/job. So, as a result headroom variable is kind of given up currently: *headroom is always set to 0* What this implies to the speculation is that speculation becomes very aggressive, not considering whether there is enough space for the mappers or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5844) Reducer Preemption is too aggressive
[ https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maysam Yabandeh updated MAPREDUCE-5844: --- Attachment: MAPREDUCE-5844.patch Thanks [~kasha] for the comments. I am attaching a new patch that has them applied. I was thinking about a proper name for setReduceResourceReqt. On one hand, by changing it to setReduceResourceRequest it becomes more readable. On the other hand, by using setReduceResourceReqt we adhere to the java standard for naming getters and setter (here reduceResourceReqt). I am more inclined towards the latter and I was wondering if you are ok with that. Reducer Preemption is too aggressive Key: MAPREDUCE-5844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Maysam Yabandeh Assignee: Maysam Yabandeh Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch We observed cases where the reducer preemption makes the job finish much later, and the preemption does not seem to be necessary since after preemption both the preempted reducer and the mapper are assigned immediately--meaning that there was already enough space for the mapper. The logic for triggering preemption is at RMContainerAllocator::preemptReducesIfNeeded The preemption is triggered if the following is true: {code} headroom + am * |m| + pr * |r| mapResourceRequest {code} where am: number of assigned mappers, |m| is mapper size, pr is number of reducers being preempted, and |r| is the reducer size. The original idea apparently was that if headroom is not big enough for the new mapper requests, reducers should be preempted. This would work if the job is alone in the cluster. Once we have queues, the headroom calculation becomes more complicated and it would require a separate headroom calculation per queue/job. So, as a result headroom variable is kind of given up currently: *headroom is always set to 0* What this implies to the speculation is that speculation becomes very aggressive, not considering whether there is enough space for the mappers or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive
[ https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033288#comment-14033288 ] Karthik Kambatla commented on MAPREDUCE-5844: - Can we change the field names also to end in Request instead of Reqt? Reducer Preemption is too aggressive Key: MAPREDUCE-5844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Maysam Yabandeh Assignee: Maysam Yabandeh Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch We observed cases where the reducer preemption makes the job finish much later, and the preemption does not seem to be necessary since after preemption both the preempted reducer and the mapper are assigned immediately--meaning that there was already enough space for the mapper. The logic for triggering preemption is at RMContainerAllocator::preemptReducesIfNeeded The preemption is triggered if the following is true: {code} headroom + am * |m| + pr * |r| mapResourceRequest {code} where am: number of assigned mappers, |m| is mapper size, pr is number of reducers being preempted, and |r| is the reducer size. The original idea apparently was that if headroom is not big enough for the new mapper requests, reducers should be preempted. This would work if the job is alone in the cluster. Once we have queues, the headroom calculation becomes more complicated and it would require a separate headroom calculation per queue/job. So, as a result headroom variable is kind of given up currently: *headroom is always set to 0* What this implies to the speculation is that speculation becomes very aggressive, not considering whether there is enough space for the mappers or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive
[ https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033298#comment-14033298 ] Hadoop QA commented on MAPREDUCE-5844: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650692/MAPREDUCE-5844.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4663//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4663//console This message is automatically generated. Reducer Preemption is too aggressive Key: MAPREDUCE-5844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Maysam Yabandeh Assignee: Maysam Yabandeh Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch We observed cases where the reducer preemption makes the job finish much later, and the preemption does not seem to be necessary since after preemption both the preempted reducer and the mapper are assigned immediately--meaning that there was already enough space for the mapper. The logic for triggering preemption is at RMContainerAllocator::preemptReducesIfNeeded The preemption is triggered if the following is true: {code} headroom + am * |m| + pr * |r| mapResourceRequest {code} where am: number of assigned mappers, |m| is mapper size, pr is number of reducers being preempted, and |r| is the reducer size. The original idea apparently was that if headroom is not big enough for the new mapper requests, reducers should be preempted. This would work if the job is alone in the cluster. Once we have queues, the headroom calculation becomes more complicated and it would require a separate headroom calculation per queue/job. So, as a result headroom variable is kind of given up currently: *headroom is always set to 0* What this implies to the speculation is that speculation becomes very aggressive, not considering whether there is enough space for the mappers or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5844) Reducer Preemption is too aggressive
[ https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maysam Yabandeh updated MAPREDUCE-5844: --- Attachment: MAPREDUCE-5844.patch Attaching the patch that also updates the variables' names: reduceResourceRequest and mapResourceRequest Reducer Preemption is too aggressive Key: MAPREDUCE-5844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Maysam Yabandeh Assignee: Maysam Yabandeh Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch We observed cases where the reducer preemption makes the job finish much later, and the preemption does not seem to be necessary since after preemption both the preempted reducer and the mapper are assigned immediately--meaning that there was already enough space for the mapper. The logic for triggering preemption is at RMContainerAllocator::preemptReducesIfNeeded The preemption is triggered if the following is true: {code} headroom + am * |m| + pr * |r| mapResourceRequest {code} where am: number of assigned mappers, |m| is mapper size, pr is number of reducers being preempted, and |r| is the reducer size. The original idea apparently was that if headroom is not big enough for the new mapper requests, reducers should be preempted. This would work if the job is alone in the cluster. Once we have queues, the headroom calculation becomes more complicated and it would require a separate headroom calculation per queue/job. So, as a result headroom variable is kind of given up currently: *headroom is always set to 0* What this implies to the speculation is that speculation becomes very aggressive, not considering whether there is enough space for the mappers or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Moved] (MAPREDUCE-5930) Document MapReduce metrics
[ https://issues.apache.org/jira/browse/MAPREDUCE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA moved HDFS-6550 to MAPREDUCE-5930: Component/s: (was: documentation) documentation Key: MAPREDUCE-5930 (was: HDFS-6550) Project: Hadoop Map/Reduce (was: Hadoop HDFS) Document MapReduce metrics -- Key: MAPREDUCE-5930 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5930 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation Reporter: Akira AJISAKA Assignee: Akira AJISAKA MapReduce-side of HADOOP-6350. Add MapReduce metrics to Metrics document. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5930) Document MapReduce metrics
[ https://issues.apache.org/jira/browse/MAPREDUCE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1407#comment-1407 ] Akira AJISAKA commented on MAPREDUCE-5930: -- Moved to the correct (MapReduce) project. Document MapReduce metrics -- Key: MAPREDUCE-5930 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5930 Project: Hadoop Map/Reduce Issue Type: Improvement Components: documentation Reporter: Akira AJISAKA Assignee: Akira AJISAKA MapReduce-side of HADOOP-6350. Add MapReduce metrics to Metrics document. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-4065) Add .proto files to built tarball
[ https://issues.apache.org/jira/browse/MAPREDUCE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated MAPREDUCE-4065: -- Assignee: Tsuyoshi OZAWA Affects Version/s: 2.4.0 Status: Patch Available (was: Open) Add .proto files to built tarball - Key: MAPREDUCE-4065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4065 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.23.2, 2.4.0 Reporter: Ralph H Castain Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4065.1.patch Please add the .proto files to the built tarball so that users can build 3rd party tools that use protocol buffers without having to do an svn checkout of the source code. Sorry I don't know more about Maven, or I would provide a patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-4065) Add .proto files to built tarball
[ https://issues.apache.org/jira/browse/MAPREDUCE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated MAPREDUCE-4065: -- Attachment: MAPREDUCE-4065.1.patch Included *.proto files into build tar ball. Directory hierarchy is as follows: {code} hadoop-3.0.0-SNAPSHOT$ tree proto/ proto/ |-- hadoop-common | |-- GenericRefreshProtocol.proto | |-- GetUserMappingsProtocol.proto | |-- HAServiceProtocol.proto | |-- IpcConnectionContext.proto | |-- ProtobufRpcEngine.proto | |-- ProtocolInfo.proto | |-- RefreshAuthorizationPolicyProtocol.proto | |-- RefreshCallQueueProtocol.proto | |-- RefreshUserMappingsProtocol.proto | |-- RpcHeader.proto | |-- Security.proto | `-- ZKFCProtocol.proto |-- hadoop-hdfs | |-- ClientDatanodeProtocol.proto | |-- ClientNamenodeProtocol.proto | |-- DatanodeProtocol.proto | |-- HAZKInfo.proto | |-- InterDatanodeProtocol.proto | |-- JournalProtocol.proto | |-- NamenodeProtocol.proto | |-- QJournalProtocol.proto | |-- acl.proto | |-- datatransfer.proto | |-- fsimage.proto | |-- hdfs.proto | `-- xattr.proto |-- hadoop-mapreduce-client | |-- MRClientProtocol.proto | |-- mr_protos.proto | `-- mr_service_protos.proto |-- hadoop-yarn-api | |-- application_history_client.proto | |-- applicationclient_protocol.proto | |-- applicationmaster_protocol.proto | |-- containermanagement_protocol.proto | |-- yarn_protos.proto | `-- yarn_service_protos.proto `-- hadoop-yarn-server-common {code} Add .proto files to built tarball - Key: MAPREDUCE-4065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4065 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.23.2, 2.4.0 Reporter: Ralph H Castain Attachments: MAPREDUCE-4065.1.patch Please add the .proto files to the built tarball so that users can build 3rd party tools that use protocol buffers without having to do an svn checkout of the source code. Sorry I don't know more about Maven, or I would provide a patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive
[ https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033350#comment-14033350 ] Hadoop QA commented on MAPREDUCE-5844: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650707/MAPREDUCE-5844.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4664//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4664//console This message is automatically generated. Reducer Preemption is too aggressive Key: MAPREDUCE-5844 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Maysam Yabandeh Assignee: Maysam Yabandeh Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch We observed cases where the reducer preemption makes the job finish much later, and the preemption does not seem to be necessary since after preemption both the preempted reducer and the mapper are assigned immediately--meaning that there was already enough space for the mapper. The logic for triggering preemption is at RMContainerAllocator::preemptReducesIfNeeded The preemption is triggered if the following is true: {code} headroom + am * |m| + pr * |r| mapResourceRequest {code} where am: number of assigned mappers, |m| is mapper size, pr is number of reducers being preempted, and |r| is the reducer size. The original idea apparently was that if headroom is not big enough for the new mapper requests, reducers should be preempted. This would work if the job is alone in the cluster. Once we have queues, the headroom calculation becomes more complicated and it would require a separate headroom calculation per queue/job. So, as a result headroom variable is kind of given up currently: *headroom is always set to 0* What this implies to the speculation is that speculation becomes very aggressive, not considering whether there is enough space for the mappers or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4065) Add .proto files to built tarball
[ https://issues.apache.org/jira/browse/MAPREDUCE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033379#comment-14033379 ] Hadoop QA commented on MAPREDUCE-4065: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650714/MAPREDUCE-4065.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-dist. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4665//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4665//console This message is automatically generated. Add .proto files to built tarball - Key: MAPREDUCE-4065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4065 Project: Hadoop Map/Reduce Issue Type: Bug Components: build Affects Versions: 0.23.2, 2.4.0 Reporter: Ralph H Castain Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4065.1.patch Please add the .proto files to the built tarball so that users can build 3rd party tools that use protocol buffers without having to do an svn checkout of the source code. Sorry I don't know more about Maven, or I would provide a patch. -- This message was sent by Atlassian JIRA (v6.2#6252)