[jira] [Created] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Kedar Dixit (JIRA)
Kedar Dixit created MAPREDUCE-5927:
--

 Summary: Getting following error
 Key: MAPREDUCE-5927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Kedar Dixit
Priority: Blocker






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Kedar Dixit (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kedar Dixit updated MAPREDUCE-5927:
---

Description: 
14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
parsing not performed. Implement the Tool interface and execute your 
application with ToolRunner to remedy this.
14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
Instead, use mapreduce.job.user.name
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
Instead, use mapreduce.job.jar
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
deprecated. Instead, use mapreduce.job.reduces
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class is 
deprecated. Instead, use mapreduce.job.output.value.class
14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is 
deprecated. Instead, use mapreduce.job.map.class
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is 
deprecated. Instead, use mapreduce.job.name
14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class 
is deprecated. Instead, use mapreduce.job.inputformat.class
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is 
deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is 
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.outputformat.class 
is deprecated. Instead, use mapreduce.job.outputformat.class
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is 
deprecated. Instead, use mapreduce.job.maps
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is 
deprecated. Instead, use mapreduce.job.output.key.class
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is 
deprecated. Instead, use mapreduce.job.working.dir
14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
job_1402913701967_0006
14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application 
application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032
14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: 
http://gs-1695:8088/proxy/application_1402913701967_0006/
14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006
14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in 
uber mode : false
14/06/16 16:21:54 INFO mapreduce.Job:  map 0% reduce 0%
14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with 
state FAILED due to: Application application_1402913701967_0006 failed 2 times 
due to AM Container for appattempt_1402913701967_0006_02 exited with  
exitCode: 1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)


.Failing this attempt.. Failing the application.
14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0


 Getting following error
 ---

 Key: MAPREDUCE-5927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Kedar Dixit
Priority: Blocker

 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
 parsing not performed. Implement the Tool interface and execute your 
 application with ToolRunner to remedy this.
 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
 Instead, use mapreduce.job.user.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
 

[jira] [Updated] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Kedar Dixit (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kedar Dixit updated MAPREDUCE-5927:
---

Description: 
Hi,

I am getting following error, while running application on cluser -

14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
parsing not performed. Implement the Tool interface and execute your 
application with ToolRunner to remedy this.
14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
Instead, use mapreduce.job.user.name
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
Instead, use mapreduce.job.jar
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
deprecated. Instead, use mapreduce.job.reduces
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class is 
deprecated. Instead, use mapreduce.job.output.value.class
14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is 
deprecated. Instead, use mapreduce.job.map.class
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is 
deprecated. Instead, use mapreduce.job.name
14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class 
is deprecated. Instead, use mapreduce.job.inputformat.class
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is 
deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is 
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.outputformat.class 
is deprecated. Instead, use mapreduce.job.outputformat.class
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is 
deprecated. Instead, use mapreduce.job.maps
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is 
deprecated. Instead, use mapreduce.job.output.key.class
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is 
deprecated. Instead, use mapreduce.job.working.dir
14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
job_1402913701967_0006
14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application 
application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032
14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: 
http://gs-1695:8088/proxy/application_1402913701967_0006/
14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006
14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in 
uber mode : false
14/06/16 16:21:54 INFO mapreduce.Job:  map 0% reduce 0%
14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with 
state FAILED due to: Application application_1402913701967_0006 failed 2 times 
due to AM Container for appattempt_1402913701967_0006_02 exited with  
exitCode: 1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)


.Failing this attempt.. Failing the application.
14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0


Can you please help me in fixing this ?

Thanks,
~Kedar


  was:
14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
parsing not performed. Implement the Tool interface and execute your 
application with ToolRunner to remedy this.
14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
Instead, use mapreduce.job.user.name
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
Instead, use mapreduce.job.jar
14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
deprecated. Instead, use mapreduce.job.reduces
14/06/16 16:21:49 INFO Configuration.deprecation: 

[jira] [Commented] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Kedar Dixit (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032335#comment-14032335
 ] 

Kedar Dixit commented on MAPREDUCE-5927:


My .bashrc has following confs.

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64/
export HADOOP_HOME=/home/gslab/setup/hadoop-2.2.0


#export JAVA_HOME=$(/usr/libexec/java_home)
export HADOOP_MAPRED_HOME=/home/gslab/setup/hadoop-2.2.0
export HADOOP_COMMON_HOME=/home/gslab/setup/hadoop-2.2.0
export HADOOP_HDFS_HOME=/home/gslab/setup/hadoop-2.2.0
export YARN_HOME=/home/gslab/setup/hadoop-2.2.0
export HADOOP_YARN_HOME=/home/gslab/setup/hadoop-2.2.0
export HADOOP_CONF_DIR=/home/gslab/setup/hadoop-2.2.0

#export MAHOUT_HOME=/home/gslab/setup/mahout-distribution-0.9
export MAHOUT_HOME=/home/gslab/setup/mahout-trunk
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$MAHOUT_HOME/bin:/$JAVA_HOME/bin
export PATH=$PATH:$MAHOUT_HOME

 Getting following error
 ---

 Key: MAPREDUCE-5927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Kedar Dixit
Priority: Blocker

 Hi,
 I am getting following error, while running application on cluser -
 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
 parsing not performed. Implement the Tool interface and execute your 
 application with ToolRunner to remedy this.
 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
 Instead, use mapreduce.job.user.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
 Instead, use mapreduce.job.jar
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
 deprecated. Instead, use mapreduce.job.reduces
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class 
 is deprecated. Instead, use mapreduce.job.output.value.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is 
 deprecated. Instead, use mapreduce.job.map.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is 
 deprecated. Instead, use mapreduce.job.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class 
 is deprecated. Instead, use mapreduce.job.inputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is 
 deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: 
 mapreduce.outputformat.class is deprecated. Instead, use 
 mapreduce.job.outputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is 
 deprecated. Instead, use mapreduce.job.maps
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is 
 deprecated. Instead, use mapreduce.job.output.key.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is 
 deprecated. Instead, use mapreduce.job.working.dir
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
 job_1402913701967_0006
 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application 
 application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032
 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: 
 http://gs-1695:8088/proxy/application_1402913701967_0006/
 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in 
 uber mode : false
 14/06/16 16:21:54 INFO mapreduce.Job:  map 0% reduce 0%
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with 
 state FAILED due to: Application application_1402913701967_0006 failed 2 
 times due to AM Container for appattempt_1402913701967_0006_02 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
 at 

[jira] [Updated] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Kedar Dixit (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kedar Dixit updated MAPREDUCE-5927:
---

Assignee: Vinod Kumar Vavilapalli

 Getting following error
 ---

 Key: MAPREDUCE-5927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Kedar Dixit
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker

 Hi,
 I am getting following error, while running application on cluser -
 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
 parsing not performed. Implement the Tool interface and execute your 
 application with ToolRunner to remedy this.
 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
 Instead, use mapreduce.job.user.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
 Instead, use mapreduce.job.jar
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
 deprecated. Instead, use mapreduce.job.reduces
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class 
 is deprecated. Instead, use mapreduce.job.output.value.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is 
 deprecated. Instead, use mapreduce.job.map.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is 
 deprecated. Instead, use mapreduce.job.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class 
 is deprecated. Instead, use mapreduce.job.inputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is 
 deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: 
 mapreduce.outputformat.class is deprecated. Instead, use 
 mapreduce.job.outputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is 
 deprecated. Instead, use mapreduce.job.maps
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is 
 deprecated. Instead, use mapreduce.job.output.key.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is 
 deprecated. Instead, use mapreduce.job.working.dir
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
 job_1402913701967_0006
 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application 
 application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032
 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: 
 http://gs-1695:8088/proxy/application_1402913701967_0006/
 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in 
 uber mode : false
 14/06/16 16:21:54 INFO mapreduce.Job:  map 0% reduce 0%
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with 
 state FAILED due to: Application application_1402913701967_0006 failed 2 
 times due to AM Container for appattempt_1402913701967_0006_02 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 .Failing this attempt.. Failing the application.
 14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0
 Can you please help me in fixing this ?
 Thanks,
 ~Kedar



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5898) distcp to support preserving HDFS extended attributes(XAttrs)

2014-06-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032352#comment-14032352
 ] 

Hudson commented on MAPREDUCE-5898:
---

SUCCESS: Integrated in Hadoop-Yarn-trunk #585 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/585/])
Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, 
HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 distcp to support preserving HDFS extended attributes(XAttrs)
 -

 Key: MAPREDUCE-5898
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5898
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
 Fix For: 3.0.0, 2.5.0

 Attachments: MAPREDUCE-5898.1.patch, MAPREDUCE-5898.patch


 This JIRA to track the distcp support to handle the Xattrs with preserving 
 options.
 Add new command line argument to support that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5920) Add Xattr option in DistCp docs

2014-06-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032355#comment-14032355
 ] 

Hudson commented on MAPREDUCE-5920:
---

SUCCESS: Integrated in Hadoop-Yarn-trunk #585 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/585/])
Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, 
HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Add Xattr option in DistCp docs 
 

 Key: MAPREDUCE-5920
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5920
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp, documentation
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
Priority: Minor
 Fix For: 3.0.0, 2.5.0

 Attachments: MAPREDUCE-5920.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Niels Basjes (JIRA)
Niels Basjes created MAPREDUCE-5928:
---

 Summary: Deadlock allocating containers for mappers and reducers
 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes


I have a small cluster consisting of 8 desktop class systems (1 master + 7 
workers).
Due to the small memory of these systems I configured yarn as follows:
{quote}
yarn.nodemanager.resource.memory-mb = 2200
yarn.scheduler.minimum-allocation-mb = 250
{quote}
On my client I did
{quote}
mapreduce.map.memory.mb = 512
mapreduce.reduce.memory.mb = 512
{quote}
Now I run a job with 27 mappers and 32 reducers.
After a while I saw this deadlock occur:
-   All nodes had been filled to their maximum capacity with reducers.
-   1 Mapper was waiting for a container slot to start in.

I tried killing reducer attempts but that didn't help (new reducer attempts 
simply took the existing container).

*Workaround*:
I set this value from my job. The default value is 0.05 (= 5%)
{quote}
mapreduce.job.reduce.slowstart.completedmaps = 0.99f
{quote}




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated MAPREDUCE-5928:


Attachment: MR job stuck in deadlock.png.jpg

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: MR job stuck in deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated MAPREDUCE-5928:


Attachment: Cluster fully loaded.png.jpg

NOTE: Node2 had issues so the system took it offline (0 containers). 
Perhaps this is what confused the MapReduce application?

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: Cluster fully loaded.png.jpg, MR job stuck in 
 deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5920) Add Xattr option in DistCp docs

2014-06-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032451#comment-14032451
 ] 

Hudson commented on MAPREDUCE-5920:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1776 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1776/])
Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, 
HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Add Xattr option in DistCp docs 
 

 Key: MAPREDUCE-5920
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5920
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp, documentation
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
Priority: Minor
 Fix For: 3.0.0, 2.5.0

 Attachments: MAPREDUCE-5920.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5018) Support raw binary data with Hadoop streaming

2014-06-16 Thread Steven Willis (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Willis updated MAPREDUCE-5018:
-

Target Version/s: 1.1.2, trunk  (was: trunk, 1.1.2)
  Status: Patch Available  (was: Open)

 Support raw binary data with Hadoop streaming
 -

 Key: MAPREDUCE-5018
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5018
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/streaming
Affects Versions: 1.1.2, trunk
Reporter: Jay Hacker
Assignee: Steven Willis
Priority: Minor
 Attachments: MAPREDUCE-5018-branch-1.1.patch, MAPREDUCE-5018.patch, 
 MAPREDUCE-5018.patch, justbytes.jar, mapstream


 People often have a need to run older programs over many files, and turn to 
 Hadoop streaming as a reliable, performant batch system.  There are good 
 reasons for this:
 1. Hadoop is convenient: they may already be using it for mapreduce jobs, and 
 it is easy to spin up a cluster in the cloud.
 2. It is reliable: HDFS replicates data and the scheduler retries failed jobs.
 3. It is reasonably performant: it moves the code to the data, maintaining 
 locality, and scales with the number of nodes.
 Historically Hadoop is of course oriented toward processing key/value pairs, 
 and so needs to interpret the data passing through it.  Unfortunately, this 
 makes it difficult to use Hadoop streaming with programs that don't deal in 
 key/value pairs, or with binary data in general.  For example, something as 
 simple as running md5sum to verify the integrity of files will not give the 
 correct result, due to Hadoop's interpretation of the data.  
 There have been several attempts at binary serialization schemes for Hadoop 
 streaming, such as TypedBytes (HADOOP-1722); however, these are still aimed 
 at efficiently encoding key/value pairs, and not passing data through 
 unmodified.  Even the RawBytes serialization scheme adds length fields to 
 the data, rendering it not-so-raw.
 I often have a need to run a Unix filter on files stored in HDFS; currently, 
 the only way I can do this on the raw data is to copy the data out and run 
 the filter on one machine, which is inconvenient, slow, and unreliable.  It 
 would be very convenient to run the filter as a map-only job, allowing me to 
 build on existing (well-tested!) building blocks in the Unix tradition 
 instead of reimplementing them as mapreduce programs.
 However, most existing tools don't know about file splits, and so want to 
 process whole files; and of course many expect raw binary input and output.  
 The solution is to run a map-only job with an InputFormat and OutputFormat 
 that just pass raw bytes and don't split.  It turns out to be a little more 
 complicated with streaming; I have attached a patch with the simplest 
 solution I could come up with.  I call the format JustBytes (as RawBytes 
 was already taken), and it should be usable with most recent versions of 
 Hadoop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5898) distcp to support preserving HDFS extended attributes(XAttrs)

2014-06-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032448#comment-14032448
 ] 

Hudson commented on MAPREDUCE-5898:
---

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1776 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1776/])
Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, 
HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 distcp to support preserving HDFS extended attributes(XAttrs)
 -

 Key: MAPREDUCE-5898
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5898
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
 Fix For: 3.0.0, 2.5.0

 Attachments: MAPREDUCE-5898.1.patch, MAPREDUCE-5898.patch


 This JIRA to track the distcp support to handle the Xattrs with preserving 
 options.
 Add new command line argument to support that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5018) Support raw binary data with Hadoop streaming

2014-06-16 Thread Steven Willis (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Willis updated MAPREDUCE-5018:
-

Target Version/s: 1.1.2, trunk  (was: trunk, 1.1.2)
  Status: Open  (was: Patch Available)

 Support raw binary data with Hadoop streaming
 -

 Key: MAPREDUCE-5018
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5018
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/streaming
Affects Versions: 1.1.2, trunk
Reporter: Jay Hacker
Assignee: Steven Willis
Priority: Minor
 Attachments: MAPREDUCE-5018-branch-1.1.patch, MAPREDUCE-5018.patch, 
 MAPREDUCE-5018.patch, justbytes.jar, mapstream


 People often have a need to run older programs over many files, and turn to 
 Hadoop streaming as a reliable, performant batch system.  There are good 
 reasons for this:
 1. Hadoop is convenient: they may already be using it for mapreduce jobs, and 
 it is easy to spin up a cluster in the cloud.
 2. It is reliable: HDFS replicates data and the scheduler retries failed jobs.
 3. It is reasonably performant: it moves the code to the data, maintaining 
 locality, and scales with the number of nodes.
 Historically Hadoop is of course oriented toward processing key/value pairs, 
 and so needs to interpret the data passing through it.  Unfortunately, this 
 makes it difficult to use Hadoop streaming with programs that don't deal in 
 key/value pairs, or with binary data in general.  For example, something as 
 simple as running md5sum to verify the integrity of files will not give the 
 correct result, due to Hadoop's interpretation of the data.  
 There have been several attempts at binary serialization schemes for Hadoop 
 streaming, such as TypedBytes (HADOOP-1722); however, these are still aimed 
 at efficiently encoding key/value pairs, and not passing data through 
 unmodified.  Even the RawBytes serialization scheme adds length fields to 
 the data, rendering it not-so-raw.
 I often have a need to run a Unix filter on files stored in HDFS; currently, 
 the only way I can do this on the raw data is to copy the data out and run 
 the filter on one machine, which is inconvenient, slow, and unreliable.  It 
 would be very convenient to run the filter as a map-only job, allowing me to 
 build on existing (well-tested!) building blocks in the Unix tradition 
 instead of reimplementing them as mapreduce programs.
 However, most existing tools don't know about file splits, and so want to 
 process whole files; and of course many expect raw binary input and output.  
 The solution is to run a map-only job with an InputFormat and OutputFormat 
 that just pass raw bytes and don't split.  It turns out to be a little more 
 complicated with streaming; I have attached a patch with the simplest 
 solution I could come up with.  I call the format JustBytes (as RawBytes 
 was already taken), and it should be usable with most recent versions of 
 Hadoop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032473#comment-14032473
 ] 

Jason Lowe commented on MAPREDUCE-5928:
---

This sounds like a bug in either headroom calculation or in 
RMContainerAllocator where the AM decides whether to preempt reducers.  Could 
you look in the AM log and see what it saw for the headroom and whether it made 
any attempt at all to ramp down reducers?

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: Cluster fully loaded.png.jpg, MR job stuck in 
 deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032476#comment-14032476
 ] 

Niels Basjes commented on MAPREDUCE-5928:
-

I'm not the only one who ran into this: 
http://hortonworks.com/community/forums/topic/mapreduce-race-condition-big-job/

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: Cluster fully loaded.png.jpg, MR job stuck in 
 deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened MAPREDUCE-5927:
---


 Getting following error
 ---

 Key: MAPREDUCE-5927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Kedar Dixit
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker

 Hi,
 I am getting following error, while running application on cluser -
 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
 parsing not performed. Implement the Tool interface and execute your 
 application with ToolRunner to remedy this.
 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
 Instead, use mapreduce.job.user.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
 Instead, use mapreduce.job.jar
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
 deprecated. Instead, use mapreduce.job.reduces
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class 
 is deprecated. Instead, use mapreduce.job.output.value.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is 
 deprecated. Instead, use mapreduce.job.map.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is 
 deprecated. Instead, use mapreduce.job.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class 
 is deprecated. Instead, use mapreduce.job.inputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is 
 deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: 
 mapreduce.outputformat.class is deprecated. Instead, use 
 mapreduce.job.outputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is 
 deprecated. Instead, use mapreduce.job.maps
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is 
 deprecated. Instead, use mapreduce.job.output.key.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is 
 deprecated. Instead, use mapreduce.job.working.dir
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
 job_1402913701967_0006
 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application 
 application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032
 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: 
 http://gs-1695:8088/proxy/application_1402913701967_0006/
 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in 
 uber mode : false
 14/06/16 16:21:54 INFO mapreduce.Job:  map 0% reduce 0%
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with 
 state FAILED due to: Application application_1402913701967_0006 failed 2 
 times due to AM Container for appattempt_1402913701967_0006_02 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 .Failing this attempt.. Failing the application.
 14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0
 Can you please help me in fixing this ?
 Thanks,
 ~Kedar



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-5927.
---

Resolution: Fixed

 Getting following error
 ---

 Key: MAPREDUCE-5927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Kedar Dixit
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker

 Hi,
 I am getting following error, while running application on cluser -
 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
 parsing not performed. Implement the Tool interface and execute your 
 application with ToolRunner to remedy this.
 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
 Instead, use mapreduce.job.user.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
 Instead, use mapreduce.job.jar
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
 deprecated. Instead, use mapreduce.job.reduces
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class 
 is deprecated. Instead, use mapreduce.job.output.value.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is 
 deprecated. Instead, use mapreduce.job.map.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is 
 deprecated. Instead, use mapreduce.job.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class 
 is deprecated. Instead, use mapreduce.job.inputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is 
 deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: 
 mapreduce.outputformat.class is deprecated. Instead, use 
 mapreduce.job.outputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is 
 deprecated. Instead, use mapreduce.job.maps
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is 
 deprecated. Instead, use mapreduce.job.output.key.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is 
 deprecated. Instead, use mapreduce.job.working.dir
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
 job_1402913701967_0006
 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application 
 application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032
 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: 
 http://gs-1695:8088/proxy/application_1402913701967_0006/
 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in 
 uber mode : false
 14/06/16 16:21:54 INFO mapreduce.Job:  map 0% reduce 0%
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with 
 state FAILED due to: Application application_1402913701967_0006 failed 2 
 times due to AM Container for appattempt_1402913701967_0006_02 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 .Failing this attempt.. Failing the application.
 14/06/16 16:21:54 INFO mapreduce.Job: Counters: 0
 Can you please help me in fixing this ?
 Thanks,
 ~Kedar



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (MAPREDUCE-5927) Getting following error

2014-06-16 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-5927.
---

Resolution: Invalid

This is a general support question better asked on the u...@hadoop.apache.org 
list.  JIRA is for tracking bugs and features in Hadoop and not a general user 
support channel.

In this case the ApplicationMaster is crashing shortly after startup.  You'll 
need to examine the ApplicationMaster log to determine what happened -- click 
on the tracking URL and then from there go to the AM logs link or you can also 
use the yarn logs command if log aggregation is enabled on your cluster.

 Getting following error
 ---

 Key: MAPREDUCE-5927
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5927
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: applicationmaster
Reporter: Kedar Dixit
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker

 Hi,
 I am getting following error, while running application on cluser -
 14/06/16 16:21:48 WARN mapreduce.JobSubmitter: Hadoop command-line option 
 parsing not performed. Implement the Tool interface and execute your 
 application with ToolRunner to remedy this.
 14/06/16 16:21:49 INFO input.FileInputFormat: Total input paths to process : 1
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: number of splits:1
 14/06/16 16:21:49 INFO Configuration.deprecation: user.name is deprecated. 
 Instead, use mapreduce.job.user.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.jar is deprecated. 
 Instead, use mapreduce.job.jar
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.reduce.tasks is 
 deprecated. Instead, use mapreduce.job.reduces
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.value.class 
 is deprecated. Instead, use mapreduce.job.output.value.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.map.class is 
 deprecated. Instead, use mapreduce.job.map.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.job.name is 
 deprecated. Instead, use mapreduce.job.name
 14/06/16 16:21:49 INFO Configuration.deprecation: mapreduce.inputformat.class 
 is deprecated. Instead, use mapreduce.job.inputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.input.dir is 
 deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.dir is 
 deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
 14/06/16 16:21:49 INFO Configuration.deprecation: 
 mapreduce.outputformat.class is deprecated. Instead, use 
 mapreduce.job.outputformat.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.map.tasks is 
 deprecated. Instead, use mapreduce.job.maps
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.output.key.class is 
 deprecated. Instead, use mapreduce.job.output.key.class
 14/06/16 16:21:49 INFO Configuration.deprecation: mapred.working.dir is 
 deprecated. Instead, use mapreduce.job.working.dir
 14/06/16 16:21:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
 job_1402913701967_0006
 14/06/16 16:21:49 INFO impl.YarnClientImpl: Submitted application 
 application_1402913701967_0006 to ResourceManager at master/10.71.71.110:8032
 14/06/16 16:21:49 INFO mapreduce.Job: The url to track the job: 
 http://gs-1695:8088/proxy/application_1402913701967_0006/
 14/06/16 16:21:49 INFO mapreduce.Job: Running job: job_1402913701967_0006
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 running in 
 uber mode : false
 14/06/16 16:21:54 INFO mapreduce.Job:  map 0% reduce 0%
 14/06/16 16:21:54 INFO mapreduce.Job: Job job_1402913701967_0006 failed with 
 state FAILED due to: Application application_1402913701967_0006 failed 2 
 times due to AM Container for appattempt_1402913701967_0006_02 exited 
 with  exitCode: 1 due to: Exception from container-launch:
 org.apache.hadoop.util.Shell$ExitCodeException:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032496#comment-14032496
 ] 

Jonathan Eagles commented on MAPREDUCE-5928:


I think this is a case of task preemption not working since the headroom 
calculation is not correct. Can you verify you are using the capacity scheduler.

YARN-1198

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: Cluster fully loaded.png.jpg, MR job stuck in 
 deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032503#comment-14032503
 ] 

Jason Lowe commented on MAPREDUCE-5928:
---

I'm wondering if the fact that the nodemanger memory has a fractional remainder 
when it's full triggers the issue.  With tasks all being 512MB that means 
each node will have 152MB remaining.  I'm guessing that with enough nodes those 
remainders will add up to appear to be enough space to run another task but in 
reality that task cannot be scheduled since the memory being reported is 
fragmented.

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: Cluster fully loaded.png.jpg, MR job stuck in 
 deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated MAPREDUCE-5928:


Attachment: AM-MR-syslog - Cleaned.txt.gz

I downloaded the Application Master log and attached it to this issue. (I 
changed the domainname of the nodes) 

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully 
 loaded.png.jpg, MR job stuck in deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5018) Support raw binary data with Hadoop streaming

2014-06-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032514#comment-14032514
 ] 

Hadoop QA commented on MAPREDUCE-5018:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12644886/MAPREDUCE-5018.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.
See 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4662//artifact/trunk/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-tools/hadoop-streaming.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4662//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4662//console

This message is automatically generated.

 Support raw binary data with Hadoop streaming
 -

 Key: MAPREDUCE-5018
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5018
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/streaming
Affects Versions: trunk, 1.1.2
Reporter: Jay Hacker
Assignee: Steven Willis
Priority: Minor
 Attachments: MAPREDUCE-5018-branch-1.1.patch, MAPREDUCE-5018.patch, 
 MAPREDUCE-5018.patch, justbytes.jar, mapstream


 People often have a need to run older programs over many files, and turn to 
 Hadoop streaming as a reliable, performant batch system.  There are good 
 reasons for this:
 1. Hadoop is convenient: they may already be using it for mapreduce jobs, and 
 it is easy to spin up a cluster in the cloud.
 2. It is reliable: HDFS replicates data and the scheduler retries failed jobs.
 3. It is reasonably performant: it moves the code to the data, maintaining 
 locality, and scales with the number of nodes.
 Historically Hadoop is of course oriented toward processing key/value pairs, 
 and so needs to interpret the data passing through it.  Unfortunately, this 
 makes it difficult to use Hadoop streaming with programs that don't deal in 
 key/value pairs, or with binary data in general.  For example, something as 
 simple as running md5sum to verify the integrity of files will not give the 
 correct result, due to Hadoop's interpretation of the data.  
 There have been several attempts at binary serialization schemes for Hadoop 
 streaming, such as TypedBytes (HADOOP-1722); however, these are still aimed 
 at efficiently encoding key/value pairs, and not passing data through 
 unmodified.  Even the RawBytes serialization scheme adds length fields to 
 the data, rendering it not-so-raw.
 I often have a need to run a Unix filter on files stored in HDFS; currently, 
 the only way I can do this on the raw data is to copy the data out and run 
 the filter on one machine, which is inconvenient, slow, and unreliable.  It 
 would be very convenient to run the filter as a map-only job, allowing me to 
 build on existing (well-tested!) building blocks in the Unix tradition 
 instead of reimplementing them as mapreduce programs.
 However, most existing tools don't know about file splits, and so want to 
 process whole files; and of course many expect raw binary input and output.  
 The solution is to run a map-only job with an InputFormat and OutputFormat 
 that just pass raw bytes and don't split.  It turns out to be a little more 
 complicated with streaming; I have attached a patch with the simplest 
 solution I could come up with.  I call the format JustBytes (as RawBytes 
 was already taken), and it should be usable with most recent versions of 
 Hadoop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032516#comment-14032516
 ] 

Niels Basjes commented on MAPREDUCE-5928:
-

I have not actively configured any scheduling.
So I guess it is running the 'default' setting ?


 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully 
 loaded.png.jpg, MR job stuck in deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5898) distcp to support preserving HDFS extended attributes(XAttrs)

2014-06-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032532#comment-14032532
 ] 

Hudson commented on MAPREDUCE-5898:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1803 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1803/])
Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, 
HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 distcp to support preserving HDFS extended attributes(XAttrs)
 -

 Key: MAPREDUCE-5898
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5898
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
 Fix For: 3.0.0, 2.5.0

 Attachments: MAPREDUCE-5898.1.patch, MAPREDUCE-5898.patch


 This JIRA to track the distcp support to handle the Xattrs with preserving 
 options.
 Add new command line argument to support that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5920) Add Xattr option in DistCp docs

2014-06-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032535#comment-14032535
 ] 

Hudson commented on MAPREDUCE-5920:
---

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1803 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1803/])
Moved CHANGES.txt entries of MAPREDUCE-5898, MAPREDUCE-5920, HDFS-6464, 
HDFS-6375 from trunk to 2.5 section on merging HDFS-2006 to branch-2 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602699)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt


 Add Xattr option in DistCp docs 
 

 Key: MAPREDUCE-5920
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5920
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distcp, documentation
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
Priority: Minor
 Fix For: 3.0.0, 2.5.0

 Attachments: MAPREDUCE-5920.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032545#comment-14032545
 ] 

Jason Lowe commented on MAPREDUCE-5928:
---

I'm pretty sure you're using the CapacityScheduler since that's been the 
default in Apache Hadoop for some time now.  I'm not positive about the HDP 
release, but I suspect it, too, is configured to use the CapacityScheduler by 
default.

After a quick examination of the AM log, it looks like a couple of things are 
going on.  The AM is blacklisting one of the nodes, and we can see that node 
not being used in the cluster picture.  There's a known issue with headroom 
calculation not taking into account blacklisted nodes.  See YARN-1680. 

The node ends up being blacklisted because the NM shot a number of the tasks 
for being over container limits.  It looks like the containers are being 
allocated as using 500MB but the JVM heap sizes are set to 512MB.  Note that 
the container size includes the size of the entire process tree for the task.  
That's not just the heap, so it needs to also include thread stacks, JVM data, 
JVM code, any subprocesses launched (e.g.: hadoop streaming) etc.  If you 
really need a 512MB heap then I'd allocate 768MB or maybe even 1024MB 
containers, depending on what the tasks are doing.

It does look like some fractional memory shenanigans could be involved here, as 
the picture shows most of the nodes having only 200MB free.  It'd be 
interesting to know if you still hit the deadlock after fixing the cause of the 
blacklisting.

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully 
 loaded.png.jpg, MR job stuck in deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032564#comment-14032564
 ] 

Niels Basjes commented on MAPREDUCE-5928:
-

I took the 'dead' node (node2) offline (completely stopped all hadoop/yarn 
related deamons) and ran the same job again after it had disappeared in all 
overviews.
Now it does complete all mappers.

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully 
 loaded.png.jpg, MR job stuck in deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-5929) YARNRunner.java, path for jobJarPath not set correctly

2014-06-16 Thread Chao Tian (JIRA)
Chao Tian created MAPREDUCE-5929:


 Summary: YARNRunner.java, path for jobJarPath not set correctly
 Key: MAPREDUCE-5929
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5929
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Chao Tian


In YARNRunner.java, line 357,

Path jobJarPath = new Path(jobConf.get(MRJobConfig.JAR));

This causes the job.jar file to miss scheme, host and port number on 
distributed file systems other than hdfs. 

If we compare line 357 with line 344, there job.xml is actually set as
 
Path jobConfPath = new Path(jobSubmitDir,MRJobConfig.JOB_CONF_FILE);

It appears jobSubmitDir is missing on line 357, which causes this problem. In 
hdfs, the additional qualify process will correct this problem, but not other 
generic distributed file systems.

The proposed change is to replace 35 7 with

Path jobJarPath = new Path(jobConf.get(jobSubmitDir,MRJobConfig.JAR));

 





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032766#comment-14032766
 ] 

Niels Basjes commented on MAPREDUCE-5928:
-

Where/how can I determine for sure if the capacity scheduler is used?

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully 
 loaded.png.jpg, MR job stuck in deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5926) Support utf-8 text with BOM (byte order marker) for branch-1

2014-06-16 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032773#comment-14032773
 ] 

Karthik Kambatla commented on MAPREDUCE-5926:
-

[~zxu] - can we do this as part of MAPREDUCE-5777 itself? You can annotate the 
patch with branch-1.

 Support utf-8 text with BOM (byte order marker) for branch-1
 

 Key: MAPREDUCE-5926
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5926
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: MAPREDUCE-5926.000.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5926) Support utf-8 text with BOM (byte order marker) for branch-1

2014-06-16 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032789#comment-14032789
 ] 

zhihai xu commented on MAPREDUCE-5926:
--

Karthik - Ok, I will do that. Thanks for the information.




 Support utf-8 text with BOM (byte order marker) for branch-1
 

 Key: MAPREDUCE-5926
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5926
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv1
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: MAPREDUCE-5926.000.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032802#comment-14032802
 ] 

Jason Lowe commented on MAPREDUCE-5928:
---

You can click on the Tools-Configuration link on the UI and verify 
yarn.resourcemanager.scheduler.class is 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
 or look for capacityscheduler in the RM logs.

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully 
 loaded.png.jpg, MR job stuck in deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5928) Deadlock allocating containers for mappers and reducers

2014-06-16 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032831#comment-14032831
 ] 

Niels Basjes commented on MAPREDUCE-5928:
-

Confirmed It is using the CapacityScheduler:
{code}
property
nameyarn.resourcemanager.scheduler.class/name

valueorg.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler/value
sourceyarn-default.xml/source
/property
{code}

I'm going to fiddle with the memory setting tomorrow.

 Deadlock allocating containers for mappers and reducers
 ---

 Key: MAPREDUCE-5928
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5928
 Project: Hadoop Map/Reduce
  Issue Type: Bug
 Environment: Hadoop 2.4.0 (as packaged by HortonWorks in HDP 2.1.2)
Reporter: Niels Basjes
 Attachments: AM-MR-syslog - Cleaned.txt.gz, Cluster fully 
 loaded.png.jpg, MR job stuck in deadlock.png.jpg


 I have a small cluster consisting of 8 desktop class systems (1 master + 7 
 workers).
 Due to the small memory of these systems I configured yarn as follows:
 {quote}
 yarn.nodemanager.resource.memory-mb = 2200
 yarn.scheduler.minimum-allocation-mb = 250
 {quote}
 On my client I did
 {quote}
 mapreduce.map.memory.mb = 512
 mapreduce.reduce.memory.mb = 512
 {quote}
 Now I run a job with 27 mappers and 32 reducers.
 After a while I saw this deadlock occur:
 - All nodes had been filled to their maximum capacity with reducers.
 - 1 Mapper was waiting for a container slot to start in.
 I tried killing reducer attempts but that didn't help (new reducer attempts 
 simply took the existing container).
 *Workaround*:
 I set this value from my job. The default value is 0.05 (= 5%)
 {quote}
 mapreduce.job.reduce.slowstart.completedmaps = 0.99f
 {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-16 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033114#comment-14033114
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

Thanks for updating the patch, Maysam.

Few comments:
# Unfortunately, RMContainerAllocator and RMContainerRequestor are not 
annotated to @Private classes. So, all the fields/methods that are made 
accessible should have a @Private annotation in addition the @VisibleForTesting 
annotation.
# By moving TestRMContainerAllocator to be in the same package as the above two 
files, we can limit the visibility to package-private instead of public. Can 
you please check if that is straight-forward? 
# Can we combine the following two statements into one? 
{code}
allocationDelayThresholdMs = conf.getInt(
MRJobConfig.MR_JOB_REDUCER_PREEMPT_DELAY_SEC,
MRJobConfig.DEFAULT_MR_JOB_REDUCER_PREEMPT_DELAY_SEC);
allocationDelayThresholdMs *= 1000; //sec - ms
{code}
# Nit: Rename setMapResourceReqt and setReduceResourceReqt to end in Request 
instead of Reqt? #
# Nit: In the tests, can we use a smaller sleep time? Also, instead of sleeping 
for an extra second, can we sleep for the exact time and then check if the 
reducer gets preempted in a loop with much smaller sleep? YARN/MR should use a 
Clock so tests don't have to actually sleep for that long.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-16 Thread Maysam Yabandeh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maysam Yabandeh updated MAPREDUCE-5844:
---

Attachment: MAPREDUCE-5844.patch

Thanks [~kasha] for the comments. I am attaching a new patch that has them 
applied.

I was thinking about a proper name for setReduceResourceReqt. On one hand, by 
changing it to setReduceResourceRequest it becomes more readable. On the other 
hand, by using setReduceResourceReqt we adhere to the java standard for naming 
getters and setter (here reduceResourceReqt). I am more inclined towards the 
latter and I was wondering if you are ok with that.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-16 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033288#comment-14033288
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

Can we change the field names also to end in Request instead of Reqt? 

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033298#comment-14033298
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650692/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4663//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4663//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-16 Thread Maysam Yabandeh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maysam Yabandeh updated MAPREDUCE-5844:
---

Attachment: MAPREDUCE-5844.patch

Attaching the patch that also updates the variables' names: 
reduceResourceRequest and mapResourceRequest

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Moved] (MAPREDUCE-5930) Document MapReduce metrics

2014-06-16 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA moved HDFS-6550 to MAPREDUCE-5930:


Component/s: (was: documentation)
 documentation
Key: MAPREDUCE-5930  (was: HDFS-6550)
Project: Hadoop Map/Reduce  (was: Hadoop HDFS)

 Document MapReduce metrics
 --

 Key: MAPREDUCE-5930
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5930
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA

 MapReduce-side of HADOOP-6350. Add MapReduce metrics to Metrics document.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5930) Document MapReduce metrics

2014-06-16 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1407#comment-1407
 ] 

Akira AJISAKA commented on MAPREDUCE-5930:
--

Moved to the correct (MapReduce) project.

 Document MapReduce metrics
 --

 Key: MAPREDUCE-5930
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5930
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: documentation
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA

 MapReduce-side of HADOOP-6350. Add MapReduce metrics to Metrics document.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-4065) Add .proto files to built tarball

2014-06-16 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4065:
--

 Assignee: Tsuyoshi OZAWA
Affects Version/s: 2.4.0
   Status: Patch Available  (was: Open)

 Add .proto files to built tarball
 -

 Key: MAPREDUCE-4065
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4065
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.2, 2.4.0
Reporter: Ralph H Castain
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4065.1.patch


 Please add the .proto files to the built tarball so that users can build 3rd 
 party tools that use protocol buffers without having to do an svn checkout of 
 the source code.
 Sorry I don't know more about Maven, or I would provide a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MAPREDUCE-4065) Add .proto files to built tarball

2014-06-16 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-4065:
--

Attachment: MAPREDUCE-4065.1.patch

Included *.proto files into build tar ball. Directory hierarchy is as follows:

{code}
hadoop-3.0.0-SNAPSHOT$ tree proto/
proto/
|-- hadoop-common
|   |-- GenericRefreshProtocol.proto
|   |-- GetUserMappingsProtocol.proto
|   |-- HAServiceProtocol.proto
|   |-- IpcConnectionContext.proto
|   |-- ProtobufRpcEngine.proto
|   |-- ProtocolInfo.proto
|   |-- RefreshAuthorizationPolicyProtocol.proto
|   |-- RefreshCallQueueProtocol.proto
|   |-- RefreshUserMappingsProtocol.proto
|   |-- RpcHeader.proto
|   |-- Security.proto
|   `-- ZKFCProtocol.proto
|-- hadoop-hdfs
|   |-- ClientDatanodeProtocol.proto
|   |-- ClientNamenodeProtocol.proto
|   |-- DatanodeProtocol.proto
|   |-- HAZKInfo.proto
|   |-- InterDatanodeProtocol.proto
|   |-- JournalProtocol.proto
|   |-- NamenodeProtocol.proto
|   |-- QJournalProtocol.proto
|   |-- acl.proto
|   |-- datatransfer.proto
|   |-- fsimage.proto
|   |-- hdfs.proto
|   `-- xattr.proto
|-- hadoop-mapreduce-client
|   |-- MRClientProtocol.proto
|   |-- mr_protos.proto
|   `-- mr_service_protos.proto
|-- hadoop-yarn-api
|   |-- application_history_client.proto
|   |-- applicationclient_protocol.proto
|   |-- applicationmaster_protocol.proto
|   |-- containermanagement_protocol.proto
|   |-- yarn_protos.proto
|   `-- yarn_service_protos.proto
`-- hadoop-yarn-server-common
{code}

 Add .proto files to built tarball
 -

 Key: MAPREDUCE-4065
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4065
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.2, 2.4.0
Reporter: Ralph H Castain
 Attachments: MAPREDUCE-4065.1.patch


 Please add the .proto files to the built tarball so that users can build 3rd 
 party tools that use protocol buffers without having to do an svn checkout of 
 the source code.
 Sorry I don't know more about Maven, or I would provide a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033350#comment-14033350
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650707/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4664//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4664//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-4065) Add .proto files to built tarball

2014-06-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033379#comment-14033379
 ] 

Hadoop QA commented on MAPREDUCE-4065:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12650714/MAPREDUCE-4065.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-dist.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4665//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4665//console

This message is automatically generated.

 Add .proto files to built tarball
 -

 Key: MAPREDUCE-4065
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4065
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.2, 2.4.0
Reporter: Ralph H Castain
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4065.1.patch


 Please add the .proto files to the built tarball so that users can build 3rd 
 party tools that use protocol buffers without having to do an svn checkout of 
 the source code.
 Sorry I don't know more about Maven, or I would provide a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)