Raghvendra Singh created MAPREDUCE-6438:
-------------------------------------------
Summary: mapreduce fails with job.jar does not exist
Key: MAPREDUCE-6438
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6438
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mrv2
Reporter: Raghvendra Singh
I have a hortonworks distribution(2.2.6.0-2800) of hadoop which runs mapreduce
job based on yarn, and i have a simple map reduce job which reads compressed
data files from hdfs, does some processing over it and then this data is saved
in hbase with bulk load
Here is my program that does it
{code}
final Configuration hadoopConfiguration = new Configuration();
configuration.set(“yarn.resourcemanager.address”, “XXXXXX”);
configuration.set(“yarn.resourcemanager.scheduler.address”, “XXXXXX”);
configuration.set("mapreduce.framework.name", "yarn”);
configuration.set("mapreduce.jobtracker.staging.root.dir", “XXXXXXXX”);
final Job job = Job.getInstance(hadoopConfiguration, "migration");
job.setJarByClass(BlitzService.class);
job.setMapperClass(DataMigrationMapper.class);
job.setMapOutputKeyClass(ImmutableBytesWritable.class);
job.setMapOutputValueClass(KeyValue.class);
job.setReducerClass(DataMigrationReducer.class);
job.setCombinerClass(DataMigrationReducer.class);
HFileOutputFormat2.configureIncrementalLoad(job, hTable);
FileInputFormat.setInputPaths(job, filesToProcess.toArray(new
Path[filesToProcess.size()]));
HFileOutputFormat2.setOutputPath(job, new Path(SOME PATH));
job.waitForCompletion(true);
{code}
This should be a very simple thing to run but i am facing this exception while
running the job
{code}
INFO [2015-07-23 23:53:20,222] org.apache.hadoop.yarn.client.RMProxy:
Connecting to ResourceManager at /172.30.0.147:8032
WARN [2015-07-23 23:53:20,383] org.apache.hadoop.mapreduce.JobSubmitter:
Hadoop command-line option parsing not performed. Implement the Tool interface
and execute your application with ToolRunner to remedy this.
INFO [2015-07-23 23:53:20,492]
org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input paths to
process : 16
INFO [2015-07-23 23:53:20,561] org.apache.hadoop.mapreduce.JobSubmitter:
number of splits:16
INFO [2015-07-23 23:53:20,719] org.apache.hadoop.mapreduce.JobSubmitter:
Submitting tokens for job: job_1437695344326_0002
INFO [2015-07-23 23:53:20,842]
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application
application_1437695344326_0002
INFO [2015-07-23 23:53:20,867] org.apache.hadoop.mapreduce.Job: The url to
track the job:
http://ip-172-30-0-147.us-west-2.compute.internal:8088/proxy/application_1437695344326_0002/
INFO [2015-07-23 23:53:20,868] org.apache.hadoop.mapreduce.Job: Running
job: job_1437695344326_0002
INFO [2015-07-23 23:53:35,994] org.apache.hadoop.mapreduce.Job: Job
job_1437695344326_0002 running in uber mode : false
INFO [2015-07-23 23:53:35,995] org.apache.hadoop.mapreduce.Job: map 0%
reduce 0%
INFO [2015-07-23 23:53:43,053] org.apache.hadoop.mapreduce.Job: Task Id :
attempt_1437695344326_0002_m_000001_1000, Status : FAILED
File
file:/tmp/hadoop-yarn/staging/root/.staging/job_1437695344326_0002/job.jar does
not exist
java.io.FileNotFoundException: File
file:/tmp/hadoop-yarn/staging/root/.staging/job_1437695344326_0002/job.jar does
not exist
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:608)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:821)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:598)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
INFO [2015-07-23 23:53:44,075] org.apache.hadoop.mapreduce.Job: Task Id :
attempt_1437695344326_0002_m_000002_1000, Status : FAILED
File
file:/tmp/hadoop-yarn/staging/root/.staging/job_1437695344326_0002/job.jar does
not exist
java.io.FileNotFoundException: File
file:/tmp/hadoop-yarn/staging/root/.staging/job_1437695344326_0002/job.jar does
not exist
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:608)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:821)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:598)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}
Also attached are the core-site.xml, mapred-site.xml, hdfs-site.xml and
yarn-site.xml
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)