Hello everyone,
I'm trying to run a sequence of MR jobs using Java action for their
drivers in Oozie.
The problem is that MR job are run locally instead on Hadoop cluster.
How to fix this?
First job reads from HBase, performs some processing and puts the result
on HDFS, while next job should read from it. There are 10 mappers in
first job, but I'm only showing the last one as an example.
Here is the error log from HBase MR job:
Aw==, start row: 9-777-1123456789113, end row:
9-777-1123456789114, region location: hdp-slave1.nissatech.local:16020)
2016-05-04 14:33:48,373 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process
identifier=hconnection-0x860ce79 connecting to ZooKeeper
ensemble=192.168.84.27:2181
2016-05-04 14:33:48,373 INFO [LocalJobRunner Map Task Executor #0]
org.apache.zookeeper.ZooKeeper: Initiating client connection,
connectString=192.168.84.27:2181 sessionTimeout=90000
watcher=hconnection-0x860ce790x0, quorum=192.168.84.27:2181,
baseZNode=/hbase-unsecure
2016-05-04 14:33:48,378 INFO [LocalJobRunner Map Task Executor
#0-SendThread(192.168.84.27:2181)] org.apache.zookeeper.ClientCnxn:
Opening socket connection to server 192.168.84.27/192.168.84.27:2181.
Will not attempt to authenticate using SASL (unknown error)
2016-05-04 14:33:48,379 INFO [LocalJobRunner Map Task Executor
#0-SendThread(192.168.84.27:2181)] org.apache.zookeeper.ClientCnxn:
Socket connection established to 192.168.84.27/192.168.84.27:2181,
initiating session
2016-05-04 14:33:48,391 INFO [LocalJobRunner Map Task Executor
#0-SendThread(192.168.84.27:2181)] org.apache.zookeeper.ClientCnxn:
Session establishment complete on server
192.168.84.27/192.168.84.27:2181, sessionid = 0x152f8f85214096b,
negotiated timeout = 40000
2016-05-04 14:33:48,394 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase: Input split
length: 0 bytes.
2016-05-04 14:33:48,590 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
2016-05-04 14:33:48,590 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 100
2016-05-04 14:33:48,590 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: soft limit at 83886080
2016-05-04 14:33:48,590 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 104857600
2016-05-04 14:33:48,591 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: kvstart = 26214396; length = 6553600
2016-05-04 14:33:48,592 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2016-05-04 14:33:48,801 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.LocalJobRunner:
2016-05-04 14:33:48,802 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x152f8f85214096b
2016-05-04 14:33:48,828 INFO [LocalJobRunner Map Task Executor
#0-EventThread] org.apache.zookeeper.ClientCnxn: EventThread shut down
2016-05-04 14:33:48,828 INFO [LocalJobRunner Map Task Executor #0]
org.apache.zookeeper.ZooKeeper: Session: 0x152f8f85214096b closed
2016-05-04 14:33:48,839 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: Starting flush of map output
2016-05-04 14:33:48,839 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: Spilling map output
2016-05-04 14:33:48,839 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 5734062;
bufvoid = 104857600
2016-05-04 14:33:48,839 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: kvstart = 26214396(104857584); kvend =
26210008(104840032); length = 4389/6553600
2016-05-04 14:33:48,874 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.MapTask: Finished spill 0
2016-05-04 14:33:48,877 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.Task:
Task:attempt_local1149688163_0001_m_000009_0 is done. And is in the
process of committing
2016-05-04 14:33:48,897 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.LocalJobRunner: map
2016-05-04 14:33:48,897 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.Task: Task
'attempt_local1149688163_0001_m_000009_0' done.
2016-05-04 14:33:48,897 INFO [LocalJobRunner Map Task Executor #0]
org.apache.hadoop.mapred.LocalJobRunner: Finishing task:
attempt_local1149688163_0001_m_000009_0
2016-05-04 14:33:48,897 INFO [Thread-42]
org.apache.hadoop.mapred.LocalJobRunner: map task executor complete.
2016-05-04 14:33:48,901 INFO [Thread-42]
org.apache.hadoop.mapred.LocalJobRunner: Waiting for reduce tasks
2016-05-04 14:33:48,901 INFO [pool-9-thread-1]
org.apache.hadoop.mapred.LocalJobRunner: Starting task:
attempt_local1149688163_0001_r_000000_0
2016-05-04 14:33:48,918 INFO [pool-9-thread-1]
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output
Committer Algorithm version is 1
2016-05-04 14:33:48,919 INFO [pool-9-thread-1]
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter:
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false
2016-05-04 14:33:48,919 INFO [pool-9-thread-1]
org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2016-05-04 14:33:48,932 INFO [pool-9-thread-1]
org.apache.hadoop.mapred.ReduceTask: Using ShuffleConsumerPlugin:
org.apache.hadoop.mapreduce.task.reduce.Shuffle@697f13c9
2016-05-04 14:33:48,959 INFO [pool-9-thread-1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: MergerManager:
memoryLimit=289931264, maxSingleShuffleLimit=72482816,
mergeThreshold=191354640, ioSortFactor=10, memToMemMergeOutputsThreshold=10
2016-05-04 14:33:48,965 INFO [EventFetcher for fetching Map
Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher:
attempt_local1149688163_0001_r_000000_0 Thread started: EventFetcher for
fetching Map Completion Events
2016-05-04 14:33:49,035 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher: localfetcher#1
about to shuffle output of map attempt_local1149688163_0001_m_000007_0
decomp: 5381537 len: 5381541 to MEMORY
2016-05-04 14:33:49,056 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 5381537
bytes from map-output for attempt_local1149688163_0001_m_000007_0
2016-05-04 14:33:49,061 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
closeInMemoryFile -> map-output of size: 5381537,
inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->5381537
2016-05-04 14:33:49,070 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher: localfetcher#1
about to shuffle output of map attempt_local1149688163_0001_m_000000_0
decomp: 5472201 len: 5472205 to MEMORY
2016-05-04 14:33:49,084 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 5472201
bytes from map-output for attempt_local1149688163_0001_m_000000_0
2016-05-04 14:33:49,084 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
closeInMemoryFile -> map-output of size: 5472201,
inMemoryMapOutputs.size() -> 2, commitMemory -> 5381537, usedMemory
->10853738
2016-05-04 14:33:49,110 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher: localfetcher#1
about to shuffle output of map attempt_local1149688163_0001_m_000001_0
decomp: 5387977 len: 5387981 to MEMORY
2016-05-04 14:33:49,124 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 5387977
bytes from map-output for attempt_local1149688163_0001_m_000001_0
2016-05-04 14:33:49,125 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
closeInMemoryFile -> map-output of size: 5387977,
inMemoryMapOutputs.size() -> 3, commitMemory -> 10853738, usedMemory
->16241715
2016-05-04 14:33:49,129 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher: localfetcher#1
about to shuffle output of map attempt_local1149688163_0001_m_000004_0
decomp: 5347914 len: 5347918 to MEMORY
2016-05-04 14:33:49,143 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 5347914
bytes from map-output for attempt_local1149688163_0001_m_000004_0
2016-05-04 14:33:49,144 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
closeInMemoryFile -> map-output of size: 5347914,
inMemoryMapOutputs.size() -> 4, commitMemory -> 16241715, usedMemory
->21589629
2016-05-04 14:33:49,148 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher: localfetcher#1
about to shuffle output of map attempt_local1149688163_0001_m_000002_0
decomp: 5671398 len: 5671402 to MEMORY
2016-05-04 14:33:49,161 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 5671398
bytes from map-output for attempt_local1149688163_0001_m_000002_0
2016-05-04 14:33:49,161 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
closeInMemoryFile -> map-output of size: 5671398,
inMemoryMapOutputs.size() -> 5, commitMemory -> 21589629, usedMemory
->27261027
2016-05-04 14:33:49,166 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher: localfetcher#1
about to shuffle output of map attempt_local1149688163_0001_m_000005_0
decomp: 5743249 len: 5743253 to MEMORY
2016-05-04 14:33:49,180 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 5743249
bytes from map-output for attempt_local1149688163_0001_m_000005_0
2016-05-04 14:33:49,180 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
closeInMemoryFile -> map-output of size: 5743249,
inMemoryMapOutputs.size() -> 6, commitMemory -> 27261027, usedMemory
->33004276
2016-05-04 14:33:49,184 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher: localfetcher#1
about to shuffle output of map attempt_local1149688163_0001_m_000008_0
decomp: 5471488 len: 5471492 to MEMORY
2016-05-04 14:33:49,197 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 5471488
bytes from map-output for attempt_local1149688163_0001_m_000008_0
2016-05-04 14:33:49,197 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
closeInMemoryFile -> map-output of size: 5471488,
inMemoryMapOutputs.size() -> 7, commitMemory -> 33004276, usedMemory
->38475764
2016-05-04 14:33:49,313 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher: localfetcher#1
about to shuffle output of map attempt_local1149688163_0001_m_000003_0
decomp: 5579502 len: 5579506 to MEMORY
2016-05-04 14:33:49,327 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 5579502
bytes from map-output for attempt_local1149688163_0001_m_000003_0
2016-05-04 14:33:49,327 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
closeInMemoryFile -> map-output of size: 5579502,
inMemoryMapOutputs.size() -> 8, commitMemory -> 38475764, usedMemory
->44055266
2016-05-04 14:33:49,332 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher: localfetcher#1
about to shuffle output of map attempt_local1149688163_0001_m_000006_0
decomp: 5605456 len: 5605460 to MEMORY
2016-05-04 14:33:49,344 INFO [main]
org.apache.hadoop.mapreduce.Job: map 100% reduce 0%
2016-05-04 14:33:49,349 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 5605456
bytes from map-output for attempt_local1149688163_0001_m_000006_0
2016-05-04 14:33:49,349 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
closeInMemoryFile -> map-output of size: 5605456,
inMemoryMapOutputs.size() -> 9, commitMemory -> 44055266, usedMemory
->49660722
2016-05-04 14:33:49,354 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.LocalFetcher: localfetcher#1
about to shuffle output of map attempt_local1149688163_0001_m_000009_0
decomp: 5738455 len: 5738459 to MEMORY
2016-05-04 14:33:49,370 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 5738455
bytes from map-output for attempt_local1149688163_0001_m_000009_0
2016-05-04 14:33:49,370 INFO [localfetcher#1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:
closeInMemoryFile -> map-output of size: 5738455,
inMemoryMapOutputs.size() -> 10, commitMemory -> 49660722, usedMemory
->55399177
2016-05-04 14:33:49,373 INFO [EventFetcher for fetching Map
Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher:
EventFetcher is interrupted.. Returning
2016-05-04 14:33:49,375 INFO [pool-9-thread-1]
org.apache.hadoop.mapred.LocalJobRunner: 10 / 10 copied.
2016-05-04 14:33:49,376 INFO [pool-9-thread-1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: finalMerge
called with 10 in-memory map-outputs and 0 on-disk map-outputs
2016-05-04 14:33:49,388 INFO [pool-9-thread-1]
org.apache.hadoop.mapred.Merger: Merging 10 sorted segments
2016-05-04 14:33:49,389 INFO [pool-9-thread-1]
org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 10
segments left of total size: 55398877 bytes
2016-05-04 14:33:49,711 INFO [pool-9-thread-1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: Merged 10
segments, 55399177 bytes to disk to satisfy reduce memory limit
2016-05-04 14:33:49,712 INFO [pool-9-thread-1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: Merging 1
files, 55399163 bytes from disk
2016-05-04 14:33:49,713 INFO [pool-9-thread-1]
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: Merging 0
segments, 0 bytes from memory into reduce
2016-05-04 14:33:49,714 INFO [pool-9-thread-1]
org.apache.hadoop.mapred.Merger: Merging 1 sorted segments
2016-05-04 14:33:49,714 INFO [pool-9-thread-1]
org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1
segments left of total size: 55399129 bytes
2016-05-04 14:33:49,715 INFO [pool-9-thread-1]
org.apache.hadoop.mapred.LocalJobRunner: 10 / 10 copied.
2016-05-04 14:33:49,742 INFO [Thread-42]
org.apache.hadoop.mapred.LocalJobRunner: reduce task executor complete.
2016-05-04 14:33:49,797 WARN [Thread-42]
org.apache.hadoop.mapred.LocalJobRunner: job_local1149688163_0001
java.lang.Exception: java.io.IOException: Mkdirs failed to create
file:/user/hdfs/sessions/777/23115/inputRecordsAsWritables/_temporary/0/_temporary/attempt_local1149688163_0001_r_000000_0
(exists=false,
cwd=file:/hadoop/yarn/local/usercache/hdfs/appcache/application_1461858162941_0054/container_e12_1461858162941_0054_01_000002)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.io.IOException: Mkdirs failed to create
file:/user/hdfs/sessions/777/23115/inputRecordsAsWritables/_temporary/0/_temporary/attempt_local1149688163_0001_r_000000_0
(exists=false,
cwd=file:/hadoop/yarn/local/usercache/hdfs/appcache/application_1461858162941_0054/container_e12_1461858162941_0054_01_000002)
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449)
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
at
org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1074)
at
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:273)
at
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:530)
at
org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat.getSequenceWriter(SequenceFileOutputFormat.java:64)
at
org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:75)
at
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:540)
at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2016-05-04 14:33:50,346 INFO [main]
org.apache.hadoop.mapreduce.Job: Job job_local1149688163_0001 failed
with state FAILED due to: NA
2016-05-04 14:33:50,407 INFO [main]
org.apache.hadoop.mapreduce.Job: Counters: 38
File System Counters
FILE: Number of bytes read=1287449333
FILE: Number of bytes written=1607139426
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1111590
HDFS: Number of bytes written=220
HDFS: Number of read operations=40
HDFS: Number of large read operations=0
HDFS: Number of write operations=20
Map-Reduce Framework
Map input records=10906
Map output records=10906
Map output bytes=55355550
Map output materialized bytes=55399217
Input split bytes=2900
Combine input records=0
Combine output records=0
Reduce input groups=0
Reduce shuffle bytes=55399217
Reduce input records=0
Reduce output records=0
Spilled Records=10906
Shuffled Maps =10
Failed Shuffles=0
Merged Map outputs=10
GC time elapsed (ms)=641
CPU time spent (ms)=11290
Physical memory (bytes) snapshot=4507889664
Virtual memory (bytes) snapshot=22225674240
Total committed heap usage (bytes)=2925002752
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
And here is the exception from next job:
Failing Oozie Launcher, Main class
[org.apache.oozie.action.hadoop.JavaMain], main() threw exception,
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
does not exist: file:/user/hdfs/sessions/777/23115/inputRecordsAsWritables
org.apache.oozie.action.hadoop.JavaMainException:
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
does not exist: file:/user/hdfs/sessions/777/23115/inputRecordsAsWritables
at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:59)
at
org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
at org.apache.oozie.action.hadoop.JavaMain.main(JavaMain.java:35)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by:
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
does not exist: file:/user/hdfs/sessions/777/23115/inputRecordsAsWritables
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:323)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:265)
at
org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:59)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:387)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
at
com.nissatech.kmedoidsusingfames.algorithms.initialization.RandomSeedDriver.generateRandomSeed(RandomSeedDriver.java:52)
at
com.nissatech.kmedoidsusingfames.algorithms.initialization.ScalableKMeansPPInitialization.performInitialization(ScalableKMeansPPInitialization.java:43)
at
com.nissatech.kmedoidsusingfames.algorithms.kmedoids.KMedoidsUsingFAMES.perform(KMedoidsUsingFAMES.java:54)
at
com.nissatech.kmedoidsusingfames.algorithms.ClusteringAlgorithmRepetitor.performIteratingForSameNoOfClusters(ClusteringAlgorithmRepetitor.java:43)
at
com.nissatech.kmedoidsusingfames.algorithms.ClusteringAlgorithmIterator.performTraining(ClusteringAlgorithmIterator.java:46)
at
com.nissatech.kmedoidsusingfames.orchestration.Orchestrator.main(Orchestrator.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:56)
... 15 more
It seems to me that first job is run locally and hence there is no
result for the next one on the HDFS. Am I wrong?
___________________________
I was able to make my MR job run on HDP cluster by adding this to
configuration (based on the following link):
Configuration conf = new Configuration(false);
conf.addResource(new Path("file:///",
System.getProperty("oozie.action.conf.xml")));
But why do I need to do that and how to avoid it? I have a sequence of
MR jobs run from this Java action and I don't won't to bind myself to
using Oozie and adding this to config of each job. Is there a way to
make my jobs run on cluster from Oozie by default?
I should probably mention that this is an HDP cluster and setup was
performed through Ambari.
--
signature *Marko Dinic'*
/Software engineer @/
Nissatech
Kajmakc(alanska 8
18000 Nis(, Serbia
website <http://www.nissatech.com> | email
<mailto:[email protected]>
tel/fax: +381 18 288 111
mobile: +381 63 82 49 556
skype: vesto91