oozie workflow fails for hive query of ORC table

xufeng Mon, 25 May 2015 02:06:08 -0700

Hi everyone:


Who had this issue and solved it?
thanks for your help.




http://zh.hortonworks.com/community/forums/topic/oozie-workflow-fails-for-hive-query-of-orc-table/
************************************************************

We have a hive script to handle incremental update for a big table. The hive 
script essentially merge the incremental table and base table and then use the 
end result to overwrite the base table. The base table is a ORC table. The hive 
script works without any problem when executing from Hive Shell. But when we 
use it in Oozie workflow via Hive Action, it fails buz in the second stage of 
the MapReduce job, all the reduce attempts fail with the following error:
“TaskAttempt killed because it ran on unusable node”

We also tried to use the Shell Action of Oozie workflow to execute the script. 
It also fails with a different error as follows (it complaints it can’t find 
the meta file job.splitmetainfo, which is generated automatically by MapReduce. 
How can we make it work in Oozie workflow? Thanks a lot in advance.
==================================================================
Job init failed : org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
java.io.FileNotFoundException: File does not exist: 
hdfs://qa1-sjc001-031.i.jasperwireless.com:8020/user/hdfs/.staging/job_1424730037228_0101/job.splitmetainfo
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1568)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1432)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1390)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1289)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1057)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1500)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1496)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1429)
Caused by: java.io.FileNotFoundException: File does not exist: 
hdfs://qa1-sjc001-031.i.jasperwireless.com:8020/user/hdfs/.staging/job_1424730037228_0101/job.splitmetainfo
at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at 
org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:15

oozie workflow fails for hive query of ORC table

Reply via email to