Georgi Ivanov created OOZIE-2643:
------------------------------------
Summary: Hive on Tez vai Oozie fails with ClassNotFound Exception
against tables with Hcatalog Json Serde
Key: OOZIE-2643
URL: https://issues.apache.org/jira/browse/OOZIE-2643
Project: Oozie
Issue Type: Bug
Reporter: Georgi Ivanov
If we create an oozie workflow that has a hive action using tez execution
engine and we reference a table with hcatalog json serde, oozie does not
localize properly the sharelib jars for the tez session. It localizes them for
the Hive Action but they do not propagate to Tez. Using Hive on MR works fine.
This problem is only present with Oozie and Tez. The workflow throws
ClassNotFound Exception.
2016-08-18 08:35:00,337 [ERROR] [TezChild] |tez.TezProcessor|:
java.lang.RuntimeException: Map operator initialization failed
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:265)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.data.JsonSerDe
not found
at
org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:347)
at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:382)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:227)
... 15 more
Caused by: java.lang.ClassNotFoundException: Class
org.apache.hive.hcatalog.data.JsonSerDe not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at
org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:143)
at
org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313)
... 17 more
The only workaround that I've found so far is to add the hcatalog sharelib to
tez.aux.uris as a configuration parameter inside the workflow.xml
<property>
<name>tez.aux.uris</name>
<value>${nameNode}/user/oozie/share/lib/hcatalog/</value>
</property>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)