Georgi Ivanov created OOZIE-2643:
------------------------------------

             Summary: Hive on Tez vai Oozie fails with ClassNotFound Exception 
against tables with Hcatalog Json Serde
                 Key: OOZIE-2643
                 URL: https://issues.apache.org/jira/browse/OOZIE-2643
             Project: Oozie
          Issue Type: Bug
            Reporter: Georgi Ivanov


If we create an oozie workflow that has a hive action using tez execution 
engine and we reference a table with hcatalog json serde, oozie does not 
localize properly the sharelib jars for the tez session. It localizes them for 
the Hive Action but they do not propagate to Tez. Using Hive on MR works fine. 
This problem is only present with Oozie and Tez. The workflow throws 
ClassNotFound Exception.

2016-08-18 08:35:00,337 [ERROR] [TezChild] |tez.TezProcessor|: 
java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:265)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.data.JsonSerDe 
not found
at 
org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:347)
at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:382)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:227)
... 15 more
Caused by: java.lang.ClassNotFoundException: Class 
org.apache.hive.hcatalog.data.JsonSerDe not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at 
org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:143)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313)
... 17 more

The only workaround that I've found so far is to add the hcatalog sharelib to 
tez.aux.uris as a configuration parameter inside the workflow.xml

                <property>
                    <name>tez.aux.uris</name>
                    <value>${nameNode}/user/oozie/share/lib/hcatalog/</value>
                </property>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to