> On 一月 23, 2015, 2:05 a.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java,
> >  line 220
> > <https://reviews.apache.org/r/30107/diff/4/?file=829689#file829689line220>
> >
> >     So, this is the code that adds the jars to the classpath of the remote 
> > driver?
> >     
> >     I'm wondering why these jars are necessary in order to deserailize 
> > SparkWork.
> 
> chengxiang li wrote:
>     Same as previous comments, SparkWork contains MapWork/ReduceWork which 
> contains operator tree, UTFFOperator need to load added jar class.
> 
> Xuefu Zhang wrote:
>     Sorry, but which operator? UTFFOperator? I could find it in hive source.

Sorry, as you can see from the error log in JIRA, the extra class in added jar 
is contained in UDTFOperator:

org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: 
de.bankmark.bigbench.queries.q10.SentimentUDF
Serialization trace:
genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
childOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)


- chengxiang


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/#review69329
-----------------------------------------------------------


On 一月 22, 2015, 9:23 a.m., chengxiang li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30107/
> -----------------------------------------------------------
> 
> (Updated 一月 22, 2015, 9:23 a.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9410
>     https://issues.apache.org/jira/browse/HIVE-9410
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> The RemoteDriver does not contains added jar in it's classpath, so it would 
> failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR, 
> while use add jar through Hive CLI, Hive add jar into CLI classpath(through 
> thread context classloader) and add it to distributed cache as well. Compare 
> to Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should 
> add added jar into it's classpath as well.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java 
> 30a00a7 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java 
> 00aa4ec 
>   spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java 
> 1eb3ff2 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
> 5f9be65 
>   
> spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30107/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> chengxiang li
> 
>

Reply via email to