Re: Support for Hive 2.x

Steve Loughran Sat, 03 Sep 2016 04:13:09 -0700

On 2 Sep 2016, at 18:40, Dongjoon Hyun 
<dongj...@apache.org<mailto:dongj...@apache.org>> wrote:


Hi, Rostyslav,

After your email, I also tried to search in this morning, but I didn't find a 
proper one.

The last related issue is SPARK-8064, `Upgrade Hive to 1.2`

https://issues.apache.org/jira/browse/SPARK-8064

If you want, you can file an JIRA issue including your pain points, then you 
can monitor through it.

I guess you have more reasons to do that, not just a compilation issue.



That was a pretty major change, as Spark SQL and Spark Thrift server  make use 
of the library in ways that the Hive authors never intended —and so forced the 
spark teams to do terrible things to get stuff to hook up (thrift)

In The SQL side of things, parser changes broke stuff, as did changed error 
messages. Work there involved catching up with the changes, and differentiating 
regressions from simple changes in error messages triggering false alarms.

oh, and then there was the kryo version. Twitter have been moving Chill -> Kryo 
3 in sync with their other codebase (storm?), spark's kryo version is driven by 
Chill; Hive needs to be in sync there or (as is done for Spark, a custom build 
of hive.jar made forcing it into the same version as chill & spark).

I did some preparatory work on a branch opening hive thrift server up for 
better subclassing

https://issues.apache.org/jira/browse/SPARK-10793

(FWIW Hive 1.2.1 actually uses a coy and past of the Hadoop 0.23 version of the 
hadoop yarn service classes, without the YARN-117 changes. If they could be 
moved back to the Hadoop reference implementation (i.e. commit to Hadoop 2.2+ 
and migrate back), and the thrift classes were reworked for better subclassing, 
life would be simpler —leaving only the SQL changes and protobuf and kryo 
versions...

Bests,
Dongjoon.



On Fri, Sep 2, 2016 at 12:51 AM, Rostyslav Sotnychenko 
<r.sotnyche...@gmail.com<mailto:r.sotnyche...@gmail.com>> wrote:
Hello!

I tried compiling Spark 2.0 with Hive 2.0, but as expected this failed.

So I am wondering if there is any talks going on about adding support of Hive 
2.x to Spark? I was unable to find any JIRA about this.


Thanks,
Rostyslav

Re: Support for Hive 2.x

Reply via email to