Re: CDH5, HiveContext, Parquet

Yin Huai Sun, 10 Aug 2014 17:52:33 -0700

If the link to PR/1819 is broken. Here is the one
https://github.com/apache/spark/pull/1819.



On Sun, Aug 10, 2014 at 5:56 PM, Eric Friedman <eric.d.fried...@gmail.com>
wrote:

> Thanks Michael, I can try that too.
>
> I know you guys aren't in sales/marketing (thank G-d), but given all the
> hoopla about the CDH<->DataBricks partnership, it'd be awesome if you guys
> were somewhat more aligned, by which I mean that the DataBricks releases on
> Apache that say "for CDH5" would actually work on CDH5. I know Cloudera has
> to qualify them for support and so on, but if DataBricks development
> treated mainstream CDH as the primary deployment target, well that would be
> great.  Of course I'm being selfish.  *smile*.
>
>
>
> On Aug 10, 2014, at 2:43 PM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
> I imagine it's not the only instance of this kind of problem people
>> will ever encounter. Can you rebuild Spark with this particular
>> release of Hive?
>
>
> Unfortunately the Hive APIs that we use change to much from release to
> release to make this possible.  There is a JIRA for compiling Spark SQL
> against Hive 13: SPARK-2706
> <https://issues.apache.org/jira/browse/SPARK-2706>.
>
> if I try to add hive-exec-0.12.0-cdh5.0.3.jar to my SPARK_CLASSPATH, in
>> order to get DeprecatedParquetInputFormat, I find out that there is an
>> incompatibility in the SerDeUtils class.  Spark's Hive snapshot expects to
>> find
>
>
> Instead of including CDH's version of Hive, I'd try just including the
> Hive jars for Parquet from here:
> http://mvnrepository.com/artifact/com.twitter/parquet-hive-bundle/1.5.0
>
> However, support for this is a work in progress.  You'll likely need to
> make sure you have a version of Spark that includes this commit (added last
> Friday)
> https://github.com/apache/spark/commit/9016af3f2729101027e33593e094332f05f48d92
>
> Another option would be to try this *experimental* patch: pr/1819.
>
>

Re: CDH5, HiveContext, Parquet

Reply via email to