Re: Is it possible to use Hadoop 3.x and Hive 3.x using spark 2.4?

Daniel de Oliveira Mantovani Mon, 06 Jul 2020 15:33:06 -0700

Hi Teja,

To access Hive 3 using Apache Spark 2.x.x you need to use this connector
from Cloudera
https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/integrating-hive/content/hive_hivewarehouseconnector_for_handling_apache_spark_data.html
.
It has many limitations.... You just can write to Hive managed tables in
ORC format. But you can mitigate this problem writing to Hive unmanaged
tables, so parquet will work.
The performance is also not the same.


Good luck


On Mon, Jul 6, 2020 at 3:16 PM Sean Owen <sro...@gmail.com> wrote:

> 2.4 works with Hadoop 3 (optionally) and Hive 1. I doubt it will work
> connecting to Hadoop 3 / Hive 3; it's possible in a few cases.
> It's also possible some vendor distributions support this combination.
>
> On Mon, Jul 6, 2020 at 7:51 AM Teja <saiteja.pa...@gmail.com> wrote:
> >
> > We use spark 2.4.0 to connect to Hadoop 2.7 cluster and query from Hive
> > Metastore version 2.3. But the Cluster managing team has decided to
> upgrade
> > to Hadoop 3.x and Hive 3.x. We could not migrate to spark 3 yet, which is
> > compatible with Hadoop 3 and Hive 3, as we could not test if anything
> > breaks.
> >
> > *Is there any possible way to stick to spark 2.4.x version and still be
> able
> > to use Hadoop 3 and Hive 3?
> > *
> >
> > I got to know backporting is one option but I am not sure how. It would
> be
> > great if you could point me in that direction.
> >
> >
> >
> > --
> > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

-- 

--
Daniel Mantovani

Re: Is it possible to use Hadoop 3.x and Hive 3.x using spark 2.4?

Reply via email to