Hello all,
I am trying to understand how the Spark SQL integration with hive works.
Whenever i build spark with -Phive -P hive-thriftserver options, i see that
it is packaged with hive-2.3.7*.jars and spark-hive*.jars. And the
documentation claims that spark can talk to different versions of hive. If
that is the case , what should i do if i have a hive 3.2.1 running on my
instance and i want my spark application to talk to that hive cluster.

Does this mean i have to build spark with hive version 3.2.1 or like
the documentation states, is it enough if i just add the metastore jars to
spark-defaults.conf ?

Should i add my hive 3.2.1 lib to the SPARK_DIST_CLASSPATH as well ? Will
there be conflicts between the hive 2.3.7 jars and the hive 3.2.1 jars i
will have in this case ?


Thanks !

Reply via email to