Re: Spark hive build and connectivity

Mich Talebzadeh Thu, 22 Oct 2020 10:42:11 -0700

Hi,

To access Hive tables Spark uses native API as below (default) where you
have set-up


ltr $SPARK_HOME/conf
hive-site.xml -> /data6/hduser/hive-3.0.0/conf/hive-site.xml

val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
HiveContext.sql("use ilayer")
val account_table = HiveContext.table("joint_accounts")  // account_table
is a DF

or you can access any version of Hive on any host using JDBC connection

Example using Cloudera drivers (the only one that works I think)

driver: *com.cloudera.hive.jdbc41.HS2Driver*

Connection URL: *jdbc:hive2://rhes75:10099  *## Hive thrift server port

HTH



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*





*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Thu, 22 Oct 2020 at 17:47, Ravi Shankar <r...@unifisoftware.com> wrote:

> Hello Mitch,
> I am just trying to access hive tables from my hive 3.2.1 cluster using
> spark. Basically i just want my spark-jobs to be able to access these hive
> tables. I want to understand how spark jobs interact with hive to access
> these tables.
>
> - I see that whenever i build spark with hive support (-Phive
> -Phive-thriftserver) , it gets built with hive 2.3.7 jars. So , will it be
> ok if i access tables created using my hive 3.2.1 cluster ?
> - Do i have to add hive 3.2.1 jars to spark's (SPARK_DIST_CLASSPATH) ?
>
>
>
> On Thu, Oct 22, 2020 at 11:20 AM Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Hi Ravi,
>>
>> What exactly are you trying to do?
>>
>> You want to enhance Spark SQl or you want to run Hive on Spark engine?
>>
>> HTH
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Thu, 22 Oct 2020 at 16:36, Ravi Shankar <r...@unifisoftware.com>
>> wrote:
>>
>>> Hello all,
>>> I am trying to understand how the Spark SQL integration with hive works.
>>> Whenever i build spark with -Phive -P hive-thriftserver options, i see that
>>> it is packaged with hive-2.3.7*.jars and spark-hive*.jars. And the
>>> documentation claims that spark can talk to different versions of hive. If
>>> that is the case , what should i do if i have a hive 3.2.1 running on my
>>> instance and i want my spark application to talk to that hive cluster.
>>>
>>> Does this mean i have to build spark with hive version 3.2.1 or like
>>> the documentation states, is it enough if i just add the metastore jars to
>>> spark-defaults.conf ?
>>>
>>> Should i add my hive 3.2.1 lib to the SPARK_DIST_CLASSPATH as well ?
>>> Will there be conflicts between the hive 2.3.7 jars and the hive 3.2.1 jars
>>> i will have in this case ?
>>>
>>>
>>> Thanks !
>>>
>>

Re: Spark hive build and connectivity

Reply via email to