Hi I have to fire few insert into queries which uses Hive partitions. I have
two Hive partitions named server and date. Now I execute insert into queries
using hiveContext as shown below query works fine

hiveContext.sql("insert into summary1
partition(server='a1',date='2015-05-22') select from sourcetbl bla bla")
hiveContext.sql("insert into summary2
partition(server='a1',date='2015-05-22') select from sourcetbl bla bla")
I want above queries to be fired across all partitions. Server partition
from a1 to a1000 and date will be yesterday's date and this job will run
every day on yesterday's date all partitions.

I was thinking to have something like this but not sure if it is a good
approach.

DataFrame partitionFrame = hiveContext.sql("show partitions where
date='2015-05-07'")
partitionFrame.forEach(); // execute above queries inside foreach
Will it work in parallel if I use dataframe.foreach and apply quries in all
partitions in parallel? Please guide I am new to Spark. Thanks in advance.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-call-hiveContext-sql-on-all-the-Hive-partitions-in-parallel-tp23648.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to