Hive on Spark vs Spark on Hive(HiveContext)

Pralabh Kumar Thu, 01 Jul 2021 02:51:52 -0700

Hi Dev

I am having thousands of legacy hive queries .  As a plan to move to Spark
, we are planning to migrate Hive queries on Spark .  Now there are two
approaches



   1.  One is Hive on Spark , which is similar to changing the execution
   engine in hive queries like TEZ.
   2. Another one is migrating hive queries to Hivecontext/sparksql , an
   approach used by Facebook and presented in Spark conference.
   
https://databricks.com/session/experiences-migrating-hive-workload-to-sparksql#:~:text=Spark%20SQL%20in%20Apache%20Spark,SQL%20with%20minimal%20user%20intervention
   .


Can you please guide me which option to go for . I am personally inclined
to go for option 2 . It also allows the use of the latest spark .

Please help me on the same , as there are not much comparisons online
available keeping Spark 3.0 in perspective.

Regards
Pralabh Kumar

Hive on Spark vs Spark on Hive(HiveContext)

Reply via email to