Re: Hive on Spark Vs Spark SQL
It's a completely different path. On Sun, Nov 15, 2015 at 10:37 PM, kiran lonikarwrote: > I would like to know if Hive on Spark uses or shares the execution code > with Spark SQL or DataFrames? > > More specifically, does Hive on Spark benefit from the changes made to > Spark SQL, project Tungsten? Or is it completely different execution path > where it creates its own plan and executes on RDD? > > -Kiran > >
Re: Hive on Spark Vs Spark SQL
So does not benefit from Project Tungsten right? On Mon, Nov 16, 2015 at 12:07 PM, Reynold Xinwrote: > It's a completely different path. > > > On Sun, Nov 15, 2015 at 10:37 PM, kiran lonikar wrote: > >> I would like to know if Hive on Spark uses or shares the execution code >> with Spark SQL or DataFrames? >> >> More specifically, does Hive on Spark benefit from the changes made to >> Spark SQL, project Tungsten? Or is it completely different execution path >> where it creates its own plan and executes on RDD? >> >> -Kiran >> >> >
Re: Hive on Spark Vs Spark SQL
No it does not -- although it'd benefit from some of the work to make shuffle more robust. On Sun, Nov 15, 2015 at 10:45 PM, kiran lonikarwrote: > So does not benefit from Project Tungsten right? > > > On Mon, Nov 16, 2015 at 12:07 PM, Reynold Xin wrote: > >> It's a completely different path. >> >> >> On Sun, Nov 15, 2015 at 10:37 PM, kiran lonikar >> wrote: >> >>> I would like to know if Hive on Spark uses or shares the execution code >>> with Spark SQL or DataFrames? >>> >>> More specifically, does Hive on Spark benefit from the changes made to >>> Spark SQL, project Tungsten? Or is it completely different execution path >>> where it creates its own plan and executes on RDD? >>> >>> -Kiran >>> >>> >> >
Re: Hive on Spark VS Spark SQL
I don't think that's quite the difference. Any SQL engine has a query planner and an execution engine. Both of these Spark for execution. HoS uses Hive for query planning. Although it's not optimized for execution on Spark per se, it's got a lot of language support and is stable/mature. Spark SQL's query planner is less developed at this point but purpose-built for Spark as an execution engine. Spark SQL is also how you put SQL-like operations in a Spark program -- programmatic SQL if you will -- which isn't what Hive or therefore HoS does. HoS is good if you're already using Hive and need its language features and need it as it works today, and want a faster batch execution version of it. On Wed, May 20, 2015 at 7:18 AM, Debasish Das debasish.da...@gmail.com wrote: SparkSQL was built to improve upon Hive on Spark runtime further... On Tue, May 19, 2015 at 10:37 PM, guoqing0...@yahoo.com.hk guoqing0...@yahoo.com.hk wrote: Hive on Spark and SparkSQL which should be better , and what are the key characteristics and the advantages and the disadvantages between ? -- guoqing0...@yahoo.com.hk
Re: Hive on Spark VS Spark SQL
SparkSQL was built to improve upon Hive on Spark runtime further... On Tue, May 19, 2015 at 10:37 PM, guoqing0...@yahoo.com.hk guoqing0...@yahoo.com.hk wrote: Hive on Spark and SparkSQL which should be better , and what are the key characteristics and the advantages and the disadvantages between ? -- guoqing0...@yahoo.com.hk
Re: Hive on Spark VS Spark SQL
And if I am not wrong, spark SQL api is intended to move closer to SQL standards. I feel its a clever decision on spark's part to keep both APIs operational. These short term confusions worth the long term benefits. On 20 May 2015 17:19, Sean Owen so...@cloudera.com wrote: I don't think that's quite the difference. Any SQL engine has a query planner and an execution engine. Both of these Spark for execution. HoS uses Hive for query planning. Although it's not optimized for execution on Spark per se, it's got a lot of language support and is stable/mature. Spark SQL's query planner is less developed at this point but purpose-built for Spark as an execution engine. Spark SQL is also how you put SQL-like operations in a Spark program -- programmatic SQL if you will -- which isn't what Hive or therefore HoS does. HoS is good if you're already using Hive and need its language features and need it as it works today, and want a faster batch execution version of it. On Wed, May 20, 2015 at 7:18 AM, Debasish Das debasish.da...@gmail.com wrote: SparkSQL was built to improve upon Hive on Spark runtime further... On Tue, May 19, 2015 at 10:37 PM, guoqing0...@yahoo.com.hk guoqing0...@yahoo.com.hk wrote: Hive on Spark and SparkSQL which should be better , and what are the key characteristics and the advantages and the disadvantages between ? -- guoqing0...@yahoo.com.hk