Re: Hive on Spark Vs Spark SQL

2015-11-15 Thread Reynold Xin
It's a completely different path.


On Sun, Nov 15, 2015 at 10:37 PM, kiran lonikar  wrote:

> I would like to know if Hive on Spark uses or shares the execution code
> with Spark SQL or DataFrames?
>
> More specifically, does Hive on Spark benefit from the changes made to
> Spark SQL, project Tungsten? Or is it completely different execution path
> where it creates its own plan and executes on RDD?
>
> -Kiran
>
>


Re: Hive on Spark Vs Spark SQL

2015-11-15 Thread kiran lonikar
So does not benefit from Project Tungsten right?


On Mon, Nov 16, 2015 at 12:07 PM, Reynold Xin  wrote:

> It's a completely different path.
>
>
> On Sun, Nov 15, 2015 at 10:37 PM, kiran lonikar  wrote:
>
>> I would like to know if Hive on Spark uses or shares the execution code
>> with Spark SQL or DataFrames?
>>
>> More specifically, does Hive on Spark benefit from the changes made to
>> Spark SQL, project Tungsten? Or is it completely different execution path
>> where it creates its own plan and executes on RDD?
>>
>> -Kiran
>>
>>
>


Re: Hive on Spark Vs Spark SQL

2015-11-15 Thread Reynold Xin
No it does not -- although it'd benefit from some of the work to make
shuffle more robust.


On Sun, Nov 15, 2015 at 10:45 PM, kiran lonikar  wrote:

> So does not benefit from Project Tungsten right?
>
>
> On Mon, Nov 16, 2015 at 12:07 PM, Reynold Xin  wrote:
>
>> It's a completely different path.
>>
>>
>> On Sun, Nov 15, 2015 at 10:37 PM, kiran lonikar 
>> wrote:
>>
>>> I would like to know if Hive on Spark uses or shares the execution code
>>> with Spark SQL or DataFrames?
>>>
>>> More specifically, does Hive on Spark benefit from the changes made to
>>> Spark SQL, project Tungsten? Or is it completely different execution path
>>> where it creates its own plan and executes on RDD?
>>>
>>> -Kiran
>>>
>>>
>>
>


Re: Hive on Spark VS Spark SQL

2015-05-20 Thread Sean Owen
I don't think that's quite the difference. Any SQL  engine has a query
planner and an execution engine. Both of these Spark for execution. HoS
uses Hive for query planning. Although it's not optimized for execution on
Spark per se, it's got a lot of language support and is stable/mature.
Spark SQL's query planner is less developed at this point but purpose-built
for Spark as an execution engine. Spark SQL is also how you put SQL-like
operations in a Spark program -- programmatic SQL if you will -- which
isn't what Hive or therefore HoS does. HoS is good if you're already using
Hive and need its language features and need it as it works today, and want
a faster batch execution version of it.

On Wed, May 20, 2015 at 7:18 AM, Debasish Das debasish.da...@gmail.com
wrote:

 SparkSQL was built to improve upon Hive on Spark runtime further...

 On Tue, May 19, 2015 at 10:37 PM, guoqing0...@yahoo.com.hk 
 guoqing0...@yahoo.com.hk wrote:

 Hive on Spark and SparkSQL which should be better , and what are the key
 characteristics and the advantages and the disadvantages between ?

 --
 guoqing0...@yahoo.com.hk





Re: Hive on Spark VS Spark SQL

2015-05-20 Thread Debasish Das
SparkSQL was built to improve upon Hive on Spark runtime further...

On Tue, May 19, 2015 at 10:37 PM, guoqing0...@yahoo.com.hk 
guoqing0...@yahoo.com.hk wrote:

 Hive on Spark and SparkSQL which should be better , and what are the key
 characteristics and the advantages and the disadvantages between ?

 --
 guoqing0...@yahoo.com.hk



Re: Hive on Spark VS Spark SQL

2015-05-20 Thread ayan guha
And if I am not wrong, spark SQL api is intended to move closer to SQL
standards. I feel its a clever decision on spark's part to keep both APIs
operational. These short term confusions worth the long term benefits.
On 20 May 2015 17:19, Sean Owen so...@cloudera.com wrote:

 I don't think that's quite the difference. Any SQL  engine has a query
 planner and an execution engine. Both of these Spark for execution. HoS
 uses Hive for query planning. Although it's not optimized for execution on
 Spark per se, it's got a lot of language support and is stable/mature.
 Spark SQL's query planner is less developed at this point but purpose-built
 for Spark as an execution engine. Spark SQL is also how you put SQL-like
 operations in a Spark program -- programmatic SQL if you will -- which
 isn't what Hive or therefore HoS does. HoS is good if you're already using
 Hive and need its language features and need it as it works today, and want
 a faster batch execution version of it.

 On Wed, May 20, 2015 at 7:18 AM, Debasish Das debasish.da...@gmail.com
 wrote:

 SparkSQL was built to improve upon Hive on Spark runtime further...

 On Tue, May 19, 2015 at 10:37 PM, guoqing0...@yahoo.com.hk 
 guoqing0...@yahoo.com.hk wrote:

 Hive on Spark and SparkSQL which should be better , and what are the key
 characteristics and the advantages and the disadvantages between ?

 --
 guoqing0...@yahoo.com.hk