Re: Best way to Hive to Spark migration

2018-04-05 Thread Jörn Franke
And the usual hint when migrating - do not migrate only but also optimize the 
ETL process design - this brings the most benefit s

> On 5. Apr 2018, at 08:18, Jörn Franke  wrote:
> 
> Ok this is not much detail, but you are probably best off if you migrate them 
> to SparkSQL.
> 
> Depends also on the Hive version and Spark version. If you have a recent one 
> with TEZ+llap I would not expect so much difference. It can be also less 
> performant -Spark SQL got only recently some features suchst cost based 
> optimizer.
> 
>> On 5. Apr 2018, at 08:02, Pralabh Kumar  wrote:
>> 
>> Hi 
>> 
>> I have lot of ETL jobs (complex ones) , since they are SLA critical , I am 
>> planning them to migrate to spark.
>> 
>>> On Thu, Apr 5, 2018 at 10:46 AM, Jörn Franke  wrote:
>>> You need to provide more context on what you do currently in Hive and what 
>>> do you expect from the migration.
>>> 
 On 5. Apr 2018, at 05:43, Pralabh Kumar  wrote:
 
 Hi Spark group
 
 What's the best way to Migrate Hive to Spark
 
 1) Use HiveContext of Spark
 2) Use Hive on Spark 
 (https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
 3) Migrate Hive to Calcite to Spark SQL
 
 
 Regards
 
>> 


Re: Best way to Hive to Spark migration

2018-04-04 Thread Jörn Franke
Ok this is not much detail, but you are probably best off if you migrate them 
to SparkSQL.

Depends also on the Hive version and Spark version. If you have a recent one 
with TEZ+llap I would not expect so much difference. It can be also less 
performant -Spark SQL got only recently some features suchst cost based 
optimizer.

> On 5. Apr 2018, at 08:02, Pralabh Kumar  wrote:
> 
> Hi 
> 
> I have lot of ETL jobs (complex ones) , since they are SLA critical , I am 
> planning them to migrate to spark.
> 
>> On Thu, Apr 5, 2018 at 10:46 AM, Jörn Franke  wrote:
>> You need to provide more context on what you do currently in Hive and what 
>> do you expect from the migration.
>> 
>>> On 5. Apr 2018, at 05:43, Pralabh Kumar  wrote:
>>> 
>>> Hi Spark group
>>> 
>>> What's the best way to Migrate Hive to Spark
>>> 
>>> 1) Use HiveContext of Spark
>>> 2) Use Hive on Spark 
>>> (https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
>>> 3) Migrate Hive to Calcite to Spark SQL
>>> 
>>> 
>>> Regards
>>> 
> 


Re: Best way to Hive to Spark migration

2018-04-04 Thread Pralabh Kumar
Hi

I have lot of ETL jobs (complex ones) , since they are SLA critical , I am
planning them to migrate to spark.

On Thu, Apr 5, 2018 at 10:46 AM, Jörn Franke  wrote:

> You need to provide more context on what you do currently in Hive and what
> do you expect from the migration.
>
> On 5. Apr 2018, at 05:43, Pralabh Kumar  wrote:
>
> Hi Spark group
>
> What's the best way to Migrate Hive to Spark
>
> 1) Use HiveContext of Spark
> 2) Use Hive on Spark (https://cwiki.apache.org/
> confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
> 3) Migrate Hive to Calcite to Spark SQL
>
>
> Regards
>
>


Re: Best way to Hive to Spark migration

2018-04-04 Thread Jörn Franke
You need to provide more context on what you do currently in Hive and what do 
you expect from the migration.

> On 5. Apr 2018, at 05:43, Pralabh Kumar  wrote:
> 
> Hi Spark group
> 
> What's the best way to Migrate Hive to Spark
> 
> 1) Use HiveContext of Spark
> 2) Use Hive on Spark 
> (https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
> 3) Migrate Hive to Calcite to Spark SQL
> 
> 
> Regards
>