Re: What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-29 Thread Yana Kadiyska
One thing to note, if you are using Mesos, is that the version of Mesos
changed from 0.21 to 1.0.0. So taking a newer Spark might push you into
larger infrastructure upgrades

On Fri, Sep 22, 2017 at 2:39 PM, Gokula Krishnan D 
wrote:

> Hello All,
>
> Currently our Batch ETL Jobs are in Spark 1.6.0 and planning to upgrade
> into Spark 2.1.0.
>
> With minor code changes (like configuration and Spark Session.sc) able to
> execute the existing JOB into Spark 2.1.0.
>
> But noticed that JOB completion timings are much better in Spark 1.6.0 but
> no in Spark 2.1.0.
>
> For the instance, JOB A completed in 50s in Spark 1.6.0.
>
> And with the same input and JOB A completed in 1.5 mins in Spark 2.1.0.
>
> Is there any specific factor needs to be considered when switching to
> Spark 2.1.0 from Spark 1.6.0.
>
>
>
> Thanks & Regards,
> Gokula Krishnan* (Gokul)*
>


Re: What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-29 Thread Gokula Krishnan D
Do you see any changes or improvments in the *Core-API* in Spark 2.X when
compared with Spark 1.6.0. ?.




Thanks & Regards,
Gokula Krishnan* (Gokul)*

On Mon, Sep 25, 2017 at 1:32 PM, Gokula Krishnan D 
wrote:

> Thanks for the reply. Forgot to mention that, our Batch ETL Jobs are in
> Core-Spark.
>
>
> On Sep 22, 2017, at 3:13 PM, Vadim Semenov 
> wrote:
>
> 1. 40s is pretty negligible unless you run your job very frequently, there
> can be many factors that influence that.
>
> 2. Try to compare the CPU time instead of the wall-clock time
>
> 3. Check the stages that got slower and compare the DAGs
>
> 4. Test with dynamic allocation disabled
>
> On Fri, Sep 22, 2017 at 2:39 PM, Gokula Krishnan D 
> wrote:
>
>> Hello All,
>>
>> Currently our Batch ETL Jobs are in Spark 1.6.0 and planning to upgrade
>> into Spark 2.1.0.
>>
>> With minor code changes (like configuration and Spark Session.sc) able to
>> execute the existing JOB into Spark 2.1.0.
>>
>> But noticed that JOB completion timings are much better in Spark 1.6.0
>> but no in Spark 2.1.0.
>>
>> For the instance, JOB A completed in 50s in Spark 1.6.0.
>>
>> And with the same input and JOB A completed in 1.5 mins in Spark 2.1.0.
>>
>> Is there any specific factor needs to be considered when switching to
>> Spark 2.1.0 from Spark 1.6.0.
>>
>>
>>
>> Thanks & Regards,
>> Gokula Krishnan* (Gokul)*
>>
>
>
>


Re: What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-25 Thread Gokula Krishnan D
Thanks for the reply. Forgot to mention that, our Batch ETL Jobs are in
Core-Spark.


On Sep 22, 2017, at 3:13 PM, Vadim Semenov 
wrote:

1. 40s is pretty negligible unless you run your job very frequently, there
can be many factors that influence that.

2. Try to compare the CPU time instead of the wall-clock time

3. Check the stages that got slower and compare the DAGs

4. Test with dynamic allocation disabled

On Fri, Sep 22, 2017 at 2:39 PM, Gokula Krishnan D 
wrote:

> Hello All,
>
> Currently our Batch ETL Jobs are in Spark 1.6.0 and planning to upgrade
> into Spark 2.1.0.
>
> With minor code changes (like configuration and Spark Session.sc) able to
> execute the existing JOB into Spark 2.1.0.
>
> But noticed that JOB completion timings are much better in Spark 1.6.0 but
> no in Spark 2.1.0.
>
> For the instance, JOB A completed in 50s in Spark 1.6.0.
>
> And with the same input and JOB A completed in 1.5 mins in Spark 2.1.0.
>
> Is there any specific factor needs to be considered when switching to
> Spark 2.1.0 from Spark 1.6.0.
>
>
>
> Thanks & Regards,
> Gokula Krishnan* (Gokul)*
>


Re: What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-23 Thread vaquar khan
http://spark.apache.org/docs/latest/sql-programming-guide.html#migration-guide

Regards,
Vaquar khan

On Fri, Sep 22, 2017 at 4:41 PM, Gokula Krishnan D 
wrote:

> Thanks for the reply. Forgot to mention that, our Batch ETL Jobs are in
> Core-Spark.
>
>
> On Sep 22, 2017, at 3:13 PM, Vadim Semenov 
> wrote:
>
> 1. 40s is pretty negligible unless you run your job very frequently, there
> can be many factors that influence that.
>
> 2. Try to compare the CPU time instead of the wall-clock time
>
> 3. Check the stages that got slower and compare the DAGs
>
> 4. Test with dynamic allocation disabled
>
> On Fri, Sep 22, 2017 at 2:39 PM, Gokula Krishnan D 
> wrote:
>
>> Hello All,
>>
>> Currently our Batch ETL Jobs are in Spark 1.6.0 and planning to upgrade
>> into Spark 2.1.0.
>>
>> With minor code changes (like configuration and Spark Session.sc) able to
>> execute the existing JOB into Spark 2.1.0.
>>
>> But noticed that JOB completion timings are much better in Spark 1.6.0
>> but no in Spark 2.1.0.
>>
>> For the instance, JOB A completed in 50s in Spark 1.6.0.
>>
>> And with the same input and JOB A completed in 1.5 mins in Spark 2.1.0.
>>
>> Is there any specific factor needs to be considered when switching to
>> Spark 2.1.0 from Spark 1.6.0.
>>
>>
>>
>> Thanks & Regards,
>> Gokula Krishnan* (Gokul)*
>>
>
>
>


-- 
Regards,
Vaquar Khan
+1 -224-436-0783
Greater Chicago


Re: What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-22 Thread Gokula Krishnan D
Thanks for the reply. Forgot to mention that, our Batch ETL Jobs are in 
Core-Spark. 


> On Sep 22, 2017, at 3:13 PM, Vadim Semenov  
> wrote:
> 
> 1. 40s is pretty negligible unless you run your job very frequently, there 
> can be many factors that influence that.
> 
> 2. Try to compare the CPU time instead of the wall-clock time
> 
> 3. Check the stages that got slower and compare the DAGs
> 
> 4. Test with dynamic allocation disabled
> 
> On Fri, Sep 22, 2017 at 2:39 PM, Gokula Krishnan D  > wrote:
> Hello All, 
> 
> Currently our Batch ETL Jobs are in Spark 1.6.0 and planning to upgrade into 
> Spark 2.1.0. 
> 
> With minor code changes (like configuration and Spark Session.sc) able to 
> execute the existing JOB into Spark 2.1.0. 
> 
> But noticed that JOB completion timings are much better in Spark 1.6.0 but no 
> in Spark 2.1.0.
> 
> For the instance, JOB A completed in 50s in Spark 1.6.0. 
> 
> And with the same input and JOB A completed in 1.5 mins in Spark 2.1.0. 
> 
> Is there any specific factor needs to be considered when switching to Spark 
> 2.1.0 from Spark 1.6.0. 
> 
> 
> 
> Thanks & Regards, 
> Gokula Krishnan (Gokul)
> 



Re: What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-22 Thread Vadim Semenov
1. 40s is pretty negligible unless you run your job very frequently, there
can be many factors that influence that.

2. Try to compare the CPU time instead of the wall-clock time

3. Check the stages that got slower and compare the DAGs

4. Test with dynamic allocation disabled

On Fri, Sep 22, 2017 at 2:39 PM, Gokula Krishnan D 
wrote:

> Hello All,
>
> Currently our Batch ETL Jobs are in Spark 1.6.0 and planning to upgrade
> into Spark 2.1.0.
>
> With minor code changes (like configuration and Spark Session.sc) able to
> execute the existing JOB into Spark 2.1.0.
>
> But noticed that JOB completion timings are much better in Spark 1.6.0 but
> no in Spark 2.1.0.
>
> For the instance, JOB A completed in 50s in Spark 1.6.0.
>
> And with the same input and JOB A completed in 1.5 mins in Spark 2.1.0.
>
> Is there any specific factor needs to be considered when switching to
> Spark 2.1.0 from Spark 1.6.0.
>
>
>
> Thanks & Regards,
> Gokula Krishnan* (Gokul)*
>


What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-22 Thread Gokula Krishnan D
Hello All,

Currently our Batch ETL Jobs are in Spark 1.6.0 and planning to upgrade
into Spark 2.1.0.

With minor code changes (like configuration and Spark Session.sc) able to
execute the existing JOB into Spark 2.1.0.

But noticed that JOB completion timings are much better in Spark 1.6.0 but
no in Spark 2.1.0.

For the instance, JOB A completed in 50s in Spark 1.6.0.

And with the same input and JOB A completed in 1.5 mins in Spark 2.1.0.

Is there any specific factor needs to be considered when switching to Spark
2.1.0 from Spark 1.6.0.



Thanks & Regards,
Gokula Krishnan* (Gokul)*