Re: Arrow optimization in conversion from R DataFrame to Spark DataFrame

2018-11-10 Thread Hyukjin Kwon
Thanks guys ! 

2018년 11월 10일 (토) 오전 7:35, Bryan Cutler 님이 작성:

> Great work Hyukjin!  I'm not too familiar with R, but I'll take a look at
> the PR.
>
> Bryan
>
> On Fri, Nov 9, 2018 at 9:19 AM Shivaram Venkataraman <
> shiva...@eecs.berkeley.edu> wrote:
>
>> Thanks Hyukjin! Very cool results
>>
>> Shivaram
>> On Fri, Nov 9, 2018 at 10:58 AM Felix Cheung 
>> wrote:
>> >
>> > Very cool!
>> >
>> >
>> > 
>> > From: Hyukjin Kwon 
>> > Sent: Thursday, November 8, 2018 10:29 AM
>> > To: dev
>> > Subject: Arrow optimization in conversion from R DataFrame to Spark
>> DataFrame
>> >
>> > Hi all,
>> >
>> > I am trying to introduce R Arrow optimization by reusing PySpark Arrow
>> optimization.
>> >
>> > It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200%
>> faster.
>> >
>> > Looks working fine so far; however, I would appreciate if you guys have
>> some time to take a look (https://github.com/apache/spark/pull/22954) so
>> that we can directly go ahead as soon as R API of Arrow is released.
>> >
>> > More importantly, I want some more people who're more into Arrow R API
>> side but also interested in Spark side. I have already cc'ed some people I
>> know but please come, review and discuss for both Spark side and Arrow side.
>> >
>> > Thanks.
>> >
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>


Re: Arrow optimization in conversion from R DataFrame to Spark DataFrame

2018-11-09 Thread Bryan Cutler
Great work Hyukjin!  I'm not too familiar with R, but I'll take a look at
the PR.

Bryan

On Fri, Nov 9, 2018 at 9:19 AM Shivaram Venkataraman <
shiva...@eecs.berkeley.edu> wrote:

> Thanks Hyukjin! Very cool results
>
> Shivaram
> On Fri, Nov 9, 2018 at 10:58 AM Felix Cheung 
> wrote:
> >
> > Very cool!
> >
> >
> > 
> > From: Hyukjin Kwon 
> > Sent: Thursday, November 8, 2018 10:29 AM
> > To: dev
> > Subject: Arrow optimization in conversion from R DataFrame to Spark
> DataFrame
> >
> > Hi all,
> >
> > I am trying to introduce R Arrow optimization by reusing PySpark Arrow
> optimization.
> >
> > It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200%
> faster.
> >
> > Looks working fine so far; however, I would appreciate if you guys have
> some time to take a look (https://github.com/apache/spark/pull/22954) so
> that we can directly go ahead as soon as R API of Arrow is released.
> >
> > More importantly, I want some more people who're more into Arrow R API
> side but also interested in Spark side. I have already cc'ed some people I
> know but please come, review and discuss for both Spark side and Arrow side.
> >
> > Thanks.
> >
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: Arrow optimization in conversion from R DataFrame to Spark DataFrame

2018-11-09 Thread Shivaram Venkataraman
Thanks Hyukjin! Very cool results

Shivaram
On Fri, Nov 9, 2018 at 10:58 AM Felix Cheung  wrote:
>
> Very cool!
>
>
> 
> From: Hyukjin Kwon 
> Sent: Thursday, November 8, 2018 10:29 AM
> To: dev
> Subject: Arrow optimization in conversion from R DataFrame to Spark DataFrame
>
> Hi all,
>
> I am trying to introduce R Arrow optimization by reusing PySpark Arrow 
> optimization.
>
> It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200% faster.
>
> Looks working fine so far; however, I would appreciate if you guys have some 
> time to take a look (https://github.com/apache/spark/pull/22954) so that we 
> can directly go ahead as soon as R API of Arrow is released.
>
> More importantly, I want some more people who're more into Arrow R API side 
> but also interested in Spark side. I have already cc'ed some people I know 
> but please come, review and discuss for both Spark side and Arrow side.
>
> Thanks.
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Arrow optimization in conversion from R DataFrame to Spark DataFrame

2018-11-09 Thread Felix Cheung
Very cool!



From: Hyukjin Kwon 
Sent: Thursday, November 8, 2018 10:29 AM
To: dev
Subject: Arrow optimization in conversion from R DataFrame to Spark DataFrame

Hi all,

I am trying to introduce R Arrow optimization by reusing PySpark Arrow 
optimization.

It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200% faster.

Looks working fine so far; however, I would appreciate if you guys have some 
time to take a look (https://github.com/apache/spark/pull/22954) so that we can 
directly go ahead as soon as R API of Arrow is released.

More importantly, I want some more people who're more into Arrow R API side but 
also interested in Spark side. I have already cc'ed some people I know but 
please come, review and discuss for both Spark side and Arrow side.

Thanks.