Re: Arrow optimization in conversion from R DataFrame to Spark DataFrame
Thanks guys ! 2018년 11월 10일 (토) 오전 7:35, Bryan Cutler 님이 작성: > Great work Hyukjin! I'm not too familiar with R, but I'll take a look at > the PR. > > Bryan > > On Fri, Nov 9, 2018 at 9:19 AM Shivaram Venkataraman < > shiva...@eecs.berkeley.edu> wrote: > >> Thanks Hyukjin! Very cool results >> >> Shivaram >> On Fri, Nov 9, 2018 at 10:58 AM Felix Cheung >> wrote: >> > >> > Very cool! >> > >> > >> > >> > From: Hyukjin Kwon >> > Sent: Thursday, November 8, 2018 10:29 AM >> > To: dev >> > Subject: Arrow optimization in conversion from R DataFrame to Spark >> DataFrame >> > >> > Hi all, >> > >> > I am trying to introduce R Arrow optimization by reusing PySpark Arrow >> optimization. >> > >> > It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200% >> faster. >> > >> > Looks working fine so far; however, I would appreciate if you guys have >> some time to take a look (https://github.com/apache/spark/pull/22954) so >> that we can directly go ahead as soon as R API of Arrow is released. >> > >> > More importantly, I want some more people who're more into Arrow R API >> side but also interested in Spark side. I have already cc'ed some people I >> know but please come, review and discuss for both Spark side and Arrow side. >> > >> > Thanks. >> > >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >>
Re: Arrow optimization in conversion from R DataFrame to Spark DataFrame
Great work Hyukjin! I'm not too familiar with R, but I'll take a look at the PR. Bryan On Fri, Nov 9, 2018 at 9:19 AM Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Thanks Hyukjin! Very cool results > > Shivaram > On Fri, Nov 9, 2018 at 10:58 AM Felix Cheung > wrote: > > > > Very cool! > > > > > > > > From: Hyukjin Kwon > > Sent: Thursday, November 8, 2018 10:29 AM > > To: dev > > Subject: Arrow optimization in conversion from R DataFrame to Spark > DataFrame > > > > Hi all, > > > > I am trying to introduce R Arrow optimization by reusing PySpark Arrow > optimization. > > > > It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200% > faster. > > > > Looks working fine so far; however, I would appreciate if you guys have > some time to take a look (https://github.com/apache/spark/pull/22954) so > that we can directly go ahead as soon as R API of Arrow is released. > > > > More importantly, I want some more people who're more into Arrow R API > side but also interested in Spark side. I have already cc'ed some people I > know but please come, review and discuss for both Spark side and Arrow side. > > > > Thanks. > > > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
Re: Arrow optimization in conversion from R DataFrame to Spark DataFrame
Thanks Hyukjin! Very cool results Shivaram On Fri, Nov 9, 2018 at 10:58 AM Felix Cheung wrote: > > Very cool! > > > > From: Hyukjin Kwon > Sent: Thursday, November 8, 2018 10:29 AM > To: dev > Subject: Arrow optimization in conversion from R DataFrame to Spark DataFrame > > Hi all, > > I am trying to introduce R Arrow optimization by reusing PySpark Arrow > optimization. > > It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200% faster. > > Looks working fine so far; however, I would appreciate if you guys have some > time to take a look (https://github.com/apache/spark/pull/22954) so that we > can directly go ahead as soon as R API of Arrow is released. > > More importantly, I want some more people who're more into Arrow R API side > but also interested in Spark side. I have already cc'ed some people I know > but please come, review and discuss for both Spark side and Arrow side. > > Thanks. > - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: Arrow optimization in conversion from R DataFrame to Spark DataFrame
Very cool! From: Hyukjin Kwon Sent: Thursday, November 8, 2018 10:29 AM To: dev Subject: Arrow optimization in conversion from R DataFrame to Spark DataFrame Hi all, I am trying to introduce R Arrow optimization by reusing PySpark Arrow optimization. It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200% faster. Looks working fine so far; however, I would appreciate if you guys have some time to take a look (https://github.com/apache/spark/pull/22954) so that we can directly go ahead as soon as R API of Arrow is released. More importantly, I want some more people who're more into Arrow R API side but also interested in Spark side. I have already cc'ed some people I know but please come, review and discuss for both Spark side and Arrow side. Thanks.