Hi all,

I am trying to introduce R Arrow optimization by reusing PySpark Arrow
optimization.

It boosts R DataFrame > Spark DataFrame up to roughly 900% ~ 1200% faster.

Looks working fine so far; however, I would appreciate if you guys have
some time to take a look (https://github.com/apache/spark/pull/22954) so
that we can directly go ahead as soon as R API of Arrow is released.

More importantly, I want some more people who're more into Arrow R API side
but also interested in Spark side. I have already cc'ed some people I know
but please come, review and discuss for both Spark side and Arrow side.

Thanks.

Reply via email to