HyukjinKwon edited a comment on issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataFrame to Spark DataFrame URL: https://github.com/apache/spark/pull/22954#issuecomment-457214769 To cut it short, I think this PR is ready to go. I reran the benchmark, and updated PR descriptions. Few things to mention: 1. Arrow is not related on CRAN and looks it's going to take few months (see [ARROW-3204](https://issues.apache.org/jira/browse/ARROW-3204)). So, for now, it should be manually installed. - It can be installed by `Rscript -e 'remotes::install_github("apache/[email protected]", subdir = "r")'`. - I used maxOS Mojave 10.14.2 and faced some problems to fix at my env. Please connect me if you guys face some issue during installing this. If this is globally happening, I will document this somewhere. 2. Looks we can run the build via AppVeyor when it's on CRAN (see [ARROW-3204](https://issues.apache.org/jira/browse/ARROW-3204)). 3. We should remove the workarounds that I used to avoid CRAN check (see https://github.com/apache/spark/pull/22954#discussion_r250585248 and https://github.com/apache/spark/pull/22954#discussion_r250618871) Next items (im going to investigate first before filing JIRAs): 1. Im gonna take a look if we can do this Spark DataFrame -> R DataFrame too 2. Also, I'm going to take a look for R native function APIs like lapply and gapply and see if we can optimize this 3. Before Spark 3.0 release, I will document this. Hopefully, we can get rid of both workaround I mentioned above before this.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
