HyukjinKwon edited a comment on issue #22954: [SPARK-25981][R] Enables Arrow 
optimization from R DataFrame to Spark DataFrame
URL: https://github.com/apache/spark/pull/22954#issuecomment-457214769
 
 
   To cut it short, I think this PR is ready to go. I reran the benchmark, and 
updated PR descriptions.
   
   Few things to mention:
   
   1. Arrow is not related on CRAN and looks it's going to take few months (see 
[ARROW-3204](https://issues.apache.org/jira/browse/ARROW-3204)). So, for now, 
it should be manually installed.
       - It can be installed by `Rscript -e 
'remotes::install_github("apache/[email protected]", subdir = "r")'`.
       - I used maxOS Mojave 10.14.2 and faced some problems to fix at my env. 
Please connect me if you guys face some issue during installing this. If this 
is globally happening, I will document this somewhere.
   
   2. Looks we can run the build via AppVeyor when it's on CRAN (see 
[ARROW-3204](https://issues.apache.org/jira/browse/ARROW-3204)).
   
   3. We should remove the workarounds that I used to avoid CRAN check (see 
https://github.com/apache/spark/pull/22954#discussion_r250585248 and 
https://github.com/apache/spark/pull/22954#discussion_r250618871)
   
   Next items (im going to investigate first before filing JIRAs):
   
   1. Im gonna take a look if we can do this Spark DataFrame -> R DataFrame too
   2. Also, I'm going to take a look for R native function APIs like lapply and 
gapply and see if we can optimize this
   3. Before Spark 3.0 release, I will document this. Hopefully, we can get rid 
of both workaround I mentioned above before this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to