e-strauss commented on PR #2296:
URL: https://github.com/apache/systemds/pull/2296#issuecomment-3117658741

   @phaniarnab I added a new experiment for spark with arrow tables as 
comparison. For the experiment, I created an arrow table in python and 
transferred it to spark by triggering the computation using count action. The 
experiment can be found 
[here](https://gist.github.com/e-strauss/4e55c9e3cd4f80f671e1023bdd2ce5b8).
   
   For larger data sizes, the runtime goes down significantly, since spark 
switches from a LocalRelation in createDataFrame to a RDD-based 
createDataFrame. Both with Arrow optimization. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to