Re: [PR] [WIP] Data transfer Python <--> Java [systemds]

via GitHub Fri, 25 Jul 2025 05:45:06 -0700


e-strauss commented on PR #2296:
URL: https://github.com/apache/systemds/pull/2296#issuecomment-3117658741


   @phaniarnab I added a new experiment for spark with arrow tables as 
comparison. For the experiment, I created an arrow table in python and 
transferred it to spark by triggering the computation using count action. The 
experiment can be found 
[here](https://gist.github.com/e-strauss/4e55c9e3cd4f80f671e1023bdd2ce5b8).
   
   For larger data sizes, the runtime goes down significantly, since spark 
switches from a LocalRelation in createDataFrame to a RDD-based 
createDataFrame. Both with Arrow optimization. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [WIP] Data transfer Python <--> Java [systemds]

Reply via email to