e-strauss commented on PR #2296: URL: https://github.com/apache/systemds/pull/2296#issuecomment-3117658741
@phaniarnab I added a new experiment for spark with arrow tables as comparison. For the experiment, I created an arrow table in python and transferred it to spark by triggering the computation using count action. The experiment can be found [here](https://gist.github.com/e-strauss/4e55c9e3cd4f80f671e1023bdd2ce5b8). For larger data sizes, the runtime goes down significantly, since spark switches from a LocalRelation in createDataFrame to a RDD-based createDataFrame. Both with Arrow optimization. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org