Dewey Dunnington created SEDONA-723: ---------------------------------------
Summary: Add Arrow write format Key: SEDONA-723 URL: https://issues.apache.org/jira/browse/SEDONA-723 Project: Apache Sedona Issue Type: Improvement Reporter: Dewey Dunnington In SEDONA-660, SEDONA-714, and SEDONA-717, we wired up the ArrowSerializer from SparkConnect to accelerate transfer between the JVM and Python on the driver. For queries whose results are arbitrarily large or unknown at the time of issuing the query, this can result in out-of-memory and it would be helpful to have an escape hatch. This is also a useful way for Sedona users to build services on top of Sedona (e.g., by returning the URLs to the written Arrow files as described in https://arrow.apache.org/blog/2025/01/10/arrow-result-transfer/ ). This should probably be a feature of Spark itself; however, I don't think the existing conversion infrastructure is flexible enough to handle it. I'll put up a draft PR exploring the idea to see if there is interest! -- This message was sent by Atlassian Jira (v8.20.10#820010)