paleolimbot opened a new issue, #1756: URL: https://github.com/apache/sedona/issues/1756
I'm still getting to know how the input/output works for Spark/Sedona, but I noticed that there's a `_collect_as_arrow()` method on data frames (I think this is exposed as `.toArrow()` in pyspark 4.0.0) and I'm wondering if there's an opportunity to implement something like `Adapter.to_geoarrow()` to provide compatibility with tools that import/export it (e.g., geopandas, geoarrow-rs). ```python import os import geopandas import pyspark from sedona.spark import SedonaContext config = SedonaContext.builder().getOrCreate() sedona = SedonaContext.create(config) gs = geopandas.GeoSeries.from_wkt(["POINT (1 2)"]) gdf = geopandas.GeoDataFrame({"geometry": gs}) sedona.createDataFrame(gdf)._collect_as_arrow() #> [pyarrow.RecordBatch #> geometry: binary #> ---- #> geometry: [1200000001000000000000000000F03F0000000000000040]] ``` I am assuming that the binary here is the same binary that is being serialized/deserialized in https://github.com/apache/sedona/tree/52b6ae8e71601cdf36a6176198839bc3daf5547c/python/src . A simple case might be converting that serialization to WKB and exporting `geoarrow.wkb`, which has the widest compatibility. With some information about which geometry types are present, it would be possible to generate "native" geoarrow (i.e., all coordinates together with separate buffers for part/ring offsets). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@sedona.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org