Hello, There is an emerging spec[1] for how to store geospatial data in Arrow + pass through parquet files in the geopandas world. There is even a new R package that implements a wrapper to do the same in R[2]. These both define a serialization[3] for storing geospatial data as an Arrow table (and thus also when saving to parquet with Arrow).
I could see a number of ways that we might interact with standards like these, and for any of these that we pursue it would be good to clarify that in our docs: 1. Point to the standard — we could mention that this standard exists and that if someone is building a geospatial data aware application, they _could_ refer to this standard if they want to. 2. Adopt a/this standard — this could range from stating that we've adopted it as the way that spatial data _ought_ to be stored to asking the creators if maintaining it within the Arrow project itself would be better (either by adopting it or creating a fork — of course communication with the folks working on it now would be critical!) 3. Create extension type(s) for geospatial data — this would require adopting a standard like the one linked, but on top of that providing an extension type within Arrow itself that the various clients could implement as they saw fit. 4. Create new, fully separate type(s) for geospatial data — again, this would require adopting a standard of some sort, but we would implement it as a specific type and presumably support it in all of the clients as we could. There are of course pros and cons to all of these. This type of data *is* somewhat specialized and I don't think we want to have a huge profusion of types for all of the possible specialized data types out there. But, at a minimum we should acknowledge (or adopt) a standard if it exists and encourage implementations that use Arrow to follow that standard (like sfarrow does to be compatible with geopandas) so that some level of interoperability is there + people aren't needing to reinvent the wheel each time they store spatial data. Thoughts? Are there other projects out there that already do something like this with Arrow that we should consider? [1] https://github.com/geopandas/geo-arrow-spec/pull/2 [2] https://github.com/wcjochem/sfarrow [3] for now they create a binary WKB column + attach a bit of metadata to the schema that that's what happened, though there are other ways one could encode this and the spec might include other way(s) to store this data in the future. -Jon