paleolimbot commented on PR #45459: URL: https://github.com/apache/arrow/pull/45459#issuecomment-2749852558
Thank you for the review! > To confirm, the segfault I mentioned is no longer present with the latest change to remove GeoCrsContext handling. Yes, I think previously there was a `throw ParquetException(...)` that should have been a `return Status(...)`. I believe that was due to the bug whose fix you merged in GeoPandas (it was generating a string containing JSON and not a JSON object, and the previous iteration only supported JSON objects and threw the error when it saw a string). The bug now no longer affects this code, because it writes the stringified JSON to the LogicalType crs (the same result as serializing the RapidJSON object into the LogicalType crs). > for the Parquet code to "see" a geometry column in the Arrow data it is writing, it needs to be an actual registered extension type, and having the extension metadata in the field metadata is not sufficient? Yes. I believe we have access to the `Field` there and this is theoretically possible. I was going for a degree of "opt-in" here since this is all new (an earlier version of this PR exposed an option `write_geospatial_types`, which is another option). I really fine with anything here but I do thinking having some version of opt-in would be good for the first release that this lands in. I think a better long-term solution would be a canonical extension type (i.e., always registered), and while I'd prefer to merge this as experimental and have that discussion over the next few months, I'm happy to implement any consensus we end up on here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org