paleolimbot commented on PR #45459:
URL: https://github.com/apache/arrow/pull/45459#issuecomment-2749852558

   Thank you for the review!
   
   > To confirm, the segfault I mentioned is no longer present with the latest 
change to remove GeoCrsContext handling.
   
   Yes, I think previously there was a `throw ParquetException(...)` that 
should have been a `return Status(...)`. I believe that was due to the bug 
whose fix you merged in GeoPandas (it was generating a string containing JSON 
and not a JSON object, and the previous iteration only supported JSON objects 
and threw the error when it saw a string). The bug now no longer affects this 
code, because it writes the stringified JSON to the LogicalType crs (the same 
result as serializing the RapidJSON object into the LogicalType crs).
   
   > for the Parquet code to "see" a geometry column in the Arrow data it is 
writing, it needs to be an actual registered extension type, and having the 
extension metadata in the field metadata is not sufficient?
   
   Yes. I believe we have access to the `Field` there and this is theoretically 
possible. I was going for a degree of "opt-in" here since this is all new (an 
earlier version of this PR exposed an option `write_geospatial_types`, which is 
another option). I really fine with anything here but I do thinking having some 
version of opt-in would be good for the first release that this lands in.
   
   I think a better long-term solution would be a canonical extension type 
(i.e., always registered), and while I'd prefer to merge this as experimental 
and have that discussion over the next few months, I'm happy to implement any 
consensus we end up on here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to