paleolimbot commented on PR #10065:
URL: https://github.com/apache/arrow-rs/pull/10065#issuecomment-4653799290

   > Is this a "breaking change"
   
   Definitely not in the "Rust code will stop compiling" sense. From a metadata 
semantics point of view I would consider this a bug fix...the currently emitted 
metadata would have either triggered errors or resulted in silently/subtly 
incorrect results in at least one common case (the default Parquet != GeoArrow 
default CRS), and intentionally invalid metadata is required to write 
geography. In order to have any of this code triggered one would need to build 
parquet with the geospatial feature enabled, which I am not sure is common (but 
I don't know this).
   
   That said there is a workaround if this needs to be punted (in SedonaDB we 
have a UDF that intentionally creates invalid GeoArrow field metadata to write 
valid Parquet output).
   
   > DO you have any suggestions on how to review this PR? 
   
   I hope the below is helpful!
   
   > other implementations I can cross reference with
   
   Arrow C++ would be the other one: 
https://github.com/apache/arrow/blob/bfc9cdb2cc8e1dcbfef5e97db2e3e9f5a15fd356/cpp/src/parquet/geospatial/util_json_internal.cc#L35-L142
   
   The only difference here is the `srid:0` conversion. That was only just 
clarified in the Parquet spec in the last few weeks and I haven't had a chance 
to PR it into that spec yet.
   
   > Like other implementations I can cross reference with
   
   The two CRS specs:
   
   - https://geoarrow.org/extension-types.html#extension-metadata
   - 
https://github.com/apache/parquet-format/blob/master/Geospatial.md#coordinate-reference-system
   
   > Is it right to have embedded quotes here?
   
   Good catch! I'll update. It works because we do try to parse the value as 
JSON first (to avoid taking a valid JSON object from the Parquet field, which 
is a string, escaping it, and re-inserting it into a the JSON GeoArrow 
metadata) but it's not what I was trying to test 😬 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to