Re: [PR] PARQUET-2471: Add geometry logical type [parquet-format]

via GitHub Tue, 28 May 2024 06:40:27 -0700


paleolimbot commented on PR #240:
URL: https://github.com/apache/parquet-format/pull/240#issuecomment-2135246899


   That is a great point!
   
   > Or will this require that all Parquet implementations have some baseline 
level of geospatial support?
   
   I think the minimum would be "ability to access logical type information", 
which is where the "serialized metadata" (as opposed to thrift-specified 
metadata) is nice because it would let that evolve without a change to the 
reader. In Arrow C++ or pyarrow I believe there is access to a Parquet logical 
type already (more difficult than file-level metadata, which was already 
propagated to a, Arrow schema, albeit incorrectly sometimes if there were 
multiple files or a renamed column involved). The second level might be the 
ability to write column statistics, which would require a WKB parser.
   
   The flip side of this argument is that embedding geospatial details in the 
format allows Parquet to be a more effective geospatial file format for the 
readers/writers that *do* care about these details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] PARQUET-2471: Add geometry logical type [parquet-format]

Reply via email to