paleolimbot commented on PR #240: URL: https://github.com/apache/parquet-format/pull/240#issuecomment-2135246899
That is a great point! > Or will this require that all Parquet implementations have some baseline level of geospatial support? I think the minimum would be "ability to access logical type information", which is where the "serialized metadata" (as opposed to thrift-specified metadata) is nice because it would let that evolve without a change to the reader. In Arrow C++ or pyarrow I believe there is access to a Parquet logical type already (more difficult than file-level metadata, which was already propagated to a, Arrow schema, albeit incorrectly sometimes if there were multiple files or a renamed column involved). The second level might be the ability to write column statistics, which would require a WKB parser. The flip side of this argument is that embedding geospatial details in the format allows Parquet to be a more effective geospatial file format for the readers/writers that *do* care about these details. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
