Drabble commented on PR #855: URL: https://github.com/apache/incubator-baremaps/pull/855#issuecomment-2131242523
I did a little bit of reading on the spec of geoparquet. https://github.com/opengeospatial/geoparquet/blob/main/format-specs/geoparquet.md I have a feeling that in our library what we do is extract the data from a generic parquet file and provide in addition the Geoparquet metadata. We don't really do Geoparquet specifc processing except from the metadata parsing. We could provide extra functions to parse the WKB/GeoArrow binary into a Java class that loads up geometries and sets the correct CRS. For Baremaps, I don't think this is really useful as we store raw WKB in the database. But it could be useful for someone using the library. Maybe this should be considered in a second iteration? The big question is, should we return a Logical type called Geometry instead of the binary type? This could also be valid for other logical types like STRING or Date. And should we rename the `Primitive` class into something else as String/Geometry are not primitives. Regarding the geoparquet metadata, I believe we should provide this metadata with a function like `geoParquetReader.getGeoParquetMetadata()` instead of including it inside each `GeoParquetGroupImpl` object. The Geoparquet metadata should be the same for each record, as long as the files are valid. As for writing to Geoparquet files, I think this is not useful for Baremaps. We should maybe consider it as a second or third step. I think there are a lot of question about supporting the entire specification and validating schemas if we try to implement that. Some things to consider for support are: - Single-geometry type encodings based on the [GeoArrow](https://geoarrow.org/) specification instead of WKB. - Validating input geoparquet files. For example in the spec, there should never be geometries inside nested objects. - Using more complex types for `GeoParquetColumnMetadata`. For example the edges field could be an enum. `Name of the coordinate system for the edges. Must be one of "planar" or "spherical". The default value is "planar".` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
