paleolimbot commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1696730689
##########
src/main/thrift/parquet.thrift:
##########
@@ -373,6 +453,51 @@ struct JsonType {
struct BsonType {
}
+/**
+ * Physical type and encoding for the geometry type.
+ */
+enum GeometryEncoding {
+ /**
+ * Allowed for physical type: BYTE_ARRAY.
+ *
+ * Well-known binary (WKB) representations of geometries. It supports 2D or
+ * 3D geometries of the standard geometry types (Point, LineString, Polygon,
+ * MultiPoint, MultiLineString, MultiPolygon, and GeometryCollection). This
+ * is the preferred option for maximum portability.
+ *
+ * This encoding enables GeometryStatistics to be set in the column chunk
+ * and page index.
+ */
+ WKB = 0;
+
+ // TODO: add native encoding from GeoParquet/GeoArrow
+}
+
+/**
+ * Geometry logical type annotation (added in 2.11.0)
+ */
+struct GeometryType {
+ /**
+ * Physical type and encoding for the geometry type. Please refer to the
+ * definition of GeometryEncoding for more detail.
+ */
+ 1: required GeometryEncoding encoding;
+ /**
+ * Edges of polygon.
+ */
+ 2: required Edges edges;
+ /**
+ * Coordinate Reference System, i.e. mapping of how coordinates refer to
+ * precise locations on earth, e.g. OGC:CRS84
+ */
+ 3: optional string crs;
Review Comment:
> Example:
I know this is just an example, but can we make this a string to avoid a
Thrift update when a new CRS encoding arrives?
> `WKT2 = 0;`
If this is included as an option it would need to be more explicit about
what *kind* of WKT2 we're talking about (I think we'd mean WKT2 2019):
https://github.com/OSGeo/PROJ/blob/79b4f28c10d1695da841ca33d6f14fced2a2979a/src/proj.h#L793
> introduce additional overhead to the implementer
It is true that the implementer typically only handles translating the
coordinates into some native library representation (e.g., JTS, GEOS) or
performing computation based on them. I think the main thing here is to make it
explicit exactly how to resolve the projection parameters given a CRS
representation...with WKT and PROJJSON they are embedded (i.e., an implementor
of a coordinate transform does not have to resolve anything from a database to
translate between another CRS on the same datum...like long/lat to a mercator
projection). With SRID, one would have to make it clear how to actually resolve
those (EPSG or PROJ database version, URI of a lookup table of some kind, etc.).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]