cholmes commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1712845719


##########
src/main/thrift/parquet.thrift:
##########
@@ -373,6 +453,51 @@ struct JsonType {
 struct BsonType {
 }
 
+/**
+ * Physical type and encoding for the geometry type.
+ */
+enum GeometryEncoding {
+  /**
+   * Allowed for physical type: BYTE_ARRAY.
+   *
+   * Well-known binary (WKB) representations of geometries. It supports 2D or
+   * 3D geometries of the standard geometry types (Point, LineString, Polygon,
+   * MultiPoint, MultiLineString, MultiPolygon, and GeometryCollection). This
+   * is the preferred option for maximum portability.
+   *
+   * This encoding enables GeometryStatistics to be set in the column chunk
+   * and page index.
+   */
+  WKB = 0;
+
+  // TODO: add native encoding from GeoParquet/GeoArrow
+}
+
+/**
+ * Geometry logical type annotation (added in 2.11.0)
+ */
+struct GeometryType {
+  /**
+   * Physical type and encoding for the geometry type. Please refer to the
+   * definition of GeometryEncoding for more detail.
+   */
+  1: required GeometryEncoding encoding;
+  /**
+   * Edges of polygon.
+   */
+  2: required Edges edges;
+  /**
+   * Coordinate Reference System, i.e. mapping of how coordinates refer to
+   * precise locations on earth, e.g. OGC:CRS84
+   */
+  3: optional string crs;

Review Comment:
   > what do we put for crs_kind when crs="OGC:CRS84". It is empty ?
   
   +1 to all @jiayuasu said, and just to be totally clear - it would often be 
empty. Including the CRS and kind/encoding in this case is more 'informative' - 
implementations should understand that if they see crs="OGC:CRS84" then they 
don't need to check the crs and kind/encoding, and if the values differ then 
they should use CRS84 and ignore the provided CRS. We should provide the 
definition of OGC:CRS84 in all possible encodings in a link from the core 
definition - WKT1, WKT2, PROJJSON, etc. so that any projection aware library is 
sure to get the exact right definition. The goal for GeoParquet was to make it 
so that implementations that only want to support long / lat can without having 
to parse / worry about anything else, since it's a lot of complexity and 
requires some sort of geo library to parse.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to