paleolimbot commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1769208683


##########
src/main/thrift/parquet.thrift:
##########
@@ -237,6 +237,135 @@ struct SizeStatistics {
    3: optional list<i64> definition_level_histogram;
 }
 
+/**
+ * Physical type and encoding for the geometry type.
+ */
+enum GeometryEncoding {
+  /**
+   * Allowed for physical type: BYTE_ARRAY.
+   *
+   * Well-known binary (WKB) representations of geometries.
+   *
+   * To be clear, we follow the same rule of WKB and coordinate axis order from
+   * GeoParquet [1][2]. Geometries SHOULD be encoded as ISO WKB [3][4]
+   * supporting XY, XYZ, XYM, XYZM and the standard geometry types
+   * Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon,
+   * and GeometryCollection). Coordinate order is always (x, y) where x is
+   * easting or longitude and y is northing or latitude. This ordering 
explicitly
+   * overrides the axis order as specified in the CRS following the GeoPackage
+   * specification [5].
+   *
+   * This is the preferred encoding for maximum portability. It also supports
+   * GeometryStatistics to be set in the column chunk and page index.
+   *
+   * [1] 
https://github.com/opengeospatial/geoparquet/blob/v1.1.0/format-specs/geoparquet.md?plain=1#L92
+   * [2] 
https://github.com/opengeospatial/geoparquet/blob/v1.1.0/format-specs/geoparquet.md?plain=1#L155
+   * [3] https://portal.ogc.org/files/?artifact_id=18241
+   * [4] https://www.iso.org/standard/60343.html
+   * [5] https://www.geopackage.org/spec130/#gpb_spec
+   */
+  WKB = 0;
+}
+
+/**
+ * Interpretation for edges of elements of a GEOMETRY logical type. In other
+ * words, whether a point between two vertices should be interpolated in
+ * its XY dimensions as if it were a Cartesian line connecting the two
+ * vertices (planar) or the shortest spherical arc between the longitude
+ * and latitude represented by the two vertices (spherical). This value
+ * applies to all non-point geometry objects and is independent of the
+ * coordinate reference system.
+ *
+ * Because most systems currently assume planar edges and do not support
+ * spherical edges, planar should be used as the default value.
+ */
+enum EdgeInterpolation {
+  PLANAR = 0;
+  SPHERICAL = 1;
+}
+
+/**
+ * A custom binary-encoded polygon or multi-polygon to represent a covering of
+ * geometries. For example, it may be a bounding box or an envelope of 
geometries

Review Comment:
   Great point that there is room for improvement on the phrasing here!
   
   I believe the intent is that this can be any polygon that completely covers 
the values that it is representing such that for all the values it is 
representing, both `st_intersects(arbitrary_geometry, covering)` is guaranteed 
to be true if `st_intersects(arbitrary_geometry, value)` is true. One easy way 
to generate this is to take the bounding box (as defined here) and return its 
vertices as a polygon. Your example is a horizontal line (in Cartesian space, 
which it could be defined as if the EdgeInterpolation was set to PLANAR), and 
so this would be a degenerate Polygon (but could still be defined). For 
spherical edges, one could compute a discrete global grid covering (e.g., S2 or 
H3) and convert the boundary of that to a polygon.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to