desruisseaux commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1731852567


##########
src/main/thrift/parquet.thrift:
##########
@@ -237,6 +237,98 @@ struct SizeStatistics {
    3: optional list<i64> definition_level_histogram;
 }
 
+/**
+ * Interpretation for edges of GEOMETRY logical type, i.e. whether the edge
+ * between points represent a straight cartesian line or the shortest line on
+ * the sphere. It applies to all non-point geometry objects.
+ */
+enum Edges {
+  PLANAR = 0;
+  SPHERICAL = 1;
+}
+
+/**
+ * A custom binary-encoded polygon or multi-polygon to represent a covering of
+ * geometries. For example, it may be a bounding box or an envelope of 
geometries
+ * when a bounding box cannot be built (e.g. a geometry has spherical edges, 
or if
+ * an edge of geographic coordinates crosses the antimeridian). In addition, 
it can
+ * also be used to provide vendor-agnostic coverings like S2 or H3 grids.
+ */
+struct Covering {
+  /**
+   * A type of covering. Currently accepted values: "WKB".
+   */
+  1: required string kind;
+  /**
+   * A payload specific to kind. Below are the supported values:
+   * - WKB: well-known binary of a POLYGON or MULTI-POLYGON that completely
+   *   covers the contents. This will be interpreted according to the same CRS
+   *   and edges defined by the logical type.
+   */
+  2: required binary value;
+}
+
+/**
+ * Bounding box of geometries in the representation of min/max value pair of
+ * coordinates from each axis. Values of Z and M are omitted for 2D geometries.
+ * Filter pushdown on geometries are only safe for planar spatial predicate
+ * but it is recommended that the writer always generates bounding box 
statistics,
+ * regardless of whether the geometries are planar or spherical.

Review Comment:
   I cannot speak for the author's intend, but three difficulties that I see 
with bounding boxes on a sphere are:
   
   * Boxes crossing the anti-meridian (e.g., from 170° to −170° of longitude). 
Pretty much everything (union, intersection, adding points) become more 
complicated. There is no easy fix, switching to e.g. the 0…360° convention 
works only in special cases.
   * As the box top/bottom border become closer to the north/south pole, the 
box width become larger even if the real world feature is not that large. If 
the box includes the pole, it become 360° width even if the feature is very 
small. It makes the box quite ineffective as the "smallest" enclosing bounding 
box. In particular, it can ruin the performance that we would expect from 
tiling as soon as a box's border become close enough to a pole.
   * When testing whether a point is inside the box, we can get false positives 
or false negatives near the top and bottom borders. It can happen if the box 
describes the minimum and maximum coordinates values of all control points (or 
nodes) of geometries, and if the line segments between those points are 
interpreted as the shortest paths. Because geodesics appears as curves on a 
(_latitude_, _longitude_) map, these curves may cross (exit then reenter) a 
bounding box border.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to