paleolimbot commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1693305287
##########
src/main/thrift/parquet.thrift:
##########
@@ -237,6 +237,83 @@ struct SizeStatistics {
3: optional list<i64> definition_level_histogram;
}
+/**
+ * Interpretation for edges of GEOMETRY logical type, i.e. whether the edge
+ * between points represent a straight cartesian line or the shortest line on
+ * the sphere. Please note that it only applies to polygons.
+ */
+enum Edges {
+ PLANAR = 0;
+ SPHERICAL = 1;
+}
+
+/**
+ * A custom WKB-encoded polygon or multi-polygon to represent a covering of
+ * geometries. For example, it may be a bounding box, or an evelope of
geometries
+ * when a bounding box cannot be built (e.g. a geometry has spherical edges,
or if
+ * an edge of geographic coordinates crosses the antimeridian). In addition,
it can
+ * also be used to provide vendor-agnostic coverings like S2 or H3 grids.
+ */
+struct Covering {
+ /** Bytes of a WKB-encoded geometry */
+ 1: required binary geometry;
+ /** Edges of the geometry, which is independent of edges from the logical
type */
+ 2: required Edges edges;
Review Comment:
If we do `metadata` we should probably remove the `geometry`/`edges`, which
were included to perhaps avoid hard-coding vendor names into the Parquet. That
said it is supremely more useful to have the `int64[]` array of values on read
than the boundaries of that covering as WKB (it is more or less equivalent to
generate the boundaries on write given access to S2 C++ or H3 C).
```
struct Covering {
/** A type of covering. Currently accepted values are "WKT", "S2 and "H3".
*/
1: required string kind;
/** A payload specific to kind:
* - WKT: well-known text of a POLYGON that completely covers the contents.
* This will be interpreted according to the same CRS and edges defined
by
* the logical type.
* - S2: A JSON array of S2 cell identifiers encoded as hexadecimal
strings.
* - H3: A JSON array of H3 cell identifiers encoded as hexadecimal
strings.
*/
2: required string value;
}
```
...and allow more than one of these to be in `GeometryStatistics`. I'm not
awesome at thrift but maybe there's a `Map<string: string>` available as well
(i.e., it isn't necessary to have more than one covering of a specific type).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]