wgtmac commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1641623813
##########
src/main/thrift/parquet.thrift:
##########
@@ -237,6 +237,78 @@ struct SizeStatistics {
3: optional list<i64> definition_level_histogram;
}
+/**
+ * Interpretation for edges of GEOMETRY logical type, i.e. whether the edge
+ * between points represent a straight cartesian line or the shortest line on
+ * the sphere.
+ */
+enum Edges {
+ PLANAR = 0;
+ SPHERICAL = 1;
+}
+
+/**
+ * A custom WKB-encoded geometry data to be used in geometry statistics.
+ * The geometry may be a polygon to encode an s2 or h3 covering to provide
+ * vendor-agnostic coverings, or an evelope of geometries when a bounding
+ * box cannot be built (e.g. a geometry has spherical edges, or if an edge
+ * of geographic coordinates crosses the antimeridian).
+ */
+struct Geometry {
+ /** Bytes of a WKB-encoded geometry */
+ 1: required binary geometry;
+ /**
+ * Edges of the geometry if it is a polygon. It may be different to the
+ * edges attribute from the GEOMETRY logical type.
+ */
+ 2: optional Edges edges;
Review Comment:
I'm inclined to make it `required` because the geometry is anyway a polygon
which has this attribute. Though writer who generates this geometry statistics
is required to fill the `Edges` field, it makes the reader easier in that it
does not need to derive this attribute from logical type. Sometimes the values
in the geometry column do not have any polygon value, in which case the logical
type may not have the `Edges` field. cc @paleolimbot
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]