jiayuasu commented on code in PR #494:
URL: https://github.com/apache/parquet-format/pull/494#discussion_r2069349181
##########
Geospatial.md:
##########
@@ -94,6 +94,41 @@ Bounding box is defined as the thrift struct below in the
representation of
min/max value pair of coordinates from each axis. Note that X and Y Values are
always present. Z and M are omitted for 2D geospatial instances.
+Writers should follow the guidelines below when calculating bounding boxes in
+the presence of edge cases.
+
+* `null` instance: Skip it and continue processing the remaining
+ geospatial instances. Do not produce a bounding box if all instances are
null.
+* Non-`null` instance with [special geospatial
values](#special-geospatial-values):
+ * X and Y: Skip any special X or Y value and continue processing the
Review Comment:
Taking a LineString object as an example `LINESTRING (0 NaN, 0 1, 1 2)`:
We have 3 levels here:
1. A geospatial instance: this linestring object
2. A geospatial coordinate:
* we have 3 geospatial coordinates `0 NaN` , `0 1`, `1 2`
3. A geospatial value:
* For X, we have `0`, `0`, `1`
* For Y, we have `NaN`, `1`, `2`
The special geospatial value in this spec refers to the 3rd level. When
calculating bbox, we only omitted that single bad value `NaN`, and use X (`0`,
`0`, `1`) and Y (`1`, `2`).
That's why we didn't have a definition for `special coordinate` (2nd level)
because we look at the 3rd level directly
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]