wgtmac commented on code in PR #494: URL: https://github.com/apache/parquet-format/pull/494#discussion_r2055264910
########## Geospatial.md: ########## @@ -94,6 +94,36 @@ Bounding box is defined as the thrift struct below in the representation of min/max value pair of coordinates from each axis. Note that X and Y Values are always present. Z and M are omitted for 2D geospatial instances. +Writers should follow the guidelines below when calculating bounding boxes in +the presence of invalid values. An invalid geospatial value refers to any of +the following: `NaN`, `null`, `does not exist` (e.g., LINESTRING EMPTY), or +`out of bounds` (e.g., `x < -180` or `x > 180` for `GEOGRAPHY` types): + +* X and Y: Skip any invalid X or Y value and processing the remaining X or Y + values. Do not produce a bounding box if all X or all Y values are invalid. + +* Z: Skip any invalid Z value and continue processing the remaining Z values. + Omit Z from the bounding box if all Z values are invalid. + +* M: Skip any invalid M value and continue processing the remaining M values. + Omit M from the bounding box if all M values are invalid. + +Readers should follow the guidelines below when examining bounding boxes: + +* No bounding box: No assumptions can be made about the presence or absence + of invalid values. Readers may need to load all individual coordinate + values for validation. + +* A bounding box is present: + * X and Y: X and Y of the bounding box must be present. Readers should Review Comment: If any X or Y value in the bbox is invalid, the bbox is malformed and cannot be used. ########## Geospatial.md: ########## @@ -94,6 +94,36 @@ Bounding box is defined as the thrift struct below in the representation of min/max value pair of coordinates from each axis. Note that X and Y Values are always present. Z and M are omitted for 2D geospatial instances. +Writers should follow the guidelines below when calculating bounding boxes in +the presence of invalid values. An invalid geospatial value refers to any of +the following: `NaN`, `null`, `does not exist` (e.g., LINESTRING EMPTY), or +`out of bounds` (e.g., `x < -180` or `x > 180` for `GEOGRAPHY` types): + +* X and Y: Skip any invalid X or Y value and processing the remaining X or Y Review Comment: ```suggestion * X and Y: Skip any invalid X or Y value and continue processing the remaining X or Y ``` ########## Geospatial.md: ########## @@ -94,6 +94,36 @@ Bounding box is defined as the thrift struct below in the representation of min/max value pair of coordinates from each axis. Note that X and Y Values are always present. Z and M are omitted for 2D geospatial instances. +Writers should follow the guidelines below when calculating bounding boxes in +the presence of invalid values. An invalid geospatial value refers to any of +the following: `NaN`, `null`, `does not exist` (e.g., LINESTRING EMPTY), or +`out of bounds` (e.g., `x < -180` or `x > 180` for `GEOGRAPHY` types): + +* X and Y: Skip any invalid X or Y value and processing the remaining X or Y + values. Do not produce a bounding box if all X or all Y values are invalid. + +* Z: Skip any invalid Z value and continue processing the remaining Z values. + Omit Z from the bounding box if all Z values are invalid. + +* M: Skip any invalid M value and continue processing the remaining M values. + Omit M from the bounding box if all M values are invalid. + +Readers should follow the guidelines below when examining bounding boxes: + +* No bounding box: No assumptions can be made about the presence or absence Review Comment: We may not make assumption of valid values as well. For example, we cannot think this is an empty bbox. ########## Geospatial.md: ########## @@ -94,6 +94,36 @@ Bounding box is defined as the thrift struct below in the representation of min/max value pair of coordinates from each axis. Note that X and Y Values are always present. Z and M are omitted for 2D geospatial instances. +Writers should follow the guidelines below when calculating bounding boxes in +the presence of invalid values. An invalid geospatial value refers to any of +the following: `NaN`, `null`, `does not exist` (e.g., LINESTRING EMPTY), or +`out of bounds` (e.g., `x < -180` or `x > 180` for `GEOGRAPHY` types): + +* X and Y: Skip any invalid X or Y value and processing the remaining X or Y + values. Do not produce a bounding box if all X or all Y values are invalid. + +* Z: Skip any invalid Z value and continue processing the remaining Z values. + Omit Z from the bounding box if all Z values are invalid. + +* M: Skip any invalid M value and continue processing the remaining M values. + Omit M from the bounding box if all M values are invalid. + +Readers should follow the guidelines below when examining bounding boxes: + +* No bounding box: No assumptions can be made about the presence or absence + of invalid values. Readers may need to load all individual coordinate + values for validation. + +* A bounding box is present: + * X and Y: X and Y of the bounding box must be present. Readers should Review Comment: ditto for Z and M below ########## Geospatial.md: ########## @@ -94,6 +94,36 @@ Bounding box is defined as the thrift struct below in the representation of min/max value pair of coordinates from each axis. Note that X and Y Values are always present. Z and M are omitted for 2D geospatial instances. +Writers should follow the guidelines below when calculating bounding boxes in +the presence of invalid values. An invalid geospatial value refers to any of +the following: `NaN`, `null`, `does not exist` (e.g., LINESTRING EMPTY), or Review Comment: It seems worth providing concrete example for each case? For example, I still don't understand what does `null` mean here. Is it a null binary value in Parquet, or a null value in WKB? We can add a section below named `Invalid geospatial value` and link it here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
