2010YOUY01 opened a new pull request, #578: URL: https://github.com/apache/sedona-db/pull/578
Follow up to https://github.com/apache/sedona-db/pull/560 ## Rationale `read_parquet()` now supports an override option, `geometry_columns`, to cast Binary columns to geometries. This is an unsafe operation, so adding a validation step is helpful. This PR currently limits validation to WKB validity. In the long term, additional metadata properties can also be validated. If any entry fails validation, an error will be returned. Geo columns are inferred from {(Geo)Parquet metadata, user-provided `geometry_columns` override options}. The `validate` option applies uniformly in all cases—if `validate = true`, geometry columns are always validated. I think this approach is more general and simpler. ## Implementation Propagate the `validate` option to `GeoParquetFileOpener` (which yields the final decoded batches as the data source output), and use it to optionally validate the decoded batches for WKB validity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
