paleolimbot commented on issue #46270:
URL: https://github.com/apache/arrow/issues/46270#issuecomment-2914901238
> why would you prune a row group with empty column statistics?
In the case of `... WHERE st_intersects(col_name, st_geomfromtext('POINT (0
1)'))`, values of `col_name` that are null or EMPTY (a special geometry value
that has no coordinates) will not evaluate to true. If there were entire row
groups where `col_name` was null or EMPTY (e.g., this is a Parquet file
containing locations of sales, but the data source didn't provide locations for
all the sales, expressing that as a null or EMPTY), there is no point scanning
that row group.
On the flip side, when writing a row group that contains all EMPTY or nulls,
we have to write these in a special way to the Parquet metadata, so the
`GeoStatistics` has to be able to express the "completely empty" state (so that
it can be detected when writing to Thrift).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]