paleolimbot commented on issue #46270:
URL: https://github.com/apache/arrow/issues/46270#issuecomment-2914262617

   Guaranteeing emptiness is important for two reasons:
   
   - When pruning row groups for a range query along the lines of 
`st_intersects(col_name, st_geomfromtext('POINT (0 1)'))`, truly empty column 
statistics would indicate a row group that can be pruned; however, statistics 
that were not provided by the Parquet metadata (or were provided but were 
invalid) would indicate a row group that cannot be pruned.
   - After accumulating statistics during writing, `ToThrift()` needs to know 
if the bounds for a given dimension are completely empty because it has to 
serialize that in a specific way.
   
   > The user would rather some property that guarantees that statistics are 
valid, I think. 
   
   I think the least confusing interface would be 
`GeoStatistics::IntersectsBox(window_xmin, window_ymin, window_xmax, 
window_ymax)`. I will take a stab at that and perhaps mark everything else as 
internal?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to