paleolimbot commented on PR #46649:
URL: https://github.com/apache/arrow/pull/46649#issuecomment-2931029109

   > Why was it done that way, if emptiness is a useful information to have?
   
   The PR where we discussed this is 
https://github.com/apache/parquet-format/pull/494 ...the consensus was that 
checking the `null_count` for a column chunk against the number of rows in the 
row group would catch the most common case (row group is all null). We then 
discovered that we don't currently write null counts for unsorted logical 
types, but hopefully we can fix that ( 
https://github.com/apache/arrow/pull/46275 ).
   
   > And is there a point in exposing emptiness in our geostats APIs? 
   
   We use the same API for producing and consuming GeoStatistics (this was 
modelled after the regular Statistics). We could move the write path only use 
internals although I am not sure this would be less confusing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to