wgtmac commented on PR #2971:
URL: https://github.com/apache/parquet-java/pull/2971#issuecomment-2797272927

   > I wasn't able to find a null count for a row group in statistics for all 
null values (or otherwise) because (at least in C++) the statistics aren't 
written because the sort order is unknown? 
   
   @paleolimbot You're right. I was talking about the `statistics` in the 
column chunk metadata which is unfortunately disabled due to unknown sort 
order. It works for other data types.
   
   > we can also clarify in the comments of the format that an omitted bbox 
(when GeospatialStatistics exists) occurs if-and-only-if there are no x or y 
values? (And also that omitted z and/or m statistics occur if-and-only-if there 
were no z and/or m values, respectively, which is true today in both Java and 
C++)
   
   I think we need do this. Usually the spec should talk about the views from 
writer and reader. We could suggest `writers` not to produce bbox if x/y has 
any NaN and unset z/m axis if any NaN exist and `readers` to ignore the 
(malformed) bbox if any NaN value exists.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to