shuai-xu opened a new pull request, #2055: URL: https://github.com/apache/orc/pull/2055
### What changes were proposed in this pull request? This pr fix the bug that if the column statistics in a orc file is not fully written, and lack of hasnull field, user may get a wrong result using c++ to read it. For example, a file struct<string col1, string col2>, has 10 lines, col1 all has value, col2 all is null. the column 1's stat written by trino may be numberOfValues: 10 stringStatistics { minimum: "10" maximum: "100" sum: 565 }. col2's stat is numberOfValues: 0. They all have no hasnull field. When we want to get where col2 is null, we will get nothing. ### Why are the changes needed? User may get a wrong result with this bug. ### How was this patch tested? Add unit tests. ### Was this patch authored or co-authored using generative AI tooling? No -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@orc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org