gszadovszky commented on PR #196: URL: https://github.com/apache/parquet-format/pull/196#issuecomment-1491890773
Thank you, @JFinis, for working on this. This is not an easy topic. I am afraid we cannot avoid encoding NaN values into column index min/max lists for the sake of backward compatibility: There is no such thing as "missing value" in the list. We encode actual primitive values. We need to store there something for each page. That's why we have `null_pages` to highlight that the values encoded for the corresponding page are valid or not. The only way I can think of being backward compatible is to store NaN values in min/max otherwise we mix up older readers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
