rdblue commented on a change in pull request #86: Fix row group dictionary
filter handling of null values.
URL: https://github.com/apache/incubator-iceberg/pull/86#discussion_r253943635
##########
File path:
parquet/src/main/java/com/netflix/iceberg/parquet/ParquetDictionaryRowGroupFilter.java
##########
@@ -282,7 +285,7 @@ public Boolean or(Boolean leftResult, Boolean rightResult)
{
}
Set<T> dictionary = dict(id, lit.comparator());
- if (dictionary.size() > 1) {
+ if (dictionary.size() > 1 || mayContainNulls.get(id)) {
Review comment:
This is only run if the check above doesn't return. That check guarantees
that all pages are dictionary-encoded so there must be a dictionary. I think it
may be better to throw an exception in `dict` if there is no dictionary instead
of handling dictionary-encoded pages with no dictionary here..
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]