Hi everyone, The Spark community caught a correctness bug in Parquet, PARQUET-1510 <https://issues.apache.org/jira/browse/PARQUET-1510> and SPARK-26677 <https://issues.apache.org/jira/browse/SPARK-26677>. The dictionary filter was ignoring null values and skipping row groups incorrectly.
Spark is considering disabling Parquet dictionary filters, but PARQUET-1309 <https://issues.apache.org/jira/browse/PARQUET-1309> causes a problem because the stats and dictionary filter config properties are swapped. And, it is a bad idea to disable filtering for all of Parquet due to a bug like this. (I've also suggested a work-around that I think is more likely.) Since this is a correctness bug and Spark can't update to 1.11.0 in a patch release of Spark, if the Parquet release were finished, I think we should create a 1.10.1 release. I would include the fixes for PARQUET-1309 and PARQUET-1510. Is everyone okay with me creating a release candidate for 1.10.1? If so, are there any other bugs that should be fixed in 1.10.1? rb -- Ryan Blue Software Engineer Netflix
