Hello Impala Public Jenkins,
I'd like you to do a code review. Please visit
http://gerrit.cloudera.org:8080/18888
to review the following change.
Change subject: IMPALA-11345: Parquet Bloom filtering failure if column is
added to the schema
......................................................................
IMPALA-11345: Parquet Bloom filtering failure if column is added to the
schema
If a new column was added to an existing table with existing data and
Parquet Bloom filtering was turned ON, queries having an equality
conjunct on the new column failed.
This was because the old Parquet data files did not have the new column
in their schema and could not find a column for the conjunct. This was
treated as an error and the query failed.
After this patch this situation is no longer treated as an error and the
conjunct is simply disregarded for Bloom filtering in the files that
lack the new column.
Testing:
- added the test
TestParquetBloomFilter::test_parquet_bloom_filtering_schema_change in
tests/query_test/test_parquet_bloom_filter.py that checks that a
query as described above does not fail.
Change-Id: Ief3e6b6358d3dff3abe5beeda752033a7e8e16a6
Reviewed-on: http://gerrit.cloudera.org:8080/18779
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
---
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M tests/query_test/test_parquet_bloom_filter.py
2 files changed, 87 insertions(+), 6 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/18888/1
--
To view, visit http://gerrit.cloudera.org:8080/18888
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: branch-4.1.1
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ief3e6b6358d3dff3abe5beeda752033a7e8e16a6
Gerrit-Change-Number: 18888
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>