[
https://issues.apache.org/jira/browse/HIVE-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14591441#comment-14591441
]
Prasanth Jayachandran commented on HIVE-11033:
----------------------------------------------
Test failures are not related.
> BloomFilter index is not honored by ORC reader
> ----------------------------------------------
>
> Key: HIVE-11033
> URL: https://issues.apache.org/jira/browse/HIVE-11033
> Project: Hive
> Issue Type: Bug
> Affects Versions: 1.2.0
> Reporter: Allan Yan
> Assignee: Prasanth Jayachandran
> Attachments: HIVE-11033.2.patch, HIVE-11033.patch
>
>
> There is a bug in the org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl class
> which caused the bloom filter index saved in the ORC file not being used. The
> root cause is the bloomFilterIndices variable defined in the SargApplier
> class superseded the one defined in its parent class. Therefore, in the
> ReaderImpl.pickRowGroups()
> {code}
> protected boolean[] pickRowGroups() throws IOException {
> // if we don't have a sarg or indexes, we read everything
> if (sargApp == null) {
> return null;
> }
> readRowIndex(currentStripe, included, sargApp.sargColumns);
> return sargApp.pickRowGroups(stripes.get(currentStripe), indexes);
> }
> {code}
> The bloomFilterIndices populated by readRowIndex() is not picked up by
> sargApp object. One solution is to make SargApplier.bloomFilterIndices a
> reference to its parent counterpart.
> {noformat}
> 18:46 $ diff src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java
> src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.original
> 174d173
> < bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> 178c177
> < sarg, options.getColumnNames(), strideRate, types,
> included.length, bloomFilterIndices);
> ---
> > sarg, options.getColumnNames(), strideRate, types,
> > included.length);
> 204a204
> > bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> 673c673
> < List<OrcProto.Type> types, int includedCount,
> OrcProto.BloomFilterIndex[] bloomFilterIndices) {
> ---
> > List<OrcProto.Type> types, int includedCount) {
> 677c677
> < this.bloomFilterIndices = bloomFilterIndices;
> ---
> > bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)