[jira] [Commented] (HIVE-11033) BloomFilter index is not honored by ORC reader

Hive QA (JIRA) Wed, 17 Jun 2015 11:26:45 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590261#comment-14590261
 ]


Hive QA commented on HIVE-11033:
--------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12740053/HIVE-11033.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9008 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_corr
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4287/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4287/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4287/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12740053 - PreCommit-HIVE-TRUNK-Build

> BloomFilter index is not honored by ORC reader
> ----------------------------------------------
>
>                 Key: HIVE-11033
>                 URL: https://issues.apache.org/jira/browse/HIVE-11033
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.2.0
>            Reporter: Allan Yan
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-11033.patch
>
>
> There is a bug in the org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl class 
> which caused the bloom filter index saved in the ORC file not being used. The 
> root cause is the bloomFilterIndices variable defined in the SargApplier 
> class superseded the one defined in its parent class. Therefore, in the 
> ReaderImpl.pickRowGroups()
> {code}
>   protected boolean[] pickRowGroups() throws IOException {
>     // if we don't have a sarg or indexes, we read everything
>     if (sargApp == null) {
>       return null;
>     }
>     readRowIndex(currentStripe, included, sargApp.sargColumns);
>     return sargApp.pickRowGroups(stripes.get(currentStripe), indexes);
>   }
> {code}
> The bloomFilterIndices populated by readRowIndex() is not picked up by 
> sargApp object. One solution is to make SargApplier.bloomFilterIndices a 
> reference to its parent counterpart.
> {noformat}
> 18:46 $ diff src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java 
> src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.original
> 174d173
> <     bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> 178c177
> <           sarg, options.getColumnNames(), strideRate, types, 
> included.length, bloomFilterIndices);
> ---
> >           sarg, options.getColumnNames(), strideRate, types, 
> > included.length);
> 204a204
> >     bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> 673c673
> <         List<OrcProto.Type> types, int includedCount, 
> OrcProto.BloomFilterIndex[] bloomFilterIndices) {
> ---
> >         List<OrcProto.Type> types, int includedCount) {
> 677c677
> <       this.bloomFilterIndices = bloomFilterIndices;
> ---
> >       bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()];
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11033) BloomFilter index is not honored by ORC reader

Reply via email to