[jira] [Commented] (PARQUET-1061) parquet dictionary filter does not work.

Junjie Chen (JIRA) Tue, 18 Jul 2017 17:41:42 -0700

    [ 
https://issues.apache.org/jira/browse/PARQUET-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092409#comment-16092409
 ]


Junjie Chen commented on PARQUET-1061:
--------------------------------------

Yes, I already set parquet.filter.dictionary.enabled to true.

ParquetFileReader#filterRowGroup is called in 
ParquetRecordReader#initializeInternalReader. while in 
initializeInternalReader, filterRowGroup will be called when (*rowGroupoffset 
== null*).  And if rowGroupOffset == null means no row group in split(am I 
right?), so the call is wrong here.

> parquet dictionary filter does not work.
> ----------------------------------------
>
>                 Key: PARQUET-1061
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1061
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.9.0
>         Environment: Hive 2.2.0 + Parquet-mr 1.9.0/master
>            Reporter: Junjie Chen
>
> When perform selective query, we observed that dictionary filter was not 
> applied.  Please see following code snippet. 
>     if (rowGroupOffsets != null) {
>       // verify a row group was found for each offset
>       List<BlockMetaData> blocks = reader.getFooter().getBlocks();
>       if (blocks.size() != rowGroupOffsets.length) {
>         throw new IllegalStateException(
>             "All of the offsets in the split should be found in the file."
>             + " expected: " + Arrays.toString(rowGroupOffsets)
>             + " found: " + blocks);
>       }
>     } else {
> *Why apply data filter when row group offset equal to null? *
>       // apply data filters
>       reader.filterRowGroups(getFilter(configuration));
>     }
> I can enable filter after move else block code into second layer if. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (PARQUET-1061) parquet dictionary filter does not work.

Reply via email to