[jira] [Commented] (DRILL-5795) Filter pushdown for parquet handles multi rowgroup file

ASF GitHub Bot (JIRA) Tue, 19 Sep 2017 16:48:26 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172509#comment-16172509
 ]


ASF GitHub Bot commented on DRILL-5795:
---------------------------------------

Github user dprofeta commented on the issue:

    https://github.com/apache/drill/pull/949
  
    @parthchandra Can you please review?


> Filter pushdown for parquet handles multi rowgroup file
> -------------------------------------------------------
>
>                 Key: DRILL-5795
>                 URL: https://issues.apache.org/jira/browse/DRILL-5795
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Parquet
>            Reporter: Damien Profeta
>
> DRILL-1950 implemented the filter pushdown for parquet file but only in the 
> case of one rowgroup per parquet file. In the case of multiple rowgroups per 
> files, it detects that the rowgroup can be pruned but then tell to the 
> drillbit to read the whole file which leads to performance issue.
> Having multiple rowgroup per file helps to handle partitioned dataset and 
> still read only the relevant subset of data without ending with more file 
> than really needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5795) Filter pushdown for parquet handles multi rowgroup file

Reply via email to