[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/949 Indeed, in the plan, there is an added information that is the number of rowgroup scanned. I think the check of the plan should be fixed by checking separately numFiles and usedMetadataFile without expecting they are next to each other. ---
[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level
Github user cchang738 commented on the issue: https://github.com/apache/drill/pull/949 There is a plan verification failure due to plan change. The plan baseline needs to be changed after this PR is merged. Plan Verification Failures: /root/drillAutomation/mapr/framework/resources/Functional/int96/q28.q Query: explain plan for select voter_id, name from `hive1_parquet_part` where date_part('year', create_timestamp1)=2018 Expected and actual text plans are different. Expected: .*numFiles=2, usedMetadataFile=true.* Actual: 00-00Screen 00-01 Project(voter_id=[$0], name=[$1]) 00-02Project(voter_id=[$1], name=[$2]) 00-03 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/drill/testdata/subqueries/hive1_parquet_part/0_0_10.parquet], ReadEntryWithPath [path=/drill/testdata/subqueries/hive1_parquet_part/0_0_9.parquet]], selectionRoot=/drill/testdata/subqueries/hive1_parquet_part, numFiles=2, numRowGroups=2, usedMetadataFile=true, cacheFileRoot=/drill/testdata/subqueries/hive1_parquet_part, columns=[`create_timestamp1`, `voter_id`, `name`]]]) ---
[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level
Github user priteshm commented on the issue: https://github.com/apache/drill/pull/949 @paul-rogers, @kkhatua can you provide some more information on the test case that failed? Hopefully, @dprofeta can replicate it in his environment. ---
[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level
Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/949 This change causes one of our functional tests to fail. We will have to track down the issue and either update the test, or post the problem here. ---
[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/949 @paul-rogers done ---
[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level
Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/949 Please resolve commits and rebase onto master. ---
[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/949 I have updated the patch with a unit test and fixed an issue when everything is filtered. ---
[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/949 I will add a unit test to test the number of rowgroups that are scanned by the groupscan to see if the filter is well able to prune rowgroup. ---
[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/949 @parthchandra Can you please review? ---