[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-10-24 Thread dprofeta
Github user dprofeta commented on the issue:

https://github.com/apache/drill/pull/949
  
Indeed, in the plan, there is an added information that is the number of 
rowgroup scanned. I think the check of the plan should be fixed by checking 
separately numFiles and usedMetadataFile without expecting they are next to 
each other.


---


[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-10-23 Thread cchang738
Github user cchang738 commented on the issue:

https://github.com/apache/drill/pull/949
  
There is a plan verification failure due to plan change. The plan baseline 
needs to be changed after this PR is merged.

Plan Verification Failures:
/root/drillAutomation/mapr/framework/resources/Functional/int96/q28.q
Query: 
explain plan for select voter_id, name from `hive1_parquet_part` where 
date_part('year', create_timestamp1)=2018

Expected and actual text plans are different.
Expected:
.*numFiles=2, usedMetadataFile=true.*

Actual:
00-00Screen
00-01  Project(voter_id=[$0], name=[$1])
00-02Project(voter_id=[$1], name=[$2])
00-03  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=/drill/testdata/subqueries/hive1_parquet_part/0_0_10.parquet], 
ReadEntryWithPath 
[path=/drill/testdata/subqueries/hive1_parquet_part/0_0_9.parquet]], 
selectionRoot=/drill/testdata/subqueries/hive1_parquet_part, numFiles=2, 
numRowGroups=2, usedMetadataFile=true, 
cacheFileRoot=/drill/testdata/subqueries/hive1_parquet_part, 
columns=[`create_timestamp1`, `voter_id`, `name`]]])


---


[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-10-16 Thread priteshm
Github user priteshm commented on the issue:

https://github.com/apache/drill/pull/949
  
@paul-rogers, @kkhatua can you provide some more information on the test 
case that failed? Hopefully, @dprofeta can replicate it in his environment.


---


[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-10-09 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/949
  
This change causes one of our functional tests to fail. We will have to 
track down the issue and either update the test, or post the problem here.


---


[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-10-02 Thread dprofeta
Github user dprofeta commented on the issue:

https://github.com/apache/drill/pull/949
  
@paul-rogers done


---


[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-09-30 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/949
  
Please resolve commits and rebase onto master.


---


[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-09-21 Thread dprofeta
Github user dprofeta commented on the issue:

https://github.com/apache/drill/pull/949
  
I have updated the patch with a unit test and fixed an issue when 
everything is filtered.


---


[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-09-20 Thread dprofeta
Github user dprofeta commented on the issue:

https://github.com/apache/drill/pull/949
  
I will add a unit test to test the number of rowgroups that are scanned by 
the groupscan to see if the filter is well able to prune rowgroup.


---


[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-09-19 Thread dprofeta
Github user dprofeta commented on the issue:

https://github.com/apache/drill/pull/949
  
@parthchandra Can you please review?


---