[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-10-24 Thread dprofeta
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/949 Indeed, in the plan, there is an added information that is the number of rowgroup scanned. I think the check of the plan should be fixed by checking separately numFiles and usedMetadataFile without

[GitHub] drill issue #976: DRILL-5797: Choose parquet reader from read columns

2017-10-17 Thread dprofeta
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/976 I updated the javadoc with Paul remarks. ---

[GitHub] drill issue #976: DRILL-5797: Choose parquet reader from read columns

2017-10-16 Thread dprofeta
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/976 here is the updated PR. Yes, I also wanted to add group without repetition. It is only a matter of naming so it should not be hard but when I tested, the fast reader was not able to handle it. ---

[GitHub] drill pull request #976: DRILL-5797: Choose parquet reader from read columns

2017-10-09 Thread dprofeta
Github user dprofeta commented on a diff in the pull request: https://github.com/apache/drill/pull/976#discussion_r143403559 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetScanBatchCreator.java --- @@ -156,18 +160,39 @@ public ScanBatch getBatch

[GitHub] drill pull request #976: DRILL-5797: Choose parquet reader from read columns

2017-10-09 Thread dprofeta
Github user dprofeta commented on a diff in the pull request: https://github.com/apache/drill/pull/976#discussion_r143403657 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetScanBatchCreator.java --- @@ -156,18 +160,39 @@ public ScanBatch getBatch

[GitHub] drill pull request #976: DRILL-5797: Choose parquet reader from read columns

2017-10-09 Thread dprofeta
Github user dprofeta commented on a diff in the pull request: https://github.com/apache/drill/pull/976#discussion_r143403232 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetScanBatchCreator.java --- @@ -156,18 +160,39 @@ public ScanBatch getBatch

[GitHub] drill issue #976: DRILL-5797: Choose parquet reader from read columns

2017-10-06 Thread dprofeta
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/976 @sachouche Can you review it? ---

[GitHub] drill pull request #976: DRILL-5797: Choose parquet reader from read columns

2017-10-06 Thread dprofeta
GitHub user dprofeta opened a pull request: https://github.com/apache/drill/pull/976 DRILL-5797: Choose parquet reader from read columns ParquetRecordReader is not able to read complex columns. However it is able to read simple columns in a file containing complex columns

[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-10-02 Thread dprofeta
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/949 @paul-rogers done ---

[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-09-21 Thread dprofeta
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/949 I have updated the patch with a unit test and fixed an issue when everything is filtered. ---

[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-09-20 Thread dprofeta
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/949 I will add a unit test to test the number of rowgroups that are scanned by the groupscan to see if the filter is well able to prune rowgroup. ---

[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-09-19 Thread dprofeta
Github user dprofeta commented on the issue: https://github.com/apache/drill/pull/949 @parthchandra Can you please review? ---

[GitHub] drill pull request #949: DRILL-5795: Parquet Filter push down at rowgroup le...

2017-09-19 Thread dprofeta
GitHub user dprofeta opened a pull request: https://github.com/apache/drill/pull/949 DRILL-5795: Parquet Filter push down at rowgroup level Before this commit, the filter was pruning complete files. When a file is composed of multiple rowgroups, it was not able to prune one