Ben-Zvi commented on a change in pull request #1783: DRILL-7240: Catch runtime 
pruning filter-match exceptions and do not prune these rowgroups
URL: https://github.com/apache/drill/pull/1783#discussion_r281435745
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/AbstractParquetScanBatchCreator.java
 ##########
 @@ -184,18 +185,30 @@ protected ScanBatch getBatch(ExecutorFragmentContext 
context, AbstractParquetRow
           //
           // Perform the Run-Time Pruning - i.e. Skip this rowgroup if the 
match fails
           //
-          RowsMatch match = FilterEvaluatorUtils.matches(filterPredicate, 
columnsStatistics, footerRowCount);
-
-          // collect logging info
-          long timeToRead = pruneTimer.elapsed(TimeUnit.MICROSECONDS);
-          pruneTimer.stop();
-          pruneTimer.reset();
-          totalPruneTime += timeToRead;
-          logger.trace("Run-time pruning: {} row-group {} (RG index: {} row 
count: {}), took {} usec", // trace each single rowgroup
-            match == RowsMatch.NONE ? "Excluded" : "Included", 
rowGroup.getPath(), rowGroupIndex, footerRowCount, timeToRead);
+          RowsMatch matchResult = RowsMatch.ALL;
+          try {
+            matchResult = FilterEvaluatorUtils.matches(filterPredicate, 
columnsStatistics, footerRowCount);
+
+            // collect logging info
+            long timeToRead = pruneTimer.elapsed(TimeUnit.MICROSECONDS);
+            pruneTimer.stop();
+            pruneTimer.reset();
+            totalPruneTime += timeToRead;
+            logger.trace("Run-time pruning: {} row-group {} (RG index: {} row 
count: {}), took {} usec", // trace each single rowgroup
+              matchResult == RowsMatch.NONE ? "Excluded" : "Included", 
rowGroup.getPath(), rowGroupIndex, footerRowCount, timeToRead);
+          } catch (ClassCastException cce) {
+            if ( ! matchCastErrorNotified ) {
+              logger.info("Run-time pruning check failed due to type casting. 
Skipping pruning rowgroups starting from {}. (Error: {})", rowGroup.getPath(), 
cce.getMessage());
 
 Review comment:
   Done, here is a sample from the log (tested with 6 rowgroups/files; three of 
which fail, or the remaining three one was pruned):
   ```
   2019-05-06 18:37:19,809 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] 
TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read 
Footer,,/tmp/twofoo/sub/0_0_3.parquet,,0,0,0,170059
   2019-05-06 18:37:19,828 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] 
TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read 
Footer,,/tmp/twofoo/sub/0_0_2.parquet,,0,0,0,4691
   2019-05-06 18:37:19,829 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] 
TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - Run-time pruning: Included 
row-group /tmp/twofoo/sub/0_0_2.parquet (RG index: 0 row count: 2), took 466 
usec
   2019-05-06 18:37:19,869 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] 
TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read 
Footer,,/tmp/twofoo/sub/0_0_0.parquet,,0,0,0,37316
   2019-05-06 18:37:19,870 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] 
TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - Run-time pruning: Excluded 
row-group /tmp/twofoo/sub/0_0_0.parquet (RG index: 0 row count: 2), took 683 
usec
   2019-05-06 18:37:19,909 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] 
TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read 
Footer,,/tmp/twofoo/sub/0_0_1.parquet,,0,0,0,38874
   2019-05-06 18:37:19,914 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] 
TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read 
Footer,,/tmp/twofoo/sub/0_0_4.parquet,,0,0,0,2417
   2019-05-06 18:37:19,915 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] 
TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - Run-time pruning: Included 
row-group /tmp/twofoo/sub/0_0_4.parquet (RG index: 0 row count: 2), took 356 
usec
   2019-05-06 18:37:19,919 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] 
TRACE o.a.d.e.s.p.AbstractParquetScanBatchCreator - ParquetTrace,Read 
Footer,,/tmp/twofoo/sub/0_0_5.parquet,,0,0,0,2749
   2019-05-06 18:37:19,922 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] INFO 
 o.a.d.e.s.p.AbstractParquetScanBatchCreator - Finished parquet_runtime_pruning 
in 1505 usec. Out of given 6 rowgroups, 1 were pruned. 
   2019-05-06 18:37:19,922 [232f1eb3-2cab-7d16-1222-c05eca0fdceb:frag:0:0] INFO 
 o.a.d.e.s.p.AbstractParquetScanBatchCreator - Run-time pruning check skipped 
for 3 out of 6 rowgroups due to: java.lang.Integer cannot be cast to 
java.lang.Long
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to