[
https://issues.apache.org/jira/browse/DRILL-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360194#comment-16360194
]
Paul Rogers commented on DRILL-6147:
------------------------------------
Salim Says:
Predicate Pushdown
- The reason I invoked Predicate Pushdown within the document is to help the
analysis:
o Notice how Record Batch materialization could involve many more pages
o A solution that relies mainly on the current set of pages (one per column)
might pay a heavy IO price without much to show for
+ By waiting for all columns to have at least one page loaded so that
upfront stats are gathered
+ Batch memory is then divided optimally across columns and the current
batch size is computed
+ Unfortunately, such logic will fail if more pages are involved than the
ones taken in consideration
o Example -
+ Two variable length columns c1 and c2
+ Reader waits for two pages P1-1 and P2-1 so that we a) allocate memory
optimally across c1 and c2 and b) compute a batch size that will minimize
overflow logic
+ Assume, because of data length skew or predicate pushdown, that more
pages are involved in loading the batch
+ for c1: {P1-1, P1-2, P1-3, P1-4}, c2: {P2-1, P2-2}
+ It is now highly possible that overflow logic might not be optimal
since only two pages statistics were considered instead of six
- I have added new logic to the ScanBatch so to log (on-demand) extra batch
statistics which will help us assess the efficiency of the batch sizing
strategy; will add this information to the document when this sub-task is done
Paul's comment:
Perhaps we are conflating a number of things here and things are not, in fact,
this simple. Predicates must be applied per row, not per column or per page. If
the predicate is "X = 5", then we must exclude all column values when X != 5.
This means we cannot do independent bulk operations if we are also doing
per-row predicate filtering. Said another way, predicate push-down forces
row-by-row processing, even though the underlying storage format is columnar.
(This is why the Filter operator works row-by-row.)
The result set loader has built-in support for predicate push-down. Once a row
is loaded, before it is saved, we can apply a predicate to that row. If we
don't want the row, we don't save it, and the result set loader overwrites that
row with a new value. That code works and is tested. (The predicate evaluation
logic would have to be added; the row-handling parts are done. Here again, why
not build on the existing work, adding additional value such as that filter
evaluation logic.)
The discussion above is about optimizing I/O. But, predicates work row by row.
Not sure how page buffering impacts per-row predicates: one has to load all
pages for a batch, and decode them, if we need even a single value from that
page. (Parquet does not allow random access to values within a page due to
encoding, if I understand correctly.) One optimization would be to read just
the predicate column and apply the predicate to those values (X in my example
above.) Then, if no X value matches the predicate, we can skip reading the
other pages. But, I'd have thought we could get the same result from the footer
metadata...
> Limit batch size for Flat Parquet Reader
> ----------------------------------------
>
> Key: DRILL-6147
> URL: https://issues.apache.org/jira/browse/DRILL-6147
> Project: Apache Drill
> Issue Type: Improvement
> Components: Storage - Parquet
> Reporter: salim achouche
> Assignee: salim achouche
> Priority: Major
> Fix For: 1.13.0
>
>
> The Parquet reader currently uses a hard-coded batch size limit (32k rows)
> when creating scan batches; there is no parameter nor any logic for
> controlling the amount of memory used. This enhancement will allow Drill to
> take an extra input parameter to control direct memory usage.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)