[
https://issues.apache.org/jira/browse/DRILL-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385395#comment-16385395
]
Paul Rogers commented on DRILL-6147:
------------------------------------
To follow up, we should look at all sides of the issue. One factor overlooked
in my previous note is that code now is better than code later.
DRILL-6147 is available today and will immediately give users a performance
boost. The result set loader is large and will take some months to commit, and
so can't offer a benefit until then.
It is hard to argue that we wait. Let's get DRILL-6147 in now, then revisit the
issue later (doing the proposed test) once the result set loader is available.
And, as discussed, DRILL-6147 works only for the flat Parquet reader. We'll
need the result set loader for the Parquet reader that reads nested types.
> Limit batch size for Flat Parquet Reader
> ----------------------------------------
>
> Key: DRILL-6147
> URL: https://issues.apache.org/jira/browse/DRILL-6147
> Project: Apache Drill
> Issue Type: Improvement
> Components: Storage - Parquet
> Reporter: salim achouche
> Assignee: salim achouche
> Priority: Major
> Fix For: 1.14.0
>
>
> The Parquet reader currently uses a hard-coded batch size limit (32k rows)
> when creating scan batches; there is no parameter nor any logic for
> controlling the amount of memory used. This enhancement will allow Drill to
> take an extra input parameter to control direct memory usage.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)