[jira] [Commented] (DRILL-6147) Limit batch size for Flat Parquet Reader

Paul Rogers (JIRA) Sun, 04 Mar 2018 13:31:30 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385395#comment-16385395
 ]


Paul Rogers commented on DRILL-6147:
------------------------------------

To follow up, we should look at all sides of the issue. One factor overlooked 
in my previous note is that code now is better than code later.

DRILL-6147 is available today and will immediately give users a performance 
boost. The result set loader is large and will take some months to commit, and 
so can't offer a benefit until then.

It is hard to argue that we wait. Let's get DRILL-6147 in now, then revisit the 
issue later (doing the proposed test) once the result set loader is available.

And, as discussed, DRILL-6147 works only for the flat Parquet reader. We'll 
need the result set loader for the Parquet reader that reads nested types.


> Limit batch size for Flat Parquet Reader
> ----------------------------------------
>
>                 Key: DRILL-6147
>                 URL: https://issues.apache.org/jira/browse/DRILL-6147
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Parquet
>            Reporter: salim achouche
>            Assignee: salim achouche
>            Priority: Major
>             Fix For: 1.14.0
>
>
> The Parquet reader currently uses a hard-coded batch size limit (32k rows) 
> when creating scan batches; there is no parameter nor any logic for 
> controlling the amount of memory used. This enhancement will allow Drill to 
> take an extra input parameter to control direct memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-6147) Limit batch size for Flat Parquet Reader

Reply via email to