[jira] [Commented] (DRILL-6147) Limit batch size for Flat Parquet Reader

ASF GitHub Bot (JIRA) Thu, 28 Jun 2018 19:17:35 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527066#comment-16527066
 ]


ASF GitHub Bot commented on DRILL-6147:
---------------------------------------

sachouche commented on a change in pull request #1330: DRILL-6147: Adding 
Columnar Parquet Batch Sizing functionality
URL: https://github.com/apache/drill/pull/1330#discussion_r198940291
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenBulkPageReader.java
 ##########
 @@ -72,14 +79,22 @@
     this.columnPrecInfo = columnPrecInfoInput;
     this.entry = new VarLenColumnBulkEntry(this.columnPrecInfo);
     this.containerCallback = containerCallbackInput;
+    this.fieldOverflowStateContainer = fieldOverflowStateContainer;
 
     // Initialize the Variable Length Entry Readers
-    fixedReader = new VarLenFixedEntryReader(buffer, pageInfo, columnPrecInfo, 
entry);
-    nullableFixedReader = new VarLenNullableFixedEntryReader(buffer, pageInfo, 
columnPrecInfo, entry);
-    variableLengthReader = new VarLenEntryReader(buffer, pageInfo, 
columnPrecInfo, entry);
-    nullableVLReader = new VarLenNullableEntryReader(buffer, pageInfo, 
columnPrecInfo, entry);
-    dictionaryReader = new VarLenEntryDictionaryReader(buffer, pageInfo, 
columnPrecInfo, entry);
-    nullableDictionaryReader = new VarLenNullableDictionaryReader(buffer, 
pageInfo, columnPrecInfo, entry);
+    fixedReader = new VarLenFixedEntryReader(buffer, pageInfo, columnPrecInfo, 
entry, containerCallback);
 
 Review comment:
   Variable columns bulk reading involves a couple of 4k buffers; not sure if 
RS is needed for this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Limit batch size for Flat Parquet Reader
> ----------------------------------------
>
>                 Key: DRILL-6147
>                 URL: https://issues.apache.org/jira/browse/DRILL-6147
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Parquet
>            Reporter: salim achouche
>            Assignee: salim achouche
>            Priority: Major
>             Fix For: 1.14.0
>
>
> The Parquet reader currently uses a hard-coded batch size limit (32k rows) 
> when creating scan batches; there is no parameter nor any logic for 
> controlling the amount of memory used. This enhancement will allow Drill to 
> take an extra input parameter to control direct memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-6147) Limit batch size for Flat Parquet Reader

Reply via email to