[ 
https://issues.apache.org/jira/browse/DRILL-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528199#comment-16528199
 ] 

Robert Hou commented on DRILL-6147:
-----------------------------------

This is a new style of testing for QA, at least for me.  As far as I can tell, 
QA has only been testing the correctness of a query.  I.e. did the query return 
the right data.  Or is the explain plan correct.

For batch size testing, there is no change in the data that is returned.  So QA 
is verifying that each batch is created correctly.  At the moment, we can only 
do this by looking at the logs.  So we need log messages that are specific to 
QA testing.  Each test is testing a specific operator, so we would prefer that 
for a given query, we only check logs messages for that specific operator.

We may have had some similar features where the data has not changed, and I'm 
not sure how QA verified them.  I do have experience with external sort.  The 
external sort project that was recently delivered was focussed on executing 
queries that were failing.  So QA's main task was to verify the queries 
succeeded with the new version.

We will likely face similar issues as we continue work in resource management.  
We can always look at better ways to do this.

> Limit batch size for Flat Parquet Reader
> ----------------------------------------
>
>                 Key: DRILL-6147
>                 URL: https://issues.apache.org/jira/browse/DRILL-6147
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Parquet
>            Reporter: salim achouche
>            Assignee: salim achouche
>            Priority: Major
>              Labels: ready-to-commit
>             Fix For: 1.14.0
>
>
> The Parquet reader currently uses a hard-coded batch size limit (32k rows) 
> when creating scan batches; there is no parameter nor any logic for 
> controlling the amount of memory used. This enhancement will allow Drill to 
> take an extra input parameter to control direct memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to