[ 
https://issues.apache.org/jira/browse/DRILL-6678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-6678:
-------------------------------------
    Description: SelectionVectorRemover in most of the cases is downstream to 
Filter which reduces the number of records to be copied in output container. In 
those cases if SelectionVectorRemover can pack the output batch to approximate 
maximum utilization based on RecordBatchSizer target record count, that will 
reduce the number of output batches from it and will help to improve 
performance. During Lateral & Unnest Performance evaluation we have noticed a 
significant decrease in performance as number of batches increases for same 
number of rows (i.e. Batch is not fully packed)  (was: SelectionVectorRemover 
in most of the cases is downstream to Filter which reduces the number of 
records to be copied in output container. In those cases if 
SelectionVectorRemover can pack the output batch to maximum utilization that 
will reduce the number of output batches from it and will help to improve 
performance. During Lateral & Unnest  Performance evaluation we have noticed a 
significant decrease in performance as number of batches increases for same 
number of rows (i.e. Batch is not fully packed))

> Improve SelectionVectorRemover to pack output batch based on BatchSizing
> ------------------------------------------------------------------------
>
>                 Key: DRILL-6678
>                 URL: https://issues.apache.org/jira/browse/DRILL-6678
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Relational Operators
>    Affects Versions: 1.14.0
>            Reporter: Sorabh Hamirwasia
>            Assignee: Sorabh Hamirwasia
>            Priority: Major
>
> SelectionVectorRemover in most of the cases is downstream to Filter which 
> reduces the number of records to be copied in output container. In those 
> cases if SelectionVectorRemover can pack the output batch to approximate 
> maximum utilization based on RecordBatchSizer target record count, that will 
> reduce the number of output batches from it and will help to improve 
> performance. During Lateral & Unnest Performance evaluation we have noticed a 
> significant decrease in performance as number of batches increases for same 
> number of rows (i.e. Batch is not fully packed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to