[
https://issues.apache.org/jira/browse/DRILL-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Venki Korukanti updated DRILL-2178:
-----------------------------------
Attachment: 0001-DRILL-2178-Update-outgoing-record-batch-size-and-all.patch
Attaching patch. Unit tests passed. Currently running functional, sf 100
parquet/text.
> Update outgoing record batch size and allocation in PartitionSender
> -------------------------------------------------------------------
>
> Key: DRILL-2178
> URL: https://issues.apache.org/jira/browse/DRILL-2178
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Venki Korukanti
> Assignee: Venki Korukanti
> Fix For: 0.8.0
>
> Attachments:
> 0001-DRILL-2178-Update-outgoing-record-batch-size-and-all.patch
>
>
> Currently we allocate memory for vectors in partition sender
> OutgoingRecordBatch using allocateNew() which for most ValueVectors allocates
> space for 4096 record capacity, but we flush the current record batch as soon
> as we reach 1000 records causing wasted memory. Automatic resizing kicks in
> after flushing few record batches, but auto resize always doubles or halves
> the capacity. This cause the buffer record capacity to flip between 512 and
> 2048.
> This JIRA is to:
> 1. Decide on the outgoing record batch depending upon the number of receivers
> of partition sender. Default value is 1024, but when the number of receivers
> exceeds 1000 change it to 512.
> 2. Allocate value vector space for storing the outgoing record batch size
> decided in (1). For this we make use of
> {{AllocationHelper.allocate(ValueVector v, int valueCount, int bytesPerValue,
> int repeatedPerTop)}}. {{bytesPerValue}} and {{repeatedPerTop}} is currently
> hard coded to 50 and 10, but this shouldn't matter as these values are
> applicable for variable and repeated vectors which have realloc facility if
> they run out of space.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)