[ 
https://issues.apache.org/jira/browse/DRILL-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013023#comment-16013023
 ] 

Paul Rogers commented on DRILL-5517:
------------------------------------

Highlights of changes:

* Provide a {{setBytesBounded()}} method in DrillBuf which sets bytes if they 
fit in the current buffer, else returns false. (Current versions throw 
exceptions.)
* Equivalent methods in the UDLE class.
* Constant that defines the maximum vector buffer size. Also includes an 
experimental option to adjust this size as a system option at startup.
* For all vectors with data, define new {{setScalar}} and {{setArrayItem}} 
methods that implement the semantics mentioned above. {{setScalar()}} is for 
individual values, {{setArrayItem()}} is for items that are members of an 
array. The semantics of the two cases are slightly different.
* For all vectors with data, provide a {{copyEntry()}} method to be used when 
handling "overflow" rows in record batch writers.
* For all vectors with data, provide an {{exchange()}} method that swaps the 
buffers between two vectors. Unlike the {{TransferPair}} mechanism, this 
exchange is for vectors within a single operator, using the same allocator, so 
no memory accounting is needed. Used when swapping between the "full" and 
"overflow" batches in the record batch writer.
* In each fixed-width vector, define a constant for the value width rather than 
using a (generated) "magic number".
* In each fixed-width vector, create a constant for the maximum number of 
values that fit in either 64K or the maximum vector length.

This work required reviewing much existing code. A number of cosmetic cleanups 
were done as problems were noticed.

A unit test verifies the semantics of the new methods for typical generated 
required, optional and repeated vectors for both the fixed-width and 
variable-width cases.


> Provide size-aware set operations in value vectors
> --------------------------------------------------
>
>                 Key: DRILL-5517
>                 URL: https://issues.apache.org/jira/browse/DRILL-5517
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.11.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.11.0
>
>
> DRILL-5211 describes a memory fragmentation issue in Drill. The resolution is 
> to limit vector sizes to 16 MB (the size of Netty memory allocation "slabs.") 
> Effort starts by providing "size-aware" set operations in value vectors which:
> * Operate as {{setSafe()}} while vectors are below 16 MB.
> * Return false if setting the value (and growing the vector) would exceed the 
> vector limit.
> The methods in value vectors then become the foundation on which we can 
> construct size-aware record batch "writers."



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to