[
https://issues.apache.org/jira/browse/DRILL-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013023#comment-16013023
]
Paul Rogers commented on DRILL-5517:
------------------------------------
Highlights of changes:
* Provide a {{setBytesBounded()}} method in DrillBuf which sets bytes if they
fit in the current buffer, else returns false. (Current versions throw
exceptions.)
* Equivalent methods in the UDLE class.
* Constant that defines the maximum vector buffer size. Also includes an
experimental option to adjust this size as a system option at startup.
* For all vectors with data, define new {{setScalar}} and {{setArrayItem}}
methods that implement the semantics mentioned above. {{setScalar()}} is for
individual values, {{setArrayItem()}} is for items that are members of an
array. The semantics of the two cases are slightly different.
* For all vectors with data, provide a {{copyEntry()}} method to be used when
handling "overflow" rows in record batch writers.
* For all vectors with data, provide an {{exchange()}} method that swaps the
buffers between two vectors. Unlike the {{TransferPair}} mechanism, this
exchange is for vectors within a single operator, using the same allocator, so
no memory accounting is needed. Used when swapping between the "full" and
"overflow" batches in the record batch writer.
* In each fixed-width vector, define a constant for the value width rather than
using a (generated) "magic number".
* In each fixed-width vector, create a constant for the maximum number of
values that fit in either 64K or the maximum vector length.
This work required reviewing much existing code. A number of cosmetic cleanups
were done as problems were noticed.
A unit test verifies the semantics of the new methods for typical generated
required, optional and repeated vectors for both the fixed-width and
variable-width cases.
> Provide size-aware set operations in value vectors
> --------------------------------------------------
>
> Key: DRILL-5517
> URL: https://issues.apache.org/jira/browse/DRILL-5517
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.11.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> DRILL-5211 describes a memory fragmentation issue in Drill. The resolution is
> to limit vector sizes to 16 MB (the size of Netty memory allocation "slabs.")
> Effort starts by providing "size-aware" set operations in value vectors which:
> * Operate as {{setSafe()}} while vectors are below 16 MB.
> * Return false if setting the value (and growing the vector) would exceed the
> vector limit.
> The methods in value vectors then become the foundation on which we can
> construct size-aware record batch "writers."
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)