[
https://issues.apache.org/jira/browse/DRILL-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15957996#comment-15957996
]
Paul Rogers commented on DRILL-5416:
------------------------------------
The solution seems to be to change the serialization. Today we treat the vector
as the unit of serialization (creating composite buffers as needed during read
and write.) Note, however, that maps are treated as special; the map vector
actually contains serialization code to write its composite vectors.
The revision is to use the map pattern for all vectors. Serializing vectors
becomes a tree-walk: visit each vector. If the vector is simple (has only a
buffer) serialize that. If it is composite (has more than one buffer or
vector), visit each to serialize.
Then, reverse the process on read so that each vector is backed by its own
buffer, and free space is owned by a single vector and we can correctly
understand the memory needs of each vector.
> Vectors read from disk report incorrect memory sizes
> ----------------------------------------------------
>
> Key: DRILL-5416
> URL: https://issues.apache.org/jira/browse/DRILL-5416
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.8.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Priority: Minor
> Fix For: 1.11.0
>
>
> The external sort and revised hash agg operators spill to disk using a vector
> serialization mechanism. This mechanism serializes each vector as a (length,
> bytes) pair.
> Before spilling, if we check the memory used for a vector (using the new
> {{RecordBatchSizer}} class), we learn of the actual memory consumed by the
> vector, including any unused space in the vector.
> If we spill the vector, then reread it, the reported storage size is wrong.
> On reading, the code allocates a buffer, based on the saved length, rounded
> up to the next power of two. Then, when building the vector, we "slice" the
> read buffer, setting the memory size to the data size.
> For example, suppose we save 20 1-byte fields. The size on disk is 20. The
> read buffer is rounded to 32 bytes (the size of the original, pre-spill
> buffer.) We read the 20 bytes and create a vector. Creating the vector
> reports the memory size as 20, "hiding" the extra, unused 12 bytes.
> As a result, when computing memory sizes, we receive incorrect numbers.
> Working with false numbers means that the code cannot safely operate within a
> memory budget, causing the user to receive an unexpected OOM error.
> As it turns out, the code path that does the slicing is used only for reads
> from disk. This ticket asks to remove the slicing step: just use the
> allocated buffer directly so that the after-read vector reports the correct
> memory usage; same as the before-spill vector.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)