[
https://issues.apache.org/jira/browse/DRILL-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142282#comment-16142282
]
ASF GitHub Bot commented on DRILL-5657:
---------------------------------------
Github user paul-rogers commented on the issue:
https://github.com/apache/drill/pull/914
Merged a set of enhancements & bug fixes. Compared to the first set of
commits:
* Provides tools to print the sometimes-complex tuple model, schema and
writer object used for complex data types.
* Sharpens up the specification of the client API used to write data,
especially for obscure cases such as abandoning rows, vector overflow inside
deep structures, etc.
* Revises the internal “WriterEvents” API used to pass events down the
writer hierarchy in response to the API revisions.
* Fully handles the case where a client chooses to “abandon” and rewrite a
row (perhaps based on a filter condition.)
* A couple of minor vector revisions. RepeatedMapVector allows building an
instance from an existing offset vector (needed when harvesting batches with
overflow.)
* Adds more Javadoc.
* Added many unit tests.
* Fixes in response to unit tests.
* Moved vector cache into result set loader layer.
Remaining open issues:
* Column projection
* Support for lists and repeated lists
> Implement size-aware result set loader
> --------------------------------------
>
> Key: DRILL-5657
> URL: https://issues.apache.org/jira/browse/DRILL-5657
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: Future
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Fix For: Future
>
>
> A recent extension to Drill's set of test tools created a "row set"
> abstraction to allow us to create, and verify, record batches with very few
> lines of code. Part of this work involved creating a set of "column
> accessors" in the vector subsystem. Column readers provide a uniform API to
> obtain data from columns (vectors), while column writers provide a uniform
> writing interface.
> DRILL-5211 discusses a set of changes to limit value vectors to 16 MB in size
> (to avoid memory fragmentation due to Drill's two memory allocators.) The
> column accessors have proven to be so useful that they will be the basis for
> the new, size-aware writers used by Drill's record readers.
> A step in that direction is to retrofit the column writers to use the
> size-aware {{setScalar()}} and {{setArray()}} methods introduced in
> DRILL-5517.
> Since the test framework row set classes are (at present) the only consumer
> of the accessors, those classes must also be updated with the changes.
> This then allows us to add a new "row mutator" class that handles size-aware
> vector writing, including the case in which a vector fills in the middle of a
> row.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)