GitHub user paul-rogers opened a pull request:

    https://github.com/apache/drill/pull/1161

    DRILL-6230: Extend row set readers to handle hyper vectors

    The current row set readers have incomplete support for hyper-vectors. To 
add full support, we need an interface that supports either single batches or 
hyper batches. Accessing vectors in hyper batches differs depending on whether 
the vector is at the top level or is nested. See this post for details. Also 
includes a simpler reader template: replaces the original three classes with 
one, in parallel with the writers.
    
    Key changes:
    
    * Refactor the readers to generate just the required reader, then build up 
the optional and repeated readers as layers on top of the generated reader. 
This is the same structure that the writers already use.
    * Add and test support for hyper-vectors.
    * Extend the existing "vector accessor" abstraction to fully support the 
highly complex process of locating nested vectors (those within a map or union) 
in a hyper-batch.
    * Introduce the idea of a "null state" abstraction to handle the messy null 
handling in unions and repeated lists.
    * Modifies tests as needed for the new internal format of vector readers.
    
    To keep the PR from getting overly large, this PR strips out the actual 
union and list support. That support will be added in a future PR. Similarly, 
there are matching changes to writers that will also be done in a separate PR.
    
    Other minor changes:
    
    * Revises the previous utility PR. In some cases, it turns out to be 
cleaner to use a separate `mapValue()` function instead of `objArray()`, even 
though both produce an object array. Calling it `mapValue()` makes it a bit 
clearer what we're trying to accomplish.
    
    This PR is not needed for Drill 1.13; it can go into Drill 1.14.
    
    See [this 
post](https://github.com/paul-rogers/drill/wiki/Batch-Handling-Upgrades) for 
details of the end-state toward which this PR is one step.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/paul-rogers/drill DRILL-6230

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1161.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1161
    
----
commit e9891c561088d2c79ab1758dc857a8a52ec253ac
Author: Paul Rogers <progers@...>
Date:   2018-03-11T07:43:36Z

    Accessor revisions

commit 6f6e3eb803793d71a5e8dba8362737bac66d923c
Author: Paul Rogers <progers@...>
Date:   2018-03-11T22:41:42Z

    Merge of exec row set readers & tests

commit 65cd6205ea8e85ac4e001634ffa24268a57ce273
Author: Paul Rogers <progers@...>
Date:   2018-03-12T00:23:35Z

    Fixed tests to remove work not in this PR

----


---

Reply via email to