[jira] [Commented] (DRILL-5657) Implement size-aware result set loader

ASF GitHub Bot (JIRA) Fri, 25 Aug 2017 14:52:20 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142282#comment-16142282
 ]


ASF GitHub Bot commented on DRILL-5657:
---------------------------------------

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/914
  
    Merged a set of enhancements & bug fixes. Compared to the first set of 
commits:
    
    * Provides tools to print the sometimes-complex tuple model, schema and 
writer object used for complex data types.
    * Sharpens up the specification of the client API used to write data, 
especially for obscure cases such as abandoning rows, vector overflow inside 
deep structures, etc.
    * Revises the internal “WriterEvents” API used to pass events down the 
writer hierarchy in response to the API revisions.
    * Fully handles the case where a client chooses to “abandon” and rewrite a 
row (perhaps based on a filter condition.)
    * A couple of minor vector revisions. RepeatedMapVector allows building an 
instance from an existing offset vector (needed when harvesting batches with 
overflow.)
    * Adds more Javadoc.
    * Added many unit tests.
    * Fixes in response to unit tests.
    * Moved vector cache into result set loader layer.
    
    Remaining open issues:
    
    * Column projection
    * Support for lists and repeated lists


> Implement size-aware result set loader
> --------------------------------------
>
>                 Key: DRILL-5657
>                 URL: https://issues.apache.org/jira/browse/DRILL-5657
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: Future
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: Future
>
>
> A recent extension to Drill's set of test tools created a "row set" 
> abstraction to allow us to create, and verify, record batches with very few 
> lines of code. Part of this work involved creating a set of "column 
> accessors" in the vector subsystem. Column readers provide a uniform API to 
> obtain data from columns (vectors), while column writers provide a uniform 
> writing interface.
> DRILL-5211 discusses a set of changes to limit value vectors to 16 MB in size 
> (to avoid memory fragmentation due to Drill's two memory allocators.) The 
> column accessors have proven to be so useful that they will be the basis for 
> the new, size-aware writers used by Drill's record readers.
> A step in that direction is to retrofit the column writers to use the 
> size-aware {{setScalar()}} and {{setArray()}} methods introduced in 
> DRILL-5517.
> Since the test framework row set classes are (at present) the only consumer 
> of the accessors, those classes must also be updated with the changes.
> This then allows us to add a new "row mutator" class that handles size-aware 
> vector writing, including the case in which a vector fills in the middle of a 
> row.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5657) Implement size-aware result set loader

Reply via email to