GitHub user ppadma opened a pull request:

    https://github.com/apache/drill/pull/1125

     DRILL-6126: Allocate memory for value vectors upfront in flatten operator

    Made changes to allocate memory upfront for flatten operator based on 
sizing calculations.
    Need to do allocation of single column (can be nested) for a particular 
record count
    and allocation hints. Refactored the code a bit for that.
    Instead of assuming worst case fragmentation factor of 2, changed the logic 
to round down the number of rows calculated to nearest power of two. This will 
allow us to pack value vectors more densely and will help with memory 
utilization.
    RepeatedMapvector and RepeatedListVector are extended from 
RepeatedFixedWidthVectorLike. This is wrong and causing problems in Allocation 
logic (allocatePrecomputedChildCount in AllocationHelper more specifically). 
Fixed that.
    This PR has 2 commits. One for all of the above and second one for
    DRILL-6162: Enhance record batch sizer to retain nesting information.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ppadma/drill DRILL-6126

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1125.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1125
    
----
commit 58c6b9ad584e56c71d982feaaa43ad32b5011eef
Author: Padma Penumarthy <ppenumar97@...>
Date:   2018-02-21T17:33:12Z

    DRILL-6162: Enhance record batch sizer to retain nesting information for 
map columns.

commit f7c09131179b75d10ffe195785c9aef3b9c7ed97
Author: Padma Penumarthy <ppenumar97@...>
Date:   2018-02-21T17:35:47Z

    DRILL-6126: Allocate memory for value vectors upfront in flatten operator

----


---

Reply via email to