GitHub user ppadma opened a pull request: https://github.com/apache/drill/pull/1125
DRILL-6126: Allocate memory for value vectors upfront in flatten operator Made changes to allocate memory upfront for flatten operator based on sizing calculations. Need to do allocation of single column (can be nested) for a particular record count and allocation hints. Refactored the code a bit for that. Instead of assuming worst case fragmentation factor of 2, changed the logic to round down the number of rows calculated to nearest power of two. This will allow us to pack value vectors more densely and will help with memory utilization. RepeatedMapvector and RepeatedListVector are extended from RepeatedFixedWidthVectorLike. This is wrong and causing problems in Allocation logic (allocatePrecomputedChildCount in AllocationHelper more specifically). Fixed that. This PR has 2 commits. One for all of the above and second one for DRILL-6162: Enhance record batch sizer to retain nesting information. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppadma/drill DRILL-6126 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1125.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1125 ---- commit 58c6b9ad584e56c71d982feaaa43ad32b5011eef Author: Padma Penumarthy <ppenumar97@...> Date: 2018-02-21T17:33:12Z DRILL-6162: Enhance record batch sizer to retain nesting information for map columns. commit f7c09131179b75d10ffe195785c9aef3b9c7ed97 Author: Padma Penumarthy <ppenumar97@...> Date: 2018-02-21T17:35:47Z DRILL-6126: Allocate memory for value vectors upfront in flatten operator ---- ---