Ben-Zvi commented on issue #1650: DRILL-6707: Allow Merge-Join batch resizing 
to go smaller than current outgoing row count
URL: https://github.com/apache/drill/pull/1650#issuecomment-471149386
 
 
   Pushed another commit, where the changes are:
   
   (1) Renamed all the "TargetOutputRowCount" to "MaxRowCount" ...
   (2) Removed the "minus 1" for `setCurrentOutgoingMaxRowCount()` (Though this 
should be restored; see below) 
   (3) In the record batch memory manager, made
             `      MAX_NUM_ROWS = ValueVector.MAX_ROW_COUNT - 1` 
        So all the references start 1 below the 64k.
   (4) `setOutputRowCount()` has two implementations; the second just set the 
given value as is. So changed it to check for a power of two (and greater than 
1), and then subtract 1.
   
   The test `testUnnestLimitBatchSize_WithExcludedCols()` still fails now. The 
reason may have to do with LateralJoinBatch-innerNext(), which calls "update" 
on then memory manager (twice - left and right), and **AFTER** that calls 
allocateVectors().   However the update assumed that the outgoing vectors are 
already allocated.
      Also allocateVectors() is using "outputIndex == 0" to check if the 
allocation has not yet happened. This may be wrong (e.g., allocated, but batch 
is still empty).
       In the failed test, the allocated batch was of size 255, but empty, and 
the update calculated that the remaining size fits 128 entries. So the new 
allocation size became 127, and the "max rows" was calculated as MIN ( 255, 0 + 
128 ), hence 128. ==> Thus undoing (2) above could fix that.
    
    

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to