Ben-Zvi commented on issue #1650: DRILL-6707: Allow Merge-Join batch resizing to go smaller than current outgoing row count URL: https://github.com/apache/drill/pull/1650#issuecomment-471149386 Pushed another commit, where the changes are: (1) Renamed all the "TargetOutputRowCount" to "MaxRowCount" ... (2) Removed the "minus 1" for `setCurrentOutgoingMaxRowCount()` (Though this should be restored; see below) (3) In the record batch memory manager, made ` MAX_NUM_ROWS = ValueVector.MAX_ROW_COUNT - 1` So all the references start 1 below the 64k. (4) `setOutputRowCount()` has two implementations; the second just set the given value as is. So changed it to check for a power of two (and greater than 1), and then subtract 1. The test `testUnnestLimitBatchSize_WithExcludedCols()` still fails now. The reason may have to do with LateralJoinBatch-innerNext(), which calls "update" on then memory manager (twice - left and right), and **AFTER** that calls allocateVectors(). However the update assumed that the outgoing vectors are already allocated. Also allocateVectors() is using "outputIndex == 0" to check if the allocation has not yet happened. This may be wrong (e.g., allocated, but batch is still empty). In the failed test, the allocated batch was of size 255, but empty, and the update calculated that the remaining size fits 128 entries. So the new allocation size became 127, and the "max rows" was calculated as MIN ( 255, 0 + 128 ), hence 128. ==> Thus undoing (2) above could fix that.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
