Ben-Zvi commented on a change in pull request #1650: DRILL-6707: Allow 
Merge-Join batch resizing to go smaller than current outgoing row count
URL: https://github.com/apache/drill/pull/1650#discussion_r265366168
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/JoinBatchMemoryManager.java
 ##########
 @@ -40,7 +40,27 @@ public JoinBatchMemoryManager(int outputBatchSize, 
RecordBatch leftBatch,
     this.columnsToExclude = excludedColumns;
   }
 
-  private int updateInternal(int inputIndex, int outputPosition,  boolean 
useAggregate) {
+  /**
+   * Update the memory manager parameters based on the new incoming batch
+   *
+   * Notice three (possibly) different "row counts" for the outgoing batches:
+   *
+   *  1. The rowCount that the current outgoing batch was allocated with 
(always a power of 2; e.g. 8192)
+   *  2. The new rowCount computed based on the newly seen input rows (always 
a power of 2); may be bigger than (1) if the
+   *     new input rows are much smaller than before (e.g. 16384), or smaller 
(e.g. 4096) if the new rows are much wider.
+   *     Subsequent outgoing batches would be allocated based on this (2) new 
rowCount.
+   *  3. The target rowCount for the current outgoing batch. While initially 
(1), it may be resized down if the new rows
+   *     are getting bigger. In any case it won't be resized above (1) (to 
avoid IOOB) or below the current number of rows
+   *     in that batch (i.e., outputPosition). (Need not be a power of two; 
e.g., 7983).
+   *
 
 Review comment:
   Add a comment about the "- 1" issue (e.g., power of 2, minus 1).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to