Ben-Zvi commented on a change in pull request #1324: DRILL-6310: limit batch
size for hash aggregate
URL: https://github.com/apache/drill/pull/1324#discussion_r199289952
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTableTemplate.java
##########
@@ -694,7 +694,7 @@ public PutStatus put(int incomingRowIdx, IndexPointer
htIdxHolder, int hashCode,
}
htIdxHolder.value = currentIdx;
return addedBatch ? PutStatus.NEW_BATCH_ADDED :
- (freeIndex + 1 > totalIndexSize) ?
+ (freeIndex + 1 > prevIndexSize +
batchHolders.get(batchHolders.size()-1).getTargetBatchRowCount()) ?
Review comment:
`prevIndexSize` is used in two places, here and when checking if a new batch
is needed. Both use it the same way (by adding the size of the last batch).
Both places are part of the **HOT** code, called for every row.
So maybe if this variable could be set to include the size of the last
batch, a lot of computations would be saved.
Will it be simple? Need to know the size of the last batch when setting
this variable.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services