[ 
https://issues.apache.org/jira/browse/DRILL-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16495973#comment-16495973
 ] 

ASF GitHub Bot commented on DRILL-6236:
---------------------------------------

Ben-Zvi commented on a change in pull request #1227: DRILL-6236: batch sizing 
for hash join
URL: https://github.com/apache/drill/pull/1227#discussion_r191964302
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/JoinBatchMemoryManager.java
 ##########
 @@ -85,13 +71,50 @@ public int update(int inputIndex, int outputPosition) {
   }
 
   @Override
-  public RecordBatchSizer.ColumnSize getColumnSize(String name) {
-    RecordBatchSizer leftSizer = getRecordBatchSizer(LEFT_INDEX);
-    RecordBatchSizer rightSizer = getRecordBatchSizer(RIGHT_INDEX);
+  public int update(int inputIndex, int outputPosition, boolean useAggregate) {
+    switch (inputIndex) {
+      case LEFT_INDEX:
 
 Review comment:
   A cleanup suggestion: There are too many "update()" methods. And the LEFT 
never use aggregate, and the RIGHT always use aggregate. So how about instead:
   ```
   private int foo(RecordBatch batch, int inputIndex, boolean useAggregate) {
        setRecordBatchSizer(inputIndex, new RecordBatchSizer(batch));
        updateIncomingStats(inputIndex);
        return useAggregate ? (int) getAvgInputRowWidth(inputIndex) : 
getRecordBatchSizer(inputIndex).getRowAllocSize();
   }
   
   public int updateRight(RecordBatch batch,int outputPosition) {
      rightRowWidth = foo(batch,RIGHT_INDEX,true);
      return updateInternal(outputPosition);
   }
   
   public int updateLeft(RecordBatch batch,int outputPosition) {
      leftRowWidth = foo(batch,LEFT_INDEX,false);
      return updateInternal(outputPosition);
   }
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> batch sizing for hash join
> --------------------------
>
>                 Key: DRILL-6236
>                 URL: https://issues.apache.org/jira/browse/DRILL-6236
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Flow
>    Affects Versions: 1.13.0
>            Reporter: Padma Penumarthy
>            Assignee: Padma Penumarthy
>            Priority: Major
>             Fix For: 1.14.0
>
>
> limit output batch size for hash join based on memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to