Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/822#discussion_r118811771
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
    @@ -18,82 +18,161 @@
     package org.apache.drill.exec.physical.impl.aggregate;
     
     import java.io.IOException;
    +import java.io.InputStream;
    +import java.io.OutputStream;
     import java.util.ArrayList;
     import java.util.Iterator;
     import java.util.List;
    +import java.util.concurrent.TimeUnit;
     
     import javax.inject.Named;
     
    +import com.google.common.base.Stopwatch;
    +
    +import org.apache.drill.common.exceptions.DrillRuntimeException;
     import org.apache.drill.common.expression.ErrorCollector;
     import org.apache.drill.common.expression.ErrorCollectorImpl;
     import org.apache.drill.common.expression.ExpressionPosition;
     import org.apache.drill.common.expression.FieldReference;
     import org.apache.drill.common.expression.LogicalExpression;
    +
    +import org.apache.drill.common.expression.SchemaPath;
    +import org.apache.drill.exec.ExecConstants;
    +import org.apache.drill.exec.cache.VectorAccessibleSerializable;
     import org.apache.drill.exec.compile.sig.RuntimeOverridden;
     import org.apache.drill.exec.exception.ClassTransformationException;
    +import org.apache.drill.exec.exception.OutOfMemoryException;
     import org.apache.drill.exec.exception.SchemaChangeException;
     import org.apache.drill.exec.expr.TypeHelper;
    +
     import org.apache.drill.exec.memory.BufferAllocator;
     import org.apache.drill.exec.ops.FragmentContext;
     import org.apache.drill.exec.ops.MetricDef;
    +import org.apache.drill.exec.ops.OperatorContext;
     import org.apache.drill.exec.ops.OperatorStats;
     import org.apache.drill.exec.physical.config.HashAggregate;
     import org.apache.drill.exec.physical.impl.common.ChainedHashTable;
     import org.apache.drill.exec.physical.impl.common.HashTable;
     import org.apache.drill.exec.physical.impl.common.HashTableConfig;
     import org.apache.drill.exec.physical.impl.common.HashTableStats;
     import org.apache.drill.exec.physical.impl.common.IndexPointer;
    +
    +import org.apache.drill.exec.physical.impl.spill.RecordBatchSizer;
    +
    +import org.apache.drill.exec.physical.impl.spill.SpillSet;
    +import org.apache.drill.exec.planner.physical.AggPrelBase;
    +
    +import org.apache.drill.exec.proto.UserBitShared;
    +import org.apache.drill.exec.record.CloseableRecordBatch;
    +
     import org.apache.drill.exec.record.MaterializedField;
    +
     import org.apache.drill.exec.record.RecordBatch;
    -import org.apache.drill.exec.record.RecordBatch.IterOutcome;
    -import org.apache.drill.exec.record.TypedFieldId;
    +import org.apache.drill.exec.record.BatchSchema;
    +
     import org.apache.drill.exec.record.VectorContainer;
    +
    +import org.apache.drill.exec.record.TypedFieldId;
    +
    +import org.apache.drill.exec.record.RecordBatch.IterOutcome;
     import org.apache.drill.exec.record.VectorWrapper;
    +import org.apache.drill.exec.record.WritableBatch;
    +import org.apache.drill.exec.record.selection.SelectionVector2;
    +import org.apache.drill.exec.record.selection.SelectionVector4;
    +
     import org.apache.drill.exec.vector.AllocationHelper;
    +
     import org.apache.drill.exec.vector.FixedWidthVector;
     import org.apache.drill.exec.vector.ObjectVector;
     import org.apache.drill.exec.vector.ValueVector;
    +
     import org.apache.drill.exec.vector.VariableWidthVector;
     
    +import org.apache.hadoop.fs.Path;
    +
    +import static org.apache.drill.exec.record.RecordBatch.MAX_BATCH_SIZE;
    +
    --- End diff --
    
    It is impressive how you were able to slide spilling into the existing code 
structure. Performance and modularity are never required, of course, but it may 
be worth at least considering them.
    
    Putting so much code in a template has a large drawback. Our current 
byte-code based code generation performs at its worst when templates are large. 
This template is the base for the generated code. In traditional Java, the size 
of a subclass is independent of the size of the superclass. So, if we used 
"plain old" Java, the size of this template would have no performance impact.
    
    But, with byte-code manipulation, each generated class contains a complete 
copy of the byte code for the template class. With a huge template, we make a 
huge copy every time. We pay a cost in terms of the time it takes to make the 
copy, then analyze the resulting byte codes. Also, we fill up the code cache 
with many copies of the same code.
    
    Three solutions.
    
    1. Ignore the problem. (Which is probably the right choice until 
performance becomes a concern.)
    2. Refactor the code so that the bulk of logic is in a non-template class, 
with only a thin layer of code in the template.
    3. Use "plain old" Java compilation for this class to avoid the many large 
copies described above.
    
    The other problem is that this class has so many state variables that full 
testing and understanding will be hard. There is value in smaller chunks to 
reduce the cost of testing and maintenance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to