> On March 1, 2014, 4:20 p.m., Jacques Nadeau wrote: > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTableTemplate.java, > > line 96 > > <https://reviews.apache.org/r/18347/diff/1/?file=500195#file500195line96> > > > > Same as above, let's use IntVector instead of primitive array, > > otherwise memory accounting is challenging. > > Aman Sinha wrote: > Agreed...will make these off-heap.
Made links, hashValues and startIndices off-heap. - Aman ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18347/#review35902 ----------------------------------------------------------- On March 6, 2014, 2:10 a.m., Aman Sinha wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/18347/ > ----------------------------------------------------------- > > (Updated March 6, 2014, 2:10 a.m.) > > > Review request for drill, Jacques Nadeau, Jason Altekruse, Mehant Baid, and > Steven Phillips. > > > Bugs: Drill-318, Drill-335 and and > https://issues.apache.org/jira/browse/Drill-318 > https://issues.apache.org/jira/browse/Drill-335 > https://issues.apache.org/jira/browse/and > > > Repository: drill-git > > > Description > ------- > > This patch contains the following main components: > 1. Implementation of the hash aggregation execution operator - this has two > main parts: the HashAggTemplate and the HashAggBatch. > 2. Implementation of a hash table which is used by the hash aggregation. The > hash table hash two main parts: the HashTableTemplate and the > ChainedHashTable. The hash table internally uses the notion of 'BatchHolder' > to keep track of all keys that can fit within one batch of 64K values. New > BatcHolder objects are created as needed. Each BatchHolder has its own > vector container. The HashAggregate also has a similar structure and it > keeps track of the workspace variables. > (NOTE: An initial design document for the hash aggregation and hash table > was already attached with Drill-335. The document has not yet been updated > with the latest implementation ... will try to do that in the near future). > 3. Jinfeng's changes to use workspace vectors in the generated code for > aggregate functions (previously, for streaming aggregate we only needed to > maintain workspace variable for 1 running group; however for hash aggregate > we need to maintain it for all groups). > 4. Fix for Drill-318: because of #3 above, the previous fix for Drill-318 is > not valid anymore. I modified the template generation code for the aggregate > functions such that they conform to the new infrastructure. > 5. The original AggTemplate, AggBatch and Aggregator classes have been moved > to corresponding StreamingAggTemplate, StreamingAggBatch and > StreamingAggregator in order to differentiate it from hash aggregation. > These appear as new files but the code there has not changed. > > > Diffs > ----- > > > common/src/main/java/org/apache/drill/common/expression/fn/AggregationFunctions.java > 5b46b78 > exec/java-exec/src/main/codegen/config.fmpp 8f1060a > exec/java-exec/src/main/codegen/data/AggrTypes1.tdd PRE-CREATION > exec/java-exec/src/main/codegen/data/AggrTypes2.tdd PRE-CREATION > exec/java-exec/src/main/codegen/templates/AggrTypeFunctions1.java > PRE-CREATION > exec/java-exec/src/main/codegen/templates/AggrTypeFunctions2.java > PRE-CREATION > exec/java-exec/src/main/codegen/templates/TypeHelper.java 8c72cf7 > > exec/java-exec/src/main/java/org/apache/drill/exec/compile/sig/GeneratorMapping.java > f99150e > > exec/java-exec/src/main/java/org/apache/drill/exec/compile/sig/MappingSet.java > 1cb2380 > exec/java-exec/src/main/java/org/apache/drill/exec/expr/ClassGenerator.java > e674eab > > exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillAggFuncHolder.java > 7622865 > > exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFuncHolder.java > dcdf1ea > > exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionConverter.java > befa9bf > > exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/agg/impl/CountFunctions.java > b0939f1 > > exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/agg/impl/SumFunctions.java > 57905fe > > exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/BitFunctions.java > 41bb7c6 > exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java > 8462622 > exec/java-exec/src/main/java/org/apache/drill/exec/opt/BasicOptimizer.java > aed4802 > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/OperatorCost.java > 0bade41 > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractPhysicalVisitor.java > ec7244a > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/PhysicalVisitor.java > 2e2d7fd > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/HashAggregate.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/AggBatch.java > 86fea4e > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/AggBatchCreator.java > 36344f9 > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/AggTemplate.java > a8f39e3 > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/Aggregator.java > 2683045 > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatchCreator.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggregator.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatchCreator.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggTemplate.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggregator.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/ChainedHashTable.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTable.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTableConfig.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTableTemplate.java > PRE-CREATION > > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java > 509d13b > > exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java > 6e764a8 > exec/java-exec/src/test/java/org/apache/drill/exec/expr/ExpressionTest.java > 0de64b4 > > exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/SimpleRootExec.java > ccf7758 > > exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestSimpleFunctions.java > f6a8096 > > exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/agg/TestHashAggr.java > PRE-CREATION > > exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/common/TestHashTable.java > PRE-CREATION > exec/java-exec/src/test/resources/common/test_hashtable1.json PRE-CREATION > > Diff: https://reviews.apache.org/r/18347/diff/ > > > Testing > ------- > > I have run several tests manually as part of TestHashAggr...these tests use > TPC-H data and in particular a relatively large 'Orders' table. However, I > have not yet packaged the tests to run as part of JUnit since the location > and size of the parquet files needs to be figured out. I will continue to > work on that. > > > Thanks, > > Aman Sinha > >
