GitHub user jianqiao opened a pull request:

    https://github.com/apache/incubator-quickstep/pull/179

    QUICKSTEP-70-71 Improve aggregation performance

    This PR implements two features that improve aggregation performance:
    1. Adds `CollisionFreeVectorTable` to support specialized high-performance 
aggregation.
    2. Adds support for aggregation copy elision that we only materialize 
intermediate results for non-trivial expressions.
    
    For feature 1, when the group-by attribute is a range-bounded single 
attribute of `INT` or `LONG` type. We can use a vector of type 
`std::vector<std::atomic<StateT>>` to store the aggregation states, where 
`StateT` is the aggregation state type (currently restricted to `LONG` and 
`DOUBLE`). Then during aggregation, for each tuple, we locate the aggregation 
state with the group-by key's value as index to the state vector, and 
concurrently update the state with C++'s atomic primitives.
    
    For feature 2, note that the current implementation of aggregation always 
creates a `ColumnVectorsValueAccessor` to store the results of ALL the input 
expressions. However, we can avoid the creation of a column vector (thus 
avoiding copying values into the column vector) if the aggregation is on a 
simple attribute, e.g. `SUM(x)`. Thus by PR, when performing aggregation we 
prepare two input `ValueAccessor`s: one BASE accessor that is created directly 
from the input relation's storage block, and one DERIVED accessor that is the 
temporary result `ColumnVectorsValueAccessor`. Each aggregation argument may be 
from the base accessor (meaning that it is a simple attribute) or from the 
derived accessor (meaning that it is a non-trivial expression that gets 
evaluated). The two accessors are then properly handled in aggregation handles 
and aggregation hash tables.
    
    **Main changes:**
    `expressions/aggregation`: Updated the aggregation handles to support copy 
elision. Also did some cleanups.
    `relational_operators`: Added `InitializeAggregationOperator` to support 
parallel initialization of the aggregation state (just `memset` the memory to 
0) -- because it takes a relatively long time to do the initialization with 
single thread if the aggregation hash table is large.
    `storage`: Added `CollisionFreeVectorTable`. Renamed `FastHashTable` to 
`PackedPayloadHashTable`, made it support copy elision, and cleaned up the 
class to remove unused methods. Refactored `AggregationOperationState` to 
support copy elision and support the new aggregation. Moved aggregation code 
out of `StorageBlock`.
    
    This PR significantly improves some TPC-H queries' performance. For 
example, it improves TPC-H Q18 from ~27.5s to ~3.5s, with scale factor 100 on a 
cloudlab machine.
    
    Below shows the TPC-H performance (scale factor 100 on a cloudlab machine) 
with recently committed optimizations up to this point:
    
    |  **TPCH SF100** | **master (ms)** | **w/ optimizations (ms)** |
    |  ------ | ------ | ------ |
    |  Q01 | 13629 | 11221 |
    |  Q02 | 537 | 460 |
    |  Q03 | 4824 | 4124 |
    |  Q04 | 2185 | 2203 |
    |  Q05 | 5517 | 5282 |
    |  Q06 | 399 | 401 |
    |  Q07 | 18563 | 3456 |
    |  Q08 | 1746 | 899 |
    |  Q09 | 7247 | 5586 |
    |  Q10 | 6745 | 5665 |
    |  Q11 | 1053 | 247 |
    |  Q12 | 1713 | 1698 |
    |  Q13 | 22896 | 15582 |
    |  Q14 | 805 | 745 |
    |  Q15 | 897 | 431 |
    |  Q16 | 9942 | 9158 |
    |  Q17 | 1588 | 1117 |
    |  Q18 | 27459 | 3507 |
    |  Q19 | 1711 | 1609 |
    |  Q20 | 1204 | 1014 |
    |  Q21 | 8671 | 7886 |
    |  Q22 | 6178 | 724 |
    |  **Total** | **145509** | **83016** |

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-quickstep collision-free-agg

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-quickstep/pull/179.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #179
    
----
commit 68be4a614f55412632f13051295327fecba1fada
Author: Jianqiao Zhu <jianq...@cs.wisc.edu>
Date:   2017-01-30T20:46:39Z

    - Adds CollisionFreeVectorTable to support specialized fast path 
aggregation for range-bounded single integer group-by key.
    - Supports copy elision for aggregation.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to