[
https://issues.apache.org/jira/browse/PHOENIX-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16567742#comment-16567742
]
Hudson commented on PHOENIX-4751:
---------------------------------
FAILURE: Integrated in Jenkins build PreCommit-PHOENIX-Build #1958 (See
[https://builds.apache.org/job/PreCommit-PHOENIX-Build/1958/])
PHOENIX-4751 Implement client-side hash aggregation (tdsilva: rev
2379080d8348f7f9953c5ab95bdb473ab326b5c3)
* (add)
phoenix-core/src/main/java/org/apache/phoenix/iterate/ClientHashAggregatingResultIterator.java
* (edit)
phoenix-core/src/main/java/org/apache/phoenix/execute/ClientAggregatePlan.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/parse/HintNode.java
* (add)
phoenix-core/src/it/java/org/apache/phoenix/end2end/ClientHashAggregateIT.java
> Support client-side hash aggregation with SORT_MERGE_JOIN
> ---------------------------------------------------------
>
> Key: PHOENIX-4751
> URL: https://issues.apache.org/jira/browse/PHOENIX-4751
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 4.14.0, 4.13.1
> Reporter: Gerald Sangudi
> Assignee: Gerald Sangudi
> Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments:
> 0001-PHOENIX-4751-Add-HASH_AGGREGATE-hint.4.x-HBase-1.4.patch,
> 0001-PHOENIX-4751-Implement-client-side-has.4.x-HBase-1.4.patch,
> 0001-PHOENIX-4751-Implement-client-side-hash-aggre.master.patch,
> 0002-PHOENIX-4751-Begin-implementation-of-c.4.x-HBase-1.4.patch,
> 0003-PHOENIX-4751-Generated-aggregated-resu.4.x-HBase-1.4.patch,
> 0004-PHOENIX-4751-Sort-results-of-client-ha.4.x-HBase-1.4.patch,
> 0005-PHOENIX-4751-Add-integration-test-for-.4.x-HBase-1.4.patch,
> 0006-PHOENIX-4751-Fix-and-run-integration-t.4.x-HBase-1.4.patch,
> 0007-PHOENIX-4751-Add-integration-test-for-.4.x-HBase-1.4.patch,
> 0008-PHOENIX-4751-Verify-EXPLAIN-plan-for-b.4.x-HBase-1.4.patch,
> 0009-PHOENIX-4751-Standardize-null-checks-a.4.x-HBase-1.4.patch,
> 0010-PHOENIX-4751-Abort-when-client-aggrega.4.x-HBase-1.4.patch,
> 0011-PHOENIX-4751-Use-Phoenix-memory-mgmt-t.4.x-HBase-1.4.patch,
> 0012-PHOENIX-4751-Remove-extra-memory-limit.4.x-HBase-1.4.patch,
> 0013-PHOENIX-4751-Sort-only-when-necessary.4.x-HBase-1.4.patch,
> 0014-PHOENIX-4751-Sort-only-when-necessary-.4.x-HBase-1.4.patch,
> 0015-PHOENIX-4751-Show-client-hash-aggregat.4.x-HBase-1.4.patch,
> 0016-PHOENIX-4751-Handle-reverse-sort-add-c.4.x-HBase-1.4.patch
>
>
> A GROUP BY that follows a SORT_MERGE_JOIN should be able to use hash
> aggregation in some cases, for improved performance.
> When a GROUP BY follows a SORT_MERGE_JOIN, the GROUP BY does not use hash
> aggregation. It instead performs a CLIENT SORT followed by a CLIENT
> AGGREGATE. The performance can be improved if (a) the GROUP BY output does
> not need to be sorted, and (b) the GROUP BY input is large enough and has low
> cardinality.
> The hash aggregation can initially be a hint. Here is an example from Phoenix
> 4.13.1 that would benefit from hash aggregation if the GROUP BY input is
> large with low cardinality.
> CREATE TABLE unsalted (
> keyA BIGINT NOT NULL,
> keyB BIGINT NOT NULL,
> val SMALLINT,
> CONSTRAINT pk PRIMARY KEY (keyA, keyB)
> );
> EXPLAIN
> SELECT /*+ USE_SORT_MERGE_JOIN */
> t1.val v1, t2.val v2, COUNT(\*) c
> FROM unsalted t1 JOIN unsalted t2
> ON (t1.keyA = t2.keyA)
> GROUP BY t1.val, t2.val;
>
> +-------------------------------------------------------------+----------------++------------------+
> |PLAN|EST_BYTES_READ|EST_ROWS_READ| |
> +-------------------------------------------------------------+----------------++------------------+
> |SORT-MERGE-JOIN (INNER) TABLES|null|null| |
> | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER UNSALTED|null|null| |
> |AND|null|null| |
> | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER UNSALTED|null|null| |
> |CLIENT SORTED BY [TO_DECIMAL(T1.VAL), T2.VAL]|null|null| |
> |CLIENT AGGREGATE INTO DISTINCT ROWS BY [T1.VAL, T2.VAL]|null|null| |
> +-------------------------------------------------------------+----------------++------------------+
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)