[ https://issues.apache.org/jira/browse/PHOENIX-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chenglei updated PHOENIX-7397: ------------------------------ Description: A common use case of union all is to union multiple homogeneous tables and then further aggregate or sort , just as following sql statements: {code:java} select A, B,C, sum(...),sum(....) from ( select a1 as A, b1 as B,c1 as C, .... from T1 union all select a2,b2,c2,..... from T2 union all select a3,b3,c3, .... ) group by A,B,C having.... {code} {code:java} select A, B,C, sum(...),sum(...) from ( select a1 as A, b1 as B,c1 as C, sum(...), sum(...),.... from T1 group by a1,b1,c1 union all select a2,b2,c2,.sum(...),sum(...).... from T2 group by a2,b2,c2 union all select a3,b3,c3, sum(...),sum(...).... group by a3,b3,c3 ) group by A,B,C having.... {code} {code:java} select A, B,C... from ( select a1 as A, b1 as B,c1 as C, .... from T1 union all select a2,b2,c2,..... from T2 union all select a3,b3,c3, .... ) where ...... order by A,B,C {code} For situations similar to above three SQL queries, if each subquery of union all is actually ordered by A, B, and C, then {{UnionPlan}} just needs to perform a simple merge on the subquery results to ensure the overall order of the union all . However, for the current implementation of Phoenix, the outer {{ClientAggregatePlan}} or {{ClientScanPlan}} does not know this information within {{UnionPlan}} and will still perform heavyweight processing based on the assumption that the overall result of union all is completely unordered. was: A common use case of union all is to union multiple homogeneous tables and then further aggregate or sort , just as following sql statements: {code:java} select A, B,C, sum(...),sum(....) from ( select a1 as A, b1 as B,c1 as C, .... from T1 union all select a2,b2,c2,..... from T2 union all select a3,b3,c3, .... ) group by A,B,C having.... {code} {code:java} select A, B,C, sum(...),sum(...) from ( select a1 as A, b1 as B,c1 as C, sum(...), sum(...),.... from T1 group by a1,b1,c1 union all select a2,b2,c2,.sum(...),sum(...).... from T2 group by a2,b2,c2 union all select a3,b3,c3, sum(...),sum(...).... group by a3,b3,c3 ) group by A,B,C having.... {code} {code:java} select A, B,C... from ( select a1 as A, b1 as B,c1 as C, .... from T1 union all select a2,b2,c2,..... from T2 union all select a3,b3,c3, .... ) where ...... order by A,B,C {code} For situations similar to above three SQL queries, if each subquery of union all is actually ordered by A, B, and C, then {{UnionPlan}} just needs to perform a simple merge on the subquery results to ensure the overall order of the union all result. However, for the current implementation of Phoenix, the outer {{ClientAggregatePlan}} or {{ClientScanPlan}} does not know this information within {{UnionPlan}} and will still perform heavyweight processing based on the assumption that the overall result of union all is completely unordered. > Optimize ClientAggregatePlan/ClientScanPlan when inner query plan is UnionPlan > ------------------------------------------------------------------------------ > > Key: PHOENIX-7397 > URL: https://issues.apache.org/jira/browse/PHOENIX-7397 > Project: Phoenix > Issue Type: Improvement > Components: core > Affects Versions: 5.2.0, 5.1.3 > Reporter: chenglei > Priority: Major > > A common use case of union all is to union multiple homogeneous tables and > then further aggregate or sort , just as following sql statements: > {code:java} > select A, B,C, sum(...),sum(....) > from ( > select a1 as A, b1 as B,c1 as C, .... > from T1 > union all > select a2,b2,c2,..... > from T2 > union all > select a3,b3,c3, .... > ) > group by A,B,C > having.... > {code} > {code:java} > select A, B,C, sum(...),sum(...) > from ( > select a1 as A, b1 as B,c1 as C, sum(...), sum(...),.... > from T1 > group by a1,b1,c1 > union all > select a2,b2,c2,.sum(...),sum(...).... > from T2 > group by a2,b2,c2 > union all > select a3,b3,c3, sum(...),sum(...).... > group by a3,b3,c3 > ) > group by A,B,C > having.... > {code} > {code:java} > select A, B,C... > from ( > select a1 as A, b1 as B,c1 as C, .... > from T1 > union all > select a2,b2,c2,..... > from T2 > union all > select a3,b3,c3, .... > ) > where ...... > order by A,B,C > {code} > For situations similar to above three SQL queries, if each subquery of union > all is actually ordered by A, B, and C, then {{UnionPlan}} just needs to > perform a simple merge on the subquery results to ensure the overall order of > the union all . However, for the current implementation of Phoenix, the outer > {{ClientAggregatePlan}} or {{ClientScanPlan}} does not know this information > within {{UnionPlan}} and will still perform heavyweight processing based on > the assumption that the overall result of union all is completely unordered. -- This message was sent by Atlassian Jira (v8.20.10#820010)