Hi Haisgheng, I agree with you, since the groupby on t1.b, by definition, produces one tuple per distinct t1.b value, for any given t1.b distinct values you have at most one tuple both sides, so the direct multiplication cnt*c suffice as SUM0 sums over a single element.
But I don't know the logic behind the rule, so I cannot say if the rewriting is there for other reasons (triggering other possible chains of rewriting for instance), but definitely an interesting remark :) Best regards, Alessandro On 29 March 2018 at 13:54, Haisheng Yuan <[email protected]> wrote: > Hi devs, > > While reading SqlSplittableAggFunction and AggregateJoinTransposeRule > source > code, I noticed that in CountSplitter.topSplit, it creates an SUM0 AggCall, > which makes sense for the following query: > select count(t1.a) from t1 join t2 on t1.b=t2.b; > > which will be transformed to: > select sum0(cnt*c) from (select b, count(a) as cnt from t1 group by b) t1 > join (select b, count(*) as c from t2 group by b)t2 on t1.b=t2.b ; > > But for another query: > select t1.b, count(t1.a) from t1 join t2 on t1.b=t2.b group by t1.b; > > which will be transformed to: > select t1.b, sum0(cnt*c) from (select b, count(a) as cnt from t1 group by > b) t1 join (select b, count(*) as c from t2 group by b)t2 on t1.b=t2.b > group by t1.b; > > In fact, query without additional SUM0 group agg should be enough: > select t1.b, cnt*c from (select b, count(a) as cnt from t1 group by b) t1 > join (select b, count(*) as c from t2 group by b) t2 on t1.b=t2.b; > > Am I missing something? Correct me if I am wrong. > > Thanks~ > Haisheng Yuan >
