在SQL中,如果开启了 local-global 参数:set table.optimizer.agg-phase-strategy=TWO_PHASE;
或者开启了Partial-Final 参数:set table.optimizer.distinct-agg.split.enabled=true;
                                         set
table.optimizer.distinct-agg.split.bucket-num=1024;
还需要对应的将SQL改写为两段式吗?
例如:
    原SQL:
    SELECT day, COUNT(DISTINCT buy_id) as cnt FROM T GROUP BY day,

    对所需DISTINCT字段buy_id模1024自动打散后,SQL:
    SELECT day, SUM(cnt) total
    FROM (
    SELECT day, MOD(buy_id, 1024), COUNT(DISTINCT buy_id) as cnt
    FROM T GROUP BY day, MOD(buy_id, 1024))
    GROUP BY day

还是flink会帮我自动改写SQL,我不用关心?

另外,如果只设置开启上述参数,没有改写SQL,感觉没有优化,在flink web ui界面上也没有看到两阶段算子
<http://apache-flink.147419.n8.nabble.com/file/t1346/%E7%AE%97%E5%AD%90.png> 





--
Sent from: http://apache-flink.147419.n8.nabble.com/

回复