Hi!

Reusing common sub-plans are an optimization of Flink. Flink is really
reusing them in runtime and the results of the reused tasks are calculated
only once.

Vasily Melnik <vasily.mel...@glowbyteconsulting.com> 于2021年9月2日周四 下午6:32写道:

>
> Hi all.
>
> Using SQL with blink planner for batch calculations, i see *Reused*
> nodes in Optimized Execution Plan while making self join operations:
>
>
> == Optimized Execution Plan ==
> Union(all=[true], union=[id, v, v0, w0$o0])
> :- OverAggregate(orderBy=[id DESC], window#0=[ROW_NUMBER(*) AS w0$o0 ROWS
> BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW], select=[id, v, v0,
> w0$o0])(reuse_id=[2])
> :  +- Sort(orderBy=[id DESC])
> :     +- Exchange(distribution=[single])
> :        +- Calc(select=[id, v, v0])
> :           +- HashJoin(joinType=[LeftOuterJoin], where=[($f2 = id0)],
> select=[id, v, $f2, id0, v0], build=[right])
> :              :- Exchange(distribution=[hash[$f2]])
> :              :  +- Calc(select=[id, v, (id + 1) AS $f2])
> :              :     +- TableSourceScan(table=[[default_catalog,
> default_database, t1]], fields=[id, v])(reuse_id=[1])
> :              +- Exchange(distribution=[hash[id]])
> :                 +- *Reused*(reference_id=[1])
> +- *Reused*(reference_id=[2])
>
>
> Question is: do these steps (scans, intermediate calculations) really be
> calculated once or it is just a print shortcut?
>

Reply via email to