[jira] [Updated] (FLINK-24048) Move changeLog inference out of optimizing phase

Shuo Cheng (Jira) Sun, 29 Aug 2021 23:18:06 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-24048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shuo Cheng updated FLINK-24048:
-------------------------------
    Description: 
Currently, when there are multiple sinks in a sql job, the DAG is split into 
multiple relNode blocks; as changeLog inference is in optimizing phase, we need 
to propagate the changeLog mode among blocks to ensure each block can generate 
an accurate physical plan.

In current solution, the DAG is optimized 3 times in order to propagate 
changeLog mode, which is inefficient. Actually, we can just optimize the DAG, 
expanding the DAG to a physical node tree, and then infer changeLog mode. In 
this way, the dag is only optimized 1 time.

(Similarly, minibatch interval can also be inferred in same way)

  was:
Currently, when there are multiple sinks in a sql job, the DAG is split into 
multiple relNode blocks; as changeLog inference is in optimizing phase, we need 
to propagate the changeLog mode among blocks to ensure each block can generate 
an accurate physical plan.

In current solution, the DAG is optimized 3 times in order to propagate 
changeLog mode, which is inefficient. Actually, we can just optimize the DAG, 
expanding the DAG to a physical node tree, and then infer changeLog mode. In 
this way, the dag is only optimized 1 time.

(Similarly, minibatch interval can also be inferred is same way)


> Move changeLog inference out of optimizing phase
> ------------------------------------------------
>
>                 Key: FLINK-24048
>                 URL: https://issues.apache.org/jira/browse/FLINK-24048
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Planner
>    Affects Versions: 1.14.0
>            Reporter: Shuo Cheng
>            Priority: Major
>             Fix For: 1.15.0
>
>
> Currently, when there are multiple sinks in a sql job, the DAG is split into 
> multiple relNode blocks; as changeLog inference is in optimizing phase, we 
> need to propagate the changeLog mode among blocks to ensure each block can 
> generate an accurate physical plan.
> In current solution, the DAG is optimized 3 times in order to propagate 
> changeLog mode, which is inefficient. Actually, we can just optimize the DAG, 
> expanding the DAG to a physical node tree, and then infer changeLog mode. In 
> this way, the dag is only optimized 1 time.
> (Similarly, minibatch interval can also be inferred in same way)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-24048) Move changeLog inference out of optimizing phase

Reply via email to