[
https://issues.apache.org/jira/browse/FLINK-20478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jark Wu reassigned FLINK-20478:
-------------------------------
Assignee: Jark Wu
> Adjust the explain result
> -------------------------
>
> Key: FLINK-20478
> URL: https://issues.apache.org/jira/browse/FLINK-20478
> Project: Flink
> Issue Type: Sub-task
> Components: Table SQL / Planner
> Reporter: godfrey he
> Assignee: Jark Wu
> Priority: Major
>
> Currently, the explain result includes "Abstract Syntax Tree", "Optimized
> Logical Plan" and "Physical Execution Plan". While the "Optimized Logical
> Plan" is an {{ExecNode}} graph, and the
> "[ExplainDetail|https://github.com/apache/flink/blob/master/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/ExplainDetail.java]"
> represents the expected explain details, including {{ESTIMATED_COST}} and
> {{CHANGELOG_MODE}} now. Those types can only used for Calicte {{RelNode}}s
> instead of {{ExecNode}}. So I suggest to make the following adjustments:
> 1. Keep "Abstract Syntax Tree" as it, which represents the original
> (un-optimized) {{RelNode}} graph converted from {{SqlNode}}.
> 2. Rename "Optimized Logical Plan" to "Optimized Physical Plan", which
> represents the optimized physical {{RelNode}} graph composed of
> {{FlinkPhysicalRel}}. {{ESTIMATED_COST}} and {{CHANGELOG_MODE}} describe the
> expected explain details for "Optimized Physical Plan".
> 3.Replace "Physical Execution Plan" with "Optimized Execution Plan", which
> represents the optimized {{ExecNode}} graph. Currently, many optimizations
> are based on {{ExecNode}} graph, such as sub-plan reuse, multiple input
> rewrite. We may introduce more optimizations in the future. So there are more
> and more difference between "Optimized Physical Plan" and "Optimized
> Execution Plan". We do not want to show tow execution plans, and "Physical
> Execution Plan" for {{StreamGraph}} is less important than "Optimized
> Execution Plan". If we want to introduce "Physical Execution Plan" in the
> future, we can add a type named "PHYSICAL_EXECUTION_PLAN" in
> {{ExplainDetail}} to support it. There is already an issue to do the similar
> things, [FLINK-19687|https://issues.apache.org/jira/browse/FLINK-19687]
> The following example show the explain result after adjustment:
> {code}
> == Abstract Syntax Tree ==
> LogicalLegacySink(name=[`default_catalog`.`default_database`.`upsertSink1`],
> fields=[a, cnt])
> +- LogicalProject(a=[$0], cnt=[$1])
> +- LogicalFilter(condition=[>($1, 10)])
> +- LogicalAggregate(group=[{0}], cnt=[COUNT()])
> +- LogicalProject(a=[$0])
> +- LogicalTableScan(table=[[default_catalog, default_database,
> MyTable1]])
> LogicalLegacySink(name=[`default_catalog`.`default_database`.`upsertSink2`],
> fields=[a, cnt])
> +- LogicalProject(a=[$0], cnt=[$1])
> +- LogicalFilter(condition=[<($1, 10)])
> +- LogicalAggregate(group=[{0}], cnt=[COUNT()])
> +- LogicalProject(a=[$0])
> +- LogicalTableScan(table=[[default_catalog, default_database,
> MyTable1]])
> == Optimized Physical Plan ==
> LegacySink(name=[`default_catalog`.`default_database`.`upsertSink1`],
> fields=[a, cnt])
> +- Calc(select=[a, cnt], where=[>(cnt, 10)])
> +- GroupAggregate(groupBy=[a], select=[a, COUNT(*) AS cnt])
> +- Exchange(distribution=[hash[a]])
> +- Calc(select=[a])
> +- DataStreamScan(table=[[default_catalog, default_database,
> MyTable1]], fields=[a, b, c])
> LegacySink(name=[`default_catalog`.`default_database`.`upsertSink2`],
> fields=[a, cnt])
> +- Calc(select=[a, cnt], where=[<(cnt, 10)])
> +- GroupAggregate(groupBy=[a], select=[a, COUNT(*) AS cnt])
> +- Exchange(distribution=[hash[a]])
> +- Calc(select=[a])
> +- DataStreamScan(table=[[default_catalog, default_database,
> MyTable1]], fields=[a, b, c])
> == Optimized Execution Plan ==
> GroupAggregate(groupBy=[a], select=[a, COUNT(*) AS cnt], reuse_id=[1])
> +- Exchange(distribution=[hash[a]])
> +- Calc(select=[a])
> +- DataStreamScan(table=[[default_catalog, default_database,
> MyTable1]], fields=[a, b, c])
> LegacySink(name=[`default_catalog`.`default_database`.`upsertSink1`],
> fields=[a, cnt])
> +- Calc(select=[a, cnt], where=[>(cnt, 10)])
> +- Reused(reference_id=[1])
> LegacySink(name=[`default_catalog`.`default_database`.`upsertSink2`],
> fields=[a, cnt])
> +- Calc(select=[a, cnt], where=[<(cnt, 10)])
> +- Reused(reference_id=[1])
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)