[
https://issues.apache.org/jira/browse/SPARK-43946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
jinhai-cloud updated SPARK-43946:
---------------------------------
Description:
{code:java}
// code placeholder
with t1 as (
select rand() c3
),
t2 as (select * from t1)
select c3 from t1 where c3 > 0 {code}
{code:java}
// code placeholder
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.InlineCTE ===
WithCTE WithCTE
:- CTERelationDef 0, false :- CTERelationDef 0,
false
: +- Project [rand(3418873542988342437) AS c3#236] : +- Project
[rand(3418873542988342437) AS c3#236]
: +- OneRowRelation : +- OneRowRelation
!:- CTERelationDef 1, false +- Project [c3#236]
!: +- Project [c3#236] +- Filter (c3#236 >
cast(0 as double))
!: +- CTERelationRef 0, true, [c3#236] +- CTERelationRef
0, true, [c3#236]
!+- Project [c3#236]
! +- Filter (c3#236 > cast(0 as double))
! +- CTERelationRef 0, true, [c3#236]
{code}
When the above query applies the inlineCTE rule, inline is not possible because
the refCount of CTERelationDef 0 is equal to 2.
However, according to the optimized logicalplan, the plan can be further
optimized because the refCount of CTERelationDef 0 is equal to 1.
Therefore, we can add the rule *RemoveRedundantCTEDef* to delete the
unreferenced CTERelationDef to prevent the refCount from being miscalculated
{code:java}
// code placeholder
Project [c3#236]
+- Filter (c3#236 > cast(0 as double))
+- Project [rand(-7871530451581327544) AS c3#236]
+- OneRowRelation {code}
> Add rule to remove unused CTEDef
> --------------------------------
>
> Key: SPARK-43946
> URL: https://issues.apache.org/jira/browse/SPARK-43946
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.2.4, 3.3.2, 3.4.0
> Reporter: jinhai-cloud
> Priority: Major
>
> {code:java}
> // code placeholder
> with t1 as (
> select rand() c3
> ),
> t2 as (select * from t1)
> select c3 from t1 where c3 > 0 {code}
> {code:java}
> // code placeholder
> === Applying Rule org.apache.spark.sql.catalyst.optimizer.InlineCTE ===
> WithCTE WithCTE
> :- CTERelationDef 0, false :- CTERelationDef 0,
> false
> : +- Project [rand(3418873542988342437) AS c3#236] : +- Project
> [rand(3418873542988342437) AS c3#236]
> : +- OneRowRelation : +- OneRowRelation
> !:- CTERelationDef 1, false +- Project [c3#236]
> !: +- Project [c3#236] +- Filter (c3#236 >
> cast(0 as double))
> !: +- CTERelationRef 0, true, [c3#236] +-
> CTERelationRef 0, true, [c3#236]
> !+- Project [c3#236]
> ! +- Filter (c3#236 > cast(0 as double))
> ! +- CTERelationRef 0, true, [c3#236]
> {code}
> When the above query applies the inlineCTE rule, inline is not possible
> because the refCount of CTERelationDef 0 is equal to 2.
> However, according to the optimized logicalplan, the plan can be further
> optimized because the refCount of CTERelationDef 0 is equal to 1.
> Therefore, we can add the rule *RemoveRedundantCTEDef* to delete the
> unreferenced CTERelationDef to prevent the refCount from being miscalculated
> {code:java}
> // code placeholder
> Project [c3#236]
> +- Filter (c3#236 > cast(0 as double))
> +- Project [rand(-7871530451581327544) AS c3#236]
> +- OneRowRelation {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]