[ 
https://issues.apache.org/jira/browse/SPARK-54593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-54593:
----------------------------------
    Affects Version/s: 4.3.0
                           (was: 3.0.0)
                           (was: 4.0.0)

> Expand DPP for LogicalRelation and LogicalRDD
> ---------------------------------------------
>
>                 Key: SPARK-54593
>                 URL: https://issues.apache.org/jira/browse/SPARK-54593
>             Project: Spark
>          Issue Type: Improvement
>          Components: Optimizer, SQL
>    Affects Versions: 4.3.0
>            Reporter: Dustin Smith
>            Assignee: Chao Sun
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.3.0
>
>
> Enable Dynamic Partition Pruning (DPP) for queries involving LocalRelation 
> and LogicalRDD as the filtering side of broadcast joins. These logical plan 
> nodes represent small, materialized datasets that are ideal candidates for 
> DPP optimization.
>  #  PartitionPruning.scala
>  ** Modified hasSelectivePredicate() to treat LocalRelation and LogicalRDD as 
> selective predicates
>  ** Added overhead calculation for LocalRelation and LogicalRDD in 
> calculatePlanOverhead()
>  ** LogicalRDD is only considered for DPP when it has originStats with 
> rowCount, indicating materialized data
>  # Test Coverage
>  ** Added 5 comprehensive tests in DynamicPartitionPruningSuite.scala:
>  *** DPP with LocalRelation in broadcast join
>  *** DPP with LogicalRDD from cached DataFrame in broadcast join
>  *** DPP with empty LocalRelation 
>  *** DPP should not trigger for LogicalRDD without originStats 
>  *** DPP with large LocalRelation 
> Will supply a Github PR.
> https://github.com/apache/spark/pull/53324



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to