[ 
https://issues.apache.org/jira/browse/SPARK-28860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lai Zhou updated SPARK-28860:
-----------------------------
    Description: 
Now the star-schema detection uses TableAccessCardinality to reorder DimTables  
when there is a selectiveStarJoin . 

[StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341]
{code:java}
if (isSelectiveStarJoin(dimTables, conditions)) { 
val reorderDimTables = dimTables.map { 
plan => TableAccessCardinality(plan, getTableAccessCardinality(plan)) }
.sortBy(_.size).map { 
case TableAccessCardinality(p1, _) => p1
 }{code}
 

But the getTableAccessCardinality method does't consider the ColumnStats of the 
equi-join-key. I'm not sure if we should compute Join cardinality for the 
dimTable based on it's join key here.

[~ioana-delaney]

 

 

 

 

  was:
Now the star-schema detection uses TableAccessCardinality to reorder DimTables  
when there is a selectiveStarJoin . 

[StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341]

 
{code:java}
if (isSelectiveStarJoin(dimTables, conditions)) { 
val reorderDimTables = dimTables.map { plan => TableAccessCardinality(plan, 
getTableAccessCardinality(plan)) }.sortBy(_.size).map { case 
TableAccessCardinality(p1, _) => p1 }{code}
 

 

But the getTableAccessCardinality method does't consider the ColumnStats of the 
equi-join-key. I'm not sure if we should compute Join cardinality for the 
dimTable based on it's

join key here.

[~ioana-delaney]

 

 

 

 


>  Using ColumnStats of join key to get TableAccessCardinality when finding 
> star joins in ReorderJoinRule
> -------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-28860
>                 URL: https://issues.apache.org/jira/browse/SPARK-28860
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.3
>            Reporter: Lai Zhou
>            Priority: Minor
>
> Now the star-schema detection uses TableAccessCardinality to reorder 
> DimTables  when there is a selectiveStarJoin . 
> [StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341]
> {code:java}
> if (isSelectiveStarJoin(dimTables, conditions)) { 
> val reorderDimTables = dimTables.map { 
> plan => TableAccessCardinality(plan, getTableAccessCardinality(plan)) }
> .sortBy(_.size).map { 
> case TableAccessCardinality(p1, _) => p1
>  }{code}
>  
> But the getTableAccessCardinality method does't consider the ColumnStats of 
> the equi-join-key. I'm not sure if we should compute Join cardinality for the 
> dimTable based on it's join key here.
> [~ioana-delaney]
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to