[ https://issues.apache.org/jira/browse/SPARK-28860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lai Zhou updated SPARK-28860: ----------------------------- Description: Now the star-schema detection uses TableAccessCardinality to reorder DimTables when there is a selectiveStarJoin . [StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341] {code:java} if (isSelectiveStarJoin(dimTables, conditions)) { val reorderDimTables = dimTables.map { plan => TableAccessCardinality(plan, getTableAccessCardinality(plan)) } .sortBy(_.size).map { case TableAccessCardinality(p1, _) => p1 }{code} But the getTableAccessCardinality method does't consider the ColumnStats of the equi-join-key. I'm not sure if we should compute Join cardinality for the dimTable based on it's join key here. [~ioana-delaney] was: Now the star-schema detection uses TableAccessCardinality to reorder DimTables when there is a selectiveStarJoin . [StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341] {code:java} if (isSelectiveStarJoin(dimTables, conditions)) { val reorderDimTables = dimTables.map { plan => TableAccessCardinality(plan, getTableAccessCardinality(plan)) }.sortBy(_.size).map { case TableAccessCardinality(p1, _) => p1 }{code} But the getTableAccessCardinality method does't consider the ColumnStats of the equi-join-key. I'm not sure if we should compute Join cardinality for the dimTable based on it's join key here. [~ioana-delaney] > Using ColumnStats of join key to get TableAccessCardinality when finding > star joins in ReorderJoinRule > ------------------------------------------------------------------------------------------------------- > > Key: SPARK-28860 > URL: https://issues.apache.org/jira/browse/SPARK-28860 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.3 > Reporter: Lai Zhou > Priority: Minor > > Now the star-schema detection uses TableAccessCardinality to reorder > DimTables when there is a selectiveStarJoin . > [StarSchemaDetection.scala#L341|https://github.com/apache/spark/blob/98e1a4cea44d7cb2f6d502c0202ad3cac2a1ad8d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala#L341] > {code:java} > if (isSelectiveStarJoin(dimTables, conditions)) { > val reorderDimTables = dimTables.map { > plan => TableAccessCardinality(plan, getTableAccessCardinality(plan)) } > .sortBy(_.size).map { > case TableAccessCardinality(p1, _) => p1 > }{code} > > But the getTableAccessCardinality method does't consider the ColumnStats of > the equi-join-key. I'm not sure if we should compute Join cardinality for the > dimTable based on it's join key here. > [~ioana-delaney] > > > > -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org