dengziming commented on code in PR #51686: URL: https://github.com/apache/spark/pull/51686#discussion_r2242856210
########## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala: ########## @@ -226,10 +195,54 @@ object V2ScanRelationPushDown extends Rule[LogicalPlan] with PredicateHelper { } } - def generateJoinOutputAlias(name: String): String = - s"${name}_${java.util.UUID.randomUUID().toString.replace("-", "_")}" + private def generateColumnAliasesForDuplicatedName( + leftSideRequiredColumnNames: Array[String], + rightSideRequiredColumnNames: Array[String] + ): (Array[SupportsPushDownJoin.ColumnWithAlias], + Array[SupportsPushDownJoin.ColumnWithAlias]) = { + // Count occurrences of each column name across both sides to identify duplicates. + val allRequiredColumnNames = leftSideRequiredColumnNames ++ rightSideRequiredColumnNames + val allNameCounts: Map[String, Int] = + allRequiredColumnNames.groupBy(identity).view.mapValues(_.size).toMap Review Comment: I don't think it's necessary after some investigation, if our sql is `select * from a(id,sid) join b(id,Sid)`, we can have 2 versions of SQL pushdown to database: 1. `select id, sid, id_1, Sid from (select id, sid from a) join (select id as id_1, Sid from b)` 2. `select id, sid, id_1, sid_1 from (select id, sid from a) join (select id as id_1, Sid as sid_1 from b)` I added this to my test case to show version 1 also can work, and version 2 doesn't make the sql clearer. Is it possible we will meet `AMBIGUOUS_REFERENCE` in version 1? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org