[ 
https://issues.apache.org/jira/browse/SPARK-32551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171874#comment-17171874
 ] 

Hyukjin Kwon commented on SPARK-32551:
--------------------------------------

{{spark.sql.analyzer.failAmbiguousSelfJoin}} isn't a completely correct check 
but roughly detects. You can turn off that configuration for now as guided. cc 
[~cloud_fan] FYI

> Ambiguous self join error in non self join with window
> ------------------------------------------------------
>
>                 Key: SPARK-32551
>                 URL: https://issues.apache.org/jira/browse/SPARK-32551
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: kanika dhuria
>            Priority: Major
>
> Following code fails ambiguous self join analysis, even when it doesn't have 
> self join 
> val v1 = spark.range(3).toDF("m")
>  val v2 = spark.range(3).toDF("d")
>  val v3 = v1.join(v2, v1("m").===(v2("d")))
>  val v4 = v3("d");
>  val w1 = Window.partitionBy(v4)
>  val out = v3.select(v4.as("a"), sum(v4).over(w1).as("b"))
> org.apache.spark.sql.AnalysisException: Column a#45L are ambiguous. It's 
> probably because you joined several Datasets together, and some of these 
> Datasets are the same. This column points to one of the Datasets but Spark is 
> unable to figure out which one. Please alias the Datasets with different 
> names via `Dataset.as` before joining them, and specify the column using 
> qualified name, e.g. `df.as("a").join(df.as("b"), $"a.id" > $"b.id")`. You 
> can also set spark.sql.analyzer.failAmbiguousSelfJoin to false to disable 
> this check.;
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to