[ 
https://issues.apache.org/jira/browse/SPARK-11803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15010315#comment-15010315
 ] 

Xiao Li commented on SPARK-11803:
---------------------------------

I believe this must be an urgent issue. This is my first time to read this 
Dataset implementation. This might be a quick fix for you. I should not block 
your current progress. 

Just want to share my idea. Detecting the conflicting attributes in the 
joinWith function and then reassign the new expression ids to the 
other.logicalPlan. This should be similar as how the Analyzer does for self 
joins for handling the duplicate expression ids. Then, we can use the new 
`other` for joining with `this`.




> Dataset self join returns incorrect result
> ------------------------------------------
>
>                 Key: SPARK-11803
>                 URL: https://issues.apache.org/jira/browse/SPARK-11803
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Reynold Xin
>
> See the test case in https://github.com/apache/spark/pull/9789
> {code}
>   ignore("self join") {
>     val ds = Seq("1", "2").toDS().as("a")
>     val joined = ds.joinWith(ds, lit(true))
>     checkAnswer(joined, ("1", "1"), ("1", "2"), ("2", "1"), ("2", "2"))
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to