[ https://issues.apache.org/jira/browse/SPARK-11803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15010181#comment-15010181 ]
Xiao Li commented on SPARK-11803: --------------------------------- The optimized plan is wrong. Project [value#1 AS _1#4,value#1 AS _2#5] Join Inner, None LocalRelation [value#1], [[0,1000000001,31],[0,1000000001,32]] LocalRelation [[empty row],[empty row]] The correct one should be like Project [value#1 AS _1#4,value#5 AS _2#5] Join Inner, None LocalRelation [value#1], [[0,1000000001,31],[0,1000000001,32]] LocalRelation [value#5], [[0,1000000001,31],[0,1000000001,32]] > Dataset self join returns incorrect result > ------------------------------------------ > > Key: SPARK-11803 > URL: https://issues.apache.org/jira/browse/SPARK-11803 > Project: Spark > Issue Type: Sub-task > Components: SQL > Reporter: Reynold Xin > > See the test case in https://github.com/apache/spark/pull/9789 > {code} > ignore("self join") { > val ds = Seq("1", "2").toDS().as("a") > val joined = ds.joinWith(ds, lit(true)) > checkAnswer(joined, ("1", "1"), ("1", "2"), ("2", "1"), ("2", "2")) > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org