Santiago M. Mola created SPARK-6743:
---------------------------------------

             Summary: Join with empty projection on one side produces invalid 
results
                 Key: SPARK-6743
                 URL: https://issues.apache.org/jira/browse/SPARK-6743
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.3.0
            Reporter: Santiago M. Mola


{code:java}
val sqlContext = new SQLContext(sc)
val tab0 = sc.parallelize(Seq(
      (83,0,38),
      (26,0,79),
      (43,81,24)
    ))
    sqlContext.registerDataFrameAsTable(sqlContext.createDataFrame(tab0), 
"tab0")
sqlContext.cacheTable("tab0")   
val df1 = sqlContext.sql("SELECT tab0._2, cor0._2 FROM tab0, tab0 cor0 GROUP BY 
tab0._2, cor0._2")
val result1 = df1.collect()
val df2 = sqlContext.sql("SELECT cor0._2 FROM tab0, tab0 cor0 GROUP BY cor0._2")
val result2 = df2.collect()
val df3 = sqlContext.sql("SELECT cor0._2 FROM tab0 cor0 GROUP BY cor0._2")
val result3 = df3.collect()
{code}

Given the previous code, result2 equals to Row(43), Row(83), Row(26), which is 
wrong. These results correspond to cor0._1, instead of cor0._2. Correct results 
would be Row(0), Row(81), which are ok for the third query. The first query 
also produces valid results, and the only difference is that the left side of 
the join is not empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to