[ 
https://issues.apache.org/jira/browse/SPARK-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-15527.
-------------------------------
    Resolution: Fixed

OK, did you mean this is resolved? normally we wouldn't make a JIRA for it but 
I understand the logic.

> Duplicate column names with different case after join of DataFrames
> -------------------------------------------------------------------
>
>                 Key: SPARK-15527
>                 URL: https://issues.apache.org/jira/browse/SPARK-15527
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.1
>            Reporter: Ian Hellstrom
>             Fix For: 1.6.0
>
>
> Column names can be duplicated when the cases (upper/lower/mixed) do not 
> match in 1.4.1. In 1.6.0, I have checked it and Spark behaves as expected: 
> the join columns are matched in a case-sensitive fashion. In 1.4.1 joins 
> appear to be case-insensitive even though the results are inconsistent.
> I did not find a related ticket, hence I'm opening this one even though it's 
> technically fixed, just in case this happens to be a coincidence.
> Here's a minimal example to check:
> {code}
> case class Test(id: Int, value: String)
> val lhs = sc.parallelize(List(Test(1, "A"), Test(2, "B"), Test(3, "C"))).toDF
> val rhs = sc.parallelize(List(Test(1, "AA"), Test(2, "BB"), Test(4, 
> "D"))).toDF
> val rhsId = rhs.withColumnRenamed("id", "ID")
> val full = lhs.join(rhs, "id")
> val fullId = lhs.join(rhsId, "id") // both id and ID in result in 1.4.1
> val fullID = lhs.join(rhsId, "ID") // only id in result in 1.4.1
> {code}
> The last two joins don't execute on 1.6.0 because "id" is not found in rhsId 
> (first case) and "ID" is not found in lhs (second case). On 1.4.1 you can see 
> the difference. The former gives a DataFrame with two columns even though 
> it's clear the rows where matched, and in the latter we see only one. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to