[
https://issues.apache.org/jira/browse/SPARK-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen reopened SPARK-15527:
-------------------------------
Oops, I think maybe we use "Not A Problem" in this case, but the meaning is
clear here
> Duplicate column names with different case after join of DataFrames
> -------------------------------------------------------------------
>
> Key: SPARK-15527
> URL: https://issues.apache.org/jira/browse/SPARK-15527
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.4.1
> Reporter: Ian Hellstrom
> Fix For: 1.6.0
>
>
> Column names can be duplicated when the cases (upper/lower/mixed) do not
> match in 1.4.1. In 1.6.0, I have checked it and Spark behaves as expected:
> the join columns are matched in a case-sensitive fashion. In 1.4.1 joins
> appear to be case-insensitive even though the results are inconsistent.
> I did not find a related ticket, hence I'm opening this one even though it's
> technically fixed, just in case this happens to be a coincidence.
> Here's a minimal example to check:
> {code}
> case class Test(id: Int, value: String)
> val lhs = sc.parallelize(List(Test(1, "A"), Test(2, "B"), Test(3, "C"))).toDF
> val rhs = sc.parallelize(List(Test(1, "AA"), Test(2, "BB"), Test(4,
> "D"))).toDF
> val rhsId = rhs.withColumnRenamed("id", "ID")
> val full = lhs.join(rhs, "id")
> val fullId = lhs.join(rhsId, "id") // both id and ID in result in 1.4.1
> val fullID = lhs.join(rhsId, "ID") // only id in result in 1.4.1
> {code}
> The last two joins don't execute on 1.6.0 because "id" is not found in rhsId
> (first case) and "ID" is not found in lhs (second case). On 1.4.1 you can see
> the difference. The former gives a DataFrame with two columns even though
> it's clear the rows where matched, and in the latter we see only one.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]