[
https://issues.apache.org/jira/browse/SPARK-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15043647#comment-15043647
]
Narine Kokhlikyan edited comment on SPARK-11250 at 12/6/15 2:04 AM:
--------------------------------------------------------------------
Hi there,
I've created a pull request for the join on scala side.
if the not-join-condition column names repeat in both dataframes.
e.g.
Employee
-------------
empid
name
Company
----------
cid
empid
name
and we call join with
employee.join(company, "empid", "inner") this will generate a resulting
dataframe with columns:
empid, cid, name_x name_y
what do you think ? [~davies] [~shivaram] [~sunrui] I can change other joins
too if we agree on the logic.
Thanks,
Narine
was (Author: narine):
Hi there,
I've created a pull request for the join on scala side.
if the not-join-condition column names repeat in both dataframes.
e.g.
Employee
-------------
empid
name
Company
----------
cid
empid
name
and we call join with
employee.join(company, "empid", "inner") this will generate a resulting
dataframe with columns:
empid, cid, name_x name_y
what do you think ? I can change other joins too if we agree on the logic.
Thanks,
Narine
> Generate different alias for columns with same name during join
> ---------------------------------------------------------------
>
> Key: SPARK-11250
> URL: https://issues.apache.org/jira/browse/SPARK-11250
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Reporter: Davies Liu
> Assignee: Apache Spark
>
> It's confusing to see columns with same name after joining, and hard to
> access them, we could generate different alias for them in joined DataFrame.
> see https://github.com/apache/spark/pull/9012/files#r42696855 as example
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]