[ 
https://issues.apache.org/jira/browse/SPARK-17734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15654971#comment-15654971
 ] 

Leonardo Yvens commented on SPARK-17734:
----------------------------------------

Hello, [~dongjoon] and [~pdxleif]. The issue wants a method that returns a 
typed Dataset but the suggestion in the first comment returns a DataFrame so it 
dosen't fix the issue. Maybe what you want is a `joinWith(other : Dataset\[U], 
usingColumn: String) : Dataset\[(T, U)]`? I think that method should not exist 
because it would have to keep the column duplicated, and that is inconsistent 
with the `join` method since it does not duplicate the column. Then this would 
be closed as wontfix. Is this correct or am I misunderstanding the issue?

> inner equi-join shorthand that returns Datasets, like DataFrame already has
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-17734
>                 URL: https://issues.apache.org/jira/browse/SPARK-17734
>             Project: Spark
>          Issue Type: Wish
>            Reporter: Leif Warner
>            Priority: Minor
>
> There's an existing ".join(right: Dataset[_], usingColumn: String): 
> DataFrame" method on Dataset.
> Would appreciate it if a variant that returns typed Datasets would also 
> available.
> If you write a join that contains the common column name name, you get an 
> AnalysisException thrown because that's ambiguous, e.g:
> $"foo" === $"foo"
> So I wrote table1.toDF()("foo") === table2.toDF()("foo"), but that's a little 
> error prone, and coworkers considered it a hack and didn't want to use it, 
> because it "mixes DataFrame and Dataset api".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to