[ 
https://issues.apache.org/jira/browse/SPARK-30957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-30957.
----------------------------------
    Resolution: Won't Fix

> Null-safe variant of Dataset.join(Dataset[_], Seq[String])
> ----------------------------------------------------------
>
>                 Key: SPARK-30957
>                 URL: https://issues.apache.org/jira/browse/SPARK-30957
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Enrico Minack
>            Priority: Major
>
> The {{Dataset.join(Dataset, Seq[String])}} method provides extra convenience 
> over {{Dataset.join(Dataset, joinExprs: Column)}} as it does not duplicate 
> the join columns {{Seq[String]}} in the result {{DataFrame}}. Those columns 
> are compared with {{===}}. When those join columns need to be compared 
> null-safe with {{<=>}}, the join condition becomes very verbose and requires 
> extra {{drop}} operations:
> {code:java}
> df1.join(df2, df1("a") <=> df2("a") && df1("b") <=> 
> df2("b")).drop(df2("a")).drop(df2("b")).show()
> {code}
> Elegant would be the following null-safe join operation:
> {code:java}
> df1.joinNullSafe(df2, joinColumns)
> {code}
> Possible namings:
>  - {{Dataset.joinNullSafe(Dataset[_], Seq[String])}}
>  - {{Dataset.joinWithNulls(Dataset[_], Seq[String])}}
>  - {{Dataset.join(Dataset[_], Seq[String], <=>)}}
> *I am happy to provide a PR if this Dataset API extension is appreciated.*
> This request has been sent to the Apache Spark user and 
> [dev|http://apache-spark-developers-list.1001551.n3.nabble.com/Fwd-dataframe-null-safe-joins-given-a-list-of-columns-tt28842.html]
>  mailing list by Marcelo Valle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to