[ 
https://issues.apache.org/jira/browse/SPARK-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-17871:
---------------------------------
    Labels: bulk-closed  (was: )

> Dataset joinwith syntax should support specifying the condition in a 
> compile-time safe way
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17871
>                 URL: https://issues.apache.org/jira/browse/SPARK-17871
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.0.1
>            Reporter: Jamie Hutton
>            Priority: Major
>              Labels: bulk-closed
>
> One of the great things about datasets is the way it enables compile time 
> type-safety. However the joinWith method currently only supports a 
> "condition" which is specified as a column, meaning we have to reference the 
> columns by name, removing compile time checking and leading to errors
> It would be great if spark could support a join method which allowed 
> compile-time checking of the condition. Something like:
> leftDS.joinWith(rightDS, case(l,r)=>l.id=r.id"))
> This would have the added benefit of solving a serialization issue which 
> stops joinwith working when using kryo (because with kryo we just one one 
> column on binary called "value" representing the entire object, and we need 
> to be able to join on items within the object - more info here: 
> http://stackoverflow.com/questions/36648128/how-to-store-custom-objects-in-a-dataset).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to