[
https://issues.apache.org/jira/browse/SPARK-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-17871:
---------------------------------
Labels: bulk-closed (was: )
> Dataset joinwith syntax should support specifying the condition in a
> compile-time safe way
> ------------------------------------------------------------------------------------------
>
> Key: SPARK-17871
> URL: https://issues.apache.org/jira/browse/SPARK-17871
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 2.0.1
> Reporter: Jamie Hutton
> Priority: Major
> Labels: bulk-closed
>
> One of the great things about datasets is the way it enables compile time
> type-safety. However the joinWith method currently only supports a
> "condition" which is specified as a column, meaning we have to reference the
> columns by name, removing compile time checking and leading to errors
> It would be great if spark could support a join method which allowed
> compile-time checking of the condition. Something like:
> leftDS.joinWith(rightDS, case(l,r)=>l.id=r.id"))
> This would have the added benefit of solving a serialization issue which
> stops joinwith working when using kryo (because with kryo we just one one
> column on binary called "value" representing the entire object, and we need
> to be able to join on items within the object - more info here:
> http://stackoverflow.com/questions/36648128/how-to-store-custom-objects-in-a-dataset).
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]