Hello Richard, Looks like the Dataset is Dataset[(Int, Int)]. I guess for the case of "ds.joinWith(other, expr, Outer).map({ case (t, u) => (Option(t), Option(u)) })". We are trying to use null to create a "(Int, Int)" and somehow it ended up with a tuple2 having default values.
Can you create a jira? We will investigate the issue. Thanks! Yin On Mon, Jun 20, 2016 at 8:21 AM, Richard Marscher <rmarsc...@localytics.com> wrote: > I know recently outer join was changed to preserve actual nulls through > the join in https://github.com/apache/spark/pull/13425. I am seeing what > seems like inconsistent behavior though based on how the join is interacted > with. In one case the default datatype values are still used instead of > nulls whereas the other case passes the nulls through. I have a small > databricks notebook showing the case against 2.0 preview: > > > https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/160347920874755/4268263383756277/673639177603143/latest.html > > -- > *Richard Marscher* > Senior Software Engineer > Localytics > Localytics.com <http://localytics.com/> | Our Blog > <http://localytics.com/blog> | Twitter <http://twitter.com/localytics> | > Facebook <http://facebook.com/localytics> | LinkedIn > <http://www.linkedin.com/company/1148792?trk=tyah> >