[GitHub] spark issue #21449: [SPARK-24385][SQL] Resolve self-join condition ambiguity...

daniel-shields Tue, 29 May 2018 14:14:57 -0700

Github user daniel-shields commented on the issue:

    https://github.com/apache/spark/pull/21449
  
    This case can also occur when the datasets are different but share a common 
lineage. Consider the following:
    `df = spark.range(10)
    df1 = df.groupby('id').count()
    df2 = df.groupby('id').sum('id')
    df1.join(df2, df2['id'].eqNullSafe(df1['id'])).collect()`
    This currently fails with eqNullSafe, but works with ==.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21449: [SPARK-24385][SQL] Resolve self-join condition ambiguity...

Reply via email to