[GitHub] spark pull request #22326: [SPARK-25314][SQL] Fix Python UDF accessing attib...

xuanyuanking Tue, 04 Sep 2018 00:12:01 -0700

Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22326#discussion_r214807200
  
    --- Diff: python/pyspark/sql/tests.py ---
    @@ -545,6 +545,15 @@ def test_udf_in_filter_on_top_of_join(self):
             right = self.spark.createDataFrame([Row(b=1)])
             f = udf(lambda a, b: a == b, BooleanType())
             df = left.crossJoin(right).filter(f("a", "b"))
    +
    +    def test_udf_in_join_condition(self):
    +        # regression test for SPARK-25314
    +        from pyspark.sql.functions import udf
    +        left = self.spark.createDataFrame([Row(a=1)])
    +        right = self.spark.createDataFrame([Row(b=1)])
    +        f = udf(lambda a, b: a == b, BooleanType())
    +        df = left.crossJoin(right).filter(f("a", "b"))
    --- End diff --
    
    ditto, the correct test is `df = left.join(right, f("a", "b"))`.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22326: [SPARK-25314][SQL] Fix Python UDF accessing attib...

Reply via email to