viirya opened a new pull request #25168: [SPARK-28276][SQL][PYTHON][TEST] 
Convert and port 'cross-join.sql' into UDF test base
URL: https://github.com/apache/spark/pull/25168
 
 
   ## What changes were proposed in this pull request?
   
   This PR adds some tests converted from `cross-join.sql'` to test UDFs.
   
   <details><summary>Diff comparing to 'cross-join.sql'</summary>
   <p>
   
   ```diff
   diff --git 
a/sql/core/src/test/resources/sql-tests/results/cross-join.sql.out 
b/sql/core/src/test/resources/sql-tests/results/udf/udf-cross-join.sql.out
   index 3833c42bdf..11c1e01d54 100644
   --- a/sql/core/src/test/resources/sql-tests/results/cross-join.sql.out
   +++ 
b/sql/core/src/test/resources/sql-tests/results/udf/udf-cross-join.sql.out
   @@ -43,7 +43,7 @@ two   2       two     22
    
    
    -- !query 3
   -SELECT * FROM nt1 cross join nt2 where nt1.k = nt2.k
   +SELECT * FROM nt1 cross join nt2 where udf(nt1.k) = udf(nt2.k)
    -- !query 3 schema
    struct<k:string,v1:int,k:string,v2:int>
    -- !query 3 output
   @@ -53,7 +53,7 @@ two   2       two     22
    
    
    -- !query 4
   -SELECT * FROM nt1 cross join nt2 on (nt1.k = nt2.k)
   +SELECT * FROM nt1 cross join nt2 on (udf(nt1.k) = udf(nt2.k))
    -- !query 4 schema
    struct<k:string,v1:int,k:string,v2:int>
    -- !query 4 output
   @@ -63,7 +63,7 @@ two   2       two     22
    
    
    -- !query 5
   -SELECT * FROM nt1 cross join nt2 where nt1.v1 = 1 and nt2.v2 = 22
   +SELECT * FROM nt1 cross join nt2 where udf(nt1.v1) = "1" and udf(nt2.v2) = 
"22"
    -- !query 5 schema
    struct<k:string,v1:int,k:string,v2:int>
    -- !query 5 output
   @@ -71,12 +71,12 @@ one 1       two     22
    
    
   -- !query 6
   -SELECT a.key, b.key FROM
   -(SELECT k key FROM nt1 WHERE v1 < 2) a
   +SELECT udf(a.key), udf(b.key) FROM
   +(SELECT udf(k) key FROM nt1 WHERE v1 < 2) a
    CROSS JOIN
   -(SELECT k key FROM nt2 WHERE v2 = 22) b
   +(SELECT udf(k) key FROM nt2 WHERE v2 = 22) b
    -- !query 6 schema
   -struct<key:string,key:string>
   +struct<udf(key):string,udf(key):string>
    -- !query 6 output
    one    two
    
   @@ -114,23 +114,29 @@ struct<>
    
    
   -- !query 11
   -select * from ((A join B on (a = b)) cross join C) join D on (a = d)
   +select * from ((A join B on (udf(a) = udf(b))) cross join C) join D on 
(udf(a) = udf(d))
    -- !query 11 schema
   -struct<a:string,va:int,b:string,vb:int,c:string,vc:int,d:string,vd:int>
   +struct<>
    -- !query 11 output
   -one    1       one     1       one     1       one     1
   -one    1       one     1       three   3       one     1
   -one    1       one     1       two     2       one     1
   -three  3       three   3       one     1       three   3
   -three  3       three   3       three   3       three   3
   -three  3       three   3       two     2       three   3
   -two    2       two     2       one     1       two     2
   -two    2       two     2       three   3       two     2
   -two    2       two     2       two     2       two     2
   +org.apache.spark.sql.AnalysisException
   +Detected implicit cartesian product for INNER join between logical plans
   +Filter (udf(a#x) = udf(b#x))
   ++- Join Inner
   +   :- Project [k#x AS a#x, v1#x AS va#x]
   +   :  +- LocalRelation [k#x, v1#x]
   +   +- Project [k#x AS b#x, v1#x AS vb#x]
   +      +- LocalRelation [k#x, v1#x]
   +and
   +Project [k#x AS d#x, v1#x AS vd#x]
   ++- LocalRelation [k#x, v1#x]
   +Join condition is missing or trivial.
   +Either: use the CROSS JOIN syntax to allow cartesian products between these
   +relations, or: enable implicit cartesian products by setting the 
configuration
   +variable spark.sql.crossJoin.enabled=true;
    
    
    -- !query 12
   -SELECT * FROM nt1 CROSS JOIN nt2 ON (nt1.k > nt2.k)
   +SELECT * FROM nt1 CROSS JOIN nt2 ON (udf(nt1.k) > udf(nt2.k))
    -- !query 12 schema
    struct<k:string,v1:int,k:string,v2:int>
    -- !query 12 output
   ```
   
   </p>
   </details>
   
   ## How was this patch tested?
   
   Added test.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to