HyukjinKwon commented on a change in pull request #25161: [SPARK-28390][SQL][PYTHON][TESTS] Convert and port 'pgSQL/select_having.sql' into UDF test base URL: https://github.com/apache/spark/pull/25161#discussion_r305319557
########## File path: sql/core/src/test/resources/sql-tests/inputs/udf/pgSQL/udf-select_having.sql ########## @@ -0,0 +1,62 @@ +-- +-- Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group +-- +-- +-- SELECT_HAVING +-- https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/select_having.sql +-- +-- This test file was converted from inputs/pgSQL/select_having.sql + +-- load test data +CREATE TABLE test_having (a int, b int, c string, d string) USING parquet; +INSERT INTO test_having VALUES (0, 1, 'XXXX', 'A'); +INSERT INTO test_having VALUES (1, 2, 'AAAA', 'b'); +INSERT INTO test_having VALUES (2, 2, 'AAAA', 'c'); +INSERT INTO test_having VALUES (3, 3, 'BBBB', 'D'); +INSERT INTO test_having VALUES (4, 3, 'BBBB', 'e'); +INSERT INTO test_having VALUES (5, 3, 'bbbb', 'F'); +INSERT INTO test_having VALUES (6, 4, 'cccc', 'g'); +INSERT INTO test_having VALUES (7, 4, 'cccc', 'h'); +INSERT INTO test_having VALUES (8, 4, 'CCCC', 'I'); +INSERT INTO test_having VALUES (9, 4, 'CCCC', 'j'); + +SELECT udf(b), udf(c) FROM test_having + GROUP BY b, c HAVING udf(count(*)) = 1 ORDER BY b, c; + +-- HAVING is effectively equivalent to WHERE in this case +SELECT udf(b), udf(c) FROM test_having + GROUP BY b, c HAVING udf(b) = 3 ORDER BY b, c; Review comment: Hm, @shivusondur, seems working if we put udf in `ORDER BY` ```python from pyspark.sql.functions import pandas_udf, PandasUDFType @pandas_udf("int", PandasUDFType.SCALAR) def noop(x): return x spark.udf.register("udf", noop) spark.range(10).selectExpr("id as b", "id as c").createOrReplaceTempView("test_having") sql("""SELECT udf(b), udf(c) FROM test_having GROUP BY b, c HAVING udf(count(*)) = 1 ORDER BY udf(b), c""").show() ``` ``` +------+------+ |udf(b)|udf(c)| +------+------+ | 0| 0| | 1| 1| | 2| 2| | 3| 3| | 4| 4| | 5| 5| | 6| 6| | 7| 7| | 8| 8| | 9| 9| +------+------+ ``` can you double check this? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
