HyukjinKwon opened a new pull request #25128: 
[SPARK-28270][FOLLOW-UP][SQL][PYTHON][TESTS] Avoid cast input of UDF as double 
in the failed test in udf-aggregate_part1.sql
URL: https://github.com/apache/spark/pull/25128
 
 
   ## What changes were proposed in this pull request?
   
   It still can be flaky on certain environments due to float limitation 
described at https://github.com/apache/spark/pull/25110 . See 
https://github.com/apache/spark/pull/25110#discussion_r302735905
   
   - 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/6584/testReport/org.apache.spark.sql/SQLQueryTestSuite/udf_pgSQL_udf_aggregates_part1_sql___Regular_Python_UDF/
   
   ```
   Expected "700000000000[6] 1", but got "700000000000[5] 1" Result did not 
match for query #33
SELECT CAST(avg(udf(CAST(x AS DOUBLE))) AS long), 
CAST(udf(var_pop(CAST(x AS DOUBLE))) AS decimal(10,3))
FROM (VALUES 
(7000000000005), (7000000000007)) v(x)
   ```
   
   Here;s what's going on: 
https://github.com/apache/spark/pull/25110#discussion_r302791930
   
   ```
   scala> Seq("7000000000004.999", 
"7000000000006.999").toDF().selectExpr("CAST(avg(value) AS long)").show()
   +--------------------------+
   |CAST(avg(value) AS BIGINT)|
   +--------------------------+
   |             7000000000005|
   +--------------------------+
   ```
   
   Therefore, this PR just avoid to cast in the specific test.
   
   This is a temp fix. We need more robust way to avoid such cases.
   
   ## How was this patch tested?
   
   It passes with Maven in my local before/after this PR. I believe the problem 
seems similarly the Python or OS installed in the machine. I should test this 
against PR builder with `test-maven` for sure..
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to