[GitHub] [spark] LuciferYang opened a new pull request, #38028: [SPARK-40435][SQL][TESTS][FOLLOWUP] Correct test precondition of `PythonUDFSuite` and `ContinuousSuite`

GitBox Wed, 28 Sep 2022 00:50:48 -0700


LuciferYang opened a new pull request, #38028:
URL: https://github.com/apache/spark/pull/38028


   ### What changes were proposed in this pull request?
   https://github.com/apache/spark/pull/37894 changed the preconditions for the 
following two tests from `assume(shouldTestGroupedAggPandasUDFs)` to  
`assume(shouldTestPythonUDFs)`:
   
   - `SPARK-39962: Global aggregation of Pandas UDF should respect the column 
order` in `PythonUDFSuite`
   - `continuous mode with various UDFs - Scalar Pandas UDF` in 
`ContinuousSuite`
   
   but this change this change will cause test failure  if `pandas` is not 
installed, so this pr restore the test preconditions from 
`assume(shouldTestPythonUDFs)` to `assume(shouldTestGroupedAggPandasUDFs)`.
   
   ### Why are the changes needed?
   Fix test precondition of `PythonUDFSuite` and `ContinuousSuite`
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   
   - Pass GitHub Actions 
   - Manual test, `pandas` is not installed:
   
   ```
   build/sbt clean "sql/testOnly 
org.apache.spark.sql.execution.python.PythonUDFSuite"
   build/sbt clean "sql/testOnly 
org.apache.spark.sql.streaming.continuous.ContinuousSuite"
   ```
   
   Before
   
   PythonUDFSuite
   ```
   [info] - SPARK-39962: Global aggregation of Pandas UDF should respect the 
column order *** FAILED *** (799 milliseconds)
   [info]   java.lang.RuntimeException: Python executable [python3] and/or 
pyspark are unavailable.
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$.pandasGroupedAggFunc$lzycompute(IntegratedUDFTestUtils.scala:236)
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$.org$apache$spark$sql$IntegratedUDFTestUtils$$pandasGroupedAggFunc(IntegratedUDFTestUtils.scala:217)
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$TestGroupedAggPandasUDF.udf$lzycompute(IntegratedUDFTestUtils.scala:433)
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$TestGroupedAggPandasUDF.udf(IntegratedUDFTestUtils.scala:430)
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$TestGroupedAggPandasUDF.apply(IntegratedUDFTestUtils.scala:444)
   [info]   at 
org.apache.spark.sql.execution.python.PythonUDFSuite.$anonfun$new$9(PythonUDFSuite.scala:82)
   [info]   at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
   ```
   
   and 
   
   ContinuousSuite
   ```
   [info] - continuous mode with various UDFs - Scalar Pandas UDF *** FAILED 
*** (715 milliseconds)
   [info]   java.lang.RuntimeException: Python executable [python3] and/or 
pyspark are unavailable.
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$.pandasFunc$lzycompute(IntegratedUDFTestUtils.scala:214)
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$.org$apache$spark$sql$IntegratedUDFTestUtils$$pandasFunc(IntegratedUDFTestUtils.scala:194)
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$TestScalarPandasUDF$$anon$2.<init>(IntegratedUDFTestUtils.scala:382)
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$TestScalarPandasUDF.udf$lzycompute(IntegratedUDFTestUtils.scala:379)
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$TestScalarPandasUDF.udf(IntegratedUDFTestUtils.scala:379)
   [info]   at 
org.apache.spark.sql.IntegratedUDFTestUtils$TestScalarPandasUDF.apply(IntegratedUDFTestUtils.scala:404)
   [info]   at 
org.apache.spark.sql.streaming.continuous.ContinuousSuite.$anonfun$new$24(ContinuousSuite.scala:289)
   ```
   
   After
   
   PythonUDFSuite
   ```
   [info] Run completed in 11 seconds, 278 milliseconds.
   [info] Total number of tests run: 4
   [info] Suites: completed 1, aborted 0
   [info] Tests: succeeded 4, failed 0, canceled 1, ignored 0, pending 0
   [info] All tests passed.
   [success] Total time: 72 s (01:12), completed 2022-9-28 15:46:40
   ```
   and
   
   ContinuousSuite
   ```
   [info] Run completed in 33 seconds, 197 milliseconds.
   [info] Total number of tests run: 13
   [info] Suites: completed 1, aborted 0
   [info] Tests: succeeded 13, failed 0, canceled 1, ignored 0, pending 0
   [info] All tests passed.
   [success] Total time: 64 s (01:04), completed 2022-9-28 15:49:45
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] LuciferYang opened a new pull request, #38028: [SPARK-40435][SQL][TESTS][FOLLOWUP] Correct test precondition of `PythonUDFSuite` and `ContinuousSuite`

Reply via email to