[PR] [SPARK-47933][PYTHON][TESTS][FOLLOW-UP] Enable doctest `pyspark.sql.connect.column` [spark]

via GitHub Thu, 06 Jun 2024 02:30:21 -0700


zhengruifeng opened a new pull request, #46895:
URL: https://github.com/apache/spark/pull/46895


   ### What changes were proposed in this pull request?
   Enable doctest `pyspark.sql.connect.column`
   
   
   ### Why are the changes needed?
   test coverage
   
   
   ### Does this PR introduce _any_ user-facing change?
   no, test only
   
   
   ### How was this patch tested?
   manually check:
   
   I manually broke some doctests in `Column`, then found 
`pyspark.sql.connect.column` didn't fail:
   ```
   (spark_dev_312) ➜  spark git:(master) ✗ python/run-tests -k 
--python-executables python3 --testnames 'pyspark.sql.classic.column'
   Running PySpark tests. Output is in 
/Users/ruifeng.zheng/Dev/spark/python/unit-tests.log
   Will test against the following Python executables: ['python3']
   Will test the following Python tests: ['pyspark.sql.classic.column']
   python3 python_implementation is CPython
   python3 version is: Python 3.12.2
   Starting test(python3): pyspark.sql.classic.column (temp output: 
/Users/ruifeng.zheng/Dev/spark/python/target/4bdd14b8-92ba-43ba-a7fb-655e6769aeb9/python3__pyspark.sql.classic.column__i2_c1zct.log)
   WARNING: Using incubator modules: jdk.incubator.vector
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
   **********************************************************************
   File "/Users/ruifeng.zheng/Dev/spark/python/pyspark/sql/column.py", line 
385, in pyspark.sql.column.Column.contains
   Failed example:
       df.filter(df.name.contains('o')).collect()
   Differences (ndiff with -expected +actual):
       - [Row(age=5, name='Bobx')]
       ?                      -
       + [Row(age=5, name='Bob')]
   **********************************************************************
      1 of   2 in pyspark.sql.column.Column.contains
   ***Test Failed*** 1 failures.
   
   Had test failures in pyspark.sql.classic.column with python3; see logs.
   (spark_dev_312) ➜  spark git:(master) ✗ python/run-tests -k 
--python-executables python3 --testnames 'pyspark.sql.connect.column'
   Running PySpark tests. Output is in 
/Users/ruifeng.zheng/Dev/spark/python/unit-tests.log
   Will test against the following Python executables: ['python3']
   Will test the following Python tests: ['pyspark.sql.connect.column']
   python3 python_implementation is CPython
   python3 version is: Python 3.12.2
   Starting test(python3): pyspark.sql.connect.column (temp output: 
/Users/ruifeng.zheng/Dev/spark/python/target/2acaff3c-ef1d-41eb-b63e-509f3e0192c0/python3__pyspark.sql.connect.column__66td62h9.log)
   Finished test(python3): pyspark.sql.connect.column (3s)
   Tests passed in 3 seconds
   ```
   
   after this PR, it fails as expected:
   ```
   (spark_dev_312) ➜  spark git:(master) ✗ python/run-tests -k 
--python-executables python3 --testnames 'pyspark.sql.connect.column'
   Running PySpark tests. Output is in 
/Users/ruifeng.zheng/Dev/spark/python/unit-tests.log
   Will test against the following Python executables: ['python3']
   Will test the following Python tests: ['pyspark.sql.connect.column']
   python3 python_implementation is CPython
   python3 version is: Python 3.12.2
   Starting test(python3): pyspark.sql.connect.column (temp output: 
/Users/ruifeng.zheng/Dev/spark/python/target/390ff7ae-7683-425c-b0d2-ee336e1ad452/python3__pyspark.sql.connect.column__f69b3smc.log)
   WARNING: Using incubator modules: jdk.incubator.vector
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
   org.apache.spark.SparkSQLException: [INVALID_CURSOR.DISCONNECTED] The cursor 
is invalid. The cursor has been disconnected by the server. SQLSTATE: HY109
        at 
org.apache.spark.sql.connect.execution.ExecuteGrpcResponseSender.execute(ExecuteGrpcResponseSender.scala:281)
        at 
org.apache.spark.sql.connect.execution.ExecuteGrpcResponseSender$$anon$1.run(ExecuteGrpcResponseSender.scala:101)
   **********************************************************************
   File "/Users/ruifeng.zheng/Dev/spark/python/pyspark/sql/column.py", line 
385, in pyspark.sql.column.Column.contains
   Failed example:
       df.filter(df.name.contains('o')).collect()
   Expected:
       [Row(age=5, name='Bobx')]
   Got:
       [Row(age=5, name='Bob')]
   **********************************************************************
      1 of   2 in pyspark.sql.column.Column.contains
   ***Test Failed*** 1 failures.
   
   Had test failures in pyspark.sql.connect.column with python3; see logs.
   ```
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-47933][PYTHON][TESTS][FOLLOW-UP] Enable doctest `pyspark.sql.connect.column` [spark]

Reply via email to