LuciferYang commented on PR #42385: URL: https://github.com/apache/spark/pull/42385#issuecomment-1676854237
@utkarsh39 I found that this PR may caused some PySpark test cases to fail in the Java 17 daily tests(pyspark-sql and pyspark-connect module): - https://github.com/apache/spark/actions/runs/5837423492 - https://github.com/apache/spark/actions/runs/5843658110 - https://github.com/apache/spark/actions/runs/5849761680 <img width="1157" alt="image" src="https://github.com/apache/spark/assets/1475305/bcab0032-5d96-4596-9f03-0aa364f91574"> To verify this , I conducted some local testing using Java 17 ``` java -version openjdk version "17.0.8" 2023-07-18 LTS OpenJDK Runtime Environment Zulu17.44+15-CA (build 17.0.8+7-LTS) OpenJDK 64-Bit Server VM Zulu17.44+15-CA (build 17.0.8+7-LTS, mixed mode, sharing) ``` 1. Revert to the previous PR before SPARK-44705 and run the following commands: ``` // [SPARK-44765][CONNECT] Simplify retries of ReleaseExecute git reset --hard 9bde882fcb39e9fedced0df9702df2a36c1a84e6 export SKIP_UNIDOC=true export SKIP_MIMA=true export SKIP_PACKAGING=true ./dev/run-tests --parallelism 1 --modules "pyspark-sql" ``` ``` Finished test(python3.9): pyspark.sql.tests.test_udtf (57s) ... 2 tests were skipped Tests passed in 59 seconds ``` The tests in `pyspark.sql.tests.test_udtf` passed. 2. Revert to SPARK-44705 and run the following commands: ``` // [SPARK-44705][PYTHON] Make PythonRunner single-threaded git reset --hard 8aaff55839493e80e3ce376f928c04aa8f31d18c export SKIP_UNIDOC=true export SKIP_MIMA=true export SKIP_PACKAGING=true ./dev/run-tests --parallelism 1 --modules "pyspark-sql" ``` ``` ====================================================================== FAIL: test_udtf_with_analyze_table_argument_adding_columns (pyspark.sql.tests.test_udtf.UDTFTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/tests/test_udtf.py", line 1340, in test_udtf_with_analyze_table_argument_adding_columns assertSchemaEqual( File "/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/testing/utils.py", line 356, in assertSchemaEqual raise PySparkAssertionError( pyspark.errors.exceptions.base.PySparkAssertionError: [DIFFERENT_SCHEMA] Schemas do not match. --- actual +++ expected - StructType([StructField('a', LongType(), True)]) + StructType([StructField('id', LongType(), False), StructField('is_even', BooleanType(), True)]) ====================================================================== FAIL: test_udtf_with_analyze_table_argument_repeating_rows (pyspark.sql.tests.test_udtf.UDTFTests) (query_no=0) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/tests/test_udtf.py", line 1394, in test_udtf_with_analyze_table_argument_repeating_rows assertSchemaEqual(df.schema, expected_schema) File "/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/testing/utils.py", line 356, in assertSchemaEqual raise PySparkAssertionError( pyspark.errors.exceptions.base.PySparkAssertionError: [DIFFERENT_SCHEMA] Schemas do not match. --- actual +++ expected - StructType([StructField('id', LongType(), False), StructField('is_even', BooleanType(), True)]) + StructType([StructField('id', LongType(), False)]) ====================================================================== FAIL: test_udtf_with_analyze_table_argument_repeating_rows (pyspark.sql.tests.test_udtf.UDTFTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/tests/test_udtf.py", line 1400, in test_udtf_with_analyze_table_argument_repeating_rows self.spark.sql( AssertionError: AnalysisException not raised ====================================================================== FAIL: test_udtf_with_analyze_using_accumulator (pyspark.sql.tests.test_udtf.UDTFTests) (query_no=0) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/tests/test_udtf.py", line 1625, in test_udtf_with_analyze_using_accumulator assertSchemaEqual(df.schema, StructType().add("col1", IntegerType())) File "/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/testing/utils.py", line 356, in assertSchemaEqual raise PySparkAssertionError( pyspark.errors.exceptions.base.PySparkAssertionError: [DIFFERENT_SCHEMA] Schemas do not match. --- actual +++ expected - StructType([StructField('a', IntegerType(), True), StructField('b', IntegerType(), True)]) + StructType([StructField('col1', IntegerType(), True)]) ====================================================================== FAIL: test_udtf_with_analyze_using_accumulator (pyspark.sql.tests.test_udtf.UDTFTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/tests/test_udtf.py", line 1628, in test_udtf_with_analyze_using_accumulator self.assertEqual(test_accum.value, 222) AssertionError: 111 != 222 ---------------------------------------------------------------------- Ran 174 tests in 54.619s FAILED (failures=34, errors=6, skipped=2) ``` There are 34 test failures after this one merged. @utkarsh39 Do you have time to fix these test cases? For this, I have created SPARK-44797. Or should we revert this PR to restore the Java 17 daily tests first? @HyukjinKwon @ueshin @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
