utkarsh39 commented on PR #42385:
URL: https://github.com/apache/spark/pull/42385#issuecomment-1677405782

   > @utkarsh39
   > 
   > I found that this PR may caused some PySpark test cases to fail in the 
Java 17 daily tests(pyspark-sql and pyspark-connect module):
   > 
   > * https://github.com/apache/spark/actions/runs/5837423492
   > * https://github.com/apache/spark/actions/runs/5843658110
   > * https://github.com/apache/spark/actions/runs/5849761680
   > 
   > <img alt="image" width="1157" 
src="https://user-images.githubusercontent.com/1475305/260390648-bcab0032-5d96-4596-9f03-0aa364f91574.png";>
   > To verify this , I conducted some local testing using Java 17
   > 
   > ```
   > java -version
   > openjdk version "17.0.8" 2023-07-18 LTS
   > OpenJDK Runtime Environment Zulu17.44+15-CA (build 17.0.8+7-LTS)
   > OpenJDK 64-Bit Server VM Zulu17.44+15-CA (build 17.0.8+7-LTS, mixed mode, 
sharing)
   > ```
   > 
   > 1. Revert to the previous PR before 
[SPARK-44705](https://issues.apache.org/jira/browse/SPARK-44705) and run the 
following commands:
   > 
   > ```
   > // [SPARK-44765][CONNECT] Simplify retries of ReleaseExecute
   > git reset --hard 9bde882fcb39e9fedced0df9702df2a36c1a84e6
   > export SKIP_UNIDOC=true
   > export SKIP_MIMA=true
   > export SKIP_PACKAGING=true
   > ./dev/run-tests --parallelism 1 --modules "pyspark-sql"
   > ```
   > 
   > ```
   > Finished test(python3.9): pyspark.sql.tests.test_udtf (57s) ... 2 tests 
were skipped
   > ```
   > 
   > The tests in `pyspark.sql.tests.test_udtf` passed.
   > 
   > 2. Revert to 
[SPARK-44705](https://issues.apache.org/jira/browse/SPARK-44705) and run the 
following commands:
   > 
   > ```
   > // [SPARK-44705][PYTHON] Make PythonRunner single-threaded
   > git reset --hard 8aaff55839493e80e3ce376f928c04aa8f31d18c
   > export SKIP_UNIDOC=true
   > export SKIP_MIMA=true
   > export SKIP_PACKAGING=true
   > ./dev/run-tests --parallelism 1 --modules "pyspark-sql"
   > ```
   > 
   > ```
   > ======================================================================
   > FAIL: test_udtf_with_analyze_table_argument_adding_columns 
(pyspark.sql.tests.test_udtf.UDTFTests)
   > ----------------------------------------------------------------------
   > Traceback (most recent call last):
   >   File 
"/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/tests/test_udtf.py",
 line 1340, in test_udtf_with_analyze_table_argument_adding_columns
   >     assertSchemaEqual(
   >   File 
"/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/testing/utils.py",
 line 356, in assertSchemaEqual
   >     raise PySparkAssertionError(
   > pyspark.errors.exceptions.base.PySparkAssertionError: [DIFFERENT_SCHEMA] 
Schemas do not match.
   > --- actual
   > +++ expected
   > - StructType([StructField('a', LongType(), True)])
   > + StructType([StructField('id', LongType(), False), StructField('is_even', 
BooleanType(), True)])
   > 
   > ======================================================================
   > FAIL: test_udtf_with_analyze_table_argument_repeating_rows 
(pyspark.sql.tests.test_udtf.UDTFTests) (query_no=0)
   > ----------------------------------------------------------------------
   > Traceback (most recent call last):
   >   File 
"/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/tests/test_udtf.py",
 line 1394, in test_udtf_with_analyze_table_argument_repeating_rows
   >     assertSchemaEqual(df.schema, expected_schema)
   >   File 
"/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/testing/utils.py",
 line 356, in assertSchemaEqual
   >     raise PySparkAssertionError(
   > pyspark.errors.exceptions.base.PySparkAssertionError: [DIFFERENT_SCHEMA] 
Schemas do not match.
   > --- actual
   > +++ expected
   > - StructType([StructField('id', LongType(), False), StructField('is_even', 
BooleanType(), True)])
   > + StructType([StructField('id', LongType(), False)])
   > 
   > ======================================================================
   > FAIL: test_udtf_with_analyze_table_argument_repeating_rows 
(pyspark.sql.tests.test_udtf.UDTFTests)
   > ----------------------------------------------------------------------
   > Traceback (most recent call last):
   >   File 
"/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/tests/test_udtf.py",
 line 1400, in test_udtf_with_analyze_table_argument_repeating_rows
   >     self.spark.sql(
   > AssertionError: AnalysisException not raised
   > 
   > ======================================================================
   > FAIL: test_udtf_with_analyze_using_accumulator 
(pyspark.sql.tests.test_udtf.UDTFTests) (query_no=0)
   > ----------------------------------------------------------------------
   > Traceback (most recent call last):
   >   File 
"/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/tests/test_udtf.py",
 line 1625, in test_udtf_with_analyze_using_accumulator
   >     assertSchemaEqual(df.schema, StructType().add("col1", IntegerType()))
   >   File 
"/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/testing/utils.py",
 line 356, in assertSchemaEqual
   >     raise PySparkAssertionError(
   > pyspark.errors.exceptions.base.PySparkAssertionError: [DIFFERENT_SCHEMA] 
Schemas do not match.
   > --- actual
   > +++ expected
   > - StructType([StructField('a', IntegerType(), True), StructField('b', 
IntegerType(), True)])
   > + StructType([StructField('col1', IntegerType(), True)])
   > 
   > ======================================================================
   > FAIL: test_udtf_with_analyze_using_accumulator 
(pyspark.sql.tests.test_udtf.UDTFTests)
   > ----------------------------------------------------------------------
   > Traceback (most recent call last):
   >   File 
"/Users/yangjie01/SourceCode/git/spark-mine-sbt/python/pyspark/sql/tests/test_udtf.py",
 line 1628, in test_udtf_with_analyze_using_accumulator
   >     self.assertEqual(test_accum.value, 222)
   > AssertionError: 111 != 222
   > 
   > ----------------------------------------------------------------------
   > Ran 174 tests in 54.619s
   > 
   > FAILED (failures=34, errors=6, skipped=2)
   > ```
   > 
   > There are 34 test failures after this one merged.
   > 
   > @utkarsh39 Do you have time to fix these test cases? For this, I have 
created [SPARK-44797](https://issues.apache.org/jira/browse/SPARK-44797).
   > 
   > Or should we revert this PR to restore the Java 17 daily tests first? 
@HyukjinKwon @ueshin @dongjoon-hyun
   
   I will try to get these tests fixed ASAP


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to