[GitHub] [spark] LuciferYang opened a new pull request, #37487: [SPARK-40053][SQL][TESTS] Ignore test of Spark 2.x and restrict test Python version for `HiveExternalCatalogVersionsSuite`

GitBox Thu, 11 Aug 2022 22:40:10 -0700


LuciferYang opened a new pull request, #37487:
URL: https://github.com/apache/spark/pull/37487


   ### What changes were proposed in this pull request?
   
https://github.com/apache/spark/blob/d4c58159925133771d305cc7ac4f1248f215812c/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala#L269-L273
   
   The `testingVersions` filters as above seems out of date, if run the 
following command with Java 8 and Python 2.7.x
   
   ```
   build/mvn clean test -Phadoop-3 -Phive -pl sql/hive -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite
   ```
   
   `testingVersions` will includes 2.4.8, 3.0.3, 3.1.3,3.2.2 and 3.3.0, the 
test will `ABORTED` instead of `CANCELED`.
   
   So this pr makes the following changes to the versions filter:
   
   - Ignore test of Spark 2.x test due to Spark 2.4.8 already EOL
   - Force test Python version to 3.7+ due to `Spark runs on Java 8/11/17, 
Scala 2.12/2.13, Python 3.7+ and R 3.5+.`
   
   After this pr if test python version < 3.7, the test in 
`HiveExternalCatalogVersionsSuite` will be ignore.
   
   ### Why are the changes needed?
   HiveExternalCatalogVersionsSuite should not test Spark 2.4.8 and should not 
`ABORTED` when python is unavailable.
   
   ### Does this PR introduce _any_ user-facing change?
   No, just for test.
   
   
   ### How was this patch tested?
   
   - Pass GitHub Actions
   - Manual test with Java 8 and python3 is unavailable.
   
   ```
   build/mvn clean test -Phadoop-3 -Phive -pl sql/hive -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite
   ```
   
   **Before**
   
   ```
   org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED ***
     Exception encountered when invoking run on a nested suite - spark-submit 
returned with exit code 1.
     Command line: 
'/home/disk0/spark-source/spark/sql/hive/target/tmp/test-spark-075ffae9-27ce-40bb-82d9-247651ccb6fb/spark-2.4.8/bin/spark-submit'
 '--name' 'prepare testing tables' '--master' 'local[2]' '--conf' 
'spark.ui.enabled=false' '--conf' 'spark.master.rest.enabled=false' '--conf' 
'spark.sql.hive.metastore.version=1.2' '--conf' 
'spark.sql.hive.metastore.jars=maven' '--conf' 
'spark.sql.warehouse.dir=/home/disk0/spark-source/spark/sql/hive/target/tmp/warehouse-f885ee00-27c4-4db2-92a0-3d6131275c8b'
 '--conf' 'spark.sql.test.version.index=0' '--driver-java-options' 
'-Dderby.system.home=/home/disk0/spark-source/spark/sql/hive/target/tmp/warehouse-f885ee00-27c4-4db2-92a0-3d6131275c8b
 -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED 
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED 
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED 
--add-opens=java.base/java.io=ALL-UNNAMED 
--add-opens=java.base/java.net=ALL-UNNAMED 
--add-opens=java.base/java.nio=ALL-UNN
 AMED --add-opens=java.base/java.util=ALL-UNNAMED 
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED 
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED 
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED 
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED 
--add-opens=java.base/sun.security.action=ALL-UNNAMED 
--add-opens=java.base/sun.util.calendar=ALL-UNNAMED 
--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED' 
'/home/disk0/spark-source/spark/sql/hive/target/tmp/test6906586602292435904.py'
     
     2022-08-11 20:10:23.407 - stderr> 22/08/12 11:10:23 WARN NativeCodeLoader: 
Unable to load native-hadoop library for your platform... using builtin-java 
classes where applicable
     2022-08-11 20:10:23.671 - stdout> Traceback (most recent call last):
     2022-08-11 20:10:23.672 - stdout>   File 
"/home/disk0/spark-source/spark/sql/hive/target/tmp/test6906586602292435904.py",
 line 2, in <module>
     2022-08-11 20:10:23.672 - stdout>     from pyspark.sql import SparkSession
     2022-08-11 20:10:23.672 - stdout>   File 
"/home/disk0/spark-source/spark/sql/hive/target/tmp/test-spark-075ffae9-27ce-40bb-82d9-247651ccb6fb/spark-2.4.8/python/lib/pyspark.zip/pyspark/__init__.py",
 line 51, in <module>
     2022-08-11 20:10:23.672 - stdout>   File 
"/home/disk0/spark-source/spark/sql/hive/target/tmp/test-spark-075ffae9-27ce-40bb-82d9-247651ccb6fb/spark-2.4.8/python/lib/pyspark.zip/pyspark/context.py",
 line 31, in <module>
     2022-08-11 20:10:23.672 - stdout>   File 
"/home/disk0/spark-source/spark/sql/hive/target/tmp/test-spark-075ffae9-27ce-40bb-82d9-247651ccb6fb/spark-2.4.8/python/lib/pyspark.zip/pyspark/accumulators.py",
 line 97, in <module>
     2022-08-11 20:10:23.672 - stdout>   File 
"/home/disk0/spark-source/spark/sql/hive/target/tmp/test-spark-075ffae9-27ce-40bb-82d9-247651ccb6fb/spark-2.4.8/python/lib/pyspark.zip/pyspark/serializers.py",
 line 72, in <module>
     2022-08-11 20:10:23.672 - stdout>   File 
"/home/disk0/spark-source/spark/sql/hive/target/tmp/test-spark-075ffae9-27ce-40bb-82d9-247651ccb6fb/spark-2.4.8/python/lib/pyspark.zip/pyspark/cloudpickle.py",
 line 246, in <module>
     2022-08-11 20:10:23.672 - stdout>   File 
"/home/disk0/spark-source/spark/sql/hive/target/tmp/test-spark-075ffae9-27ce-40bb-82d9-247651ccb6fb/spark-2.4.8/python/lib/pyspark.zip/pyspark/cloudpickle.py",
 line 270, in CloudPickler
     2022-08-11 20:10:23.672 - stdout> NameError: name 'memoryview' is not 
defined
     2022-08-11 20:10:23.682 - stderr> log4j:WARN No appenders could be found 
for logger (org.apache.spark.util.ShutdownHookManager).
     2022-08-11 20:10:23.682 - stderr> log4j:WARN Please initialize the log4j 
system properly.
     2022-08-11 20:10:23.682 - stderr> log4j:WARN See 
http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. 
(SparkSubmitTestUtils.scala:96)
   ```
   
   **After**
   ```
   HiveExternalCatalogVersionsSuite:
   11:55:34.369 ERROR 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite: Python version <  
3.7.0, the running environment is unavailable.
   - backward compatibility !!! CANCELED !!!
     PROCESS_TABLES.isPythonVersionAtLeast37 was false 
(HiveExternalCatalogVersionsSuite.scala:240)
   11:55:34.528 WARN 
org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite: 
   
   ===== POSSIBLE THREAD LEAK IN SUITE 
o.a.s.sql.hive.HiveExternalCatalogVersionsSuite, threads: Keep-Alive-Timer 
(daemon=true) =====
   
   Run completed in 2 seconds, 87 milliseconds.
   Total number of tests run: 0
   Suites: completed 2, aborted 0
   Tests: succeeded 0, failed 0, canceled 1, ignored 0, pending 0
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] LuciferYang opened a new pull request, #37487: [SPARK-40053][SQL][TESTS] Ignore test of Spark 2.x and restrict test Python version for `HiveExternalCatalogVersionsSuite`

Reply via email to