gaogaotiantian commented on code in PR #53410:
URL: https://github.com/apache/spark/pull/53410#discussion_r2604594289


##########
python/run-tests.py:
##########
@@ -228,6 +228,10 @@ def get_default_python_executables():
 def split_and_validate_testnames(testnames):
     testnames_to_test = []
 
+    py4j_module_path = os.path.join(SPARK_HOME, 
"python/lib/py4j-0.10.9.9-src.zip")

Review Comment:
   Yes, we did this in `bin/pyspark` so we need it here. 
`importlib.util.find_spec(name)` fails because it can't find `py4j`. It can 
find `pyspark` that's why the errors.
   
   Basically when the script tries to locate the test module. It can find 
`pyspark`, but not `pyspark.sql` (because `py4j` is not there). Then it 
believes the test should be split as `pyspark sql.xxx`. We need `py4j` in this 
specific script (that runs before `bin/pyspark`) to check the test module 
properly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to