Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20473#discussion_r165573545
--- Diff: python/run-tests.py ---
@@ -151,6 +152,68 @@ def parse_opts():
return opts
+def _check_dependencies(python_exec, modules_to_test):
+ if "COVERAGE_PROCESS_START" in os.environ:
+ # Make sure if coverage is installed.
+ try:
+ subprocess_check_output(
+ [python_exec, "-c", "import coverage"],
+ stderr=open(os.devnull, 'w'))
+ except:
+ print_red("Coverage is not installed in Python executable '%s'
"
+ "but 'COVERAGE_PROCESS_START' environment variable
is set, "
+ "exiting." % python_exec)
+ sys.exit(-1)
+
+ # If we should test 'pyspark-sql', it checks if PyArrow and Pandas are
installed and
+ # explicitly prints out. See SPARK-23300.
+ if pyspark_sql in modules_to_test:
+ # Hyukjin: I think here is not the best place to leave versions
for extra dependencies.
--- End diff --
Not sure .. I was thinking of putting this in
`./dev/sparktestsupport/modules.py` too but .. I believe this should be done
separately. We should replace these too:
https://github.com/apache/spark/blob/12d20dd75b1620da362dbb5345bed58e47ddacb9/python/pyspark/sql/utils.py#L120
https://github.com/apache/spark/blob/12d20dd75b1620da362dbb5345bed58e47ddacb9/python/pyspark/sql/utils.py#L130
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]