WeichenXu123 commented on a change in pull request #25138:
[SPARK-26175][PYSPARK] Closing stdin of the worker process right after fork
URL: https://github.com/apache/spark/pull/25138#discussion_r303716013
##########
File path: python/pyspark/sql/tests/test_udf.py
##########
@@ -616,6 +616,20 @@ def f():
self.spark.range(1).select(f()).collect()
+ def test_worker_original_stdin_closed(self):
+ # Test the original stdin of worker inherit from daemon is closed
+ # and is replaced with '/dev/null'.
+ # See SPARK-26175
+ def task(iterator):
+ import sys
+ res = sys.stdin.read()
+ # Because the stdin is replaced with '/dev/null'
+ # Read data from it will get EOF
+ assert res == '', "Expect read EOF from stdin."
Review comment:
Verify read stdin get EOF immediately.
Should we add more test such as verifying the worker process actually exit ?
But I think current test is enough, the fact we can only read EOF from stdin
represent the stdin is dummy and safe file descriptor, it won't influence other
file descriptors in daemon.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]