Github user parente commented on the issue:
https://github.com/apache/spark/pull/18339
Oh neat. #17298 looks similar to the approach we took in spylon-kernel to
launch with stdout/stderr pipes redirected to the parent process and threads to
read them
(https://github.com/maxpoint/spylon-kernel/blob/master/spylon_kernel/scala_interpreter.py#L73).
That project is based on Calysto/metakernel, which has an API for sending
stdout/stderr back to kernel clients, so we use that instead of `print()` like
the PR here does.
I still think it would be handy to give clients more control over how the
py4j gateway is launched. For instance, if I want to use pyspark in an asyncio
application, I might want to open pipes to the jvm process, but then switch
them to non-blocking IO mode and hook them up to an async reader. If #17298
merges without a making the threads optional and exposing the pipes for the
caller to use, it's likely to be more harmful than helpful in the async
situation.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]