Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18320 Does it fail by running just gapply and nothing else? From what you have found in your investigations and the code you pointed to, I suspect this isn't limited to gapply. I think this PR so only works around the problem. I am concern that an user can also run into this issue. An naive approach might be to change `park.sparkr.use.daemon` inside gapply when it is called, but I suspect that only shifts the problem around, and it might fail then with other methods that shuffles or calls UDFs. If a long running demon process is the problem, either we find and fix the leak (close the pipe, socket etc) or we put a count on the number of execution and re-cycle the demon process periodically before this leak becomes fatal. thought?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org