[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 This also happens i Mac to me. My assumption is, `mcfork` itself might be fine (opening a pipe between parent and child is okay) but when the child exits, it looks failed to return the resources properly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18320 also, perhaps this is a bug in RHEL/CentOS? it seems like mcfork only calls the OS fork function --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 I was thinking we should not work around for now if this can't be fixed within `daemon.R` and it is not practically problematic which we don't know yet. Let try to deal with`daemon.R`. Probably, the best way is to fix this problem together. I will be back after thinking more and taking a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 I was thinking we should not work around for now if this can't be fixed within `daemon.R` and it is not practically problematic which we don't know yet. Let try to deal with a look for `daemon.R`. Probably, the best way is to fix this problem together. I will be back after thinking more and taking a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18320 Does it fail by running just gapply and nothing else? From what you have found in your investigations and the code you pointed to, I suspect this isn't limited to gapply. I think this PR so only works around the problem. I am concern that an user can also run into this issue. An naive approach might be to change `park.sparkr.use.daemon` inside gapply when it is called, but I suspect that only shifts the problem around, and it might fail then with other methods that shuffles or calls UDFs. If a long running demon process is the problem, either we find and fix the leak (close the pipe, socket etc) or we put a count on the number of execution and re-cycle the demon process periodically before this leak becomes fatal. thought? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 @felixcheung, BTW, is it okay as a PR alone as is? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 I suspect this is an issue in R. I will raise this issue in R community soon and share it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 Yes, I guess it will pass if we reduce `spark.sql.shuffle.partitions` (< I didn't look carefully and test this either). Just to make sure (and to share what I investigated ...), from my code read, With `spark.sparkr.use.daemon` enabled, for each task execution, 1. JVM start (if not started)> R daemon 2. JVM send port --> R daemon fork with the port---> R worker This looks being tested on OSs except for Windows. With `spark.sparkr.use.daemon` disabled, for each task execution, 1. JVM forking processes from Java (expensive)-> R worker This looks being tested only on Windows. This PR proposes to switch this one to latter case (which was the former before) by avoiding calling the (already running from other execution) R daemon. I am fine with giving a shot with reducing the number of partitions if you are fond of it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18320 that's very interesting. that code has been around for 2 years - to be honest I'm not 100% sure about what it is doing. perhaps this could also be fixed with a lower number of partitions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 For normal usecases, I carefully suspect it might be fine because I executed 200 * ~10 tasks in a single machine quickly but I don't know if it happens frequently when it runs slowly in a cluster in a distributed manner. At least, this was not reproduced when the number of fork executions is not many. Practically, it might be fine but need more investigation if this is important to prioritize this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 Yes, there is still the issue and this only fixes (avoid) the test failure. I believe running the codes should reproduce the issue for both Mac and CentOS. What I don't get it, when the number of fork executions is not many, this is not reproduced (even sometimes increasing pipes were not observed sometimes). The number of the pipes decrease in a certain condition (it did not look related with time but some events). The issue with `gapply` is exposed and found now as it invokes many forks via `daemon.R` but I guess this issue might still exist for all other APIs executing R native function with this daemon. I gave a shot to resolve the root cause within `daemon.R` with several tries but I could not make it. Root cause is: With a terminal executing `watch -n 0.01 "lsof -c R | wc -l"` With another terminal: ```r for(i in 0:200) { p <- parallel:::mcfork() if (inherits(p, "masterProcess")) { tools::pskill(Sys.getpid(), tools::SIGUSR1) parallel:::mcexit(0L) } } ``` The number of opened pipes just keep increasing. I double checked the processes and sockets are closed via `netstats` and `ps`. We need to resolve this one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18320 thx - I think more importantly, does the issue manifest when someone manually call gapply in a similar way on RHEL/CentOS? We could workaround the test failure, but if user can use into this in normal use then we need to address this within gapply --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18320 **[Test build #78148 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78148/testReport)** for PR 18320 at commit [`52c8abf`](https://github.com/apache/spark/commit/52c8abf9551e126f75ef0aa0a042f1ebd13e8d47). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78148/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18320 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18320 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78147/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18320 **[Test build #78147 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78147/testReport)** for PR 18320 at commit [`505d75f`](https://github.com/apache/spark/commit/505d75f0e9a90481f96d0f1fefd4f9baaa38ee7d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18320 **[Test build #78148 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78148/testReport)** for PR 18320 at commit [`52c8abf`](https://github.com/apache/spark/commit/52c8abf9551e126f75ef0aa0a042f1ebd13e8d47). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 Yes, I believe you are correct and the daemon is already running but it avoids to use the problematic daemon - https://github.com/apache/spark/blob/478fbc866fbfdb4439788583281863ecea14e8af/core/src/main/scala/org/apache/spark/api/r/RRunner.scala#L363-L392 up to my knowledge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 Yes, I believe you are correct but the daemon is already running but it avoids to use the problematic daemon - https://github.com/apache/spark/blob/478fbc866fbfdb4439788583281863ecea14e8af/core/src/main/scala/org/apache/spark/api/r/RRunner.scala#L363-L392 up to my knowledge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18320 Hmm I'm not sure - I'm pretty sure the session / spark context is already initialized when this test is run and changing the setting here does it affect the existing daemon process already running? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18320 **[Test build #78147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78147/testReport)** for PR 18320 at commit [`505d75f`](https://github.com/apache/spark/commit/505d75f0e9a90481f96d0f1fefd4f9baaa38ee7d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18320: [SPARK-21093][R] Avoid mcfork in R's daemon in gapply/ga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18320 cc @felixcheung, @shivaram and @MLnick. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org