Re: [R-pkg-devel] Too many processes spawned on Windows and Debian, although only 2 cores should be used
Hello. As already pointed out, the current R implementation treats any non-empty value on _R_CHECK_LIMIT_CORES_ different from "false" as a true value, e.g. "TRUE", "true", "T", "1", but also "donald duck". Using '--as-cran' sets _R_CHECK_LIMIT_CORES_="TRUE", if unset. If already set, it'll not touch it. So, it could be that a CRAN check server already uses, say, _R_CHECK_LIMIT_CORES_="true". We cannot make assumptions about that. To make your life, and an end-user's too, easier, I suggest just using num_workers <- 2L without conditioning on running on CRAN or not. Why? There are many problems with using parallel::detectCores(). First of all, it can return NA_integer_ on some systems, so you cannot assume it gives a valid value (== error). It can also return 1L, which means your 'num_workers - 1' will give zero worker (== error). You need to account for that if you rely on detectCores(). Second, detectCores() returns number of physical CPU cores. It's getting more and more common to run in "cgroups" constrained environments where your R process only gets access to a fraction of these cores. Such constrains are in place in many shared multi-user HPC environments, and sometimes when using Linux containers (e.g. Docker, Apptainer, and Podman). A notable example of this is when using the RStudio Cloud. So, if you use detectCores() on those systems, you'll actually over-parallelize, which slows things down and you risk running out of memory. For example, you might launch 64 parallel workers when you only have access to four CPU cores. Each core will be clogged up by 16 workers. Third, if you default to detectCores() and a user runs your code on a machine shared by many users, the other users will not be happy. Note that the user will often not know they're overusing the machine. So, it's a loss-loss for everyone. Fourth, detectCores() will return *all* physical CPU cores on the current machine. These days we have machines with 128, 196, and more cores. Are you sure your software will actually run faster when using that many cores? The benefit from parallelization tends to decrease as you add more workers until there is no longer an speed improvement. If you keep adding more parallel workers you're going to see a negative effect, i.e. you're penalized when parallelization too much. So, be aware that when you test on 16 or 24 cores and things runs really fast, that might not be the experience for other users, or users in the future (who will have access to more CPU cores). So, yes, I suggest not to use num_workers <- detectCores(). Pick a fixed number instead, and the CRAN policy suggests using two. You can let the user control how many they want to use. As a developer, it's really really ... (read impossible) to know how many they want to use. Cheers, Henrik PS. Note that detectCores() returns a single integer value (possible NA_integer_). Because of this, there is no need to subset with num_workers[1]. I've seen this used in code; not sure where it comes from but it looks like a cut'n'paste behavior. On Wed, Nov 16, 2022 at 6:38 AM Riko Kelter wrote: > > Hi Ivan, > > thanks for the info, I changed the check as you pointed out and it > worked. R CMD build and R CMD check --as-cran run without errors or > warnings on Linux + MacOS. However, I uploaded the package again at the > WINBUILDER service and obtained the following weird error: > > * checking re-building of vignette outputs ... ERROR > Check process probably crashed or hung up for 20 minutes ... killed > Most likely this happened in the example checks (?), > if not, ignore the following last lines of example output: > > End of example output (where/before crash/hang up occured ?) > > Strangely, there are no examples included in any .Rd file. Also, I > checked whether a piece of code spawns new clusters. However, the > critical lines are inside a function which is repeatedly called in the > vignettes. The parallelized part looks as copied below. After the code > is executed the cluster is stopped. I use registerDoSNOW(cl) because > otherwise my progress bar does not work. > > > Code: > > ### CHECK CORES > > chk <- tolower(Sys.getenv("_R_CHECK_LIMIT_CORES_", "")) >if (nzchar(chk) && (chk != "false")){ # then limit the workers > num_workers <- 2L >} else { > # use all cores > num_workers <- parallel::detectCores() >} > >chk <- Sys.getenv("_R_CHECK_LIMIT_CORES_", "") > >cl <- parallel::makeCluster(num_workers[1]-1) # not to overload your > computer >#doParallel::registerDoParallel(cl) >doSNOW::registerDoSNOW(cl) > > ### SET UP PROGRESS BAR > > pb <- progress_bar$new( > format = "Iteration = :letter [:bar] :elapsed | expected time till > finish: :eta", > total = nsim,# 100 > width = 120) > >progress_letter <- seq(1,nsim) # token reported in progress bar > ># allowing progress bar to
Re: [R-pkg-devel] Too many processes spawned on Windows and Debian, although only 2 cores should be used
Hi Ivan, thanks for the info, I changed the check as you pointed out and it worked. R CMD build and R CMD check --as-cran run without errors or warnings on Linux + MacOS. However, I uploaded the package again at the WINBUILDER service and obtained the following weird error: * checking re-building of vignette outputs ... ERROR Check process probably crashed or hung up for 20 minutes ... killed Most likely this happened in the example checks (?), if not, ignore the following last lines of example output: End of example output (where/before crash/hang up occured ?) Strangely, there are no examples included in any .Rd file. Also, I checked whether a piece of code spawns new clusters. However, the critical lines are inside a function which is repeatedly called in the vignettes. The parallelized part looks as copied below. After the code is executed the cluster is stopped. I use registerDoSNOW(cl) because otherwise my progress bar does not work. Code: ### CHECK CORES chk <- tolower(Sys.getenv("_R_CHECK_LIMIT_CORES_", "")) if (nzchar(chk) && (chk != "false")){ # then limit the workers num_workers <- 2L } else { # use all cores num_workers <- parallel::detectCores() } chk <- Sys.getenv("_R_CHECK_LIMIT_CORES_", "") cl <- parallel::makeCluster(num_workers[1]-1) # not to overload your computer #doParallel::registerDoParallel(cl) doSNOW::registerDoSNOW(cl) ### SET UP PROGRESS BAR pb <- progress_bar$new( format = "Iteration = :letter [:bar] :elapsed | expected time till finish: :eta", total = nsim, # 100 width = 120) progress_letter <- seq(1,nsim) # token reported in progress bar # allowing progress bar to be used in foreach - progress <- function(n){ pb$tick(tokens = list(letter = progress_letter[n])) } opts <- list(progress = progress) ### MAIN SIMULATION if(method=="PP"){ finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { tempMatrix = singleTrial_PP(s = s, n=nInit, responseMatrix = responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = a0, b0 = b0) tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, tempMatrix) } } if(method=="PPe"){ refFunc = refFunc nu = nu shape1 = shape1 shape2 = shape2 if(refFunc == "flat"){ finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = a0, b0 = b0, refFunc = "flat") tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, tempMatrix) } } if(refFunc == "beta"){ finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = a0, b0 = b0, refFunc = "beta", shape1 = shape1, shape2 = shape2) tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, tempMatrix) } } if(refFunc == "binaryStep"){ finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = a0, b0 = b0, refFunc = "binaryStep", shape1 = shape1, shape2 = shape2, truncation = truncation) tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, tempMatrix) } } if(refFunc == "relu"){ finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = a0, b0 = b0, refFunc = "relu", shape1 = shape1, shape2 = shape2, truncation = truncation) tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, tempMatrix) } } if(refFunc == "palu"){ finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, .packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% { tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = a0, b0 = b0, refFunc = "palu", shape1 = shape1, shape2 = shape2, truncation = truncation) tempMatrix
Re: [R-pkg-devel] Too many processes spawned on Windows and Debian, although only 2 cores should be used
В Wed, 16 Nov 2022 07:29:25 +0100 Riko Kelter пишет: > if (nzchar(chk) && chk == "TRUE") { > # use 2 cores in CRAN/Travis/AppVeyor > num_workers <- 2L > } The check in parallel:::.check_ncores is a bit different: chk <- tolower(Sys.getenv("_R_CHECK_LIMIT_CORES_", "")) if (nzchar(chk) && (chk != "false")) # then limit the workers Unless you actually set _R_CHECK_LIMIT_CORES_=FALSE on your machine when running the checks, I would perform a more pessimistic check of nzchar(chk) (without additionally checking whether it's TRUE or not FALSE), though copy-pasting the check from parallel:::.check_ncores should also work. Can we see the rest of the vignette? Perhaps the problem is not with the check. For example, a piece of code might be implicitly spawning a new cluster, defaulting to all of the cores instead of num_workers. > [[alternative HTML version deleted]] Unfortunately, the plain text version of your message prepared by your mailer has all the code samples mangled: https://stat.ethz.ch/pipermail/r-package-devel/2022q4/008647.html Please compose your messages to R mailing lists in plain text. -- Best regards, Ivan __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[R-pkg-devel] Too many processes spawned on Windows and Debian, although only 2 cores should be used
Hello, I have a short question on the number of processes which are spawned during parallelization. My package passes R CMD check –as-cran on MacOS and Linux, but the vignettes fail with the following error on Windows and Debian: |--- re-building 'gettingstarted.Rmd' using rmarkdown Quitting from lines 121-122 (gettingstarted.Rmd) Error: processing vignette 'gettingstarted.Rmd' failed with diagnostics: 55 simultaneous processes spawned --- failed re-building 'gettingstarted.Rmd'| The same happens on Debian, where 31 processes are spawned. In all vignettes, I followed a stackoverflow thread and included the following check for the number of cores on CRAN: |chk <- Sys.getenv("_R_CHECK_LIMIT_CORES_", "") if (nzchar(chk) && chk == "TRUE") { # use 2 cores in CRAN/Travis/AppVeyor num_workers <- 2L } else { # use all cores in devtools::test() num_workers <- parallel::detectCores() }| see also https://stackoverflow.com/questions/50571325/r-cran-check-fail-when-using-parallel-functions Link to the files and check results: https://win-builder.r-project.org/incoming_pretest/brada_1.0_20221115_141147/ Question: Does anyone have a clue why so many processes are spawned on Windows / Debian? There should be only 2 processes spawned if I am correct. PS: Maybe there is a Windows user who can reproduce the gettingstarted.Rmd vignette and tell me how many processes are spawned on his machine. PPS: I saw someone recommending to put options(mc.cores=2) at the top of each vignette, but I think the above code snippet replaces this. Thanks for any help, all the best, Riko [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel