Re: [R-pkg-devel] Too many processes spawned on Windows and Debian, although only 2 cores should be used

2022-11-16 Thread Henrik Bengtsson
Hello.

As already pointed out, the current R implementation treats any
non-empty value on _R_CHECK_LIMIT_CORES_ different from "false" as a
true value, e.g. "TRUE", "true", "T", "1", but also "donald duck".
Using '--as-cran' sets _R_CHECK_LIMIT_CORES_="TRUE", if unset.  If
already set, it'll not touch it.  So, it could be that a CRAN check
server already uses, say, _R_CHECK_LIMIT_CORES_="true".  We cannot
make assumptions about that.

To make your life, and an end-user's too, easier, I suggest just using

  num_workers <- 2L

without conditioning on running on CRAN or not.

Why? There are many problems with using parallel::detectCores().

First of all, it can return NA_integer_ on some systems, so you cannot
assume it gives a valid value (== error).  It can also return 1L,
which means your 'num_workers - 1' will give zero worker (== error).
You need to account for that if you rely on detectCores().

Second, detectCores() returns number of physical CPU cores. It's
getting more and more common to run in "cgroups" constrained
environments where your R process only gets access to a fraction of
these cores.  Such constrains are in place in many shared multi-user
HPC environments, and sometimes when using Linux containers (e.g.
Docker, Apptainer, and Podman).  A notable example of this is when
using the RStudio Cloud.  So, if you use detectCores() on those
systems, you'll actually over-parallelize, which slows things down and
you risk running out of memory. For example, you might launch 64
parallel workers when you only have access to four CPU cores.  Each
core will be clogged up by 16 workers.

Third, if you default to detectCores() and a user runs your code on a
machine shared by many users, the other users will not be happy.  Note
that the user will often not know they're overusing the machine.  So,
it's a loss-loss for everyone.

Fourth, detectCores() will return *all* physical CPU cores on the
current machine. These days we have machines with 128, 196, and more
cores.  Are you sure your software will actually run faster when using
that many cores?  The benefit from parallelization tends to decrease
as you add more workers until there is no longer an speed improvement.
If you keep adding more parallel workers you're going to see a
negative effect, i.e. you're penalized when parallelization too much.
So, be aware that when you test on 16 or 24 cores and things runs
really fast, that might not be the experience for other users, or
users in the future (who will have access to more CPU cores).

So, yes, I suggest not to use num_workers <- detectCores().  Pick a
fixed number instead, and the CRAN policy suggests using two.  You can
let the user control how many they want to use.  As a developer, it's
really really ... (read impossible) to know how many they want to use.

Cheers,

Henrik

PS. Note that detectCores() returns a single integer value (possible
NA_integer_).  Because of this, there is no need to subset with
num_workers[1]. I've seen this used in code; not sure where it comes
from but it looks like a cut'n'paste behavior.

On Wed, Nov 16, 2022 at 6:38 AM Riko Kelter  wrote:
>
> Hi Ivan,
>
> thanks for the info, I changed the check as you pointed out and it
> worked. R CMD build and R CMD check --as-cran run without errors or
> warnings on Linux + MacOS. However, I uploaded the package again at the
> WINBUILDER service and obtained the following weird error:
>
> * checking re-building of vignette outputs ... ERROR
> Check process probably crashed or hung up for 20 minutes ... killed
> Most likely this happened in the example checks (?),
> if not, ignore the following last lines of example output:
>
>  End of example output (where/before crash/hang up occured ?) 
>
> Strangely, there are no examples included in any .Rd file. Also, I
> checked whether a piece of code spawns new clusters. However, the
> critical lines are inside a function which is repeatedly called in the
> vignettes. The parallelized part looks as copied below. After the code
> is executed the cluster is stopped. I use registerDoSNOW(cl) because
> otherwise my progress bar does not work.
>
>
> Code:
>
> ### CHECK CORES
>
> chk <- tolower(Sys.getenv("_R_CHECK_LIMIT_CORES_", ""))
>if (nzchar(chk) && (chk != "false")){  # then limit the workers
>  num_workers <- 2L
>} else {
>  # use all cores
>  num_workers <- parallel::detectCores()
>}
>
>chk <- Sys.getenv("_R_CHECK_LIMIT_CORES_", "")
>
>cl <- parallel::makeCluster(num_workers[1]-1) # not to overload your
> computer
>#doParallel::registerDoParallel(cl)
>doSNOW::registerDoSNOW(cl)
>
> ### SET UP PROGRESS BAR
>
> pb <- progress_bar$new(
>  format = "Iteration = :letter [:bar] :elapsed | expected time till
> finish: :eta",
>  total = nsim,# 100
>  width = 120)
>
>progress_letter <- seq(1,nsim)  # token reported in progress bar
>
># allowing progress bar to 

Re: [R-pkg-devel] Too many processes spawned on Windows and Debian, although only 2 cores should be used

2022-11-16 Thread Riko Kelter
Hi Ivan,

thanks for the info, I changed the check as you pointed out and it 
worked. R CMD build and R CMD check --as-cran run without errors or 
warnings on Linux + MacOS. However, I uploaded the package again at the 
WINBUILDER service and obtained the following weird error:

* checking re-building of vignette outputs ... ERROR
Check process probably crashed or hung up for 20 minutes ... killed
Most likely this happened in the example checks (?),
if not, ignore the following last lines of example output:

 End of example output (where/before crash/hang up occured ?) 

Strangely, there are no examples included in any .Rd file. Also, I 
checked whether a piece of code spawns new clusters. However, the 
critical lines are inside a function which is repeatedly called in the 
vignettes. The parallelized part looks as copied below. After the code 
is executed the cluster is stopped. I use registerDoSNOW(cl) because 
otherwise my progress bar does not work.


Code:

### CHECK CORES

chk <- tolower(Sys.getenv("_R_CHECK_LIMIT_CORES_", ""))
   if (nzchar(chk) && (chk != "false")){  # then limit the workers
     num_workers <- 2L
   } else {
     # use all cores
     num_workers <- parallel::detectCores()
   }

   chk <- Sys.getenv("_R_CHECK_LIMIT_CORES_", "")

   cl <- parallel::makeCluster(num_workers[1]-1) # not to overload your 
computer
   #doParallel::registerDoParallel(cl)
   doSNOW::registerDoSNOW(cl)

### SET UP PROGRESS BAR

pb <- progress_bar$new(
     format = "Iteration = :letter [:bar] :elapsed | expected time till 
finish: :eta",
     total = nsim,    # 100
     width = 120)

   progress_letter <- seq(1,nsim)  # token reported in progress bar

   # allowing progress bar to be used in foreach 
-
   progress <- function(n){
     pb$tick(tokens = list(letter = progress_letter[n]))
   }

   opts <- list(progress = progress)

### MAIN SIMULATION

if(method=="PP"){
     finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, .packages 
= c("extraDistr", "fbst"), .options.snow = opts) %dopar% {
   tempMatrix = singleTrial_PP(s = s, n=nInit, responseMatrix = 
responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = 
a0, b0 = b0)

   tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, 
tempMatrix)
     }
   }

   if(method=="PPe"){
     refFunc = refFunc
     nu = nu
     shape1 = shape1
     shape2 = shape2
     if(refFunc == "flat"){
   finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, 
.packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% {
     tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = 
responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = 
a0, b0 = b0, refFunc = "flat")

     tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, 
tempMatrix)
   }
     }
     if(refFunc == "beta"){
   finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, 
.packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% {
     tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = 
responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = 
a0, b0 = b0, refFunc = "beta",
  shape1 = shape1, shape2 = shape2)

     tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, 
tempMatrix)
   }
     }
     if(refFunc == "binaryStep"){
   finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, 
.packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% {
     tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = 
responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = 
a0, b0 = b0, refFunc = "binaryStep",
  shape1 = shape1, shape2 = shape2, 
truncation = truncation)

     tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, 
tempMatrix)
   }
     }
     if(refFunc == "relu"){
   finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, 
.packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% {
     tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = 
responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = 
a0, b0 = b0, refFunc = "relu",
  shape1 = shape1, shape2 = shape2, 
truncation = truncation)

     tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, 
tempMatrix)
   }
     }
     if(refFunc == "palu"){
   finalMatrix <- foreach::foreach(s=1:nsim, .combine=rbind, 
.packages = c("extraDistr", "fbst"), .options.snow = opts) %dopar% {
     tempMatrix = singleTrial_PPe(s = s, n=nInit, responseMatrix = 
responseMatrix, nInit = nInit, Nmax = Nmax, batchsize = batchsize, a0 = 
a0, b0 = b0, refFunc = "palu",
  shape1 = shape1, shape2 = shape2, 
truncation = truncation)

     tempMatrix 

Re: [R-pkg-devel] Too many processes spawned on Windows and Debian, although only 2 cores should be used

2022-11-15 Thread Ivan Krylov
В Wed, 16 Nov 2022 07:29:25 +0100
Riko Kelter  пишет:

> if (nzchar(chk) && chk == "TRUE") {
>  # use 2 cores in CRAN/Travis/AppVeyor
>  num_workers <- 2L
> }

The check in parallel:::.check_ncores is a bit different:

chk <- tolower(Sys.getenv("_R_CHECK_LIMIT_CORES_", ""))
if (nzchar(chk) && (chk != "false")) # then limit the workers

Unless you actually set _R_CHECK_LIMIT_CORES_=FALSE on your machine
when running the checks, I would perform a more pessimistic check of
nzchar(chk) (without additionally checking whether it's TRUE or not
FALSE), though copy-pasting the check from parallel:::.check_ncores
should also work.

Can we see the rest of the vignette? Perhaps the problem is not with
the check. For example, a piece of code might be implicitly spawning a
new cluster, defaulting to all of the cores instead of num_workers.

>   [[alternative HTML version deleted]]

Unfortunately, the plain text version of your message prepared by your
mailer has all the code samples mangled:
https://stat.ethz.ch/pipermail/r-package-devel/2022q4/008647.html

Please compose your messages to R mailing lists in plain text.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] Too many processes spawned on Windows and Debian, although only 2 cores should be used

2022-11-15 Thread Riko Kelter
Hello,

I have a short question on the number of processes which are spawned 
during parallelization. My package passes R CMD check –as-cran on MacOS 
and Linux, but the vignettes fail with the following error on Windows 
and Debian:

|--- re-building 'gettingstarted.Rmd' using rmarkdown Quitting from 
lines 121-122 (gettingstarted.Rmd) Error: processing vignette 
'gettingstarted.Rmd' failed with diagnostics: 55 simultaneous processes 
spawned --- failed re-building 'gettingstarted.Rmd'|

The same happens on Debian, where 31 processes are spawned. In all 
vignettes, I followed a stackoverflow thread and included the following 
check for the number of cores on CRAN:

|chk <- Sys.getenv("_R_CHECK_LIMIT_CORES_", "") if (nzchar(chk) && chk 
== "TRUE") { # use 2 cores in CRAN/Travis/AppVeyor num_workers <- 2L } 
else { # use all cores in devtools::test() num_workers <- 
parallel::detectCores() }|

see also 
https://stackoverflow.com/questions/50571325/r-cran-check-fail-when-using-parallel-functions

Link to the files and check results:

https://win-builder.r-project.org/incoming_pretest/brada_1.0_20221115_141147/

Question: Does anyone have a clue why so many processes are spawned on 
Windows / Debian? There should be only 2 processes spawned if I am correct.

PS: Maybe there is a Windows user who can reproduce the 
gettingstarted.Rmd vignette and tell me how many processes are spawned 
on his machine.

PPS: I saw someone recommending to put options(mc.cores=2) at the top of 
each vignette, but I think the above code snippet replaces this.

Thanks for any help, all the best,

Riko

​
[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel