Hi John,
I tried to complicate the example a bit so that it takes longer to evaluate
(see below). All cores seem to be going, but it looks like the environment in
which the evaluation goes matters. I run the code twice, once in the global env
and second time in the separately created one and here are the timings:
Timing in global environment
user system elapsed
0.359 0.153 44.263
Timing in a separate environment
user system elapsed
0.528 0.386 65.376
My original objects are created within a function and are much bigger than in
this example. I have 122^2 expressions to evaluate. The data matrix has more
than 42k rows (10^6 in the example) and 123 columns (n in the example). Since
this is performed within a function, and I have not yet figured out how to
break the data matrix into individual objects to be stored in the function
environment, I sent all to a newly created one (list2env) and performed the
evaluation there. Could this be a reason for some cores not going?
Thank you again for your help.
Yan
#
rm(list = ls())
library(parallel)
set.seed(123)
#
n <- 20
expr <- parse(text = paste("pnorm(", paste(paste("b", 1:n, sep = ""),
paste("x", 1:n, sep = ""), sep = "*", collapse = " + "), ")", sep = ""))
coefs <- paste("b", 1:n, sep = "")
vars <- paste("x", 1:n, sep = "")
G <- lapply(lapply(vars, function(x) D(expr, x)), function(x) lapply(coefs,
function(arg) D(x, arg)))
#
b <- as.list(rnorm(n, 5, 12))
names(b) <- coefs
x <- matrix(rnorm(10^6*(n)), ncol = n)
colnames(x) <- vars
#
ev <- new.env()
list2env(setNames(split(x, col(x)), vars), envir = ev)
list2env(b, envir = ev)
#
nc <- detectCores() - 1
cl <- makeCluster(nc, type = "FORK")
system.time(grad_g <- parLapply(cl, G, function(Z) lapply(Z, function(x)
mean(eval(x, envir = ev)))))
stopCluster(cl)
#
sapply(1:length(coefs), function(z) {assign(coefs[z], b[1, z], pos = 1)})
sapply(1:length(vars), function(z) {assign(vars[z], x[, z], pos = 1)})
nc <- detectCores() - 1
cl <- makeCluster(nc, type = "FORK")
system.time(grad_g <- parLapply(cl, G, function(Z) lapply(Z, function(x)
mean(eval(x)))))
stopCluster(cl)
#
On Dec 11, 2015, at 2:16 PM, John Magnotti
<[email protected]<mailto:[email protected]>> wrote:
Hi Yan,
Sorry it wasn't the simple answer. On my machine, your code creates 3 R
processes (#cores -1) and they are all active.
If you have an even simpler example say
parLapply(cl, 1:8, function_that_takes_a_while)
does that get all the cores going?
John
On Fri, Dec 11, 2015 at 6:58 AM, ALPEROVYCH Yan
<[email protected]<mailto:[email protected]>> wrote:
Hi John,
Thank you for the reply. In fact my original list has 122^2 elements in it (I
simplified the code here), and parLapply behaves in a similar way still. Can it
be something else?
Yan
On Dec 11, 2015, at 1:51 PM, John Magnotti
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
wrote:
Hello Yan,
I think parLapply is just assigning a core for every item in the list, G, you
supplied. Because you have more cores than items in the list, some of the cores
won't receive any work.
John
On Fri, Dec 11, 2015 at 4:00 AM, ALPEROVYCH Yan
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
wrote:
Hello,
I have a piece of code that needs parallelization and it used to work just fine
before (about 6 months ago). However, I had to rerun it yesterday and found out
that my code is now behaving in a weird way - not all worker processes are
charged with the computation. I created a little code that allows reproducing
the issue. Here is a snapshot of the top command:
PID COMMAND %CPU TIME
872 R 97.7 00:02.45
871 R 0.0 00:00.03
870 R 98.2 00:02.93
869 R 0.0 00:00.03
868 R 97.7 00:02.46
867 R 0.0 00:00.03
866 R 94.4 00:02.36
862 R 1.0 00:04.64
Interestingly, the mclapply command seems to correctly charge all workers
instantly.
So is this an intended behavior?
Example that reproduces the issue on my machine:
#
rm(list = ls())
library(parallel)
set.seed(123)
#
expr <- expression(pnorm(b0 + b1*x1 + b2*x2 + b3*x3))
G <- list(
D(D(expr, "x2"), "b0"),
D(D(expr, "x2"), "b1"),
D(D(expr, "x2"), "b2"),
D(D(expr, "x2"), "b3"))
#
b0 <- 1.2
b1 <- 0.4
b2 <- 0.2
b3 <- -0.6
x1 <- rnorm(10^7)
x2 <- rnorm(10^7)
x3 <- rnorm(10^7)
#
nc <- detectCores() - 1
cl <- makeCluster(nc, type = "FORK")
grad_g <- parLapply(cl, G, function(Z) lapply(Z, function(x) mean(eval(x))))
stopCluster(cl)
#
sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.2 (El Capitan)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
----
Ce message electronique et tous les fichiers attaches qu'il contient sont
confidentiels et destines exclusivement à l'usage de la personne à laquelle ils
sont adresses. Si vous avez reçu ce message par erreur, merci de le retourner à
son metteur. Les idees et opinions presentees dans ce message sont celles de
son auteur, et ne representent pas necessairement celles de l'institution ou
entite affiliee dont l'auteur est l'employe. La publication, l'usage, la
distribution, l'impression ou la copie non autorisee de ce message et des
attachements qu'il contient sont strictement interdits.
This email and any files transmitted with it are confide...{{dropped:10}}
_______________________________________________
R-SIG-Mac mailing list
[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>
https://stat.ethz.ch/mailman/listinfo/r-sig-mac
----
Ce message electronique et tous les fichiers attaches qu'il contient sont
confidentiels et destines exclusivement à l'usage de la personne à laquelle ils
sont adresses. Si vous avez reçu ce message par erreur, merci de le retourner à
son metteur. Les idees et opinions presentees dans ce message sont celles de
son auteur, et ne representent pas necessairement celles de l'institution ou
entite affiliee dont l'auteur est l'employe. La publication, l'usage, la
distribution, l'impression ou la copie non autorisee de ce message et des
attachements qu'il contient sont strictement interdits.
This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed. If
you have received this email in error please return it to the sender. The ideas
and views expressed in this email are solely those of its author, and do not
necessarily represent the views of the institution or company of which the
author is an employee. Unauthorized publication, use, distribution, printing or
copying of this e-mail or any attached files is strictly forbidden.
----
Ce message electronique et tous les fichiers attaches qu'il contient sont
confidentiels et destines exclusivement à l'usage de la personne à laquelle ils
sont adresses. Si vous avez reçu ce message par erreur, merci de le retourner à
son metteur. Les idees et opinions presentees dans ce message sont celles de
son auteur, et ne representent pas necessairement celles de l'institution ou
entite affiliee dont l'auteur est l'employe. La publication, l'usage, la
distribution, l'impression ou la copie non autorisee de ce message et des
attachements qu'il contient sont strictement interdits.
This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed. If
you have received this email in error please return it to the sender. The ideas
and views expressed in this email are solely those of its author, and do not
necessarily represent the views of the institution or company of which the
author is an employee. Unauthorized publication, use, distribution, printing or
copying of this e-mail or any attached files is strictly forbidden.
_______________________________________________
R-SIG-Mac mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-mac