I'm using bigkmeans in 'biganalytics' to cluster my 60,000 by 600,000 matrix. I'm using a 8 core Linux VM. I have register parallel backend with >registerDoMC()
And I checked how many cores registered with >getDoParWorkers() It returns 8, which is the number of cores I have on my machine. And I run the test below, whose results shows improved speed due to parallel. check <-function(n) { + for(i in 1:1000) + { + sme <- matrix(rnorm(100), 10,10) + solve(sme) + } + } times <- 100 # times to run the loop system.time(x <- foreach(j=1:times ) %dopar% check(j)) user system elapsed ----- ------ 4 system.time(x <- foreach(j=1:times ) %do% check(j)) user system elapsed ----- ------- 16 But when I run my data in bigkmeans >ans <- bigkmeans(data,200,nstart=5,iter.max=20) I see only one R process in system monitor, and only one CPU usage is high. I guess it's not really parallel. I also tried DoSNOW, though it's used for multi clusters. >cl <- makeCluster(8,type="SOCK") >registerDoSNOW(cl) >ans <- bigkmeans(data,200,nstart = 30) There are 8 R processes but only 1 running. Is it because I have something misconfigured? Or is the bigkmeans do not support parallel? Thanks in advance to any advise. Regards, Lishu -- View this message in context: http://r.789695.n4.nabble.com/bigkmeans-not-parallel-tp4353036p4353036.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.