Hi Dario -- the most likely explanation, without a reproducible example, is 
that the code used on workers sometimes puts R into a state that it cannot 
recover from.

The first approach to debug this is to run the code serially, e.g., using 
SerialParam and perhaps register(SerialParam()) (to make serial evaluation the 
default in a bplapply() invoked without a BPPARAM argument).

BiocParallel 1.5.12 is from the 'devel' branch of Bioconductor, which is 
supposed to be used (currently) on R-devel; please always use the appropriate 
version of R, with packages installed using biocLIte() when reporting problems.

Probably this belongs on support.bioconductor.org, where others may more easily 
benefit from your experience.

There are a couple of things that have come up while looking in to your problem 
and how R can get into the situation where several processes share a socket 
connection in the CLOSE_WAIT state; I'm still exploring solutions but it is not 
obvious that these would address whatever your underlying issue might be; R 
might be more helpful in saying that something has gone wrong, without being 
able to say exactly what.

Martin
________________________________________
From: Bioc-devel [[email protected]] on behalf of Dario Strbenac 
[[email protected]]
Sent: Friday, January 01, 2016 9:00 PM
To: [email protected]
Subject: Re: [Bioc-devel] bplapply Processes Sometimes Stall

Good day,

I haven't been able to make a small and reproducible example, but I am using 
bpstart and bpstop to run a loop with 25 workers multiple times on a large 
bioinformatics dataset. After a few times of running the loop successfully, a 
small number of the R workers use 100% CPU endlessly :

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 3300 dario     20   0 1190832 837212  17988 R 100.0  0.2   3848:00 R
 5014 dario     20   0 1194528 829084   8224 R  99.8  0.2   3843:44 R
 5015 dario     20   0 1194532 829088   8224 R  99.8  0.2   3843:44 R

There are also three connections belonging to the R processes waiting to close :

~$ lsof -i | grep CLOSE
R          3300 dario 1025u  IPv4 160778259      0t0  TCP 
localhost:11881->localhost:49379 (CLOSE_WAIT)
R          5014 dario 1025u  IPv4 160778259      0t0  TCP 
localhost:11881->localhost:49379 (CLOSE_WAIT)
R          5015 dario 1025u  IPv4 160778259      0t0  TCP 
localhost:11881->localhost:49379 (CLOSE_WAIT)

~$ lsof -i | grep -c R
256

I use :

R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

with BiocParallel 1.5.12

--------------------------------------
Dario Strbenac
PhD Student
University of Sydney
Camperdown NSW 2050
Australia
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to