Hi Marko, Thank you for reporting this issue. It is definitely something that has to be fixed asap. That said, it is usually recommended to use biomaRt for batch queries and not in loops. Is the any chance you can query for all the annotations you need at once and then loop in R over the result? This will make you need only a few biomaRt queries and will avoid the TCP connection leak.
Let me know if you need help converting your 1000+ queries into one batch query. Cheers, Steffen On Tue, Sep 28, 2010 at 11:59 PM, mxlaakso <mxlaa...@cs.helsinki.fi> wrote: > Hello, > > I have been using biomaRt package from Bioconductor to fetch some > biological annotations. What I have notice this week is that getBM() calls > leak TCP connections (probably via Curl). I have a loop that makes calls > such as: > > annotations <- getBM(attributes=attributes, > filter =filter.types, > values =filter.value, > mart =mart) > > I can see each request creating a new open connection when I execute this > loop and monitor the open connections using 'lsof' program. The whole loop > crashes after 1000 iterations because that exceeds the limit of allowed > parallel connections. Loops with less than 1000 iterations are completed > with correct results although the connections are left open. > > I have also tried to use curl parameter do that I first call: > curlHandle <- getCurlHandle() > then I use this handle for the getBM() call but that does not change > anything. Should I apply some kind of close call each each iteration? > > Package: biomaRt > Version: 2.4.0 > Packaged: 2010-04-22 22:52:44 UTC; biocbuild > Built: R 2.11.0; ; 2010-04-27 12:27:46 UTC; unix > > > Package: RCurl > Version: 1.4-3 > Date/Publication: 2010-07-25 12:15:39 > Built: R 2.11.1; x86_64-pc-linux-gnu; 2010-09-23 > 10:54:07 UTC; unix > > > > Example output for the COSMIC Biomart: > MART_NAME = "CosmicMart" > MART_HOST = "www.sanger.ac.uk" > MART_PATH = "/genetics/CGP/cosmic/biomart/martservice" > MART_DSET = "COSMIC48" > > $ lsof | grep sanger.ac > . > . > . > R 19974 myname 259u IPv4 137937 0t0 > TCP myhost:48226->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 260u IPv4 137971 0t0 > TCP myhost:48228->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 261u IPv4 137984 0t0 > TCP myhost:48230->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 262u IPv4 138004 0t0 > TCP myhost:48233->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 263u IPv4 138016 0t0 > TCP myhost:48235->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 264u IPv4 138032 0t0 > TCP myhost:48239->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 265u IPv4 138077 0t0 > TCP myhost:45214->ssl-slb11b.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 266u IPv4 138102 0t0 > TCP myhost:45228->ssl-slb11b.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 267u IPv4 138116 0t0 > TCP myhost:45230->ssl-slb11b.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 268u IPv4 138123 0t0 > TCP myhost:48263->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 269u IPv4 138135 0t0 > TCP myhost:48265->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 270u IPv4 138147 0t0 > TCP myhost:48267->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 271u IPv4 138185 0t0 > TCP myhost:48272->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 272u IPv4 138198 0t0 > TCP myhost:48274->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 273u IPv4 138210 0t0 > TCP myhost:48276->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 274u IPv4 138226 0t0 > TCP myhost:48282->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 275u IPv4 138246 0t0 > TCP myhost:48284->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT) > R 19974 myname 276u IPv4 138258 0t0 > TCP myhost:48286->ssl-slb11a.sanger.ac.uk:www (ESTABLISHED) > R 19974 myname 277u IPv4 138272 0t0 > TCP myhost:48288->ssl-slb11a.sanger.ac.uk:www (ESTABLISHED) > R 19974 myname 278u IPv4 138527 0t0 > TCP myhost:48290->ssl-slb11a.sanger.ac.uk:www (ESTABLISHED) > R 19974 myname 279u IPv4 138533 0t0 > TCP myhost:45259->ssl-slb11b.sanger.ac.uk:www (ESTABLISHED) > R 19974 myname 280u IPv4 138545 0t0 > TCP myhost:45261->ssl-slb11b.sanger.ac.uk:www (ESTABLISHED) > R 19974 myname 281u IPv4 138557 0t0 > TCP myhost:45263->ssl-slb11b.sanger.ac.uk:www (ESTABLISHED) > > > The final error message that I'll obtain after 1000 open connections is: > [STDERR] Error in value[[3L]](cond) : > [STDERR] Request to BioMart web service failed. Verify if you are still > connected to the internet. Alternatively the BioMart web service is > temporarily down. > [STDERR] Calls: main ... tryCatch -> tryCatchList -> tryCatchOne -> > <Anonymous> > [STDERR] Error during wrapup: cannot open the connection > [STDERR] Execution halted > > My R version is 2.11.1 (2010-05-31). > > Best regards, > Marko Laakso >