Hi Marko,

Thank you for reporting this issue. It is definitely something that has to
be fixed asap.
That said, it is usually recommended to use biomaRt for batch queries and
not in loops.
Is the any chance you can query for all the annotations you need at once and
then loop in R over the result?
This will make you need only a few biomaRt queries and will avoid the TCP
connection leak.

Let me know if you need help converting your 1000+ queries into one batch
query.

Cheers,
Steffen

On Tue, Sep 28, 2010 at 11:59 PM, mxlaakso <mxlaa...@cs.helsinki.fi> wrote:

> Hello,
>
> I have been using biomaRt package from Bioconductor to fetch some
> biological annotations. What I have notice this week is that getBM() calls
> leak TCP connections (probably via Curl). I have a loop that makes calls
> such as:
>
> annotations <- getBM(attributes=attributes,
>                     filter    =filter.types,
>                     values    =filter.value,
>                     mart      =mart)
>
> I can see each request creating a new open connection when I execute this
> loop and monitor the open connections using 'lsof' program. The whole loop
> crashes after 1000 iterations because that exceeds the limit of allowed
> parallel connections. Loops with less than 1000 iterations are completed
> with correct results although the connections are left open.
>
> I have also tried to use curl parameter do that I first call:
> curlHandle <- getCurlHandle()
> then I use this handle for the getBM() call but that does not change
> anything. Should I apply some kind of close call each each iteration?
>
> Package:        biomaRt
> Version:        2.4.0
> Packaged:       2010-04-22 22:52:44 UTC; biocbuild
> Built:          R 2.11.0; ; 2010-04-27 12:27:46 UTC; unix
>
>
> Package:              RCurl
> Version:              1.4-3
> Date/Publication:     2010-07-25 12:15:39
> Built:                R 2.11.1; x86_64-pc-linux-gnu; 2010-09-23
>                      10:54:07 UTC; unix
>
>
>
> Example output for the COSMIC Biomart:
>  MART_NAME = "CosmicMart"
>  MART_HOST = "www.sanger.ac.uk"
>  MART_PATH = "/genetics/CGP/cosmic/biomart/martservice"
>  MART_DSET = "COSMIC48"
>
> $ lsof | grep sanger.ac
> .
> .
> .
> R         19974       myname  259u     IPv4             137937      0t0
>   TCP myhost:48226->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  260u     IPv4             137971      0t0
>   TCP myhost:48228->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  261u     IPv4             137984      0t0
>   TCP myhost:48230->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  262u     IPv4             138004      0t0
>   TCP myhost:48233->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  263u     IPv4             138016      0t0
>   TCP myhost:48235->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  264u     IPv4             138032      0t0
>   TCP myhost:48239->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  265u     IPv4             138077      0t0
>   TCP myhost:45214->ssl-slb11b.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  266u     IPv4             138102      0t0
>   TCP myhost:45228->ssl-slb11b.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  267u     IPv4             138116      0t0
>   TCP myhost:45230->ssl-slb11b.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  268u     IPv4             138123      0t0
>   TCP myhost:48263->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  269u     IPv4             138135      0t0
>   TCP myhost:48265->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  270u     IPv4             138147      0t0
>   TCP myhost:48267->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  271u     IPv4             138185      0t0
>   TCP myhost:48272->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  272u     IPv4             138198      0t0
>   TCP myhost:48274->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  273u     IPv4             138210      0t0
>   TCP myhost:48276->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  274u     IPv4             138226      0t0
>   TCP myhost:48282->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  275u     IPv4             138246      0t0
>   TCP myhost:48284->ssl-slb11a.sanger.ac.uk:www (CLOSE_WAIT)
> R         19974       myname  276u     IPv4             138258      0t0
>   TCP myhost:48286->ssl-slb11a.sanger.ac.uk:www (ESTABLISHED)
> R         19974       myname  277u     IPv4             138272      0t0
>   TCP myhost:48288->ssl-slb11a.sanger.ac.uk:www (ESTABLISHED)
> R         19974       myname  278u     IPv4             138527      0t0
>   TCP myhost:48290->ssl-slb11a.sanger.ac.uk:www (ESTABLISHED)
> R         19974       myname  279u     IPv4             138533      0t0
>   TCP myhost:45259->ssl-slb11b.sanger.ac.uk:www (ESTABLISHED)
> R         19974       myname  280u     IPv4             138545      0t0
>   TCP myhost:45261->ssl-slb11b.sanger.ac.uk:www (ESTABLISHED)
> R         19974       myname  281u     IPv4             138557      0t0
>   TCP myhost:45263->ssl-slb11b.sanger.ac.uk:www (ESTABLISHED)
>
>
> The final error message that I'll obtain after 1000 open connections is:
> [STDERR] Error in value[[3L]](cond) :
> [STDERR]   Request to BioMart web service failed. Verify if you are still
> connected to the internet.  Alternatively the BioMart web service is
> temporarily down.
> [STDERR] Calls: main ... tryCatch -> tryCatchList -> tryCatchOne ->
> <Anonymous>
> [STDERR] Error during wrapup: cannot open the connection
> [STDERR] Execution halted
>
> My R version is 2.11.1 (2010-05-31).
>
> Best regards,
> Marko Laakso
>

Reply via email to