There may also be a difference in reliability, which would not so easily be measured by an individual user. I've selected the closest geographically until it seemed to be down, then tried the second closest, etc. This could be automated centrally, but then you'd have to deal with the human factor of how to turn the data into commentary to the people who volunteer to provide hardware and support without offending them.

     Spencer


Martin Maechler wrote:
Barry Rowlingson <b.rowling...@lancaster.ac.uk>
    on Thu, 30 Jul 2009 09:59:47 +0100 writes:

    > 2009/7/30 Uwe Ligges <lig...@statistik.tu-dortmund.de>:
    >> Hard to lee, you have to try out, I fear.
>> >> The speed you see highly depends on the connection from your country to
    >> others, but of course, there are also some mirrors that are not the 
fastest
    >> themselves.

    > I figured you could write a function that got the CRAN mirror list and
    > tested their response. Here's my 'cranometer':

    > cranometer <- function(ms = getCRANmirrors(all = FALSE, local.only = 
FALSE)){

    > dest = tempfile()

    > nms = dim(ms)[1]
    > ms$t = rep(NA,nms)
    > for(i in 1:nms){
    > m = ms[i,]
    > url = paste(m$URL,"/src/base/NEWS",sep="")
    > t = try(system.time(download.file(url,dest),gcFirst=TRUE))
    > if(file.exists(dest)){
    > file.remove(dest)
    > ms$t[i]=t['elapsed']
    > }else{
    > ms$t[i]=NA
    > }
    > }
    > return(ms)
    > }

    > It works by downloading the latest NEWS file (376Kbytes at the
    > moment, so not huge) from each of the mirror sites in the CRAN mirrors
    > list. If you want to test it on a subset then call getCRANmirrors
    > yourself and subset it somehow.

    > I'm running it now on the full CRAN list and I've yet to find a
    > timeout or error so I'm not sure what will happen if download.file
    > fails. It retuns a data frame like you get from getCRANmirrors but
    > with an extra 't' column giving the elapsed time to get the NEWS file.

    > CAVEATS: if your network has any local caching then these results
    > will be wrong, since your computer will probably be getting the
    > locally cached NEWS file and not the one on the server. Especially if
    > you run it twice. Oh, I should have put cacheOK=FALSE in the
    > download.file - but even that might get overruled somewhere. Also,
    > sites may have good days and bad days, good minutes and bad minutes,
    > your network may be congested on a short-term basis, etc etc.

    > Other ideas: how about combining the CRAN list with my geonames
    > package to work out distances from where you are to the CRAN site? I
    > might write that later if I get a minute...

Yes!  And visualize the corresponding  "nearest neigbourhood"
for each CRAN mirror on a world map
and make this dynamically refreshing every few minutes and put it on a webserver so people can watch the "CRAN world" in real time!
More seriously, it would be really cool if a "robust" version of
cranometer() could be used automagically in the (typical /
default) case of install.packages() {and it's call from the
Windows (or also Mac?) 'Packages' menu} when the user / site
have no CRAN repository specified:
It would choose the CRAN mirror which is closest,
or even better (and more appropriate for a statistics software),
would chose one at random, but with probability inversely
proportional to (a power of ?) the "distance".

... yes, we should defer this  from R-help to  R-devel ..

Martin

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to