[R] downloaf.file
Dear List-members, to download a file from the net, the function download.file(..) does the job. However, before embarking on the download, I would like to find out how large the file is. Is there a way to know it? Most easily, this question has been asked before, but I am new to the list. Regards, with thanks in advance, Adelchi Azzalini Adelchi Azzalini [EMAIL PROTECTED] Dipart.Scienze Statistiche, Università di Padova, Italia http://azzalini.stat.unipd.it/ __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] downloaf.file
Essentially no. Most servers will give you the length if you start the download, and then R prints it out, but it may be unknown. As in update.packages() trying URL `http://cran.r-project.org/src/contrib/PACKAGES' Content type `text/plain; charset=iso-8859-1' length 95407 bytes opened URL .. .. .. .. .. .. .. .. .. ... downloaded 93Kb and you can (probably) interrupt during those dots. On Tue, 4 Feb 2003, Adelchi Azzalini wrote: to download a file from the net, the function download.file(..) does the job. However, before embarking on the download, I would like to find out how large the file is. Is there a way to know it? Most easily, this question has been asked before, but I am new to the list. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] downloaf.file
to download a file from the net, the function download.file(..) does the job. However, before embarking on the download, I would like to find out how large the file is. Is there a way to know it? You can send web servers a 'HEAD' request, which can give you some basic information about the download. I cant see a way to get this from the current R functions, so here's a little routine to leverage the 'lynx' web browser: head.download - function (url) { if (system(lynx -help /dev/null) == 0) { method - lynx } else { stop(No lynx found) } if (method == lynx) { heads - system(paste(lynx -head -dump ', url,', sep = ),intern=T) } # turn name: value lines into named list. prob vectorisable ret - list(status=heads[1]) for(l in 2:length(heads)){ col - regexpr(:,heads[l]) if(col-1){ name - substr(heads[l],1,(col-1)) value - substr(heads[l],(col+1),nchar(heads[l])) ret[[name]] - value }else{ ret - c(ret,heads[l]) } } ret } this borrows bits from download.file(), but it does depend on you having lynx installed. The return value is a list with names corresponding to the header titles and values being the values. It looks for a : as the title: value separator, and anything that doesnt have a : is just added verbatim unnamed. For example, how big is the R logo on the home page? head.download(http://www.r-project.org/Rlogo.jpg;)$Content-Length [1] 8793 That's bytes. Yes I know its character! I dont think web servers are under any obligation to provide accurate Content-length values. Many dynamic web servers have pages that change length every time. This will also not for for ftp:// URLs or local file:// URLs (or gopher:// URLs?). Perhaps HEAD-getting functionality can be put in the next release of R? It would probably have a better name: value - named list routine than the one I just hacked up in two minutes above. Oops. Shame. Baz __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] downloaf.file
On Tue, 4 Feb 2003, Barry Rowlingson wrote: That's bytes. Yes I know its character! I dont think web servers are under any obligation to provide accurate Content-length values. Many dynamic web servers have pages that change length every time. This will also not for for ftp:// URLs or local file:// URLs (or gopher:// URLs?). The HTTP protocol says that a content length SHOULD be provided and MUST be accurate if it is provided. -thomas __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] downloaf.file
On Tue, 4 Feb 2003, Thomas Lumley wrote: On Tue, 4 Feb 2003, Barry Rowlingson wrote: That's bytes. Yes I know its character! I dont think web servers are under any obligation to provide accurate Content-length values. Many dynamic web servers have pages that change length every time. This will also not for for ftp:// URLs or local file:// URLs (or gopher:// URLs?). The HTTP protocol says that a content length SHOULD be provided and MUST be accurate if it is provided. Most proxies of my acquaintance will report unknown unless they are asked to actually get the file or have it already cached. Further, the IE internals used under Windows with --internet2 usually seems to get the wrong length (far too short) when talking to a proxy. Why is this of interest: there are lots of internet download tools available apart from R? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help