Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-27 Thread clair.crossup...@googlemail.com
Thank you Duncan. I remember seeing in your documentation that you have used this 'verbose=TRUE' argument in functions before when trying to see what is going on. This is good. However, I have not been able to get it to work for me. Does the output appear in R or do you use some other external

Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-27 Thread Duncan Temple Lang
clair.crossup...@googlemail.com wrote: Thank you Duncan. I remember seeing in your documentation that you have used this 'verbose=TRUE' argument in functions before when trying to see what is going on. This is good. However, I have not been able to get it to work for me. Does the output

Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-27 Thread clair.crossup...@googlemail.com
Thank you. The output i get from that example is below: d = debugGatherer() getURL(http://uk.youtube.com;, + debugfunction = d$update, verbose = TRUE ) [1] d$value() text About to connect() to uk.youtube.com port 80 (#0)\n Trying 208.117.236.72... connected\nConnected to

Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-27 Thread Duncan Temple Lang
Some Web servers are strict. In this case, it won't accept a request without being told who is asking, i.e. the User-Agent. If you use getURL(http://www.youtube.com;, httpheader = c(User-Agent = R (2.9.0 you should get the contents of the page as expected. (Or with URL

Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-27 Thread clair.crossup...@googlemail.com
opps, i meant: toString(readLines(http://uk.youtube.com;)) toString(readLines(http://uk.youtube.com;)) [1] !DOCTYPE HTML PUBLIC \-//W3C//DTD HTML 4.01 Transitional//EN\ \http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd\;, , , \thtml lang=\en\, , !-- machid: 302 --, head, , \t,

Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-27 Thread clair.crossup...@googlemail.com
Cheers Duncan, that worked great getURL(http://uk.youtube.com;, httpheader = c(User-Agent = R (2.8.1))) [1] !DOCTYPE HTML PUBLIC \-//W3C//DTD HTML 4.01 Transitional//EN\ \http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd\;\n\n\ [etc] May I ask if there was a specific manual you read to

Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-27 Thread Duncan Temple Lang
clair.crossup...@googlemail.com wrote: Cheers Duncan, that worked great getURL(http://uk.youtube.com;, httpheader = c(User-Agent = R (2.8.1))) [1] !DOCTYPE HTML PUBLIC \-//W3C//DTD HTML 4.01 Transitional//EN\ \http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd\;\n\n\ [etc] May I ask

Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-26 Thread Tony Breyal
Hi, i ran your getURL example and had the same problem with downloading the file. ## R Start.. library(RCurl) toString(getURL(http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2;)) [1] ## R end. However, if it is interesting that if you manually save the page

Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-26 Thread Duncan Temple Lang
clair.crossup...@googlemail.com wrote: Dear R-help, There seems to be a web page I am unable to download using RCurl. I don't understand why it won't download: library(RCurl) my.url - http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2; getURL(my.url) [1]

Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-26 Thread Jeffrey Horner
Duncan Temple Lang wrote: clair.crossup...@googlemail.com wrote: Dear R-help, There seems to be a web page I am unable to download using RCurl. I don't understand why it won't download: library(RCurl) my.url -