Re: [R] scraping with session cookies

2012-09-23 Thread Heramb Gadgil
This may be because connection to the site via R is taking a lot of time. I too faced this problem for the site Social-Mention. I tried very primitive approach. I put the 'if' condition in the loop. if(length(output)==0){getURL(site) }else{continue with the code} It might help you. Best,

Re: [R] scraping with session cookies

2012-09-21 Thread CPV
Thanks for your suggestion, The issue was resolved by Duncan's recommendation. Now I am trying to obtain data from different pages from the same site through a loop, however, the getURLContent keeps timing out, the odd part is that I can access to the link through a browser with no issues at

Re: [R] scraping with session cookies

2012-09-19 Thread Duncan Temple Lang
Hi ? The key is that you want to use the same curl handle for both the postForm() and for getting the data document. site = u = http://www.wateroffice.ec.gc.ca/graph/graph_e.html?mode=textstn=05ND012prm1=3syr=2012smo=09sday=15eyr=2012emo=09eday=18; library(RCurl) curl = getCurlHandle(cookiefile

Re: [R] scraping with session cookies

2012-09-19 Thread CPV
Thank you for your help Duncan, I have been trying what you suggested however I am getting an error when trying to create the function fun- createFunction(forms[[1]]) it says Error in isHidden I hasDefault : operations are possible only for numeric, logical or complex types On Wed, Sep 19, 2012

Re: [R] scraping with session cookies

2012-09-19 Thread Duncan Temple Lang
You don't need to use the getHTMLFormDescription() and createFunction(). Instead, you can use the postForm() call. However, getHTMLFormDescription(), etc. is more general. But you need the very latest version of the package to deal with degenerate forms that have no inputs (other than button

Re: [R] scraping with session cookies

2012-09-19 Thread CPV
Thanks again, I run the script with the postForm(site, disclaimer_action=I Agree) and it does not seem to do anything, the webpage is still the disclaimer page thus I am getting the error below Error in function (classes, fdef, mtable) : unable to find an inherited method for function

Re: [R] scraping with session cookies

2012-09-19 Thread Heramb Gadgil
Try this, library(RCurl) library(XML) site- http://www.wateroffice.ec.gc.ca/graph/graph_e.html?mode=textstn=05ND012prm1=3syr=2012smo=09sday=15eyr=2012emo=09eday=18 URL-getURL(site) Text=htmlParse(URL,asText=T) This will give you all the web dat in an HTML-Text format. You can use getNodeSet

[R] scraping with session cookies

2012-09-18 Thread CPV
Hi, I am starting coding in r and one of the things that i want to do is to scrape some data from the web. The problem that I am having is that I cannot get passed the disclaimer page (which produces a session cookie). I have been able to collect some ideas and combine them in the code below but I