Not sure if my code was attached in that last post: library(RCurl) library(XML) html <- getURL("http://www.omegahat.org/RSXML/index.html") html.tree <- htmlTreeParse(html, useInternalNodes = TRUE, error = function(...){})
On 25 Nov, 16:21, Peng Yu <pengyu...@gmail.com> wrote: > On Wed, Nov 25, 2009 at 12:19 AM, cls59 <ch...@sharpsteen.net> wrote: > > > Peng Yu wrote: > > >> I'm interested in parsing an html page. I should use XML, right? Could > >> you somebody show me some example code? Is there a tutorial for this > >> package? > > > Did you try looking through the help pages for the XML package or browsing > > the Omegahat website? > > > Look at: > > > library(XML) > > ?htmlTreeParse > > > And the relevant web page for documentation and examples is: > > > http://www.omegahat.org/RSXML/ > > http://www.omegahat.org/RSXML/shortIntro.html > > I'm trying the example on the above webpage. But I'm not sure why I > got the following error. Would you help to take a look? > > $ Rscript main.R> library(XML) > > > download.file('http://www.omegahat.org/RSXML/index.html','index.html') > > trying URL 'http://www.omegahat.org/RSXML/index.html' > Content type 'text/html; charset=ISO-8859-1' length 3021 bytes > opened URL > ================================================== > downloaded 3021 bytes > > > > > doc = xmlInternalTreeParse("index.html") > > Opening and ending tag mismatch: dd line 68 and dl > Opening and ending tag mismatch: li line 67 and body > Opening and ending tag mismatch: dt line 66 and html > Premature end of data in tag dd line 64 > Premature end of data in tag li line 63 > Premature end of data in tag dt line 62 > Premature end of data in tag dl line 61 > Premature end of data in tag body line 5 > Premature end of data in tag html line 1 > Error: 1: Opening and ending tag mismatch: dd line 68 and dl > 2: Opening and ending tag mismatch: li line 67 and body > 3: Opening and ending tag mismatch: dt line 66 and html > 4: Premature end of data in tag dd line 64 > 5: Premature end of data in tag li line 63 > 6: Premature end of data in tag dt line 62 > 7: Premature end of data in tag dl line 61 > 8: Premature end of data in tag body line 5 > 9: Premature end of data in tag html line 1 > Execution halted > > ______________________________________________ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.