Perhaps more fun is > library(XML) > res = htmlTreeParse("http://www.omegahat.org/RSXML/", useInternalNodes=TRUE) > xpathApply(res, "//h1", xmlValue) [[1]] [1] "An XML package for the S language"
Martin Quoting Steven McKinney <[EMAIL PROTECTED]>: > > > >-----Original Message----- > >From: [EMAIL PROTECTED] on behalf of Am Stat > >Sent: Wed 8/1/2007 2:19 PM > >To: r-help@stat.math.ethz.ch > >Subject: [R] Extracting a website text content using R > > >Dear useR, > > >Just wandering whether it is possible that there is any function in R could > >let me get the text contents for a certain website. > > >Thanks a lot! > > >Best, > > >Leon > > > > > Is this what you had in mind? > > > foo <- scan(url("http://cran.r-project.org/"), what = "character") > Read 69 items > > paste(unlist(foo), collapse = " ") > [1] "<!DOCTYPE HTML PUBLIC -//IETF//DTD HTML//EN > <html> <head> <title>The > Comprehensive R Archive Network</title> <link rel=\"icon\" > href=\"favicon.ico\" type=\"image/x-icon\"> <link rel=\"shortcut icon\" > href=\"favicon.ico\" type=\"image/x-icon\"> <link rel=\"stylesheet\" > type=\"text/css\" href=\"R.css\"> </head> <FRAMESET cols=\"1*, 4*\" border=0> > <FRAMESET rows=\"120, 1*\"> <FRAME src=\"logo.html\" name=\"logo\" > frameborder=0> <FRAME src=\"navbar.html\" name=\"contents\" frameborder=0> > </FRAMESET> <FRAME src=\"banner.shtml\" name=\"banner\" frameborder=0> > <noframes> <h1>The Comprehensive R Archive Network</h1> Your browser seems > not to support frames, here is the <A href=\"navbar.html\">contents page</A> > of CRAN. </noframes> </FRAMESET>" > > > Try the search phrase > > cran scan url > > in Google for more hits on > info about R functions that > can deal with URLs. > > In R try > > > apropos("URL") > [1] "contourLines" "URLdecode" "URLencode" "browseURL" > "contrib.url" "main.help.url" "url.show" > [8] "loadURL" "read.table.url" "scan.url" "source.url" > "url" > > > SteveM > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.