>-----Original Message----- >From: [EMAIL PROTECTED] on behalf of Am Stat >Sent: Wed 8/1/2007 2:19 PM >To: r-help@stat.math.ethz.ch >Subject: [R] Extracting a website text content using R >Dear useR,
>Just wandering whether it is possible that there is any function in R could >let me get the text contents for a certain website. >Thanks a lot! >Best, >Leon Is this what you had in mind? > foo <- scan(url("http://cran.r-project.org/"), what = "character") Read 69 items > paste(unlist(foo), collapse = " ") [1] "<!DOCTYPE HTML PUBLIC -//IETF//DTD HTML//EN > <html> <head> <title>The Comprehensive R Archive Network</title> <link rel=\"icon\" href=\"favicon.ico\" type=\"image/x-icon\"> <link rel=\"shortcut icon\" href=\"favicon.ico\" type=\"image/x-icon\"> <link rel=\"stylesheet\" type=\"text/css\" href=\"R.css\"> </head> <FRAMESET cols=\"1*, 4*\" border=0> <FRAMESET rows=\"120, 1*\"> <FRAME src=\"logo.html\" name=\"logo\" frameborder=0> <FRAME src=\"navbar.html\" name=\"contents\" frameborder=0> </FRAMESET> <FRAME src=\"banner.shtml\" name=\"banner\" frameborder=0> <noframes> <h1>The Comprehensive R Archive Network</h1> Your browser seems not to support frames, here is the <A href=\"navbar.html\">contents page</A> of CRAN. </noframes> </FRAMESET>" Try the search phrase cran scan url in Google for more hits on info about R functions that can deal with URLs. In R try > apropos("URL") [1] "contourLines" "URLdecode" "URLencode" "browseURL" "contrib.url" "main.help.url" "url.show" [8] "loadURL" "read.table.url" "scan.url" "source.url" "url" SteveM ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.