[R] Scraping a web page

Michael Conklin Thu, 03 Dec 2009 14:31:43 -0800

I would like to be able to submit a list of URLs of various webpages and 
extract the "content" i.e. not the mark-up of those pages. I can find plenty of 
examples in the XML library of extracting links from pages but I cannot seem to 
find a way to extract the text.  Any help would be greatly appreciated - I will 
not know the structure of the URLs I would submit in advance.  Any suggestions 
on where to look would be greatly appreciated.


Mike

W. Michael Conklin
Chief Methodologist

MarketTools, Inc. | www.markettools.com<http://www.markettools.com>
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Scraping a web page

Reply via email to