Thanks for explaining this, Charlie. Just for completeness and to make things a little easier, the XML package has a function named readHTMLTable() and you can call it with a URL and it will attempt to read all the tables in the page.
tbls = readHTMLTable('http://www.rateinflation.com/consumer-price-index/usa-cpi.php') yields a list with 10 elements, and the table of interest with the data is the 10th one. tbls[[10]] The function does the XPath voodoo and sapply() work for you and uses some heuristics. There are various controls one can specify and also various methods for working with sub-parts of the HTML document directly. D. cls59 wrote: > > > Bogaso wrote: >> Hi all, >> >> I want to download data from those two different sources, directly into R >> : >> >> http://www.rateinflation.com/consumer-price-index/usa-cpi.php >> http://eaindustry.nic.in/asp2/list_d.asp >> >> First one is CPI of US and 2nd one is WPI of India. Can anyone please give >> any clue how to download them directly into R. I want to make them zoo >> object for further analysis. >> >> Thanks, >> > > The following site did not load for me: > > http://eaindustry.nic.in/asp2/list_d.asp > > But I was able to extract the table from the US CPI site using Duncan Temple > Lang's XML package: > > library(XML) > > > First, download the website into R: > > html.raw <- readLines( > 'http://www.rateinflation.com/consumer-price-index/usa-cpi.php' ) > > Then, convert to an HTML object using the XML package: > > html.data <- htmlTreeParse( html.raw, asText = T, useInternalNodes = T ) > > A quick scan of the page source in the browser reveals that the table you > want is encased in a div with a class of "dynamicContent"-- we will use a > xpath specification[1] to retrieve all rows in that table: > > table.html <- getNodeSet( html.data, > '//d...@class="dynamicContent"]/table/tr' ) > > Now, the data values can be extracted from the cells in the rows using a > little sapply and xpathXpply voodoo: > > table.data <- t( sapply( table.html, function( row ){ > > row.data <- xpathSApply( row, './td', xmlValue ) > return( row.data) > > })) > > > Good luck! > > -Charlie > > [1]: http://www.w3schools.com/XPath/xpath_syntax.asp > > ----- > Charlie Sharpsteen > Undergraduate > Environmental Resources Engineering > Humboldt State University ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.