You want to take a look at rvest: https://github.com/hadley/rvest
On Thu, Feb 5, 2015 at 2:36 PM, Madhuri Maddipatla <[email protected]> wrote: > Dear R experts, > > My requirement for web scraping in R goes like this. > > *Step 1* - All the medical condition from from A-Z are listed in the link > below. > > http://www.webmd.com/drugs/index-drugs.aspx?show=conditions > > Choose the first condition say Acid Reflux(GERD-...) > > *Step 2 *- It lands on the this page > > http://www.webmd.com/drugs/condition-1999-Acid%20Reflux%20%20GERD-Gastroesophageal%20Reflux%20Disease%20.aspx?diseaseid=1999&diseasename=Acid+Reflux+(GERD-Gastroesophageal+Reflux+Disease)&source=3 > > with a list of drugs. > > Choose the column user reviews of the first drug say "Nexium Oral" > > *Step 3*: Now it lands on the webpage > > http://www.webmd.com/drugs/drugreview-20536-Nexium+oral.aspx?drugid=20536&drugname=Nexium+oral > > with a list of reviews. > I would like to scrape review information into a tabular format by scraping > the html. > For instance, i would like to fetch the full comment of each review as a > column in a table. > Also it should automatically go to next page and fetch the full comments of > all reviewers. > > > Please help me in this endeavor and thanks a lot in advance for reading my > mail and expecting response with your experience and expertise. > > Also please suggest me the possibility around my stepwise plan and any > advice you would like to give me along with the solution. > > High Regards, > *-----------------------------------------------------------------------------------------* > *Madhuri Maddipatla* > *-----------------------------------------------------------------------------------------* > > [[alternative HTML version deleted]] > > ______________________________________________ > [email protected] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Steve Lianoglou Computational Biologist Genentech ______________________________________________ [email protected] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

