[R] Web Scraping

Mohamed Anany Fri, 04 Oct 2013 15:39:30 -0700

Hello everybody,
I just started using R and I'm presenting a poster for R day at Kennesaw
State University and I really need some help in terms of web scraping.
I'm trying to extract used cars data from www.cars.com to include the
mileage, year, model, make, price, CARFAX availability and Technology
package availability. I've done some research, and everything points to the
XML package and RCurl package. I also got my hands on a function that would
capture all the text in the web page and store as a huge character vector.
I've never done data mining before so when i read the help documents on the
packages i mentioned earlier is like reading Chinese. I would appreciate it
if you guide me through this process of data extraction.
Here's an example of what the data would look like:


Cost    Year    Mileage    Tech    CARFAX    Make      Model
$32000 1999   57,987      1         FREE        Audi       A4

Here's the link to the search:-
http://www.cars.com/for-sale/searchresults.action?stkTyp=U&tracktype=usedcc&mkId=20049&AmbMkId=20049&AmbMkNm=Audi&make=Audi&AmbMdNm=A4&model=A4&mdId=20596&AmbMdId=20596&rd=100&zc=30062&searchSource=QUICK_FORM&enableSeo=1

I'm not expecting you to write the whole code for me, but just some
guidance and where to start and what functions would be useful in my
situation.
Thanks a lot anyway.

Regards,
M. Samir Anany

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Web Scraping

Reply via email to