Thank you for your reply...
But I have to use only Beautifulsoup for improving programming skill...
Here I attached my simple python code which retrieves all mobiles and price
and the link in flipkart.com
*import MySQLdb*
*import urllib2*
*from bs4 import BeautifulSoup*
*import itertools*
*db=MySQLdb.connect("localhost","root","","rate")*
*cursor = db.cursor()*
*for i in range(1, 2317):*
* url = "http://www.flipkart.com/mobiles/pr?sid=tyy,4io&start=
<http://www.flipkart.com/mobiles/pr?sid=tyy,4io&start=>"+str(i)+"&otracker=nmenu_sub_electronics_0_All%20Brands"*
* page = urllib2.urlopen(url)*
* soup = BeautifulSoup(page)*
* links = soup.find("a", {"class":"fk-display-block"})*
* prices = soup.find("span", {"class":"fk-font-17 fk-bold"})*
* name = (links.text).lstrip()*
* price = prices.text*
* address = "flipkart.com <http://flipkart.com>"+links.get('href')*
* print ('\n\nPhone No = '),i*
* print (name + '\n' + price + '\n' + address)*
* sql = "INSERT INTO flipkart (M_NAME, M_PRICE, ADDRESS) VALUES('%s',
'%s', '%s')"%(name, price, address)*
* try:*
* cursor.execute(sql)*
* db.commit()*
* except:*
* db.rollback()*
*db.close()*
This code gets all the details successfully but not quickly...
Is there any way to made this quick???
Thanks in advance...
With Regards
S. Praveen
http://praveenlearner.wordpress.com
On Thu, Apr 3, 2014 at 11:36 PM, Shrinivasan T <[email protected]>wrote:
> Not the direct answer.
>
> But related.
>
> Portia is a tool for visually scraping web sites without any
> programming knowledge. Just annotate web pages with a point and click
> editor to indicate what data you want to extract, and portia will
> learn how to scrape similar pages from the site. Portia has a web
> based UI served by a Twisted server, so you can install it on almost
> any modern platform.
>
> http://blog.scrapinghub.com/2014/04/01/announcing-portia/
> https://github.com/scrapinghub/portia
>
> may be useful for you.
>
>
> --
> Regards,
> T.Shrinivasan
>
>
> My Life with GNU/Linux : http://goinggnu.wordpress.com
> Free E-Magazine on Free Open Source Software in Tamil : http://kaniyam.com
>
> Get CollabNet Subversion Edge : http://www.collab.net/svnedge
> Kanchilug Blog : http://kanchilug.wordpress.com
>
> To subscribe/unsubscribe kanchilug mailing list :
> http://kanchilug.wordpress.com/join-mailing-list/
>
_______________________________________________
ILUGC Mailing List:
http://www.ae.iitm.ac.in/mailman/listinfo/ilugc
ILUGC Mailing List Guidelines:
http://ilugc.in/mailinglist-guidelines