Thank you for your reply...

But I have to use only Beautifulsoup for improving programming skill...

Here I attached my simple python code which retrieves all mobiles and price
and the link in flipkart.com


*import MySQLdb*

*import urllib2*
*from bs4 import BeautifulSoup*
*import itertools*

*db=MySQLdb.connect("localhost","root","","rate")*
*cursor = db.cursor()*


*for i in range(1, 2317):*
*    url = "http://www.flipkart.com/mobiles/pr?sid=tyy,4io&start=
<http://www.flipkart.com/mobiles/pr?sid=tyy,4io&start=>"+str(i)+"&otracker=nmenu_sub_electronics_0_All%20Brands"*
*    page = urllib2.urlopen(url)*
*    soup = BeautifulSoup(page)*
*    links = soup.find("a", {"class":"fk-display-block"})*
*    prices = soup.find("span", {"class":"fk-font-17 fk-bold"})*


*    name = (links.text).lstrip()*
*    price = prices.text*
*    address = "flipkart.com <http://flipkart.com>"+links.get('href')*

*    print ('\n\nPhone No = '),i*
*    print (name + '\n' + price + '\n' + address)*
*    sql = "INSERT INTO flipkart (M_NAME, M_PRICE, ADDRESS) VALUES('%s',
'%s', '%s')"%(name, price, address)*
*    try:*
*        cursor.execute(sql)*
*        db.commit()*
*    except:*
*        db.rollback()*
*db.close()*



This code gets all the details successfully but not quickly...
Is there any way to made this quick???

Thanks in advance...

With Regards
S. Praveen

http://praveenlearner.wordpress.com


On Thu, Apr 3, 2014 at 11:36 PM, Shrinivasan T <[email protected]>wrote:

> Not the direct answer.
>
> But related.
>
> Portia is a tool for visually scraping web sites without any
> programming knowledge. Just annotate web pages with a point and click
> editor to indicate what data you want to extract, and portia will
> learn how to scrape similar pages from the site. Portia has a web
> based UI served by a Twisted server, so you can install it on almost
> any modern platform.
>
> http://blog.scrapinghub.com/2014/04/01/announcing-portia/
> https://github.com/scrapinghub/portia
>
> may be useful for you.
>
>
> --
> Regards,
> T.Shrinivasan
>
>
> My Life with GNU/Linux : http://goinggnu.wordpress.com
> Free E-Magazine on Free Open Source Software in Tamil : http://kaniyam.com
>
> Get CollabNet Subversion Edge :     http://www.collab.net/svnedge
> Kanchilug Blog : http://kanchilug.wordpress.com
>
> To subscribe/unsubscribe kanchilug mailing list :
> http://kanchilug.wordpress.com/join-mailing-list/
>
_______________________________________________
ILUGC Mailing List:
http://www.ae.iitm.ac.in/mailman/listinfo/ilugc
ILUGC Mailing List Guidelines:
http://ilugc.in/mailinglist-guidelines

Reply via email to