Hi Bill,

Thanks for the reply, I know how the urllib module works I am not looking for scraping. I am looking to obtain the html page that my query is going to return. Just like when you type in a site like Amazon you get a bunch of product listing the module has to search the website and return the html link. I can ofcourse scrap the information from that link.

Thanks
Vin

On 02/27/2011 12:04 AM, Bill Allen wrote:
n Sat, Feb 26, 2011 at 21:11, vineeth <vineethrak...@gmail.com <mailto:vineethrak...@gmail.com>> wrote:

    Hello all,

    I am looking forward for a python module to search a website and
    extract the url.

    For example I found a module for Amazon with the name
    "amazonproduct", the api does the job of extracting the data based
    on the query it even parses the url data. I am looking some more
    similar query search python module for other websites like Amazon.

    Any help is appreciated.

    Thank You
    Vin

I am not sure what url you are trying to extract, or from where, but I can give you an example of basic web scraping if that is your aim.

The following works for Python 2.x.

#This one module that gives you the needed methods to read the html from a webpage
import urllib

#set a variable to the needed website
mypath = "http://some_website.com";

#read all the html data from the page into a variable and then parse through it looking for urls
mylines = urllib.urlopen(mypath).readlines()
for item in mylines:
    if "http://"; in item:
         ...do something with the url that was found in the page html...
         ...etc...


--Bill
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to