Re: Website data-mining.

SMERSH009 Fri, 03 Aug 2007 20:23:20 -0700

On Aug 3, 7:50 pm, Coogan <[EMAIL PROTECTED]> wrote:
> Hi--
>
> I'm using Python for the first time to make a plug-in for Firefox.
> The goal of this plug-in is to take the source code from a website
> and use the metadata and body text for different kinds of analysis.
> My question is: How can I retrieve data from a website? I'm not even
> sure if this is possible through Python. Any help?
>
> nieu


How about this? it will fetch the HTML source of the page.

import datetime, time, re, os, sys, traceback, smtplib, string,\
urllib2, urllib, inspect
from urllib2 import build_opener, HTTPCookieProcessor, Request
opener = build_opener(HTTPCookieProcessor)
from urllib import urlencode

def urlopen2(url, data=None, user_agent='urlopen2'):
    """Opens Our URLS """
    if hasattr(data, "__iter__"):
        data = urlencode(data)
        headers = {'User-Agent' : user_agent}
    return opener.open(Request(url, data, headers))

###TESTCASES START HERE###
def publishedNotes():
    page = urlopen2("http://www.yourURL.com";, ())
    pageRead = page.read()
    print pageRead

if __name__ == '__main__':
    publishedNotes()

    sys.exit()

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Website data-mining.

Reply via email to