Re: Website data-mining.
Hello, > >> I'm using Python for the first time to make a plug-in for Firefox. > >> The goal of this plug-in is to take the source code from a website > >> and use the metadata and body text for different kinds of analysis. > >> My question is: How can I retrieve data from a website? I'm not even > >> sure if this is possible through Python. Any help? > > Have a look > > athttp://www.myinterestingfiles.com/2007/03/playboy-germany-ads.html > > Well, it's certainly interesting, but I'm not sure how it might help the OP > get data from a website... Ouch, let there be a lesson to me to *read* my posts before sending them :) Should have been http://wwwsearch.sourceforge.net/mechanize/. -- Miki (who can't paste) Tebeka [EMAIL PROTECTED] http://pythonwise.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Website data-mining.
Jay Loden wrote: > Miki wrote: > > Have a look at > > http://www.myinterestingfiles.com/2007/03/playboy-germany-ads.html > > Well, it's certainly interesting, but I'm not sure how it might help the OP > get data from a website... A case of the Freudian clipboard, perhaps? ;-) Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Website data-mining.
Miki wrote: > Hello, > >> I'm using Python for the first time to make a plug-in for Firefox. >> The goal of this plug-in is to take the source code from a website >> and use the metadata and body text for different kinds of analysis. >> My question is: How can I retrieve data from a website? I'm not even >> sure if this is possible through Python. Any help? > Have a look at > http://www.myinterestingfiles.com/2007/03/playboy-germany-ads.html Well, it's certainly interesting, but I'm not sure how it might help the OP get data from a website... > for getting the data and at http://www.crummy.com/software/BeautifulSoup/ > for handling it. > > HTH. > > -- > Miki Tebeka <[EMAIL PROTECTED]> > http://pythonwise.blogspot.com > -- http://mail.python.org/mailman/listinfo/python-list
Re: Website data-mining.
Hello, > I'm using Python for the first time to make a plug-in for Firefox. > The goal of this plug-in is to take the source code from a website > and use the metadata and body text for different kinds of analysis. > My question is: How can I retrieve data from a website? I'm not even > sure if this is possible through Python. Any help? Have a look at http://www.myinterestingfiles.com/2007/03/playboy-germany-ads.html for getting the data and at http://www.crummy.com/software/BeautifulSoup/ for handling it. HTH. -- Miki Tebeka <[EMAIL PROTECTED]> http://pythonwise.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Website data-mining.
On Aug 3, 7:50 pm, Coogan <[EMAIL PROTECTED]> wrote: > Hi-- > > I'm using Python for the first time to make a plug-in for Firefox. > The goal of this plug-in is to take the source code from a website > and use the metadata and body text for different kinds of analysis. > My question is: How can I retrieve data from a website? I'm not even > sure if this is possible through Python. Any help? > > nieu How about this? it will fetch the HTML source of the page. import datetime, time, re, os, sys, traceback, smtplib, string,\ urllib2, urllib, inspect from urllib2 import build_opener, HTTPCookieProcessor, Request opener = build_opener(HTTPCookieProcessor) from urllib import urlencode def urlopen2(url, data=None, user_agent='urlopen2'): """Opens Our URLS """ if hasattr(data, "__iter__"): data = urlencode(data) headers = {'User-Agent' : user_agent} return opener.open(Request(url, data, headers)) ###TESTCASES START HERE### def publishedNotes(): page = urlopen2("http://www.yourURL.com";, ()) pageRead = page.read() print pageRead if __name__ == '__main__': publishedNotes() sys.exit() -- http://mail.python.org/mailman/listinfo/python-list
Website data-mining.
Hi-- I'm using Python for the first time to make a plug-in for Firefox. The goal of this plug-in is to take the source code from a website and use the metadata and body text for different kinds of analysis. My question is: How can I retrieve data from a website? I'm not even sure if this is possible through Python. Any help? nieu -- http://mail.python.org/mailman/listinfo/python-list