Re: Website data-mining.

2007-08-04 Thread Miki
Hello,

> >> I'm using Python for the first time to make a plug-in for Firefox.
> >> The goal of this plug-in is to take the source code from a website
> >> and use the metadata and body text for different kinds of analysis.
> >> My question is: How can I retrieve data from a website? I'm not even
> >> sure if this is possible through Python. Any help?
> > Have a look 
> > athttp://www.myinterestingfiles.com/2007/03/playboy-germany-ads.html
>
> Well, it's certainly interesting, but I'm not sure how it might help the OP 
> get data from a website...
Ouch, let there be a lesson to me to *read* my posts before sending
them :)

Should have been http://wwwsearch.sourceforge.net/mechanize/.

--
Miki (who can't paste) Tebeka
[EMAIL PROTECTED]
http://pythonwise.blogspot.com

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Website data-mining.

2007-08-04 Thread Paul Boddie
Jay Loden wrote:
> Miki wrote:
> > Have a look at 
> > http://www.myinterestingfiles.com/2007/03/playboy-germany-ads.html
>
> Well, it's certainly interesting, but I'm not sure how it might help the OP 
> get data from a website...

A case of the Freudian clipboard, perhaps? ;-)

Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Website data-mining.

2007-08-03 Thread Jay Loden
Miki wrote:
> Hello,
> 
>> I'm using Python for the first time to make a plug-in for Firefox.
>> The goal of this plug-in is to take the source code from a website
>> and use the metadata and body text for different kinds of analysis.
>> My question is: How can I retrieve data from a website? I'm not even
>> sure if this is possible through Python. Any help?
> Have a look at 
> http://www.myinterestingfiles.com/2007/03/playboy-germany-ads.html

Well, it's certainly interesting, but I'm not sure how it might help the OP get 
data from a website...

> for getting the data and at http://www.crummy.com/software/BeautifulSoup/
> for handling it.
> 
> HTH.
> 
> --
> Miki Tebeka <[EMAIL PROTECTED]>
> http://pythonwise.blogspot.com
> 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Website data-mining.

2007-08-03 Thread Miki
Hello,

> I'm using Python for the first time to make a plug-in for Firefox.
> The goal of this plug-in is to take the source code from a website
> and use the metadata and body text for different kinds of analysis.
> My question is: How can I retrieve data from a website? I'm not even
> sure if this is possible through Python. Any help?
Have a look at 
http://www.myinterestingfiles.com/2007/03/playboy-germany-ads.html
for getting the data and at http://www.crummy.com/software/BeautifulSoup/
for handling it.

HTH.

--
Miki Tebeka <[EMAIL PROTECTED]>
http://pythonwise.blogspot.com

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Website data-mining.

2007-08-03 Thread SMERSH009
On Aug 3, 7:50 pm, Coogan <[EMAIL PROTECTED]> wrote:
> Hi--
>
> I'm using Python for the first time to make a plug-in for Firefox.
> The goal of this plug-in is to take the source code from a website
> and use the metadata and body text for different kinds of analysis.
> My question is: How can I retrieve data from a website? I'm not even
> sure if this is possible through Python. Any help?
>
> nieu

How about this? it will fetch the HTML source of the page.

import datetime, time, re, os, sys, traceback, smtplib, string,\
urllib2, urllib, inspect
from urllib2 import build_opener, HTTPCookieProcessor, Request
opener = build_opener(HTTPCookieProcessor)
from urllib import urlencode

def urlopen2(url, data=None, user_agent='urlopen2'):
"""Opens Our URLS """
if hasattr(data, "__iter__"):
data = urlencode(data)
headers = {'User-Agent' : user_agent}
return opener.open(Request(url, data, headers))

###TESTCASES START HERE###
def publishedNotes():
page = urlopen2("http://www.yourURL.com";, ())
pageRead = page.read()
print pageRead

if __name__ == '__main__':
publishedNotes()

sys.exit()

-- 
http://mail.python.org/mailman/listinfo/python-list


Website data-mining.

2007-08-03 Thread Coogan
Hi--


I'm using Python for the first time to make a plug-in for Firefox.
The goal of this plug-in is to take the source code from a website
and use the metadata and body text for different kinds of analysis.
My question is: How can I retrieve data from a website? I'm not even
sure if this is possible through Python. Any help?




nieu
-- 
http://mail.python.org/mailman/listinfo/python-list