On 10/11/07, Dick Moores <[EMAIL PROTECTED]> wrote:
- I think I could learn a lot about the use of Python with the web by
- writing a script that would look at
- < http://starship.python.net/crew/index.html> and find all the links
- to more that just the default shown by this one:
- < http://starship.python.net/crew/beazley/>. I think there should be
- about 20 URLs in the list. But I need a start. So give me one?
A start? Start with urllib2 in the standard library.
Load the page source at < http://starship.python.net/crew/index.html> and have your script create a list of all the URLs you wish to visit.
Loop through that list, opening each URL. If the page source is different from the standard "WAITING..." source then you can add that URL to a new list of "good" URLS.
How about a hint of how to get those ">jcooley<" things from the source? (I'm able to have the script get the source, using urllib2.)
BTW I thought I wouldn't try to use BeautifulSoup right now, but take the hard way.
Dick
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor