________________________________ From: Alan Gauld <alan.ga...@btinternet.com> To: tutor@python.org Sent: Wed, May 18, 2011 4:40:19 PM Subject: Re: [Tutor] can I walk or glob a website? "Dave Angel" <da...@ieee.org> wrote >> "Albert-Jan Roskam" <fo...@yahoo.com> wrote >>> How can I walk (as in os.walk) or glob a website? >> >> I don't think there is a way to do that via the web. > It has to be (more or less) possible. That's what google does for their > search >engine. Google trawls the site following links. If thats all he wants then its fairly easy. I took it he wanted to actually trawl the server getting *all* the pdf files not just the published pdfs... Depends what the real requirement is. ===> No, I meant only the published ones. I would consider it somewhat dodgy/unethical/whatever-you-wanna-call-it to download unpublished stuff. Indeed I only need published data. Alan G. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor