FYI myBBC (if you fiddle it to log you in) has this same spidering problem. i've been killing myself trying to get around it.
kev -----Message d'origine----- De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]De la part de Chris Combs Envoye : 02 October 2002 10:58 A : [EMAIL PROTECTED] Objet : RE: Spidering Help > The article is at: > "The Next Ages of Game Development" > http://avault.com/developer/getarticle.asp?name=bsawyer1 > > and each page is linked thus: > http://avault.com/developer/getarticle.asp?name=bsawyer1&page=2 > http://avault.com/developer/getarticle.asp?name=bsawyer1&page=3 I don't know how the Python code in question works, but could it be that it's not expanding relative links before filtering with the regexp? <a href="getarticle.asp?name=bsawyer1&page=2"> Does it work if you change your regexp to ".*getarticle.asp?name=bsawyer1.*" ? > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED]]On Behalf Of Ian Swainson > Sent: Tuesday, October 01, 2002 4:33 AM > To: [EMAIL PROTECTED] > Subject: Spidering Help _______________________________________________ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list _______________________________________________ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list

