FYI  
myBBC (if you fiddle it to log you in) has this same spidering problem.
i've been killing myself trying to get around it.

kev 

-----Message d'origine-----
De : [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]De la part de Chris Combs
Envoye : 02 October 2002 10:58
A : [EMAIL PROTECTED]
Objet : RE: Spidering Help


> The article is at:
> "The Next Ages of Game Development"
> http://avault.com/developer/getarticle.asp?name=bsawyer1
>
> and each page is linked thus:
> http://avault.com/developer/getarticle.asp?name=bsawyer1&page=2
> http://avault.com/developer/getarticle.asp?name=bsawyer1&page=3


I don't know how the Python code in question works, but could it be that
it's not expanding relative links before filtering with the regexp?

<a href="getarticle.asp?name=bsawyer1&page=2">

Does it work if you change your regexp to
".*getarticle.asp?name=bsawyer1.*"
?


> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of Ian Swainson
> Sent: Tuesday, October 01, 2002 4:33 AM
> To: [EMAIL PROTECTED]
> Subject: Spidering Help

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list
_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to