Hi All, 

I'm using plucker desktop version 1.2.0.0 built Aug 30 2002 and I have 
a problem with the URL pattern filter 

I'm trying to dowload a french cinema cite. The starting page :

http://www.allocine.com/seance/salleproche.html?codepostal=54000

has many links in it but I'm only interested int he links to the different 
films. The URL pattern is a very simple one : 

www.allocine.com/film/fichefilm_gen_cfilm=xxxxx.htm

were xxxxx is the internal number fo the film

so I have tried the folliwing URL pattern filters (in the TAB "Limits" 
under "'URL pattern' filter"

.*www\.allocine\.com/film.*
.*www.allocine.com/film.*
.*www.allocine.com/film/fichefilm.*

the second is exactly like the example in the Help limits-Tab

however everytime I parse, other pages like

www.allocine.com/service
www.allocine.com/critique
www.allocine.com/article
www.allocine.com/forum

are also included.
It seems that plucker is completely ignoring the URL patter I give.
Any idea what I'm doing wrong ???

Any hint is very much appreciated

Thanks in advance

Oliver

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to