np, you are welcome

Dean Elwood wrote:

Ah, thanks EM - so basically we need to escape the dots....... something that didn't even occur to me -many thanks!

Dean

----- Original Message ----- From: "EM" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Sunday, November 06, 2005 9:43 PM
Subject: Re: Not crawling specific pages




Here's the relevant extract from my crawl-urlfilter.txt file:-

# Site to crawl
+^http://([a-z0-9]*\.)*mysite.org/

# ignore error pages
-^http://www.mysite.org/view/.error_page

As you can see, I took a "guess" that I could simply use the minus sign as a means of ignoring the page that I want excluded.

This doesn't seem to work. Any guidance would be greatly appreciated.


any dot in the url, has to be substituted with "\." without the quotes.
Just putting a dot in the expression will match any character.

For more, google for "regex"

Hope this helps,
EM



Reply via email to