Hi Matthias I hope you don't mind me doing this directly via e-mail instead of the list I'm a newbie at this and don't wanna appear idiotic ;)
[START OF FILE] # Creative Commnons crawl filter # Each non-comment, non-blank line contains a regular expression # prefixed by '+' or '-'. The first matching pattern in the file # determines whether a URL is included or ignored. If no pattern # matches, the URL is ignored. # skip file:, ftp:, & mailto: urls -^(file|ftp|mailto|https): -\.(gif|GIF|jpg|JPG|ico|ICO|css|sit|eps|wmf|rtf|zip|ppt|mpg|xls|gz|rpm|tgz|mov|MOV|exe|pdf)$ [EMAIL PROTECTED] +^http://www\.(.*\.za\.net|.*\.co\.za|.*\.org\.za|.*\.za\.com|.*\.za).* #Only ZA based domains starting with www -http://.*/.*/.*/.*/.* #max depth is 3 directories after domain -.*\.\..* #get rid of weird links eg. http://www.coolbananas.co.za/hello/../stuff/something.asp -.*/.*(print|friend|emailto|register|signin|login|logon|signmenin).* #get rid of common redirects for logins and send to a friend stuff [END OF FILE] On Wed, 06 Apr 2005 11:49:32 +0200 Matthias Jaekle <[EMAIL PROTECTED]> wrote: > > Hi! First post here goes... can anyone provide me with > > their regular expression entries in the regex-urlfilter > > files as my +entry works but my -entries don't work or > so > > it seems - can one have more than one - entry in the > file? > Yes, > can you please post your whole regex file. > > Matthias > -- > http://www.eventax.com - eventax GmbH > http://www.umkreisfinder.de - Die Suchmaschine f�r > Lokales und Events _____________________________________________________________________ For super low premiums, click here http://www.dialdirect.co.za/quote
