my current regex-urlfilter properties are as follows:

# skip file: ftp: and mailto: urls
#-^(file|ftp|mailto):

# skip image and other suffixes we can't yet parse
# for a more extensive coverage use the urlfilter-suffix plugin
#-\.(gif|GIF|jpg|JPG|png|PNG|ico|ICO|css|CSS|sit|SIT|eps|EPS|
wmf|WMF|zip|ZIP|ppt|pdf|PPT|mpg|MPG|xls|XLS|gz|GZ|rpm|RPM|
tgz|TGZ|mov|MOV|exe|EXE|jpeg|JPEG|bmp|BMP|js|JS)$

# skip URLs containing certain characters as probable queries, etc.
#-[?*!@=]

# skip URLs with slash-delimited segment that repeats 3+ times, to break
loops
#-.*(/[^/]+)/[^/]+\1/[^/]+\1/

# accept anything else
-^(http://up.anv.bz)
+.

# skip URLs longer than 512 characters
-^.{513,}$

Thanks and Regards,
Shubham Gupta

On Wednesday 05 October 2016 11:29 AM, Sachin Shaju wrote:
my regex-urlfilter properties are as follows:
>>>>
>>>># skip file: ftp: and mailto: urls
>>>>-^(file|ftp|mailto):
>>>>
>>>># skip image and other suffixes we can't yet parse
>>>># for a more extensive coverage use the urlfilter-suffix plugin
>>>>-\.(gif|GIF|jpg|JPG|png|PNG|ico|ICO|css|CSS|sit|SIT|eps|EPS|
>>>>wmf|WMF|zip|ZIP|ppt|pdf|PPT|mpg|MPG|xls|XLS|gz|GZ|rpm|RPM|
>>>>tgz|TGZ|mov|MOV|exe|EXE|jpeg|JPEG|bmp|BMP|js|JS)$
>>>>
>>>># skip URLs containing certain characters as probable queries, etc.
>>>>#-[?*!@=]
>>>>
>>>># skip URLs with slash-delimited segment that repeats 3+ times, to break
>>>>loops
>>>>-.*(/[^/]+)/[^/]+\1/[^/]+\1/
>>>>
>>>># accept anything else
>>>>-^(http://up.anv.bz)
>>>>+.
>>>>
>>>># skip URLs longer than 512 characters
>>>>-^.{513,}$

Reply via email to