[ http://issues.apache.org/jira/browse/NUTCH-305?page=all ]

Stefan Neufeind updated NUTCH-305:
----------------------------------

    Attachment: suffix-urlfilter.txt

Find attached an suffix-urlfilter.txt that might be interesting to some people. 
More contributions welcome at any time. Maybe we should ship such a list and 
use the suffix-filter instead of regex to filter by document-extension?

> Update crawl and url filter lists to exclude jpeg|JPEG|bmp|BMP
> --------------------------------------------------------------
>
>          Key: NUTCH-305
>          URL: http://issues.apache.org/jira/browse/NUTCH-305
>      Project: Nutch
>         Type: Bug

>     Versions: 0.8-dev
>     Reporter: chris finne
>  Attachments: suffix-urlfilter.txt
>


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to