Jérôme Charron wrote:
[...]
build a list of file extensions to include (other ones will be excluded) in
the fecth process.
[...]
I would not like to exclude all others - as for example many extensions are valid for html - especially dynamicly generated pages (jsp,asp,cgi just to name the easy ones and a lot of custom ones). But the idea of automatically allowing extensions for which plugins are enabled is good in my opinion. Anyway I will try to find my own list of forbidden extensions I prepared based on 80mln of urls - I just prepared the list of most common ones and went through it manually. I will try to find it over weekend so we can combine it with the list discussed in this thread.
P.




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to