Jérôme Charron wrote:
[...]
build a list of file extensions to include (other ones will be excluded) in
the fecth process.
[...]
I would not like to exclude all others - as for example many extensions
are valid for html - especially dynamicly generated pages (jsp,asp,cgi
just to name the easy ones and a lot of custom ones). But the idea of
automatically allowing extensions for which plugins are enabled is good
in my opinion.
Anyway I will try to find my own list of forbidden extensions I prepared
based on 80mln of urls - I just prepared the list of most common ones
and went through it manually. I will try to find it over weekend so we
can combine it with the list discussed in this thread.
P.
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers