I know you can enable language detect during
index-more however is there a method to doing this
during the crawl?

I'm interested in building an index as english only
right now. what is the theory behind that? anyone have
any experience?

would it be building a huge black list, ignoring tlds
until you find a computational method or??? thoughts anyone?


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to