Hello, Sorry this is probably in the documentation somewhere, but I couldn't find it.
How to index and search accented words without accents? For example: "Portégé" (a model for Toshiba laptops) would be indexed as "portege"; and the search for "portégé" would be equivalent to the search for "portege" and find either "Portégé", "Portegé", "portége", "portege", etc. This is how Google works; maybe Nutch do the same by default? Currently, by default (0.7.1), "Portégé" is indexed as "portégé" and found only if searched for "portégé" or "Portégé" (but not "portege"). This is all the most useful considering users in the US do not have easy access to accented letters on their keywords... Thanks, Frank. ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
