I am performing a nutch search using an accentuated word; documents including that word correctly show up when I click the search button. However if I try with the same word but without the accent, the pages showing up previously no longer show up; I actually get no results at all. Does anybody know how I can solve this issue? as most users sometimes do not take the time to type in accentuated letters.
Sounds like you'll need to normalize the words (strip accents). There was a discussion on this topic a few weeks ago, where I'd suggested using ICU to convert words into primary sort keys. Of course you'd have to do the same thing on the query side.
One minus is that if the person does take the time to enter accents, their search won't get narrowed down. So you could also index the text w/accents (but still lower-casing), and then if the query word contains accents you search on this alternative, more precise field.
-- Ken -- Ken Krugler TransPac Software, Inc. <http://www.transpac.com> +1 530-470-9200 ------------------------------------------------------- SF.Net email is sponsored by: Discover Easy Linux Migration Strategies from IBM. Find simple to follow Roadmaps, straightforward articles, informative Webcasts and more! Get everything you need to get up to speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
