2010/7/27 Oleg Burlaca <o...@burlaca.com> > Actually the situation with Немцов из ок, > I've just checked how Yandex works with Немцов and Немцова: > http://nano.yandex.ru/project/inflect/ > > I think there are two solutions: > a) manually search for both Немцов and then Немцова > b) use wildcard query: Немцов* >
Well, here is one idea of a more general solution. The problem with "protected words" is you must have a complete list. One idea would be to add a filter that protects any words from stemming that match a regular expression: In english maybe someone wants to avoid any capitalized words to reduce trouble: [A-Z].* in your case then some pattern like [A-Я].*ов might prevent problems. > Robert, thanks for the RussianLightStemFilterFactory info, > I've found this page > http://www.mail-archive.com/solr-comm...@lucene.apache.org/msg06857.html > that somehow describes it. Where can I read more about > RussianLightStemFilterFactory ? > > Here is the link: http://doc.rero.ch/lm.php?url=1000,43,4,20091209094227-CA/Dolamic_Ljiljana_-_Indexing_and_Searching_Strategies_for_the_Russian_20091209.pdf > Regards, > Oleg > > 2010/7/27 Oleg Burlaca <o...@burlaca.com> > > > A similar word is Немцов. > > The strange thing is that searching for "Немцова" will not find documents > > containing "Немцов" > > > > Немцова: 14 articles > > > > > http://www.sova-center.ru/search/?lg=1&q=%D0%BD%D0%B5%D0%BC%D1%86%D0%BE%D0%B2%D0%B0 > > > > Немцов: 74 articles > > > > > http://www.sova-center.ru/search/?lg=1&q=%D0%BD%D0%B5%D0%BC%D1%86%D0%BE%D0%B2 > > > > > > > > > -- Robert Muir rcm...@gmail.com