right, but your problem is this is the current output:

Ковров -> Ковр
Коврову -> Ковров
Ковровом -> Ковров
Коврове -> Ковров

so, if Ковров was simply left alone, all your forms would match...

2010/7/27 Oleg Burlaca <o...@burlaca.com>

> Thanks Robert for all your help,
>
> The idea of ы[A-Z].* stopwords is ideal for the english language,
> although in russian nouns are inflected: Борис, Борису, Бориса, Борисом
>
> I'll try the RussianLightStemFilterFactory (the article in the PDF
> mentioned
> it's more accurate).
>
> Once again thanks,
> Oleg Burlaca
>
> On Tue, Jul 27, 2010 at 12:07 PM, Robert Muir <rcm...@gmail.com> wrote:
>
> > 2010/7/27 Oleg Burlaca <o...@burlaca.com>
> >
> > > Actually the situation with Немцов из ок,
> > > I've just checked how Yandex works with Немцов and Немцова:
> > > http://nano.yandex.ru/project/inflect/
> > >
> > > I think there are two solutions:
> > > a) manually search for both Немцов and then Немцова
> > > b) use wildcard query: Немцов*
> > >
> >
> > Well, here is one idea of a more general solution.
> > The problem with "protected words" is you must have a complete list.
> >
> > One idea would be to add a filter that protects any words from stemming
> > that
> > match a regular expression:
> > In english maybe someone wants to avoid any capitalized words to reduce
> > trouble: [A-Z].*
> > in your case then some pattern like [A-Я].*ов might prevent problems.
> >
> >
> > > Robert, thanks for the RussianLightStemFilterFactory info,
> > > I've found this page
> > >
> http://www.mail-archive.com/solr-comm...@lucene.apache.org/msg06857.html
> > > that somehow describes it. Where can I read more about
> > > RussianLightStemFilterFactory ?
> > >
> > >
> > Here is the link:
> >
> >
> http://doc.rero.ch/lm.php?url=1000,43,4,20091209094227-CA/Dolamic_Ljiljana_-_Indexing_and_Searching_Strategies_for_the_Russian_20091209.pdf
> >
> >
> > > Regards,
> > > Oleg
> > >
> > > 2010/7/27 Oleg Burlaca <o...@burlaca.com>
> > >
> > > > A similar word is Немцов.
> > > > The strange thing is that searching for "Немцова" will not find
> > documents
> > > > containing "Немцов"
> > > >
> > > > Немцова: 14 articles
> > > >
> > > >
> > >
> >
> http://www.sova-center.ru/search/?lg=1&q=%D0%BD%D0%B5%D0%BC%D1%86%D0%BE%D0%B2%D0%B0
> > > >
> > > > Немцов: 74 articles
> > > >
> > > >
> > >
> >
> http://www.sova-center.ru/search/?lg=1&q=%D0%BD%D0%B5%D0%BC%D1%86%D0%BE%D0%B2
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Robert Muir
> > rcm...@gmail.com
> >
>



-- 
Robert Muir
rcm...@gmail.com

Reply via email to