Yes, I'm sure I've enabled SnowballPorterFilterFactory both at Index and
Query time, because the search works ok,
except names and geo locations.

I've noticed that searching by
Коврова

also shows documents that contain Коврову, Коврове

Search by Ковров, 7 results:
http://www.sova-center.ru/search/?q=%D0%BA%D0%BE%D0%B2%D1%80%D0%BE%D0%B2

Search by Коврова, 26 results:
http://www.sova-center.ru/search/?lg=1&q=%D0%BA%D0%BE%D0%B2%D1%80%D0%BE%D0%B2%D0%B0

Adding such words in stopwords.txt will be a tedious task, as there are 7
millions russian names :)

Kind Regards,
Oleg Burlaca



On Tue, Jul 27, 2010 at 11:35 AM, Robert Muir <rcm...@gmail.com> wrote:

> another look, your problem is ковров itself... its mapped to ковр
>
> a workaround might be to use the protected words functionality to
> keep ковров and any other problematic people/geo names as-is.
>
> separately, in trunk there is an alternative russian stemmer
> (RussianLightStemFilterFactory), which might give you less problems on
> average, but I noticed it has this same problem with the example you gave.
>
> On Tue, Jul 27, 2010 at 4:25 AM, Robert Muir <rcm...@gmail.com> wrote:
>
> > All of your examples stem to "ковров":
> >
> >    assertAnalyzesTo(a, "Коврова Коврову Ковровом Коврове",
> >           new String[] { "ковров", "ковров", "ковров", "ковров" });
> >     }
> >
> > Are you sure you enabled this at *both* index and query time?
> >
> > 2010/7/27 Oleg Burlaca <o...@burlaca.com>
> >
> > Hello,
> >>
> >> I'm using SnowballPorterFilterFactory with language="Russian".
> >> The stemming works ok except people names, geographical places.
> >> Here are some examples:
> >>
> >> searching for Ковров should also find Коврова, Коврову, Ковровом,
> Коврове.
> >>
> >> Are there other stemming plugins for the russian language that can
> handle
> >> this?
> >> If not, what are the options. A simple solution may be to use the
> wildcard
> >> queries in Standard mode instead of the DisMaxQueryHandler:
> >> Ковров*
> >>
> >> but I'd like to avoid it.
> >>
> >> Thanks.
> >>
> >
> >
> >
> > --
> > Robert Muir
> > rcm...@gmail.com
> >
>
>
>
> --
> Robert Muir
> rcm...@gmail.com
>

Reply via email to