Hi,
Is there any API in KStem filter for lexicons configuration?
As far as I understand the original code works in such a way that lexicons
are loaded from files at startup (see
http://lexicalresearch.com/kstem-doc.txt). The author (Robert Krovetz) names
possibility to modify lexicons among
May be I should show some examples where I think custom configuration can be
useful. Let me give you two examples:
1) As of now, KStem does conflation of both words connector and
connected to the same term connect.
2) Contrary it does not do conflation of transaction and transactions to
the same
On Mon, Jun 20, 2011 at 7:19 AM, Lukáš Vlček lukas.vl...@gmail.com wrote:
Having an option to modify internal lexicons I would be able to adapt the
KStem to work better for specific text corpora.
What do you think?
please use StemmerOverrideFilter for this! it works with all stemmers,
Hi Robert,
this sounds interesting I will look at it in more detail.
However, I do not think this is really a general solution. If I understand
StemmerOverrideFilter correctly (from a quick glance) it rely on the fact
that you *know* exact term (the key in the map) in advance. In other words
if
On Mon, Jun 20, 2011 at 8:23 AM, Lukáš Vlček lukas.vl...@gmail.com wrote:
Hi Robert,
this sounds interesting I will look at it in more detail.
However, I do not think this is really a general solution. If I understand
StemmerOverrideFilter correctly (from a quick glance) it rely on the fact
Hi Robert,
I think the difference between KStem and other stemmers (at least those that
I am aware of, like snowball or porter) is that KStem is expected to produce
a real valid words and thus other filtering can be applied to the tokens
after stemming more easily (for example synonym expansion).