Hi Greet,
I suggest you to do these kind of transformation on query time only. Don't
interfere with the index. This is way is more flexible. You can disable/enable
on the fly, change your list without re-indexing.
Just an imaginary example : When user passes String as International
Businessma
If you already know the set of phrases you need to detect then you can
use Lucene's SynonymFilter to spot them and insert a new token.
Mike McCandless
http://blog.mikemccandless.com
On Thu, Feb 20, 2014 at 7:21 AM, Benson Margulies wrote:
> It sounds like you've been asked to implement Named E
It sounds like you've been asked to implement Named Entity Recognition.
OpenNLP has some capability here. There are also, um, commercial
alternatives.
On Thu, Feb 20, 2014 at 6:24 AM, Yann-Erwan Perio wrote:
> On Thu, Feb 20, 2014 at 10:46 AM, Geet Gangwar
> wrote:
>
> Hi,
>
> > My requirement
On Thu, Feb 20, 2014 at 10:46 AM, Geet Gangwar wrote:
Hi,
> My requirement is it should have capabilities to match multiple words as
> one token. for example. When user passes String as International Business
> machine logo or IBM logo it should return International Business Machine as
> one tok
Hi,
I have a requirement to write a custom tokenizer using Lucene framework.
My requirement is it should have capabilities to match multiple words as
one token. for example. When user passes String as International Business
machine logo or IBM logo it should return International Business Machine