On 9/19/07, Pieter Berkel <[EMAIL PROTECTED]> wrote:
> However, I'd like to be able to
> analyze documents more intelligently to recognize phrase keywords such as
> "open source", "Microsoft Office", "Bill Gates" rather than splitting each
> word into separate tokens (the field is never used in search queries so
> matching is not an issue).  I've been looking at SynonymFilterFactory as a
> possible solution to this problem but haven't been able to work out the
> specifics of how to configure it for phrase mappings.

SynonymFilter works out-of-the-box with multi-token synonyms...

Microsoft Office => microsoft_office
Bill Gates, William Gates => bill_gates

Just don't use a word-delimiter filter if you use underscore to join words.

-Yonik

Reply via email to