On 9/19/07, Pieter Berkel <[EMAIL PROTECTED]> wrote: > However, I'd like to be able to > analyze documents more intelligently to recognize phrase keywords such as > "open source", "Microsoft Office", "Bill Gates" rather than splitting each > word into separate tokens (the field is never used in search queries so > matching is not an issue). I've been looking at SynonymFilterFactory as a > possible solution to this problem but haven't been able to work out the > specifics of how to configure it for phrase mappings.
SynonymFilter works out-of-the-box with multi-token synonyms... Microsoft Office => microsoft_office Bill Gates, William Gates => bill_gates Just don't use a word-delimiter filter if you use underscore to join words. -Yonik