On Tue, Jan 10, 2012 at 5:32 PM, Tanner Postert <tanner.post...@gmail.com>wrote:

> We've had some issues with people searching for a document with the
> search term '200 movies'. The document is actually title 'two hundred
> movies'.
>
> Do we need to add every number to our  synonyms dictionary to
> accomplish this?


That is one way to deal with this.

But it depends on a lot of hand engineering of special cases.  That is good
to have for the low hanging fruit, but it only takes you so far.  You can
also automate the discovery of such cases to a certain degree by analyzing
query logs.


> Is it best done at index or search time?
>

I would say that opinion is divided on this and in the end, you probably
have to do versions of this at both times.  This is especially true if you
want to include secondary information like inferred query purpose
(obviously only available at query time) and inferred document
characteristics (best known at indexing time).  Partly the choice about
when to do this is driven by which trade-offs you are OK making.  For
instance, some people are driven by index size but not query response time.
 They would probably opt for pushing load to the query.  Others may be
bound by response time or query throughput.  They may wish to minimize
query complexity and size.

Reply via email to