Re: WordBoundTokenFilter

Em Mon, 13 Jun 2011 03:51:01 -0700

Hi,

sounds like the WordDelimiterTokenFilter from Solr, doesn't it?


Regards,
Em

Am 13.06.2011 12:06, schrieb Denis Bazhenov:
> Some time ago I need to tune our home grown search engine based on lucene to 
> perform well on product searches. Product search is search where users come 
> with part of product name and we should find the product.
> 
> The problem here is that users doesn't provide full model name. For instance 
> id product model name is "Sony PRS-A9000QF", users frequently search for "PRS 
> 9000", "9000QF" etc.
> 
> The simple and straightforward solution to this problem is to tokenize model 
> names on the different character type boundary. So for "Sony PRS-A9000QF" we 
> will have 5 terms: "sony", "prs", "a", "9000" "qf". This solution could 
> dramatically increase search sensitive (which is not a good thing in a 
> general search), but works well in a specialized indexes.
> 
> So a developed such a token filter. My question is there any interest in this 
> solution for the community, and does it make sense to contribute it back?
> ---
> Denis Bazhenov <dot...@gmail.com>
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: WordBoundTokenFilter

Reply via email to