Hi, sounds like the WordDelimiterTokenFilter from Solr, doesn't it?
Regards, Em Am 13.06.2011 12:06, schrieb Denis Bazhenov: > Some time ago I need to tune our home grown search engine based on lucene to > perform well on product searches. Product search is search where users come > with part of product name and we should find the product. > > The problem here is that users doesn't provide full model name. For instance > id product model name is "Sony PRS-A9000QF", users frequently search for "PRS > 9000", "9000QF" etc. > > The simple and straightforward solution to this problem is to tokenize model > names on the different character type boundary. So for "Sony PRS-A9000QF" we > will have 5 terms: "sony", "prs", "a", "9000" "qf". This solution could > dramatically increase search sensitive (which is not a good thing in a > general search), but works well in a specialized indexes. > > So a developed such a token filter. My question is there any interest in this > solution for the community, and does it make sense to contribute it back? > --- > Denis Bazhenov <dot...@gmail.com> > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org