[ http://issues.apache.org/jira/browse/SOLR-14?page=comments#action_12377440 ]
Hoss Man commented on SOLR-14: ------------------------------ It would probably be good to make sure we have some UnitTests of the existing WDF behavior prior to applying this patch, and then some tests that use this new feature just so it's clera how it works in various situations. As for duplicates: my initial thought was that this could be handled by the proposed Filter in SOLR-11... but then i realized yonik has a point: the common case is probably going to be no intra-word delimiters, so a short circut check that doesn't crete two of every token would probably be better > Add the ability to preserve the original term when using WordDelimiterFilter > ---------------------------------------------------------------------------- > > Key: SOLR-14 > URL: http://issues.apache.org/jira/browse/SOLR-14 > Project: Solr > Type: Improvement > Components: search > Reporter: Richard "Trey" Hyde > Attachments: TokenizerFactory.java, WordDelimiterFilter.patch > > When doing prefix searching, you need to hang on to the original term > othewise you'll miss many matches you should be making. > Data: ABC-12345 > WordDelimiterFitler may change this into > ABC 12345 ABC12345 > A user may enter a search such as > ABC\-123* > Which will fail to find a match given the above scenario. > The attached patch will allow the use of the "preserveOriginal" option to > WordDelimiterFilter and will analyse as > ABC 12345 ABC12345 ABC-12345 > in which case we will get a postive match. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira