Hi Jes, This is a lucene developer mailing list meant (instead of java user mailing list). Perhaps you should mail this to java user group.
On Thu, Jul 16, 2009 at 06:20:57PM +0530, JesL wrote: > > Hello, > Are there any suggestions / best practices for using Lucene for searching > non-linguistic text? What I mean by non-linguistic is that it's not English > or any other language, but rather product codes. This is presenting some > interesting challenges. Among them are the need for pretty lax wildcard > searches. For example, ABC should match on ABCD, but so should BCD. Also, > it needs to be agnostic to special characters. So, ABC/D should match ABCD > as well as ABC-D or "ABC D". > > As I write an analyzer to handle these cases, I seem to be pretty quickly > degrading into a "like '%blah%' search, with rules to treat all special > characters as single-character, optional wildcards. I'm concerned that the > performance of this will be disappointing, though. > > Any help would be much appreciated. Thanks! > > - Jes > -- > View this message in context: > http://www.nabble.com/Search-in-non-linguistic-text-tp24515712p24515712.html > Sent from the Lucene - Java Developer mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org -- Anshum -- question = ( to ) ? be : ! be; -- Wm. Shakespeare --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org