Hi Jes,Good to see you here. You could try something like an n'gram analyzer. You'd have to explore, though 'm assuming it'd be helpful for you.
-- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw............ On Thu, Jul 16, 2009 at 6:34 PM, JesL <jeslefco...@comcast.net> wrote: > > Hello, > Are there any suggestions / best practices for using Lucene for searching > non-linguistic text? What I mean by non-linguistic is that it's not > English > or any other language, but rather product codes. This is presenting some > interesting challenges. Among them are the need for pretty lax wildcard > searches. For example, ABC should match on ABCD, but so should BCD. Also, > it needs to be agnostic to special characters. So, ABC/D should match ABCD > as well as ABC-D or "ABC D". > > As I write an analyzer to handle these cases, I seem to be pretty quickly > degrading into a "like '%blah%' search, with rules to treat all special > characters as single-character, optional wildcards. I'm concerned that the > performance of this will be disappointing, though. > > Any help would be much appreciated. Thanks! > > - Jes > -- > View this message in context: > http://www.nabble.com/Search-in-non-linguistic-text-tp24515936p24515936.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >