I'm not sure how useful this reply is, but hey ;)
<aol>me too!</aol>
I do a vaguely similar thing; I have to strip accents from characters
such as e-acute out of both my input data and my incoming search queries
to put them into a standard form. I do this with a custom TokenFilter
subclass. I have an analyzer that includes this filter along with some
of the standard ones (LowercaseFilter, etc). I run the same analyzer on
indexing and searching, which has been discussed in other posts.
My point is that I'm happy with this approach and I'd recommend you do a
similar thing, at least as a first attempt.
Cheers,
Peter Pimley
Aigner, Thomas wrote:
Hello all,
I am VERY new to Lucene and we are trying out Lucene to see if
it will accomplish the vast majority of our search functions.
I have a question about a good way to index some of our product
description codes. We have description codes like 21-MA-GAB and other
punctuation. Our users need to be able to search for "21 MA GAB" or
"21-MA_GAB" or "21MAGAB". Is the best way to accomplish this by
creating synonyms for the 3 different ways when punctuation is in parts
to search for? I know I can stop punctuation in the index but what about
grouping the information together or with spaces?
Thanks all in advance,
Tom
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]