Hi Steve,
thanks for your response. Totally makes sense, given that the comma
character is a widely used for written number syntax (e.g. 1000 is the same
as 1,000). Thanks also for the notes re the mailing list and nabble.
Colm.
--
View this message in context:
I'm using Lucene to search address data, and came across an interesting case
where StandardAnalyzer appears not to remove punctuation (a comma). To
illustrate, the following code snippet uses StandardAnalyzer to analyze an
address, printing out each analyzed token.
The output of the code snippet