Hi Steve,
thanks for your response. Totally makes sense, given that the comma
character is a widely used for written number syntax (e.g. 1000 is the same
as 1,000). Thanks also for the notes re the mailing list and nabble.
Colm.
--
View this message in context:
I'm using Lucene to search address data, and came across an interesting case
where StandardAnalyzer appears not to remove punctuation (a comma). To
illustrate, the following code snippet uses StandardAnalyzer to analyze an
address, printing out each analyzed token.
The output of the code snippet
v3.5.0 StandardTokenizer, since it uses Unicode
6.0.0).
Steve
-Original Message-
From: colm.mchugh [mailto:colm.mch...@mapflow.com]
Sent: Thursday, March 22, 2012 9:23 AM
To: dev@lucene.apache.org
Subject: Case where StandardAnalyzer doesn't remove punctuation
I'm using Lucene to search