RE: Case where StandardAnalyzer doesn't remove punctuation

2012-03-26 Thread colm.mchugh
Hi Steve, thanks for your response. Totally makes sense, given that the comma character is a widely used for written number syntax (e.g. 1000 is the same as 1,000). Thanks also for the notes re the mailing list and nabble. Colm. -- View this message in context:

Case where StandardAnalyzer doesn't remove punctuation

2012-03-23 Thread colm.mchugh
I'm using Lucene to search address data, and came across an interesting case where StandardAnalyzer appears not to remove punctuation (a comma). To illustrate, the following code snippet uses StandardAnalyzer to analyze an address, printing out each analyzed token. The output of the code snippet