RE: Use of hyphens in StandardAnalyzer

2010-10-24 Thread Steven A Rowe
t; > A good suggestion. But I'm using Lucene 3.0.2 and the constructor for a > StandardAnalyzer has Version_30 as its highest value. Do you know when 3.1 > is due? > > -Original Message- > From: Steven A Rowe [mailto:sar...@syr.edu] > Sent: 24 Oct 2010 21 31 > T

RE: Use of hyphens in StandardAnalyzer

2010-10-24 Thread Martin O'Shea
A good suggestion. But I'm using Lucene 3.2 and the constructor for a StandardAnalyzer has Version_30 as its highest value. -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: 24 Oct 2010 21 31 To: java-user@lucene.apache.org Subject: RE: Use of hyphe

RE: Use of hyphens in StandardAnalyzer

2010-10-24 Thread Steven A Rowe
Hi Martin, StandardTokenizer and -Analyzer have been changed, as of future version 3.1 (the next release) to support the Unicode segmentation rules in UAX#29. My (untested) guess is that your hyphenated word will be kept as a single token if you set the version to 3.1 or higher in the construc