RE: Field Tokenization

2004-03-17 Thread Alexey Lef
You can do it using PerFieldAnalyzerWrapper. See http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/analysis/PerFiel dAnalyzerWrapper.html for details. Alexey -Original Message- From: Brandon Lee [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 17, 2004 3:51 PM To: Lucene Users

RE: n-gram indexing for generating spell suggestions

2004-10-18 Thread Alexey Lef
You can also store a phonetic key for the term to find sounds-like matches. I use double metaphone algorithm which appears to be English specific. Not sure if there is something out there for Dutch. For the length, I use relative distance cutoff (distance/length) in addition to the absolute

RE: Spell checker

2004-10-20 Thread Alexey Lef
If you look at the FuzzyQuery code, it is based on computing Levenshtein distance between the original term and every term in the index and keeping the terms that are within the specified relative distance of the original term. This would explain why FuzzyQuery may work well for small indexes but

Unexpected TermEnum behavior

2004-12-08 Thread Alexey Lef
My application needs to enumerate all terms for a specific field. To do that I get the TermEnum using the following code: TermEnum terms = reader.terms(new Term(fieldName, )); I noticed that initially TermEnum is positioned at the first term. In other words, I don't have to call

RE: Fatal error on Windows

2005-01-08 Thread Alexey Lef
If my understanding is correct, unless you are using JNI, you should never be able to crash the JVM using only java code. We've had a lot of crash problems with Sun's JVM, especially in server mode (on Linux, not Windows). We don't have any JNI code (only the JVM itself and the database driver).