Can somebody explain tokenStream() to me?
You are now venturing under the covers of Lucene's API. This is where I give the sage advice to get the Lucene source code and surf around it a bit. (It helps to have a nice IDE where you can click around classes and see the object hierarchy easily)
TokenStream is used by the Analyzer to split text into terms.
TokenStream in = new WhitespaceAnalyzer().tokenStream("contents", new StringReader(doc.getField("contents").stringValue()));
But what is the first argument (field) for tokenStream() good for? Actually I
can type whatever I want...? Don't understand the short description in the
API docs...
The field is the field name. No built-in analyzers use it, but custom analyzers could key off of it to do field-specific analysis. Look at the PerFieldAnalyzerWrapper to make per-field analysis easier than writing a custom one that keys off the field name.
Erik
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
