On Feb 16, 2004, at 9:50 AM, [EMAIL PROTECTED] wrote:
Can somebody explain tokenStream() to me?

You are now venturing under the covers of Lucene's API. This is where I give the sage advice to get the Lucene source code and surf around it a bit. (It helps to have a nice IDE where you can click around classes and see the object hierarchy easily)


TokenStream is used by the Analyzer to split text into terms.

TokenStream in = new WhitespaceAnalyzer().tokenStream("contents", new
StringReader(doc.getField("contents").stringValue()));

But what is the first argument (field) for tokenStream() good for? Actually I
can type whatever I want...? Don't understand the short description in the
API docs...

The field is the field name. No built-in analyzers use it, but custom analyzers could key off of it to do field-specific analysis. Look at the PerFieldAnalyzerWrapper to make per-field analysis easier than writing a custom one that keys off the field name.


Erik


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to