you can only call reset(Reader) on a Tokenizer, not any TokenStream. this is why there is the SavedStreams mess in Standard/Stop core analyzers and in every analyzer in LUCENE-1794...
On Mon, Aug 10, 2009 at 6:10 PM, Yonik Seeley<[email protected]> wrote: > I had thought that implementing reusable analyzers in solr was going > to be cake... but either I'm missing something, or Lucene is missing > something. > > Here's the way that one used to create custom analyzers: > > class CustomAnalyzer extends Analyzer { > public TokenStream tokenStream(String fieldName, Reader reader) { > return new LowerCaseFilter(new NGramTokenFilter(new > StandardTokenizer(reader))); > } > } > > > Now let's try to make this reusable: > > class CustomAnalyzer2 extends Analyzer { > public TokenStream tokenStream(String fieldName, Reader reader) { > return new LowerCaseFilter(new NGramTokenFilter(new > StandardTokenizer(reader))); > } > > �...@override > public TokenStream reusableTokenStream(String fieldName, Reader > reader) throws IOException { > TokenStream ts = getPreviousTokenStream(); > if (ts == null) { > ts = tokenStream(fieldName, reader); > setPreviousTokenStream(ts); > return ts; > } else { > // uh... how do I reset a token stream? > return ts; > } > } > } > > > See the missing piece? Seems like TokenStream needs a reset(Reader r) > method or something? > > -Yonik > http://www.lucidimagination.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > -- Robert Muir [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
