> I had thought that implementing reusable analyzers in solr was going
> to be cake... but either I'm missing something, or Lucene is missing
> something.
>
> Here's the way that one used to create custom analyzers:
>
> class CustomAnalyzer extends Analyzer {
> public TokenStream tokenStream(String fieldName, Reader reader) {
> return new LowerCaseFilter(new NGramTokenFilter(new
> StandardTokenizer(reader)));
> }
> }
>
>
> Now let's try to make this reusable:
>
> class CustomAnalyzer2 extends Analyzer {
> public TokenStream tokenStream(String fieldName, Reader reader) {
> return new LowerCaseFilter(new NGramTokenFilter(new
> StandardTokenizer(reader)));
> }
>
> �...@override
> public TokenStream reusableTokenStream(String fieldName, Reader
> reader) throws IOException {
> TokenStream ts = getPreviousTokenStream();
> if (ts == null) {
> ts = tokenStream(fieldName, reader);
> setPreviousTokenStream(ts);
> return ts;
> } else {
> // uh... how do I reset a token stream?
> return ts;
> }
> }
> }
>
>
> See the missing piece? Seems like TokenStream needs a reset(Reader r)
> method or something?
I'm just keeping a reference to Tokenizer, so I can reset it with a
new reader. Though this situation is awkward, TS definetly does not
need a reset(Reader).
--
Kirill Zakharenko/Кирилл Захаренко ([email protected])
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]