[ 
https://issues.apache.org/jira/browse/LUCENE-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3042:
----------------------------------

          Component/s: Analysis
             Priority: Critical  (was: Major)
    Affects Version/s: 4.0
                       3.2
                       2.9.4
                       3.0.3
                       3.1
        Fix Version/s: 4.0
                       3.2

Just to conclude:
This bug is not so serious as it appears (else someone would have noticed 
before), as it would never happen on 0-8-15 TokenStreams, when used like 
IndexWriter does.
This bug only appears if you have TokenFilters and you add Attributes on the 
top level Filter later (after using the TokenStream for first time). Using the 
TokenStream means that you calculate the states and so every Filter/Tokenizer 
got his own cached state. Adding them a new Attribute on the last filter will 
never invalidate the cache of the Tokenizer.

This bug could affect:
- Analyzers that reuse TokenStreams partly and plug filters on top in the 
reuseableTokenStream() method, reusing the partially cached tokenstream. Like 
those, that always add a non-cacheable TokenFilter on top of a base TS.
- TokenStreams that add attributes on the-fly in one of their filters.

We should backport this patch to 3.x, 3.1.1 and maybe even 2.9.x and 3.0.x 
branches (if somebody wants to patch 3.0). In general this is a serious issue 
of the new TokenStream API since 2.9.


> AttributeSource can have an invalid computed state
> --------------------------------------------------
>
>                 Key: LUCENE-3042
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3042
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>    Affects Versions: 2.9.4, 3.0.3, 3.1, 3.2, 4.0
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>            Priority: Critical
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3042.patch, LUCENE-3042.patch
>
>
> If you work a tokenstream, consume it, then reuse it and add an attribute to 
> it, the computed state is wrong.
> thus for example, clearAttributes() will not actually clear the attribute 
> added.
> So in some situations, addAttribute is not actually clearing the computed 
> state when it should.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to