[ 
https://issues.apache.org/jira/browse/LUCENE-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000511#comment-13000511
 ] 

Uwe Schindler edited comment on LUCENE-2943 at 2/28/11 9:07 PM:
----------------------------------------------------------------

I changed my mind a little bit:

The cloning of the Collator should be done in the Analyzer not in the Filter. 
The same applies to the AttributeImpl, the cloning should not be done in the 
ctor. The problem is not that the TokenStream or the Attribute instance may 
reuse the attribute in different threads, the problem is that the factory class 
(the Analyzer) does reuse the Collator in different threads when it produces 
multiple tokenstreams or the AF multiple attributes.

This is a slight difference, because the following code is always safe:
new CollationFilter(Collator.newInstance(lang)), cloning would be wrong.

The reason for the whole thing: TokenStream and Attribute instances itsself are 
single-threaded only, but not the factory or the analyzer.

      was (Author: thetaphi):
    I changed my mind a little bit:

The cloning of the Filter should be done in the Analyzer not in the Filter. The 
same applies to the AttributeImpl, the cloning should be done in the ctor. The 
problem is not that the TokenStream or the Attribute instance may reuse the 
attribute in different threads, the problem is that the factory class (the 
Analyzer) does reuse the Collator in different threads when it produces 
multiple tokenstreams or the AF multiple attributes.

This is a slight difference, because the following code is always safe:
new CollationFilter(Collator.newInstance(lang)), cloning would be wrong.

The reason for the whole thing: TokenStream and Attribute instances itsself are 
single-threaded only, but not the factory or the analyzer.
  
> ICU collator thread-safety issues
> ---------------------------------
>
>                 Key: LUCENE-2943
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2943
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>            Reporter: Robert Muir
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2943.patch
>
>
> The ICU Collators (unlike the JDK ones) aren't thread safe: 
> http://userguide.icu-project.org/collation/architecture , a little 
> non-obvious since its not mentioned
> in the javadocs, and its not clear if the docs apply to only the C code, but 
> i looked
> at the source and there is all kinds of internal state.
> So in my opinion, we should clone the icu collators (which are passed in from 
> the outside) 
> when creating a new TokenStream/AttributeImpl to prevent problems. This 
> shouldn't be a big
> deal since everything uses reusableTokenStream anyway.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to