Hello,

Every once in a while, I get an error when using Lucene in a multithreaded 
scenario (meaning: using a single IndexWriter in multiple threads, or using a 
distinct IndexWriter in each thread: it doesn't matter).
The exception chain thrown is:

Unhandled Exception: System.ArgumentException: Could not instantiate 
implementing class for Lucene.Net.Analysis.Tokenattributes.ICharTermAttribute
---> System.ArgumentException: Could not find implementing class for 
ICharTermAttribute
--->System.InvalidOperationException: Collection was modified; enumeration 
operation  may not execute.

I could not understand what was going on, especially because it only occurred 
"sometimes". It took me a while to figure out, but I think it's a bug.

Here's the stack trace of the exception when it occurs:

                [External Code]
>             
> Lucene.Net.dll!Lucene.Net.Support.HashMap<Lucene.Net.Support.WeakDictionary<System.Type,
>  System.WeakReference>.WeakKey<System.Type>, 
> System.WeakReference>.GetEnumerator() Line 229           C#
               [External Code]
               Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean() Line 59           C#
               Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.CleanIfNeeded() Line 71         C#
               Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Add(System.Type key, System.WeakReference value) Line 134 
          C#
                
Lucene.Net.dll!Lucene.Net.Util.AttributeSource.AttributeFactory.DefaultAttributeFactory.GetClassForInterface<Lucene.Net.Analysis.Tokenattributes.ICharTermAttribute>()
 Line 90  C#
                
Lucene.Net.dll!Lucene.Net.Util.AttributeSource.AttributeFactory.DefaultAttributeFactory.CreateAttributeInstance<Lucene.Net.Analysis.Tokenattributes.ICharTermAttribute>()
 Line 70  C#
                
Lucene.Net.dll!Lucene.Net.Util.AttributeSource.AddAttribute<Lucene.Net.Analysis.Tokenattributes.ICharTermAttribute>()
 Line 350                C#
               
Lucene.Net.dll!Lucene.Net.Documents.Field.StringTokenStream.InitializeInstanceFields()
 Line 658         C#
               
Lucene.Net.dll!Lucene.Net.Documents.Field.StringTokenStream.StringTokenStream() 
Line 676                C#
               
Lucene.Net.dll!Lucene.Net.Documents.Field.GetTokenStream(Lucene.Net.Analysis.Analyzer
 analyzer) Line 629         C#
               
Lucene.Net.dll!Lucene.Net.Index.DocInverterPerField.ProcessFields(Lucene.Net.Index.IndexableField[]
 fields, int count) Line 105              C#
                
Lucene.Net.dll!Lucene.Net.Index.DocFieldProcessor.ProcessDocument(Lucene.Net.Index.FieldInfos.Builder
 fieldInfos) Line 279          C#
                
Lucene.Net.dll!Lucene.Net.Index.DocumentsWriterPerThread.UpdateDocument(System.Collections.Generic.IEnumerable<Lucene.Net.Index.IndexableField>
 doc, Lucene.Net.Analysis.Analyzer analyzer, Lucene.Net.Index.Term delTerm) 
Line 287                C#
                
Lucene.Net.dll!Lucene.Net.Index.DocumentsWriter.UpdateDocument(System.Collections.Generic.IEnumerable<Lucene.Net.Index.IndexableField>
 doc, Lucene.Net.Analysis.Analyzer analyzer, Lucene.Net.Index.Term delTerm) 
Line 574                C#
               
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.UpdateDocument(Lucene.Net.Index.Term
 term, System.Collections.Generic.IEnumerable<Lucene.Net.Index.IndexableField> 
doc, Lucene.Net.Analysis.Analyzer analyzer) Line 1830          C#
                
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.AddDocument(System.Collections.Generic.IEnumerable<Lucene.Net.Index.IndexableField>
 doc, Lucene.Net.Analysis.Analyzer analyzer) Line 1455   C#
                
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.AddDocument(System.Collections.Generic.IEnumerable<Lucene.Net.Index.IndexableField>
 doc) Line 1436   C#

... and to wit, here are the threads just rushing in to do the same:

Not Flagged                        35428    17           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        35444    11           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        44124    12           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged        >             44140    13           Worker Thread <No Name>  
              Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        47700    14           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        28168    15           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        30988    16           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        21828    6              Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal

The reason why it only reproduces "sometimes" is because of this little nugget 
of code:

        private void CleanIfNeeded()
        {
            int currentColCount = GC.CollectionCount(0);
            if (currentColCount > _gcCollections)
            {
                Clean();
                _gcCollections = currentColCount;
            }
        }

If one thread does a Clean() operation in the middle of another Clean() 
operation on the same collection that replaces the object being enumerated on, 
you get the exception. Always.
To avoid the intermittence, create a bunch of threads like this and eliminate 
the test "if (currentColCount > _gcCollections)" so that the Clean() code is 
always executed. You'll get the exception every time.

I will not post the correction, but there's a simple workaround: just make sure 
the static initializers are performed in a single thread.
I.e. before creating your threads, do something like this:

new global::Lucene.Net.Documents.TextField("dummy", "dummyvalue", 
global::Lucene.Net.Documents.Field.Store.NO).GetTokenStream(new (some Analyzer 
object));

Replace "some Analyzer object" with an instance of an Analyzer object, it 
doesn't matter which one. It's meaningless, but it has the side effect of 
initializing the static fields without problems.


Vincent




Reply via email to