[ 
https://issues.apache.org/jira/browse/LUCENENET-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789779#comment-16789779
 ] 

Laimonas Simutis commented on LUCENENET-607:
--------------------------------------------

Fix is here: https://github.com/apache/lucenenet/pull/224

> InvalidCastException PendingTerm cannot be cast to PendingBlock
> ---------------------------------------------------------------
>
>                 Key: LUCENENET-607
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-607
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 4.8.0
>            Reporter: Khindikaynen Aleksey
>            Priority: Major
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Here is exception call stack:
> {code:java}
> at Lucene.Net.Codecs.BlockTreeTermsWriter.TermsWriter.Finish(Int64 
> sumTotalTermFreq, Int64 sumDocFreq, Int32 docCount, TermsHashPerField 
> termsHashPerField)
> at Lucene.Net.Index.FreqProxTermsWriterPerField.Flush(String fieldName, 
> FieldsConsumer consumer, SegmentWriteState state)
> at Lucene.Net.Index.FreqProxTermsWriter.Flush(IDictionary`2 fieldsToFlush, 
> SegmentWriteState state)
> at Lucene.Net.Index.TermsHash.Flush(IDictionary`2 fieldsToFlush, 
> SegmentWriteState state)
> at Lucene.Net.Index.DocInverter.Flush(IDictionary`2 fieldsToFlush, 
> SegmentWriteState state)
> at Lucene.Net.Index.DocFieldProcessor.Flush(SegmentWriteState state)
> at Lucene.Net.Index.DocumentsWriterPerThread.Flush()
> at Lucene.Net.Index.DocumentsWriter.DoFlush(DocumentsWriterPerThread 
> flushingDWPT)
> at Lucene.Net.Index.DocumentsWriter.FlushAllThreads(IndexWriter indexWriter)
> at Lucene.Net.Index.IndexWriter.GetReader(Boolean applyAllDeletes)
> at Lucene.Net.Index.StandardDirectoryReader.DoOpenFromWriter(IndexCommit 
> commit)
> at Lucene.Net.Search.SearcherManager.RefreshIfNeeded(IndexSearcher 
> referenceToRefresh)
> at Lucene.Net.Search.ReferenceManager`1.DoMaybeRefresh()
> at Lucene.Net.Search.ReferenceManager`1.MaybeRefreshBlocking()
> at Lucene.Net.Search.ControlledRealTimeReopenThread`1.Run()
> {code}
> Issue is quite "hard-to-reproduce" and appears only when adding documents 
> with the same terms concurrently. I have not managed to make a clear test 
> that reproduces the issue.
> I've made some research and found out that the cause of the issue are 
> duplicate terms in BytesRefHash structure. BytesRefHash using the 
> Murmurhash3_x86_32 hashing algorithm with the random seed (see 
> StringHelper.GOOD_FAST_HASH_SEED property). StringHelper.GOOD_FAST_HASH_SEED 
> property is not thread-safe and could return different values if called in 
> severeal threads in one moment, so it could result in duplicate values in 
> BytesRefHash (same values return different hashes because hashes were 
> calcucated with different seeds).
> There is another issue with GOOD_FAST_HASH_SEED. DateTime.Now.Millisecond is 
> used to randomize the seed, but DateTime.Now.Millisecond could return 0 and 
> this value is treated an "uninitialized" and the second GOOD_FAST_HASH_SEED 
> call will return another value.
> The issue could be easely fixed by moving the GOOD_FAST_HASH_SEED 
> initialization to the static ctor of StringHelper. It will make it 
> thread-safe and will fix 0-value issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to