[jira] [Resolved] (LUCENENET-607) InvalidCastException PendingTerm cannot be cast to PendingBlock

Shad Storhaug (JIRA) Mon, 12 Aug 2019 22:25:10 -0700


     [ 
https://issues.apache.org/jira/browse/LUCENENET-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shad Storhaug resolved LUCENENET-607.
-------------------------------------
    Resolution: Fixed

Thanks for the PR.

{quote}There is another issue with GOOD_FAST_HASH_SEED. 
DateTime.Now.Millisecond is used to randomize the seed, but 
DateTime.Now.Millisecond could return 0 and this value is treated an 
"uninitialized" and the second GOOD_FAST_HASH_SEED call will return another 
value.{quote}

This was due to a second bug that was made during translation of the code from 
Java. 
[{{System.currentTimeMillis()}}|https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#currentTimeMillis--]
 returns the number of milliseconds since January 1, 1970, not the number of 
milliseconds of the current time. I have replaced {{DateTime.Now.Millisecond}} 
with {{Time.CurrentTimeMilliseconds()}}, which relies on 
{{System.Diagnostics.Timestamp}} to generate the value, making it a number much 
higher than 999 that rarely repeats.


> InvalidCastException PendingTerm cannot be cast to PendingBlock
> ---------------------------------------------------------------
>
>                 Key: LUCENENET-607
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-607
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 4.8.0
>            Reporter: Khindikaynen Aleksey
>            Priority: Major
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Here is exception call stack:
> {code:java}
> at Lucene.Net.Codecs.BlockTreeTermsWriter.TermsWriter.Finish(Int64 
> sumTotalTermFreq, Int64 sumDocFreq, Int32 docCount, TermsHashPerField 
> termsHashPerField)
> at Lucene.Net.Index.FreqProxTermsWriterPerField.Flush(String fieldName, 
> FieldsConsumer consumer, SegmentWriteState state)
> at Lucene.Net.Index.FreqProxTermsWriter.Flush(IDictionary`2 fieldsToFlush, 
> SegmentWriteState state)
> at Lucene.Net.Index.TermsHash.Flush(IDictionary`2 fieldsToFlush, 
> SegmentWriteState state)
> at Lucene.Net.Index.DocInverter.Flush(IDictionary`2 fieldsToFlush, 
> SegmentWriteState state)
> at Lucene.Net.Index.DocFieldProcessor.Flush(SegmentWriteState state)
> at Lucene.Net.Index.DocumentsWriterPerThread.Flush()
> at Lucene.Net.Index.DocumentsWriter.DoFlush(DocumentsWriterPerThread 
> flushingDWPT)
> at Lucene.Net.Index.DocumentsWriter.FlushAllThreads(IndexWriter indexWriter)
> at Lucene.Net.Index.IndexWriter.GetReader(Boolean applyAllDeletes)
> at Lucene.Net.Index.StandardDirectoryReader.DoOpenFromWriter(IndexCommit 
> commit)
> at Lucene.Net.Search.SearcherManager.RefreshIfNeeded(IndexSearcher 
> referenceToRefresh)
> at Lucene.Net.Search.ReferenceManager`1.DoMaybeRefresh()
> at Lucene.Net.Search.ReferenceManager`1.MaybeRefreshBlocking()
> at Lucene.Net.Search.ControlledRealTimeReopenThread`1.Run()
> {code}
> Issue is quite "hard-to-reproduce" and appears only when adding documents 
> with the same terms concurrently. I have not managed to make a clear test 
> that reproduces the issue.
> I've made some research and found out that the cause of the issue are 
> duplicate terms in BytesRefHash structure. BytesRefHash using the 
> Murmurhash3_x86_32 hashing algorithm with the random seed (see 
> StringHelper.GOOD_FAST_HASH_SEED property). StringHelper.GOOD_FAST_HASH_SEED 
> property is not thread-safe and could return different values if called in 
> severeal threads in one moment, so it could result in duplicate values in 
> BytesRefHash (same values return different hashes because hashes were 
> calcucated with different seeds).
> There is another issue with GOOD_FAST_HASH_SEED. DateTime.Now.Millisecond is 
> used to randomize the seed, but DateTime.Now.Millisecond could return 0 and 
> this value is treated an "uninitialized" and the second GOOD_FAST_HASH_SEED 
> call will return another value.
> The issue could be easely fixed by moving the GOOD_FAST_HASH_SEED 
> initialization to the static ctor of StringHelper. It will make it 
> thread-safe and will fix 0-value issue.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (LUCENENET-607) InvalidCastException PendingTerm cannot be cast to PendingBlock

Reply via email to