[ 
https://issues.apache.org/jira/browse/SOLR-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18013937#comment-18013937
 ] 

Matthew Biscocho commented on SOLR-17863:
-----------------------------------------

+1 this was leading to a nasty bug in which replicas were holding leader 
election hostage until a timeout finally triggers on shards.

> SolrCore's perSegmentFingerprintCache Is Not Threadsafe
> -------------------------------------------------------
>
>                 Key: SOLR-17863
>                 URL: https://issues.apache.org/jira/browse/SOLR-17863
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 9.3, 9.8.1
>            Reporter: Luke Kot-Zaniewski
>            Priority: Critical
>         Attachments: image-2025-08-14-11-29-04-739.png, 
> image-2025-08-14-11-29-42-832.png
>
>
> SOLR-16713 introduced a race condition into the perSegmentFingerprintCache as 
> it changed it from the [thread-safe guava implementation to the thread-UNSAFE 
> WeakHashMap|https://github.com/apache/solr/commit/375ac647ad507638686848c549c30e6c077dca1b#diff-3bf44f923643744a743460bb0f64301618dd7529a602d945fe2f7f193d4cdde0].
>  This flew under the radar for some time because this map previously had very 
> little contention between threads. However after SOLR-17756 we turbo-charged 
> the race after parallelizing at a much more granular level upstream of this 
> call. The net effect is that WeakHashMap:put can get stuck in an 
> infinite-loop (probably due to a cycle in the underlying linked-list that it 
> modifies):
> !image-2025-08-14-11-29-04-739.png!
> !image-2025-08-14-11-29-42-832.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to