[
https://issues.apache.org/jira/browse/SOLR-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luke Kot-Zaniewski updated SOLR-17863:
--------------------------------------
Description:
SOLR-16713 introduced a race condition into the perSegmentFingerprintCache as
it changed it from the [thread-safe guava implementation to the thread-UNSAFE
WeakHashMap|https://github.com/apache/solr/commit/375ac647ad507638686848c549c30e6c077dca1b#diff-3bf44f923643744a743460bb0f64301618dd7529a602d945fe2f7f193d4cdde0].
This flew under the radar for some time because this map previously had very
little contention between threads. However after SOLR-17756 we turbo-charged
the race after parallelizing at a much more granular level upstream of this
call. The net effect is that WeakHashMap:put can get stuck in an infinite-loop
(probably due to a cycle in the underlying linked-list that it modifies):
!image-2025-08-14-11-29-04-739.png!
!image-2025-08-14-11-29-42-832.png!
was:
SOLR-16713 introduced a race condition into the perSegmentFingerprintCache as
it changed it from the [thread-safe guava implementation to the thread-UNSAFE
WeakHashMap|https://github.com/apache/solr/commit/375ac647ad507638686848c549c30e6c077dca1b#diff-3bf44f923643744a743460bb0f64301618dd7529a602d945fe2f7f193d4cdde0].
This flew under the radar for some time because this map previously had very
little contention between threads. However after SOLR-17756 we turbo-charged
the race since we know parallelize at a much more granular level upstream of
this call. The net effect is that WeakHashMap:put can get stuck in an
infinite-loop (probably due to a cycle in the underlying linked-list that it
modifies):
!image-2025-08-14-11-29-04-739.png!
!image-2025-08-14-11-29-42-832.png!
> SolrCore's perSegmentFingerprintCache Is Not Threadsafe
> -------------------------------------------------------
>
> Key: SOLR-17863
> URL: https://issues.apache.org/jira/browse/SOLR-17863
> Project: Solr
> Issue Type: Bug
> Affects Versions: 9.3, 9.8.1
> Reporter: Luke Kot-Zaniewski
> Priority: Critical
> Attachments: image-2025-08-14-11-29-04-739.png,
> image-2025-08-14-11-29-42-832.png
>
>
> SOLR-16713 introduced a race condition into the perSegmentFingerprintCache as
> it changed it from the [thread-safe guava implementation to the thread-UNSAFE
> WeakHashMap|https://github.com/apache/solr/commit/375ac647ad507638686848c549c30e6c077dca1b#diff-3bf44f923643744a743460bb0f64301618dd7529a602d945fe2f7f193d4cdde0].
> This flew under the radar for some time because this map previously had very
> little contention between threads. However after SOLR-17756 we turbo-charged
> the race after parallelizing at a much more granular level upstream of this
> call. The net effect is that WeakHashMap:put can get stuck in an
> infinite-loop (probably due to a cycle in the underlying linked-list that it
> modifies):
> !image-2025-08-14-11-29-04-739.png!
> !image-2025-08-14-11-29-42-832.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]