[
https://issues.apache.org/jira/browse/SOLR-17200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825732#comment-17825732
]
Houston Putman commented on SOLR-17200:
---------------------------------------
I like your suggestions [~hossman] , if you have a patch/PR then I'd be happy
to review it.
> "False Positive" Race conditions using "/health?requireHealthyCores=true"
> near startup
> --------------------------------------------------------------------------------------
>
> Key: SOLR-17200
> URL: https://issues.apache.org/jira/browse/SOLR-17200
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Chris M. Hostetter
> Priority: Major
>
> There seem to be at least two possible thread race conditions that can lead
> {{/health?requireHealthyCores=true}} to returning false positive while
> {{CoreContainer}} is in the process of starting up.
> # If the request comes in _after_ {{CoreContainer}} has initialized
> {{healthCheckHandler}} but _before_ initializing & running the
> {{coreLoadExecutor}}
> # A more complex situation where the request comes in _while_
> {{coreLoadExecutor}} is loading cores, and all of the cores that have
> _finished_ initialization are "active" in SolrCloud, but other SolrCores
> remain to be initialized (and may need recovery)
> In both cases, the root of the issue is that {{requireHealthyCores=true}}
> works by checking...
> {code:java}
> Collection<CloudDescriptor> coreDescriptors =
> coreContainer.getCores().stream()
> .map(c -> c.getCoreDescriptor().getCloudDescriptor())
> .collect(Collectors.toList());
> long unhealthyCores = findUnhealthyCores(coreDescriptors, clusterState);
> {code}
> ..but that means the only {{CloudDescriptor}} s that are checked are the ones
> that come from _loaded_ cores (which is what {{coreContainer.getCores()}}
> returns). and any {{currentlyLoadingCores}} (registered by CoreContainer
> calling {{solrCores.markCoreAsLoading(cd)}} before starting the
> {{coreLoadExecutor}} ) are not considered.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]