Chris M. Hostetter created SOLR-17200:
-----------------------------------------

             Summary: Race conditions on startup using 
/health?requireHealthyCores=true
                 Key: SOLR-17200
                 URL: https://issues.apache.org/jira/browse/SOLR-17200
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Chris M. Hostetter


There seem to be at least two possible thread race conditions that can lead 
{{/health?requireHealthyCores=true}} to returning false positive while 
{{CoreContainer}} is in the process of starting up.
 # If the request comes in _after_ {{CoreContainer}} has initialized 
{{healthCheckHandler}} but _before_ initializing & running the 
{{coreLoadExecutor}}
 # A more complex situation where the request comes in _while_ 
{{coreLoadExecutor}} is loading cores, and all of the cores that have 
_finished_ initialization are "active" in SolrCloud, but other SolrCores remain 
to be initialized (and may need recovery)

In both cases, the root of the issue is that {{requireHealthyCores=true}} works 
by checking...
{code:java}
      Collection<CloudDescriptor> coreDescriptors =
          coreContainer.getCores().stream()
              .map(c -> c.getCoreDescriptor().getCloudDescriptor())
              .collect(Collectors.toList());
      long unhealthyCores = findUnhealthyCores(coreDescriptors, clusterState);
{code}
..but that means the only {{CloudDescriptor}} s that are checked are the ones 
that come from _loaded_ cores (which is what {{coreContainer.getCores()}} 
returns). and any {{currentlyLoadingCores}} (registered by CoreContainer 
calling {{solrCores.markCoreAsLoading(cd)}} before starting the 
{{coreLoadExecutor}} ) are not considered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to