[jira] [Commented] (SOLR-15727) Split-brain can occur with heavy ingest and violent shutdown

Shawn Heisey (Jira) Tue, 26 Jul 2022 11:14:08 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-15727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571567#comment-17571567
 ]


Shawn Heisey commented on SOLR-15727:
-------------------------------------

By default, SolrCloud only guarantees that the transaction logs contain 100 
documents.  With heavy indexing, you could easily need a lot more than 100 docs 
to completely recover from an unclean shutdown.  SolrCloud will replay the 
transaction log when a node starts, but for that to mean anything, the 
transaction logs must contain every document that was not properly committed 
before the unclean shutdown.

How many documents are kept in the transaction logs is controlled by the 
numRecordsToKeep item in the updateLog config.

{code}
<updateLog>
  <str name="dir">${solr.ulog.dir:}</str>
  <int name="numRecordsToKeep">500</int>
  <int name="maxNumLogsToKeep">20</int>
  <int name="numVersionBuckets">65536</int>
</updateLog>
{code}

I cannot tell you what value you need for this.  Maybe start at 10000 and 
adjust from there, either up or down?  More data in the transaction logs means 
that node startup will take a little bit longer, but when startup finishes, the 
index should be fully up to date.

A bit of information that all of us involved with the project take for granted 
is that when a hard commit happens, Solr closes the current transaction log, 
opens a new one, and deletes older logs according to the settings in updateLog. 
 That is why our shipped configs have autoCommit set at 15 seconds with 
openSearcher set to false -- so the commit happens very quickly and the 
transaction logs do not become enormous.


> Split-brain can occur with heavy ingest and violent shutdown
> ------------------------------------------------------------
>
>                 Key: SOLR-15727
>                 URL: https://issues.apache.org/jira/browse/SOLR-15727
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 8.9
>            Reporter: Don
>            Priority: Major
>
> Violent shutdown of collection with 2 replicas per shard with heavy ingest 
> causes split-brain where the query results against a collection will yield 
> different results each time.
>   
> Steps to reproduce IPL 2 issue # Deploy a solr cluster with at least 3 
> instances
>  # Create a solr collection with 3 shards and 2 replicas per shard
>  # Begin heavy ingest to solr
>  # Kill one of the solr instances (we had it running in a docker container, 
> and ran "docker kill" on the container)
>  # Wait for the instance to come back up
>  # Stop ingest to solr
>  # Run a hard commit on the collection
>  # Attempt to do a few queries to the collection
>  
> +*Expected Result:*+ * Every query returns the same hit count
> +*Actual Result:*+ * As you run queries, the hit count will fluctuate
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-15727) Split-brain can occur with heavy ingest and violent shutdown

Reply via email to