[
https://issues.apache.org/jira/browse/SOLR-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SOLR-17306:
----------------------------------
Labels: pull-request-available (was: )
> Solr Repeater or Slave loses data after restart when replication is not
> enabled on leader
> -----------------------------------------------------------------------------------------
>
> Key: SOLR-17306
> URL: https://issues.apache.org/jira/browse/SOLR-17306
> Project: Solr
> Issue Type: Bug
> Affects Versions: 9.2, 9.3, 9.4, 9.5, 9.6
> Reporter: Peter Kroiss
> Priority: Major
> Labels: pull-request-available
> Attachments: solr-replication-test.txt
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> We are testing Solr 9.6.2 in a leader - repeater - follower configuration. We
> have times where we write the leader heavily, in that time replication is
> disabled to save bandwidth.
> In the time, when replication is disabled on leader, the repeater restarts
> for some reason, the repeater loses all documents and doesn't recover when
> the leader is opened for replication.
> The documents are deleted but indexVersion and generation properties are set
> to the value of the leader, so the repeater or follower doesn't recover when
> the leader is opened for replication again.
> It recovers only when there are commits on the leader after opening the
> replication.
> Log:
> 2024-05-22 06:18:42.186 INFO (qtp16373883-27-null-23) [c: s: r: x:mycore
> t:null-23] o.a.s.c.S.Request webapp=/solr path=/replication
> params=\{wt=json&command=details} status=0 QTime=10
> 2024-05-22 06:18:46.195 INFO (indexFetcher-43-thread-1) [c: s: r: x:mycore
> t:] o.a.s.h.IndexFetcher Leader's generation: 0
> 2024-05-22 06:18:46.195 INFO (indexFetcher-43-thread-1) [c: s: r: x:mycore
> t:] o.a.s.h.IndexFetcher Leader's version: 0
> 2024-05-22 06:18:46.195 INFO (indexFetcher-43-thread-1) [c: s: r: x:mycore
> t:] o.a.s.h.IndexFetcher Follower's generation: 2913
> 2024-05-22 06:18:46.195 INFO (indexFetcher-43-thread-1) [c: s: r: x:mycore
> t:] o.a.s.h.IndexFetcher Follower's version: 1716300697144
> 2024-05-22 06:18:46.195 INFO (indexFetcher-43-thread-1) [c: s: r: x:mycore
> t:] o.a.s.h.IndexFetcher New index in Leader. Deleting mine...
>
> --> there is no new Index in Leader it is only closed for replication
>
>
> We think the problem is in IndexFetcher
> old: if (IndexDeletionPolicyWrapper.getCommitTimestamp(commit) != 0L) {
> forceReplication - will probably fix the problem
> new : if (forceReplication &&
> IndexDeletionPolicyWrapper.getCommitTimestamp(commit) != 0L) {
>
>
>
>
> When investigation the problem we also found some inconsistencies in the
> details request. There are two fragments leader. When the leader is closed
> for replication the property leader. replicationEnabled is set to true, the
> property follower. leaderDetails. Leader. replicationEnabled is correct.
>
> Example
> curl -s
> "https://solr9-repeater:8983/solr/mycore/replication?wt=json&command=details"
> | jq '.details |
> { indexSize: .indexSize, indexVersion: .indexVersion, generation:
> .generation, indexPath: .indexPath, leader: \\{ replicableVersion:
> .leader.replicableVersion, replicableGeneration:
> .leader.replicableGeneration, replicationEnabled: .leader.replicationEnabled }
> ,
> follower: { leaderDetails: { indexSize: .follower.leaderDetails.indexSize,
> generation: .follower.leaderDetails.generation,
> indexVersion: .follower.leaderDetails.indexVersion, indexPath:
> .follower.leaderDetails.indexPath,
> leader:
> { replicableVersion: .follower.leaderDetails.leader.replicableVersion ,
> replicableGeneration: .follower.leaderDetails.leader.replicableGeneration,
> replicationEnabled: .follower.leaderDetails.leader.replicationEnabled }
> }}
> }'
>
> {
> "indexSize": "10.34 GB",
> "indexVersion": 1716358708159,
> "generation": 2913,
> "indexPath": "/var/solr/data/mycore/data/index.20240522061946262",
> "leader":
> { "replicableVersion": 1716358708159, "replicableGeneration": 2913,
> "replicationEnabled": "true" }
> ,
> "follower": {
> "leaderDetails": {
> "indexSize": "10.34 GB",
> "generation": 2913,
> "indexVersion": 1716358708159,
> "indexPath": "/var/solr/data/mycore/data/restore.20240508131046932",
> "leader":
> { "replicableVersion": 1716358708159, "replicableGeneration":
> 2913, "replicationEnabled": "false" }
> }
> }
> }
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]