-------- Forwarded Message --------
Subject: RE: [EXTERNAL] [VuFind-General] solr follower empty on fail of leader
Date: Fri, 31 Jan 2025 12:53:28 +0000
From: Demian Katz <demian.k...@villanova.edu>
To: thomas schwaerzler <thomas.schwaerz...@uibk.ac.at>, vufind general <vufind-gene...@lists.sourceforge.net>

Thomas,

I've been running multiple leader/follower Solr instances for more than a decade, and I've never encountered the problem you describe, nor have I heard of anyone else in the VuFind community complaining about it.

Maybe it's worth reaching out to the broader Solr community via the solr-user mailing list to see if anyone there has similar experiences or theories.

If it would help, I can share my specific Solr configurations... but from memory, I think the main differences are that we do not replicate on startup, and we don't have a backupAfter setting at all. Is there a compelling reason to replicate on startup? Maybe turning off that setting would be a worthwhile experiment, in case something weird during the startup process is somehow leading to the problem you describe.

- Demian

-----Original Message-----
From: thomas schwaerzler via VuFind-General <vufind-gene...@lists.sourceforge.net>
Sent: Friday, January 31, 2025 7:12 AM
To: vufind general <vufind-gene...@lists.sourceforge.net>
Subject: [EXTERNAL] [VuFind-General] solr follower empty on fail of leader

we have a simple architecture with two solr inices. on leader that collects the catalog data from our partner and has a lot of inserts and is permanently updated. the follower index just replicates the leader and serves to the representing vufind page.

sometimes my productive solr slave index for our productive vufind instance turns out to be empty. last time that happened, the follower somehow to an version number that was exactly the version number of the follower + 1. when i tried to replicate manually the follower did nothing. probably because it sees itself ahead of the leader. i cannot reproduce how the follower got empty. i have a suspicion that this happens, when the leader gets an error and is not reachable.

did anyone have a similar experience?


here are my solr configs:

master:
====

solr/vufind/biblio/conf/solrconfig.xml

...
<!-- added request handler: master:
https://solr.apache.org/guide/solr/latest/deployment-guide/user-managed-index-replication.html#index-replication-in-solr
-->
<requestHandler name="/replication" class="solr.ReplicationHandler" >
     <lst name="leader">
<!--Replicate on 'startup' and 'commit'. 'optimize' is also a valid value for replicateAfter. -->
             <str name="replicateAfter">startup</str>
             <str name="replicateAfter">commit</str>
             <str name="replicateAfter">optimize</str>

<!--Create a backup after 'optimize'. Other values can be 'commit', 'startup'. It is possible to have multiple entries of this config string. Note that this is just for backup, replication does not require this. -->
             <str name="backupAfter">startup</str>
<!-- changed: no space left on device. testing remove backup after commit: -->
         <!-- str name="backupAfter">commit</str -->

<!--If configuration files need to be replicated give the names here, separated by comma -->
         <str name="confFiles">schema.xml,stopwords.txt,elevate.xml</str>
<!--The default value of reservation is 10 secs.See the documentation below . Normally , you should not need to specify this -->
         <str name="commitReserveDuration">00:00:10</str>
      </lst>

      <int name="maxNumberOfBackups">1</int>

</requestHandler>




follower:
=======


<!-- slave to "seamast" master: -->
      <requestHandler name="/replication" class="solr.ReplicationHandler" >
          <lst name="follower">

             <str
name="leaderUrl">http://ds-seamast.uibk.ac.at:8986/solr/biblio</str>

             <!-- str name="pollInterval">00:13:07</str -->
             <str name="pollInterval">00:17:13</str>
             <str name="compression">internal</str>
             <str name="httpConnTimeout">20000</str>
             <str name="httpReadTimeout">20000</str>

         </lst>
      </requestHandler>
<!-- end replication slave config  -->



best
tom.


--
Thomas Schwaerzler
Digital Services Department
University Innsbruck Library
6020 Innsbruck - Innrain 52 - Austria
Phone: ++43-(0)512-507-25406
Fax: ++43-(0)512-507-25449
Email: <thomas.schwaerz...@uibk.ac.at>
URL: http://www.uibk.ac.at/ulb/ds
OpenPGP key: 0x2BD592C2, Key fingerprint = A17C 26AE FB4B BED6 7907 5DFC
5840 AB43 2BD5 92C2



_______________________________________________
VuFind-General mailing list
vufind-gene...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/vufind-general

Reply via email to