Thanks Joe for such a detailed solution. Think this can help us with the problem.
On Wed, Jun 28, 2023 at 1:47 PM Joe Jones (DHCW - Software Development) <joe.jo...@wales.nhs.uk.invalid> wrote: > For our small (50million document) 12 shard real-time index we backup each > node every night and perform an integrity check on it. > > We run a simple batch file (Windows) to loop through the environments and > generate CURL calls to instigate the backup process such as: > http://localhost:18983/solr/wcrs/replication?command=backup&location=D > :\Solr\backup\node1&name=bak > > And at a later point we integrity check with another script which calls: > java -cp 'lucene-core-9.3.0.jar;lucene-backward-codecs-9.3.0.jar' > -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex > D:\\Solr\\backup\\node1\\snapshot.bak > > The backup is essentially Solr replicating the indexes to another data > directory, and then our organisations backup scheduling backs up the data > each night for however long we set it to roll over for.....as you can > imagine, if you have large indexes you could with rolling backups be > storing a huge amount of data so that needs to be balanced. > > -----Original Message----- > From: Saksham Gupta <saksham.gu...@indiamart.com.INVALID> > Sent: 28 June 2023 06:32 > To: users@solr.apache.org > Subject: Re: Solr Cloud Backup Strategy and Data Corruption Prevention > > WARNING: This email originated from outside of NHS Wales. Do not open > links or attachments unless you know the content is safe. > > > Hi All, > Any help regarding this problem. What is the standard practice to create > backup on solr cloud? > > On Tue, Jun 27, 2023 at 5:57 PM Saksham Gupta <saksham.gu...@indiamart.com > > > wrote: > > > Hi Solr Developers, > > Reaching out to inquire about the best practices for implementing a > > backup strategy in Solr Cloud. We recently migrated from Solr > > standalone (solr6.5) to Solr 8.10, where we have a collection with > > data divided among 8 shards using implicit routing. Until now, we have > > maintained the standalone solr as a backup in case something goes > > wrong on solr cloud (due to data corruption/ deletion, etc.). > > However, we now wish to discard the standalone Solr and fully > > transition to Solr Cloud. My concern is what would happen if the data > > in Solr Cloud were to become corrupted/ deleted, necessitating the > > replacement or reindexing of the entire dataset, which can be a > > time-consuming process. We aim to minimize downtime as much as possible. > > I would greatly appreciate any insights or recommendations you could > > provide to address this concern. > > > > Thank you in advance. > > > > Best regards, > > Saksham > > > Rydym yn croesawu derbyn gohebiaeth yng Nghymraeg. Byddwn yn ateb y fath > ohebiaeth yng Nghymraeg ac ni fydd hyn yn arwain at oedi. > We welcome receiving correspondence in Welsh. We will reply to such > correspondence in Welsh and this will not lead to a delay. >