RE: SolrCloud: How best to do backups?

Davis, Daniel (NIH/NLM) [C] Thu, 08 Feb 2018 13:52:05 -0800

I would suggest you have a separate EBS to save the backup from each server.   
These EBS volumes would be mounted all the time, but only modified by a backup.


Then, you can create an AWS Lambda function that runs on a periodic trigger 
from CloudWatch, and does the following:

- run the backup (by calling 
https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html#solrcloud-backups)
- snapshot the EBS volume

If you lose a node, you provision a new backup EBS based on the latest 
snapshot.   So, if you lost the node during the backup, the latest snapshot is 
from earlier.

If you can stop indexing to these three servers periodically, you can also just 
make a snapshot of their primary EBS, and use that for the restore (because the 
index is not being updated).
 
Does this make any sense?

-----Original Message-----
From: John Bickerstaff [mailto:j...@johnbickerstaff.com] 
Sent: Thursday, February 8, 2018 4:06 PM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud: How best to do backups?

This article may be of some use...

What isn't clear is what effect either of the two strategies mentioned would 
have on serving responses to queries...  It would be nice if the backup was a 
"low priority thread" compared to the needs of the server in question, but I've 
never had to dig that deep before...

https://n2ws.com/how-to-guides/automate-amazon-ec2-instance-backup.html

On Thu, Feb 8, 2018 at 2:00 PM, John Bickerstaff <j...@johnbickerstaff.com>
wrote:

> Hmmm...
>
> Can you (fairly quickly) reproduce this AWS environment (including the 
> indexes)?  Or does it require that several week process to provision 
> new Solr boxes...?
>
> What happens now if one of those ec2 instances gets into trouble?  Do 
> you have autoscaling groups set up?
>
> On Thu, Feb 8, 2018 at 1:44 PM, Kelly, Frank <frank.ke...@here.com> wrote:
>
>> We have a large SolrCloud deployment on AWS (350m documents spread 
>> across
>> 3 collections, each with 3 shards and 3 replicas) Running on 3 x 
>> r3.xlarge’s with the data stored on EBS drives with Provisioned IOPS
>>
>> Currently it’s handling 38m requests per day
>>
>> My question is how best should we back-up the search index?
>> Is there someway to snapshot a backup while Solr remains online that 
>> doesn’t horribly affect performance?
>>
>> Right now in the event of a catastrophic failure if would take 
>> several weeks to reindex the data again based on the process we have 
>> now (which is
>> outdated)
>>
>> -Frank
>>
>> [image: Description: Macintosh
>> HD:Users:jerchow:Downloads:Asset_Package_01_160721:HERE_Logo_2016:sRG
>> B:PDF:HERE_Logo_2016_POS_sRGB.pdf]
>>
>>
>>
>> *Frank Kelly*
>>
>> *Principal Software Engineer*
>>
>> AAA Identity Profile Team (SCBE / CDA)
>>
>>
>> HERE
>>
>> 5 Wayside Rd, Burlington, MA 01803, USA 
>> <https://maps.google.com/?q=5+Wayside+Rd,+Burlington,+MA+01803,+USA&e
>> ntry=gmail&source=g>
>>
>> *42° 29' 7" N 71° 11' 32" W*
>>
>>
>> [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_360.gif]
>> <http://360.here.com/>    [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_Twitter.gif]
>> <https://www.twitter.com/here>   [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_FB.gif]
>> <https://www.facebook.com/here>    [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_IN.gif]
>> <https://www.linkedin.com/company/heremaps>    [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/
>> _Images/20160726_HERE_EMail_Signature_Insta.gif]
>> <https://www.instagram.com/here/>
>>
>
>

RE: SolrCloud: How best to do backups?

Reply via email to