I will include my response to the original post: Snapshots are at the segment level. The more segments stored in the > repository, the more segments will have to be compared to those in each > successive snapshot. With merges taking place continually in an active > index, you may end up with a considerable number of "orphaned" segments > stored in your repository, i.e. segments "backed up," but no longer > directly correlating to a segment in your index. Checking through these > may be contributing to the increased amount of time between snapshots. > > Consider pruning older snapshots. "Orphaned" segments will be deleted, > and any segments still referenced will be preserved. >
On Thursday, November 20, 2014 7:22:03 AM UTC-5, João Costa wrote: > > Hello, > > Sorry for hijacking this thread, but I'm currently also pondering the best > way to perform periodic snapshots in AWS. > > My main concern is that we are using blue-green deployment with ephemeral > storage on EC2, so if for some reason there is a problem with the cluster, > we might lose a lot of data, therefore I would rather do frequent snapshots > (for this reason, we are still using the deprecated S3 gateway). > > The thing is, you claim that "Having too many snapshots is problematic" > and that one should "prune old snapshots". Since snapshots are incremental, > this will imply data loss, correct? > Also, is the problem related to the number of snapshots or the size of the > data? Is there any way to merge old snapshots into one? Would this solve > the problem? > > Finally, if I create a cronjob to make automatic snapshots, can I run into > problems if two instances attempt to create a snapshot with the same name > at the same time? > Also, what's the best way to do a snapshot on shutdown? Should I put a > script on init.d/rc.0 to run on shutdown before elasticsearch shuts down? > I've seen cases where the EC2 instances have "not so grateful" shutdowns, > so it would be wonder if there is a better way to do this on a cluster > level (ie, if a node A notices that a node B is not responding, then it > automatically makes a snapshot). > > Sorry if some of these questions don't make much sense, I'm still quite > new to elasticsearch and have not completly understood the new snapshot > feature. > > Em sexta-feira, 14 de novembro de 2014 08h19min42s UTC, Sally Ahn escreveu: >> >> Yes, I am now seeing the snapshots complete in about 2 minutes after >> switching to a new, empty bucket. >> I'm not sure why the initial request to snapshot to the empty repo was >> hanging because the snapshot did in fact complete in about 2 minutes, >> according to the S3 timestamp. >> Time to automate deletion of old snapshots. :) >> Thanks for the response! >> >> On Thursday, November 13, 2014 9:35:20 PM UTC-8, Igor Motov wrote: >>> >>> Having too many snapshots is problematic. Each snapshot is done in >>> incremental manner, so in order to figure out what changes and what is >>> available all snapshots in the repository needs to be scanned, which takes >>> time as number of snapshots growing. I would recommend pruning old >>> snapshots as time goes by or starting snapshots into a new bucket/directory >>> if you really need to maintain 2 hour resolution for 2 months old >>> snapshots. The get command can sometimes hang because it's throttled by the >>> on-going snapshot. >>> >>> >>> On Wednesday, November 12, 2014 9:02:33 PM UTC-10, Sally Ahn wrote: >>>> >>>> I am also interested in this topic. >>>> We were snapshotting our cluster of two nodes every 2 hours (invoked >>>> via a cron job) to an S3 repository (we were running ES 1.2.2 with >>>> cloud-aws-plugin version 2.2.0, then we upgraded to ES 1.4.0 with >>>> cloud-aws-plugin 2.4.0 but are still seeing issues described below). >>>> I've been seeing an increase in the time it takes to complete a >>>> snapshot with each subsequent snapshot. >>>> I see a thread >>>> <https://groups.google.com/forum/?fromgroups#!searchin/elasticsearch/snapshot/elasticsearch/bCKenCVFf2o/TFK-Es0wxSwJ> >>>> where >>>> someone else was seeing the same thing, but that thread seems to have died. >>>> In my case, snapshots have gone from taking ~5 minutes to taking about >>>> an hour, even between snapshots where data does not seem to have changed. >>>> >>>> For example, you can see below a list of the snapshots stored in my S3 >>>> repo. Each snapshot is named with a timestamp of when my cron job invoked >>>> the snapshot process. The S3 timestamp on the left shows the completion >>>> time of that snapshot, and it's clear that it's steadily increasing: >>>> >>>> 2014-09-30 10:05 686 >>>> s3://<bucketname>/snapshot-2014.09.30-10:00:01 >>>> 2014-09-30 12:05 686 >>>> s3://<bucketname>/snapshot-2014.09.30-12:00:01 >>>> 2014-09-30 14:05 736 >>>> s3://<bucketname>/snapshot-2014.09.30-14:00:01 >>>> 2014-09-30 16:05 736 >>>> s3://<bucketname>/snapshot-2014.09.30-16:00:01 >>>> ... >>>> 2014-11-08 00:52 1488 >>>> s3://<bucketname>/snapshot-2014.11.08-00:00:01 >>>> 2014-11-08 02:54 1488 >>>> s3://<bucketname>/snapshot-2014.11.08-02:00:01 >>>> ... >>>> 2014-11-08 14:54 1488 >>>> s3://<bucketname>/snapshot-2014.11.08-14:00:01 >>>> 2014-11-08 16:53 1488 >>>> s3://<bucketname>/snapshot-2014.11.08-16:00:01 >>>> ... >>>> 2014-11-11 07:00 1638 >>>> s3://<bucketname>/snapshot-2014.11.11-06:00:01 >>>> 2014-11-11 08:58 1638 >>>> s3://<bucketname>/snapshot-2014.11.11-08:00:01 >>>> 2014-11-11 10:58 1638 >>>> s3://<bucketname>/snapshot-2014.11.11-10:00:01 >>>> 2014-11-11 12:59 1638 >>>> s3://<bucketname>/snapshot-2014.11.11-12:00:01 >>>> 2014-11-11 15:00 1638 >>>> s3://<bucketname>/snapshot-2014.11.11-14:00:01 >>>> 2014-11-11 17:00 1638 >>>> s3://<bucketname>/snapshot-2014.11.11-16:00:01 >>>> >>>> I suspected that this gradual increase was related to the accumulation >>>> of old snapshots after I tested the following: >>>> 1. I created a brand new cluster with the same hardware specs in the >>>> same datacenter and restored a snapshot of the problematic cluster taken >>>> few days back (i.e. not the latest snapshot). >>>> 2. I then backed up that restored data to a new empty bucket in the >>>> same S3 region, and that was very fast...a minute or less. >>>> 3. I then restored a later snapshot of the problematic cluster to the >>>> test cluster and tried backing it up again to the new bucket, and that >>>> also >>>> took about a minute or less. >>>> >>>> However, when I tried deleting the repository full of old snapshots >>>> from the problematic cluster and registering a brand new empty bucket, I >>>> found that my first snapshot to the new repository was also hanging >>>> indefinitely. I finally had to kill my snapshot curl command. There were >>>> no >>>> errors in the logs (the snapshot logger is very terse...wondering if >>>> anyone >>>> knows how to increase the verbosity for it). >>>> >>>> So my theory seems to have been debunked, and I am again at a loss. I >>>> am wondering whether the hanging snapshot is related to the slow snapshots >>>> I was seeing before I deleted that old repository. I have seen several >>>> issues in GitHub regarding hanging snapshots (#5958 >>>> <https://github.com/elasticsearch/elasticsearch/issues/5958>, #7980 >>>> <https://github.com/elasticsearch/elasticsearch/issues/7980>) and have >>>> tried using the elasticsearch-snapshot-cleanup >>>> <https://github.com/imotov/elasticsearch-snapshot-cleanup> utility on >>>> my cluster both before and after I upgraded from version 1.2.2 to 1.4.0 (I >>>> thought upgrading to 1.4.0 which included snapshot improvements may fix my >>>> issues, but it did not), and the script is not finding any running >>>> snapshots: >>>> >>>> [2014-11-13 05:37:45,451][INFO ][org.elasticsearch.node ] [Golden >>>> Archer] started >>>> [2014-11-13 05:37:45,451][INFO >>>> ][org.elasticsearch.org.motovs.elasticsearch.snapshots.AbortedSnapshotCleaner] >>>> >>>> No snapshots found >>>> [2014-11-13 05:37:45,452][INFO ][org.elasticsearch.node ] [Golden >>>> Archer] stopping ... >>>> >>>> Curling to _snapshot/REPO/_status also returns no ongoing snapshots: >>>> >>>> curl -XGET >>>> 'http://<hostname>:9200/_snapshot/s3_backup_repo/_status?pretty=true' >>>> { >>>> "snapshots" : [ ] >>>> } >>>> >>>> I may try bouncing ES on each node to see if that kills whatever >>>> process is causing my requests to the snapshot module to hang (requests to >>>> other modules like _cluster/health returns fine; cluster health is green, >>>> and load is low for both nodes - 0.00, 0.06). >>>> >>>> I would really appreciate some help/guidance on how to debug/fix this >>>> issue and general recommendations on how to best achieve periodic >>>> snapshots. For example, cleaning up old snapshots seems rather difficult >>>> since we have to specify the snapshot name, which we would obtain by >>>> making >>>> a request to the snapshot module, which seems to hang often. >>>> >>>> Thanks, >>>> Sally >>>> >>>> >>>> On Monday, November 10, 2014 12:27:10 AM UTC-8, Pradeep Reddy wrote: >>>>> >>>>> Hi Vineeth, >>>>> >>>>> Thanks for the reply. >>>>> I am aware of how to create and delete snapshots using cloud-aws. >>>>> >>>>> What I wanted to know was how should the work flow of periodic >>>>> snapshot be?especially how to deal with old snapshots? having too many >>>>> old >>>>> snapshots- will this impact something? >>>>> >>>>> On Friday, November 7, 2014 8:16:05 PM UTC+5:30, vineeth mohan wrote: >>>>>> >>>>>> Hi , >>>>>> >>>>>> There is a s3 repository plugin - >>>>>> https://github.com/elasticsearch/elasticsearch-cloud-aws#s3-repository >>>>>> Use this. >>>>>> The snapshots are incremental , so it should fit your purpose >>>>>> perfectly. >>>>>> >>>>>> Thanks >>>>>> Vineeth >>>>>> >>>>>> On Fri, Nov 7, 2014 at 3:22 PM, Pradeep Reddy < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> I want to backup the data every 15-30 min. I will be storing the >>>>>>> snapshots in S3. >>>>>>> >>>>>>> DELETE old and then PUT new snapshot many not be the best practice >>>>>>> as you may end up with nothing if something goes wrong. >>>>>>> >>>>>>> Using timestamp for snapshot names may be one option, but how to >>>>>>> delete old snapshots then? >>>>>>> Does S3 life management cycle help to delete old snapshots? >>>>>>> >>>>>>> Looking forward to get some opinions on this. >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "elasticsearch" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/elasticsearch/0dd81d83-5066-4652-9703-dfce63b46993%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/elasticsearch/0dd81d83-5066-4652-9703-dfce63b46993%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9786e098-d92f-497e-b4e2-f176094af9c9%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
