I like that HyperDex provides direct backup support instead of simply suggesting a stop-filecopy-start-catchup scenario. Are there any plans at Basho to make backups a core function of Riak (or as a separate but included utility) - it would certainly be nice to have something Basho provides help ensure things are done properly each time, all the time.
Cheers, Dave On Thu, Jan 23, 2014 at 1:42 PM, Joe Caswell <[email protected]> wrote: > Apologies, clicked send in the middle of an incomplete thought. It should > have read: > > Backing up the LevelDB data files while the node is stopped would remove > the necessity of using the LevelDB repair process upon restoring to make > the vnode self-consistent. > > From: Joe Caswell <[email protected]> > Date: Thursday, January 23, 2014 1:25 PM > To: Sean McKibben <[email protected]>, Elias Levy < > [email protected]> > > Cc: "[email protected]" <[email protected]> > Subject: Re: Riak Search and Yokozuna Backup Strategy > > Backing up LevelDB data files can be accomplished while the node is > running if the sst_x directories are backed up in numerical order. The > undesirable side effects of that could be duplicated data, inconsistent > manifest, or incomplete writes, which necessitates running the leveldb > repair process upon restoration for any vnode backed up while the node was > running. Since the data is initially written to the recovery log before > being appended to level 0, and any compaction operation fully writes the > data to its new location before removing it from its old location, if any > of these operations are interrupted, the data can be completely recovered > by leveldb repair. > > The only incomplete write that won't be recovered by the LevelDB repair > process is the initial write to the recovery log, limiting exposure to the > key being actively written at the time of the snapshot/backup. As long as > 2 vnodes in the same preflist are not backed up while simultaneously > writing the same key to the recovery log (i.e. rolling backups are good), > this key will be recovered by AAE/read repair after restoration. > > Backing up the LevelDB data files while the node is stopped would remove > the necessity of repairing the > > Backing up Riak Search data, on the other hand, is a dicey proposition. > There are 3 bits to riak search data: the document you store, the output > of the extractor, and the merge index. > > When you put a document in <<"key">> in a <<"bucket">> with search > enabled, Riak uses the pre-defined extractor to parse the document into > terms, possibly flattening the structure, and stores the result in > <<"_rsid_bucket">>/<<"key">>, which is used during update operations to > remove stale entries before adding new ones, and would most likely be > stored in a different vnode, possibly on a different node entirely. The > document id/link is inserted into the merge index entry for each term > identified by the extractor, any or all of which may reside on different > nodes. Since the document, its index document, and the term indexes could > not be guaranteed to be captured in any single backup operation, it is a > very real probability that these would be out of sync in the event that a > restore is required. > > If restore is only required for a single node, consistency could be > restored by running a repair operation for each riak_kv vnode and > riak_search vnode stored on the node, which would repair the data from > other nodes in the cluster. If more than one node is restored, it is quite > likely that they both stored replicas of the same data, for some subset of > the full data set. The only way to ensure consistency is fully restored in > the latter case is to reindex the data set. This can be accomplished by > reading and rewriting all of the data, or by reindexing via MapReduce as > suggested in this earlier mailing list post: > http://lists.basho.com/pipermail/riak-users_lists.basho.com/2012-October/009861.html > > In either restore case, having a backup of the merge_index data files is > not helpful, so there does not appear to be any point in backing them up. > > Joe Caswell > From: Sean McKibben <[email protected]> > Date: Tuesday, January 21, 2014 1:04 PM > To: Elias Levy <[email protected]> > Cc: "[email protected]" <[email protected]> > Subject: Re: Riak Search and Yokozuna Backup Strategy > > +1 LevelDB backup information is important to us > > > On Jan 20, 2014, at 4:38 PM, Elias Levy <[email protected]> > wrote: > > Anyone from Basho care to comment? > > > On Thu, Jan 16, 2014 at 10:19 AM, Elias Levy > <[email protected]>wrote: > >> >> Also, while LevelDB appears to be largely an append only format, the >> documentation currently does not recommend live backups, presumably because >> there are some issues that can crop up if restoring a DB that was not >> cleanly shutdown. >> >> I am guessing those issues are the ones documented as edge cases here: >> https://github.com/basho/leveldb/wiki/repair-notes >> >> That said, it looks like as of 1.4 those are largely cleared up, at least >> from what I gather from that page, and that one must only ensure that data >> is copied in a certain order and that you run the LevelDB repair algorithm >> when retiring the files. >> >> So is the backup documentation on LevelDB still correct? Will Basho will >> enable hot backups on LevelDB backends any time soon? >> >> >> >> On Thu, Jan 16, 2014 at 10:05 AM, Elias Levy <[email protected] >> > wrote: >> >>> How well does Riak Search play with backups? Can you backup the Riak >>> Search data without bringing the node down? >>> >>> The Riak documentation backup page is completely silent on Riak Search >>> and its merge_index backend. >>> >>> And looking forward, what is the backup strategy for Yokozuna? Will it >>> make use of Solr's Replication Handler, or something more lower level? >>> Will the node need to be offline to backup it up? >>> >>> >> > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > _______________________________________________ riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
