Hi Elias,
On Mon, Jan 27, 2014 at 2:40 PM, Elias Levy <[email protected]>wrote: > > > Any comments on the backup strategy for Yokozuna? Will it make use of > Solr's Replication Handler, or something more lower level? Will the node > need to be offline to backup it up? > There is no use of any Solr replication code--at all. Yokozuna (new Riak Search, yes I know the naming is confusing) can be thought of as secondary data to KV. It is a collection of index postings based on the canonical and authoritative KV data. Therefore, the postings can always be rebuilt from the KV data. AAE provides an automatic integrity check between the KV object and its postings that is run constantly in the background. Given that, there are two ways I see backup/restore working. 1. From a local, file-level perspective. You take a snapshot of your node's local filesystem and use that as a save point in case of future corruption. In this case you don't worry yourself with cluster-wide consistency, it's just a local backup. If you ever have to restore this data then AAE and read-repair can deal with any divergence that is caused by using the restore. Although, you could end up with resurrected data depending on your delete policy and age of backup. Another issue is that various parts of Riak that write to disk may not be snapshot safe. It's already been discussed how leveldb isn't. I'm willing to bet Lucene isn't either. Any case where a logical operation requires multiple filesystem writes you have to worry about the snapshot occurring in the middle of the logical operation. I have no idea how Lucene would deal with snapshots that occur at the wrong time. I'm unsure how good it is at detecting, and more importantly, recovering from corruption. This is one reason why AAE is so important. I do demos at my talks where I literally rm -rf the entire index dir and AAE rebuilds it from scratch. This will not necessarily be a fast operation in a real production database but it's good to know that the data can always be re-built from the KV data. If you can cover the KV data then you can always rebuild the indexes. 2. Backup/restore as a logical operation in Riak itself. We currently have a backup/restore but from what I hear it has various issues and needs to be fixed/replaced. But, assuming there was a backup command that worked I suppose you could try playing games with Yokozuna. Perhaps Yokozuna could freeze an index from merging segments and backup important files. Perhaps there are replication hooks built into Solr/Lucene that could be used. I'm not sure. I'm handwaving on purpose because I'm sure there are multiple avenues to explore. However, another option is to punt. As I said above the indexes can be rebuilt from the KV data. So if you have a backup that only works for KV then the restore operation would simply re-index the data as it is written. Yokozuna currently uses a low-level hook inside the KV vnode that notices any time that KV data is written so it should "just work" assuming restore goes through the KV code path and doesn't build files directly. -Z
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
