> Actually, I think a more simple approach will get what we want here: > Give hbase a custom filesystem which writes to hdfs, then to s3, but > reads just from hdfs.
+1 :) That would be fantastic for making geographically distributed live-backups. -- Jim R. Wilson (jimbojw) On Mon, May 5, 2008 at 11:41 AM, stack <[EMAIL PROTECTED]> wrote: > Clint Morgan wrote: > > > Actually, I think a more simple approach will get what we want here: > > Give hbase a custom filesystem which writes to hdfs, then to s3, but > > reads just from hdfs. > > > > > > Thats an interesting idea Clint. What would you call it? (hdfs3?) > > ... > > > > On Thu, May 1, 2008 at 5:14 PM, Clint Morgan <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > What do you need for HBASE-50? Is it sufficient forcing the cluster > to go > > > > > > > > > > > > read-only flushing all in memory while the copy runs? > > > > > > Hopefully we can minimize the time we are read-only. We'd like the > > > system to behave as close to normally as possible while snapshotting. > > > Is the only danger of allowing writes that some new hstores may get > > > written so that we don't get a consistent view? Could we solve that by > > > only copying files who where created before the time when flushing > > > completed? Or just taking a listing at this point and only copying > > > from this listing? > > > > > > > > > > > > HBase also removes files (For example, after compaction, it'll remove the > old or after a split is done with its parent, the parent is removed). You > could make a manifest that had all files at the time of snapshot but we'd > have to do something like rename files that are for deletion adding a > '.deleted' or use something like hard links -- does this exist in hdfs? -- > so that deletes would be available to the copy task when it gets around to > the copy. > > > > > > > > It would be nice for hbase to provide the orchestration when we need > > > to restore from a snapshot. (Taking regions offline, copying over the > > > appropriate parts of region and META, etc). > > > > > > > > > > > > Yes. A tool that ran the 'fix' from backup and that verified and repaired > the install. > > > > > > > > At this point I'm still not sure what types of failures to plan for. > > > Any input on the sort of failures we should expect w.r.t data loss and > > > corruption? Obviously name node failure, which we would handle with > > > secondary name node. Should be able to recover from that just by bring > > > hdfs back online. So I guess the main concern is that we get a > > > corrupted hdfs file. > > > > > > Can you tolerate empty write-ahead-logs? That is, lose of the in-memory > content on regionserver crash because we don't yet have HADOOP-1700? > > > Otherwise, from what I've seen, fallures generally come of our having > problems writing hdfs. > > St.Ack >
