We're doing a version of that at Salesforce (we have our own M/R jobs, but the principle is the same). Soon we'll run the backup M/R job over a snapshot for performance reasons, but even then the principle is the same.
Specifically we're keeping 48h worth of life data in HBase itself (TTL=48h, MIN_VERSIONS=1, KEEP_DELETED_CELLS=true), and run the jobs as of 2h in the past (rounded to an exact hour boundary), every night. I think it's time I write an updated blog post. We plan to eventually open source the tools we've written. -- Lars ________________________________ From: Timo Schaepe <[email protected]> To: "[email protected]" <[email protected]> Sent: Monday, December 23, 2013 10:53 AM Subject: Consistent Backup strategy Hey guys, we are searching for a consistent backup strategy with the export tool. Is this article still up-to-date and I can use it? http://hadoop-hbase.blogspot.com/2012/04/timestamp-consistent-backups-in-hbase.html Thanks for answers. cheers, Timo
