What’s the name of the utility?
From: Christopher [mailto:[email protected]] Sent: Tuesday, October 3, 2017 2:01 PM To: [email protected] Subject: Re: Backup and Recovery Oh, sorry, no. That's not the case. I did not mean to mislead. You also need to back up the metadata from ZooKeeper for a complete backup. We have a utility for that, which I believe is mentioned in the documentation. If not, that's a documentation bug and we should add it. (Sorry, unable to check at the moment, but please file a bug if you can't find it.) On Tue, Oct 3, 2017 at 4:47 PM <[email protected] <mailto:[email protected]> > wrote: So if I backup the HDFS I have a backup of accumulo? There isn’t any other data that I’d need to grab? From: Christopher [mailto:[email protected] <mailto:[email protected]> ] Sent: Tuesday, October 3, 2017 1:41 PM To: [email protected] <mailto:[email protected]> Subject: Re: Backup and Recovery Hi Mike. This is a great question. Accumulo has several options for backup. Accumulo is backed by HDFS for persisting its data on disk. It may be possible to use S3 directly at this layer. I'm not sure what the current state is for doing something like this, but a brief Googling for "HDFS on S3" shows a few historical projects which may still be active and mature. Accumulo also has a replication feature to automatically mirror live ingest to a pluggable external receiver, which could be a backup service you've written to store data in S3. Recovery would depend on how you store the data in S3. You could also implement an ingest system which stores data to a backup as well as to Accumulo, to handle both live and bulk ingest. Accumulo also has an "exporttable" feature, which exports the metadata for a table, along with a list of files in HDFS for you to back up to S3 (or another file system). Recovery involves using the "importtable" feature which recreates the metadata, and bulk importing the files after you've moved them from your backup location back onto HDFS. This is just a rough outline of 3 possible solutions. I don't know which (if any) would match your requirements best. There may be many other solutions as well. On Tue, Oct 3, 2017 at 4:10 PM <[email protected] <mailto:[email protected]> > wrote: Please forgive the newbie question. What options are there for backup and recovery of accumulo data? Ideally I would like something that would replicate to S3 in realtime. -- Zapata Technology your *8(a)* and *HUBZone *IT Solutions Provider Washington Technology and Inc Magazine fastest growing company two years in a row. þ Please consider our environment before printing this e-mail. *CONFIDENTIALITY NOTE:* This communication is intended solely to be used by the intended recipient only and may contain information that is privileged, confidential, or otherwise prohibited by law from disclosure. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, copying, taking any action in reliance upon, or other use of this information is strictly prohibited. If you received this communication in error, please contact the sender and then delete it. Thank you.
