What’s the name of the utility?

 

From: Christopher [mailto:[email protected]] 
Sent: Tuesday, October 3, 2017 2:01 PM
To: [email protected]
Subject: Re: Backup and Recovery

 

Oh, sorry, no. That's not the case. I did not mean to mislead. You also need to 
back up the metadata from ZooKeeper for a complete backup. We have a utility 
for that, which I believe is mentioned in the documentation. If not, that's a 
documentation bug and we should add it. (Sorry, unable to check at the moment, 
but please file a bug if you can't find it.)

On Tue, Oct 3, 2017 at 4:47 PM <[email protected] 
<mailto:[email protected]> > wrote:

So if I backup the HDFS I have a backup of accumulo? There isn’t any other data 
that I’d need to grab?

 

From: Christopher [mailto:[email protected] <mailto:[email protected]> ] 
Sent: Tuesday, October 3, 2017 1:41 PM
To: [email protected] <mailto:[email protected]> 
Subject: Re: Backup and Recovery

 

Hi Mike. This is a great question. Accumulo has several options for backup.

Accumulo is backed by HDFS for persisting its data on disk. It may be possible 
to use S3 directly at this layer. I'm not sure what the current state is for 
doing something like this, but a brief Googling for "HDFS on S3" shows a few 
historical projects which may still be active and mature.

Accumulo also has a replication feature to automatically mirror live ingest to 
a pluggable external receiver, which could be a backup service you've written 
to store data in S3. Recovery would depend on how you store the data in S3. You 
could also implement an ingest system which stores data to a backup as well as 
to Accumulo, to handle both live and bulk ingest.

Accumulo also has an "exporttable" feature, which exports the metadata for a 
table, along with a list of files in HDFS for you to back up to S3 (or another 
file system). Recovery involves using the "importtable" feature which recreates 
the metadata, and bulk importing the files after you've moved them from your 
backup location back onto HDFS.

This is just a rough outline of 3 possible solutions. I don't know which (if 
any) would match your requirements best. There may be many other solutions as 
well.

On Tue, Oct 3, 2017 at 4:10 PM <[email protected] 
<mailto:[email protected]> > wrote:

Please forgive the newbie question. What options are there for backup and 
recovery of accumulo data?

 

Ideally I would like something that would replicate to S3 in realtime.

 


-- 


Zapata Technology your *8(a)* and *HUBZone *IT Solutions Provider
Washington Technology and Inc Magazine fastest growing company two years in 
a row. 

þ Please consider our environment before printing this e-mail.

*CONFIDENTIALITY NOTE:*  This communication is intended solely to be used 
by the intended recipient only and may contain information that is 
privileged, confidential, or otherwise prohibited by law from disclosure. 
 If you are not the intended recipient, you are hereby notified that any 
dissemination, distribution, copying, taking any action in reliance upon, 
or other use of this information is strictly prohibited.  If you received 
this communication in error, please contact the sender and then delete it. 
 Thank you.  

Reply via email to