Hi All,
We've recently encountered a strange situation on a small test cluster: after
an awkward crash, our ZooKeeper data was erased and we no longer have the
[accumulo] znode. The HDFS accumulo directory is intact, so all the RFiles and
etc are still there, but it's not clear how best to bring Accumulo back up to
its previous state. Obviously just starting Accumulo as-is complains about the
missing znode ("Waiting for accumulo to be initialized"), whereas
re-initializing is not possible over existing HDFS directories ("It appears
this location was previously initialized, exiting").
A couple of questions about recovery strategies:
1) Is there any way to re-create the znode for a previous instance-id? My
understanding is that ZK is mostly used to store ephemeral data (such as which
tserver is currently responsible for which tablets) and things like users
(which we could re-create), so perhaps this is plausible?
2) I imagine that I could init Accumulo with a new instance.dfs.dir, then
import the RFiles from the old installation back in. I see Patrick just asked a
related question, so, with the data integrity caveats, I would essentially be
following the last of the steps in ACCUMULO-456.
3) This is a vague question, but have any of you had experience with the
[accumulo] znode being entirely deleted? Aside from stopping/starting ZK
(3.3.5) and Accumulo 1.4.0 (possibly with a force-quit), I'm not sure what we
could have done to actually delete it.
This is just a test instance, and the data could easily be recreated, but I
want to take this opportunity to learn a little more about Accumulo plumbing
and maintenance.
Thanks,
Krishmin