I'm a bit late to this one but I don't understand the need for complete backup of ZK data. In my experience, 99% of ZNodes are ephemeral. Therefore, it would be wrong to restore those nodes. In a disaster, the connection sessions would expire and you would not want ephemeral nodes restored. This is why in Exhibitor I took the approach of selective restore.
-Jordan On Jul 19, 2013, at 11:00 AM, jack ma <[email protected]> wrote: > I asked those question in the thread > http://mail-archives.apache.org/mod_mbox/zookeeper-user/201307.mbox/%3cCAB+cfdwhOV0JfB04=MpO_+i-4ou=VbL=eg2xs557+j+698j...@mail.gmail.com%3e, > but there is no response for that. > > So I posted those questions again here, hopefully I could get helps > from the community. > > I want to make sure I am fully understanding the procedures of zookeeper > backup and disaster recovery: > > For the backup procedures at zookeeper assemble: > (1) Login to any host which state is "Serving" > Question: > Do I have to login to leader node, or any node is ok? > (2) Copy latest snapshot file and transaction log from version-2 directory. > Question: > How to make sure we do not copy corrupt files if the > snapshot/transaction log is in the middle of update? Do we have to shutdown > the node to make the copy? > besides the transaction log and snapshot, do we have to > copy other files such as the ecoch files > > For the disaster recovery procedures at zookeeper assemble: > (1) recreate the machines for the zookeeper ensemble > (2) copy snapshot/transaction log we backed up into the zookeeper > dataDir\version-2 and logDir\version2. > Question: > Do we have to copy the epoch files? > Do we have to copy snapshot/transaction log backed up to > all the zookeeper node, or just the first node we starts? > > Appreciate your time and help. > Jack
