Hi Mahadev,
I had submitted some small fixes to PurgeTxnLog in
*ZOOKEEPER-872https://issues.apache.org/jira/browse/ZOOKEEPER-872
*. Can you or someone else take a look at it?
Thanks.
-Vishal
On Mon, Sep 13, 2010 at 5:39 PM, Mahadev Konar maha...@yahoo-inc.comwrote:
Hi Vishal,
Usually the default retention policy is safe enough for operations.
http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperAdmin.html
Gives you an overview of how to use the purging library in zookeeper.
Thanks
mahadev
On 9/8/10 12:01 PM, Vishal K vishalm...@gmail.com wrote:
Hi All,
Can you please share your experience regarding ZK snapshot retention and
recovery policies?
We have an application where we never need to rollback (i.e., revert back
to
a previous state by using old snapshots). Given this, I am trying to
understand under what circumstances would we ever need to use old ZK
snapshots. I understand a lot of these decisions depend on the
application
and amount of redundancy used at every level (e.g,. RAID level where the
snapshots are stored etc) in the product. To simplify the discussion, I
would like to rule out any application characteristics and focus mainly
on
data consistency.
- Assuming that we have a 3 node cluster I am trying to figure out when
would I really need to use old snapshot files. With 3 nodes we already
have
at least 2 servers with consistent database. If I loose files on one of
the
servers, I can use files from the other. In fact, ZK server join will
take
care of this. I can remove files from a faulty node and reboot that node.
The faulty node will sync with the leader.
- The old files will be useful if the current snapshot and/or log files
are
lost or corrupted on all 3 servers. If the loss is due to a disaster
(case
where we loose all 3 servers), one would have to keep the snapshots on
some
external storage to recover. However, if the current snapshot file is
corrupted on all 3 servers, then the most likely cause would be a bug in
ZK.
In which case, how can I trust the consistency of the old snapshots?
- Given a set of snapshots and log files, how can I verify the
correctness
of these files? Example, if one of the intermediate snapshot file is
corrupt.
- The Admin's guide says Using older log and snapshot files, you can
look
at the previous state of ZooKeeper servers and even restore that state.
The
LogFormatter class allows an administrator to look at the transactions in
a
log. * *Is there a tool that does this for the admin? The LogFormatter
only displays the transactions in the log file.
- Has anyone ever had to play with the snapshot files in production?
Thanks in advance.
Regards,
-Vishal