Re: Understanding ZooKeeper data file management and LogFormatter

2010-11-01 Thread Vishal Kher
Hi Mahadev,

I had submitted some small fixes to PurgeTxnLog in
*ZOOKEEPER-872
*. Can you or someone else take a look at it?

Thanks.
-Vishal



On Mon, Sep 13, 2010 at 5:39 PM, Mahadev Konar wrote:

> Hi Vishal,
>  Usually the default retention policy is safe enough for operations.
>
> http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperAdmin.html
>
> Gives you an overview of how to use the purging library in zookeeper.
>
> Thanks
> mahadev
>
>
> On 9/8/10 12:01 PM, "Vishal K"  wrote:
>
> > Hi All,
> >
> > Can you please share your experience regarding ZK snapshot retention and
> > recovery policies?
> >
> > We have an application where we never need to rollback (i.e., revert back
> to
> > a previous state by using old snapshots). Given this, I am trying to
> > understand under what circumstances would we ever need to use old ZK
> > snapshots. I understand a lot of these decisions depend on the
> application
> > and amount of redundancy used at every level (e.g,. RAID level where the
> > snapshots are stored etc) in the product. To simplify the discussion, I
> > would like to rule out any application characteristics and focus mainly
> on
> > data consistency.
> >
> > - Assuming that we have a 3 node cluster I am trying to figure out when
> > would I really need to use old snapshot files. With 3 nodes we already
> have
> > at least 2 servers with consistent database. If I loose files on one of
> the
> > servers, I can use files from the other. In fact, ZK server join will
> take
> > care of this. I can remove files from a faulty node and reboot that node.
> > The faulty node will sync with the leader.
> >
> > - The old files will be useful if the current snapshot and/or log files
> are
> > lost or corrupted on all 3 servers. If  the loss is due to a disaster
> (case
> > where we loose all 3 servers), one would have to keep the snapshots on
> some
> > external storage to recover. However, if the current snapshot file is
> > corrupted on all 3 servers, then the most likely cause would be a bug in
> ZK.
> > In which case, how can I trust the consistency of the old snapshots?
> >
> > - Given a set of snapshots and log files, how can I verify the
> correctness
> > of these files? Example, if one of the intermediate snapshot file is
> > corrupt.
> >
> > - The Admin's guide says "Using older log and snapshot files, you can
> look
> > at the previous state of ZooKeeper servers and even restore that state.
> The
> > LogFormatter class allows an administrator to look at the transactions in
> a
> > log." * *Is there a tool that does this for the admin?  The LogFormatter
> > only displays the transactions in the log file.
> >
> > - Has anyone ever had to play with the snapshot files in production?
> >
> > Thanks in advance.
> >
> > Regards,
> > -Vishal
> >
>
>


Re: Understanding ZooKeeper data file management and LogFormatter

2010-09-13 Thread Mahadev Konar
Hi Vishal,
 Usually the default retention policy is safe enough for operations.

http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperAdmin.html

Gives you an overview of how to use the purging library in zookeeper.

Thanks
mahadev


On 9/8/10 12:01 PM, "Vishal K"  wrote:

> Hi All,
> 
> Can you please share your experience regarding ZK snapshot retention and
> recovery policies?
> 
> We have an application where we never need to rollback (i.e., revert back to
> a previous state by using old snapshots). Given this, I am trying to
> understand under what circumstances would we ever need to use old ZK
> snapshots. I understand a lot of these decisions depend on the application
> and amount of redundancy used at every level (e.g,. RAID level where the
> snapshots are stored etc) in the product. To simplify the discussion, I
> would like to rule out any application characteristics and focus mainly on
> data consistency.
> 
> - Assuming that we have a 3 node cluster I am trying to figure out when
> would I really need to use old snapshot files. With 3 nodes we already have
> at least 2 servers with consistent database. If I loose files on one of the
> servers, I can use files from the other. In fact, ZK server join will take
> care of this. I can remove files from a faulty node and reboot that node.
> The faulty node will sync with the leader.
> 
> - The old files will be useful if the current snapshot and/or log files are
> lost or corrupted on all 3 servers. If  the loss is due to a disaster (case
> where we loose all 3 servers), one would have to keep the snapshots on some
> external storage to recover. However, if the current snapshot file is
> corrupted on all 3 servers, then the most likely cause would be a bug in ZK.
> In which case, how can I trust the consistency of the old snapshots?
> 
> - Given a set of snapshots and log files, how can I verify the correctness
> of these files? Example, if one of the intermediate snapshot file is
> corrupt.
> 
> - The Admin's guide says "Using older log and snapshot files, you can look
> at the previous state of ZooKeeper servers and even restore that state. The
> LogFormatter class allows an administrator to look at the transactions in a
> log." * *Is there a tool that does this for the admin?  The LogFormatter
> only displays the transactions in the log file.
> 
> - Has anyone ever had to play with the snapshot files in production?
> 
> Thanks in advance.
> 
> Regards,
> -Vishal
> 



Re: Understanding ZooKeeper data file management and LogFormatter

2010-09-08 Thread Ted Dunning
Due to issues in my fingers and brain.

On Wed, Sep 8, 2010 at 1:20 PM, Vishal K  wrote:

> Thanks Ted.  Did you have to unwind the cluster due to data consistency
> issues or due to issues at the application?
>
> On Wed, Sep 8, 2010 at 4:06 PM, Ted Dunning  wrote:
>
> > I have used old snapshot files exactly once when I deleted a bunch of
> > server
> > state trying to unwind a tangled
> > cluster.
> >
> > I keep a few around just for backup purposes.
> >
> > On Wed, Sep 8, 2010 at 12:01 PM, Vishal K  wrote:
> >
> > > Hi All,
> > >
> > > Can you please share your experience regarding ZK snapshot retention
> and
> > > recovery policies?
> > >
> > > We have an application where we never need to rollback (i.e., revert
> back
> > > to
> > > a previous state by using old snapshots). Given this, I am trying to
> > > understand under what circumstances would we ever need to use old ZK
> > > snapshots. I understand a lot of these decisions depend on the
> > application
> > > and amount of redundancy used at every level (e.g,. RAID level where
> the
> > > snapshots are stored etc) in the product. To simplify the discussion, I
> > > would like to rule out any application characteristics and focus mainly
> > on
> > > data consistency.
> > >
> > > - Assuming that we have a 3 node cluster I am trying to figure out when
> > > would I really need to use old snapshot files. With 3 nodes we already
> > have
> > > at least 2 servers with consistent database. If I loose files on one of
> > the
> > > servers, I can use files from the other. In fact, ZK server join will
> > take
> > > care of this. I can remove files from a faulty node and reboot that
> node.
> > > The faulty node will sync with the leader.
> > >
> > > - The old files will be useful if the current snapshot and/or log files
> > are
> > > lost or corrupted on all 3 servers. If  the loss is due to a disaster
> > (case
> > > where we loose all 3 servers), one would have to keep the snapshots on
> > some
> > > external storage to recover. However, if the current snapshot file is
> > > corrupted on all 3 servers, then the most likely cause would be a bug
> in
> > > ZK.
> > > In which case, how can I trust the consistency of the old snapshots?
> > >
> > > - Given a set of snapshots and log files, how can I verify the
> > correctness
> > > of these files? Example, if one of the intermediate snapshot file is
> > > corrupt.
> > >
> > > - The Admin's guide says "Using older log and snapshot files, you can
> > look
> > > at the previous state of ZooKeeper servers and even restore that state.
> > The
> > > LogFormatter class allows an administrator to look at the transactions
> in
> > a
> > > log." * *Is there a tool that does this for the admin?  The
> LogFormatter
> > > only displays the transactions in the log file.
> > >
> > > - Has anyone ever had to play with the snapshot files in production?
> > >
> > > Thanks in advance.
> > >
> > > Regards,
> > > -Vishal
> > >
> >
>


Re: Understanding ZooKeeper data file management and LogFormatter

2010-09-08 Thread Vishal K
Thanks Ted.  Did you have to unwind the cluster due to data consistency
issues or due to issues at the application?

On Wed, Sep 8, 2010 at 4:06 PM, Ted Dunning  wrote:

> I have used old snapshot files exactly once when I deleted a bunch of
> server
> state trying to unwind a tangled
> cluster.
>
> I keep a few around just for backup purposes.
>
> On Wed, Sep 8, 2010 at 12:01 PM, Vishal K  wrote:
>
> > Hi All,
> >
> > Can you please share your experience regarding ZK snapshot retention and
> > recovery policies?
> >
> > We have an application where we never need to rollback (i.e., revert back
> > to
> > a previous state by using old snapshots). Given this, I am trying to
> > understand under what circumstances would we ever need to use old ZK
> > snapshots. I understand a lot of these decisions depend on the
> application
> > and amount of redundancy used at every level (e.g,. RAID level where the
> > snapshots are stored etc) in the product. To simplify the discussion, I
> > would like to rule out any application characteristics and focus mainly
> on
> > data consistency.
> >
> > - Assuming that we have a 3 node cluster I am trying to figure out when
> > would I really need to use old snapshot files. With 3 nodes we already
> have
> > at least 2 servers with consistent database. If I loose files on one of
> the
> > servers, I can use files from the other. In fact, ZK server join will
> take
> > care of this. I can remove files from a faulty node and reboot that node.
> > The faulty node will sync with the leader.
> >
> > - The old files will be useful if the current snapshot and/or log files
> are
> > lost or corrupted on all 3 servers. If  the loss is due to a disaster
> (case
> > where we loose all 3 servers), one would have to keep the snapshots on
> some
> > external storage to recover. However, if the current snapshot file is
> > corrupted on all 3 servers, then the most likely cause would be a bug in
> > ZK.
> > In which case, how can I trust the consistency of the old snapshots?
> >
> > - Given a set of snapshots and log files, how can I verify the
> correctness
> > of these files? Example, if one of the intermediate snapshot file is
> > corrupt.
> >
> > - The Admin's guide says "Using older log and snapshot files, you can
> look
> > at the previous state of ZooKeeper servers and even restore that state.
> The
> > LogFormatter class allows an administrator to look at the transactions in
> a
> > log." * *Is there a tool that does this for the admin?  The LogFormatter
> > only displays the transactions in the log file.
> >
> > - Has anyone ever had to play with the snapshot files in production?
> >
> > Thanks in advance.
> >
> > Regards,
> > -Vishal
> >
>


Re: Understanding ZooKeeper data file management and LogFormatter

2010-09-08 Thread Ted Dunning
I have used old snapshot files exactly once when I deleted a bunch of server
state trying to unwind a tangled
cluster.

I keep a few around just for backup purposes.

On Wed, Sep 8, 2010 at 12:01 PM, Vishal K  wrote:

> Hi All,
>
> Can you please share your experience regarding ZK snapshot retention and
> recovery policies?
>
> We have an application where we never need to rollback (i.e., revert back
> to
> a previous state by using old snapshots). Given this, I am trying to
> understand under what circumstances would we ever need to use old ZK
> snapshots. I understand a lot of these decisions depend on the application
> and amount of redundancy used at every level (e.g,. RAID level where the
> snapshots are stored etc) in the product. To simplify the discussion, I
> would like to rule out any application characteristics and focus mainly on
> data consistency.
>
> - Assuming that we have a 3 node cluster I am trying to figure out when
> would I really need to use old snapshot files. With 3 nodes we already have
> at least 2 servers with consistent database. If I loose files on one of the
> servers, I can use files from the other. In fact, ZK server join will take
> care of this. I can remove files from a faulty node and reboot that node.
> The faulty node will sync with the leader.
>
> - The old files will be useful if the current snapshot and/or log files are
> lost or corrupted on all 3 servers. If  the loss is due to a disaster (case
> where we loose all 3 servers), one would have to keep the snapshots on some
> external storage to recover. However, if the current snapshot file is
> corrupted on all 3 servers, then the most likely cause would be a bug in
> ZK.
> In which case, how can I trust the consistency of the old snapshots?
>
> - Given a set of snapshots and log files, how can I verify the correctness
> of these files? Example, if one of the intermediate snapshot file is
> corrupt.
>
> - The Admin's guide says "Using older log and snapshot files, you can look
> at the previous state of ZooKeeper servers and even restore that state. The
> LogFormatter class allows an administrator to look at the transactions in a
> log." * *Is there a tool that does this for the admin?  The LogFormatter
> only displays the transactions in the log file.
>
> - Has anyone ever had to play with the snapshot files in production?
>
> Thanks in advance.
>
> Regards,
> -Vishal
>


Understanding ZooKeeper data file management and LogFormatter

2010-09-08 Thread Vishal K
Hi All,

Can you please share your experience regarding ZK snapshot retention and
recovery policies?

We have an application where we never need to rollback (i.e., revert back to
a previous state by using old snapshots). Given this, I am trying to
understand under what circumstances would we ever need to use old ZK
snapshots. I understand a lot of these decisions depend on the application
and amount of redundancy used at every level (e.g,. RAID level where the
snapshots are stored etc) in the product. To simplify the discussion, I
would like to rule out any application characteristics and focus mainly on
data consistency.

- Assuming that we have a 3 node cluster I am trying to figure out when
would I really need to use old snapshot files. With 3 nodes we already have
at least 2 servers with consistent database. If I loose files on one of the
servers, I can use files from the other. In fact, ZK server join will take
care of this. I can remove files from a faulty node and reboot that node.
The faulty node will sync with the leader.

- The old files will be useful if the current snapshot and/or log files are
lost or corrupted on all 3 servers. If  the loss is due to a disaster (case
where we loose all 3 servers), one would have to keep the snapshots on some
external storage to recover. However, if the current snapshot file is
corrupted on all 3 servers, then the most likely cause would be a bug in ZK.
In which case, how can I trust the consistency of the old snapshots?

- Given a set of snapshots and log files, how can I verify the correctness
of these files? Example, if one of the intermediate snapshot file is
corrupt.

- The Admin's guide says "Using older log and snapshot files, you can look
at the previous state of ZooKeeper servers and even restore that state. The
LogFormatter class allows an administrator to look at the transactions in a
log." * *Is there a tool that does this for the admin?  The LogFormatter
only displays the transactions in the log file.

- Has anyone ever had to play with the snapshot files in production?

Thanks in advance.

Regards,
-Vishal