Re: Code to clean up transaction logs needs snapshots before it works?

2019-08-14 Thread Ted Dunning
On Wed, Aug 14, 2019 at 5:35 AM Koen De Groote 
wrote:

> ...
> I've read that the log files are preallocated. I see them as being 65MB a
> piece.
>

Yes. Preallocation of logs is an important performance trick.

The point is that if a file doesn't change length when you write to it,
then the file attributes don't have to change. Writing file attributes can
be as expensive as writing the data and, in any case, isn't free and it has
forced ordering.


Re: Code to clean up transaction logs needs snapshots before it works?

2019-08-14 Thread Koen De Groote
Al right, I'll try and go forward with that info. Thanks.

On Wed, Aug 14, 2019 at 2:55 PM Norbert Kalmar 
wrote:

> Read doesn't matter, there will be no txn logs for read. Only modifications
> are logged (state changes to the datatree).
>
> As for why all txn logs are 64MB:
> "preAllocSize
> (Java system property: zookeeper.preAllocSize)
> To avoid seeks ZooKeeper allocates space in the transaction log file in
> blocks of preAllocSize kilobytes. The default block size is 64M. One reason
> for changing the size of the blocks is to reduce the block size if
> snapshots are taken more often. (Also, see snapCount)."
>
> Note that all txn is written to one log file. That's why it is 64MB by
> default.
> I didn't check which version of admin guide I quoted, but here is the one
> from 3.4.13:
>
> snapCount
> ZooKeeper records its transactions using snapshots and a transaction log
> (think write-ahead log).The number of transactions recorded in the
> transaction log before a snapshot can be taken (and the transaction log
> rolled) is determined by snapCount. In order to prevent all of the machines
> in the quorum from taking a snapshot at the same time, each ZooKeeper
> server will take a snapshot when the number of transactions in the
> transaction log reaches a runtime generated random value in the
> [snapCount/2+1, snapCount] range.The default snapCount is 100,000.
>
> I'm not sure without checking the code what happens if preAllocSize is
> filled (new log file I think), but after a restart, there will be a new log
> file created. No way to continue the last one. That's why there could be
> multiple log files, and no transaction. But if you haven't reached the
> 100.000 transactions (writes that is, reads doesn't count), there will be
> no snapshot.
>
> Regards,
> Norbert
>
> On Wed, Aug 14, 2019 at 2:35 PM Koen De Groote <
> koen.degro...@limecraft.com>
> wrote:
>
> > >Without a snapshot, we cannot delete the log files, as we would have no
> > >means of recovery. txn logs applied to the snapshot gives us back the
> > >state. Without snapshot, all txn logs needs to be "replayed" in a
> > recovery.
> > >And you need all the log files created since your last snapshot (in this
> > >case, all the txn logs as there were no snapshots yet).
> >
> > Makes sense.
> >
> > In the heaviest environment I'm hitting around 200 requests per second,
> > which all have to get data from zookeeper. Not sure what the impact is in
> > terms of snapCount, I didn't set up the system myself and don't fully
> grasp
> > the internals.
> >
> > As for the snapCount, that hasn't been touched, so that will be the
> default
> > in my environments.
> > I've read that the log files are preallocated. I see them as being 65MB a
> > piece.
> >
> > Which makes me wonder: how many of those until the process hits 10
> > snaps?
> >
> > auto-cleaning is setup, retainCount=3, purgeInterval=1
> >
> > Restarting the zookeeper process shouldn't affect this count, I think? It
> > doesn't happen often, though it might on test environments.
> >
> >
> >
> > On Wed, Aug 14, 2019 at 2:06 PM Norbert Kalmar
> > 
> > wrote:
> >
> > > Hi,
> > >
> > > Without a snapshot, we cannot delete the log files, as we would have no
> > > means of recovery. txn logs applied to the snapshot gives us back the
> > > state. Without snapshot, all txn logs needs to be "replayed" in a
> > recovery.
> > > And you need all the log files created since your last snapshot (in
> this
> > > case, all the txn logs as there were no snapshots yet).
> > >
> > > As for why there is no snapshot. What is your load? Per the admin
> guide:
> > >
> > > "snapCount
> > > (Java system property: zookeeper.snapCount)
> > > ZooKeeper logs transactions to a transaction log. After snapCount
> > > transactions are written to a log file a snapshot is started and a new
> > > transaction log file is created. The default snapCount is 100,000."
> > >
> > > By default there will be no auto-cleaning of the snapshot and log
> files.
> > > Check the autopurge.snapRetainCount and autopurge.purgeInterval
> settings
> > > for this.
> > >
> > > Regards,
> > > Norbert
> > >
> > > On Wed, Aug 14, 2019 at 1:21 PM Koen De Groote <
> > > koen.degro...@limecraft.com>
> > > wrote:
> > >
> > > > Greetings all.
> > > >
> > > > I was debugging something an ran into this bit of code:
> > > >
> > > >
> > >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/PurgeTxnLog.java#L81
> > > >
> > > > If I understand it correctly, it seems that this means log.x
> files
> > > will
> > > > only get deleted if there's a snapshot.
> > > >
> > > > Which is troublesome, as my dataDir is filling up with log files but
> > not
> > > a
> > > > single snapshot in sight.
> > > >
> > > > 1: Is this correct behavior? Both the logic of needing a snapshot and
> > the
> > > > fact that not snapshots are being generated?
> > > >
> > > > 2: While not having a fix for this, what would be 

Re: Code to clean up transaction logs needs snapshots before it works?

2019-08-14 Thread Norbert Kalmar
Read doesn't matter, there will be no txn logs for read. Only modifications
are logged (state changes to the datatree).

As for why all txn logs are 64MB:
"preAllocSize
(Java system property: zookeeper.preAllocSize)
To avoid seeks ZooKeeper allocates space in the transaction log file in
blocks of preAllocSize kilobytes. The default block size is 64M. One reason
for changing the size of the blocks is to reduce the block size if
snapshots are taken more often. (Also, see snapCount)."

Note that all txn is written to one log file. That's why it is 64MB by
default.
I didn't check which version of admin guide I quoted, but here is the one
from 3.4.13:

snapCount
ZooKeeper records its transactions using snapshots and a transaction log
(think write-ahead log).The number of transactions recorded in the
transaction log before a snapshot can be taken (and the transaction log
rolled) is determined by snapCount. In order to prevent all of the machines
in the quorum from taking a snapshot at the same time, each ZooKeeper
server will take a snapshot when the number of transactions in the
transaction log reaches a runtime generated random value in the
[snapCount/2+1, snapCount] range.The default snapCount is 100,000.

I'm not sure without checking the code what happens if preAllocSize is
filled (new log file I think), but after a restart, there will be a new log
file created. No way to continue the last one. That's why there could be
multiple log files, and no transaction. But if you haven't reached the
100.000 transactions (writes that is, reads doesn't count), there will be
no snapshot.

Regards,
Norbert

On Wed, Aug 14, 2019 at 2:35 PM Koen De Groote 
wrote:

> >Without a snapshot, we cannot delete the log files, as we would have no
> >means of recovery. txn logs applied to the snapshot gives us back the
> >state. Without snapshot, all txn logs needs to be "replayed" in a
> recovery.
> >And you need all the log files created since your last snapshot (in this
> >case, all the txn logs as there were no snapshots yet).
>
> Makes sense.
>
> In the heaviest environment I'm hitting around 200 requests per second,
> which all have to get data from zookeeper. Not sure what the impact is in
> terms of snapCount, I didn't set up the system myself and don't fully grasp
> the internals.
>
> As for the snapCount, that hasn't been touched, so that will be the default
> in my environments.
> I've read that the log files are preallocated. I see them as being 65MB a
> piece.
>
> Which makes me wonder: how many of those until the process hits 10
> snaps?
>
> auto-cleaning is setup, retainCount=3, purgeInterval=1
>
> Restarting the zookeeper process shouldn't affect this count, I think? It
> doesn't happen often, though it might on test environments.
>
>
>
> On Wed, Aug 14, 2019 at 2:06 PM Norbert Kalmar
> 
> wrote:
>
> > Hi,
> >
> > Without a snapshot, we cannot delete the log files, as we would have no
> > means of recovery. txn logs applied to the snapshot gives us back the
> > state. Without snapshot, all txn logs needs to be "replayed" in a
> recovery.
> > And you need all the log files created since your last snapshot (in this
> > case, all the txn logs as there were no snapshots yet).
> >
> > As for why there is no snapshot. What is your load? Per the admin guide:
> >
> > "snapCount
> > (Java system property: zookeeper.snapCount)
> > ZooKeeper logs transactions to a transaction log. After snapCount
> > transactions are written to a log file a snapshot is started and a new
> > transaction log file is created. The default snapCount is 100,000."
> >
> > By default there will be no auto-cleaning of the snapshot and log files.
> > Check the autopurge.snapRetainCount and autopurge.purgeInterval settings
> > for this.
> >
> > Regards,
> > Norbert
> >
> > On Wed, Aug 14, 2019 at 1:21 PM Koen De Groote <
> > koen.degro...@limecraft.com>
> > wrote:
> >
> > > Greetings all.
> > >
> > > I was debugging something an ran into this bit of code:
> > >
> > >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/PurgeTxnLog.java#L81
> > >
> > > If I understand it correctly, it seems that this means log.x files
> > will
> > > only get deleted if there's a snapshot.
> > >
> > > Which is troublesome, as my dataDir is filling up with log files but
> not
> > a
> > > single snapshot in sight.
> > >
> > > 1: Is this correct behavior? Both the logic of needing a snapshot and
> the
> > > fact that not snapshots are being generated?
> > >
> > > 2: While not having a fix for this, what would be useful to know is:
> can
> > > these log files be freely deleted? Or does the most recent one need to
> be
> > > kept, or how does it go with this files?
> > >
> > > I thought that running "bin/zkCleanup.sh /data -n 3" would clean up
> both
> > > the snapshots and the logs, but it appears that if there are not
> > snapshots,
> > > the logs aren't cleaned either.
> > >
> > > What are my options here?
> 

Re: Code to clean up transaction logs needs snapshots before it works?

2019-08-14 Thread Koen De Groote
>Without a snapshot, we cannot delete the log files, as we would have no
>means of recovery. txn logs applied to the snapshot gives us back the
>state. Without snapshot, all txn logs needs to be "replayed" in a recovery.
>And you need all the log files created since your last snapshot (in this
>case, all the txn logs as there were no snapshots yet).

Makes sense.

In the heaviest environment I'm hitting around 200 requests per second,
which all have to get data from zookeeper. Not sure what the impact is in
terms of snapCount, I didn't set up the system myself and don't fully grasp
the internals.

As for the snapCount, that hasn't been touched, so that will be the default
in my environments.
I've read that the log files are preallocated. I see them as being 65MB a
piece.

Which makes me wonder: how many of those until the process hits 10
snaps?

auto-cleaning is setup, retainCount=3, purgeInterval=1

Restarting the zookeeper process shouldn't affect this count, I think? It
doesn't happen often, though it might on test environments.



On Wed, Aug 14, 2019 at 2:06 PM Norbert Kalmar 
wrote:

> Hi,
>
> Without a snapshot, we cannot delete the log files, as we would have no
> means of recovery. txn logs applied to the snapshot gives us back the
> state. Without snapshot, all txn logs needs to be "replayed" in a recovery.
> And you need all the log files created since your last snapshot (in this
> case, all the txn logs as there were no snapshots yet).
>
> As for why there is no snapshot. What is your load? Per the admin guide:
>
> "snapCount
> (Java system property: zookeeper.snapCount)
> ZooKeeper logs transactions to a transaction log. After snapCount
> transactions are written to a log file a snapshot is started and a new
> transaction log file is created. The default snapCount is 100,000."
>
> By default there will be no auto-cleaning of the snapshot and log files.
> Check the autopurge.snapRetainCount and autopurge.purgeInterval settings
> for this.
>
> Regards,
> Norbert
>
> On Wed, Aug 14, 2019 at 1:21 PM Koen De Groote <
> koen.degro...@limecraft.com>
> wrote:
>
> > Greetings all.
> >
> > I was debugging something an ran into this bit of code:
> >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/PurgeTxnLog.java#L81
> >
> > If I understand it correctly, it seems that this means log.x files
> will
> > only get deleted if there's a snapshot.
> >
> > Which is troublesome, as my dataDir is filling up with log files but not
> a
> > single snapshot in sight.
> >
> > 1: Is this correct behavior? Both the logic of needing a snapshot and the
> > fact that not snapshots are being generated?
> >
> > 2: While not having a fix for this, what would be useful to know is: can
> > these log files be freely deleted? Or does the most recent one need to be
> > kept, or how does it go with this files?
> >
> > I thought that running "bin/zkCleanup.sh /data -n 3" would clean up both
> > the snapshots and the logs, but it appears that if there are not
> snapshots,
> > the logs aren't cleaned either.
> >
> > What are my options here?
> >
> > Kind regards,
> > Koen De Groote
> >
>


Re: Code to clean up transaction logs needs snapshots before it works?

2019-08-14 Thread Norbert Kalmar
Hi,

Without a snapshot, we cannot delete the log files, as we would have no
means of recovery. txn logs applied to the snapshot gives us back the
state. Without snapshot, all txn logs needs to be "replayed" in a recovery.
And you need all the log files created since your last snapshot (in this
case, all the txn logs as there were no snapshots yet).

As for why there is no snapshot. What is your load? Per the admin guide:

"snapCount
(Java system property: zookeeper.snapCount)
ZooKeeper logs transactions to a transaction log. After snapCount
transactions are written to a log file a snapshot is started and a new
transaction log file is created. The default snapCount is 100,000."

By default there will be no auto-cleaning of the snapshot and log files.
Check the autopurge.snapRetainCount and autopurge.purgeInterval settings
for this.

Regards,
Norbert

On Wed, Aug 14, 2019 at 1:21 PM Koen De Groote 
wrote:

> Greetings all.
>
> I was debugging something an ran into this bit of code:
>
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/PurgeTxnLog.java#L81
>
> If I understand it correctly, it seems that this means log.x files will
> only get deleted if there's a snapshot.
>
> Which is troublesome, as my dataDir is filling up with log files but not a
> single snapshot in sight.
>
> 1: Is this correct behavior? Both the logic of needing a snapshot and the
> fact that not snapshots are being generated?
>
> 2: While not having a fix for this, what would be useful to know is: can
> these log files be freely deleted? Or does the most recent one need to be
> kept, or how does it go with this files?
>
> I thought that running "bin/zkCleanup.sh /data -n 3" would clean up both
> the snapshots and the logs, but it appears that if there are not snapshots,
> the logs aren't cleaned either.
>
> What are my options here?
>
> Kind regards,
> Koen De Groote
>


Code to clean up transaction logs needs snapshots before it works?

2019-08-14 Thread Koen De Groote
Greetings all.

I was debugging something an ran into this bit of code:
https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/PurgeTxnLog.java#L81

If I understand it correctly, it seems that this means log.x files will
only get deleted if there's a snapshot.

Which is troublesome, as my dataDir is filling up with log files but not a
single snapshot in sight.

1: Is this correct behavior? Both the logic of needing a snapshot and the
fact that not snapshots are being generated?

2: While not having a fix for this, what would be useful to know is: can
these log files be freely deleted? Or does the most recent one need to be
kept, or how does it go with this files?

I thought that running "bin/zkCleanup.sh /data -n 3" would clean up both
the snapshots and the logs, but it appears that if there are not snapshots,
the logs aren't cleaned either.

What are my options here?

Kind regards,
Koen De Groote