Hello!
Thank you Michi and Flavio for all your valuable information!
I'm testing setup, where I have logs on ramdisk and snapshots directly on disk.
It's working rather poorly.
My setup is like this currently:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/lib/zookeeper
dataLogDir=/var/lib/zookeeper/logs
clientPort=2181
server.1=...
server.2=...
server.3=...
forceSync=no
skipACL=yes
ZooKeeper dies every one hour now.
I have one setup with 3 servers running version 3.3.5 and one setup with 3
servers running version 3.4.5. Both versions are doing pretty much the same
stuff and both receive the same error now.
This happens:
Server#1:
2015-04-22 06:25:53,653 - WARN
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
following the leader
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
at
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
2015-04-22 06:25:53,654 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
at
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
... AFTER THAT I SEE ...
2015-04-22 06:25:56,076 - ERROR
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@210] - Parent
/BOSS/ICS_O_182008125_580916217_180121007/_xlock missi
ng for /BOSS/ICS_O_182008125_580916217_180121007/_xlock/lock-0000000003
2015-04-22 06:25:56,076 - ERROR
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@453] - Unable to load
database on disk
java.io.IOException: Failed to process transaction type: 1 error:
KeeperErrorCode = NoNode for /BOSS/ICS_O_182008125_580916217_180121007/_xlock
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:153)
at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417)
at
org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:546)
at
org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:690)
at
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:737)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /BOSS/ICS_O_182008125_580916217_180121007/_xlock
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:211)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:151)
... 6 more
2015-04-22 06:25:56,077 - WARN
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@718] - Unexpected exception
java.lang.RuntimeException: Unable to run quorum server
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:454)
at
org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:546)
at
org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:690)
at
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:737)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716)
Caused by: java.io.IOException: Failed to process transaction type: 1 error:
KeeperErrorCode = NoNode for /BOSS/ICS_O_182008125_580916217_180121007/_xlock
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:153)
at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417)
... 4 more
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /BOSS/ICS_O_182008125_580916217_180121007/_xlock
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:211)
at org.apache
Server#2:
2015-04-22 06:25:53,675 - WARN
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when
following the leader
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:146)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
at
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
2015-04-22 06:25:53,823 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
at
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
... LATER ...
2015-04-22 06:25:53,829 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FollowerZooKeeperServer@139] -
Shutting down
2015-04-22 06:25:53,829 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@419] - shutting down
2015-04-22 06:25:53,829 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FollowerRequestProcessor@105] -
Shutting down
2015-04-22 06:25:53,829 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:CommitProcessor@181] - Shutting down
2015-04-22 06:25:53,829 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@415] - shutdown
of request processor complete
2015-04-22 06:25:53,830 - INFO [CommitProcessor:2:CommitProcessor@150] -
CommitProcessor exited loop!
2015-04-22 06:25:53,830 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@175] - Shutting
down
2015-04-22 06:25:53,830 - WARN
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@218] - Ignoring
unexpected runtime exception
java.nio.channels.CancelledKeyException
at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:187)
at java.lang.Thread.run(Thread.java:701)
2015-04-22 06:25:53,830 - INFO [SyncThread:2:SyncRequestProcessor@155] -
SyncRequestProcessor exited!
2015-04-22 06:25:53,830 - INFO
[FollowerRequestProcessor:2:FollowerRequestProcessor@95] -
FollowerRequestProcessor exited loop!
2015-04-22 06:25:53,831 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
2015-04-22 06:25:53,835 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading snapshot
/var/lib/zookeeper/version-2/snapshot.301015ebb
... LATER...
2015-04-22 06:25:56,970 - ERROR
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@210] - Parent
/BOSS/ICS_O_182041092_581017710_180121007/_xlock missi
ng for /BOSS/ICS_O_182041092_581017710_180121007/_xlock/lock-0000000003
2015-04-22 06:25:56,971 - ERROR
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@453] - Unable to load
database on disk
java.io.IOException: Failed to process transaction type: 1 error:
KeeperErrorCode = NoNode for /BOSS/ICS_O_182041092_581017710_180121007/_xlock
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:153)
at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417)
at
org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:546)
at
org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:690)
at
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:737)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /BOSS/ICS_O_182041092_581017710_180121007/_xlock
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:211)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:151)
... 6 more
2015-04-22 06:25:56,971 - WARN
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@718] - Unexpected exception
java.lang.RuntimeException: Unable to run quorum server
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:454)
at
org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:546)
at
org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:690)
at
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:737)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716)
Caused by: java.io.IOException: Failed to process transaction type: 1 error:
KeeperErrorCode = NoNode for /BOSS/ICS_O_182041092_581017710_180121007/_xlock
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:153)
at
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417)
... 4 more
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /BOSS/ICS_O_182041092_581017710_180121007/_xlock
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:211)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:151)
... 6 more
2015-04-22 06:25:56,972 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
2015-04-22 06:25:56,973 - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading snapshot
/var/lib/zookeeper/version-2/snapshot.301015ebb
....
Server#3:
2015-04-22 06:25:53,850 - INFO [ProcessThread(sid:3
cport:-1)::PrepRequestProcessor@476] - Processed session termination for
sessionid: 0x24cdd77c03d0355
2015-04-22 06:25:53,850 - ERROR
[LearnerHandler-/192.168.10.114:60067:LearnerHandler@562] - Unexpected
exception causing shutdown while sock still open
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
at
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:476)
2015-04-22 06:25:53,851 - WARN
[Sender-/192.168.10.113:45055:LearnerHandler@153] - Unexpected exception at
LearnerHandler Socket[addr=/192.168.10.113,port=
45055,localport=2888] tickOfLastAck:15970 synced?:true queuedPacketLength:6
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at
org.apache.zookeeper.server.quorum.LearnerHandler.sendPackets(LearnerHandler.java:136)
at
org.apache.zookeeper.server.quorum.LearnerHandler.access$000(LearnerHandler.java:56)
at
org.apache.zookeeper.server.quorum.LearnerHandler$1.run(LearnerHandler.java:437)
2015-04-22 06:25:53,851 - ERROR
[LearnerHandler-/192.168.10.113:45055:LearnerHandler@562] - Unexpected
exception causing shutdown while sock still open
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
at
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:476)
2015-04-22 06:25:53,852 - WARN
[LearnerHandler-/192.168.10.113:45055:LearnerHandler@575] - ******* GOODBYE
/192.168.10.113:45055 ********
2015-04-22 06:25:53,851 - WARN
[Sender-/192.168.10.114:60067:LearnerHandler@153] - Unexpected exception at
LearnerHandler Socket[addr=/192.168.10.114,port=
60067,localport=2888] tickOfLastAck:15970 synced?:true queuedPacketLength:1
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at
org.apache.zookeeper.server.quorum.LearnerHandler.sendPackets(LearnerHandler.java:136)
at
org.apache.zookeeper.server.quorum.LearnerHandler.access$000(LearnerHandler.java:56)
at
org.apache.zookeeper.server.quorum.LearnerHandler$1.run(LearnerHandler.java:437)
2015-04-22 06:25:53,851 - INFO [ProcessThread(sid:3
cport:-1)::PrepRequestProcessor@476] - Processed session termination for
sessionid: 0x34cdd77c0b30322
2015-04-22 06:25:53,851 - WARN
[LearnerHandler-/192.168.10.114:60067:LearnerHandler@575] - ******* GOODBYE
/192.168.10.114:60067 ********
2015-04-22 06:25:53,853 - INFO [ProcessThread(sid:3
cport:-1)::PrepRequestProcessor@476] - Processed session termination for
sessionid: 0x24cdd77c03d0352
2015-04-22 06:25:53,851 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted
socket connection from /192.168.10.161:46385
2015-04-22 06:25:53,854 - INFO [ProcessThread(sid:3
cport:-1)::PrepRequestProcessor@476] - Processed session termination for
sessionid: 0x24cdd77c03d034e
2015-04-22 06:25:53,854 - WARN
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of
stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x34cdd77c0b3031d, likely client has closed socket
at
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:701)
Why would "broken pipe" occur? It happens every 30-60 minutes. And wjy can't ZK
read Snapshot? It's probably do to forceSync=no ?
Thank you for all your suggestions.
Kind regards,
Dejan Markic
________________________________________
From: Michi Mutsuzaki [[email protected]]
Sent: Friday, April 17, 2015 9:51 PM
To: [email protected]
Cc: Flavio Junqueira
Subject: Re: Transaction logs and snapshots
Hi Dejan,
I had a similar usecase: no durability requirement / virtualized (esx)
environment. We saw intermittent session expiry, so we ended up
setting forceSync to false. It's been working well since then.
http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#Unsafe+Options
On Thu, Apr 16, 2015 at 10:08 PM, Dejan Markic
<[email protected]> wrote:
> Hello Flavio!
>
> When we were testing ZooKeeper, we saw high IOPS - and since we don't care
> about data durability, we simply moved it to ramdisk. All ZK's are running on
> virtual machines (some HyperV, some vmWare). So yes, in the end, any high
> IOPS can be problematic.
> So I guess my only solution at the moment is, to increase the ramdisk to
> accommodate the logs/snapshots.
> I've just had another idea ... ZK uses only the log file while running right?
> That's where all IOPS are happening? Is there a way, to put active log on
> ramdisk, snapshots and old logs to another directory?
> Don't know why I put snapshots on ramdisk ... if I understand correctly,
> snapshots are simply written when needed, right? I know I can put snapshots
> to another directory (eg to disk directly) and it will not cause constant
> IOPS, right?
>
> Thank you and kind regards,
> Dejan Markic
> ________________________________________
> From: Flavio Junqueira [[email protected]]
> Sent: Thursday, April 16, 2015 11:26 PM
> To: Dejan Markic
> Cc: [email protected]
> Subject: Re: Transaction logs and snapshots
>
> Distributed locks is indeed part of our bread and butter. Why don't you want
> to write to disk? Your workload does't seem to be heavy. Does the IO traffic
> compete with some other traffic you have?
>
> -Flavio
>
>> On 16 Apr 2015, at 22:15, Dejan Markic <[email protected]> wrote:
>>
>> Hello Flavio!
>>
>> Yes, indeed, ZK might not be the best option - but I could not find any
>> better. What we need is a rather fast, distributed locking "system". ZK was
>> at the moment the best option, and after testing it seemed to be the thing
>> we are looking for. Other than snapshots/transaction logs, we have no
>> problems. It easily handles our current load. It has C library, which makes
>> it fairly easy to port it to other software.
>> What we need (but I cannot find any) is distributed in-memory distributed
>> locking system where we can store some small information.
>> For instance, we use ZK's nodes as /SESSION_ID ... we lock it here, and then
>> we use eg /SESSION_ID/my_var to store something. After session is gone, we
>> remove this node and all information about it.
>>
>> If you have any idea about what kind of software we should try, please let
>> me know. You've helped me enough already!
>>
>> Thank you and kind regards,
>> Dejan Markic
>> ________________________________________
>> From: Flavio Junqueira [[email protected]]
>> Sent: Thursday, April 16, 2015 10:29 PM
>> To: Dejan Markic
>> Cc: [email protected]
>> Subject: Re: Transaction logs and snapshots
>>
>> Another think you could do is to make snapCount very large so that snapshots
>> are created infrequently. But, let me step back and ask you why you think ZK
>> is a good fit for your project. It isn't clear to me that your case is a
>> good one for ZK.
>>
>> -Flavio
>>
>>
>>> On 16 Apr 2015, at 11:01, Dejan Markic <[email protected]> wrote:
>>>
>>> Hello!
>>>
>>> Log seems to be always 67.108.880 bytes.
>>> Snapshots are currently between 30-40MB. Snapshot is created almost every
>>> minute.
>>> Yes, data durability is not important at all. Once the session ends (it may
>>> last between 0 and few minutes, average around 1-2 minutes maybe), I don't
>>> need it anymore. I regulary remove nodes that are not changed for more
>>> than 10 minutes.
>>> I even recieve updates for sessions, so even if ZK looses data, I would get
>>> it back after few minutes.
>>>
>>> Thanks!
>>>
>>> Kind regards,
>>> Dejan
>>>
>>>
>>> -----Original Message-----
>>> From: Flavio Junqueira [mailto:[email protected]]
>>> Sent: Thursday, April 16, 2015 11:49 AM
>>> To: [email protected]
>>> Subject: Re: Transaction logs and snapshots
>>>
>>> Hi Dejan,
>>> For a typical ZK application, granularity of hours is more than enough,
>>> since it is supposed to be an infrequent background task. In your case, it
>>> sounds like durability isn't an important property because if it is you
>>> shouldn't be getting rid of disk data this fast. I'm also wondering about
>>> the amount of data you're generating. What's the size of your snapshots and
>>> txn logs?
>>> -Flavio
>>>
>>>
>>> On Thursday, April 16, 2015 10:26 AM, Dejan Markic
>>> <[email protected]> wrote:
>>>
>>>
>>>
>>> Hello Flavio!
>>>
>>> Would that mean, that zkCleanup.sh would not be needed?
>>> PurgeInterval is minimum 1 hour? Why is it so high?
>>>
>>> Thanks!
>>>
>>> Kind regards,
>>> Dejan Markic
>>>
>>>
>>> -----Original Message-----
>>> From: Flavio Junqueira [mailto:[email protected]]
>>> Sent: Thursday, April 16, 2015 11:15 AM
>>> To: [email protected]
>>> Subject: Re: Transaction logs and snapshots
>>>
>>> Hi Dejan,
>>> Check if the autopurge feature solves your problem:
>>> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_advancedConfiguration
>>>
>>> -Flavio
>>>
>>>
>>> On Thursday, April 16, 2015 9:17 AM, Dejan Markic
>>> <[email protected]> wrote:
>>>
>>>
>>>
>>> Hello all!
>>>
>>> We are running 3 ZK servers in ensemble, and ZK is processing a lot of
>>> commands per seconds. There are probably around 300 nodes
>>> created/checked/set/get per second.
>>> Since we have only information about live sessions we handle in ZK, we
>>> don't need any data persistency - eg: we can stop all nodes, clean all
>>> transaction logs/snapshots, and start them up again, without any issues.
>>> Since we have a lot of requests/changes, we have moved dataDir onto
>>> ramdisk, so we have no problems with disk IOPS, etc.
>>> Is there a way, to minimze the usage of snapshots/logs so ramdisk would not
>>> get filled up? It happens that transaction logs/snapshots grow so large,
>>> that we run out of space on ramdisk.
>>> We issue >/usr/share/zookeeper/bin/zkCleanup.sh -n 3< every 2 minutes, so
>>> this should cleanup the dataDir quite often. Why is >count number of
>>> snapshots/logs to keep< limited to 3 and not below?
>>> I assume, in my setup, I don't even need snapshots/logs to be stored after
>>> they are not actively needed?
>>> So my basic questions are:
>>> - can I somehow get rid of snapshot/logs sooner, more often ... ?
>>> - when is snapshot created? Can it be created sooner, so it would be
>>> smaller?
>>> - Is it possible to get rid of snapshot/logs all together?
>>>
>>> Thank you for all your inputs and kind regards, Dejan Markic
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>