Hello!

Thank you Michi and Flavio for all your valuable information!
I'm testing setup, where I have logs on ramdisk and snapshots directly on disk. 
It's working rather poorly.

My setup is like this currently:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/lib/zookeeper
dataLogDir=/var/lib/zookeeper/logs
clientPort=2181
server.1=...
server.2=...
server.3=...
forceSync=no
skipACL=yes

ZooKeeper dies every one hour now.
I have one setup with 3 servers running version 3.3.5 and one setup with 3 
servers running version 3.4.5. Both versions are doing pretty much the same 
stuff and both receive the same error now.

This happens:
Server#1:
2015-04-22 06:25:53,653 - WARN  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
following the leader
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:146)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
        at java.io.DataInputStream.readInt(DataInputStream.java:387)
        at 
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at 
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        at 
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at 
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
        at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
2015-04-22 06:25:53,654 - INFO  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
        at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
... AFTER THAT I SEE ...
2015-04-22 06:25:56,076 - ERROR 
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@210] - Parent 
/BOSS/ICS_O_182008125_580916217_180121007/_xlock missi
ng for /BOSS/ICS_O_182008125_580916217_180121007/_xlock/lock-0000000003
2015-04-22 06:25:56,076 - ERROR 
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@453] - Unable to load 
database on disk
java.io.IOException: Failed to process transaction type: 1 error: 
KeeperErrorCode = NoNode for /BOSS/ICS_O_182008125_580916217_180121007/_xlock
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:153)
        at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:546)
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:690)
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:737)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /BOSS/ICS_O_182008125_580916217_180121007/_xlock
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:211)
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:151)
        ... 6 more
2015-04-22 06:25:56,077 - WARN  
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@718] - Unexpected exception
java.lang.RuntimeException: Unable to run quorum server
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:454)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:546)
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:690)
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:737)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716)
Caused by: java.io.IOException: Failed to process transaction type: 1 error: 
KeeperErrorCode = NoNode for /BOSS/ICS_O_182008125_580916217_180121007/_xlock
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:153)
        at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417)
        ... 4 more
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /BOSS/ICS_O_182008125_580916217_180121007/_xlock
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:211)
        at org.apache

Server#2:
2015-04-22 06:25:53,675 - WARN  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
following the leader
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:146)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
        at java.io.DataInputStream.readInt(DataInputStream.java:387)
        at 
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at 
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        at 
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at 
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
        at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740)
2015-04-22 06:25:53,823 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
        at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744)
... LATER ...
2015-04-22 06:25:53,829 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FollowerZooKeeperServer@139] - 
Shutting down
2015-04-22 06:25:53,829 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@419] - shutting down
2015-04-22 06:25:53,829 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FollowerRequestProcessor@105] - 
Shutting down
2015-04-22 06:25:53,829 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:CommitProcessor@181] - Shutting down
2015-04-22 06:25:53,829 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@415] - shutdown 
of request processor complete
2015-04-22 06:25:53,830 - INFO  [CommitProcessor:2:CommitProcessor@150] - 
CommitProcessor exited loop!
2015-04-22 06:25:53,830 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@175] - Shutting 
down
2015-04-22 06:25:53,830 - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@218] - Ignoring 
unexpected runtime exception
java.nio.channels.CancelledKeyException
        at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
        at sun.nio.ch.SelectionKeyImpl.readyOps(SelectionKeyImpl.java:87)
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:187)
        at java.lang.Thread.run(Thread.java:701)
2015-04-22 06:25:53,830 - INFO  [SyncThread:2:SyncRequestProcessor@155] - 
SyncRequestProcessor exited!
2015-04-22 06:25:53,830 - INFO  
[FollowerRequestProcessor:2:FollowerRequestProcessor@95] - 
FollowerRequestProcessor exited loop!
2015-04-22 06:25:53,831 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
2015-04-22 06:25:53,835 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading snapshot 
/var/lib/zookeeper/version-2/snapshot.301015ebb
... LATER...
2015-04-22 06:25:56,970 - ERROR 
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@210] - Parent 
/BOSS/ICS_O_182041092_581017710_180121007/_xlock missi
ng for /BOSS/ICS_O_182041092_581017710_180121007/_xlock/lock-0000000003
2015-04-22 06:25:56,971 - ERROR 
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@453] - Unable to load 
database on disk
java.io.IOException: Failed to process transaction type: 1 error: 
KeeperErrorCode = NoNode for /BOSS/ICS_O_182041092_581017710_180121007/_xlock
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:153)
        at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:546)
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:690)
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:737)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /BOSS/ICS_O_182041092_581017710_180121007/_xlock
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:211)
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:151)
        ... 6 more
2015-04-22 06:25:56,971 - WARN  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@718] - Unexpected exception
java.lang.RuntimeException: Unable to run quorum server
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:454)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:546)
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:690)
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:737)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716)
Caused by: java.io.IOException: Failed to process transaction type: 1 error: 
KeeperErrorCode = NoNode for /BOSS/ICS_O_182041092_581017710_180121007/_xlock
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:153)
        at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417)
        ... 4 more
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /BOSS/ICS_O_182041092_581017710_180121007/_xlock
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:211)
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:151)
        ... 6 more
2015-04-22 06:25:56,972 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
2015-04-22 06:25:56,973 - INFO  
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading snapshot 
/var/lib/zookeeper/version-2/snapshot.301015ebb
....


Server#3:

2015-04-22 06:25:53,850 - INFO  [ProcessThread(sid:3 
cport:-1)::PrepRequestProcessor@476] - Processed session termination for 
sessionid: 0x24cdd77c03d0355
2015-04-22 06:25:53,850 - ERROR 
[LearnerHandler-/192.168.10.114:60067:LearnerHandler@562] - Unexpected 
exception causing shutdown while sock still open
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at 
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at 
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        at 
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at 
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:476)
2015-04-22 06:25:53,851 - WARN  
[Sender-/192.168.10.113:45055:LearnerHandler@153] - Unexpected exception at 
LearnerHandler Socket[addr=/192.168.10.113,port=
45055,localport=2888] tickOfLastAck:15970 synced?:true queuedPacketLength:6
java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
        at 
org.apache.zookeeper.server.quorum.LearnerHandler.sendPackets(LearnerHandler.java:136)
        at 
org.apache.zookeeper.server.quorum.LearnerHandler.access$000(LearnerHandler.java:56)
        at 
org.apache.zookeeper.server.quorum.LearnerHandler$1.run(LearnerHandler.java:437)
2015-04-22 06:25:53,851 - ERROR 
[LearnerHandler-/192.168.10.113:45055:LearnerHandler@562] - Unexpected 
exception causing shutdown while sock still open
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at 
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at 
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        at 
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
        at 
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:476)
2015-04-22 06:25:53,852 - WARN  
[LearnerHandler-/192.168.10.113:45055:LearnerHandler@575] - ******* GOODBYE 
/192.168.10.113:45055 ********
2015-04-22 06:25:53,851 - WARN  
[Sender-/192.168.10.114:60067:LearnerHandler@153] - Unexpected exception at 
LearnerHandler Socket[addr=/192.168.10.114,port=
60067,localport=2888] tickOfLastAck:15970 synced?:true queuedPacketLength:1
java.net.SocketException: Broken pipe
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
        at 
org.apache.zookeeper.server.quorum.LearnerHandler.sendPackets(LearnerHandler.java:136)
        at 
org.apache.zookeeper.server.quorum.LearnerHandler.access$000(LearnerHandler.java:56)
        at 
org.apache.zookeeper.server.quorum.LearnerHandler$1.run(LearnerHandler.java:437)
2015-04-22 06:25:53,851 - INFO  [ProcessThread(sid:3 
cport:-1)::PrepRequestProcessor@476] - Processed session termination for 
sessionid: 0x34cdd77c0b30322
2015-04-22 06:25:53,851 - WARN  
[LearnerHandler-/192.168.10.114:60067:LearnerHandler@575] - ******* GOODBYE 
/192.168.10.114:60067 ********
2015-04-22 06:25:53,853 - INFO  [ProcessThread(sid:3 
cport:-1)::PrepRequestProcessor@476] - Processed session termination for 
sessionid: 0x24cdd77c03d0352
2015-04-22 06:25:53,851 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /192.168.10.161:46385
2015-04-22 06:25:53,854 - INFO  [ProcessThread(sid:3 
cport:-1)::PrepRequestProcessor@476] - Processed session termination for 
sessionid: 0x24cdd77c03d034e
2015-04-22 06:25:53,854 - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of 
stream exception
EndOfStreamException: Unable to read additional data from client sessionid 
0x34cdd77c0b3031d, likely client has closed socket
        at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
        at java.lang.Thread.run(Thread.java:701)

Why would "broken pipe" occur? It happens every 30-60 minutes. And wjy can't ZK 
read Snapshot? It's probably do to forceSync=no ?

Thank you for all your suggestions.

Kind regards,
Dejan Markic
________________________________________
From: Michi Mutsuzaki [[email protected]]
Sent: Friday, April 17, 2015 9:51 PM
To: [email protected]
Cc: Flavio Junqueira
Subject: Re: Transaction logs and snapshots

Hi Dejan,

I had a similar usecase: no durability requirement / virtualized (esx)
environment. We saw intermittent session expiry, so we ended up
setting forceSync to false. It's been working well since then.

http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#Unsafe+Options

On Thu, Apr 16, 2015 at 10:08 PM, Dejan Markic
<[email protected]> wrote:
> Hello Flavio!
>
> When we were testing ZooKeeper, we saw high IOPS - and since we don't care 
> about data durability, we simply moved it to ramdisk. All ZK's are running on 
> virtual machines (some HyperV, some vmWare). So yes, in the end, any high 
> IOPS can be problematic.
> So I guess my only solution at the moment is, to increase the ramdisk to 
> accommodate the logs/snapshots.
> I've just had another idea ... ZK uses only the log file while running right? 
> That's where all IOPS are happening? Is there a way, to put active log on 
> ramdisk, snapshots and old logs to another directory?
> Don't know why I put snapshots on ramdisk ... if I understand correctly, 
> snapshots are simply written when needed, right? I know I can put snapshots 
> to another directory (eg to disk directly) and it will not cause constant 
> IOPS, right?
>
> Thank you and kind regards,
> Dejan Markic
> ________________________________________
> From: Flavio Junqueira [[email protected]]
> Sent: Thursday, April 16, 2015 11:26 PM
> To: Dejan Markic
> Cc: [email protected]
> Subject: Re: Transaction logs and snapshots
>
> Distributed locks is indeed part of our bread and butter. Why don't you want 
> to write to disk? Your workload does't seem to be heavy. Does the IO traffic 
> compete with some other traffic you have?
>
> -Flavio
>
>> On 16 Apr 2015, at 22:15, Dejan Markic <[email protected]> wrote:
>>
>> Hello Flavio!
>>
>> Yes, indeed, ZK might not be the best option - but I could not find any 
>> better. What we need is a rather fast, distributed locking "system". ZK was 
>> at the moment the best option, and after testing it seemed to be the thing 
>> we are looking for. Other than snapshots/transaction logs, we have no 
>> problems. It easily handles our current load. It has C library, which makes 
>> it fairly easy to port it to other software.
>> What we need (but I cannot find any) is distributed in-memory distributed 
>> locking system where we can store some small information.
>> For instance, we use ZK's nodes as /SESSION_ID ... we lock it here, and then 
>> we use eg /SESSION_ID/my_var to store something. After session is gone, we 
>> remove this node and all information about it.
>>
>> If you have any idea about what kind of software we should try, please let 
>> me know. You've helped me enough already!
>>
>> Thank you and kind regards,
>> Dejan Markic
>> ________________________________________
>> From: Flavio Junqueira [[email protected]]
>> Sent: Thursday, April 16, 2015 10:29 PM
>> To: Dejan Markic
>> Cc: [email protected]
>> Subject: Re: Transaction logs and snapshots
>>
>> Another think you could do is to make snapCount very large so that snapshots 
>> are created infrequently. But, let me step back and ask you why you think ZK 
>> is a good fit for your project. It isn't clear to me that your case is a 
>> good one for ZK.
>>
>> -Flavio
>>
>>
>>> On 16 Apr 2015, at 11:01, Dejan Markic <[email protected]> wrote:
>>>
>>> Hello!
>>>
>>> Log seems to be always 67.108.880 bytes.
>>> Snapshots are currently between 30-40MB. Snapshot is created almost every 
>>> minute.
>>> Yes, data durability is not important at all. Once the session ends (it may 
>>> last between 0 and few minutes, average around 1-2 minutes maybe), I don't 
>>> need it anymore. I regulary remove  nodes that are not changed for more 
>>> than 10 minutes.
>>> I even recieve updates for sessions, so even if ZK looses data, I would get 
>>> it back after few minutes.
>>>
>>> Thanks!
>>>
>>> Kind regards,
>>> Dejan
>>>
>>>
>>> -----Original Message-----
>>> From: Flavio Junqueira [mailto:[email protected]]
>>> Sent: Thursday, April 16, 2015 11:49 AM
>>> To: [email protected]
>>> Subject: Re: Transaction logs and snapshots
>>>
>>> Hi Dejan,
>>> For a typical ZK application, granularity of hours is more than enough, 
>>> since it is supposed to be an infrequent background task. In your case, it 
>>> sounds like durability isn't an important property because if it is you 
>>> shouldn't be getting rid of disk data this fast. I'm also wondering about 
>>> the amount of data you're generating. What's the size of your snapshots and 
>>> txn logs?
>>> -Flavio
>>>
>>>
>>>    On Thursday, April 16, 2015 10:26 AM, Dejan Markic 
>>> <[email protected]> wrote:
>>>
>>>
>>>
>>> Hello Flavio!
>>>
>>> Would that mean, that zkCleanup.sh would not be needed?
>>> PurgeInterval is minimum 1 hour? Why is it so high?
>>>
>>> Thanks!
>>>
>>> Kind regards,
>>> Dejan Markic
>>>
>>>
>>> -----Original Message-----
>>> From: Flavio Junqueira [mailto:[email protected]]
>>> Sent: Thursday, April 16, 2015 11:15 AM
>>> To: [email protected]
>>> Subject: Re: Transaction logs and snapshots
>>>
>>> Hi Dejan,
>>> Check if the autopurge feature solves your problem:
>>> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_advancedConfiguration
>>>
>>> -Flavio
>>>
>>>
>>>    On Thursday, April 16, 2015 9:17 AM, Dejan Markic 
>>> <[email protected]> wrote:
>>>
>>>
>>>
>>> Hello all!
>>>
>>> We are running 3 ZK servers in ensemble, and ZK is processing a lot of 
>>> commands per seconds. There are probably around 300 nodes 
>>> created/checked/set/get per second.
>>> Since we have only information about live sessions we handle in ZK, we 
>>> don't need any data persistency - eg: we can stop all nodes, clean all 
>>> transaction logs/snapshots, and start them up again, without any issues.
>>> Since we have a lot of requests/changes, we have moved dataDir onto 
>>> ramdisk, so we have no problems with disk IOPS, etc.
>>> Is there a way, to minimze the usage of snapshots/logs so ramdisk would not 
>>> get filled up? It happens that transaction logs/snapshots grow so large, 
>>> that we run out of space on ramdisk.
>>> We issue >/usr/share/zookeeper/bin/zkCleanup.sh -n 3< every 2 minutes, so 
>>> this should cleanup the dataDir quite often. Why is >count number of 
>>> snapshots/logs to keep< limited to 3 and not below?
>>> I assume, in my setup, I don't even need snapshots/logs to be stored after 
>>> they are not actively needed?
>>> So my basic questions are:
>>> - can I somehow get rid of snapshot/logs sooner, more often ... ?
>>> - when is snapshot created? Can it be created sooner, so it would be 
>>> smaller?
>>> - Is it possible to get rid of snapshot/logs all together?
>>>
>>> Thank you for all your inputs and kind regards, Dejan Markic
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Reply via email to