Hi Dejan, I had a similar usecase: no durability requirement / virtualized (esx) environment. We saw intermittent session expiry, so we ended up setting forceSync to false. It's been working well since then.
http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#Unsafe+Options On Thu, Apr 16, 2015 at 10:08 PM, Dejan Markic <[email protected]> wrote: > Hello Flavio! > > When we were testing ZooKeeper, we saw high IOPS - and since we don't care > about data durability, we simply moved it to ramdisk. All ZK's are running on > virtual machines (some HyperV, some vmWare). So yes, in the end, any high > IOPS can be problematic. > So I guess my only solution at the moment is, to increase the ramdisk to > accommodate the logs/snapshots. > I've just had another idea ... ZK uses only the log file while running right? > That's where all IOPS are happening? Is there a way, to put active log on > ramdisk, snapshots and old logs to another directory? > Don't know why I put snapshots on ramdisk ... if I understand correctly, > snapshots are simply written when needed, right? I know I can put snapshots > to another directory (eg to disk directly) and it will not cause constant > IOPS, right? > > Thank you and kind regards, > Dejan Markic > ________________________________________ > From: Flavio Junqueira [[email protected]] > Sent: Thursday, April 16, 2015 11:26 PM > To: Dejan Markic > Cc: [email protected] > Subject: Re: Transaction logs and snapshots > > Distributed locks is indeed part of our bread and butter. Why don't you want > to write to disk? Your workload does't seem to be heavy. Does the IO traffic > compete with some other traffic you have? > > -Flavio > >> On 16 Apr 2015, at 22:15, Dejan Markic <[email protected]> wrote: >> >> Hello Flavio! >> >> Yes, indeed, ZK might not be the best option - but I could not find any >> better. What we need is a rather fast, distributed locking "system". ZK was >> at the moment the best option, and after testing it seemed to be the thing >> we are looking for. Other than snapshots/transaction logs, we have no >> problems. It easily handles our current load. It has C library, which makes >> it fairly easy to port it to other software. >> What we need (but I cannot find any) is distributed in-memory distributed >> locking system where we can store some small information. >> For instance, we use ZK's nodes as /SESSION_ID ... we lock it here, and then >> we use eg /SESSION_ID/my_var to store something. After session is gone, we >> remove this node and all information about it. >> >> If you have any idea about what kind of software we should try, please let >> me know. You've helped me enough already! >> >> Thank you and kind regards, >> Dejan Markic >> ________________________________________ >> From: Flavio Junqueira [[email protected]] >> Sent: Thursday, April 16, 2015 10:29 PM >> To: Dejan Markic >> Cc: [email protected] >> Subject: Re: Transaction logs and snapshots >> >> Another think you could do is to make snapCount very large so that snapshots >> are created infrequently. But, let me step back and ask you why you think ZK >> is a good fit for your project. It isn't clear to me that your case is a >> good one for ZK. >> >> -Flavio >> >> >>> On 16 Apr 2015, at 11:01, Dejan Markic <[email protected]> wrote: >>> >>> Hello! >>> >>> Log seems to be always 67.108.880 bytes. >>> Snapshots are currently between 30-40MB. Snapshot is created almost every >>> minute. >>> Yes, data durability is not important at all. Once the session ends (it may >>> last between 0 and few minutes, average around 1-2 minutes maybe), I don't >>> need it anymore. I regulary remove nodes that are not changed for more >>> than 10 minutes. >>> I even recieve updates for sessions, so even if ZK looses data, I would get >>> it back after few minutes. >>> >>> Thanks! >>> >>> Kind regards, >>> Dejan >>> >>> >>> -----Original Message----- >>> From: Flavio Junqueira [mailto:[email protected]] >>> Sent: Thursday, April 16, 2015 11:49 AM >>> To: [email protected] >>> Subject: Re: Transaction logs and snapshots >>> >>> Hi Dejan, >>> For a typical ZK application, granularity of hours is more than enough, >>> since it is supposed to be an infrequent background task. In your case, it >>> sounds like durability isn't an important property because if it is you >>> shouldn't be getting rid of disk data this fast. I'm also wondering about >>> the amount of data you're generating. What's the size of your snapshots and >>> txn logs? >>> -Flavio >>> >>> >>> On Thursday, April 16, 2015 10:26 AM, Dejan Markic >>> <[email protected]> wrote: >>> >>> >>> >>> Hello Flavio! >>> >>> Would that mean, that zkCleanup.sh would not be needed? >>> PurgeInterval is minimum 1 hour? Why is it so high? >>> >>> Thanks! >>> >>> Kind regards, >>> Dejan Markic >>> >>> >>> -----Original Message----- >>> From: Flavio Junqueira [mailto:[email protected]] >>> Sent: Thursday, April 16, 2015 11:15 AM >>> To: [email protected] >>> Subject: Re: Transaction logs and snapshots >>> >>> Hi Dejan, >>> Check if the autopurge feature solves your problem: >>> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_advancedConfiguration >>> >>> -Flavio >>> >>> >>> On Thursday, April 16, 2015 9:17 AM, Dejan Markic >>> <[email protected]> wrote: >>> >>> >>> >>> Hello all! >>> >>> We are running 3 ZK servers in ensemble, and ZK is processing a lot of >>> commands per seconds. There are probably around 300 nodes >>> created/checked/set/get per second. >>> Since we have only information about live sessions we handle in ZK, we >>> don't need any data persistency - eg: we can stop all nodes, clean all >>> transaction logs/snapshots, and start them up again, without any issues. >>> Since we have a lot of requests/changes, we have moved dataDir onto >>> ramdisk, so we have no problems with disk IOPS, etc. >>> Is there a way, to minimze the usage of snapshots/logs so ramdisk would not >>> get filled up? It happens that transaction logs/snapshots grow so large, >>> that we run out of space on ramdisk. >>> We issue >/usr/share/zookeeper/bin/zkCleanup.sh -n 3< every 2 minutes, so >>> this should cleanup the dataDir quite often. Why is >count number of >>> snapshots/logs to keep< limited to 3 and not below? >>> I assume, in my setup, I don't even need snapshots/logs to be stored after >>> they are not actively needed? >>> So my basic questions are: >>> - can I somehow get rid of snapshot/logs sooner, more often ... ? >>> - when is snapshot created? Can it be created sooner, so it would be >>> smaller? >>> - Is it possible to get rid of snapshot/logs all together? >>> >>> Thank you for all your inputs and kind regards, Dejan Markic >>> >>> >>> >>> >>> >>> >>> >> >
