Disabling forceSync will only make the writes to the txn log asynchronous, but the same volume of data will be written. I still think you could try to reduce the number of snapshots generated by increasing snapCount.
-Flavio > On 17 Apr 2015, at 20:51, Michi Mutsuzaki <[email protected]> wrote: > > Hi Dejan, > > I had a similar usecase: no durability requirement / virtualized (esx) > environment. We saw intermittent session expiry, so we ended up > setting forceSync to false. It's been working well since then. > > http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#Unsafe+Options > > On Thu, Apr 16, 2015 at 10:08 PM, Dejan Markic > <[email protected]> wrote: >> Hello Flavio! >> >> When we were testing ZooKeeper, we saw high IOPS - and since we don't care >> about data durability, we simply moved it to ramdisk. All ZK's are running >> on virtual machines (some HyperV, some vmWare). So yes, in the end, any high >> IOPS can be problematic. >> So I guess my only solution at the moment is, to increase the ramdisk to >> accommodate the logs/snapshots. >> I've just had another idea ... ZK uses only the log file while running >> right? That's where all IOPS are happening? Is there a way, to put active >> log on ramdisk, snapshots and old logs to another directory? >> Don't know why I put snapshots on ramdisk ... if I understand correctly, >> snapshots are simply written when needed, right? I know I can put snapshots >> to another directory (eg to disk directly) and it will not cause constant >> IOPS, right? >> >> Thank you and kind regards, >> Dejan Markic >> ________________________________________ >> From: Flavio Junqueira [[email protected]] >> Sent: Thursday, April 16, 2015 11:26 PM >> To: Dejan Markic >> Cc: [email protected] >> Subject: Re: Transaction logs and snapshots >> >> Distributed locks is indeed part of our bread and butter. Why don't you want >> to write to disk? Your workload does't seem to be heavy. Does the IO traffic >> compete with some other traffic you have? >> >> -Flavio >> >>> On 16 Apr 2015, at 22:15, Dejan Markic <[email protected]> wrote: >>> >>> Hello Flavio! >>> >>> Yes, indeed, ZK might not be the best option - but I could not find any >>> better. What we need is a rather fast, distributed locking "system". ZK was >>> at the moment the best option, and after testing it seemed to be the thing >>> we are looking for. Other than snapshots/transaction logs, we have no >>> problems. It easily handles our current load. It has C library, which makes >>> it fairly easy to port it to other software. >>> What we need (but I cannot find any) is distributed in-memory distributed >>> locking system where we can store some small information. >>> For instance, we use ZK's nodes as /SESSION_ID ... we lock it here, and >>> then we use eg /SESSION_ID/my_var to store something. After session is >>> gone, we remove this node and all information about it. >>> >>> If you have any idea about what kind of software we should try, please let >>> me know. You've helped me enough already! >>> >>> Thank you and kind regards, >>> Dejan Markic >>> ________________________________________ >>> From: Flavio Junqueira [[email protected]] >>> Sent: Thursday, April 16, 2015 10:29 PM >>> To: Dejan Markic >>> Cc: [email protected] >>> Subject: Re: Transaction logs and snapshots >>> >>> Another think you could do is to make snapCount very large so that >>> snapshots are created infrequently. But, let me step back and ask you why >>> you think ZK is a good fit for your project. It isn't clear to me that your >>> case is a good one for ZK. >>> >>> -Flavio >>> >>> >>>> On 16 Apr 2015, at 11:01, Dejan Markic <[email protected]> wrote: >>>> >>>> Hello! >>>> >>>> Log seems to be always 67.108.880 bytes. >>>> Snapshots are currently between 30-40MB. Snapshot is created almost every >>>> minute. >>>> Yes, data durability is not important at all. Once the session ends (it >>>> may last between 0 and few minutes, average around 1-2 minutes maybe), I >>>> don't need it anymore. I regulary remove nodes that are not changed for >>>> more than 10 minutes. >>>> I even recieve updates for sessions, so even if ZK looses data, I would >>>> get it back after few minutes. >>>> >>>> Thanks! >>>> >>>> Kind regards, >>>> Dejan >>>> >>>> >>>> -----Original Message----- >>>> From: Flavio Junqueira [mailto:[email protected]] >>>> Sent: Thursday, April 16, 2015 11:49 AM >>>> To: [email protected] >>>> Subject: Re: Transaction logs and snapshots >>>> >>>> Hi Dejan, >>>> For a typical ZK application, granularity of hours is more than enough, >>>> since it is supposed to be an infrequent background task. In your case, it >>>> sounds like durability isn't an important property because if it is you >>>> shouldn't be getting rid of disk data this fast. I'm also wondering about >>>> the amount of data you're generating. What's the size of your snapshots >>>> and txn logs? >>>> -Flavio >>>> >>>> >>>> On Thursday, April 16, 2015 10:26 AM, Dejan Markic >>>> <[email protected]> wrote: >>>> >>>> >>>> >>>> Hello Flavio! >>>> >>>> Would that mean, that zkCleanup.sh would not be needed? >>>> PurgeInterval is minimum 1 hour? Why is it so high? >>>> >>>> Thanks! >>>> >>>> Kind regards, >>>> Dejan Markic >>>> >>>> >>>> -----Original Message----- >>>> From: Flavio Junqueira [mailto:[email protected]] >>>> Sent: Thursday, April 16, 2015 11:15 AM >>>> To: [email protected] >>>> Subject: Re: Transaction logs and snapshots >>>> >>>> Hi Dejan, >>>> Check if the autopurge feature solves your problem: >>>> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_advancedConfiguration >>>> >>>> -Flavio >>>> >>>> >>>> On Thursday, April 16, 2015 9:17 AM, Dejan Markic >>>> <[email protected]> wrote: >>>> >>>> >>>> >>>> Hello all! >>>> >>>> We are running 3 ZK servers in ensemble, and ZK is processing a lot of >>>> commands per seconds. There are probably around 300 nodes >>>> created/checked/set/get per second. >>>> Since we have only information about live sessions we handle in ZK, we >>>> don't need any data persistency - eg: we can stop all nodes, clean all >>>> transaction logs/snapshots, and start them up again, without any issues. >>>> Since we have a lot of requests/changes, we have moved dataDir onto >>>> ramdisk, so we have no problems with disk IOPS, etc. >>>> Is there a way, to minimze the usage of snapshots/logs so ramdisk would >>>> not get filled up? It happens that transaction logs/snapshots grow so >>>> large, that we run out of space on ramdisk. >>>> We issue >/usr/share/zookeeper/bin/zkCleanup.sh -n 3< every 2 minutes, so >>>> this should cleanup the dataDir quite often. Why is >count number of >>>> snapshots/logs to keep< limited to 3 and not below? >>>> I assume, in my setup, I don't even need snapshots/logs to be stored after >>>> they are not actively needed? >>>> So my basic questions are: >>>> - can I somehow get rid of snapshot/logs sooner, more often ... ? >>>> - when is snapshot created? Can it be created sooner, so it would be >>>> smaller? >>>> - Is it possible to get rid of snapshot/logs all together? >>>> >>>> Thank you for all your inputs and kind regards, Dejan Markic >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >>
