Hi Dejan,

I had a similar usecase: no durability requirement / virtualized (esx)
environment. We saw intermittent session expiry, so we ended up
setting forceSync to false. It's been working well since then.

http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#Unsafe+Options

On Thu, Apr 16, 2015 at 10:08 PM, Dejan Markic
<[email protected]> wrote:
> Hello Flavio!
>
> When we were testing ZooKeeper, we saw high IOPS - and since we don't care 
> about data durability, we simply moved it to ramdisk. All ZK's are running on 
> virtual machines (some HyperV, some vmWare). So yes, in the end, any high 
> IOPS can be problematic.
> So I guess my only solution at the moment is, to increase the ramdisk to 
> accommodate the logs/snapshots.
> I've just had another idea ... ZK uses only the log file while running right? 
> That's where all IOPS are happening? Is there a way, to put active log on 
> ramdisk, snapshots and old logs to another directory?
> Don't know why I put snapshots on ramdisk ... if I understand correctly, 
> snapshots are simply written when needed, right? I know I can put snapshots 
> to another directory (eg to disk directly) and it will not cause constant 
> IOPS, right?
>
> Thank you and kind regards,
> Dejan Markic
> ________________________________________
> From: Flavio Junqueira [[email protected]]
> Sent: Thursday, April 16, 2015 11:26 PM
> To: Dejan Markic
> Cc: [email protected]
> Subject: Re: Transaction logs and snapshots
>
> Distributed locks is indeed part of our bread and butter. Why don't you want 
> to write to disk? Your workload does't seem to be heavy. Does the IO traffic 
> compete with some other traffic you have?
>
> -Flavio
>
>> On 16 Apr 2015, at 22:15, Dejan Markic <[email protected]> wrote:
>>
>> Hello Flavio!
>>
>> Yes, indeed, ZK might not be the best option - but I could not find any 
>> better. What we need is a rather fast, distributed locking "system". ZK was 
>> at the moment the best option, and after testing it seemed to be the thing 
>> we are looking for. Other than snapshots/transaction logs, we have no 
>> problems. It easily handles our current load. It has C library, which makes 
>> it fairly easy to port it to other software.
>> What we need (but I cannot find any) is distributed in-memory distributed 
>> locking system where we can store some small information.
>> For instance, we use ZK's nodes as /SESSION_ID ... we lock it here, and then 
>> we use eg /SESSION_ID/my_var to store something. After session is gone, we 
>> remove this node and all information about it.
>>
>> If you have any idea about what kind of software we should try, please let 
>> me know. You've helped me enough already!
>>
>> Thank you and kind regards,
>> Dejan Markic
>> ________________________________________
>> From: Flavio Junqueira [[email protected]]
>> Sent: Thursday, April 16, 2015 10:29 PM
>> To: Dejan Markic
>> Cc: [email protected]
>> Subject: Re: Transaction logs and snapshots
>>
>> Another think you could do is to make snapCount very large so that snapshots 
>> are created infrequently. But, let me step back and ask you why you think ZK 
>> is a good fit for your project. It isn't clear to me that your case is a 
>> good one for ZK.
>>
>> -Flavio
>>
>>
>>> On 16 Apr 2015, at 11:01, Dejan Markic <[email protected]> wrote:
>>>
>>> Hello!
>>>
>>> Log seems to be always 67.108.880 bytes.
>>> Snapshots are currently between 30-40MB. Snapshot is created almost every 
>>> minute.
>>> Yes, data durability is not important at all. Once the session ends (it may 
>>> last between 0 and few minutes, average around 1-2 minutes maybe), I don't 
>>> need it anymore. I regulary remove  nodes that are not changed for more 
>>> than 10 minutes.
>>> I even recieve updates for sessions, so even if ZK looses data, I would get 
>>> it back after few minutes.
>>>
>>> Thanks!
>>>
>>> Kind regards,
>>> Dejan
>>>
>>>
>>> -----Original Message-----
>>> From: Flavio Junqueira [mailto:[email protected]]
>>> Sent: Thursday, April 16, 2015 11:49 AM
>>> To: [email protected]
>>> Subject: Re: Transaction logs and snapshots
>>>
>>> Hi Dejan,
>>> For a typical ZK application, granularity of hours is more than enough, 
>>> since it is supposed to be an infrequent background task. In your case, it 
>>> sounds like durability isn't an important property because if it is you 
>>> shouldn't be getting rid of disk data this fast. I'm also wondering about 
>>> the amount of data you're generating. What's the size of your snapshots and 
>>> txn logs?
>>> -Flavio
>>>
>>>
>>>    On Thursday, April 16, 2015 10:26 AM, Dejan Markic 
>>> <[email protected]> wrote:
>>>
>>>
>>>
>>> Hello Flavio!
>>>
>>> Would that mean, that zkCleanup.sh would not be needed?
>>> PurgeInterval is minimum 1 hour? Why is it so high?
>>>
>>> Thanks!
>>>
>>> Kind regards,
>>> Dejan Markic
>>>
>>>
>>> -----Original Message-----
>>> From: Flavio Junqueira [mailto:[email protected]]
>>> Sent: Thursday, April 16, 2015 11:15 AM
>>> To: [email protected]
>>> Subject: Re: Transaction logs and snapshots
>>>
>>> Hi Dejan,
>>> Check if the autopurge feature solves your problem:
>>> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_advancedConfiguration
>>>
>>> -Flavio
>>>
>>>
>>>    On Thursday, April 16, 2015 9:17 AM, Dejan Markic 
>>> <[email protected]> wrote:
>>>
>>>
>>>
>>> Hello all!
>>>
>>> We are running 3 ZK servers in ensemble, and ZK is processing a lot of 
>>> commands per seconds. There are probably around 300 nodes 
>>> created/checked/set/get per second.
>>> Since we have only information about live sessions we handle in ZK, we 
>>> don't need any data persistency - eg: we can stop all nodes, clean all 
>>> transaction logs/snapshots, and start them up again, without any issues.
>>> Since we have a lot of requests/changes, we have moved dataDir onto 
>>> ramdisk, so we have no problems with disk IOPS, etc.
>>> Is there a way, to minimze the usage of snapshots/logs so ramdisk would not 
>>> get filled up? It happens that transaction logs/snapshots grow so large, 
>>> that we run out of space on ramdisk.
>>> We issue >/usr/share/zookeeper/bin/zkCleanup.sh -n 3< every 2 minutes, so 
>>> this should cleanup the dataDir quite often. Why is >count number of 
>>> snapshots/logs to keep< limited to 3 and not below?
>>> I assume, in my setup, I don't even need snapshots/logs to be stored after 
>>> they are not actively needed?
>>> So my basic questions are:
>>> - can I somehow get rid of snapshot/logs sooner, more often ... ?
>>> - when is snapshot created? Can it be created sooner, so it would be 
>>> smaller?
>>> - Is it possible to get rid of snapshot/logs all together?
>>>
>>> Thank you for all your inputs and kind regards, Dejan Markic
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Reply via email to