Re: Support for large number of keys?

Ted Dunning Tue, 27 May 2014 08:19:40 -0700

On Tue, May 27, 2014 at 7:56 AM, Jan Kotek <[email protected]> wrote:

> Zookeeper in current form stores all keys in memory and snapshots them
> periodically. I think it limits size of data Zookeeper can store.
>


Yes. It does.


> I am investigating feasibility of patching ZK to support large number of
> keys. It
> would use off-heap storage engine with incremental snapshotting.
> I think this way Zookeeper could store around 100 million keys-value pairs
> without
> negative impact on performance.
>

ZK already has a progressive snapshot in some form.  What I mean by this is
that the engine does not pause during the creation of a snapshot.  The
reason that this is true is that all transactions that are written to the
transaction log are idempotent.  This means that parts of the memory table
that are written later may actually have been updated after the snapshot
started.  This isn't a problem because after a snapshot is reloaded, all
transactions that were received after the start of the snapshot are
replayed.  If an element was written after a transaction updated it,
replaying that transaction will have no effect.

Technically it is feasible (I already done something similar for
> Hazelcast). My
> question is if someone would actually use this improvement. Current ZK is
> probably just fine for most uses, it only has problem when you put
> excessive
> amount of data inside.
>

And that is, of course, outside the mission that ZK *intends* to support.

So my questions is: Do you use ZK as a database? And do you have problem
> with long
> crash recovery time?
>

This is a question for users.

As a user and as somebody who supports users, I don't and don't know about
anybody using ZK as a large data store.

Re: Support for large number of keys?

Reply via email to