Re: Use compression to store data in ZK

kishore g Sun, 08 Mar 2015 12:24:20 -0700

Yeah, we still need to support it but we can go a long way without
bucketing if we compress it. We know we can support 1k partitions with raw
json and no bucketing. By adding compression, we can probably go upto 10k
partitions (need to validate this) per resource without bucketing.

I plan to use GZIP to compress/uncompress. Let me know if there is
something better.

This is what I am planning to do. We have common ZNRecordSerializer to
serialize/deserialize the data. We can simply check for a
"enableCompression" in the simpleFields and if its true, we apply
compression. On deserializing we can check for the magic header of GZIP and
if it matches, we automatically decompress the data.

The advantage of this is we don't to change the api of ZNRecordSerializer
or how it is set in various places. When a resource is created if
compression is turned on we set enableCompression=true in the idealstate.
This will take care of compressing idealstate. We now have to copy this in
creation of current state and External View. We should carry it with
External View since the controller creates it. For the CurrentState its not
straightforward, since it is created by the participants and they don't
read the IdealState. We can punt on the current state hoping that size of
current state is inversely proportional to the number of nodes in the
system. And if there are large number of partitions, the number of nodes
might also be large (this is not necessarily true). The other option is to
set the enableCompression=true the first time the CurrentState ZNode is
created by the participant.

Let me know what you think.

On Sun, Mar 8, 2015 at 11:09 AM, Kanak Biscuitwala <[email protected]>
wrote:

> I like this idea, but we would still need to support bucketizing either
> way because we cannot guarantee that the compressed version will be compact
> enough for every use case.
>
> What types of compression schemes are you planning to support?
>
> ----------------------------------------
> > Date: Sat, 7 Mar 2015 22:30:15 -0800
> > Subject: Use compression to store data in ZK
> > From: [email protected]
> > To: [email protected]
> >
> > Hi,
> >
> > Currently we have bucketing as one of the options when the number of
> > partitions are large. We have couple of bugs with the handling of
> > bucketized resources (one of them is fatal).
> >
> > One of the reasons to split the znode is because we use JSON to store the
> > data in ZNode. While JSON is good for debugging, its space inefficient.
> >
> > A better option before going to bucketing is to support compression of
> > Ideal state, current state and External View. This also gives good
> > performance.
> >
> > I plan to add this support and make it configurable. Feedback/suggestions
> >
> > thanks,
> > Kishore G
>
>

Re: Use compression to store data in ZK

Reply via email to