RE: Use compression to store data in ZK

2015-03-08 Thread Kanak Biscuitwala
I like this idea, but we would still need to support bucketizing either way 
because we cannot guarantee that the compressed version will be compact enough 
for every use case.

What types of compression schemes are you planning to support?


 Date: Sat, 7 Mar 2015 22:30:15 -0800
 Subject: Use compression to store data in ZK
 From: g.kish...@gmail.com
 To: dev@helix.apache.org

 Hi,

 Currently we have bucketing as one of the options when the number of
 partitions are large. We have couple of bugs with the handling of
 bucketized resources (one of them is fatal).

 One of the reasons to split the znode is because we use JSON to store the
 data in ZNode. While JSON is good for debugging, its space inefficient.

 A better option before going to bucketing is to support compression of
 Ideal state, current state and External View. This also gives good
 performance.

 I plan to add this support and make it configurable. Feedback/suggestions

 thanks,
 Kishore G
  

Review Request 31835: [HELIX-572] Fixing External View update logic for bucketized resource

2015-03-08 Thread Kishore Gopalakrishna

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31835/
---

Review request for helix.


Bugs: HELIX-572


Repository: helix-git


Description
---

commit 6aae15d77ce123b7dc83bc39fccd5c7c210bd972
Author: Kishore Gopalakrishna g.kish...@gmail.com
Date:   Sun Mar 8 14:20:08 2015 -0700

[HELIX-572] Fixing External View update logic for bucketized resource

:100644 100644 169c993... 8c9fc8d... M  
helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixDataAccessor.java
:100644 100644 207a318... 2b5e2bc... M  
helix-core/src/test/java/org/apache/helix/integration/TestBucketizedResource.java
:100755 100755 82cbcf9... e869f25... M  hpost-review.sh


Diffs
-

  helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixDataAccessor.java 
169c993 
  
helix-core/src/test/java/org/apache/helix/integration/TestBucketizedResource.java
 207a318 
  hpost-review.sh 82cbcf9 

Diff: https://reviews.apache.org/r/31835/diff/


Testing
---


Thanks,

Kishore Gopalakrishna



Re: Review Request 31832: [HELIX-572] Fixing External View update logic for bucketized resource

2015-03-08 Thread Kanak Biscuitwala

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31832/#review75644
---

Ship it!



helix-core/src/test/java/org/apache/helix/integration/TestBucketizedResource.java
https://reviews.apache.org/r/31832/#comment122854

Remove indent


- Kanak Biscuitwala


On March 8, 2015, 12:19 a.m., Kishore Gopalakrishna wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31832/
 ---
 
 (Updated March 8, 2015, 12:19 a.m.)
 
 
 Review request for helix.
 
 
 Repository: helix-git
 
 
 Description
 ---
 
 commit 1cd1327f45d5bda9e6ee8d371353e43a65cae743
 Author: Kishore Gopalakrishna g.kish...@gmail.com
 Date:   Sat Mar 7 23:45:45 2015 -0800
 
 [HELIX-572] Fixing External View update logic for bucketized resource
 
 :100644 100644 38c1417... 358971d... M  
 helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixDataAccessor.java
 :100644 100644 7234658... f83d14e... M  
 helix-core/src/test/java/org/apache/helix/integration/TestBucketizedResource.jav
 
 
 Diffs
 -
 
   
 helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixDataAccessor.java 
 38c1417 
   
 helix-core/src/test/java/org/apache/helix/integration/TestBucketizedResource.java
  7234658 
 
 Diff: https://reviews.apache.org/r/31832/diff/
 
 
 Testing
 ---
 
 Added the check for version of external view in the TestBucketizedResource 
 integration test case
 
 
 Thanks,
 
 Kishore Gopalakrishna
 




Re: Use compression to store data in ZK

2015-03-08 Thread kishore g
Yeah, we still need to support it but we can go a long way without
bucketing if we compress it. We know we can support 1k partitions with raw
json and no bucketing. By adding compression, we can probably go upto 10k
partitions (need to validate this) per resource without bucketing.

I plan to use GZIP to compress/uncompress. Let me know if there is
something better.

This is what I am planning to do. We have common ZNRecordSerializer to
serialize/deserialize the data. We can simply check for a
enableCompression in the simpleFields and if its true, we apply
compression. On deserializing we can check for the magic header of GZIP and
if it matches, we automatically decompress the data.

The advantage of this is we don't to change the api of ZNRecordSerializer
or how it is set in various places. When a resource is created if
compression is turned on we set enableCompression=true in the idealstate.
This will take care of compressing idealstate. We now have to copy this in
creation of current state and External View. We should carry it with
External View since the controller creates it. For the CurrentState its not
straightforward, since it is created by the participants and they don't
read the IdealState. We can punt on the current state hoping that size of
current state is inversely proportional to the number of nodes in the
system. And if there are large number of partitions, the number of nodes
might also be large (this is not necessarily true). The other option is to
set the enableCompression=true the first time the CurrentState ZNode is
created by the participant.

Let me know what you think.






On Sun, Mar 8, 2015 at 11:09 AM, Kanak Biscuitwala kana...@hotmail.com
wrote:

 I like this idea, but we would still need to support bucketizing either
 way because we cannot guarantee that the compressed version will be compact
 enough for every use case.

 What types of compression schemes are you planning to support?

 
  Date: Sat, 7 Mar 2015 22:30:15 -0800
  Subject: Use compression to store data in ZK
  From: g.kish...@gmail.com
  To: dev@helix.apache.org
 
  Hi,
 
  Currently we have bucketing as one of the options when the number of
  partitions are large. We have couple of bugs with the handling of
  bucketized resources (one of them is fatal).
 
  One of the reasons to split the znode is because we use JSON to store the
  data in ZNode. While JSON is good for debugging, its space inefficient.
 
  A better option before going to bucketing is to support compression of
  Ideal state, current state and External View. This also gives good
  performance.
 
  I plan to add this support and make it configurable. Feedback/suggestions
 
  thanks,
  Kishore G




Review Request 31832: [HELIX-572] Fixing External View update logic for bucketized resource

2015-03-08 Thread Kishore Gopalakrishna

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31832/
---

Review request for helix.


Repository: helix-git


Description
---

commit 1cd1327f45d5bda9e6ee8d371353e43a65cae743
Author: Kishore Gopalakrishna g.kish...@gmail.com
Date:   Sat Mar 7 23:45:45 2015 -0800

[HELIX-572] Fixing External View update logic for bucketized resource

:100644 100644 38c1417... 358971d... M  
helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixDataAccessor.java
:100644 100644 7234658... f83d14e... M  
helix-core/src/test/java/org/apache/helix/integration/TestBucketizedResource.jav


Diffs
-

  helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixDataAccessor.java 
38c1417 
  
helix-core/src/test/java/org/apache/helix/integration/TestBucketizedResource.java
 7234658 

Diff: https://reviews.apache.org/r/31832/diff/


Testing
---

Added the check for version of external view in the TestBucketizedResource 
integration test case


Thanks,

Kishore Gopalakrishna



[jira] [Created] (HELIX-573) Add support to compress/uncompress data on ZK

2015-03-08 Thread kishore gopalakrishna (JIRA)
kishore gopalakrishna created HELIX-573:
---

 Summary: Add support to compress/uncompress data on ZK
 Key: HELIX-573
 URL: https://issues.apache.org/jira/browse/HELIX-573
 Project: Apache Helix
  Issue Type: Improvement
Reporter: kishore gopalakrishna


Currently we have bucketing as one of the options when the number of partitions 
are large. We have couple of bugs with the handling of bucketized resources 
(one of them is fatal). 

One of the reasons to split the znode is because we use JSON to store the data 
in ZNode. While JSON is good for debugging, its space inefficient.

A better option before going to bucketing is to support compression of Ideal 
state, current state and External View. This also gives good performance.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HELIX-573) Add support to compress/uncompress data on ZK

2015-03-08 Thread kishore gopalakrishna (JIRA)

[ 
https://issues.apache.org/jira/browse/HELIX-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352346#comment-14352346
 ] 

kishore gopalakrishna commented on HELIX-573:
-

From Kanaks's email:

I like this idea, but we would still need to support bucketizing either way 
because we cannot guarantee that the compressed version will be compact enough 
for every use case.

What types of compression schemes are you planning to support?

 Add support to compress/uncompress data on ZK
 -

 Key: HELIX-573
 URL: https://issues.apache.org/jira/browse/HELIX-573
 Project: Apache Helix
  Issue Type: Improvement
Reporter: kishore gopalakrishna
Assignee: kishore gopalakrishna

 Currently we have bucketing as one of the options when the number of 
 partitions are large. We have couple of bugs with the handling of bucketized 
 resources (one of them is fatal). 
 One of the reasons to split the znode is because we use JSON to store the 
 data in ZNode. While JSON is good for debugging, its space inefficient.
 A better option before going to bucketing is to support compression of Ideal 
 state, current state and External View. This also gives good performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 31836: [HELIX-573] Add support to automatically compress/uncompress data in Zookeeper

2015-03-08 Thread Kishore Gopalakrishna

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31836/
---

Review request for helix.


Repository: helix-git


Description
---

commit 1ef44c2e9a132df3513a51e3a8dac658236a2263
Author: Kishore Gopalakrishna g.kish...@gmail.com
Date:   Sun Mar 8 16:40:29 2015 -0700

[HELIX-573] Add support to automatically compress/uncompress data in 
Zookeeper

:100644 100644 4419fdd... 1f34529... M  
helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordSerializer.java
:100644 100644 2d7cb3c... 26d7e2b... M  
helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordStreamingSerializer.java
:00 100644 000... 90c1e8e... A  
helix-core/src/test/java/org/apache/helix/manager/zk/TestZNRecordSerializer.java
:100644 100644 e4b0b25... 95064f8... M  
helix-core/src/test/java/org/apache/helix/manager/zk/TestZNRecordStreamingSerializer.java
:100755 100755 e869f25... 99ef81c... M  hpost-review.sh


Diffs
-

  helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixDataAccessor.java 
169c993 
  helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordSerializer.java 
4419fdd 
  
helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordStreamingSerializer.java
 2d7cb3c 
  
helix-core/src/test/java/org/apache/helix/integration/TestBucketizedResource.java
 207a318 
  
helix-core/src/test/java/org/apache/helix/manager/zk/TestZNRecordSerializer.java
 PRE-CREATION 
  
helix-core/src/test/java/org/apache/helix/manager/zk/TestZNRecordStreamingSerializer.java
 e4b0b25 
  hpost-review.sh 82cbcf9 

Diff: https://reviews.apache.org/r/31836/diff/


Testing
---

Added basic test for compress/uncompress


Thanks,

Kishore Gopalakrishna



Re: Review Request 31836: [HELIX-573] Add support to automatically compress/uncompress data in Zookeeper

2015-03-08 Thread Kanak Biscuitwala

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31836/#review75661
---



helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordSerializer.java
https://reviews.apache.org/r/31836/#comment122869

As currently written, compression will be default because valueOf returns 
true if the string is null. I'm not sure if you want that, or at the very 
least, it should be made explicit, i.e.:

```
if (record.getBooleanField(enableCompression, defaultValue) {
```



helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordSerializer.java
https://reviews.apache.org/r/31836/#comment122870

Lines 91 and 92 can be combined into 1 line.



helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordSerializer.java
https://reviews.apache.org/r/31836/#comment122868

This log message won't be very useful if the bytes are compressed.



helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordStreamingSerializer.java
https://reviews.apache.org/r/31836/#comment122876

Same comment as above. Use getBooleanField. Also, consider putting this 
code in a common area since it's copy-pasted from the other serializer class.



helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordStreamingSerializer.java
https://reviews.apache.org/r/31836/#comment122877

Same comment about code duplication.



helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordStreamingSerializer.java
https://reviews.apache.org/r/31836/#comment122878

This method too.


- Kanak Biscuitwala


On March 8, 2015, 4:48 p.m., Kishore Gopalakrishna wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31836/
 ---
 
 (Updated March 8, 2015, 4:48 p.m.)
 
 
 Review request for helix.
 
 
 Repository: helix-git
 
 
 Description
 ---
 
 commit 1ef44c2e9a132df3513a51e3a8dac658236a2263
 Author: Kishore Gopalakrishna g.kish...@gmail.com
 Date:   Sun Mar 8 16:40:29 2015 -0700
 
 [HELIX-573] Add support to automatically compress/uncompress data in 
 Zookeeper
 
 :100644 100644 4419fdd... 1f34529... M  
 helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordSerializer.java
 :100644 100644 2d7cb3c... 26d7e2b... M  
 helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordStreamingSerializer.java
 :00 100644 000... 90c1e8e... A  
 helix-core/src/test/java/org/apache/helix/manager/zk/TestZNRecordSerializer.java
 :100644 100644 e4b0b25... 95064f8... M  
 helix-core/src/test/java/org/apache/helix/manager/zk/TestZNRecordStreamingSerializer.java
 :100755 100755 e869f25... 99ef81c... M  hpost-review.sh
 
 
 Diffs
 -
 
   
 helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixDataAccessor.java 
 169c993 
   
 helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordSerializer.java 
 4419fdd 
   
 helix-core/src/main/java/org/apache/helix/manager/zk/ZNRecordStreamingSerializer.java
  2d7cb3c 
   
 helix-core/src/test/java/org/apache/helix/integration/TestBucketizedResource.java
  207a318 
   
 helix-core/src/test/java/org/apache/helix/manager/zk/TestZNRecordSerializer.java
  PRE-CREATION 
   
 helix-core/src/test/java/org/apache/helix/manager/zk/TestZNRecordStreamingSerializer.java
  e4b0b25 
   hpost-review.sh 82cbcf9 
 
 Diff: https://reviews.apache.org/r/31836/diff/
 
 
 Testing
 ---
 
 Added basic test for compress/uncompress
 
 
 Thanks,
 
 Kishore Gopalakrishna