tutububug commented on code in PR #207:
URL: https://github.com/apache/kvrocks-website/pull/207#discussion_r1547513290


##########
community/data-structure-on-rocksdb.md:
##########
@@ -314,3 +315,28 @@ where the `payload` is a string encoded in the 
corresponding `format`:
 | CBOR       |          1 |
 
 Also, if we decide to add a more IO-friendly format to avoid reading all 
payload to the memory before searching an element via JSONPath or seperate a 
relatively large JSON to multiple key-values, we can take advantage of the 
`format` field.
+
+## Hyperloglog
+
+Redis hyperloglog can be thought of as a static array with a length of 16384. 
The array elements are called registers, which are used to store the maximum 
count of consecutive 0s. This register array is the input parameter for the 
hyperloglog algorithm. 
+In Kvrocks, the hyperloglog data structure is stored in following two parts:
+ 
+#### hyperloglog metadata
+
+```text
+        +----------+------------+-----------+-----------+
+key =>  |  flags   |  expire    |  version  |  size     |
+        | (1byte)  | (Ebyte)    |  (8byte)  | (Sbyte)   |
+        +----------+------------+-----------+-----------+
+```
+- `size` records the number of registers, which is a constant, even no element 
is added.
+
+#### hyperloglog sub keys-values

Review Comment:
   I had changed my previous implementation of the storage format, currently, 
the format is the same as the bitmap's, and each HLL subkey represents a 
segment with 1024 registers. 
   Also, I tried to use bitfield, but finally, I found that it requires byte 
alignment because of 'memcpy' (it should only be able to operate bits within a 
byte). The implementation of Redis only uses 6 bits, not aligned, to save more 
space, so it seems not suitable. 
   However, another problem arises. When the register is set to 6-bit, unit 
test runs fail on a few OS releases in CI, but there is no problem when it is 
set to 8-bit, maybe it has something to do with byte alignment, but I don’t 
know the reason yet. Therefore, the current implementation still maintains 8 
bits without using bitfield.
   If I understand something wrong, correct me, thanks.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to