PragmaTwice commented on code in PR #207: URL: https://github.com/apache/kvrocks-website/pull/207#discussion_r1576375575
########## community/data-structure-on-rocksdb.md: ########## @@ -314,3 +315,27 @@ where the `payload` is a string encoded in the corresponding `format`: | CBOR | 1 | Also, if we decide to add a more IO-friendly format to avoid reading all payload to the memory before searching an element via JSONPath or seperate a relatively large JSON to multiple key-values, we can take advantage of the `format` field. + +## Hyperloglog + +Redis hyperloglog can be thought of as a static array with a length of 16384. The array elements are called registers, which are used to store the maximum count of consecutive 0s. This register array is the input parameter for the hyperloglog algorithm. +In Kvrocks, the hyperloglog data structure is stored in following two parts: + +#### hyperloglog metadata + +```text + +----------+------------+-----------+-----------+ +key => | flags | expire | version | size | + | (1byte) | (Ebyte) | (8byte) | (Sbyte) | + +----------+------------+-----------+-----------+ +``` +#### hyperloglog sub keys-values + +```text + +-----------------------+-----+ +key|version|register_index => | 0s count (1byte) | ... | + +-----------------------+-----+ +``` +The register index is calculated using the first 14 bits of the user element's hash value (64 bits), which is why the register array length is 16384. +The length of consecutive zeros is calculated using the last 50 digits of the hash value of the user key. +Inspired by the bitmap implementation, hyperloglog divides the register array into 16 segments, each with 1024 registers. Review Comment: So it's now not actually `register_index`? But something like segment index. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
