GitHub user mapleFU added a comment to the discussion: Any plan to support the JSON in kvrocks?
## RedisJSON ### Type System Complex Types: - Array - Object Basic Types: - Int: a 64-bit signed integer number - Double: float64 number - Bool - String Bottom Type: - Null ### Format #### Storage Just stores "Raw IJSON" values for json. It's a cbor-like typed json binary format. When stored in rdb, the "Raw IJSON" format would be cast to "JSON Text". #### Memory - serde_json: **User input** value will be wrapped in SerdeJson - ijson: ijson is an in-place replacement for serde_json, it provides light-weight in-memory format. The storage value will be deserialized with ijson format. ### Commands Commands are Listed below: - https://github.com/RedisJSON/RedisJSON/blob/master/redis_json/src/lib.rs#L190-L214 #### Path - JsonPath - Require Path and Filter Expression - Legacy Path Syntax - Use `a.b` `a["b"]` # Kvrocks Design ## Method 1: Integrate or Reuse RedisJSON RedisJson can not be well-optimized if "string" representation is used. Instead, we should pay some efforts to re-implement the old Legact Path Syntax. If we just reuse RedisJSON, we can: 1. Make RedisJSON supports redis modules 2. Maintain a fork of RedisJSON, and replace redismodules interface with kvrocks' callback ### Pros: - Can always be compatible with RedisJson - Easy to integrate Redisearch - Not need to maintain "Legacy Path Syntax" in kvrocks' code ### Cons: - It's hard to encapsulate kvrocks's content into Redis context. - If fork RedisJson, we need to maintain a fork for RedisJSON, and currently we just need to maintain get/set /event/redis-protocol-version and rdb. But if it requires more features, maintaining a fork might be hard. - The underlying storage format might change, and RedisJSON doesn't have KeyMeta in storage, if RedisJSON or ijson change the underlying implemention ## Method 2: Re-implement JSON in kvrocks Re-implement JSON in kvrocks is not hard, we need to consider the case below: - Json Path implementions - Storage format - In-memory format - Thirdparty libraries ## Thirdparty libraries - Jsoncons: https://github.com/danielaparker/jsoncons - Support JSON Path, and has cbor, bson extensions - Not support querying objects directly on binary / text - SIMDJson: - Query json directly on raw text ## Storage Format ### Raw Text Data Pros: 1. Easy to use 2. Can be well compacted 3. Human readable Cons: - May consume lots of memory ### Cbor Pros: 1. Easy to use 2. Usally smaller than json text 3. Strong typed Cons: - May complex during parsing ### Jsonb Pros: 1. It can be fast when reading, because it supports fast seek 2. Strong typed Cons: - May complex during parsing - Might be larger than json text ## In-memory format - Memory format can be the same as storage format or `jsoncons::json`. The latter might be memory-consuming - When updating or querying: - When updating, we might need to support casting it to `jsoncons::json`, and update it, then casting it back to storage format - When querying, we will use storage format if it's possible ### Corner cases #### Huge JSON Value If the json value is small, using all kinds of data is OK, if json value is huge, we need to consider that: - Would it be heavy when parsing or updating? - Should it be separate in kvrocks/rocksdb? ### Pros: - Might have better performance if we can use jsonb without parsing the object - If the json is huge, we're able to separate it into different parts ### Cons: - Need lots of compatibility works - Might not have good performance the first time, and the underlying storage format might change ## References 1. PostgreSQL JSON / JSONB 2. RedisJson 3. Redis Modules 4. Jsoncons GitHub link: https://github.com/apache/incubator-kvrocks/discussions/1164#discussioncomment-6205300 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
