dashjay opened a new pull request, #3346:
URL: https://github.com/apache/kvrocks/pull/3346

   this pr want to close the origin issue: #2269
   
   With reading all codes in this pr:
   - #2717
   - #2402
   - #3269
   
   I came up with a way that differ from these three pr, I did learn alot from 
3269 which encode the expire time into value.
   But the compatibility is limited. The comment I said here: 
https://github.com/apache/kvrocks/pull/3269#issuecomment-3709830181 told that: 
   
   > if the user's value was happended to be set to [0xFF][0xFE][1-byte 
flags][8-byte timestamp if flag set]..... before update, will this key be treat 
as expired at once ?
   
   I have read all these PRs, and find the key problem of this new feature...
   
       some time/space complexity will become O(N) from O(1)
       how is the value encoded the expired time
       underlying GC expired key will be cost.
       ...
   
   So the key problem of add support of hash expiration commands for kvrocks 
is: **Let some(every) key in hash carry a TTL while keeping the read/write 
hot-path still O(1) and keep all function as origin, correct and fast.**
   
   Sorry I can't agree that cmd "hlen" or any other redis command is not 
correct at any time, kvrocks should be 100% compactable with redis. I can just 
agree that if one command like "hrangebylex" is origin not exists in redis, 
that we can modify it's API(better not).
   
   Following is the content of the proposal and trade off things:
   
   # INTODUCE
   
   redis command need to be implemented
   - HEXPIRE – set TTL for a field (in seconds) [Redis HEXPIRE | 
Docs](https://redis.io/docs/latest/commands/hexpire/)
   - HPEXPIRE – set TTL for a field (in milliseconds) [Redis HPEXPIRE | 
Docs](https://redis.io/docs/latest/commands/hpexpire/)
   - HEXPIREAT – set expiration timestamp for a field (in seconds) [Redis 
HEXPIREAT | Docs](https://redis.io/docs/latest/commands/hexpireat/)
   - HPEXPIREAT – set expiration timestamp for a field (in milliseconds) [Redis 
HPEXPIREAT | Docs](https://redis.io/docs/latest/commands/hpexpireat/)
   - HTTL – get remaining TTL for a field (in seconds) [Redis HTTL | 
Docs](https://redis.io/docs/latest/commands/httl/)
   - HPTTL – get remaining TTL for a field (in milliseconds) [Redis HPTTL | 
Docs](https://redis.io/docs/latest/commands/hpttl/)
   - HEXPIRETIME – get expiration timestamp for a field (in seconds) [Redis 
HEXPIRETIME | Docs](https://redis.io/docs/latest/commands/hexpiretime/)
   - HPEXPIRETIME – get expiration timestamp for a field (in milliseconds) 
[Redis HPEXPIRETIME | Docs](https://redis.io/docs/latest/commands/hpexpiretime/)
   - HSETEX  set field and the ttl for field together [Redis HSETEX | 
Docs](https://redis.io/docs/latest/commands/hsetex/)
   
   
   <details>
   
   <summary> Old commands (need compatibility) </summary>
   
   - hget: return not found when expired
   - hincrby: hincrby from 0 when expired
   - hincrbyfloat: hincrbyfloat from 0 when expired
   - hset: how the expire time work ?
   - hsetexpire: expire the entire hash
   - hsetnx: ""
   - hdel: return 0 if expired
   - hstrlen: check expired
   - hexists: check expired
   - hlen: big problem, size should minus expired key count
   - hmget: check expired
   - hmset: ""
   - hkeys: filtered the expired keys
   - hvals: filtered the expired values
   - hgetall: filterd the expired key-value pair
   - hscan: a lot expired key will exhaust the CPU
   - hrangebylex: ""
   - hrandfield: a lot expired key will exhaust the CPU
   
   </details>
   
   # Where to store the timestamp ?
   
   
   |implements | pros | cons | 
   |-----------|------|------|
   | all timestamp metadata stored in metadata_cf including the expire time of 
every field | backward compatibility | cost too much when update (write amp) |
   | encoded the expire time in every value | less cost  | compatibility 
problem |
   | encoded the expire time in the key | less cost | compatibility problem |
   | store the expire time in another encoded key | less cost | good 
compatibility |
   
   there must be an count N which make different implements has better 
performance, a lot trade-off work need to be done.
   
   If I take `store the expire time in another encoded key` the origin command 
will be effect as following:
   
   all read operation will execute twice
   
   > rocksdb:get -> rocksdb:get * 2 p99 from 25us -> 50us maybe
   
   following command will be affected:
   
   - hget, hmget, hkeys, hvals, hgetall, hscan, `hrangebylex`, hrandfield
   
   and how the key encoded is a problem, I came up with to ways:
   - just store them in another cf (this will bring one more cf for only a 
small function)
   - just use metadata.version + magic_number for storing the ttls... ( I think 
this will be better)
   
   
   # HLen problem
   
   ok the final problem is this:
   
   the cmd "hlen" which is origin O(1) but now need to be O(N)
   
   
   I came up with two ways to solve this problem:
   
   - metadata.size is still the element count, hlen take O(N) to calculate 
alive key count
   - provide a new command "hlenrelax" provide not redis-compatable hlen, and 
provide not accurate count which fixed by  compaction_filter
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to