DCjanus commented on issue #1658:
URL: https://github.com/apache/kvrocks/issues/1658#issuecomment-2873214854

   Here's my approach to enhancing the RANDOMKEY command:
   
   I'm considering implementing a more uniformly random key selection using the 
underlying storage structure:
   
   1. Use RocksDB's `GetLiveFileMetaData()` to retrieve all active SST files 
and their key counts
   2. Select an SST file randomly with weighting based on each file's key count
   3. Implement a custom `TablePropertiesCollector` that samples keys during 
SST construction (e.g. every 128 keys)
   4. When selecting a random key, choose one of these sampled keys as a 
starting point, scan ~128 keys from there, and randomly select one
   
   For SST files without properties (existing files before implementation), we 
could:
   - Store a cursor for each SST file without properties
   - Record the last sampling result for that file
   - Sequentially iterate N times from that position in subsequent calls
   - Randomly select one key from this iteration
   
   Limitations:
   1. Keys that are frequently modified may exist in multiple SST files, 
skewing randomness
   2. Additional storage overhead for sampled keys in table properties
   3. Many edge cases to handle in the implementation
   
   I'm not sure if introducing this much complexity is justified for a command 
that may not be heavily used. The current implementation (scanning 60 keys and 
selecting one) might be sufficient for most use cases despite not providing 
true randomness.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to