DCjanus commented on issue #1658: URL: https://github.com/apache/kvrocks/issues/1658#issuecomment-2873214854
Here's my approach to enhancing the RANDOMKEY command: I'm considering implementing a more uniformly random key selection using the underlying storage structure: 1. Use RocksDB's `GetLiveFileMetaData()` to retrieve all active SST files and their key counts 2. Select an SST file randomly with weighting based on each file's key count 3. Implement a custom `TablePropertiesCollector` that samples keys during SST construction (e.g. every 128 keys) 4. When selecting a random key, choose one of these sampled keys as a starting point, scan ~128 keys from there, and randomly select one For SST files without properties (existing files before implementation), we could: - Store a cursor for each SST file without properties - Record the last sampling result for that file - Sequentially iterate N times from that position in subsequent calls - Randomly select one key from this iteration Limitations: 1. Keys that are frequently modified may exist in multiple SST files, skewing randomness 2. Additional storage overhead for sampled keys in table properties 3. Many edge cases to handle in the implementation I'm not sure if introducing this much complexity is justified for a command that may not be heavily used. The current implementation (scanning 60 keys and selecting one) might be sufficient for most use cases despite not providing true randomness. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
