PokIsemaine opened a new pull request, #2332:
URL: https://github.com/apache/kvrocks/pull/2332

   issue: https://github.com/apache/kvrocks/issues/2310
   This is just a draft PR to demonstrate ideas, so there are a lot of 
imperfections.
   
   # **A. How:**
   
![image](https://github.com/apache/kvrocks/assets/46661603/37eed141-02a0-4743-b792-6546955f6b22)
   
   Add a Context parameter to each DB API. The Context contains the fixed 
snapshot during the entire call process and the resulting operations (stored 
using rocksdb::WriteBatchWithIndex).
   **Start of call:** Construct `Context`, use snapshot for fixed call, and 
pass it to DB API as a parameter
   
   **During the call:**
   
   Read operation:
     - Use rocksdb::WriteBatchWithIndex's `GetFromBatchAndDB` to read data. 
Only two parts of data can be seen in one call: the write operation data in 
this call stored in `rocksdb::WriteBatchWithIndex`; the snapshot is 
`Context.snapshot` DB data at the time. (`NewIterator` and `MultiGet` use the 
same method)
     - You will not see data from other concurrent API calls.
     
   Write operations:
     - Kvrocks current write operation: `GetBatchBase` gets a `WriteBatch`, 
then adds operations to `WriteBatch`, and then writes the operations in 
`WriteBatch` to DB.
     - Modify the `writeToDB` part of Write and implement a 
`WriteBatch::Handler` (tentatively named `WriteBatchIndexer`) for iterating the 
operations in `WriteBatch` and appending them to `WriteBatchWithIndex` of 
`Context`.
   
   # **B. Possible codes to pay attention to:**
   
   **storage:**
     - Structure of Context
     - writeToDB and Get
     - batch_indexer.h: WriteBatchIndexer
   
   **redis_db:** As an example, other APIs are derived from the redis_db API
   
   **cmd_xx:** Construct a new Context when executing
   
   # **C. Description: This section is used to explain the modifications that 
you may feel strange.**
   
   **CI part:** I temporarily added the `-v` parameter to observe the running 
of golang integration tests in GitHub Action in more detail
   
   **Test part:**
     - C++ unit test: I added `engine::Context` to `TestBase` to simplify test 
initialization. During the test process, I used `SetLatestSnapshot` of 
`Context` to get the latest `snapshot` and clear the `batch` in `Context`. This 
is because some tests A large number of loops are used. If the same ctx is used 
all the time, the data of all loops will be accumulated on it, and lead to 
extremely long time consuming. This is unnecessary because for the test case we 
do not require multiple APIs to be guaranteed to read the same `snapshot`.
     - Go integration test: I added Sleep to the BlockCommand test of list and 
zset, because I found that sometimes the execution order of multiple client 
requests would be inconsistent with the order of code writing due to 
fluctuations, resulting in infinite waiting or errors. As a result, for 
reference to other parts of the code and for convenience, I added some sleep to 
ensure the order.
   
   **TODO:** TODO: ctx in the implementation code means that I am not sure 
whether the ctx setting here is correct. Should it come from the upper 
parameters or create a Context myself? Where is the boundary of the Context. 
I'm not familiar with the search, slot, etc. parts yet.
   
   # **D. Improvement: The following are the points that I currently consider 
need to be improved.**
   
   - `Context` should be able to replace the original `LatestSnapshot`, 
`GetOptions`, and merge part of the API's `ReadOptions`
   - Improve comments and code
   - Correctness verification:
     - Unit test for WriteBatch::Handler
     - jepsen, sync point, and script tools verify the isolation level 
(expected isolation level of snapshot) and possible concurrency exceptions
   Performance testing, especially testing of DeleteRange related operations 
(if not ideal, try disabling it)
   
   # **E. suggestion:**
   * Because it is a draft PR, it may be more efficient to discuss rationality 
first and then discuss optimization details.
   * When discussing, you can use A-E to indicate which part is being discussed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to