dracoooooo opened a new issue, #1553:
URL: https://github.com/apache/horaedb/issues/1553

   ### Describe This Problem
   
   To implement a WAL based on the local disk, in addition to using segment 
files to record logs, it is also necessary to use another file to record the 
metadata of the WAL. This is because the current WAL `Delete` interface 
includes a tableId as a parameter, and logs for multiple tables are recorded in 
the same segment file. This means that it is not possible to simply mark all 
logs before a certain sequence number as deletable. Therefore, a manifast file 
is needed to maintain this information.
   
   ### Proposal
   
   ### Format
   
   Using protobuf as the file format for WAL manifest:
   
   ```proto
   syntax = "proto3";
   
   message Manifest {
     map<string, uint64> latest_mark_deleted = 1;
   }
   ```
   
   The key in the map is `<regionId>:<tableId>`, and the value is the highest 
sequance number marked as deleted for this table in the WAL.
   
   Not using the manifest file to record more information is to avoid updating 
this file during appending logs, thereby reducing I/O overhead.
   
   ### Append Logs
   
   Do not update the manifest file.
   
   ### Read Logs
   
   Use the manifest file to skip logs that have already been deleted.
   
   ### Delete Logs
   
   Update the values in the map, create a new manifest file, and overwrite the 
old file.
   
   Record the maximum sequence number of all tables in each segment file in 
memory. When all the tables mark the deleted sequence number as greater than 
the maximum sequence number in the segment, delete this segment.
   
   When an old segment is deleted, if a table’s log exists only in this old 
segment, then remove this table from the manifest’s map.
   
   ### Potential Risks
   
   1. If the number of tables is very large, the overhead of overwriting this 
manifest file each time could be significant.
   
   
   ### Additional Context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to