dracoooooo commented on issue #1553:
URL: https://github.com/apache/horaedb/issues/1553#issuecomment-2282824406

   
   
   > > The key in the map is `<regionId>:<tableId>`,
   >
   > I think we can encode regionId in wal directory path, so the key could 
only contains `tableId`.
   
   Indeed.
   
   > > Use the manifest file to skip logs that have already been deleted.
   >
   > How will you skip WAL files? Which strategy will you use?
   
   This manifest exists both in the file system and in memory. In memory, it is 
represented as a map. Since we record the min and max sequence numbers of each 
table in segments in memory, we can skip the segments that are not needed. 
While iterating through the necessary segments, we might encounter logs that 
have already been deleted. In such cases, we can skip them based on the 
information in the map.
   
   > > Update the values in the map, create a new manifest file, and overwrite 
the old file.
   >
   > This is the normal case, what if there are some partials error, such as 
overwrite failed, you need to document more details, pseudo code or sequence 
diagram may help.
   
   The general steps for overwriting are to acquire the write lock for the 
manifest, create a new temporary file, write to this temporary file, use 
`fsync` to ensure the content has been written to the disk, and then use rename 
to replace the original file.
   
   If an error occurs in the steps above, I don’t think it can be handled, and 
we would have to panic.
   
   > > Record the maximum sequence number of all tables in each segment file in 
memory
   >
   > How will you recovery this info when server start up? do we need to 
iterate the whole WAL files?
   
   Yes. InfluxDB does the same thing. I think this is a trade-off to avoid 
writing manifest file during the WAL write operation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to