zealchen opened a new pull request, #1604:
URL: https://github.com/apache/horaedb/pull/1604

   ## Rationale
   #1600 
   
   ## Detailed Changes
   Creating a new data format to represent the manifest snapshot file.
   ```text
   | magic(u32) | version(u8) |  flags(u8) | length(u64) | Record(N) ... |
   
   The Magic field (u32) is used to ensure the validity of the data source.
   The Flags field (u8) is reserved for future extensibility, such as enabling 
compression or supporting additional features.
   The length field (u64) represents the total length of the subsequent records 
and serves as a straightforward method for verifying their integrity. (length = 
record_length * record_count)
   
   # Record is a self-descriptive message
   | id(u64) | time_range(i64*2)| size(u32) |  num_rows(u32)|
   ```
   
   In do_merge, the snapshot data handle is like:
   
   ```text
   Old data flow in do_merge:
                                         delta_sstmetas
                                                | (extend vec)
                                                V                               
 
   object_store -> org_bytes -> org_pb -> Vec<sstmeta> -> dst_pb -> dst_bytes 
-> object_store
   
   New data flow in do_merge:
                  delta_sstmetas -> bytes
                                     | (append)
                                     V                                
   object_store -> org_bytes -> dst_bytes -> object_store
   ````
   
   Specifically, I create the SnapshotHeader and SnapshotRecordV1 to represent 
the corresponding data in snapshot bytes. Before merge delta sstfiles into new 
bytes, we allocate a larger Vec `<u8>` and copy each segments (header, old 
records, new records) into it. 
   
   This RP DOES NOT address format upgrade logic which can be resolved in 
another PR. As for the upgrade, we could define new SnapshotRecord format and 
do data migration before Manifest::try_new. 
   
   
   ## Test Plan
   UT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to