zealchen opened a new pull request, #1604:
URL: https://github.com/apache/horaedb/pull/1604
## Rationale
#1600
## Detailed Changes
Creating a new data format to represent the manifest snapshot file.
```text
| magic(u32) | version(u8) | flags(u8) | length(u64) | Record(N) ... |
The Magic field (u32) is used to ensure the validity of the data source.
The Flags field (u8) is reserved for future extensibility, such as enabling
compression or supporting additional features.
The length field (u64) represents the total length of the subsequent records
and serves as a straightforward method for verifying their integrity. (length =
record_length * record_count)
# Record is a self-descriptive message
| id(u64) | time_range(i64*2)| size(u32) | num_rows(u32)|
```
In do_merge, the snapshot data handle is like:
```text
Old data flow in do_merge:
delta_sstmetas
| (extend vec)
V
object_store -> org_bytes -> org_pb -> Vec<sstmeta> -> dst_pb -> dst_bytes
-> object_store
New data flow in do_merge:
delta_sstmetas -> bytes
| (append)
V
object_store -> org_bytes -> dst_bytes -> object_store
````
Specifically, I create the SnapshotHeader and SnapshotRecordV1 to represent
the corresponding data in snapshot bytes. Before merge delta sstfiles into new
bytes, we allocate a larger Vec `<u8>` and copy each segments (header, old
records, new records) into it.
This RP DOES NOT address format upgrade logic which can be resolved in
another PR. As for the upgrade, we could define new SnapshotRecord format and
do data migration before Manifest::try_new.
## Test Plan
UT
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]