hongzhi-gao opened a new pull request, #731:
URL: https://github.com/apache/tsfile/pull/731

   ## Summary
   
   This PR adds **TsFile recovery and continued writing** in C++: open an 
incomplete or corrupted TsFile, truncate bad tail data, and continue appending 
with the existing tree and table writers. The behavior is aligned with the Java 
`RestorableTsFileIOWriter` and enables safe recovery after crashes or 
interrupted writes.
   
   ## Motivation
   
   - Support **recoverable TsFile writes**: detect incomplete/corrupted files, 
truncate to the last valid offset, and resume writing without losing previously 
committed data.
   - Allow **tail metadata generation** after recovery: recovered chunk group 
meta and chunk statistics are restored so that `close()` can write correct 
metadata index and file tail.
   - Unify **reader metadata API**: introduce `DeviceTimeseriesMetadataMap` and 
overloads of `get_timeseries_metadata(device_ids)` / 
`get_timeseries_metadata()` returning a map; add `get_all_devices()` where 
appropriate. Only existing devices are included when querying by device list.
   
   ## Implementation
   
   ### RestorableTsFileIOWriter (new)
   
   - **Open**: `open(file_path, truncate_corrupted)` opens with 
`O_RDWR|O_CREAT` (no `O_TRUNC`) and runs a **self-check**:
     - Empty file → treat as crashed, allow writing from scratch.
     - Valid header + tail magic → **complete file**; close write handle, 
`can_write() = false`.
     - Otherwise → **recovery path**: scan from header to find the last valid 
truncation point, optionally truncate, then re-init the base writer and 
**restore write_stream_** from file content so `cur_file_position()` is correct 
when generating tail metadata later.
   - **Recovery details**:
     - Rebuild **chunk group meta** and **chunk statistics** (multi-page: merge 
page stats; single-page: decompress and decode to fill stats; aligned value 
chunks use time batch from the time chunk in the same group).
     - **write_stream_** is refilled by reading the current file content into 
the stream; `flush_skip_leading_` is set so that `flush_stream_to_file()` skips 
these leading bytes and only writes new data. No change to the normal write 
path.
     - Recovered `ChunkGroupMeta` entries are pushed via 
`push_chunk_group_meta()` and marked with `chunk_group_meta_from_recovery_` so 
`destroy()` does not free them (they live in the recovery arena).
   
   ### TsFileIOWriter (base)
   
   - **Recovery-only hooks** (defaults keep normal behavior unchanged):
     - `flush_skip_leading_`: when > 0, `flush_stream_to_file()` skips that 
many leading bytes in the stream (already on disk) and writes the rest.
     - `chunk_group_meta_from_recovery_`: when true, `destroy()` skips freeing 
chunk group meta (owned by recovery).
   - Added `push_chunk_group_meta()`, `set_flush_skip_leading()` (protected) 
and `friend RestorableTsFileIOWriter`.
   
   ### WriteFile
   
   - Added `truncate(size)`, `seek_to_end()`, `get_position()`, and `get_fd()` 
for recovery (truncate, append position, and reading file content to restore 
`write_stream_`). `close()` is idempotent when already closed.
   
   ### TsFileWriter / TsFileTableWriter / TsFileTreeWriter
   
   - **TsFileWriter**: `init(RestorableTsFileIOWriter* rw)` initializes from a 
recovered writer: takes schema from the file, does not own `io_writer_` 
(`io_writer_owned_ = false`). Ensures `time_chunk_writer_` is created when 
appending after recovery.
   - **TsFileTableWriter** / **TsFileTreeWriter**: New constructors that take 
`RestorableTsFileIOWriter*` for appending after recovery (schema/alignment from 
restored file).
   
   ### Reader API refactor
   
   - **DeviceTimeseriesMetadataMap** type alias: `IDeviceID` → 
`vector<shared_ptr<ITimeseriesIndex>>`.
   - **TsFileReader**: `get_timeseries_metadata(device_ids)` returns a map 
(only existing devices); `get_timeseries_metadata()` returns metadata for all 
devices; `get_all_devices()` returns the same as `get_all_device_ids()`.
   - **TsFileTreeReader**: Same metadata API; `get_all_devices()` returns 
`vector<shared_ptr<IDeviceID>>` (underlying reader).
   
   ## Testing
   
   - **RestorableTsFileIOWriterTest**: open empty file, bad magic, complete 
file, truncated file, header-only file; recovery then continue write with 
**TsFileWriter**, **TsFileTreeWriter** (multi-device), **TsFileTableWriter**, 
and aligned timeseries; read-back with tree/table reader and assert row counts 
and device counts.
   - **WriteFile**: `TruncateFile` test for `truncate(size)`.
   - **TsFileReaderTest / TsFileReaderTreeTest**: Updated to use new 
`get_timeseries_metadata()` / `get_timeseries_metadata(device_ids)` and 
`DeviceTimeseriesMetadataMap`; assertions unchanged.
   
   All new behavior is guarded by the recovery path or new APIs; existing write 
and read behavior is unchanged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to