hongzhi-gao opened a new pull request, #731:
URL: https://github.com/apache/tsfile/pull/731
## Summary
This PR adds **TsFile recovery and continued writing** in C++: open an
incomplete or corrupted TsFile, truncate bad tail data, and continue appending
with the existing tree and table writers. The behavior is aligned with the Java
`RestorableTsFileIOWriter` and enables safe recovery after crashes or
interrupted writes.
## Motivation
- Support **recoverable TsFile writes**: detect incomplete/corrupted files,
truncate to the last valid offset, and resume writing without losing previously
committed data.
- Allow **tail metadata generation** after recovery: recovered chunk group
meta and chunk statistics are restored so that `close()` can write correct
metadata index and file tail.
- Unify **reader metadata API**: introduce `DeviceTimeseriesMetadataMap` and
overloads of `get_timeseries_metadata(device_ids)` /
`get_timeseries_metadata()` returning a map; add `get_all_devices()` where
appropriate. Only existing devices are included when querying by device list.
## Implementation
### RestorableTsFileIOWriter (new)
- **Open**: `open(file_path, truncate_corrupted)` opens with
`O_RDWR|O_CREAT` (no `O_TRUNC`) and runs a **self-check**:
- Empty file → treat as crashed, allow writing from scratch.
- Valid header + tail magic → **complete file**; close write handle,
`can_write() = false`.
- Otherwise → **recovery path**: scan from header to find the last valid
truncation point, optionally truncate, then re-init the base writer and
**restore write_stream_** from file content so `cur_file_position()` is correct
when generating tail metadata later.
- **Recovery details**:
- Rebuild **chunk group meta** and **chunk statistics** (multi-page: merge
page stats; single-page: decompress and decode to fill stats; aligned value
chunks use time batch from the time chunk in the same group).
- **write_stream_** is refilled by reading the current file content into
the stream; `flush_skip_leading_` is set so that `flush_stream_to_file()` skips
these leading bytes and only writes new data. No change to the normal write
path.
- Recovered `ChunkGroupMeta` entries are pushed via
`push_chunk_group_meta()` and marked with `chunk_group_meta_from_recovery_` so
`destroy()` does not free them (they live in the recovery arena).
### TsFileIOWriter (base)
- **Recovery-only hooks** (defaults keep normal behavior unchanged):
- `flush_skip_leading_`: when > 0, `flush_stream_to_file()` skips that
many leading bytes in the stream (already on disk) and writes the rest.
- `chunk_group_meta_from_recovery_`: when true, `destroy()` skips freeing
chunk group meta (owned by recovery).
- Added `push_chunk_group_meta()`, `set_flush_skip_leading()` (protected)
and `friend RestorableTsFileIOWriter`.
### WriteFile
- Added `truncate(size)`, `seek_to_end()`, `get_position()`, and `get_fd()`
for recovery (truncate, append position, and reading file content to restore
`write_stream_`). `close()` is idempotent when already closed.
### TsFileWriter / TsFileTableWriter / TsFileTreeWriter
- **TsFileWriter**: `init(RestorableTsFileIOWriter* rw)` initializes from a
recovered writer: takes schema from the file, does not own `io_writer_`
(`io_writer_owned_ = false`). Ensures `time_chunk_writer_` is created when
appending after recovery.
- **TsFileTableWriter** / **TsFileTreeWriter**: New constructors that take
`RestorableTsFileIOWriter*` for appending after recovery (schema/alignment from
restored file).
### Reader API refactor
- **DeviceTimeseriesMetadataMap** type alias: `IDeviceID` →
`vector<shared_ptr<ITimeseriesIndex>>`.
- **TsFileReader**: `get_timeseries_metadata(device_ids)` returns a map
(only existing devices); `get_timeseries_metadata()` returns metadata for all
devices; `get_all_devices()` returns the same as `get_all_device_ids()`.
- **TsFileTreeReader**: Same metadata API; `get_all_devices()` returns
`vector<shared_ptr<IDeviceID>>` (underlying reader).
## Testing
- **RestorableTsFileIOWriterTest**: open empty file, bad magic, complete
file, truncated file, header-only file; recovery then continue write with
**TsFileWriter**, **TsFileTreeWriter** (multi-device), **TsFileTableWriter**,
and aligned timeseries; read-back with tree/table reader and assert row counts
and device counts.
- **WriteFile**: `TruncateFile` test for `truncate(size)`.
- **TsFileReaderTest / TsFileReaderTreeTest**: Updated to use new
`get_timeseries_metadata()` / `get_timeseries_metadata(device_ids)` and
`DeviceTimeseriesMetadataMap`; assertions unchanged.
All new behavior is guarded by the recovery path or new APIs; existing write
and read behavior is unchanged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]