mrproliu opened a new pull request, #1174:
URL: https://github.com/apache/skywalking-banyandb/pull/1174
## What
The `migration` tool now supports **stream** data in addition to **measure**,
across all three subcommands: `copy`, `verify`, and `analyze`. A single plan
may mix measure and stream groups; each group is routed to its catalog
automatically.
## Why
Previously the tool could only migrate measure groups. Operators migrating a
cluster's data (across hot/warm/cold tiers, or into a new layout) had no
offline path for stream groups.
## How
- The orchestration core (plan loading, catalog detection via the offline
schema reader, per-group union secondary index build, and the
per-(entry, group) copy/verify loop) is shared behind a `CatalogExecutor`
interface; `measure` and `stream` each provide their own executor.
- **Stream copy** keeps exact row parity (streams never deduplicate),
rebuilds
the per-shard element index, and broadcasts the group's union secondary
index
into every aligned target segment. Parts already aligned to the target
stage's `SegmentInterval` are byte-copied; parts crossing a grid boundary
are
re-bucketed row by row.
- **verify** reports per-(node, group) source-vs-target row coverage and
flags
any misaligned target segment.
## Tests
- Added measure and stream end-to-end copy/verify tests (fast and slow paths,
element-index rebuild, union-sidx broadcast, query-back validation).
- `make test` and `make lint` pass.
- [ ] If this pull request closes/resolves/fixes an existing issue, replace
the issue number. Fixes apache/skywalking#<issue number>.
- [x] Update the [`CHANGES`
log](https://github.com/apache/skywalking-banyandb/blob/main/CHANGES.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]