[PR] Refactor migration tool and support stream catalog data [skywalking-banyandb]

via GitHub Sat, 13 Jun 2026 00:11:10 -0700


mrproliu opened a new pull request, #1174:
URL: https://github.com/apache/skywalking-banyandb/pull/1174


   ## What
   
   The `migration` tool now supports **stream** data in addition to **measure**,
   across all three subcommands: `copy`, `verify`, and `analyze`. A single plan
   may mix measure and stream groups; each group is routed to its catalog
   automatically.
   
   ## Why
   
   Previously the tool could only migrate measure groups. Operators migrating a
   cluster's data (across hot/warm/cold tiers, or into a new layout) had no
   offline path for stream groups.
   
   ## How
   
   - The orchestration core (plan loading, catalog detection via the offline
     schema reader, per-group union secondary index build, and the
     per-(entry, group) copy/verify loop) is shared behind a `CatalogExecutor`
     interface; `measure` and `stream` each provide their own executor.
   - **Stream copy** keeps exact row parity (streams never deduplicate), 
rebuilds
     the per-shard element index, and broadcasts the group's union secondary 
index
     into every aligned target segment. Parts already aligned to the target
     stage's `SegmentInterval` are byte-copied; parts crossing a grid boundary 
are
     re-bucketed row by row.
   - **verify** reports per-(node, group) source-vs-target row coverage and 
flags
     any misaligned target segment.
   
   ## Tests
   
   - Added measure and stream end-to-end copy/verify tests (fast and slow paths,
     element-index rebuild, union-sidx broadcast, query-back validation).
   - `make test` and `make lint` pass.
   
   - [ ] If this pull request closes/resolves/fixes an existing issue, replace 
the issue number. Fixes apache/skywalking#<issue number>.
   - [x] Update the [`CHANGES` 
log](https://github.com/apache/skywalking-banyandb/blob/main/CHANGES.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] Refactor migration tool and support stream catalog data [skywalking-banyandb]

Reply via email to