mrproliu opened a new pull request, #1138:
URL: https://github.com/apache/skywalking-banyandb/pull/1138

   This PR introduces a new `banyand-migration` CLI under 
`banyand/cmd/migration/` with three subcommands.
   
   ## `copy`
   
   Realigns measure parts from the source (a backup snapshot or live PVC 
mounts) into the target directory, routing every row to the grid-aligned target 
segment under each entry stage's `SegmentInterval`.
   
   Demo output:
   
   ```
   DONE in 38m54s
      target segments   : 31
      source parts      : 1248
      target mem-parts  : 1306 (pre-merge; banyandb's merge loop will compact)
      rows copied       : 588213170
      bytes written     : 47129058304
      fast-path parts   : 1197
      slow-path parts   : 51
      slow-path rows    : 1812440
   ```
   
   ## `verify`
   
   Read-only inspection that walks source + target for every (entry, group), 
reporting per-pair row counts, target segment grid alignment, and sidx doc 
counts.
   
   Demo output:
   
   ```
   == SUMMARY ==
     data coverage per (node, group):
       ┌────────┬──────────────────┬────────────────┬───────────────┐
       │ node   │ sw_metricsMinute │ sw_metricsHour │ sw_metricsDay │
       ├────────┼──────────────────┼────────────────┼───────────────┤
       │ hot-0  │ ✓                │ --             │ ✓             │
       │ hot-1  │ ✓                │ ✓              │ --            │
       │ cold-0 │ ✓                │ ✓              │ ✓             │
       └────────┴──────────────────┴────────────────┴───────────────┘
   
     target segments                : 31
     target segments misaligned     : 0
     source rows total              : 588213170
     target rows total              : 588212817
   ```
   
   ## `analyze`
   
   Row-level (seriesID, timestamp) duplicate analysis for one (entry, group), 
pinpointing exactly which rows account for the src/tgt diff that `verify` 
surfaces.
   
   Demo output:
   
   ```
   == analyze entry [4/5] stage=warm nodes=[warn-1] group=sw_metricsHour ==
     parts scanned                : 17
     total rows on disk           : 2980213
     distinct (sid, ts) (global)  : 2401206
     cross-part dup rows          : 579007  (NOT dropped by copy — slow path is 
per-part)
     WITHIN-part dup rows         : 110  (← MATCH this against verify src-tgt 
diff)
   
   == src-vs-target multiset diff ==
     src rows           : 2980213
     tgt rows           : 2980121
     missing rows total : 92  (← MATCHES verify src-tgt diff for this 
entry+group)
     missing rows (showing 5 / 74 keys):
       sid=3766501475917424574 ts=2026-05-06T15:00:00Z
         src    version=511194984324783  
part=…/seg-20260506/shard-0/0000000000000013
         tgt    version=514535618046101  
part=…/data-copy/.../seg-20260430/shard-0/0000000000000034
         MISSING version=511194984324783  
part=…/seg-20260506/shard-0/0000000000000013
   ```
   
   ## Docs
   
   The full runbook lives in `banyand/cmd/migration/MIGRATION.md` (including 
the in-cluster workflow), with command reference and config schema in 
`banyand/cmd/migration/README.md`. Example YAML plans are under 
`banyand/cmd/migration/example/`.
   
   
   - [ ] If this pull request closes/resolves/fixes an existing issue, replace 
the issue number. Fixes apache/skywalking#<issue number>.
   - [x] Update the [`CHANGES` 
log](https://github.com/apache/skywalking-banyandb/blob/main/CHANGES.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to