hanahmily opened a new pull request, #1141:
URL: https://github.com/apache/skywalking-banyandb/pull/1141
## Summary
Lands the vectorized (columnar) execution path for `measure` queries — both
the standalone (data-node-local) and the distributed (liaison + N data nodes)
plans — alongside an opt-in tracing system for diagnostic and bottleneck
detection. Replaces the proto-per-row wire under flag-on with a columnar
raw-frame v3 binary; the row path is preserved byte-for-byte as the rollback
rail behind `--measure-vectorized-enabled=false`.
76 commits, 472 files, +62k / −2.7k.
## What's shipped
- **Storage bridge** — typed columnar pull interface returned by storage,
with a single-block fast path, multi-block fall-through, and per-type
batch + column recycling pools.
- **Vec operator library** — 1024-row record batches with zero-copy uint16
selection vectors, typed columns over Go generics, passthrough
TagValue / FieldValue columns that avoid the decode/re-encode round
trip on scan egress.
- **Vec operators** — Scan, BatchAggregation (All / Map / Reduce modes),
BatchGroupByFirst, BatchTop (two-stage extraction: cheap sort key per
row, full copy only on heap displacement), fusible BatchLimit, and
per-iterator FrameEmitter that encodes directly to the wire.
- **Wire protocol — frame v3** — columnar binary with magic-byte
fail-loud discriminator, packed fixed-width payloads, uvarint-prefixed
variable-width payloads, and proto-bytes-per-cell for cross-group type
divergence. Codec dispatch on response Go type and incoming leading
byte — no new proto message; one `bytes` field added to the existing
internal response.
- **Standalone vec plan** — analyzer + dispatch path that owns hidden
criteria tag projection, egress-strip, and the structural plan tree
(Scan → [GroupByAgg] → [Top] → Limit).
- **Distributed vec plan** — broadcast strategy with rewritten node /
liaison query templates, Limit + Top push-down with a calibrated
per-node Limit formula, single-group and multi-group fan-out, k-way
heap merge with `(sid, ts, version)` dedup, mid-flight schema-divergence
fallback, and reuse of the standalone Top / GroupBy / Limit operators
on the merged stream.
- **Opt-in tracing system** — span tree reusing the existing
`common.v1.Trace` contract, fixed vocabulary of stable tag-key
constants, CI ast-grep rules enforcing the vocabulary and `Stop` /
`ToProto` ordering, fanout cap synthesis with percentile rollups,
bottleneck microbenchmark gate, and a runbook with worked examples.
- **Defaults & rollback** — vec is the production default; flag-off
reverts every request to the legacy row plan. The row path is left
untouched on this branch with `nolint:staticcheck` markers identifying
the rollback-rail call sites.
- **E2E + bench wiring** — OAP e2e workflow now a 2-D `(suite × engine)`
matrix so every existing suite runs on both rails. A new distributed
querybench exercises `scan_all` and `top_with_filter` at 1 k / 10 k /
100 k cardinality and emits JSON / Markdown / static HTML reports.
## Performance
Vec / Row ratio measured by the new distributed querybench (lower is better):
| Workload | Cardinality | p50 | p99 | QPS | CPU | Mallocs | Bytes |
|---|---:|---:|---:|---:|---:|---:|---:|
| scan_all | 1 k | 0.61× | 0.59× | 1.22× | 0.81× | 0.54× | 0.78× |
| scan_all | 10 k | 0.53× | 0.56× | 1.23× | 0.80× | 0.54× | 0.80× |
| scan_all | 100 k | 0.54× | 0.54× | 1.21× | 0.82× | 0.55× | 0.84× |
| top_with_filter | 1 k | 0.54× | 0.71× | 1.91× | 0.50× | 0.21× | 0.38× |
| top_with_filter | 10 k | 0.38× | 0.45× | 2.35× | 0.43× | 0.12× | 0.19× |
| top_with_filter | 100 k | 0.36× | 0.43× | 2.64× | 0.38× | **0.11×** |
0.17× |
Headline: **2× scan throughput, 2.6× top throughput at 100 k, and up to
9× fewer allocations on top_with_filter.**
## Compatibility
- Wire format under `Trace=false` is byte-identical to the pre-tracing
flag-on stream.
- Trace-enabled queries require every data node on the new release;
un-upgraded nodes return their existing hard error, surfaced through
the standard partial-failure machinery.
- Flag-off behavior is unchanged — every fixture and integration test
runs green on both rails.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]