hanahmily opened a new pull request, #1141:
URL: https://github.com/apache/skywalking-banyandb/pull/1141

   ## Summary
   
   Lands the vectorized (columnar) execution path for `measure` queries — both
   the standalone (data-node-local) and the distributed (liaison + N data nodes)
   plans — alongside an opt-in tracing system for diagnostic and bottleneck
   detection. Replaces the proto-per-row wire under flag-on with a columnar
   raw-frame v3 binary; the row path is preserved byte-for-byte as the rollback
   rail behind `--measure-vectorized-enabled=false`.
   
   76 commits, 472 files, +62k / −2.7k.
   
   ## What's shipped
   
   - **Storage bridge** — typed columnar pull interface returned by storage,
     with a single-block fast path, multi-block fall-through, and per-type
     batch + column recycling pools.
   - **Vec operator library** — 1024-row record batches with zero-copy uint16
     selection vectors, typed columns over Go generics, passthrough
     TagValue / FieldValue columns that avoid the decode/re-encode round
     trip on scan egress.
   - **Vec operators** — Scan, BatchAggregation (All / Map / Reduce modes),
     BatchGroupByFirst, BatchTop (two-stage extraction: cheap sort key per
     row, full copy only on heap displacement), fusible BatchLimit, and
     per-iterator FrameEmitter that encodes directly to the wire.
   - **Wire protocol — frame v3** — columnar binary with magic-byte
     fail-loud discriminator, packed fixed-width payloads, uvarint-prefixed
     variable-width payloads, and proto-bytes-per-cell for cross-group type
     divergence. Codec dispatch on response Go type and incoming leading
     byte — no new proto message; one `bytes` field added to the existing
     internal response.
   - **Standalone vec plan** — analyzer + dispatch path that owns hidden
     criteria tag projection, egress-strip, and the structural plan tree
     (Scan → [GroupByAgg] → [Top] → Limit).
   - **Distributed vec plan** — broadcast strategy with rewritten node /
     liaison query templates, Limit + Top push-down with a calibrated
     per-node Limit formula, single-group and multi-group fan-out, k-way
     heap merge with `(sid, ts, version)` dedup, mid-flight schema-divergence
     fallback, and reuse of the standalone Top / GroupBy / Limit operators
     on the merged stream.
   - **Opt-in tracing system** — span tree reusing the existing
     `common.v1.Trace` contract, fixed vocabulary of stable tag-key
     constants, CI ast-grep rules enforcing the vocabulary and `Stop` /
     `ToProto` ordering, fanout cap synthesis with percentile rollups,
     bottleneck microbenchmark gate, and a runbook with worked examples.
   - **Defaults & rollback** — vec is the production default; flag-off
     reverts every request to the legacy row plan. The row path is left
     untouched on this branch with `nolint:staticcheck` markers identifying
     the rollback-rail call sites.
   - **E2E + bench wiring** — OAP e2e workflow now a 2-D `(suite × engine)`
     matrix so every existing suite runs on both rails. A new distributed
     querybench exercises `scan_all` and `top_with_filter` at 1 k / 10 k /
     100 k cardinality and emits JSON / Markdown / static HTML reports.
   
   ## Performance
   
   Vec / Row ratio measured by the new distributed querybench (lower is better):
   
   | Workload | Cardinality | p50 | p99 | QPS | CPU | Mallocs | Bytes |
   |---|---:|---:|---:|---:|---:|---:|---:|
   | scan_all | 1 k | 0.61× | 0.59× | 1.22× | 0.81× | 0.54× | 0.78× |
   | scan_all | 10 k | 0.53× | 0.56× | 1.23× | 0.80× | 0.54× | 0.80× |
   | scan_all | 100 k | 0.54× | 0.54× | 1.21× | 0.82× | 0.55× | 0.84× |
   | top_with_filter | 1 k | 0.54× | 0.71× | 1.91× | 0.50× | 0.21× | 0.38× |
   | top_with_filter | 10 k | 0.38× | 0.45× | 2.35× | 0.43× | 0.12× | 0.19× |
   | top_with_filter | 100 k | 0.36× | 0.43× | 2.64× | 0.38× | **0.11×** | 
0.17× |
   
   Headline: **2× scan throughput, 2.6× top throughput at 100 k, and up to
   9× fewer allocations on top_with_filter.**
   
   ## Compatibility
   
   - Wire format under `Trace=false` is byte-identical to the pre-tracing
     flag-on stream.
   - Trace-enabled queries require every data node on the new release;
     un-upgraded nodes return their existing hard error, surfaced through
     the standard partial-failure machinery.
   - Flag-off behavior is unchanged — every fixture and integration test
     runs green on both rails.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to