Felix-wave opened a new issue, #13860: URL: https://github.com/apache/skywalking/issues/13860
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar issues. (No matches for the panic message in either `apache/skywalking` or `apache/skywalking-banyandb`.) ### Apache SkyWalking Component BanyanDB ### What happened `banyand/trace/block_writer.go` panics when blocks for the same `traceID` arrive with `tm.min < minTimestampLast`. On a real SkyWalking OAP 10.4.0 production workload this fires roughly once per minute, with each panic discarding one trace-write batch. The offending check (`apache/skywalking-banyandb` master branch): https://github.com/apache/skywalking-banyandb/blob/master/banyand/trace/block_writer.go#L259-L262 ```go if isSeenTid && tm.min < bw.minTimestampLast { logger.Panicf("the block for tid=%s cannot contain timestamp smaller than %d, but it contains timestamp %d", tid, bw.minTimestampLast, tm.min) } ``` The same pattern exists at line 320 of the same file and in `banyand/{stream,measure}/block_writer.go`. #### Sample panics from a production cluster (banyandb 0.10.1, OAP 10.4.0) ``` the block for tid=6359c73a002c425785500f958cdc4007.661.17779941290647127 cannot contain timestamp smaller than 1777994129833000000, but it contains timestamp 1777994129064000000 the block for tid=6359c73a002c425785500f958cdc4007.661.17779941958867471 cannot contain timestamp smaller than 1777994196635000000, but it contains timestamp 1777994196633000000 the block for tid=6359c73a002c425785500f958cdc4007.661.17779944351678775 cannot contain timestamp smaller than 1777994435391000000, but it contains timestamp 1777994435191000000 ``` Skew between `minTimestampLast` and incoming `tm.min` ranges from ~2ms to ~770ms. All three samples share the `tid` prefix `6359c73a002c425785500f958cdc4007`, i.e. the same trace. #### Why this fires on normal traffic In SkyWalking, a single trace is composed of segments produced by multiple Java agents on different services. Wall clocks across services are not strictly monotonic relative to each other (NTP drift, container clocks, sub-millisecond inter-service hops). When OAP forwards segments belonging to one traceID to BanyanDB, segments can arrive in batches whose `min(timestamp)` is slightly earlier than a previously-flushed block for the same traceID. The `block_writer` treats this as a programming invariant violation and panics; in practice it is normal upstream input. #### Impact - Every panic is recovered by the gRPC stream interceptor, so the server keeps running, but the **single in-flight trace-write batch is dropped**. - On 0.9.0 the same panic also fires; in our environment it left the gRPC server in a degraded state and pods restarted ~406 times over 7 days. On 0.10.1 the recovery is clean (pod stays up, 0 restarts), but trace data loss continues at ~1 batch/minute. - Net effect: under any non-trivial trace volume, BanyanDB silently sheds a small fraction of traces and produces a continuous stream of stack traces. ### What you expected to happen BanyanDB should accept slightly out-of-order timestamps within the same traceID without panicking and without dropping the batch. Suggested directions (the maintainers likely know better which is appropriate): 1. **Sort incoming blocks by `tm.min` per traceID before writing**, instead of asserting input order. 2. **Demote to a warning + drop only the offending block** rather than `Panicf`, so the rest of the batch is durable. 3. If strict ordering is required for an internal index, **track per-traceID `minTimestampLast` and tolerate small backward skew** within a configurable window. ### How to reproduce Steady-state SkyWalking deployment with multiple Java agents reporting traces to OAP, OAP backed by BanyanDB. We see this on: - BanyanDB: `apache/skywalking-banyandb:0.10.1` (also reproduced on 0.9.0) - SkyWalking OAP: `apache/skywalking-oap-server:10.4.0` - ~30+ services, `apache-skywalking-java-agent` 9.5.0, JDK 21 - Standalone BanyanDB on Kubernetes (Aliyun ACK), `--trace-root-path=/data/trace` - No special configuration; typical SkyWalking trace volume (GiBs/day) Within ~3 minutes of OAP starting, the first panic appears in BanyanDB logs; cadence stabilizes at roughly one panic per minute. ### Anything else Relevant code references in `apache/skywalking-banyandb`: - `banyand/trace/block_writer.go:261` — first Panicf - `banyand/trace/block_writer.go:322` — second Panicf - Same pattern in `banyand/stream/block_writer.go` and `banyand/measure/block_writer.go` Happy to provide more samples (full stack traces, debug logs, longer time series) or test a candidate fix on our cluster. ### Are you willing to submit a pull request to fix on your own - [ ] Yes, I am willing to submit a pull request on my own! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
