yannan-wyn opened a new pull request, #3270:
URL: https://github.com/apache/brpc/pull/3270
### What problem does this PR solve?
Problem Summary:
The original priority queue implementation misuses `WorkStealingQueue` —
multiple producers call `push()` concurrently, but `push()` is designed for
single-owner use only. This leads to potential data races under contention.
See also: #2819, #3055, #3078, #3096
### What is changed and the side effects?
Changed:
- Replace the single `WorkStealingQueue` per tag with a sharded-mode
design (which is PriorityShard), each shard containing:
- `butil::MPSCQueue` inbound (lock-free) for external producers
- `WorkStealingQueue` for owner flush/pop and non-owner steal
- Add owner lifecycle management (bind/unbind/draining) tied to TaskGroup
creation/destruction
- Add owner-preferred shard selection (thread-local round-robin, skip
ownerless/draining shards)
- Add fallback path: clear `BTHREAD_GLOBAL_PRIORITY` and re-enqueue to
normal `_remote_rq` during shard teardown
- Add `_priority_shard_index` field to `TaskGroup` for O(1) owner shard
lookup
- Add unit tests (correctness, concurrent producers, owner dynamic
changes, stress) and microbenchmarks
Side effects:
- Performance effects:
- Owner hot path: ~32ns (same as baseline WSQ, zero overhead)
- Multi-producer inbound (8 threads): ~260ns vs baseline `_remote_rq`
~1050ns (3-4x faster)
- Full pipeline (8 producers): ~155ns vs baseline ~1050ns (5-10x faster)
- Breaking backward compatibility:
- None. Gated by `FLAGS_enable_bthread_priority_queue` (default false).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]