mrproliu opened a new pull request, #1151:
URL: https://github.com/apache/skywalking-banyandb/pull/1151

     ### Problem
   
     A segment's refCount used to mean two things at once: resources loaded
     ("open") and actively in use ("referenced") — an open segment always 
carried
     a baseline refCount >= 1, and the last DecRef to 0 closed it.
   
     Because the two were the same counter, a reopened cold segment (refCount 
== 1)
     looked identical to a baseline-only one. So a snapshot/inspect that 
reopened an
     idle-closed segment could be torn down mid-operation by the idle reclaimer 
or
     retention, leaving index == nil → cold-node nil-pointer panic and bluge
     exclusive-lock (bluge.pid) churn.
   
     ### Fix
   
     Make the two concepts orthogonal:
   
     - refCount now counts active users only;
     - index != nil is the sole "open" signal.
   
     This adds a dormant state:
   
   | State | index | refCount |
   |-------|-----|-----------|
   | CLOSED | `nil` | 0 |
   | DORMANT (open, unused) | `!=nil` | 0|
   | ACTIVE (open, in use) | `!=nil` | `>=1` |
   
     - New/loaded segments start dormant; DecRef to 0 leaves them dormant (no 
longer closes).
     - Teardown (closeIfIdle idle reclaim / performDelete retention) happens 
only at refCount == 0 under s.mu, and the 0 -> 1 reopen (acquire) takes the 
same lock — so an active reference always pins the
     segment.
     - acquire refuses to reopen a segment flagged for deletion 
(ErrSegmentClosed).
   
     An active reference can no longer be reclaimed mid-use (panic fixed), 
while idle
     segments still release their bluge writers.
   
     ### Notes
   
     - No struct fields added/removed — only refCount's meaning changed.
     - Caller contract unchanged (still one DecRef per incRef); no API / wire / 
on-disk format change.
     - Only intentional semantic change: the totalSegRefs gauge now reports the 
open-segment count instead of the sum of ref counts.
   
     ### Testing
   
     - Unit tests for every state/transition + an end-to-end integration test 
(create → dormant → query-reopen → snapshot → idle reclaim → restart → 
retention delete).
     - Concurrency tests under -race (incRef vs reclaim, inspect/snapshot vs 
delete), deterministic and validated under GOMAXPROCS=1.
     - Full storage suite green under -race; lint clean.
   
   - [ ] If this pull request closes/resolves/fixes an existing issue, replace 
the issue number. Fixes apache/skywalking#<issue number>.
   - [x] Update the [`CHANGES` 
log](https://github.com/apache/skywalking-banyandb/blob/main/CHANGES.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to