hanahmily commented on code in PR #1124:
URL:
https://github.com/apache/skywalking-banyandb/pull/1124#discussion_r3213031043
##########
pkg/schema/cache.go:
##########
@@ -108,7 +109,29 @@ func (sr *schemaRepo) SendMetadataEvent(event
MetadataEvent) {
return
}
defer sr.closer.SenderDone()
+ // SendMetadataEvent runs synchronously on the caller goroutine (e.g. an
+ // etcd-watch handler such as (*schemaRepo).OnAddOrUpdate). A panic from
+ // processEvent — string-typed today, error-typed once Step 1b lands —
+ // would otherwise propagate to a caller that has no recover() of its
own.
+ // The classifier branch escalates permanent errors to .Fatal() so
+ // runtime registration of an incompatible group exits the process the
+ // same way the boot path does; non-permanent panics are logged and
+ // swallowed (no regression vs status quo).
+ defer func() {
+ if recovered := recover(); recovered != nil {
+ if recErr, isErr := recovered.(error); isErr &&
initerror.IsPermanent(recErr) {
+ sr.l.Fatal().Err(recErr).Interface("event",
event).Str("stack", string(debug.Stack())).
+ Msg("SendMetadataEvent hit a permanent
error, refusing to continue")
+ }
+ sr.l.Warn().Interface("err",
recovered).Interface("event", event).Str("stack", string(debug.Stack())).
+ Msg("recovered from SendMetadataEvent panic")
Review Comment:
Addressed in 35a5f8f7 — `SendMetadataEvent`'s defer-recover now requeues
non-permanent panics onto `eventCh`, mirroring the existing transient
error-return path (bumps `totalErrs`, pushes the event with a
`closer.CloseNotify` escape). Permanent panics still escalate via `.Fatal()`.
Comment updated to reflect the new behavior.
##########
banyand/internal/storage/tsdb.go:
##########
@@ -239,15 +240,31 @@ func OpenTSDB[T TSTable, O any](ctx context.Context, opts
TSDBOpts[T, O], cache
lockPath := filepath.Join(opts.Location, lockFilename)
lock, err := tsdbLfs.CreateLockFile(lockPath, FilePerm)
if err != nil {
- logger.Panicf("cannot create lock file %s: %s", lockPath, err)
+ panic(fmt.Errorf("cannot create lock file %s: %w", lockPath,
err))
}
db.lock = lock
+ // Release the lock fd if any subsequent step in OpenTSDB returns an
error.
+ // Otherwise the file descriptor leaks until process exit and a retry
on the
+ // same data dir hits "resource temporarily unavailable" from the next
+ // CreateLockFile call. Ownership transfers to (*database).Close() only
on
+ // success — released = true is set on the success return below.
+ released := false
+ defer func() {
+ if !released {
+ _ = lock.Close()
+ _ = tsdbLfs.DeleteFile(lockPath)
+ }
+ }()
if err := db.segmentController.open(); err != nil {
return nil, err
}
obsservice.MetricsCollector.Register(location, db.collect)
db.disableRotation = opts.DisableRotation
- return db, db.startRotationTask()
+ if err := db.startRotationTask(); err != nil {
+ return nil, err
+ }
+ released = true
+ return db, nil
Review Comment:
Addressed in 9527293a — restored the original "return db on rotation-task
error" contract so the caller can `db.Close()` to unregister the metrics
collector and clean up the segment controller. `released = true` is set before
`startRotationTask()` runs because once the db is handed to the caller, the
lock-fd cleanup becomes the caller's responsibility (via `db.Close()`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]