Copilot commented on code in PR #247:
URL: https://github.com/apache/skywalking-go/pull/247#discussion_r3362482310


##########
plugins/core/correlation.go:
##########
@@ -0,0 +1,127 @@
+// Licensed to Apache Software Foundation (ASF) under one or more contributor
+// license agreements. See the NOTICE file distributed with
+// this work for additional information regarding copyright
+// ownership. Apache Software Foundation (ASF) licenses this file to you under
+// the Apache License, Version 2.0 (the "License"); you may
+// not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package core
+
+import "sync"
+
+// CorrelationContext is the synchronized correlation key/value storage shared
+// by every span of a segment. It replaces the bare map[string]string that used
+// to live on SegmentContext: that map is legitimately reachable from multiple
+// goroutines (snapshot-continued child spans share it with the segment root),
+// so an unsynchronized write racing the exit-span header encoding or the
+// snapshot copy would crash the process with the unrecoverable
+// "concurrent map iteration and map write" fatal error.
+//
+// SegmentContext is copied by value between the spans of a segment, therefore
+// the lock cannot be embedded there; the pointer wrapper keeps a single lock
+// per logical correlation store. RWMutex is the right tool here (unlike the
+// span opLock): correlation is read on every propagation encode by potentially
+// concurrent goroutines while writes are rare.
+type CorrelationContext struct {
+       mu   sync.RWMutex
+       data map[string]string
+}
+
+// newCorrelationContext returns an empty store. The inner map is allocated
+// lazily on the first Set: most spans never touch correlation, and the empty
+// map would be one extra allocation on every segment.
+func newCorrelationContext() *CorrelationContext {
+       return &CorrelationContext{}
+}
+
+// newCorrelationContextFrom builds a store pre-filled with a copy of m
+// (typically the correlation decoded from the inbound propagation headers).
+func newCorrelationContextFrom(m map[string]string) *CorrelationContext {
+       c := &CorrelationContext{data: make(map[string]string, len(m))}
+       for k, v := range m {
+               c.data[k] = v
+       }
+       return c

Review Comment:
   newCorrelationContextFrom eagerly allocates an empty inner map even when the 
inbound correlation header is absent (m is nil/empty). This defeats the 
lazy-allocation design noted above and introduces an extra allocation on every 
inbound span that has refs but no correlation values.



##########
plugins/core/span_tracing.go:
##########
@@ -363,10 +382,24 @@ func newSegmentRoot(segmentSpan *SegmentSpanImpl) 
*RootSegmentSpan {
        s.notify = ch
        s.segment = make([]reporter.ReportedSpan, 0, 10)
        s.doneCh = make(chan int32)
+       s.collectorDone = make(chan struct{})
        go func() {
                total := -1
-               defer close(ch)
-               defer close(s.doneCh)
+               // Closing collectorDone (instead of the data channels) lets 
late
+               // senders exit safely through their select; the unclosed 
channels are
+               // reclaimed by the GC. It must stay in a defer so that even a 
panic
+               // below cannot leave senders blocked forever.
+               defer close(s.collectorDone)
+               defer func() {
+                       // Defense in depth: a panic here would kill the 
process since this
+                       // goroutine has no other recover.

Review Comment:
   collectorDone is only closed when the collector goroutine returns, but after 
the loop breaks it no longer receives from the collect channel while 
Reporter.SendTracing runs. If a late end0 send happens in this window (e.g., 
due to a miscount/bug or a blocking reporter), the sender goroutine can still 
block indefinitely because collectorDone won’t be closed until SendTracing 
finishes. Closing collectorDone immediately after the collect loop completes 
avoids that head-of-line blocking and matches the intent of letting late 
senders bail out safely.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to