Copilot commented on code in PR #1127:
URL:
https://github.com/apache/skywalking-banyandb/pull/1127#discussion_r3227096687
##########
banyand/property/db/repair_gossip.go:
##########
@@ -307,13 +357,15 @@ func (r *repairGossipClient) Rev(ctx context.Context,
tracer gossip.Trace, nextN
syncSpan := tracer.CreateSpan(startSyncSpan, "repair
property")
syncSpan.Tag(gossip.TraceTagOperateType,
gossip.TraceTagOperateRepairProperty)
syncSpan.Tag(gossip.TraceTagPropertyID,
string(sync.Property.Id))
- updated, newer, err := syncShard.repair(ctx,
sync.Property.Id, sync.Property.Property, sync.Property.DeleteTime)
+ updated, newer, err := r.executeRepairWithBudget(ctx,
syncShard, sync.Property.Id, sync.Property.Property, sync.Property.DeleteTime,
request.Group)
syncSpan.Tag("updated", fmt.Sprintf("%t", updated))
syncSpan.Tag("has_newer", fmt.Sprintf("%t", newer !=
nil))
if err != nil {
Review Comment:
In the PropertySync handling block, `syncSpan.End()` isn’t guaranteed to
run. If `updated == false` and either `newer == nil` or `sync.From ==
...MISSING`, the code falls through without ending the span, which can leak
spans and distort tracing. Consider deferring `syncSpan.End()` immediately
after creation (or ensuring it’s called on every exit path of this case).
##########
banyand/property/gossip/server.go:
##########
@@ -328,23 +336,31 @@ func (q *protocolHandler) addToProcess(request
*propertyv1.PropagationRequest, t
// if the latest round is out of ttl, then needs to change to current
node to executing
if time.Since(groupShard.latestTime) > q.s.scheduleInterval/2 {
groupShard.originalNodeID = request.Context.OriginNode
- select {
- case groupShard.channel <- handlingRequestData:
- q.notifyNewRequest()
- default:
- q.s.log.Error().Msgf("ready to added propagation into
group shard %s(%d) in a new round, but it's full", request.Group,
request.ShardId)
+ if groupShard.pending != nil {
+ q.s.serverMetrics.totalCoalesced.Inc(1, request.Group)
+ q.s.log.Info().
+ Str("group", request.Group).
+ Uint32("shardID", request.ShardId).
+ Str("originNode", request.Context.OriginNode).
+ Msg("propagation request coalesced into pending
(TTL takeover)")
}
+ groupShard.pending = handlingRequestData
+ q.notifyNewRequest()
return true
Review Comment:
`groupShard.latestTime` is only initialized on first creation and never
updated when new requests are accepted into `pending`. This makes the TTL check
`time.Since(groupShard.latestTime) > q.s.scheduleInterval/2` use a stale
timestamp, so a different originator can be accepted immediately after a recent
request (because the TTL condition stays true forever after the first expiry).
Update `latestTime` whenever you enqueue/overwrite `pending` so the TTL window
is based on the most recent accepted round/activity.
##########
test/property_repair/shared_utils.go:
##########
@@ -334,6 +354,106 @@ func parseRepairSuccessCount(content string) int64 {
return totalCount
}
+// parseRepairFailureCount parses the repair_failure_count from prometheus
metrics text.
+func parseRepairFailureCount(content string) int64 {
+ // Look for metric lines like:
banyandb_property_scheduler_property_repair_failure_count{group="perf-test-group",shard="0"}
5
+ re :=
regexp.MustCompile(`banyandb_property_scheduler_property_repair_failure_count\{[^}]+\}\s+(\d+(?:\.\d+)?)`)
+ matches := re.FindAllStringSubmatch(content, -1)
+
+ var totalCount int64
+ for _, match := range matches {
+ if len(match) >= 2 {
Review Comment:
The Prometheus value regexes (e.g., `\d+(?:\.\d+)?`) won’t match valid
exposition formats like scientific notation (e.g., `1.23e+06`) or `+Inf`
(histograms). This can cause these parsers to silently return 0 and make the
integration/perf checks flaky. Consider using a more complete Prometheus float
pattern (optionally including exponent and `Inf`/`NaN`) and reusing it across
all `parse*` helpers.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]