Copilot commented on code in PR #65:
URL: 
https://github.com/apache/skywalking-banyandb-helm/pull/65#discussion_r3502977355


##########
chart/values.yaml:
##########
@@ -1067,6 +1067,23 @@ cluster:
           maxArtifacts: 10
           ## @param 
cluster.fodc.agent.config.crashCollection.diagnosisMemoryPercent Set banyand 
GOMEMLIMIT to this percent of the cgroup memory limit, reserving headroom for 
post-panic diagnostics (0 disables)
           diagnosisMemoryPercent: 50
+      ## Memory-pressure pprof capture. When a data container's RSS approaches 
its
+      ## cgroup memory limit, the agent pulls heap+goroutine pprof from the 
data
+      ## container's pprof endpoint (port 6060) onto a shared emptyDir, served 
by the
+      ## proxy's HTTP API. Requires a data-container memory limit to be 
effective.
+      pressureProfiler:
+        ## @param cluster.fodc.agent.pressureProfiler.enabled Enable automatic 
heap+goroutine pprof capture under memory pressure
+        enabled: true
+        ## @param cluster.fodc.agent.pressureProfiler.triggerPercent Capture 
when RSS / cgroup_limit reaches this percentage
+        triggerPercent: 75
+        ## @param cluster.fodc.agent.pressureProfiler.cooldown Minimum 
interval between two captures
+        cooldown: 5m
+        ## @param cluster.fodc.agent.pressureProfiler.dir Directory (on the 
writable volume) where captured profiles are stored; must equal the 
pressure-profiles mount path
+        dir: /tmp/pressure-profiles
+        ## @param cluster.fodc.agent.pressureProfiler.maxArtifacts Maximum 
number of capture events to retain (lowest-RSS evicted first)
+        maxArtifacts: 16
+        ## @param cluster.fodc.agent.pressureProfiler.maxDiskSize Maximum 
total on-disk size for retained events; case-insensitive, all suffixes are 
1024-based (K/M/G/T, KB/MB/GB/TB, Ki/Mi/Gi/Ti), or a plain byte count; 0 
disables the disk bound
+        maxDiskSize: 512MB

Review Comment:
   The default maxDiskSize value uses "512MB", but the parameter 
documentation/examples use IEC units like "Mi"/"Gi". Using "512Mi" here would 
better match the docs and the surrounding chart conventions (most memory sizes 
in this chart use Mi/Gi).



##########
doc/parameters.md:
##########
@@ -396,6 +396,12 @@ The content of this document describes the parameters that 
can be configured in
 | `cluster.fodc.agent.config.crashCollection.dir`                    | Shared 
path where banyand writes panic.json and fodc-agent reads it                    
                                       | `/tmp/crash`                           
         |
 | `cluster.fodc.agent.config.crashCollection.maxArtifacts`           | Max 
crash artifact directories banyand retains (oldest removed first; 0 disables 
pruning)                                     | `10`                             
               |
 | `cluster.fodc.agent.config.crashCollection.diagnosisMemoryPercent` | Set 
banyand GOMEMLIMIT to this percent of the cgroup memory limit, reserving 
headroom for post-panic diagnostics (0 disables) | `50`                         
                   |
+| `cluster.fodc.agent.pressureProfiler.enabled`                      | Enable 
automatic heap pprof capture under memory pressure                              
                                       | `true`                                 
         |

Review Comment:
   This parameter description mentions only heap pprof capture, but the 
values.yaml documentation describes capturing heap+goroutine profiles. Aligning 
the wording here will avoid user confusion about what gets captured.



##########
chart/templates/_helpers.tpl:
##########
@@ -367,3 +367,25 @@ Resolve discovery file data key
 {{- $cm := $file.configMap | default dict }}
 {{- default "nodes.yaml" $cm.key }}
 {{- end }}
+
+{{/*
+Convert a human-readable size to an integer number of bytes.
+Case-insensitive. All suffixes are 1024-based (binary): K/M/G/T, KB/MB/GB/TB,
+Ki/Mi/Gi/Ti (also KiB/MiB/GiB/TiB). No suffix -> plain byte count. Empty or 0 
-> 0.
+*/}}
+{{- define "banyandb.toBytes" -}}
+{{- $s := . | toString | trim -}}
+{{- if or (eq $s "") (eq $s "0") -}}
+0
+{{- else -}}
+{{- $num := $s | regexFind "^[0-9]+" | int64 -}}
+{{- $unit := lower (regexReplaceAll "^[0-9]+" $s "" | trim) -}}
+{{- $mult := int64 1 -}}
+{{- if or (eq $unit "k") (eq $unit "kb") (eq $unit "ki") (eq $unit "kib") 
-}}{{- $mult = int64 1024 -}}
+{{- else if or (eq $unit "m") (eq $unit "mb") (eq $unit "mi") (eq $unit "mib") 
-}}{{- $mult = int64 1048576 -}}
+{{- else if or (eq $unit "g") (eq $unit "gb") (eq $unit "gi") (eq $unit "gib") 
-}}{{- $mult = int64 1073741824 -}}
+{{- else if or (eq $unit "t") (eq $unit "tb") (eq $unit "ti") (eq $unit "tib") 
-}}{{- $mult = int64 1099511627776 -}}
+{{- end -}}
+{{- mul $num $mult -}}

Review Comment:
   The new banyandb.toBytes helper silently accepts invalid inputs (e.g. 
unknown suffixes or decimal values like "1.5Gi") and will return an incorrect 
byte count instead of failing fast. This can lead to surprising max-disk-byte 
limits and hard-to-debug behavior; consider validating that the value starts 
with digits and that the unit (if present) is one of the supported suffixes, 
otherwise Helm should fail with a clear error message.



##########
chart/values.yaml:
##########
@@ -1067,6 +1067,23 @@ cluster:
           maxArtifacts: 10
           ## @param 
cluster.fodc.agent.config.crashCollection.diagnosisMemoryPercent Set banyand 
GOMEMLIMIT to this percent of the cgroup memory limit, reserving headroom for 
post-panic diagnostics (0 disables)
           diagnosisMemoryPercent: 50
+      ## Memory-pressure pprof capture. When a data container's RSS approaches 
its
+      ## cgroup memory limit, the agent pulls heap+goroutine pprof from the 
data
+      ## container's pprof endpoint (port 6060) onto a shared emptyDir, served 
by the
+      ## proxy's HTTP API. Requires a data-container memory limit to be 
effective.

Review Comment:
   The description says profiles are written to a "shared emptyDir", but the 
templates only mount this volume into the fodc-agent container. Consider 
rewording to avoid implying the volume is shared with other containers in the 
pod.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to