CRZbulabula commented on code in PR #17765:
URL: https://github.com/apache/iotdb/pull/17765#discussion_r3301105581


##########
scripts/conf/confignode-env.sh:
##########
@@ -68,6 +69,24 @@ calculate_memory_sizes()
     case "`uname`" in
         Linux)
             system_memory_in_mb=`free -m| sed -n '2p' | awk '{print $2}'`
+            # When running in a container/pod, use cgroup memory limit instead 
of host memory
+            if [ -f /sys/fs/cgroup/memory.max ]; then

Review Comment:
   Scope clarification — suggest expanding this code comment. The check reads 
the *root* cgroup, which is what containers see (Docker / Kubernetes bind-mount 
the container's own cgroup at `/sys/fs/cgroup/...`). For a cgroup-restricted 
process running on a bare-metal host (i.e. not in a container, but in a 
sub-cgroup with a memory limit), this won't detect the limit because 
`/proc/self/cgroup` is not consulted. That's a reasonable trade-off, but please 
make it explicit so future readers don't expect it to work on a host. For 
example:
   
   ```sh
   # When running in a container/pod, use cgroup memory limit instead of host 
memory.
   # This reads the root cgroup, which is what containers see; it does NOT walk
   # /proc/self/cgroup, so cgroup-restricted processes on bare metal are not 
covered.
   ```



##########
scripts/conf/datanode-env.sh:
##########
@@ -69,6 +70,24 @@ calculate_memory_sizes()
     case "`uname`" in
         Linux)
             system_memory_in_mb=`free -m| sed -n '2p' | awk '{print $2}'`
+            # When running in a container/pod, use cgroup memory limit instead 
of host memory
+            if [ -f /sys/fs/cgroup/memory.max ]; then

Review Comment:
   Mirror of the suggestion on `confignode-env.sh` — please add the same 
scope-clarification comment here (root cgroup only; doesn't walk 
`/proc/self/cgroup`) so the two files stay in sync. Ideally these would share a 
helper in `scripts/conf/iotdb-common.sh`.



##########
scripts/conf/windows/confignode-env.bat:
##########
@@ -20,7 +20,8 @@
 @echo off
 
 @REM You can set datanode memory size, example '2G' or '2048M'

Review Comment:
   Drive-by nit (pre-existing, but you're touching the next line anyway): the 
comment says `datanode memory size` but this is the **confignode** env file. 
Free win to fix while here.



##########
scripts/conf/confignode-env.sh:
##########
@@ -68,6 +69,24 @@ calculate_memory_sizes()
     case "`uname`" in
         Linux)
             system_memory_in_mb=`free -m| sed -n '2p' | awk '{print $2}'`
+            # When running in a container/pod, use cgroup memory limit instead 
of host memory
+            if [ -f /sys/fs/cgroup/memory.max ]; then
+                # cgroup v2
+                cgroup_mem=`cat /sys/fs/cgroup/memory.max`
+                if [ "$cgroup_mem" != "max" ]; then
+                    cgroup_mem_in_mb=`expr $cgroup_mem / 1024 / 1024`
+                    if [ "$cgroup_mem_in_mb" -lt "$system_memory_in_mb" ]; then
+                        system_memory_in_mb=$cgroup_mem_in_mb
+                    fi
+                fi
+            elif [ -f /sys/fs/cgroup/memory/memory.limit_in_bytes ]; then
+                # cgroup v1
+                cgroup_mem=`cat /sys/fs/cgroup/memory/memory.limit_in_bytes`
+                cgroup_mem_in_mb=`expr $cgroup_mem / 1024 / 1024`

Review Comment:
   Unlike cgroup v2 (which has the explicit `"max"` string sentinel handled on 
line 77), cgroup v1 represents *no limit* with a kernel sentinel integer 
(typically `PAGE_COUNTER_MAX`, e.g. `9223372036854771712`). This branch doesn't 
filter it; instead the downstream `-lt` comparison on line 87 rejects it 
because the converted value is enormously larger than `system_memory_in_mb`. 
The logic is correct under bash, but the reasoning is non-obvious. Suggest a 
one-line code comment, e.g.:
   
   ```sh
   # v1 has no "max" sentinel — when unset, this file holds a huge integer 
(~PAGE_COUNTER_MAX);
   # the -lt check below naturally rejects it, so no explicit guard is needed.
   ```
   
   Side note: `expr` overflow on a 32-bit-only `expr` could be a theoretical 
concern, but the shebang is `#!/bin/bash` so this is fine in practice.



##########
scripts/conf/datanode-env.sh:
##########
@@ -69,6 +70,24 @@ calculate_memory_sizes()
     case "`uname`" in
         Linux)
             system_memory_in_mb=`free -m| sed -n '2p' | awk '{print $2}'`
+            # When running in a container/pod, use cgroup memory limit instead 
of host memory
+            if [ -f /sys/fs/cgroup/memory.max ]; then
+                # cgroup v2
+                cgroup_mem=`cat /sys/fs/cgroup/memory.max`
+                if [ "$cgroup_mem" != "max" ]; then
+                    cgroup_mem_in_mb=`expr $cgroup_mem / 1024 / 1024`
+                    if [ "$cgroup_mem_in_mb" -lt "$system_memory_in_mb" ]; then
+                        system_memory_in_mb=$cgroup_mem_in_mb
+                    fi
+                fi
+            elif [ -f /sys/fs/cgroup/memory/memory.limit_in_bytes ]; then
+                # cgroup v1
+                cgroup_mem=`cat /sys/fs/cgroup/memory/memory.limit_in_bytes`
+                cgroup_mem_in_mb=`expr $cgroup_mem / 1024 / 1024`

Review Comment:
   Mirror of the suggestion on `confignode-env.sh` line 85 — please add a code 
comment noting that the cgroup v1 "no limit" sentinel (`PAGE_COUNTER_MAX`) is 
rejected by the `-lt` check rather than being filtered explicitly. Otherwise 
it's not obvious why there's no parallel to the v2 `!= "max"` guard.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to