Re: [PR] [CASSANDRA-20457] Limit number of OOM heap dumps to avoid full disk issue [cassandra]

via GitHub Fri, 30 May 2025 10:02:51 -0700


gowa commented on code in PR #4166:
URL: https://github.com/apache/cassandra/pull/4166#discussion_r2116272907



##########
conf/cassandra-env.sh:
##########
@@ -199,6 +199,45 @@ if [ "x$CASSANDRA_HEAPDUMP_DIR" = "x" ]; then
 fi
 JVM_OPTS="$JVM_OPTS -XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date 
+%s`-pid$$.hprof"
 
+# by default, enable cassandra heapdump files clean up, keeping 2 latest files
+# and 1 oldest heap dump file (this may help identify the earliest OOM issue)
+if [ "x$CASSANDRA_HEAPDUMP_CLEAN" = "x" ]; then
+    CASSANDRA_HEAPDUMP_CLEAN=1
+fi
+if [ "x$CASSANDRA_HEAPDUMP_KEEP_LAST_N_FILES" = "x" ]; then
+    CASSANDRA_HEAPDUMP_KEEP_LAST_N_FILES=2
+fi
+if [ "x$CASSANDRA_HEAPDUMP_KEEP_FIRST_N_FILES" = "x" ]; then
+    CASSANDRA_HEAPDUMP_KEEP_FIRST_N_FILES=1
+fi
+
+# this flag identifies that 'cassandra-env.sh' function
+# clean_heap_dump_files has been loaded and should be called.
+# this flag can be reset in bin/cassandra if -H option is passed.
+call_clean_heap_dump_files=1
+
+clean_heap_dump_files()
+{
+    if [ "x$CASSANDRA_HEAPDUMP_CLEAN" = "x1" ] && \
+           [ "$CASSANDRA_HEAPDUMP_KEEP_LAST_N_FILES" -ge 0 ] && \
+           [ "$CASSANDRA_HEAPDUMP_KEEP_FIRST_N_FILES" -ge 0 ] && \
+           [ -d "$CASSANDRA_HEAPDUMP_DIR" ]; then
+        # find heap dump files, take not more than 100 of them (in order not 
to overload xargs),
+        # sort by last modification date descending
+        # print those, that need to be removed
+        find "$CASSANDRA_HEAPDUMP_DIR" -name "cassandra-*-pid*.hprof" -type f 
| \

Review Comment:
   Again, here I tried to find some proper balance between portability and 
correctness. The approach, I suggested, ensures that firstly `find` will only 
find some files with expected mask. And only then `ls -t1` is used, because 
Internet suggests not to parse its output. But as long as I already filtered 
files using find, then found it safe to parse ls's output.
   If I change to `ls -t1 | grep cassandra-*-pid*.hprof` then I will need to 
check that the matching lines are files. and only then send them to awk. I 
think, I will do this.
   
   At the same time... finding more than 100 heap dumps looks mostly not 
realistic. And after several iterations this all will converge to proper limits.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org

Re: [PR] [CASSANDRA-20457] Limit number of OOM heap dumps to avoid full disk issue [cassandra]

Reply via email to