Hi! I want to keep you updated: The problem isn't fixed, still, so I I'm running this simple script via cron to avoid uncontrolled kernel panic: ---snip--- #!/usr/bin/sh # Detect RAM corruption. If detected log a message and reboot # to prevent kernel panic
#cron jobs need a PATH PATH=/sbin:/usr/sbin:/usr/bin:/bin if journalctl -b -g 'Code: Bad RIP value|BUG: Bad rss-counter state mm:' >/dev/null then MSG='RAM corruption detected, starting pro-active reboot' logger -t reboot-before-panic -p local0.notice "$MSG" shutdown -r +1 "$MSG" fi --- Still I suspect it might be related to snapshots being made. After a few days of running the problems started again like this: Mar 26 23:00:01 h19 systemd[1]: Started Timeline of Snapper Snapshots. Mar 26 23:00:01 h19 dbus-daemon[5700]: [system] Activating via systemd: service name='org.opensuse.Snapper' unit='snapperd.service' requested by ':1.343' (uid=0 pid=11200 comm="/usr/lib/snapper/systemd-helper --timeline ") Mar 26 23:00:01 h19 systemd[1]: Starting DBus interface for snapper... Mar 26 23:00:01 h19 dbus-daemon[5700]: [system] Successfully activated service 'org.opensuse.Snapper' Mar 26 23:00:01 h19 systemd[1]: Started DBus interface for snapper. Mar 26 23:00:01 h19 systemd[1]: snapper-timeline.service: Succeeded. Mar 26 23:00:01 h19 systemd[1]: Created slice Slice /system/systemd-coredump. Mar 26 23:00:01 h19 systemd[1]: Started Process Core Dump (PID 11227/UID 0). Mar 26 23:00:01 h19 systemd-coredump[11231]: Process 11226 (run-crons) of user 0 dumped core. Stack trace of thread 11226: #0 0x00007f89ff9dacdb raise (libc.so.6 + 0x4acdb) #1 0x00007f89ff9dc324 abort (libc.so.6 + 0x4c324) #2 0x00007f89ffa20b07 __libc_message (libc.so.6 + 0x90b07) #3 0x00007f89ffa28b8a malloc_printerr (libc.so.6 + 0x98b8a) #4 0x00007f89ffa2a634 _int_free (libc.so.6 + 0x9a634) #5 0x000055c998de3963 command_substitute (bash + 0x9f963) #6 0x000055c998ddb380 n/a (bash + 0x97380) #7 0x000055c998ddda57 n/a (bash + 0x99a57) #8 0x000055c998ddcb94 n/a (bash + 0x98b94) #9 0x000055c998dc8955 n/a (bash + 0x84955) #10 0x000055c998dc756d execute_command_internal (bash + 0x8356d) #11 0x000055c998dc86e1 execute_command (bash + 0x846e1) #12 0x000055c998dc76fd execute_command_internal (bash + 0x836fd) #13 0x000055c998dc86e1 execute_command (bash + 0x846e1) #14 0x000055c998dc8516 execute_command_internal (bash + 0x84516) #15 0x000055c998dc773c execute_command_internal (bash + 0x8373c) #16 0x000055c998dc86e1 execute_command (bash + 0x846e1) #17 0x000055c998dc8007 execute_command_internal (bash + 0x84007) #18 0x000055c998dc86e1 execute_command (bash + 0x846e1) #19 0x000055c998dbce2b reader_loop (bash + 0x78e2b) #20 0x000055c998dbcabc main (bash + 0x78abc) #21 0x00007f89ff9c52bd __libc_start_main (libc.so.6 + 0x352bd) #22 0x000055c998df729a _start (bash + 0xb329a) Mar 26 23:00:01 h19 systemd[1]: systemd-coredump@0-11227-0.service: Succeeded. Mar 26 23:00:01 h19 kernel: BUG: Bad rss-counter state mm:00000000acc74328 idx:1 val:14 Mar 26 23:01:01 h19 systemd[1]: snapperd.service: Succeeded. Mar 26 23:05:01 h19 reboot-before-panic[12356]: RAM corruption detected, starting pro-active reboot Regards, Ulrich _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/