Your message dated Wed, 28 Apr 2021 18:18:02 +0200 with message-id <[email protected]> and subject line Closing this bug has caused the Debian Bug report #719948, regarding Kernel BUG in cgroup freezer when repeatedly freezing/thawing a group to be marked as done.
This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact [email protected] immediately.) -- 719948: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=719948 Debian Bug Tracking System Contact [email protected] with problems
--- Begin Message ---Package: src:linux Version: 3.2.46-1 Severity: important Dear Debian Linux Kernel Maintainers, If I create a cgroup freezer container on an SMP machine and repeatedly freeze/thaw it in a loop, the kernel freezes with a BUG. To reproduce, create a cgroups freezer container with a single process in it on an SMP machine with wheezy standard kernel 3.2.46-1: mkdir /dev/cgroups-freezer mount -t cgroup -o freezer freezer /dev/cgroups-freezer mkdir /dev/cgroups-freezer/crashtest cd /dev/cgroups-freezer/crashtest sleep 3600 & echo $! > tasks Then run this ugly perl one-liner from within the same "crashtest" directory: perl -e 'while (1) { open FILE, ">freezer.state" or die; print FILE "FROZEN" or die; close FILE or die; open FILE, ">freezer.state" or die; print FILE "THAWED" or die; close FILE or die; };' On my test machines, the following BUG reproducibly happens in less than a second, and the machine locks up: [ 2703.254372] ------------[ cut here ]------------ [ 2703.254530] kernel BUG at /build/linux-dJLVDt/linux-3.2.46/kernel/cgroup_freezer.c:241! [ 2703.254769] invalid opcode: 0000 [#1] SMP [ 2703.254917] Modules linked in: netconsole nfnetlink_log nfnetlink configfs nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc loop snd_intel8x0 snd_ac97_codec snd_pcm snd_page_alloc snd_timer snd soundcore ac97_bus ac battery processor parport_pc parport power_supply thermal_sys button psmouse serio_raw pcspkr joydev evdev i2c_piix4 i2c_core vboxguest(O) ext4 crc16 jbd2 mbcache usbhid hid sg sr_mod sd_mod cdrom crc_t10dif ata_generic ata_piix ohci_hcd ehci_hcd ahci libahci usbcore e1000 libata scsi_mod usb_common [last unloaded: netconsole] [ 2703.256018] [ 2703.256018] Pid: 2835, comm: perl Tainted: G O 3.2.0-4-686-pae #1 Debian 3.2.46-1 innotek GmbH VirtualBox/VirtualBox [ 2703.256018] EIP: 0060:[<c106dc6f>] EFLAGS: 00010002 CPU: 0 [ 2703.256018] EIP is at update_if_frozen.isra.1+0x47/0x73 [ 2703.256018] EAX: 00000000 EBX: 00000001 ECX: df2ef4c0 EDX: dd265ee4 [ 2703.256018] ESI: 00000001 EDI: dd6a6350 EBP: 00000000 ESP: dd265edc [ 2703.256018] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 2703.256018] Process perl (pid: 2835, ti=dd264000 task=df248ee0 task.ti=dd264000) [ 2703.256018] Stack: [ 2703.256018] dd265ee4 df2ef4c0 00000000 de2b1284 df2ef4c0 dd6a6340 dd265f28 00000002 [ 2703.256018] c106dd5a c12c271a c1165b6c c106dd01 c13e892c dd265f28 0916b860 c106b49d [ 2703.256018] 00000006 df2ef4c0 00001000 5a4f5246 00004e45 520eb4b9 2fb866f6 520eb4bf [ 2703.256018] Call Trace: [ 2703.256018] [<c106dd5a>] ? freezer_write+0x59/0x13c [ 2703.256018] [<c12c271a>] ? _cond_resched+0x5/0x18 [ 2703.256018] [<c1165b6c>] ? _copy_from_user+0x28/0x47 [ 2703.256018] [<c106dd01>] ? freezer_read+0x66/0x66 [ 2703.256018] [<c106b49d>] ? cgroup_file_write+0x18f/0x1e1 [ 2703.256018] [<c10ccddf>] ? rw_verify_area+0xc6/0xe7 [ 2703.256018] [<c106b30e>] ? cgroup_file_open+0x87/0x87 [ 2703.256018] [<c10cd07f>] ? vfs_write+0x83/0xd4 [ 2703.256018] [<c10cd23f>] ? sys_write+0x3d/0x61 [ 2703.256018] [<c12c7f5f>] ? sysenter_do_call+0x12/0x28 [ 2703.256018] Code: e8 2b f6 ff ff eb 0b e8 2d ff ff ff 46 3c 01 83 db ff 8b 44 24 04 8d 54 24 08 e8 fe f6 ff ff 85 c0 75 e4 85 ed 75 06 85 db 74 17 <0f> 0b 4d 75 0c 39 f3 75 0e c7 07 02 00 00 00 eb 06 39 f3 74 02 [ 2703.256018] EIP: [<c106dc6f>] update_if_frozen.isra.1+0x47/0x73 SS:ESP 0068:dd265edc [ 2703.256018] ---[ end trace 29c9f3fc0f436abe ]--- I have duplicated this on wheezy with this kernel: Linux [hostname] 3.2.0-4-686-pae #1 SMP Debian 3.2.46-1 i686 GNU/Linux And on squeeze with the same kernel backported, but on different amd64 (non-virtual) hardware: Linux [hostname] 3.2.0-0.bpo.4-amd64 #1 SMP Debian 3.2.46-1~bpo60+1 x86_64 GNU/Linux In my testing, the BUG only happens on SMP machines, and not on single CPU machines. Also, if you include a slight delay before the freeze, the problem doesn't happen reproducibly, at least to me: perl -e 'while (1) { select (undef, undef, undef, 0.01); open FILE, ">freezer.state" or die; print FILE "FROZEN" or die; close FILE or die; open FILE, ">freezer.state" or die; print FILE "THAWED" or die; close FILE or die; };' # does not BUG due to the select() delay Looking at line 241 of kernel/cgroup_freezer.c in version 3.2.46, something is clearly wrong: the code believes the state of the group is CGROUP_THAWED, and yet it contains a frozen task. The fact that it's both timing- and SMP- dependent suggests a race condition of some kind. -- System Information: Debian Release: 7.1 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Kernel: Linux 3.2.0-4-686-pae (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash -- Robert L Mathews, Tiger Technologies
--- End Message ---
--- Begin Message ---This bug was filed for a very old kernel. If you can reproduce it with - the current version in unstable/testing - the latest kernel from buster.backports please reopen the bug, see https://www.debian.org/Bugs/server-control
--- End Message ---

