Your message dated Sat, 24 Apr 2021 14:34:34 -0700 (PDT) with message-id <[email protected]> and subject line Closing this bug (BTS maintenance for src:linux bugs) has caused the Debian Bug report #741239, regarding task kjournald blocked for more than 120 seconds (very slow I/O, reboot required) to be marked as done.
This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact [email protected] immediately.) -- 741239: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=741239 Debian Bug Tracking System Contact [email protected] with problems
--- Begin Message ---Package: linux-image-3.2.0-4-amd64 Version: 3.2.51-1 I had an image file mounted loopback. A few minutes after running rsync on the mountpoint some bug was triggered and all I/O became extremely slow after this: Mar 9 11:21:20 st1 kernel: [1702551.406424] INFO: task kjournald:8705 blocked for more than 120 seconds. Mar 9 11:21:20 st1 kernel: [1702551.406428] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Mar 9 11:21:20 st1 kernel: [1702551.406431] kjournald D ffff88041fb93780 0 8705 2 0x00000000 Mar 9 11:21:20 st1 kernel: [1702551.406436] ffff880403ff2fe0 0000000000000046 ffffffff00000000 ffff88040f7a4f60 Mar 9 11:21:20 st1 kernel: [1702551.406442] 0000000000013780 ffff880406069fd8 ffff880406069fd8 ffff880403ff2fe0 Mar 9 11:21:20 st1 kernel: [1702551.406446] ffffffff810135d2 0000000181066245 ffff8804048144b0 ffff88041fb93fd0 Mar 9 11:21:20 st1 kernel: [1702551.406451] Call Trace: Mar 9 11:21:20 st1 kernel: [1702551.406459] [<ffffffff810135d2>] ? read_tsc+0x5/0x14 Mar 9 11:21:20 st1 kernel: [1702551.406464] [<ffffffff8111d315>] ? wait_on_buffer+0x28/0x28 Mar 9 11:21:20 st1 kernel: [1702551.406469] [<ffffffff8134e141>] ? io_schedule+0x59/0x71 Mar 9 11:21:20 st1 kernel: [1702551.406473] [<ffffffff8111d31b>] ? sleep_on_buffer+0x6/0xa Mar 9 11:21:20 st1 kernel: [1702551.406476] [<ffffffff8134e584>] ? __wait_on_bit+0x3e/0x71 Mar 9 11:21:20 st1 kernel: [1702551.406480] [<ffffffff8134e626>] ? out_of_line_wait_on_bit+0x6f/0x78 Mar 9 11:21:20 st1 kernel: [1702551.406483] [<ffffffff8111d315>] ? wait_on_buffer+0x28/0x28 Mar 9 11:21:20 st1 kernel: [1702551.406489] [<ffffffff8105fcad>] ? autoremove_wake_function+0x2a/0x2a Mar 9 11:21:20 st1 kernel: [1702551.406497] [<ffffffffa05b156c>] ? journal_commit_transaction+0x912/0xdce [jbd] Mar 9 11:21:20 st1 kernel: [1702551.406502] [<ffffffff81070fc1>] ? arch_local_irq_save+0x11/0x17 Mar 9 11:21:20 st1 kernel: [1702551.406506] [<ffffffff8134f209>] ? _raw_spin_lock_irqsave+0x9/0x25 Mar 9 11:21:20 st1 kernel: [1702551.406510] [<ffffffff8134f247>] ? _raw_spin_unlock_irqrestore+0xe/0xf Mar 9 11:21:20 st1 kernel: [1702551.406516] [<ffffffffa05b4643>] ? kjournald+0xe0/0x21e [jbd] Mar 9 11:21:20 st1 kernel: [1702551.406521] [<ffffffff8105fc83>] ? add_wait_queue+0x3c/0x3c Mar 9 11:21:20 st1 kernel: [1702551.406526] [<ffffffffa05b4563>] ? commit_timeout+0x5/0x5 [jbd] Mar 9 11:21:20 st1 kernel: [1702551.406530] [<ffffffff8105f631>] ? kthread+0x76/0x7e Mar 9 11:21:20 st1 kernel: [1702551.406535] [<ffffffff81356374>] ? kernel_thread_helper+0x4/0x10 Mar 9 11:21:20 st1 kernel: [1702551.406540] [<ffffffff8105f5bb>] ? kthread_worker_fn+0x139/0x139 Mar 9 11:21:20 st1 kernel: [1702551.406544] [<ffffffff81356370>] ? gs_change+0x13/0x13 Mar 9 11:21:20 st1 kernel: [1702551.406548] INFO: task rsync:4668 blocked for more than 120 seconds. Mar 9 11:21:20 st1 kernel: [1702551.406550] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Mar 9 11:21:20 st1 kernel: [1702551.406552] rsync D ffff88041fb13780 0 4668 8750 0x00000004 Mar 9 11:21:20 st1 kernel: [1702551.406557] ffff880404a9e340 0000000000000086 ffff880100000000 ffff88040f775610 Mar 9 11:21:20 st1 kernel: [1702551.406561] 0000000000013780 ffff880406293fd8 ffff880406293fd8 ffff880404a9e340 Mar 9 11:21:20 st1 kernel: [1702551.406566] 0000000000000246 000000018134f209 ffff88041fdd5c18 ffff8803c84a7978 Mar 9 11:21:20 st1 kernel: [1702551.406570] Call Trace: Mar 9 11:21:20 st1 kernel: [1702551.406575] [<ffffffffa05b0513>] ? do_get_write_access+0x1ad/0x367 [jbd] Mar 9 11:21:20 st1 kernel: [1702551.406580] [<ffffffff8105fcad>] ? autoremove_wake_function+0x2a/0x2a Mar 9 11:21:20 st1 kernel: [1702551.406589] [<ffffffffa05df078>] ? ext3_set_acl+0x87/0x1fc [ext3] Mar 9 11:21:20 st1 kernel: [1702551.406594] [<ffffffffa05b07e5>] ? journal_get_write_access+0x21/0x38 [jbd] Mar 9 11:21:20 st1 kernel: [1702551.406600] [<ffffffffa05cac8f>] ? __ext3_get_inode_loc+0xf4/0x2cf [ext3] Mar 9 11:21:20 st1 kernel: [1702551.406607] [<ffffffffa05dd459>] ? __ext3_journal_get_write_access+0x1c/0x49 [ext3] Mar 9 11:21:20 st1 kernel: [1702551.406614] [<ffffffffa05df078>] ? ext3_set_acl+0x87/0x1fc [ext3] Mar 9 11:21:20 st1 kernel: [1702551.406620] [<ffffffffa05cc2ed>] ? ext3_reserve_inode_write+0x39/0x7b [ext3] Mar 9 11:21:20 st1 kernel: [1702551.406625] [<ffffffffa05af372>] ? start_this_handle+0x2b8/0x31c [jbd] Mar 9 11:21:20 st1 kernel: [1702551.406631] [<ffffffffa05cc389>] ? ext3_mark_inode_dirty+0x5a/0x79 [ext3] Mar 9 11:21:20 st1 kernel: [1702551.406638] [<ffffffffa05df078>] ? ext3_set_acl+0x87/0x1fc [ext3] Mar 9 11:21:20 st1 kernel: [1702551.406645] [<ffffffffa05df2c8>] ? ext3_xattr_set_acl+0xdb/0x11d [ext3] Mar 9 11:21:20 st1 kernel: [1702551.406650] [<ffffffff811130ce>] ? generic_setxattr+0x69/0x6f Mar 9 11:21:20 st1 kernel: [1702551.406654] [<ffffffff8111375d>] ? __vfs_setxattr_noperm+0x66/0xcd Mar 9 11:21:20 st1 kernel: [1702551.406657] [<ffffffff811138e5>] ? setxattr+0x95/0xf0 Mar 9 11:21:20 st1 kernel: [1702551.406662] [<ffffffff81036628>] ? should_resched+0x5/0x23 Mar 9 11:21:20 st1 kernel: [1702551.406666] [<ffffffff81113831>] ? vfs_setxattr+0x6d/0x8c Mar 9 11:21:20 st1 kernel: [1702551.406669] [<ffffffff81036628>] ? should_resched+0x5/0x23 Mar 9 11:21:20 st1 kernel: [1702551.406672] [<ffffffff8111390d>] ? setxattr+0xbd/0xf0 Mar 9 11:21:20 st1 kernel: [1702551.406678] [<ffffffff810eb124>] ? kmem_cache_free+0x2d/0x69 Mar 9 11:21:20 st1 kernel: [1702551.406683] [<ffffffff81105d33>] ? user_path_at_empty+0x53/0x7b Mar 9 11:21:20 st1 kernel: [1702551.406688] [<ffffffff81113aed>] ? sys_setxattr+0x6a/0x8f Mar 9 11:21:20 st1 kernel: [1702551.406692] [<ffffffff81354212>] ? system_call_fastpath+0x16/0x1b The process 'rsync' became unkillable. These kernel messages continued for the next 10 minutes or so with some variations. Even after the messages stopped, I/O was still extremely slow... freezing up all the time. With slow/freezing I/O I don't mean just to this mountpoint (which actually seemed totally frozen) but also to/from the other disks. This box is acting as a NFS server. Remote systems saw a continuing 'nfs: server .. not respondig'.. followed by 'nfs: .. OK' a while later, and then not responding again. The exact situation was: (rsync on..) ext3 -> loopback/DM -> ext4 -> DRBD -> RAID -> disks I used kpartx to create loopback/DM devices based on an .img file which was mounted in /data (ext4 on DRBD), then mounted /dev/mapper/loop1p1 on /mnt (ext3), and ran an rsync on this /mnt. All data was previously synchronized without ACL/EA's and I re-ran rsync with -AX added. This means the rsync job was mostly synchronizing ACL's and Extended Attributes, maybe that is a significant clue. The machine has 16GB of RAM, of which typically 15G is available for caching/buffers. At the time of the hang this was also the case, I checked with 'free'. Some possibly relevant vm sysctl settings: vm.dirty_background_ratio = 10 vm.dirty_background_bytes = 0 vm.dirty_ratio = 0 vm.dirty_bytes = 15000000 vm.dirty_writeback_centisecs = 500 vm.dirty_expire_centisecs = 3000 As you can see dirty bytes is very low (just 15M), so that rules out kjournald blocking due to a very large dirty buffer. Hopefully this report will help. Of course, if this is a known issue and there's a solution available then I'd be eager to hear about it. If you need any additional details or if I misfiled this bug, let me know. ii linux-image-3.2.0-4-amd64 3.2.51-1 amd64 Linux 3.2 for 64-bit PCs
--- End Message ---
--- Begin Message ---Hi This bug was filed for a very old kernel or the bug is old itself without resolution. If you can reproduce it with - the current version in unstable/testing - the latest kernel from backports please reopen the bug, see https://www.debian.org/Bugs/server-control for details. Regards, Salvatore
--- End Message ---

