[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
I have a system exhibiting the same/similar symptoms. Running a fresh install of Ubuntu 9.04 jaunty uname -a: Linux ServerX 2.6.28-11-generic #42-Ubuntu SMP Fri Apr 17 01:58:03 UTC 2009 x86_64 GNU/Linux Motherboard: SUPERMICRO MBD-H8DME-2-O SATA card: SUPERMICRO AOC-SAT2-MV8 (Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X Controller (rev 09)) The system has a SW RAID6 array made of four 1TB disks. Currently the array is degraded and only has 3 disks to work with. md1 : active raid6 sde1[4] sdd1[0] sdc1[2] 1953519872 blocks level 6, 64k chunk, algorithm 2 [4/2] [U_U_] [==..] recovery = 10.8% (105874700/976759936) finish=142.0min speed=102195K/sec With the array on the PCI-X card I'm able to recreate the crash by failing a drive and reading it to the array. Some time after 50% it will hang and the system is unresponsive. The system boots from RAID1 md0 two 500GB drive which is on the motherboards controller. I was able to add a disk plugged into the PCI-X to md0 and it would sync w/o problems. Moving the RAID6 array to the mother boards controller the rebuild will work w/o problems. kpolberg mentioned adjusting stripe_cache_size.The command he posted: echo 16384 /sys/block/md1/md/stripe_cache_size Looks like it helps, no crash fro 24Hrs. If it remains stable I will try with a larger array. Will post more info if needed. -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
forgot to mention the RAID6 array is using reiserfs -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
This bug was fixed in the package linux - 2.6.24-23.46 --- linux (2.6.24-23.46) hardy-proposed; urgency=low [Alessio Igor Bogani] * rt: Updated PREEMPT_RT support to rt21 - LP: #302138 [Amit Kucheria] * SAUCE: Update lpia patches from moblin tree - LP: #291457 [Andy Whitcroft] * SAUCE: replace gfs2_bitfit with upstream version to prevent oops - LP: #276641 [Colin Ian King] * isdn: Do not validate ISDN net device address prior to interface-up - LP: #237306 * hwmon: (coretemp) Add Penryn CPU to coretemp - LP: #235119 * USB: add support for Motorola ROKR Z6 cellphone in mass storage mode - LP: #263217 * md: fix an occasional deadlock in raid5 - LP: #208551 [Stefan Bader] * SAUCE: buildenv: Show CVE entries in printchanges * SAUCE: buildenv: Send git-ubuntu-log informational message to stderr * Xen: dma: avoid unnecessarily SWIOTLB bounce buffering - LP: #247148 * Update openvz patchset to apply to latest stable tree. - LP: #301634 * XEN: Fix FTBS with stable updates - LP: #301634 [Steve Conklin] * Add HID quirk for dual USB gamepad - LP: #140608 [Tim Gardner] * Enable CONFIG_AX25_DAMA_SLAVE=y - LP: #257684 * SAUCE: Correctly blacklist Thinkpad r40e in ACPI - LP: #278794 * SAUCE: ALPS touchpad for Dell Latitude E6500/E6400 - LP: #270643 [Upstream Kernel Changes] * Revert [Bluetooth] Eliminate checks for impossible conditions in IRQ handler - LP: #217659 * KVM: VMX: Clear CR4.VMXE in hardware_disable - LP: #268981 * iov_iter_advance() fix - LP: #231746 * Fix off-by-one error in iov_iter_advance() - LP: #231746 * USB: serial: ch341: New VID/PID for CH341 USB-serial - LP: #272485 * x86: Fix 32-bit x86 MSI-X allocation leakage - LP: #273103 * b43legacy: Fix failure in rate-adjustment mechanism - LP: #273143 * x86: Reserve FIRST_DEVICE_VECTOR in used_vectors bitmap. - LP: #276334 * openvz: merge missed fixes from vanilla 2.6.24 openvz branch - LP: #298059 * openvz: some autofs related fixes - LP: #298059 * openvz: fix ve stop deadlock after nfs connect - LP: #298059 * openvz: fix netlink and rtnl inside container - LP: #298059 * openvz: fix wrong size of ub0_percpu - LP: #298059 * openvz: fix OOPS while stopping VE started before binfmt_misc.ko loaded - LP: #298059 * x86-64: Fix bytes left to copy return value for copy_from_user() * NET: Fix race in dev_close(). (Bug 9750) - LP: #301608 * IPV6: Fix IPsec datagram fragmentation - LP: #301608 * IPV6: dst_entry leak in ip4ip6_err. - LP: #301608 * IPV4: Remove IP_TOS setting privilege checks. - LP: #301608 * IPCONFIG: The kernel gets no IP from some DHCP servers - LP: #301608 * IPCOMP: Disable BH on output when using shared tfm - LP: #301608 * IRQ_NOPROBE helper functions - LP: #301608 * MIPS: Mark all but i8259 interrupts as no-probe. - LP: #301608 * ub: fix up the conversion to sg_init_table() - LP: #301608 * x86: adjust enable_NMI_through_LVT0() - LP: #301608 * SCSI ips: handle scsi_add_host() failure, and other err cleanups - LP: #301608 * CRYPTO xcbc: Fix crash with IPsec - LP: #301608 * CRYPTO xts: Use proper alignment - LP: #301608 * SCSI ips: fix data buffer accessors conversion bug - LP: #301608 * SCSI aic94xx: fix REQ_TASK_ABORT and REQ_DEVICE_RESET - LP: #301608 * x86: replace LOCK_PREFIX in futex.h - LP: #301608 * ARM pxa: fix clock lookup to find specific device clocks - LP: #301608 * futex: fix init order - LP: #301608 * futex: runtime enable pi and robust functionality - LP: #301608 * file capabilities: simplify signal check - LP: #301608 * hugetlb: ensure we do not reference a surplus page after handing it to buddy - LP: #301608 * ufs: fix parenthesisation in ufs_set_fs_state() - LP: #301608 * spi: pxa2xx_spi clock polarity fix - LP: #301608 * NETFILTER: Fix incorrect use of skb_make_writable - LP: #301608 * NETFILTER: fix ebtable targets return - LP: #301608 * SCSI advansys: fix overrun_buf aligned bug - LP: #301608 * pata_hpt*, pata_serverworks: fix UDMA masking - LP: #301608 * moduleparam: fix alpha, ia64 and ppc64 compile failures - LP: #301608 * PCI x86: always use conf1 to access config space below 256 bytes - LP: #301608 * e1000e: Fix CRC stripping in hardware context bug - LP: #301608 * atmel_spi: fix clock polarity - LP: #301608 * x86: move out tick_nohz_stop_sched_tick() call from the loop - LP: #301608 * macb: Fix speed setting - LP: #301608 * ioat: fix 'ack' handling, driver must ensure that 'ack' is zero - LP: #301608 * VT notifier fix for VT switch - LP: #301608 * USB: ftdi_sio: Workaround for broken Matrix Orbital serial port - LP: #301608 * USB: ftdi_sio - really enable EM1010PC - LP: #301608 *
[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
DesktopMan, since you are the original bug reporter, it would be great to get confirmation from you that this newer kernel does indeed fix the bug you had reported here. Thanks. -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
I do not have this setup anymore (9+ months, needed it operational), but there have been other people reporting the same problem more recently. Hopefully one of them will be able to confirm. -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
I upgraded to 2.6.27-10. There where otehr upgrade that occured at that time too. I reformated my array to XFS. I tried to copy a large amount of data. It failed in the same mannor. After reboot, the array is rebuilding, but I found this inthe log: Dec 6 10:44:54 Server kernel: [44205.953002] Call Trace: Dec 6 10:44:54 Server kernel: [44205.953002] [802ac083] ? find_get_pages+0x43/0x110 Dec 6 10:44:54 Server kernel: [44205.953002] [802b6c74] ? pagevec_lookup+0x24/0x30 Dec 6 10:44:54 Server kernel: [44205.953002] [a0d9302d] ? xfs_cluster_write+0xad/0x180 [xfs] Dec 6 10:44:54 Server kernel: [44205.953002] [a0d93598] ? xfs_page_state_convert+0x498/0x760 [xfs] Dec 6 10:44:54 Server kernel: [44205.953002] [a0d939c1] ? xfs_vm_writepage+0x71/0x120 [xfs] Dec 6 10:44:54 Server kernel: [44205.953002] [802b9554] ? pageout+0x124/0x280 Dec 6 10:44:54 Server kernel: [44205.953002] [802ab1da] ? page_waitqueue+0xa/0x90 Dec 6 10:44:54 Server kernel: [44205.953002] [802b9b5d] ? shrink_page_list+0x34d/0x530 Dec 6 10:44:54 Server kernel: [44205.953002] [802b9ee2] ? shrink_inactive_list+0x1a2/0x4b0 Dec 6 10:44:54 Server kernel: [44205.953002] [802ba26b] ? shrink_zone+0x7b/0x160 Dec 6 10:44:54 Server kernel: [44205.953002] [802ba3dd] ? shrink_zones+0x8d/0x150 Dec 6 10:44:54 Server kernel: [44205.953002] [802ba526] ? do_try_to_free_pages+0x86/0x2e0 Dec 6 10:44:54 Server kernel: [44205.953002] [802ba877] ? try_to_free_pages+0x67/0x70 Dec 6 10:44:54 Server kernel: [44205.953002] [802b9380] ? isolate_pages_global+0x0/0x50 Dec 6 10:44:54 Server kernel: [44205.953002] [802b2b49] ? __alloc_pages_internal+0x239/0x520 Dec 6 10:44:54 Server kernel: [44205.953002] [802d5c6d] ? alloc_pages_current+0xad/0x110 Dec 6 10:44:54 Server kernel: [44205.953002] [802ac617] ? __page_cache_alloc+0x67/0x80 Dec 6 10:44:54 Server kernel: [44205.953002] [802ad253] ? __grab_cache_page+0x63/0xb0 Dec 6 10:44:54 Server kernel: [44205.953002] [803171a9] ? block_write_begin+0x89/0xf0 Dec 6 10:44:54 Server kernel: [44205.953002] [a0d9248a] ? xfs_vm_write_begin+0x2a/0x30 [xfs] Dec 6 10:44:54 Server kernel: [44205.953002] [a0d92050] ? xfs_get_blocks+0x0/0x20 [xfs] Dec 6 10:44:54 Server kernel: [44205.953002] [802ab93c] ? generic_perform_write+0xbc/0x1c0 Dec 6 10:44:54 Server kernel: [44205.953002] [802ad6a2] ? generic_file_buffered_write+0x92/0x170 Dec 6 10:44:54 Server kernel: [44205.953002] [a0d9b2f3] ? xfs_write+0x6b3/0x9b0 [xfs] Dec 6 10:44:54 Server kernel: [44205.953002] [a0d96ca8] ? xfs_file_aio_write+0x58/0x60 [xfs] Dec 6 10:44:54 Server kernel: [44205.953002] [802e9b79] ? do_sync_write+0xf9/0x140 Dec 6 10:44:54 Server kernel: [44205.953002] [80267050] ? autoremove_wake_function+0x0/0x40 Dec 6 10:44:54 Server kernel: [44205.953002] [80387071] ? aa_file_permission+0x21/0xf0 Dec 6 10:44:54 Server kernel: [44205.953002] [80387198] ? apparmor_file_permission+0x28/0x30 Dec 6 10:44:54 Server kernel: [44205.953002] [80361c46] ? security_file_permission+0x16/0x20 Dec 6 10:44:54 Server kernel: [44205.953002] [802ea23b] ? vfs_write+0xcb/0x130 Dec 6 10:44:54 Server kernel: [44205.953002] [802ea395] ? sys_write+0x55/0x90 Dec 6 10:44:54 Server kernel: [44205.953002] [8021285a] ? system_call_fastpath+0x16/0x1b Dec 6 10:44:54 Server kernel: [44205.953002] Dec 6 11:00:09 Server syslogd 1.5.0#2ubuntu6: restart. Dec 6 11:00:09 Server kernel: Inspecting /boot/System.map-2.6.27-10-generic -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
Tony [2008-12-03 23:36 -]: I followed the instruction to enable proposed, but I don’t know what I need to update to test this fix. A normal system upgrade should pull in the new 2.6.27-10 kernel. I. e. you shold get a couple of linux-image, linux-restricted-modules packages with 2.6.27-10.20 version. -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
Martin, Hope this doesn't repeat. If you’re willing to help a newbe, I am willing to test this on my fresh intrepid install. I have a very similar system as described above showing the same kind of problems. I posted my problems in another bug (147464) but after reading this bug, I think this is closer to what I am seeing. I have a ferash install of 8.10 (mythbuntu). I was able to stop this problem, but only if I set rsize=8092 in fstab. This killed throughput! I finally reformatted the array to ext3 and that fixed the problem. Right now I have been testing the array with JFS and so far have not had the system lock up. I followed the instruction to enable proposed, but I don’t know what I need to update to test this fix. I am willing to any testing you need, my system is not a production system and there is no important data on the machine. Tony -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
New SRU fixes it harder apparently. ** Changed in: linux (Ubuntu Hardy) Status: Fix Released = In Progress -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
Accepted into intrepid-proposed, please test and give feedback here. Please see https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance! ** Tags added: verification-needed -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
I am still having deadlocks, only thing that will fix it is setting the stripe_cache_size on the md device higher. echo 16384 /sys/block/md0/md/stripe_cache_size Linux sarah 2.6.24-21-generic #1 SMP Mon Aug 25 16:57:51 UTC 2008 x86_64 GNU/Linux Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sde1[0] sdc1[7] sdd1[6] sdf1[5] sdg1[4] sdb1[3] sda1[2] sdh1[1] 3418686208 blocks level 5, 256k chunk, algorithm 2 [8/8] [] [] resync = 44.4% (216937632/488383744) finish=84.5min speed=53527K/sec unused devices: none [EMAIL PROTECTED]:~# xfs_info /dev/md0 meta-data=/dev/md0 isize=256agcount=75, agsize=11446528 blks = sectsz=4096 attr=1 data = bsize=4096 blocks=854671552, imaxpct=25 = sunit=64 swidth=192 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=2 = sectsz=4096 sunit=1 blks, lazy-count=0 realtime =none extsz=786432 blocks=0, rtextents=0 If you need some more information, please ask. -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 208551] Re: mdadm with Raid5 stuck in uninterruptable sleep
linux 2.6.24-21 copied to hardy-updates. ** Changed in: linux (Ubuntu Hardy) Status: Fix Committed = Fix Released -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs