RE: raid5:md3: kernel BUG , followed by , Silent halt .
> From: Mr. James W. Laferriere [mailto:[EMAIL PROTECTED] > You said to watch here & I have . > Is there any hope of digging this out ? > Anything further I can provide ? Please just say so . > Tia , JimL Apologies, I expect to have time for a deeper dive on this issue next week. Since I can reproduce it here I should have all I need to be able to track it down. I will update you when I have something for you to test. The testing you have done is very valuable and appreciated, thank you for being patient. Regards, Dan - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5:md3: kernel BUG , followed by , Silent halt .
Hello Dan , On Mon, 27 Aug 2007, Dan Williams wrote: On 8/25/07, Mr. James W. Laferriere <[EMAIL PROTECTED]> wrote: On Mon, 20 Aug 2007, Dan Williams wrote: On 8/18/07, Mr. James W. Laferriere <[EMAIL PROTECTED]> wrote: Hello All , Here we go again . Again attempting to do bonnie++ testing on a small array . Kernel 2.6.22.1 Patches involved , IOP1 , 2.6.22.1-iop1 for improved sequential write performance (stripe-queue) , Dan Williams <[EMAIL PROTECTED]> Hello James, Thanks for the report. I tried to reproduce this on my system, no luck. Possibly because there is significant hardware differances ? See 'lspci -v' below .sig . However it looks like their is a potential race between 'handle_queue' and 'add_queue_bio'. The attached patch moves these critical sections under spin_lock(&sq->lock), and adds some debugging output if this BUG triggers. It also includes a fix for retry_aligned_read which is unrelated to this debug. -- Dan Applied your patch . The same 'kernel BUG at drivers/md/raid5.c:3689!' messages appear (see attached) . The system is still responsive with your patch , the kernel crashed last time . Tho the bonnie++ run is stuck in 'D' . And doing a '> /md3/asdf' stays hung even after passing the parent process a 'kill -9' . Any further info You can think of I can/should , I will try to acquire . But I'll have to repeat these steps to attempt to get the same results . I'll be shutting the system down after sending this off . Fyi , the previous 'BUG" without your patch was quite repeatable . I might have time over the next couple of weeks to be able to see if it is as repatable as the last one . Contents of /proc/mdstat for md3 . md3 : active raid6 sdx1[3] sdw1[2] sdv1[1] sdu1[0] sdt1[7](S) sds1[6] sdr1[5] sdq1[4] 717378560 blocks level 6, 1024k chunk, algorithm 2 [7/7] [UUU] bitmap: 2/137 pages [8KB], 512KB chunk Commands I ran that lead to the 'BUG' . bonniemd3() { /root/bonnie++-1.03a/bonnie++ -u0:0 -d /md3 -s 131072 -f; } bonniemd3 > 131072MB-bonnie++-run-md3-xfs.log-20070825 2>&1 & Ok, the 'bitmap' and 'raid6' details were the missing pieces of my testing. I can now reproduce this bug in handle_queue. I'll keep you posted on what I find. Thank you for tracking this. Regards, You said to watch here & I have . Is there any hope of digging this out ? Anything further I can provide ? Please just say so . Tia , JimL - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5:md3: kernel BUG , followed by , Silent halt .
On 8/25/07, Mr. James W. Laferriere <[EMAIL PROTECTED]> wrote: > Hello Dan , > > On Mon, 20 Aug 2007, Dan Williams wrote: > > On 8/18/07, Mr. James W. Laferriere <[EMAIL PROTECTED]> wrote: > >> Hello All , Here we go again . Again attempting to do bonnie++ > >> testing > >> on a small array . > >> Kernel 2.6.22.1 > >> Patches involved , > >> IOP1 , 2.6.22.1-iop1 for improved sequential write performance > >> (stripe-queue) , Dan Williams <[EMAIL PROTECTED]> > > > > Hello James, > > > > Thanks for the report. > > > > I tried to reproduce this on my system, no luck. > Possibly because there is significant hardware differances ? > See 'lspci -v' below .sig . > > > However it looks > > like their is a potential race between 'handle_queue' and > > 'add_queue_bio'. The attached patch moves these critical sections > > under spin_lock(&sq->lock), and adds some debugging output if this BUG > > triggers. It also includes a fix for retry_aligned_read which is > > unrelated to this debug. > > -- > > Dan > Applied your patch . The same 'kernel BUG at > drivers/md/raid5.c:3689!' > messages appear (see attached) . The system is still responsive with your > patch , the kernel crashed last time . Tho the bonnie++ run is stuck in 'D' > . > And doing a '> /md3/asdf' stays hung even after passing the parent process a > 'kill -9' . > Any further info You can think of I can/should , I will try to > acquire > . But I'll have to repeat these steps to attempt to get the same results . > I'll be shutting the system down after sending this off . > Fyi , the previous 'BUG" without your patch was quite repeatable . > I might have time over the next couple of weeks to be able to see if > it > is as repatable as the last one . > > Contents of /proc/mdstat for md3 . > > md3 : active raid6 sdx1[3] sdw1[2] sdv1[1] sdu1[0] sdt1[7](S) sds1[6] sdr1[5] > sdq1[4] >717378560 blocks level 6, 1024k chunk, algorithm 2 [7/7] [UUU] >bitmap: 2/137 pages [8KB], 512KB chunk > > Commands I ran that lead to the 'BUG' . > > bonniemd3() { /root/bonnie++-1.03a/bonnie++ -u0:0 -d /md3 -s 131072 -f; } > bonniemd3 > 131072MB-bonnie++-run-md3-xfs.log-20070825 2>&1 & > Ok, the 'bitmap' and 'raid6' details were the missing pieces of my testing. I can now reproduce this bug in handle_queue. I'll keep you posted on what I find. Thank you for tracking this. Regards, Dan - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5:md3: kernel BUG , followed by , Silent halt .
Hello Dan , On Mon, 20 Aug 2007, Dan Williams wrote: On 8/18/07, Mr. James W. Laferriere <[EMAIL PROTECTED]> wrote: Hello All , Here we go again . Again attempting to do bonnie++ testing on a small array . Kernel 2.6.22.1 Patches involved , IOP1 , 2.6.22.1-iop1 for improved sequential write performance (stripe-queue) , Dan Williams <[EMAIL PROTECTED]> Hello James, Thanks for the report. I tried to reproduce this on my system, no luck. Possibly because there is significant hardware differances ? See 'lspci -v' below .sig . However it looks like their is a potential race between 'handle_queue' and 'add_queue_bio'. The attached patch moves these critical sections under spin_lock(&sq->lock), and adds some debugging output if this BUG triggers. It also includes a fix for retry_aligned_read which is unrelated to this debug. -- Dan Applied your patch . The same 'kernel BUG at drivers/md/raid5.c:3689!' messages appear (see attached) . The system is still responsive with your patch , the kernel crashed last time . Tho the bonnie++ run is stuck in 'D' . And doing a '> /md3/asdf' stays hung even after passing the parent process a 'kill -9' . Any further info You can think of I can/should , I will try to acquire . But I'll have to repeat these steps to attempt to get the same results . I'll be shutting the system down after sending this off . Fyi , the previous 'BUG" without your patch was quite repeatable . I might have time over the next couple of weeks to be able to see if it is as repatable as the last one . Contents of /proc/mdstat for md3 . md3 : active raid6 sdx1[3] sdw1[2] sdv1[1] sdu1[0] sdt1[7](S) sds1[6] sdr1[5] sdq1[4] 717378560 blocks level 6, 1024k chunk, algorithm 2 [7/7] [UUU] bitmap: 2/137 pages [8KB], 512KB chunk Commands I ran that lead to the 'BUG' . bonniemd3() { /root/bonnie++-1.03a/bonnie++ -u0:0 -d /md3 -s 131072 -f; } bonniemd3 > 131072MB-bonnie++-run-md3-xfs.log-20070825 2>&1 & [EMAIL PROTECTED]:~ # top top - 02:22:09 up 3:39, 2 users, load average: 3.09, 2.89, 2.48 Tasks: 155 total, 1 running, 154 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8256156k total, 7995044k used, 261112k free, 7480k buffers Swap: 987896k total, 2784k used, 985112k free, 7787320k cached PID USER P PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 4073 root 0 18 0 000 D0 0.0 15:10.86 [bonnie++] 4076 root 5 15 0 3000 1828 1252 D0 0.0 0:00.05 -bash 4308 root 5 15 0 000 D0 0.0 0:01.40 [pdflush] 4422 root 6 18 0 2212 1168 860 R0 0.0 0:00.13 top [EMAIL PROTECTED]:~ # ps -auxww | grep -C3 bonni Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html root 4029 0.0 0.0 0 0 ?S< Aug25 0:01 [xfsbufd] root 4030 0.0 0.0 0 0 ?S< Aug25 0:00 [xfssyncd] root 4072 0.0 0.0 2992 848 ?SAug25 0:00 -bash root 4073 7.3 0.0 0 0 ?DAug25 15:10 [bonnie++] root 4074 0.1 0.0 6412 1980 ?Ss Aug25 0:12 sshd: [EMAIL PROTECTED]/1 root 4076 0.0 0.0 3000 1828 pts/1Ds+ Aug25 0:00 -bash root 4302 0.1 0.0 0 0 ?S00:50 0:08 [pdflush] -- +-+ | James W. Laferriere | System Techniques | Give me VMS | | NetworkEngineer | 663 Beaumont Blvd | Give me Linux | | [EMAIL PROTECTED] | Pacifica, CA. 94044 | only on AXP | +-+ [EMAIL PROTECTED]:~ # lspci -v 00:00.0 Host bridge: Intel Corporation 5000P Chipset Memory Controller Hub (rev b1) Subsystem: Super Micro Computer Inc Unknown device 8080 Flags: bus master, fast devsel, latency 0 Capabilities: [50] Power Management version 2 Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable- Capabilities: [6c] Express Root Port (Slot-) IRQ 0 Capabilities: [100] Advanced Error Reporting 00:02.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8 Port 2-3 (rev b1) (prog-if 00 [Normal decode]) Flags: bus master, f
Re: raid5:md3: kernel BUG , followed by , Silent halt .
On 8/18/07, Mr. James W. Laferriere <[EMAIL PROTECTED]> wrote: > Hello All , Here we go again . Again attempting to do bonnie++ > testing > on a small array . > Kernel 2.6.22.1 > Patches involved , > IOP1 , 2.6.22.1-iop1 for improved sequential write performance > (stripe-queue) , Dan Williams <[EMAIL PROTECTED]> Hello James, Thanks for the report. I tried to reproduce this on my system, no luck. However it looks like their is a potential race between 'handle_queue' and 'add_queue_bio'. The attached patch moves these critical sections under spin_lock(&sq->lock), and adds some debugging output if this BUG triggers. It also includes a fix for retry_aligned_read which is unrelated to this debug. -- Dan --- raid5-fix-sq-locking.patch --- raid5: address potential sq->to_write race From: Dan Williams <[EMAIL PROTECTED]> synchronize reads and writes to the sq->to_write bit Signed-off-by: Dan Williams <[EMAIL PROTECTED]> --- drivers/md/raid5.c | 12 1 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 02e313b..688b8d3 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -2289,10 +2289,14 @@ static int add_queue_bio(struct stripe_queue *sq, struct bio *bi, int dd_idx, sh = sq->sh; if (forwrite) { bip = &sq->dev[dd_idx].towrite; + set_bit(dd_idx, sq->to_write); if (*bip == NULL && (!sh || (sh && !sh->dev[dd_idx].written))) firstwrite = 1; - } else + } else { bip = &sq->dev[dd_idx].toread; + set_bit(dd_idx, sq->to_read); + } + while (*bip && (*bip)->bi_sector < bi->bi_sector) { if ((*bip)->bi_sector + ((*bip)->bi_size >> 9) > bi->bi_sector) goto overlap; @@ -2324,7 +2328,6 @@ static int add_queue_bio(struct stripe_queue *sq, struct bio *bi, int dd_idx, /* check if page is covered */ sector_t sector = sq->dev[dd_idx].sector; - set_bit(dd_idx, sq->to_write); for (bi = sq->dev[dd_idx].towrite; sector < sq->dev[dd_idx].sector + STRIPE_SECTORS && bi && bi->bi_sector <= sector; @@ -2334,8 +2337,7 @@ static int add_queue_bio(struct stripe_queue *sq, struct bio *bi, int dd_idx, } if (sector >= sq->dev[dd_idx].sector + STRIPE_SECTORS) set_bit(dd_idx, sq->overwrite); - } else - set_bit(dd_idx, sq->to_read); + } return 1; @@ -3656,6 +3658,7 @@ static void handle_queue(struct stripe_queue *sq, int disks, int data_disks) struct stripe_head *sh = NULL; /* continue to process i/o while the stripe is cached */ + spin_lock(&sq->lock); if (test_bit(STRIPE_QUEUE_HANDLE, &sq->state)) { if (io_weight(sq->overwrite, disks) == data_disks) { set_bit(STRIPE_QUEUE_IO_HI, &sq->state); @@ -3678,6 +3681,7 @@ static void handle_queue(struct stripe_queue *sq, int disks, int data_disks) */ BUG_ON(!(sq->sh && sq->sh == sh)); } + spin_unlock(&sq->lock); release_queue(sq); if (sh) { --- raid5-debug-init_queue-bugs.patch --- raid5: printk instead of BUG in init_queue From: Dan Williams <[EMAIL PROTECTED]> Signed-off-by: Dan Williams <[EMAIL PROTECTED]> --- drivers/md/raid5.c | 19 +-- 1 files changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 688b8d3..7164011 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -557,12 +557,19 @@ static void init_queue(struct stripe_queue *sq, sector_t sector, __FUNCTION__, (unsigned long long) sq->sector, (unsigned long long) sector, sq); - BUG_ON(atomic_read(&sq->count) != 0); - BUG_ON(io_weight(sq->to_read, disks)); - BUG_ON(io_weight(sq->to_write, disks)); - BUG_ON(io_weight(sq->overwrite, disks)); - BUG_ON(test_bit(STRIPE_QUEUE_HANDLE, &sq->state)); - BUG_ON(sq->sh); + if ((atomic_read(&sq->count) != 0) || io_weight(sq->to_read, disks) || + io_weight(sq->to_write, disks) || io_weight(sq->overwrite, disks) || + test_bit(STRIPE_QUEUE_HANDLE, &sq->state) || sq->sh) { + printk(KERN_ERR "%s: sector=%llx count: %d to_read: %lu " +"to_write: %lu overwrite: %lu state: %lx " +"sq->sh: %p\n", __FUNCTION__, +(unsigned long long) sq->sector, +atomic_read(&sq->count), +io_weight(sq->to_read, disks), +io_weight(sq->to_write, disks), +io_weight(sq->overwrite, disks), +sq->state, sq->sh); + } sq->state = (1 << STRIPE_QUEUE_HANDLE); sq->sector = sector; --- raid5-fix-get_active_queue-bug.patch --- raid5: fix get_active_queue bug in retry_aligned_read From: Dan Williams <[EMAIL PROTECTED]> Check for a potential null return
raid5:md3: kernel BUG , followed by , Silent halt .
Hello All , Here we go again . Again attempting to do bonnie++ testing on a small array . Kernel 2.6.22.1 Patches involved , IOP1 , 2.6.22.1-iop1 for improved sequential write performance (stripe-queue) , Dan Williams <[EMAIL PROTECTED]> [SCSI] Addition to pci_ids.h for ATTO Technology, Inc. linux/kernel/git/jejb/scsi-misc-2.6.git , commit: d43639784817e4c9acaffd0952a1dc8ab6ecc076 [SCSI] mpt fusion: Add support for ATTO 4LD: Rebranded LSI 53C1030 linux/kernel/git/jejb/scsi-misc-2.6.git , commit: 5e394a08b72f40b059d05c44268769ce87ded11a root@(none):~ # [ cut here ] kernel BUG at drivers/md/raid5.c:562! invalid opcode: [#1] SMP Modules linked in: CPU:7 EIP:0060:[]Not tainted VLI EFLAGS: 00010002 (2.6.22.1-iop1 #2) EIP is at init_queue+0x164/0x190 eax: 0001 ebx: f61cda30 ecx: 0001 edx: 0010 esi: f70fde00 edi: f61cda30 ebp: f7605b08 esp: f7605ad8 ds: 007b es: 007b fs: 00d8 gs: ss: 0068 Process pdflush (pid: 377, ti=f7604000 task=f7cd6030 task.ti=f7604000) Stack: f61cda30 f7605af8 c0405cdc 0001 f70fde04 f70fde00 f70fde00 000b78e8 f61cda30 f70fde00 f70fdfa0 f7605b50 c06e8ad7 0007 0003 f70fdf1c 000b78e8 0286 f7605b38 0286 f7605bbc f70fdf40 Call Trace: [] show_trace_log_lvl+0x1a/0x30 [] show_stack_log_lvl+0x9a/0xc0 [] show_registers+0x1d6/0x2e0 [] die+0x10e/0x210 [] do_trap+0x91/0xd0 [] do_invalid_op+0x89/0xa0 [] error_code+0x72/0x78 [] get_active_queue+0x147/0x1a0 [] make_request+0x1a0/0x3f0 [] generic_make_request+0x1d5/0x2c0 [] submit_bio+0x4f/0xf0 [] xfs_submit_ioend_bio+0x1e/0x30 [] xfs_submit_ioend+0xf2/0x100 [] xfs_page_state_convert+0x430/0x640 [] xfs_vm_writepage+0x63/0x100 [] __writepage+0xb/0x30 [] write_cache_pages+0x206/0x300 [] generic_writepages+0x2f/0x40 [] xfs_vm_writepages+0x24/0x60 [] do_writepages+0x2e/0x50 [] __sync_single_inode+0x5b/0x1a0 [] __writeback_single_inode+0x44/0x1e0 [] sync_sb_inodes+0x12d/0x250 [] writeback_inodes+0xcc/0xe0 [] background_writeout+0x7e/0xb0 [] __pdflush+0xcb/0x1a0 [] pdflush+0x2c/0x30 [] kthread+0x5c/0xa0 [] kernel_thread_helper+0x7/0x10 === Code: 8b 45 ec c7 04 24 6e 53 9b c0 89 44 24 04 e8 74 b0 a3 ff 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe <0f> 0b eb fe 8b 55 e8 89 f8 83 c2 04 e8 fb d0 d1 ff 85 db 0f 85 EIP: [] init_queue+0x164/0x190 SS:ESP 0068:f7605ad8 Build .config . http://www.baby-dragons.com/linux-2.6.22.1d-reiser-xfs-lmsensors-EMMENSE-DEBUGGING_minus-I_oat-W_iop-patches-atto-ul4d-patch.config Build log . http://www.baby-dragons.com/linux-2.6.22.1d-reiser-xfs-lmsensors-EMMENSE-DEBUGGING_minus-I_oat-W_iop-patches-atto-ul4d-patch.log [EMAIL PROTECTED]:~ # /usr/src/linux/scripts/ver_linux If some fields are empty or look unusual you may have an old version. Compare to the current minimal requirements in Documentation/Changes. Linux filesrv2 2.6.22.1-iop1 #2 SMP Sat Aug 18 19:52:15 UTC 2007 i686 pentium4 i386 GNU/Linux Gnu C 3.4.6 Gnu make 3.81 binutils 2.15.92.0.2 util-linux 2.12r mount 2.12r module-init-tools 3.2.2 e2fsprogs 1.38 jfsutils 1.1.11 reiserfsprogs 3.6.19 xfsprogs 2.8.10 pcmciautils014 pcmcia-cs 3.2.8 quota-tools3.13. PPP2.4.4 Linux C Library2.3.6 Dynamic linker (ldd) 2.3.6 Linux C++ Library 6.0.3 Procps 3.2.7 Net-tools 1.60 Kbd1.12 oprofile 0.9.1 Sh-utils 5.97 udev 097 Modules Loaded -- +-+ | James W. Laferriere | System Techniques | Give me VMS | | NetworkEngineer | 663 Beaumont Blvd | Give me Linux | | [EMAIL PROTECTED] | Pacifica, CA. 94044 | only on AXP | +-+ - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html