Re: mfi timeouts
- Original Message - From: rihad ri...@mail.ru To: John Baldwin j...@freebsd.org Cc: freebsd-stable@freebsd.org; vi...@unsane.co.uk Sent: Thursday, February 28, 2013 5:36 AM Subject: Re: mfi timeouts On 02/27/2013 08:59 PM, John Baldwin wrote: On Wednesday, February 27, 2013 12:58:11 am rihad wrote: Now about this part taken from here http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html By issuing a dummy read operation (thus forcing a flush of data buffers), this issue is largely averted. Does this mean that battery-backed cache (BBU) is effectively rendered useless, as all write operations are forced on to the disk platters on every interrupt? No, this is a very different level. This is forcing pending PCI DMA transactions on the PCI bus to flush by doing a read, not forcing I/O buffers to be flushed to disk. Thanks for clarifying. After applying the dummy read patch mfi timeouts don't appear in dmesg output any more, but i/o stalls still occurred 2-3 times during periods of high activity, for no more than 10-20 seconds. I guess the only way to fix that is to choose another hardware RAID implementation, or try Steven Hartland's patch? Does 8.3 or 9.1 include more fixes in this area, is upgrading recommended? 8.3 and 9.1 are way behind head and I'm not aware of any significant changes to them which may help you there rihad. I would recommend using mfi from HEAD even if you stick with 8.3 or 9.1 as a base, it shouldn't require much work to back port, just avoid the busdma changes. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 02/28/2013 09:36 AM, rihad wrote: On 02/27/2013 08:59 PM, John Baldwin wrote: On Wednesday, February 27, 2013 12:58:11 am rihad wrote: Now about this part taken from here http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html By issuing a dummy read operation (thus forcing a flush of data buffers), this issue is largely averted. Does this mean that battery-backed cache (BBU) is effectively rendered useless, as all write operations are forced on to the disk platters on every interrupt? No, this is a very different level. This is forcing pending PCI DMA transactions on the PCI bus to flush by doing a read, not forcing I/O buffers to be flushed to disk. Thanks for clarifying. After applying the dummy read patch mfi timeouts don't appear in dmesg output any more, but i/o stalls still occurred 2-3 times during periods of high activity, for no more than 10-20 seconds. I guess the only way to fix that is to choose another hardware RAID implementation, or try Steven Hartland's patch? Does 8.3 or 9.1 include more fixes in this area, is upgrading recommended? Oops, still same errors occurring even after kernel rebuild... I/O stalled for around 100 seconds as per application logs... I wonder why the patch didn't help? :( mfi0: COMMAND 0xff80010ccda0 TIMEOUT AFTER 59 SECONDS mfi0: COMMAND 0xff80010ccaf8 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010cc520 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010ca6d8 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010cc410 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010ca7e8 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010caa08 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010cbfd0 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010cc0e0 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010c9880 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010cca70 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010c94c8 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010cae48 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010ccb80 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010ca320 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010c9990 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010cb178 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010cac28 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010ca8f8 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010cd0d0 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010c9f68 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010c9440 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010ca430 TIMEOUT AFTER 57 SECONDS mfi0: COMMAND 0xff80010cc058 TIMEOUT AFTER 56 SECONDS mfi0: COMMAND 0xff80010cb288 TIMEOUT AFTER 52 SECONDS mfi0: COMMAND 0xff80010cb6c8 TIMEOUT AFTER 52 SECONDS mfi0: COMMAND 0xff80010cbc18 TIMEOUT AFTER 51 SECONDS mfi0: COMMAND 0xff80010cbdb0 TIMEOUT AFTER 51 SECONDS mfi0: COMMAND 0xff80010c97f8 TIMEOUT AFTER 47 SECONDS mfi0: COMMAND 0xff80010ccc90 TIMEOUT AFTER 41 SECONDS mfi0: COMMAND 0xff80010cd048 TIMEOUT AFTER 37 SECONDS ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
- Original Message - From: rihad ri...@mail.ru Did the patch help? We're having the same issues running mfiutil show volumes every minute doesn't make the freezes go away. Will this small patch be ok on 8.2-RELEASE-p4? Thanks. I'm about to commit a major patch to mfi, which many problems with the current driver. Given info in that PR it could well fix this problem. Unfortunately due to the number of changes, its likely to sit in head for a while before it gets MFC, just to be safe. We've been running it on 8.3-RELEASE on top of the driver from head for some months and not issues so far. So if anyone is interested in a patchset for that I'll be able to provide. I'll post a link to the commit when I'm done. Thanks, if the trivial dummy read proves insufficient, sure I'll need to look into that :) Thanks. Ok patch is in; two parts the second as I mentioned its a bit of monster I'm afraid. http://svnweb.freebsd.org/base?view=revisionrevision=247367 http://svnweb.freebsd.org/base?view=revisionrevision=247369 With the exception of the recent busdma changes: http://svnweb.freebsd.org/base?view=revisionrevision=246713 the entire driver back ports trivially to 8 so should be good with 9 too. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On Wednesday, February 27, 2013 12:58:11 am rihad wrote: Now about this part taken from here http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html By issuing a dummy read operation (thus forcing a flush of data buffers), this issue is largely averted. Does this mean that battery-backed cache (BBU) is effectively rendered useless, as all write operations are forced on to the disk platters on every interrupt? No, this is a very different level. This is forcing pending PCI DMA transactions on the PCI bus to flush by doing a read, not forcing I/O buffers to be flushed to disk. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 02/27/2013 08:59 PM, John Baldwin wrote: On Wednesday, February 27, 2013 12:58:11 am rihad wrote: Now about this part taken from here http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html By issuing a dummy read operation (thus forcing a flush of data buffers), this issue is largely averted. Does this mean that battery-backed cache (BBU) is effectively rendered useless, as all write operations are forced on to the disk platters on every interrupt? No, this is a very different level. This is forcing pending PCI DMA transactions on the PCI bus to flush by doing a read, not forcing I/O buffers to be flushed to disk. Thanks for clarifying. After applying the dummy read patch mfi timeouts don't appear in dmesg output any more, but i/o stalls still occurred 2-3 times during periods of high activity, for no more than 10-20 seconds. I guess the only way to fix that is to choose another hardware RAID implementation, or try Steven Hartland's patch? Does 8.3 or 9.1 include more fixes in this area, is upgrading recommended? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 28/10/2011 04:14, Jan Mikkelsen wrote: / Hi, // // There is a patch linked to from this PR, which seems very similar: // // http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 // // http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html // // The problem is also consistent with running mfiutil clearing the problem. // // I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. // /This looks promising, I'll give a try when I get a moment. Hi, Did the patch help? We're having the same issues running mfiutil show volumes every minute doesn't make the freezes go away. Will this small patch be ok on 8.2-RELEASE-p4? Thanks. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 26/02/2013 18:31, rihad wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: / Hi, // // There is a patch linked to from this PR, which seems very similar: // // http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 // // http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html // // The problem is also consistent with running mfiutil clearing the problem. // // I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. // /This looks promising, I'll give a try when I get a moment. Hi, Did the patch help? We're having the same issues running mfiutil show volumes every minute doesn't make the freezes go away. Will this small patch be ok on 8.2-RELEASE-p4? Thanks. From what I remember this patch did help. That server has been running happily on 8.3 release for the last 300 days or so with no timeouts. Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On Tuesday, February 26, 2013 1:31:44 pm rihad wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: / Hi, // // There is a patch linked to from this PR, which seems very similar: // // http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 // // http://lists.freebsd.org/pipermail/freebsd-scsi/2011- March/004839.html // // The problem is also consistent with running mfiutil clearing the problem. // // I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. // /This looks promising, I'll give a try when I get a moment. Hi, Did the patch help? We're having the same issues running mfiutil show volumes every minute doesn't make the freezes go away. Will this small patch be ok on 8.2-RELEASE-p4? Thanks. You can use the patch on 8.2. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
- Original Message - From: rihad ri...@mail.ru On 28/10/2011 04:14, Jan Mikkelsen wrote: / Hi, // // There is a patch linked to from this PR, which seems very similar: // // http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 // // http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html // // The problem is also consistent with running mfiutil clearing the problem. // // I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. // /This looks promising, I'll give a try when I get a moment. Hi, Did the patch help? We're having the same issues running mfiutil show volumes every minute doesn't make the freezes go away. Will this small patch be ok on 8.2-RELEASE-p4? Thanks. I'm about to commit a major patch to mfi, which many problems with the current driver. Given info in that PR it could well fix this problem. Unfortunately due to the number of changes, its likely to sit in head for a while before it gets MFC, just to be safe. We've been running it on 8.3-RELEASE on top of the driver from head for some months and not issues so far. So if anyone is interested in a patchset for that I'll be able to provide. I'll post a link to the commit when I'm done. Regards Steve This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 02/27/2013 12:48 AM, John Baldwin wrote: On Tuesday, February 26, 2013 1:31:44 pm rihad wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: / Hi, // // There is a patch linked to from this PR, which seems very similar: // // http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 // // http://lists.freebsd.org/pipermail/freebsd-scsi/2011- March/004839.html // // The problem is also consistent with running mfiutil clearing the problem. // // I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. // /This looks promising, I'll give a try when I get a moment. Hi, Did the patch help? We're having the same issues running mfiutil show volumes every minute doesn't make the freezes go away. Will this small patch be ok on 8.2-RELEASE-p4? Thanks. You can use the patch on 8.2. Thanks, I was forced to apply the patch to 8.2-p4 and rebuild the kernel yesterday in the night, as it sometimes locks up both interfaces in periods of high disk/net activity (around 4-5 gbit/s passing through). Has anyone had full system lock-ups besides the i/o stall mfi timeout errors? I hope those issues are related. In such cases sometimes one of the interfaces lives, sometimes both are down. Happened 2-3 times during a little over a month. Now about this part taken from here http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html By issuing a dummy read operation (thus forcing a flush of data buffers), this issue is largely averted. Does this mean that battery-backed cache (BBU) is effectively rendered useless, as all write operations are forced on to the disk platters on every interrupt? # mfiutil show adapter mfi0 Adapter: Product Name: Integrated Intel(R) RAID Controller SROMBSASMP2 Serial Number: Firmware: 8.0.1-0033 RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50 Battery Backup: present NVRAM: 32K Onboard Memory: 512M Minimum Stripe: 8K Maximum Stripe: 1M ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 02/27/2013 03:39 AM, Steven Hartland wrote: - Original Message - From: rihad ri...@mail.ru On 28/10/2011 04:14, Jan Mikkelsen wrote: / Hi, // // There is a patch linked to from this PR, which seems very similar: // // http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 // // http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html // // The problem is also consistent with running mfiutil clearing the problem. // // I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. // /This looks promising, I'll give a try when I get a moment. Hi, Did the patch help? We're having the same issues running mfiutil show volumes every minute doesn't make the freezes go away. Will this small patch be ok on 8.2-RELEASE-p4? Thanks. I'm about to commit a major patch to mfi, which many problems with the current driver. Given info in that PR it could well fix this problem. Unfortunately due to the number of changes, its likely to sit in head for a while before it gets MFC, just to be safe. We've been running it on 8.3-RELEASE on top of the driver from head for some months and not issues so far. So if anyone is interested in a patchset for that I'll be able to provide. I'll post a link to the commit when I'm done. Thanks, if the trivial dummy read proves insufficient, sure I'll need to look into that :) Thanks. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 14/11/2011 19:42, John Baldwin wrote: On Thursday, November 10, 2011 5:59:28 am Vincent Hoffman wrote: Well the dell has been up for about 19 hours now using MSI, I ran bonnie++ a few times on it and have now stuck it in a permanent loop (will look in from time to time.) Are there any tests you'd like run/info you'd like? Actually, can you please test www.freebsd.org/~jhb/patches/mfi_msi.patch? You will have to set the hw.mfi.msi=1 tunable to enable MSI support. This is a commit candidate if it works. Thanks. Applied and running with bonnie++ overnight. All good for me at least. Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 16/11/2011, at 9:43 PM, Vincent Hoffman wrote: On 14/11/2011 19:42, John Baldwin wrote: On Thursday, November 10, 2011 5:59:28 am Vincent Hoffman wrote: Well the dell has been up for about 19 hours now using MSI, I ran bonnie++ a few times on it and have now stuck it in a permanent loop (will look in from time to time.) Are there any tests you'd like run/info you'd like? Actually, can you please test www.freebsd.org/~jhb/patches/mfi_msi.patch? You will have to set the hw.mfi.msi=1 tunable to enable MSI support. This is a commit candidate if it works. Thanks. Applied and running with bonnie++ overnight. All good for me at least. Boots for me with hw.mfi.msi=1, fails to boot with mw.mfi.msi=0, giving repeated timeout messages pretty much as expected. Won't be able to put load on it until later tomorrow or next week. Regards, Jan. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On Monday, November 14, 2011 7:27:18 pm Jan Mikkelsen wrote: Hi, Sorry about being unclear. They all failed in the same way. So, these combinations have continuous timeout errors and fail to completely boot: Plain 9.0-RC1 9-stable with 1.62 of mfi.c 9-stable with www.freebsd.org/~jhb/patches/mfi.patch, pci_alloc_msi instead of pci_alloc_msix and hw.mfi.msix=0 This boots, but gets mfi0: Cannot allocate interrupt and there are no /dev/mfi* devices: 9-stable with www.freebsd.org/~jhb/patches/mfi.patch and hw.mfi.msix=1 This seems to work, but I have not put any load on it yet: 9-stable with www.freebsd.org/~jhb/patches/mfi.patch, pci_alloc_msi instead of pci_alloc_msix and hw.mfi.msix=1 Ok, so MSI interrupts seem to work for you. I see you have a new patch, www.freebsd.org/~jhb/patches/mfi_msi.patch. This patch doesn't seem to include the dummy read from your earlier patch, or the one in 1.62 of mfi.c. I assume I need to apply the 1.62 mfi.c diff to by 9-stable sources as well. Is that correct? You can just apply this patch, no need to backport the fix in 1.62 as that fix should not be needed if you are using MSI. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
Hi, I have just tested mfi on a machine that has just arrived. I'm seeing the command timeout problem on boot with 9.0-RC1. The message is mfi0: COMMAND addr TIMEOUT AFTER 59 seconds, and then repeats every 30 seconds (with the time changed, obviously). I have tested the 9.0-RC1 ISO, a 9-stable kernel patched with the patch from the PR I referenced below (also in FreeBSD cvs in revision 1.62 of src/sys/dev/mfi/mfi.c), and 9-stable kernel with the patch at www.freebsd.org/~jhb/patches/mfi.patch with the 'pci_alloc_msix' call changed to 'pci_alloc_msi'. I'm setting up a test with the pci_alloc_msix call unchanged at the moment, but I don't have anything in /boot/loader.conf, so that I'm not expecting that to make a difference. This is on a Supermicro X8DTi-F, 48GB memory and an LSI MegaRAID 9261-8i. 8.2-RELEASE boots fine, dmesg and some output from mfiutil below. Any suggestions gratefully received! Thanks, Jan Mikkelsen On 09/11/2011, at 10:39 AM, Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. I can give root access to the machine if this would be helpful, I cant give KVM access though unfortunately. (but can look in from time to time if needed.) Vince Copyright (c) 1992-2011 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.2-RELEASE #0: Thu May 26 14:33:45 EST 2011 r...@valhalla.transactionware.com:/home/janm/p4/freebsd-image-std-2008.2/work/base-freebsd/home/janm/p4/freebsd-image-std-2008.2/FreeBSD/src/sys/GENERIC amd64 Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz (2409.70-MHz K8-class CPU) Origin = GenuineIntel Id = 0x206c2 Family = 6 Model = 2c Stepping = 2 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0x9ee3fdSSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT AMD Features=0x2c100800SYSCALL,NX,Page1GB,RDTSCP,LM AMD Features2=0x1LAHF TSC: P-state invariant real memory = 51543801856 (49156 MB) avail memory = 49664753664 (47364 MB) ACPI APIC Table: SUPERM APIC1519 FreeBSD/SMP: Multiprocessor System Detected: 24 CPUs FreeBSD/SMP: 2 package(s) x 6 core(s) x 2 SMT threads cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 cpu4 (AP): APIC ID: 4 cpu5 (AP): APIC ID: 5 cpu6 (AP): APIC ID: 16 cpu7 (AP): APIC ID: 17 cpu8 (AP): APIC ID: 18 cpu9 (AP): APIC ID: 19 cpu10 (AP): APIC ID: 20 cpu11 (AP): APIC ID: 21 cpu12 (AP): APIC ID: 32 cpu13 (AP): APIC ID: 33 cpu14 (AP): APIC ID: 34 cpu15 (AP): APIC ID: 35 cpu16 (AP): APIC ID: 36 cpu17 (AP): APIC ID: 37 cpu18 (AP):
Re: mfi timeouts
Following up ... Booting the 9-stable kernel with the patch from http://www.freebsd.org/~jhb/patches/mfi.patch, modified to use pci_alloc_msi with hw.mfi.msix=1 boots OK. Haven't put any load on it yet. Will try the plain patch without the pci_alloc_msi change. On 14/11/2011, at 7:03 PM, Jan Mikkelsen wrote: Hi, I have just tested mfi on a machine that has just arrived. I'm seeing the command timeout problem on boot with 9.0-RC1. The message is mfi0: COMMAND addr TIMEOUT AFTER 59 seconds, and then repeats every 30 seconds (with the time changed, obviously). I have tested the 9.0-RC1 ISO, a 9-stable kernel patched with the patch from the PR I referenced below (also in FreeBSD cvs in revision 1.62 of src/sys/dev/mfi/mfi.c), and 9-stable kernel with the patch at www.freebsd.org/~jhb/patches/mfi.patch with the 'pci_alloc_msix' call changed to 'pci_alloc_msi'. I'm setting up a test with the pci_alloc_msix call unchanged at the moment, but I don't have anything in /boot/loader.conf, so that I'm not expecting that to make a difference. This is on a Supermicro X8DTi-F, 48GB memory and an LSI MegaRAID 9261-8i. 8.2-RELEASE boots fine, dmesg and some output from mfiutil below. Any suggestions gratefully received! Thanks, Jan Mikkelsen On 09/11/2011, at 10:39 AM, Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. I can give root access to the machine if this would be helpful, I cant give KVM access though unfortunately. (but can look in from time to time if needed.) Vince Copyright (c) 1992-2011 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.2-RELEASE #0: Thu May 26 14:33:45 EST 2011 r...@valhalla.transactionware.com:/home/janm/p4/freebsd-image-std-2008.2/work/base-freebsd/home/janm/p4/freebsd-image-std-2008.2/FreeBSD/src/sys/GENERIC amd64 Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz (2409.70-MHz K8-class CPU) Origin = GenuineIntel Id = 0x206c2 Family = 6 Model = 2c Stepping = 2 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0x9ee3fdSSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT AMD Features=0x2c100800SYSCALL,NX,Page1GB,RDTSCP,LM AMD Features2=0x1LAHF TSC: P-state invariant real memory = 51543801856 (49156 MB) avail memory = 49664753664 (47364 MB) ACPI APIC Table: SUPERM APIC1519 FreeBSD/SMP: Multiprocessor System Detected: 24 CPUs FreeBSD/SMP: 2 package(s) x 6 core(s) x 2 SMT threads cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP):
Re: mfi timeouts
On Thursday, November 10, 2011 5:59:28 am Vincent Hoffman wrote: On 09/11/2011 14:39, John Baldwin wrote: On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. Odd, it's against stock head, so I don't know why it would have failed to apply. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Okay, good. I'll commit the non-MSI bits at least and get them merged into 9.0 if possible. Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. Ok. Can you try changing it to use MSI instead of MSI-X? Just edit the mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'. Well the dell has been up for about 19 hours now using MSI, I ran bonnie++ a few times on it and have now stuck it in a permanent loop (will look in from time to time.) Are there any tests you'd like run/info you'd like? No, this looks good. I'll probably commit something to mfi to just enable MSI only (but with a tunable that defaults to off) so people can do broader testing. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On Monday, November 14, 2011 3:03:42 am Jan Mikkelsen wrote: Hi, I have just tested mfi on a machine that has just arrived. I'm seeing the command timeout problem on boot with 9.0-RC1. The message is mfi0: COMMAND addr TIMEOUT AFTER 59 seconds, and then repeats every 30 seconds (with the time changed, obviously). I have tested the 9.0-RC1 ISO, a 9-stable kernel patched with the patch from the PR I referenced below (also in FreeBSD cvs in revision 1.62 of src/sys/dev/mfi/mfi.c), and 9-stable kernel with the patch at www.freebsd.org/~jhb/patches/mfi.patch with the 'pci_alloc_msix' call changed to 'pci_alloc_msi'. You forgot to mention what happened from those tests, did any of them work? -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On Thursday, November 10, 2011 5:59:28 am Vincent Hoffman wrote: On 09/11/2011 14:39, John Baldwin wrote: On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. Odd, it's against stock head, so I don't know why it would have failed to apply. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Okay, good. I'll commit the non-MSI bits at least and get them merged into 9.0 if possible. Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. Ok. Can you try changing it to use MSI instead of MSI-X? Just edit the mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'. Well the dell has been up for about 19 hours now using MSI, I ran bonnie++ a few times on it and have now stuck it in a permanent loop (will look in from time to time.) Are there any tests you'd like run/info you'd like? Actually, can you please test www.freebsd.org/~jhb/patches/mfi_msi.patch? You will have to set the hw.mfi.msi=1 tunable to enable MSI support. This is a commit candidate if it works. Thanks. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
Hi, Sorry about being unclear. They all failed in the same way. So, these combinations have continuous timeout errors and fail to completely boot: Plain 9.0-RC1 9-stable with 1.62 of mfi.c 9-stable with www.freebsd.org/~jhb/patches/mfi.patch, pci_alloc_msi instead of pci_alloc_msix and hw.mfi.msix=0 This boots, but gets mfi0: Cannot allocate interrupt and there are no /dev/mfi* devices: 9-stable with www.freebsd.org/~jhb/patches/mfi.patch and hw.mfi.msix=1 This seems to work, but I have not put any load on it yet: 9-stable with www.freebsd.org/~jhb/patches/mfi.patch, pci_alloc_msi instead of pci_alloc_msix and hw.mfi.msix=1 I see you have a new patch, www.freebsd.org/~jhb/patches/mfi_msi.patch. This patch doesn't seem to include the dummy read from your earlier patch, or the one in 1.62 of mfi.c. I assume I need to apply the 1.62 mfi.c diff to by 9-stable sources as well. Is that correct? Will test out the new patch. Thanks, Jan Mikkelsen On 15/11/2011, at 3:36 AM, John Baldwin wrote: On Monday, November 14, 2011 3:03:42 am Jan Mikkelsen wrote: Hi, I have just tested mfi on a machine that has just arrived. I'm seeing the command timeout problem on boot with 9.0-RC1. The message is mfi0: COMMAND addr TIMEOUT AFTER 59 seconds, and then repeats every 30 seconds (with the time changed, obviously). I have tested the 9.0-RC1 ISO, a 9-stable kernel patched with the patch from the PR I referenced below (also in FreeBSD cvs in revision 1.62 of src/sys/dev/mfi/mfi.c), and 9-stable kernel with the patch at www.freebsd.org/~jhb/patches/mfi.patch with the 'pci_alloc_msix' call changed to 'pci_alloc_msi'. You forgot to mention what happened from those tests, did any of them work? -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 09/11/2011 14:39, John Baldwin wrote: On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. Odd, it's against stock head, so I don't know why it would have failed to apply. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Okay, good. I'll commit the non-MSI bits at least and get them merged into 9.0 if possible. Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. Ok. Can you try changing it to use MSI instead of MSI-X? Just edit the mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'. Well the dell has been up for about 19 hours now using MSI, I ran bonnie++ a few times on it and have now stuck it in a permanent loop (will look in from time to time.) Are there any tests you'd like run/info you'd like? Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. Odd, it's against stock head, so I don't know why it would have failed to apply. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Okay, good. I'll commit the non-MSI bits at least and get them merged into 9.0 if possible. Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. Ok. Can you try changing it to use MSI instead of MSI-X? Just edit the mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 09/11/2011 14:39, John Baldwin wrote: On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. Odd, it's against stock head, so I don't know why it would have failed to apply. I think it was http://svnweb.freebsd.org/base?view=revisionrevision=227309 no big deal. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Okay, good. I'll commit the non-MSI bits at least and get them merged into 9.0 if possible. Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. Ok. Can you try changing it to use MSI instead of MSI-X? Just edit the mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'. I'll have a try. Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 09/11/2011 14:39, John Baldwin wrote: On Tuesday, November 08, 2011 6:39:07 pm Vincent Hoffman wrote: On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. Odd, it's against stock head, so I don't know why it would have failed to apply. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Okay, good. I'll commit the non-MSI bits at least and get them merged into 9.0 if possible. Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. Ok. Can you try changing it to use MSI instead of MSI-X? Just edit the mfi_pci.c call and replace 'pci_alloc_msix' with 'pci_alloc_msi'. Much better, It boots and says Nov 9 15:25:45 zfstest kernel: mfi0: Dell PERC H700 Adapter port 0xfc00-0xfcff mem 0xdf1bc000-0xdf1b,0xdf1c-0xdf1f irq 38 at device 0.0 on pci3 Nov 9 15:25:45 zfstest kernel: mfi0: Using MSI-X Nov 9 15:25:45 zfstest kernel: mfi0: Megaraid SAS driver Ver 3.00 Nov 9 15:25:45 zfstest kernel: mfi0: 2004 (374167405s/0x0020/info) - Shutdown command received from host Nov 9 15:25:45 zfstest kernel: mfi0: 2005 (boot + 34s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/1f16/1028) Nov 9 15:25:45 zfstest kernel: mfi0: 2006 (boot + 34s/0x0020/info) - Firmware version 2.100.03-1046 Nov 9 15:25:45 zfstest kernel: mfi0: 2007 (boot + 36s/0x0008/info) - Battery Present Nov 9 15:25:45 zfstest kernel: mfi0: 2008 (boot + 36s/0x0020/info) - Package version 12.10.0-0025 Nov 9 15:25:45 zfstest kernel: mfi0: 2009 (boot + 36s/0x0020/info) - Board Revision A00 Nov 9 15:25:45 zfstest kernel: mfi0: 2010 (boot + 61s/0x0002/info) - Inserted: PD 00(e0xff/s0) Nov 9 15:25:45 zfstest kernel: mfi0: 2011 (boot + 61s/0x0002/info) - Inserted: PD 00(e0xff/s0) Info: enclPd=, scsiType=0, portMap=01, sasAddr=443322110700, Nov 9 15:25:45 zfstest kernel: mfi0: 2012 (boot + 61s/0x0002/info) - Inserted: PD 01(e0xff/s1) Nov 9 15:25:45 zfstest kernel: mfi0: 2013 (boot + 61s/0x0002/info) - Inserted: PD 01(e0xff/s1) Info: enclPd=, scsiType=0, portMap=00, sasAddr=443322110600, Nov 9 15:25:45 zfstest kernel: mfi0: 2014 (374167491s/0x0020/info) - Time established as 11/09/11 15:24:51; (63 seconds since power on) Nov 9 15:25:45 zfstest kernel: mfi0: 2015 (374167529s/0x0008/info) - Battery temperature is normal Nov 9 15:25:45 zfstest kernel: mfi0: 2016 (374167529s/0x0008/info) - Battery started charging More info as required. Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Vince Regards, Jan Mikkelsen On 28/10/2011, at 10:39 AM, Vincent Hoffman wrote: On 28/10/2011 00:04, Jeremy Chadwick wrote: On Thu, Oct 27, 2011 at 11:52:51PM +0100, Vincent Hoffman wrote: I've recently installed a new NAS at work which uses a rebranded LSI megaraid sas [root@banshee ~]# mfiutil show adapter mfi0 Adapter: Product Name: Supermicro SMC2108 Serial Number: Firmware: 12.12.0-0047 RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50 Battery Backup: present NVRAM: 32K Onboard Memory: 512M Minimum Stripe: 8k Maximum Stripe: 1M I'm running 8-STABLE as of 2011-10-23 (for zfs v28 as is got 26 3Tb drives) I'm seeing a lot of messages like mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 60 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 90 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 120 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 150 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 180 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 210 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 240 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 271 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 301 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 331 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 361 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 391 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 55 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 85 SECONDS At which time I'm seeing IO stall on the array connected to the mfi adapter, this can continue for 20 minutes or so resuming randomly (or so it seems although a little more on this later on) From pciconf -lv mfi0@pci0:5:0:0:class=0x010400 card=0x070015d9 chip=0x00791000 rev=0x04 hdr=0x00 vendor = 'LSI Logic (Was: Symbios Logic, NCR)' class = mass storage subclass = RAID From dmesg mfi0: LSI MegaSAS Gen2 port 0xe000-0xe0ff mem 0xfbd9c000-0xfbd9,0xfbdc-0xfbdf irq 32 at device 0.0 on pci5 mfi0: Megaraid SAS driver Ver 3.00 mfi0: 12330 (372962922s/0x0020/info) - Shutdown command received from host mfi0: 12331 (boot + 4s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/0700/15d9) mfi0: 12332 (boot + 4s/0x0020/info) - Firmware version 2.120.53-1235 mfi0: 12333 (boot + 7s/0x0008/info) - Battery Present mfi0: 12334 (boot + 7s/0x0020/info) - Package version 12.12.0-0047 mfi0: 12335 (boot + 7s/0x0020/info) - Board Revision I have found this thread from a bit of googleing but it doesnt end too well. http://lists.freebsd.org/pipermail/freebsd-stable/2011- September/063821.html Was this ever taken further? One thing I have noticed is that the stall (and timeout messages) seem to go away if I query the card using mfiutil, I currently have a cron doing this every 2 minutes to see if this has been coincidence or not. Any suggestions welcome and i'm happy to provide more info if i can but I dont have a duplicate to do too much debugging on, I'm happy to try patches though. Is this worth filing a PR? Can you please provide uname -a output? The version of FreeBSD you're using matters greatly here. Sure FreeBSD banshee.foobar.net 8.2-STABLE FreeBSD 8.2-STABLE #2: Wed Oct 26 16:14:09 BST 2011
Re: mfi timeouts
On 08/11/2011 19:50, John Baldwin wrote: On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. Vince Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 08/11/2011 22:24, Vincent Hoffman wrote: On 08/11/2011 19:50, John Baldwin wrote: On Wednesday, November 02, 2011 5:47:38 pm Vincent Hoffman wrote: On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Hmm, did you try the patch I had posted from that earlier thread? It had two changes in it, one was similar to the patch in the PR, the second added MSI-X support. I've since tweaked it to make the MSI-X support off by default but possible to enable via loader.conf. Would you be willing to try the updated patch at www.freebsd.org/~jhb/patches/mfi.patch? Hi, yes I tried the patch you posted originally, unfortunately the dell never finished booting either. The Supermicro is now in production but I'll take the dell up to 9-STABLE and try your updated patch. The patch didnt apply quite cleanly for 9-STABLE, 1 reject as it had already been applied. I have rebooted the dell and it seems happy with the new patch (msi disabled.) Booting with hw.mfi.msix=1 in /boot/loader.conf causes the timeouts again and stops the boot from completing. I can give root access to the machine if this would be helpful, I cant give KVM access though unfortunately. (but can look in from time to time if needed.) Vince Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. The patch you linked to seems to have removed the stalls, although I have only had it running for a day. I'll post if it stalls again though. I did manage to scrounge the use of a Dell r410 with a LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05) Badged as Dell PERC H700 Adapter to test out the patch I originally found but had the same issue as this post http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html I couldnt get the dell to stall in the first place either though so it could be a specific firmware version that the issue. Anyway thanks for the pointers. Vince Regards, Jan Mikkelsen On 28/10/2011, at 10:39 AM, Vincent Hoffman wrote: On 28/10/2011 00:04, Jeremy Chadwick wrote: On Thu, Oct 27, 2011 at 11:52:51PM +0100, Vincent Hoffman wrote: I've recently installed a new NAS at work which uses a rebranded LSI megaraid sas [root@banshee ~]# mfiutil show adapter mfi0 Adapter: Product Name: Supermicro SMC2108 Serial Number: Firmware: 12.12.0-0047 RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50 Battery Backup: present NVRAM: 32K Onboard Memory: 512M Minimum Stripe: 8k Maximum Stripe: 1M I'm running 8-STABLE as of 2011-10-23 (for zfs v28 as is got 26 3Tb drives) I'm seeing a lot of messages like mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 60 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 90 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 120 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 150 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 180 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 210 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 240 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 271 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 301 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 331 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 361 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 391 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 55 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 85 SECONDS At which time I'm seeing IO stall on the array connected to the mfi adapter, this can continue for 20 minutes or so resuming randomly (or so it seems although a little more on this later on) From pciconf -lv mfi0@pci0:5:0:0:class=0x010400 card=0x070015d9 chip=0x00791000 rev=0x04 hdr=0x00 vendor = 'LSI Logic (Was: Symbios Logic, NCR)' class = mass storage subclass = RAID From dmesg mfi0: LSI MegaSAS Gen2 port 0xe000-0xe0ff mem 0xfbd9c000-0xfbd9,0xfbdc-0xfbdf irq 32 at device 0.0 on pci5 mfi0: Megaraid SAS driver Ver 3.00 mfi0: 12330 (372962922s/0x0020/info) - Shutdown command received from host mfi0: 12331 (boot + 4s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/0700/15d9) mfi0: 12332 (boot + 4s/0x0020/info) - Firmware version 2.120.53-1235 mfi0: 12333 (boot + 7s/0x0008/info) - Battery Present mfi0: 12334 (boot + 7s/0x0020/info) - Package version 12.12.0-0047 mfi0: 12335 (boot + 7s/0x0020/info) - Board Revision I have found this thread from a bit of googleing but it doesnt end too well. http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html Was this ever taken further? One thing I have noticed is that the stall (and timeout messages) seem to go away if I query the card using mfiutil, I currently have a cron doing this every 2 minutes to see if this has been coincidence or not. Any suggestions welcome and i'm happy to provide more info if i can but I dont have a duplicate to do too much debugging on, I'm happy to try patches though. Is this worth filing a PR? Can you please provide uname -a output? The version of FreeBSD you're using matters greatly here. Sure FreeBSD banshee.foobar.net 8.2-STABLE FreeBSD 8.2-STABLE #2: Wed Oct 26 16:14:09 BST 2011 t...@banshee.foobar.net:/usr/obj/usr/src/sys/BANSHEE amd64 [root@banshee /usr/src]# svn info Path: . Working Copy Root Path: /usr/src URL: http://svn.freebsd.org/base/stable/8 Repository Root: http://svn.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 226708 Node Kind: directory Schedule: normal Last Changed Author: brueffer Last Changed Rev: 226671 Last Changed Date: 2011-10-23 19:37:57 +0100 (Sun, 23 Oct 2011) It's looking like the mfiutil query stopping the stall is not a coincidence the last 2
Re: mfi timeouts
On 28/10/2011 04:14, Jan Mikkelsen wrote: Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. This looks promising, I'll give a try when I get a moment. Thanks, Vince Regards, Jan Mikkelsen On 28/10/2011, at 10:39 AM, Vincent Hoffman wrote: On 28/10/2011 00:04, Jeremy Chadwick wrote: On Thu, Oct 27, 2011 at 11:52:51PM +0100, Vincent Hoffman wrote: I've recently installed a new NAS at work which uses a rebranded LSI megaraid sas [root@banshee ~]# mfiutil show adapter mfi0 Adapter: Product Name: Supermicro SMC2108 Serial Number: Firmware: 12.12.0-0047 RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50 Battery Backup: present NVRAM: 32K Onboard Memory: 512M Minimum Stripe: 8k Maximum Stripe: 1M I'm running 8-STABLE as of 2011-10-23 (for zfs v28 as is got 26 3Tb drives) I'm seeing a lot of messages like mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 60 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 90 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 120 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 150 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 180 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 210 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 240 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 271 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 301 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 331 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 361 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 391 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 55 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 85 SECONDS At which time I'm seeing IO stall on the array connected to the mfi adapter, this can continue for 20 minutes or so resuming randomly (or so it seems although a little more on this later on) From pciconf -lv mfi0@pci0:5:0:0:class=0x010400 card=0x070015d9 chip=0x00791000 rev=0x04 hdr=0x00 vendor = 'LSI Logic (Was: Symbios Logic, NCR)' class = mass storage subclass = RAID From dmesg mfi0: LSI MegaSAS Gen2 port 0xe000-0xe0ff mem 0xfbd9c000-0xfbd9,0xfbdc-0xfbdf irq 32 at device 0.0 on pci5 mfi0: Megaraid SAS driver Ver 3.00 mfi0: 12330 (372962922s/0x0020/info) - Shutdown command received from host mfi0: 12331 (boot + 4s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/0700/15d9) mfi0: 12332 (boot + 4s/0x0020/info) - Firmware version 2.120.53-1235 mfi0: 12333 (boot + 7s/0x0008/info) - Battery Present mfi0: 12334 (boot + 7s/0x0020/info) - Package version 12.12.0-0047 mfi0: 12335 (boot + 7s/0x0020/info) - Board Revision I have found this thread from a bit of googleing but it doesnt end too well. http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html Was this ever taken further? One thing I have noticed is that the stall (and timeout messages) seem to go away if I query the card using mfiutil, I currently have a cron doing this every 2 minutes to see if this has been coincidence or not. Any suggestions welcome and i'm happy to provide more info if i can but I dont have a duplicate to do too much debugging on, I'm happy to try patches though. Is this worth filing a PR? Can you please provide uname -a output? The version of FreeBSD you're using matters greatly here. Sure FreeBSD banshee.foobar.net 8.2-STABLE FreeBSD 8.2-STABLE #2: Wed Oct 26 16:14:09 BST 2011 t...@banshee.foobar.net:/usr/obj/usr/src/sys/BANSHEE amd64 [root@banshee /usr/src]# svn info Path: . Working Copy Root Path: /usr/src URL: http://svn.freebsd.org/base/stable/8 Repository Root: http://svn.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 226708 Node Kind: directory Schedule: normal Last Changed Author: brueffer Last Changed Rev: 226671 Last Changed Date: 2011-10-23 19:37:57 +0100 (Sun, 23 Oct 2011) It's looking like the mfiutil query stopping the stall is not a coincidence the last 2 have lasted less than the every 2 minutes that i set the cron to run, much less than previously. The cron is a simple /usr/sbin/mfiutil show volumes | grep -v OPTIMAL So get at least get an email if the volume breaks ;) Oct 28 00:01:06 banshee mfi0: COMMAND 0xff8000b22d18 TIMEOUT AFTER 59 SECONDS Oct 28 00:01:36 banshee mfi0: COMMAND 0xff8000b22d18 TIMEOUT AFTER 89 SECONDS Oct 28 00:13:09 banshee mfi0: COMMAND 0xff8000b205c8 TIMEOUT AFTER 50 SECONDS Oct 28 00:13:39 banshee mfi0: COMMAND 0xff8000b205c8 TIMEOUT
mfi timeouts
Hi, I've recently installed a new NAS at work which uses a rebranded LSI megaraid sas [root@banshee ~]# mfiutil show adapter mfi0 Adapter: Product Name: Supermicro SMC2108 Serial Number: Firmware: 12.12.0-0047 RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50 Battery Backup: present NVRAM: 32K Onboard Memory: 512M Minimum Stripe: 8k Maximum Stripe: 1M I'm running 8-STABLE as of 2011-10-23 (for zfs v28 as is got 26 3Tb drives) I'm seeing a lot of messages like mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 60 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 90 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 120 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 150 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 180 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 210 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 240 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 271 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 301 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 331 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 361 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 391 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 55 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 85 SECONDS At which time I'm seeing IO stall on the array connected to the mfi adapter, this can continue for 20 minutes or so resuming randomly (or so it seems although a little more on this later on) From pciconf -lv mfi0@pci0:5:0:0:class=0x010400 card=0x070015d9 chip=0x00791000 rev=0x04 hdr=0x00 vendor = 'LSI Logic (Was: Symbios Logic, NCR)' class = mass storage subclass = RAID From dmesg mfi0: LSI MegaSAS Gen2 port 0xe000-0xe0ff mem 0xfbd9c000-0xfbd9,0xfbdc-0xfbdf irq 32 at device 0.0 on pci5 mfi0: Megaraid SAS driver Ver 3.00 mfi0: 12330 (372962922s/0x0020/info) - Shutdown command received from host mfi0: 12331 (boot + 4s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/0700/15d9) mfi0: 12332 (boot + 4s/0x0020/info) - Firmware version 2.120.53-1235 mfi0: 12333 (boot + 7s/0x0008/info) - Battery Present mfi0: 12334 (boot + 7s/0x0020/info) - Package version 12.12.0-0047 mfi0: 12335 (boot + 7s/0x0020/info) - Board Revision I have found this thread from a bit of googleing but it doesnt end too well. http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html Was this ever taken further? One thing I have noticed is that the stall (and timeout messages) seem to go away if I query the card using mfiutil, I currently have a cron doing this every 2 minutes to see if this has been coincidence or not. Any suggestions welcome and i'm happy to provide more info if i can but I dont have a duplicate to do too much debugging on, I'm happy to try patches though. Is this worth filing a PR? Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On Thu, Oct 27, 2011 at 11:52:51PM +0100, Vincent Hoffman wrote: I've recently installed a new NAS at work which uses a rebranded LSI megaraid sas [root@banshee ~]# mfiutil show adapter mfi0 Adapter: Product Name: Supermicro SMC2108 Serial Number: Firmware: 12.12.0-0047 RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50 Battery Backup: present NVRAM: 32K Onboard Memory: 512M Minimum Stripe: 8k Maximum Stripe: 1M I'm running 8-STABLE as of 2011-10-23 (for zfs v28 as is got 26 3Tb drives) I'm seeing a lot of messages like mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 60 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 90 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 120 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 150 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 180 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 210 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 240 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 271 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 301 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 331 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 361 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 391 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 55 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 85 SECONDS At which time I'm seeing IO stall on the array connected to the mfi adapter, this can continue for 20 minutes or so resuming randomly (or so it seems although a little more on this later on) From pciconf -lv mfi0@pci0:5:0:0:class=0x010400 card=0x070015d9 chip=0x00791000 rev=0x04 hdr=0x00 vendor = 'LSI Logic (Was: Symbios Logic, NCR)' class = mass storage subclass = RAID From dmesg mfi0: LSI MegaSAS Gen2 port 0xe000-0xe0ff mem 0xfbd9c000-0xfbd9,0xfbdc-0xfbdf irq 32 at device 0.0 on pci5 mfi0: Megaraid SAS driver Ver 3.00 mfi0: 12330 (372962922s/0x0020/info) - Shutdown command received from host mfi0: 12331 (boot + 4s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/0700/15d9) mfi0: 12332 (boot + 4s/0x0020/info) - Firmware version 2.120.53-1235 mfi0: 12333 (boot + 7s/0x0008/info) - Battery Present mfi0: 12334 (boot + 7s/0x0020/info) - Package version 12.12.0-0047 mfi0: 12335 (boot + 7s/0x0020/info) - Board Revision I have found this thread from a bit of googleing but it doesnt end too well. http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html Was this ever taken further? One thing I have noticed is that the stall (and timeout messages) seem to go away if I query the card using mfiutil, I currently have a cron doing this every 2 minutes to see if this has been coincidence or not. Any suggestions welcome and i'm happy to provide more info if i can but I dont have a duplicate to do too much debugging on, I'm happy to try patches though. Is this worth filing a PR? Can you please provide uname -a output? The version of FreeBSD you're using matters greatly here. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
On 28/10/2011 00:04, Jeremy Chadwick wrote: On Thu, Oct 27, 2011 at 11:52:51PM +0100, Vincent Hoffman wrote: I've recently installed a new NAS at work which uses a rebranded LSI megaraid sas [root@banshee ~]# mfiutil show adapter mfi0 Adapter: Product Name: Supermicro SMC2108 Serial Number: Firmware: 12.12.0-0047 RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50 Battery Backup: present NVRAM: 32K Onboard Memory: 512M Minimum Stripe: 8k Maximum Stripe: 1M I'm running 8-STABLE as of 2011-10-23 (for zfs v28 as is got 26 3Tb drives) I'm seeing a lot of messages like mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 60 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 90 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 120 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 150 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 180 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 210 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 240 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 271 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 301 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 331 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 361 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 391 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 55 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 85 SECONDS At which time I'm seeing IO stall on the array connected to the mfi adapter, this can continue for 20 minutes or so resuming randomly (or so it seems although a little more on this later on) From pciconf -lv mfi0@pci0:5:0:0:class=0x010400 card=0x070015d9 chip=0x00791000 rev=0x04 hdr=0x00 vendor = 'LSI Logic (Was: Symbios Logic, NCR)' class = mass storage subclass = RAID From dmesg mfi0: LSI MegaSAS Gen2 port 0xe000-0xe0ff mem 0xfbd9c000-0xfbd9,0xfbdc-0xfbdf irq 32 at device 0.0 on pci5 mfi0: Megaraid SAS driver Ver 3.00 mfi0: 12330 (372962922s/0x0020/info) - Shutdown command received from host mfi0: 12331 (boot + 4s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/0700/15d9) mfi0: 12332 (boot + 4s/0x0020/info) - Firmware version 2.120.53-1235 mfi0: 12333 (boot + 7s/0x0008/info) - Battery Present mfi0: 12334 (boot + 7s/0x0020/info) - Package version 12.12.0-0047 mfi0: 12335 (boot + 7s/0x0020/info) - Board Revision I have found this thread from a bit of googleing but it doesnt end too well. http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html Was this ever taken further? One thing I have noticed is that the stall (and timeout messages) seem to go away if I query the card using mfiutil, I currently have a cron doing this every 2 minutes to see if this has been coincidence or not. Any suggestions welcome and i'm happy to provide more info if i can but I dont have a duplicate to do too much debugging on, I'm happy to try patches though. Is this worth filing a PR? Can you please provide uname -a output? The version of FreeBSD you're using matters greatly here. Sure FreeBSD banshee.foobar.net 8.2-STABLE FreeBSD 8.2-STABLE #2: Wed Oct 26 16:14:09 BST 2011 t...@banshee.foobar.net:/usr/obj/usr/src/sys/BANSHEE amd64 [root@banshee /usr/src]# svn info Path: . Working Copy Root Path: /usr/src URL: http://svn.freebsd.org/base/stable/8 Repository Root: http://svn.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 226708 Node Kind: directory Schedule: normal Last Changed Author: brueffer Last Changed Rev: 226671 Last Changed Date: 2011-10-23 19:37:57 +0100 (Sun, 23 Oct 2011) It's looking like the mfiutil query stopping the stall is not a coincidence the last 2 have lasted less than the every 2 minutes that i set the cron to run, much less than previously. The cron is a simple /usr/sbin/mfiutil show volumes | grep -v OPTIMAL So get at least get an email if the volume breaks ;) Oct 28 00:01:06 banshee mfi0: COMMAND 0xff8000b22d18 TIMEOUT AFTER 59 SECONDS Oct 28 00:01:36 banshee mfi0: COMMAND 0xff8000b22d18 TIMEOUT AFTER 89 SECONDS Oct 28 00:13:09 banshee mfi0: COMMAND 0xff8000b205c8 TIMEOUT AFTER 50 SECONDS Oct 28 00:13:39 banshee mfi0: COMMAND 0xff8000b205c8 TIMEOUT AFTER 80 SECONDS I'm guessing this must kick something on the card. Vince ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi timeouts
Hi, There is a patch linked to from this PR, which seems very similar: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416 http://lists.freebsd.org/pipermail/freebsd-scsi/2011-March/004839.html The problem is also consistent with running mfiutil clearing the problem. I'm about to deploy mfi controllers in a similar configuration, so I'd be very curious about whether the patch fixes the problem for you. Regards, Jan Mikkelsen On 28/10/2011, at 10:39 AM, Vincent Hoffman wrote: On 28/10/2011 00:04, Jeremy Chadwick wrote: On Thu, Oct 27, 2011 at 11:52:51PM +0100, Vincent Hoffman wrote: I've recently installed a new NAS at work which uses a rebranded LSI megaraid sas [root@banshee ~]# mfiutil show adapter mfi0 Adapter: Product Name: Supermicro SMC2108 Serial Number: Firmware: 12.12.0-0047 RAID Levels: JBOD, RAID0, RAID1, RAID5, RAID6, RAID10, RAID50 Battery Backup: present NVRAM: 32K Onboard Memory: 512M Minimum Stripe: 8k Maximum Stripe: 1M I'm running 8-STABLE as of 2011-10-23 (for zfs v28 as is got 26 3Tb drives) I'm seeing a lot of messages like mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 60 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 90 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 120 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 150 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 180 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 210 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 240 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 271 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 301 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 331 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 361 SECONDS mfi0: COMMAND 0xff8000b216c8 TIMEOUT AFTER 391 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 55 SECONDS mfi0: COMMAND 0xff8000b21b08 TIMEOUT AFTER 85 SECONDS At which time I'm seeing IO stall on the array connected to the mfi adapter, this can continue for 20 minutes or so resuming randomly (or so it seems although a little more on this later on) From pciconf -lv mfi0@pci0:5:0:0:class=0x010400 card=0x070015d9 chip=0x00791000 rev=0x04 hdr=0x00 vendor = 'LSI Logic (Was: Symbios Logic, NCR)' class = mass storage subclass = RAID From dmesg mfi0: LSI MegaSAS Gen2 port 0xe000-0xe0ff mem 0xfbd9c000-0xfbd9,0xfbdc-0xfbdf irq 32 at device 0.0 on pci5 mfi0: Megaraid SAS driver Ver 3.00 mfi0: 12330 (372962922s/0x0020/info) - Shutdown command received from host mfi0: 12331 (boot + 4s/0x0020/info) - Firmware initialization started (PCI ID 0079/1000/0700/15d9) mfi0: 12332 (boot + 4s/0x0020/info) - Firmware version 2.120.53-1235 mfi0: 12333 (boot + 7s/0x0008/info) - Battery Present mfi0: 12334 (boot + 7s/0x0020/info) - Package version 12.12.0-0047 mfi0: 12335 (boot + 7s/0x0020/info) - Board Revision I have found this thread from a bit of googleing but it doesnt end too well. http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html Was this ever taken further? One thing I have noticed is that the stall (and timeout messages) seem to go away if I query the card using mfiutil, I currently have a cron doing this every 2 minutes to see if this has been coincidence or not. Any suggestions welcome and i'm happy to provide more info if i can but I dont have a duplicate to do too much debugging on, I'm happy to try patches though. Is this worth filing a PR? Can you please provide uname -a output? The version of FreeBSD you're using matters greatly here. Sure FreeBSD banshee.foobar.net 8.2-STABLE FreeBSD 8.2-STABLE #2: Wed Oct 26 16:14:09 BST 2011 t...@banshee.foobar.net:/usr/obj/usr/src/sys/BANSHEE amd64 [root@banshee /usr/src]# svn info Path: . Working Copy Root Path: /usr/src URL: http://svn.freebsd.org/base/stable/8 Repository Root: http://svn.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 226708 Node Kind: directory Schedule: normal Last Changed Author: brueffer Last Changed Rev: 226671 Last Changed Date: 2011-10-23 19:37:57 +0100 (Sun, 23 Oct 2011) It's looking like the mfiutil query stopping the stall is not a coincidence the last 2 have lasted less than the every 2 minutes that i set the cron to run, much less than previously. The cron is a simple /usr/sbin/mfiutil show volumes | grep -v OPTIMAL So get at least get an email if the volume breaks ;) Oct 28 00:01:06 banshee mfi0: COMMAND 0xff8000b22d18 TIMEOUT AFTER 59 SECONDS Oct 28 00:01:36 banshee mfi0: COMMAND 0xff8000b22d18 TIMEOUT AFTER 89 SECONDS Oct 28 00:13:09 banshee mfi0: COMMAND 0xff8000b205c8 TIMEOUT AFTER 50 SECONDS Oct 28 00:13:39 banshee mfi0: COMMAND 0xff8000b205c8 TIMEOUT AFTER 80 SECONDS I'm guessing this must kick something on the card. Vince