Gave that a shot. No dice. Still getting the 8 second lag. It reminds me of raid cards that do staggered spinups that sequentially spin up 1 drive at a time. Only, this is happening after the kernel loads and of course, the LSI 9200s are flashed in IT mode with v.19 firmware and bios disabled.
Jul 29 14:57:12 store2 genunix: [ID 583861 kern.info] sd10 at mpt_sas2: unit-address w50000c0f0401c20f,0: w50000c0f0401c20f,0 Jul 29 14:57:12 store2 genunix: [ID 936769 kern.info] sd10 is /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f0401c20f,0 Jul 29 14:57:12 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f0401c20f,0 (sd10) online Jul 29 14:57:20 store2 genunix: [ID 583861 kern.info] sd11 at mpt_sas2: unit-address w50000c0f040075db,0: w50000c0f040075db,0 Jul 29 14:57:20 store2 genunix: [ID 936769 kern.info] sd11 is /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f040075db,0 Jul 29 14:57:21 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f040075db,0 (sd11) online Jul 29 14:57:29 store2 genunix: [ID 583861 kern.info] sd12 at mpt_sas2: unit-address w50000c0f042c684b,0: w50000c0f042c684b,0 Jul 29 14:57:29 store2 genunix: [ID 936769 kern.info] sd12 is /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f042c684b,0 Jul 29 14:57:29 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f042c684b,0 (sd12) online Jul 29 14:57:38 store2 genunix: [ID 583861 kern.info] sd13 at mpt_sas2: unit-address w50000c0f0457149f,0: w50000c0f0457149f,0 Jul 29 14:57:38 store2 genunix: [ID 936769 kern.info] sd13 is /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f0457149f,0 Jul 29 14:57:38 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f0457149f,0 (sd13) online Jul 29 14:57:47 store2 genunix: [ID 583861 kern.info] sd14 at mpt_sas2: unit-address w50000c0f042b1c6f,0: w50000c0f042b1c6f,0 Jul 29 14:57:47 store2 genunix: [ID 936769 kern.info] sd14 is /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f042b1c6f,0 Jul 29 14:57:47 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f042b1c6f,0 (sd14) online ________________________ Michael Talbott Systems Administrator La Jolla Institute > On Jul 29, 2015, at 1:50 PM, Schweiss, Chip <[email protected]> wrote: > > I have an OmniOS box with all the same hardware except the server and hard > disks. I would wager this something to do with the WD disks and something > different happening in the init. > > This is a stab in the dark, but try adding "power-condition:false" in > /kernel/drv/sd.conf for the WD disks. > > -Chip > > > > On Wed, Jul 29, 2015 at 12:48 PM, Michael Talbott <[email protected] > <mailto:[email protected]>> wrote: > Here's the specs of that server. > > Fujitsu RX300S8 > - http://www.fujitsu.com/fts/products/computing/servers/primergy/rack/rx300/ > <http://www.fujitsu.com/fts/products/computing/servers/primergy/rack/rx300/> > 128G ECC DDR3 1600 RAM > 2 x Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz > 2 x LSI 9200-8e > 2 x 10Gb Intel NICs > 2 x SuperMicro 847E26-RJBOD1 45 bay JBOD enclosures > - http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm > <http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm> > > The enclosures are not currently set up for multipathing. The front and rear > backplane each have a single independent SAS connection to one of the LSI > 9200s. > > The two enclosures are fully loaded with 45 x 4TB WD4001FYYG-01SL3 drives > each (90 total). > http://www.newegg.com/Product/Product.aspx?Item=N82E16822236353 > <http://www.newegg.com/Product/Product.aspx?Item=N82E16822236353> > > Booting the server up in Ubuntu or CentOS does not have that 8 second delay. > Each drive is found in a fraction of a second (activity LEDs on the enclosure > flash on and off really quick as the drives are scanned). On OmniOS, the > drives seem to be scanned in the same order, but, instead of it spending a > fraction of a second on each drive, it spends 8 seconds on 1 drive (led of > only one drive rapidly flashing during that process) before moving on to the > next x 90 drives. > > Is there anything I can do to get more verbosity in the boot messages that > might just reveal the root issue? > > Any suggestions appreciated. > > Thanks > > ________________________ > Michael Talbott > Systems Administrator > La Jolla Institute > >> On Jul 29, 2015, at 7:51 AM, Schweiss, Chip <[email protected] >> <mailto:[email protected]>> wrote: >> >> >> >> On Fri, Jul 24, 2015 at 5:03 PM, Michael Talbott <[email protected] >> <mailto:[email protected]>> wrote: >> Hi, >> >> I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot >> bios. But I'm still getting the 8 second per drive delay after the kernel >> loads. Any other ideas? >> >> >> 8 seconds is way too long. What JBODs and disks are you using? Could it >> be they are powered off and the delay in waiting for the power on command to >> complete? This could be accelerated by using lsiutils to send them all >> power on commands first. >> >> While I still consider it slow, however, my OmniOS systems with LSI HBAs >> discover about 2 disks per second. With systems with LOTS of disk all >> multipathed it still stacks up to a long time to discover them all. >> >> -Chip >> >> >> ________________________ >> Michael Talbott >> Systems Administrator >> La Jolla Institute >> >> > On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients >> > Amstafs ::.. <[email protected] <mailto:[email protected]>> >> > wrote: >> > >> > Michael, >> > >> > I know v20 does cause lots of issue's. >> > V19 , to the best of my knowledge doesn't contain any, so I would >> > downgrade to v19 >> > >> > >> > Kr, >> > >> > >> > Floris >> > -----Oorspronkelijk bericht----- >> > Van: OmniOS-discuss [mailto:[email protected] >> > <mailto:[email protected]>] Namens Michael Talbott >> > Verzonden: dinsdag 21 juli 2015 4:57 >> > Aan: Marion Hakanson <[email protected] <mailto:[email protected]>> >> > CC: omnios-discuss <[email protected] >> > <mailto:[email protected]>> >> > Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive >> > >> > Thanks for the reply. The bios for the card is disabled already. The 8 >> > second per drive scan happens after the kernel has already loaded and it >> > is scanning for devices. I wonder if it's due to running newer firmware. I >> > did update the cards to fw v.20.something before I moved to omnios. Is >> > there a particular firmware version on the cards I should run to match >> > OmniOS's drivers? >> > >> > >> > ________________________ >> > Michael Talbott >> > Systems Administrator >> > La Jolla Institute >> > >> >> On Jul 20, 2015, at 6:06 PM, Marion Hakanson <[email protected] >> >> <mailto:[email protected]>> wrote: >> >> >> >> Michael, >> >> >> >> I've not seen this; I do have one system with 120 drives and it >> >> definitely does not have this problem. A couple with 80+ drives are >> >> also free of this issue, though they are still running OpenIndiana. >> >> >> >> One thing I pretty much always do here, is to disable the boot option >> >> in the LSI HBA's config utility (accessible from the during boot after >> >> the BIOS has started up). I do this because I don't want the BIOS >> >> thinking it can boot from any of the external JBOD disks; And also >> >> because I've had some system BIOS crashes when they tried to enumerate >> >> too many drives. But, this all happens at the BIOS level, before the >> >> OS has even started up, so in theory it should not affect what you are >> >> seeing. >> >> >> >> Regards, >> >> >> >> Marion >> >> >> >> >> >> ================================================================ >> >> Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive >> >> From: Michael Talbott <[email protected] <mailto:[email protected]>> >> >> Date: Fri, 17 Jul 2015 16:15:47 -0700 >> >> To: omnios-discuss <[email protected] >> >> <mailto:[email protected]>> >> >> >> >> Just realized my typo. I'm using this on my 90 and 180 drive systems: >> >> >> >> # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s >> >> boot-archive setprop start/timeout_seconds=1440 >> >> >> >> Seems like 8 seconds to detect each drive is pretty excessive. >> >> >> >> Any ideas on how to speed that up? >> >> >> >> >> >> ________________________ >> >> Michael Talbott >> >> Systems Administrator >> >> La Jolla Institute >> >> >> >>> On Jul 17, 2015, at 4:07 PM, Michael Talbott <[email protected] >> >>> <mailto:[email protected]>> wrote: >> >>> >> >>> I have multiple NAS servers I've moved to OmniOS and each of them have >> >>> 90-180 4T disks. Everything has worked out pretty well for the most >> >>> part. But I've come into an issue where when I reboot any of them, I'm >> >>> getting boot-archive service timeouts happening. I found a workaround of >> >>> increasing the timeout value which brings me to the following. As you >> >>> can see below in a dmesg output, it's taking the kernel about 8 seconds >> >>> to detect each of the drives. They're connected via a couple SAS2008 >> >>> based LSI cards. >> >>> >> >>> Is this normal? >> >>> Is there a way to speed that up? >> >>> >> >>> I've fixed my frustrating boot-archive timeout problem by adjusting the >> >>> timeout value from the default of 60 seconds (I guess that'll work ok on >> >>> systems with less than 8 drives?) to 8 seconds * 90 drives + a little >> >>> extra time = 280 seconds (for the 90 drive systems). Which means it >> >>> takes between 12-24 minutes to boot those machines up. >> >>> >> >>> # svccfg -s boot-archive setprop start/timeout_seconds=280 >> >>> >> >>> I figure I can't be the only one. A little googling also revealed: >> >>> https://www.illumos.org/issues/4614 <https://www.illumos.org/issues/4614> >> >>> <https://www.illumos.org/issues/4614 >> >>> <https://www.illumos.org/issues/4614>> >> >>> >> >>> Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info >> >>> <http://kern.info/>] sd29 at >> >>> mpt_sas3: unit-address w50000c0f0401bd43,0: w50000c0f0401bd43,0 Jul >> >>> 17 15:40:15 store2 genunix: [ID 936769 kern.info <http://kern.info/>] >> >>> sd29 is >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f0401bd4 >> >>> 3,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info >> >>> <http://kern.info/>] >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f0401bd4 >> >>> 3,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 >> >>> kern.info <http://kern.info/>] sd30 at mpt_sas3: unit-address >> >>> w50000c0f045679c3,0: >> >>> w50000c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 >> >>> kern.info <http://kern.info/>] sd30 is >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f045679c >> >>> 3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info >> >>> <http://kern.info/>] >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f045679c >> >>> 3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 >> >>> kern.info <http://kern.info/>] sd31 at mpt_sas3: unit-address >> >>> w50000c0f045712b3,0: >> >>> w50000c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 936769 >> >>> kern.info <http://kern.info/>] sd31 is >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f045712b >> >>> 3,0 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info >> >>> <http://kern.info/>] >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f045712b >> >>> 3,0 (sd31) online Jul 17 15:40:42 store2 genunix: [ID 583861 >> >>> kern.info <http://kern.info/>] sd32 at mpt_sas3: unit-address >> >>> w50000c0f04571497,0: >> >>> w50000c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 936769 >> >>> kern.info <http://kern.info/>] sd32 is >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f0457149 >> >>> 7,0 Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info >> >>> <http://kern.info/>] >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f0457149 >> >>> 7,0 (sd32) online Jul 17 15:40:50 store2 genunix: [ID 583861 >> >>> kern.info <http://kern.info/>] sd33 at mpt_sas3: unit-address >> >>> w50000c0f042ac8eb,0: >> >>> w50000c0f042ac8eb,0 Jul 17 15:40:50 store2 genunix: [ID 936769 >> >>> kern.info <http://kern.info/>] sd33 is >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f042ac8e >> >>> b,0 Jul 17 15:40:50 store2 genunix: [ID 408114 kern.info >> >>> <http://kern.info/>] >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f042ac8e >> >>> b,0 (sd33) online Jul 17 15:40:59 store2 genunix: [ID 583861 >> >>> kern.info <http://kern.info/>] sd34 at mpt_sas3: unit-address >> >>> w50000c0f04571473,0: >> >>> w50000c0f04571473,0 Jul 17 15:40:59 store2 genunix: [ID 936769 >> >>> kern.info <http://kern.info/>] sd34 is >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f0457147 >> >>> 3,0 Jul 17 15:40:59 store2 genunix: [ID 408114 kern.info >> >>> <http://kern.info/>] >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f0457147 >> >>> 3,0 (sd34) online Jul 17 15:41:08 store2 genunix: [ID 583861 >> >>> kern.info <http://kern.info/>] sd35 at mpt_sas3: unit-address >> >>> w50000c0f042c636f,0: >> >>> w50000c0f042c636f,0 Jul 17 15:41:08 store2 genunix: [ID 936769 >> >>> kern.info <http://kern.info/>] sd35 is >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f042c636 >> >>> f,0 Jul 17 15:41:08 store2 genunix: [ID 408114 kern.info >> >>> <http://kern.info/>] >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f042c636 >> >>> f,0 (sd35) online Jul 17 15:41:17 store2 genunix: [ID 583861 >> >>> kern.info <http://kern.info/>] sd36 at mpt_sas3: unit-address >> >>> w50000c0f0401bf2f,0: >> >>> w50000c0f0401bf2f,0 Jul 17 15:41:17 store2 genunix: [ID 936769 >> >>> kern.info <http://kern.info/>] sd36 is >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f0401bf2 >> >>> f,0 Jul 17 15:41:17 store2 genunix: [ID 408114 kern.info >> >>> <http://kern.info/>] >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f0401bf2 >> >>> f,0 (sd36) online Jul 17 15:41:25 store2 genunix: [ID 583861 >> >>> kern.info <http://kern.info/>] sd38 at mpt_sas3: unit-address >> >>> w50000c0f0401bc1f,0: >> >>> w50000c0f0401bc1f,0 Jul 17 15:41:25 store2 genunix: [ID 936769 >> >>> kern.info <http://kern.info/>] sd38 is >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f0401bc1 >> >>> f,0 Jul 17 15:41:26 store2 genunix: [ID 408114 kern.info >> >>> <http://kern.info/>] >> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w50000c0f0401bc1 >> >>> f,0 (sd38) online >> >>> >> >>> >> >>> ________________________ >> >>> Michael Talbott >> >>> Systems Administrator >> >>> La Jolla Institute >> >>> >> >> >> >> _______________________________________________ >> >> OmniOS-discuss mailing list >> >> [email protected] <mailto:[email protected]> >> >> http://lists.omniti.com/mailman/listinfo/omnios-discuss >> >> <http://lists.omniti.com/mailman/listinfo/omnios-discuss> >> >> >> >> >> > >> > _______________________________________________ >> > OmniOS-discuss mailing list >> > [email protected] <mailto:[email protected]> >> > http://lists.omniti.com/mailman/listinfo/omnios-discuss >> > <http://lists.omniti.com/mailman/listinfo/omnios-discuss> >> > ...:: House of Ancients ::... >> > American Staffordshire Terriers >> > >> > +31-628-161-350 <tel:%2B31-628-161-350> >> > +31-614-198-389 <tel:%2B31-614-198-389> >> > Het Perk 48 >> > 4903 RB >> > Oosterhout >> > Netherlands >> > www.houseofancients.nl <http://www.houseofancients.nl/> >> >> _______________________________________________ >> OmniOS-discuss mailing list >> [email protected] <mailto:[email protected]> >> http://lists.omniti.com/mailman/listinfo/omnios-discuss >> <http://lists.omniti.com/mailman/listinfo/omnios-discuss> >> > >
_______________________________________________ OmniOS-discuss mailing list [email protected] http://lists.omniti.com/mailman/listinfo/omnios-discuss
