The only other thing that come to mind is that you mentioned you have only a single SAS path to these disks. Have you disabled multipath? (stmsboot -d)
-Chip On Wed, Jul 29, 2015 at 5:02 PM, Michael Talbott <[email protected]> wrote: > Gave that a shot. No dice. Still getting the 8 second lag. It reminds me > of raid cards that do staggered spinups that sequentially spin up 1 drive > at a time. Only, this is happening after the kernel loads and of course, > the LSI 9200s are flashed in IT mode with v.19 firmware and bios disabled. > > > Jul 29 14:57:12 store2 genunix: [ID 583861 kern.info] sd10 at mpt_sas2: > unit-address w50000c0f0401c20f,0: w50000c0f0401c20f,0 > Jul 29 14:57:12 store2 genunix: [ID 936769 kern.info] sd10 is /pci@0 > ,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f0401c20f,0 > Jul 29 14:57:12 store2 genunix: [ID 408114 kern.info] /pci@0 > ,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f0401c20f,0 (sd10) > online > Jul 29 14:57:20 store2 genunix: [ID 583861 kern.info] sd11 at mpt_sas2: > unit-address w50000c0f040075db,0: w50000c0f040075db,0 > Jul 29 14:57:20 store2 genunix: [ID 936769 kern.info] sd11 is /pci@0 > ,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f040075db,0 > Jul 29 14:57:21 store2 genunix: [ID 408114 kern.info] /pci@0 > ,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f040075db,0 (sd11) > online > Jul 29 14:57:29 store2 genunix: [ID 583861 kern.info] sd12 at mpt_sas2: > unit-address w50000c0f042c684b,0: w50000c0f042c684b,0 > Jul 29 14:57:29 store2 genunix: [ID 936769 kern.info] sd12 is /pci@0 > ,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f042c684b,0 > Jul 29 14:57:29 store2 genunix: [ID 408114 kern.info] /pci@0 > ,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f042c684b,0 (sd12) > online > Jul 29 14:57:38 store2 genunix: [ID 583861 kern.info] sd13 at mpt_sas2: > unit-address w50000c0f0457149f,0: w50000c0f0457149f,0 > Jul 29 14:57:38 store2 genunix: [ID 936769 kern.info] sd13 is /pci@0 > ,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f0457149f,0 > Jul 29 14:57:38 store2 genunix: [ID 408114 kern.info] /pci@0 > ,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f0457149f,0 (sd13) > online > Jul 29 14:57:47 store2 genunix: [ID 583861 kern.info] sd14 at mpt_sas2: > unit-address w50000c0f042b1c6f,0: w50000c0f042b1c6f,0 > Jul 29 14:57:47 store2 genunix: [ID 936769 kern.info] sd14 is /pci@0 > ,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f042b1c6f,0 > Jul 29 14:57:47 store2 genunix: [ID 408114 kern.info] /pci@0 > ,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w50000c0f042b1c6f,0 (sd14) > online > > > ________________________ > Michael Talbott > Systems Administrator > La Jolla Institute > > On Jul 29, 2015, at 1:50 PM, Schweiss, Chip <[email protected]> wrote: > > I have an OmniOS box with all the same hardware except the server and hard > disks. I would wager this something to do with the WD disks and something > different happening in the init. > > This is a stab in the dark, but try adding "power-condition:false" in > /kernel/drv/sd.conf for the WD disks. > > -Chip > > > > On Wed, Jul 29, 2015 at 12:48 PM, Michael Talbott <[email protected]> > wrote: > >> Here's the specs of that server. >> >> Fujitsu RX300S8 >> - >> http://www.fujitsu.com/fts/products/computing/servers/primergy/rack/rx300/ >> 128G ECC DDR3 1600 RAM >> 2 x Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz >> 2 x LSI 9200-8e >> 2 x 10Gb Intel NICs >> 2 x SuperMicro 847E26-RJBOD1 45 bay JBOD enclosures >> - http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm >> >> The enclosures are not currently set up for multipathing. The front and >> rear backplane each have a single independent SAS connection to one of the >> LSI 9200s. >> >> The two enclosures are fully loaded with 45 x 4TB WD4001FYYG-01SL3 drives >> each (90 total). >> http://www.newegg.com/Product/Product.aspx?Item=N82E16822236353 >> >> Booting the server up in Ubuntu or CentOS does not have that 8 second >> delay. Each drive is found in a fraction of a second (activity LEDs on the >> enclosure flash on and off really quick as the drives are scanned). On >> OmniOS, the drives seem to be scanned in the same order, but, instead of it >> spending a fraction of a second on each drive, it spends 8 seconds on 1 >> drive (led of only one drive rapidly flashing during that process) before >> moving on to the next x 90 drives. >> >> Is there anything I can do to get more verbosity in the boot messages >> that might just reveal the root issue? >> >> Any suggestions appreciated. >> >> Thanks >> >> ________________________ >> Michael Talbott >> Systems Administrator >> La Jolla Institute >> >> On Jul 29, 2015, at 7:51 AM, Schweiss, Chip <[email protected]> wrote: >> >> >> >> On Fri, Jul 24, 2015 at 5:03 PM, Michael Talbott <[email protected]> >> wrote: >> >>> Hi, >>> >>> I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot >>> bios. But I'm still getting the 8 second per drive delay after the kernel >>> loads. Any other ideas? >>> >>> >> 8 seconds is way too long. What JBODs and disks are you using? Could >> it be they are powered off and the delay in waiting for the power on >> command to complete? This could be accelerated by using lsiutils to send >> them all power on commands first. >> >> While I still consider it slow, however, my OmniOS systems with LSI HBAs >> discover about 2 disks per second. With systems with LOTS of disk all >> multipathed it still stacks up to a long time to discover them all. >> >> -Chip >> >> >>> >>> ________________________ >>> Michael Talbott >>> Systems Administrator >>> La Jolla Institute >>> >>> > On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients >>> Amstafs ::.. <[email protected]> wrote: >>> > >>> > Michael, >>> > >>> > I know v20 does cause lots of issue's. >>> > V19 , to the best of my knowledge doesn't contain any, so I would >>> downgrade to v19 >>> > >>> > >>> > Kr, >>> > >>> > >>> > Floris >>> > -----Oorspronkelijk bericht----- >>> > Van: OmniOS-discuss [mailto:[email protected]] >>> Namens Michael Talbott >>> > Verzonden: dinsdag 21 juli 2015 4:57 >>> > Aan: Marion Hakanson <[email protected]> >>> > CC: omnios-discuss <[email protected]> >>> > Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive >>> > >>> > Thanks for the reply. The bios for the card is disabled already. The 8 >>> second per drive scan happens after the kernel has already loaded and it is >>> scanning for devices. I wonder if it's due to running newer firmware. I did >>> update the cards to fw v.20.something before I moved to omnios. Is there a >>> particular firmware version on the cards I should run to match OmniOS's >>> drivers? >>> > >>> > >>> > ________________________ >>> > Michael Talbott >>> > Systems Administrator >>> > La Jolla Institute >>> > >>> >> On Jul 20, 2015, at 6:06 PM, Marion Hakanson <[email protected]> >>> wrote: >>> >> >>> >> Michael, >>> >> >>> >> I've not seen this; I do have one system with 120 drives and it >>> >> definitely does not have this problem. A couple with 80+ drives are >>> >> also free of this issue, though they are still running OpenIndiana. >>> >> >>> >> One thing I pretty much always do here, is to disable the boot option >>> >> in the LSI HBA's config utility (accessible from the during boot after >>> >> the BIOS has started up). I do this because I don't want the BIOS >>> >> thinking it can boot from any of the external JBOD disks; And also >>> >> because I've had some system BIOS crashes when they tried to enumerate >>> >> too many drives. But, this all happens at the BIOS level, before the >>> >> OS has even started up, so in theory it should not affect what you are >>> >> seeing. >>> >> >>> >> Regards, >>> >> >>> >> Marion >>> >> >>> >> >>> >> ================================================================ >>> >> Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive >>> >> From: Michael Talbott <[email protected]> >>> >> Date: Fri, 17 Jul 2015 16:15:47 -0700 >>> >> To: omnios-discuss <[email protected]> >>> >> >>> >> Just realized my typo. I'm using this on my 90 and 180 drive systems: >>> >> >>> >> # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s >>> >> boot-archive setprop start/timeout_seconds=1440 >>> >> >>> >> Seems like 8 seconds to detect each drive is pretty excessive. >>> >> >>> >> Any ideas on how to speed that up? >>> >> >>> >> >>> >> ________________________ >>> >> Michael Talbott >>> >> Systems Administrator >>> >> La Jolla Institute >>> >> >>> >>> On Jul 17, 2015, at 4:07 PM, Michael Talbott <[email protected]> >>> wrote: >>> >>> >>> >>> I have multiple NAS servers I've moved to OmniOS and each of them >>> have 90-180 4T disks. Everything has worked out pretty well for the most >>> part. But I've come into an issue where when I reboot any of them, I'm >>> getting boot-archive service timeouts happening. I found a workaround of >>> increasing the timeout value which brings me to the following. As you can >>> see below in a dmesg output, it's taking the kernel about 8 seconds to >>> detect each of the drives. They're connected via a couple SAS2008 based LSI >>> cards. >>> >>> >>> >>> Is this normal? >>> >>> Is there a way to speed that up? >>> >>> >>> >>> I've fixed my frustrating boot-archive timeout problem by adjusting >>> the timeout value from the default of 60 seconds (I guess that'll work ok >>> on systems with less than 8 drives?) to 8 seconds * 90 drives + a little >>> extra time = 280 seconds (for the 90 drive systems). Which means it takes >>> between 12-24 minutes to boot those machines up. >>> >>> >>> >>> # svccfg -s boot-archive setprop start/timeout_seconds=280 >>> >>> >>> >>> I figure I can't be the only one. A little googling also revealed: >>> >>> https://www.illumos.org/issues/4614 >>> >>> <https://www.illumos.org/issues/4614> >>> >>> >>> >>> Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at >>> >>> mpt_sas3: unit-address w50000c0f0401bd43,0: w50000c0f0401bd43,0 Jul >>> >>> 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f0401bd4 >>> >>> 3,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f0401bd4 >>> >>> 3,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 >>> >>> kern.info] sd30 at mpt_sas3: unit-address w50000c0f045679c3,0: >>> >>> w50000c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 >>> >>> kern.info] sd30 is >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f045679c >>> >>> 3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f045679c >>> >>> 3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 >>> >>> kern.info] sd31 at mpt_sas3: unit-address w50000c0f045712b3,0: >>> >>> w50000c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 936769 >>> >>> kern.info] sd31 is >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f045712b >>> >>> 3,0 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f045712b >>> >>> 3,0 (sd31) online Jul 17 15:40:42 store2 genunix: [ID 583861 >>> >>> kern.info] sd32 at mpt_sas3: unit-address w50000c0f04571497,0: >>> >>> w50000c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 936769 >>> >>> kern.info] sd32 is >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f0457149 >>> >>> 7,0 Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info] >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f0457149 >>> >>> 7,0 (sd32) online Jul 17 15:40:50 store2 genunix: [ID 583861 >>> >>> kern.info] sd33 at mpt_sas3: unit-address w50000c0f042ac8eb,0: >>> >>> w50000c0f042ac8eb,0 Jul 17 15:40:50 store2 genunix: [ID 936769 >>> >>> kern.info] sd33 is >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f042ac8e >>> >>> b,0 Jul 17 15:40:50 store2 genunix: [ID 408114 kern.info] >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f042ac8e >>> >>> b,0 (sd33) online Jul 17 15:40:59 store2 genunix: [ID 583861 >>> >>> kern.info] sd34 at mpt_sas3: unit-address w50000c0f04571473,0: >>> >>> w50000c0f04571473,0 Jul 17 15:40:59 store2 genunix: [ID 936769 >>> >>> kern.info] sd34 is >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f0457147 >>> >>> 3,0 Jul 17 15:40:59 store2 genunix: [ID 408114 kern.info] >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f0457147 >>> >>> 3,0 (sd34) online Jul 17 15:41:08 store2 genunix: [ID 583861 >>> >>> kern.info] sd35 at mpt_sas3: unit-address w50000c0f042c636f,0: >>> >>> w50000c0f042c636f,0 Jul 17 15:41:08 store2 genunix: [ID 936769 >>> >>> kern.info] sd35 is >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f042c636 >>> >>> f,0 Jul 17 15:41:08 store2 genunix: [ID 408114 kern.info] >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f042c636 >>> >>> f,0 (sd35) online Jul 17 15:41:17 store2 genunix: [ID 583861 >>> >>> kern.info] sd36 at mpt_sas3: unit-address w50000c0f0401bf2f,0: >>> >>> w50000c0f0401bf2f,0 Jul 17 15:41:17 store2 genunix: [ID 936769 >>> >>> kern.info] sd36 is >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f0401bf2 >>> >>> f,0 Jul 17 15:41:17 store2 genunix: [ID 408114 kern.info] >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f0401bf2 >>> >>> f,0 (sd36) online Jul 17 15:41:25 store2 genunix: [ID 583861 >>> >>> kern.info] sd38 at mpt_sas3: unit-address w50000c0f0401bc1f,0: >>> >>> w50000c0f0401bc1f,0 Jul 17 15:41:25 store2 genunix: [ID 936769 >>> >>> kern.info] sd38 is >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f0401bc1 >>> >>> f,0 Jul 17 15:41:26 store2 genunix: [ID 408114 kern.info] >>> >>> /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f >>> /disk@w50000c0f0401bc1 >>> >>> f,0 (sd38) online >>> >>> >>> >>> >>> >>> ________________________ >>> >>> Michael Talbott >>> >>> Systems Administrator >>> >>> La Jolla Institute >>> >>> >>> >> >>> >> _______________________________________________ >>> >> OmniOS-discuss mailing list >>> >> [email protected] >>> >> http://lists.omniti.com/mailman/listinfo/omnios-discuss >>> >> >>> >> >>> > >>> > _______________________________________________ >>> > OmniOS-discuss mailing list >>> > [email protected] >>> > http://lists.omniti.com/mailman/listinfo/omnios-discuss >>> > ...:: House of Ancients ::... >>> > American Staffordshire Terriers >>> > >>> > +31-628-161-350 >>> > +31-614-198-389 >>> > Het Perk 48 >>> > 4903 RB >>> > Oosterhout >>> > Netherlands >>> > www.houseofancients.nl >>> >>> _______________________________________________ >>> OmniOS-discuss mailing list >>> [email protected] >>> http://lists.omniti.com/mailman/listinfo/omnios-discuss >>> >> >> >> > >
_______________________________________________ OmniOS-discuss mailing list [email protected] http://lists.omniti.com/mailman/listinfo/omnios-discuss
