Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
I have an OmniOS box with all the same hardware except the server and hard disks. I would wager this something to do with the WD disks and something different happening in the init. This is a stab in the dark, but try adding power-condition:false in /kernel/drv/sd.conf for the WD disks. -Chip On Wed, Jul 29, 2015 at 12:48 PM, Michael Talbott mtalb...@lji.org wrote: Here's the specs of that server. Fujitsu RX300S8 - http://www.fujitsu.com/fts/products/computing/servers/primergy/rack/rx300/ 128G ECC DDR3 1600 RAM 2 x Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz 2 x LSI 9200-8e 2 x 10Gb Intel NICs 2 x SuperMicro 847E26-RJBOD1 45 bay JBOD enclosures - http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm The enclosures are not currently set up for multipathing. The front and rear backplane each have a single independent SAS connection to one of the LSI 9200s. The two enclosures are fully loaded with 45 x 4TB WD4001FYYG-01SL3 drives each (90 total). http://www.newegg.com/Product/Product.aspx?Item=N82E16822236353 Booting the server up in Ubuntu or CentOS does not have that 8 second delay. Each drive is found in a fraction of a second (activity LEDs on the enclosure flash on and off really quick as the drives are scanned). On OmniOS, the drives seem to be scanned in the same order, but, instead of it spending a fraction of a second on each drive, it spends 8 seconds on 1 drive (led of only one drive rapidly flashing during that process) before moving on to the next x 90 drives. Is there anything I can do to get more verbosity in the boot messages that might just reveal the root issue? Any suggestions appreciated. Thanks Michael Talbott Systems Administrator La Jolla Institute On Jul 29, 2015, at 7:51 AM, Schweiss, Chip c...@innovates.com wrote: On Fri, Jul 24, 2015 at 5:03 PM, Michael Talbott mtalb...@lji.org wrote: Hi, I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot bios. But I'm still getting the 8 second per drive delay after the kernel loads. Any other ideas? 8 seconds is way too long. What JBODs and disks are you using? Could it be they are powered off and the delay in waiting for the power on command to complete? This could be accelerated by using lsiutils to send them all power on commands first. While I still consider it slow, however, my OmniOS systems with LSI HBAs discover about 2 disks per second. With systems with LOTS of disk all multipathed it still stacks up to a long time to discover them all. -Chip Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients Amstafs ::.. i...@houseofancients.nl wrote: Michael, I know v20 does cause lots of issue's. V19 , to the best of my knowledge doesn't contain any, so I would downgrade to v19 Kr, Floris -Oorspronkelijk bericht- Van: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] Namens Michael Talbott Verzonden: dinsdag 21 juli 2015 4:57 Aan: Marion Hakanson hakan...@ohsu.edu CC: omnios-discuss omnios-discuss@lists.omniti.com Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive Thanks for the reply. The bios for the card is disabled already. The 8 second per drive scan happens after the kernel has already loaded and it is scanning for devices. I wonder if it's due to running newer firmware. I did update the cards to fw v.20.something before I moved to omnios. Is there a particular firmware version on the cards I should run to match OmniOS's drivers? Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote: Michael, I've not seen this; I do have one system with 120 drives and it definitely does not have this problem. A couple with 80+ drives are also free of this issue, though they are still running OpenIndiana. One thing I pretty much always do here, is to disable the boot option in the LSI HBA's config utility (accessible from the during boot after the BIOS has started up). I do this because I don't want the BIOS thinking it can boot from any of the external JBOD disks; And also because I've had some system BIOS crashes when they tried to enumerate too many drives. But, this all happens at the BIOS level, before the OS has even started up, so in theory it should not affect what you are seeing. Regards, Marion Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive From: Michael Talbott mtalb...@lji.org Date: Fri, 17 Jul 2015 16:15:47 -0700 To: omnios-discuss omnios-discuss@lists.omniti.com Just realized my typo. I'm using this on my 90 and 180 drive systems: # svccfg -s boot-archive
Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
Here's the specs of that server. Fujitsu RX300S8 - http://www.fujitsu.com/fts/products/computing/servers/primergy/rack/rx300/ http://www.fujitsu.com/fts/products/computing/servers/primergy/rack/rx300/ 128G ECC DDR3 1600 RAM 2 x Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz 2 x LSI 9200-8e 2 x 10Gb Intel NICs 2 x SuperMicro 847E26-RJBOD1 45 bay JBOD enclosures - http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm The enclosures are not currently set up for multipathing. The front and rear backplane each have a single independent SAS connection to one of the LSI 9200s. The two enclosures are fully loaded with 45 x 4TB WD4001FYYG-01SL3 drives each (90 total). http://www.newegg.com/Product/Product.aspx?Item=N82E16822236353 http://www.newegg.com/Product/Product.aspx?Item=N82E16822236353 Booting the server up in Ubuntu or CentOS does not have that 8 second delay. Each drive is found in a fraction of a second (activity LEDs on the enclosure flash on and off really quick as the drives are scanned). On OmniOS, the drives seem to be scanned in the same order, but, instead of it spending a fraction of a second on each drive, it spends 8 seconds on 1 drive (led of only one drive rapidly flashing during that process) before moving on to the next x 90 drives. Is there anything I can do to get more verbosity in the boot messages that might just reveal the root issue? Any suggestions appreciated. Thanks Michael Talbott Systems Administrator La Jolla Institute On Jul 29, 2015, at 7:51 AM, Schweiss, Chip c...@innovates.com wrote: On Fri, Jul 24, 2015 at 5:03 PM, Michael Talbott mtalb...@lji.org mailto:mtalb...@lji.org wrote: Hi, I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot bios. But I'm still getting the 8 second per drive delay after the kernel loads. Any other ideas? 8 seconds is way too long. What JBODs and disks are you using? Could it be they are powered off and the delay in waiting for the power on command to complete? This could be accelerated by using lsiutils to send them all power on commands first. While I still consider it slow, however, my OmniOS systems with LSI HBAs discover about 2 disks per second. With systems with LOTS of disk all multipathed it still stacks up to a long time to discover them all. -Chip Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients Amstafs ::.. i...@houseofancients.nl mailto:i...@houseofancients.nl wrote: Michael, I know v20 does cause lots of issue's. V19 , to the best of my knowledge doesn't contain any, so I would downgrade to v19 Kr, Floris -Oorspronkelijk bericht- Van: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com mailto:omnios-discuss-boun...@lists.omniti.com] Namens Michael Talbott Verzonden: dinsdag 21 juli 2015 4:57 Aan: Marion Hakanson hakan...@ohsu.edu mailto:hakan...@ohsu.edu CC: omnios-discuss omnios-discuss@lists.omniti.com mailto:omnios-discuss@lists.omniti.com Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive Thanks for the reply. The bios for the card is disabled already. The 8 second per drive scan happens after the kernel has already loaded and it is scanning for devices. I wonder if it's due to running newer firmware. I did update the cards to fw v.20.something before I moved to omnios. Is there a particular firmware version on the cards I should run to match OmniOS's drivers? Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu mailto:hakan...@ohsu.edu wrote: Michael, I've not seen this; I do have one system with 120 drives and it definitely does not have this problem. A couple with 80+ drives are also free of this issue, though they are still running OpenIndiana. One thing I pretty much always do here, is to disable the boot option in the LSI HBA's config utility (accessible from the during boot after the BIOS has started up). I do this because I don't want the BIOS thinking it can boot from any of the external JBOD disks; And also because I've had some system BIOS crashes when they tried to enumerate too many drives. But, this all happens at the BIOS level, before the OS has even started up, so in theory it should not affect what you are seeing. Regards, Marion Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive From: Michael Talbott mtalb...@lji.org mailto:mtalb...@lji.org Date: Fri, 17 Jul 2015 16:15:47 -0700 To: omnios-discuss omnios-discuss@lists.omniti.com mailto:omnios-discuss
Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
On Wed, 29 Jul 2015 17:07:52 -0500 Schweiss, Chip c...@innovates.com wrote: The only other thing that come to mind is that you mentioned you have only a single SAS path to these disks. Have you disabled multipath? (stmsboot -d) What about driver firmware? http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=CKTR9 -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael at rasmussen dot cc http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xD3C9A00E mir at datanom dot net http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xE501F51C mir at miras dot org http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xE3E80917 -- /usr/games/fortune -es says: Specifications subject to change without notice. pgpWoayFPFbmP.pgp Description: OpenPGP digital signature ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
After some more trial and error, disabling/enabling multipath did not help. But after a little googling, I ran across this: https://syneto.eu/knowledgebase/cannot-use-some-models-of-enterprise-wd-or-seagate-disks/ I guess these guys use the Illumos kernel too and it became obvious I didn't use the right syntax in sd.conf with your previous suggestion. so I added sd-config-list= WD WD4001FYYG,power-condition:false; to /kernel/drv/sd.conf and it seems to have cut my cold boot time in half :) Still, 4-5 seconds per drive seems pretty high for device detection, but it a vast improvement from than 8-9 seconds per drive. Jul 29 17:05:25 store2 genunix: [ID 583861 kern.info] sd15 at mpt_sas2: unit-address w5c0f042ba1bf,0: w5c0f042ba1bf,0 Jul 29 17:05:25 store2 genunix: [ID 936769 kern.info] sd15 is /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f042ba1bf,0 Jul 29 17:05:25 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f042ba1bf,0 (sd15) online Jul 29 17:05:30 store2 genunix: [ID 583861 kern.info] sd17 at mpt_sas2: unit-address w5c0f01c06d37,0: w5c0f01c06d37,0 Jul 29 17:05:30 store2 genunix: [ID 936769 kern.info] sd17 is /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f01c06d37,0 Jul 29 17:05:30 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f01c06d37,0 (sd17) online Jul 29 17:05:34 store2 genunix: [ID 583861 kern.info] sd18 at mpt_sas2: unit-address w5c0f045717af,0: w5c0f045717af,0 Jul 29 17:05:34 store2 genunix: [ID 936769 kern.info] sd18 is /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f045717af,0 Jul 29 17:05:34 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f045717af,0 (sd18) online Jul 29 17:05:38 store2 genunix: [ID 583861 kern.info] sd19 at mpt_sas2: unit-address w5c0f011a48fb,0: w5c0f011a48fb,0 Jul 29 17:05:38 store2 genunix: [ID 936769 kern.info] sd19 is /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f011a48fb,0 Jul 29 17:05:38 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f011a48fb,0 (sd19) online Thanks. Michael Talbott Systems Administrator La Jolla Institute On Jul 29, 2015, at 4:03 PM, Michael Rasmussen m...@miras.org wrote: On Wed, 29 Jul 2015 17:07:52 -0500 Schweiss, Chip c...@innovates.com wrote: The only other thing that come to mind is that you mentioned you have only a single SAS path to these disks. Have you disabled multipath? (stmsboot -d) What about driver firmware? http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=CKTR9 -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael at rasmussen dot cc http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xD3C9A00E mir at datanom dot net http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xE501F51C mir at miras dot org http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xE3E80917 -- /usr/games/fortune -es says: Specifications subject to change without notice. ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
On Fri, Jul 24, 2015 at 5:03 PM, Michael Talbott mtalb...@lji.org wrote: Hi, I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot bios. But I'm still getting the 8 second per drive delay after the kernel loads. Any other ideas? 8 seconds is way too long. What JBODs and disks are you using? Could it be they are powered off and the delay in waiting for the power on command to complete? This could be accelerated by using lsiutils to send them all power on commands first. While I still consider it slow, however, my OmniOS systems with LSI HBAs discover about 2 disks per second. With systems with LOTS of disk all multipathed it still stacks up to a long time to discover them all. -Chip Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients Amstafs ::.. i...@houseofancients.nl wrote: Michael, I know v20 does cause lots of issue's. V19 , to the best of my knowledge doesn't contain any, so I would downgrade to v19 Kr, Floris -Oorspronkelijk bericht- Van: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] Namens Michael Talbott Verzonden: dinsdag 21 juli 2015 4:57 Aan: Marion Hakanson hakan...@ohsu.edu CC: omnios-discuss omnios-discuss@lists.omniti.com Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive Thanks for the reply. The bios for the card is disabled already. The 8 second per drive scan happens after the kernel has already loaded and it is scanning for devices. I wonder if it's due to running newer firmware. I did update the cards to fw v.20.something before I moved to omnios. Is there a particular firmware version on the cards I should run to match OmniOS's drivers? Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote: Michael, I've not seen this; I do have one system with 120 drives and it definitely does not have this problem. A couple with 80+ drives are also free of this issue, though they are still running OpenIndiana. One thing I pretty much always do here, is to disable the boot option in the LSI HBA's config utility (accessible from the during boot after the BIOS has started up). I do this because I don't want the BIOS thinking it can boot from any of the external JBOD disks; And also because I've had some system BIOS crashes when they tried to enumerate too many drives. But, this all happens at the BIOS level, before the OS has even started up, so in theory it should not affect what you are seeing. Regards, Marion Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive From: Michael Talbott mtalb...@lji.org Date: Fri, 17 Jul 2015 16:15:47 -0700 To: omnios-discuss omnios-discuss@lists.omniti.com Just realized my typo. I'm using this on my 90 and 180 drive systems: # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s boot-archive setprop start/timeout_seconds=1440 Seems like 8 seconds to detect each drive is pretty excessive. Any ideas on how to speed that up? Michael Talbott Systems Administrator La Jolla Institute On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote: I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 4T disks. Everything has worked out pretty well for the most part. But I've come into an issue where when I reboot any of them, I'm getting boot-archive service timeouts happening. I found a workaround of increasing the timeout value which brings me to the following. As you can see below in a dmesg output, it's taking the kernel about 8 seconds to detect each of the drives. They're connected via a couple SAS2008 based LSI cards. Is this normal? Is there a way to speed that up? I've fixed my frustrating boot-archive timeout problem by adjusting the timeout value from the default of 60 seconds (I guess that'll work ok on systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra time = 280 seconds (for the 90 drive systems). Which means it takes between 12-24 minutes to boot those machines up. # svccfg -s boot-archive setprop start/timeout_seconds=280 I figure I can't be the only one. A little googling also revealed: https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd4 3,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080
Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
Hi, I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot bios. But I'm still getting the 8 second per drive delay after the kernel loads. Any other ideas? Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients Amstafs ::.. i...@houseofancients.nl wrote: Michael, I know v20 does cause lots of issue's. V19 , to the best of my knowledge doesn't contain any, so I would downgrade to v19 Kr, Floris -Oorspronkelijk bericht- Van: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] Namens Michael Talbott Verzonden: dinsdag 21 juli 2015 4:57 Aan: Marion Hakanson hakan...@ohsu.edu CC: omnios-discuss omnios-discuss@lists.omniti.com Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive Thanks for the reply. The bios for the card is disabled already. The 8 second per drive scan happens after the kernel has already loaded and it is scanning for devices. I wonder if it's due to running newer firmware. I did update the cards to fw v.20.something before I moved to omnios. Is there a particular firmware version on the cards I should run to match OmniOS's drivers? Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote: Michael, I've not seen this; I do have one system with 120 drives and it definitely does not have this problem. A couple with 80+ drives are also free of this issue, though they are still running OpenIndiana. One thing I pretty much always do here, is to disable the boot option in the LSI HBA's config utility (accessible from the during boot after the BIOS has started up). I do this because I don't want the BIOS thinking it can boot from any of the external JBOD disks; And also because I've had some system BIOS crashes when they tried to enumerate too many drives. But, this all happens at the BIOS level, before the OS has even started up, so in theory it should not affect what you are seeing. Regards, Marion Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive From: Michael Talbott mtalb...@lji.org Date: Fri, 17 Jul 2015 16:15:47 -0700 To: omnios-discuss omnios-discuss@lists.omniti.com Just realized my typo. I'm using this on my 90 and 180 drive systems: # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s boot-archive setprop start/timeout_seconds=1440 Seems like 8 seconds to detect each drive is pretty excessive. Any ideas on how to speed that up? Michael Talbott Systems Administrator La Jolla Institute On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote: I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 4T disks. Everything has worked out pretty well for the most part. But I've come into an issue where when I reboot any of them, I'm getting boot-archive service timeouts happening. I found a workaround of increasing the timeout value which brings me to the following. As you can see below in a dmesg output, it's taking the kernel about 8 seconds to detect each of the drives. They're connected via a couple SAS2008 based LSI cards. Is this normal? Is there a way to speed that up? I've fixed my frustrating boot-archive timeout problem by adjusting the timeout value from the default of 60 seconds (I guess that'll work ok on systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra time = 280 seconds (for the 90 drive systems). Which means it takes between 12-24 minutes to boot those machines up. # svccfg -s boot-archive setprop start/timeout_seconds=280 I figure I can't be the only one. A little googling also revealed: https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd4 3,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd4 3,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: unit-address w5c0f045679c3,0: w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c 3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c 3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: unit-address w5c0f045712b3,0
Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
Michael, I know v20 does cause lots of issue's. V19 , to the best of my knowledge doesn't contain any, so I would downgrade to v19 Kr, Floris -Oorspronkelijk bericht- Van: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] Namens Michael Talbott Verzonden: dinsdag 21 juli 2015 4:57 Aan: Marion Hakanson hakan...@ohsu.edu CC: omnios-discuss omnios-discuss@lists.omniti.com Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive Thanks for the reply. The bios for the card is disabled already. The 8 second per drive scan happens after the kernel has already loaded and it is scanning for devices. I wonder if it's due to running newer firmware. I did update the cards to fw v.20.something before I moved to omnios. Is there a particular firmware version on the cards I should run to match OmniOS's drivers? Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote: Michael, I've not seen this; I do have one system with 120 drives and it definitely does not have this problem. A couple with 80+ drives are also free of this issue, though they are still running OpenIndiana. One thing I pretty much always do here, is to disable the boot option in the LSI HBA's config utility (accessible from the during boot after the BIOS has started up). I do this because I don't want the BIOS thinking it can boot from any of the external JBOD disks; And also because I've had some system BIOS crashes when they tried to enumerate too many drives. But, this all happens at the BIOS level, before the OS has even started up, so in theory it should not affect what you are seeing. Regards, Marion Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive From: Michael Talbott mtalb...@lji.org Date: Fri, 17 Jul 2015 16:15:47 -0700 To: omnios-discuss omnios-discuss@lists.omniti.com Just realized my typo. I'm using this on my 90 and 180 drive systems: # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s boot-archive setprop start/timeout_seconds=1440 Seems like 8 seconds to detect each drive is pretty excessive. Any ideas on how to speed that up? Michael Talbott Systems Administrator La Jolla Institute On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote: I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 4T disks. Everything has worked out pretty well for the most part. But I've come into an issue where when I reboot any of them, I'm getting boot-archive service timeouts happening. I found a workaround of increasing the timeout value which brings me to the following. As you can see below in a dmesg output, it's taking the kernel about 8 seconds to detect each of the drives. They're connected via a couple SAS2008 based LSI cards. Is this normal? Is there a way to speed that up? I've fixed my frustrating boot-archive timeout problem by adjusting the timeout value from the default of 60 seconds (I guess that'll work ok on systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra time = 280 seconds (for the 90 drive systems). Which means it takes between 12-24 minutes to boot those machines up. # svccfg -s boot-archive setprop start/timeout_seconds=280 I figure I can't be the only one. A little googling also revealed: https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd4 3,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd4 3,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: unit-address w5c0f045679c3,0: w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c 3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c 3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: unit-address w5c0f045712b3,0: w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b 3,0 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b 3,0 (sd31) online Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: unit-address w5c0f04571497,0
Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
On Jul 20, 2015, at 7:56 PM, Michael Talbott mtalb...@lji.org wrote: Thanks for the reply. The bios for the card is disabled already. The 8 second per drive scan happens after the kernel has already loaded and it is scanning for devices. I wonder if it's due to running newer firmware. I did update the cards to fw v.20.something before I moved to omnios. Is there a particular firmware version on the cards I should run to match OmniOS's drivers? Google LSI P20 firmware for many tales of woe for many different OSes. Be aware that getting the latest version of firmware from Avago might not be obvious... the latest version is 20.00.04.00 for Windows. -- richard Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote: Michael, I've not seen this; I do have one system with 120 drives and it definitely does not have this problem. A couple with 80+ drives are also free of this issue, though they are still running OpenIndiana. One thing I pretty much always do here, is to disable the boot option in the LSI HBA's config utility (accessible from the during boot after the BIOS has started up). I do this because I don't want the BIOS thinking it can boot from any of the external JBOD disks; And also because I've had some system BIOS crashes when they tried to enumerate too many drives. But, this all happens at the BIOS level, before the OS has even started up, so in theory it should not affect what you are seeing. Regards, Marion Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive From: Michael Talbott mtalb...@lji.org Date: Fri, 17 Jul 2015 16:15:47 -0700 To: omnios-discuss omnios-discuss@lists.omniti.com Just realized my typo. I'm using this on my 90 and 180 drive systems: # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s boot-archive setprop start/timeout_seconds=1440 Seems like 8 seconds to detect each drive is pretty excessive. Any ideas on how to speed that up? Michael Talbott Systems Administrator La Jolla Institute On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote: I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 4T disks. Everything has worked out pretty well for the most part. But I've come into an issue where when I reboot any of them, I'm getting boot-archive service timeouts happening. I found a workaround of increasing the timeout value which brings me to the following. As you can see below in a dmesg output, it's taking the kernel about 8 seconds to detect each of the drives. They're connected via a couple SAS2008 based LSI cards. Is this normal? Is there a way to speed that up? I've fixed my frustrating boot-archive timeout problem by adjusting the timeout value from the default of 60 seconds (I guess that'll work ok on systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra time = 280 seconds (for the 90 drive systems). Which means it takes between 12-24 minutes to boot those machines up. # svccfg -s boot-archive setprop start/timeout_seconds=280 I figure I can't be the only one. A little googling also revealed: https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: unit-address w5c0f045679c3,0: w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: unit-address w5c0f045712b3,0: w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 (sd31) online Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: unit-address w5c0f04571497,0: w5c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 936769 kern.info] sd32 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 Jul 17 15:40:42 store2 genunix
Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
Michael, I've not seen this; I do have one system with 120 drives and it definitely does not have this problem. A couple with 80+ drives are also free of this issue, though they are still running OpenIndiana. One thing I pretty much always do here, is to disable the boot option in the LSI HBA's config utility (accessible from the during boot after the BIOS has started up). I do this because I don't want the BIOS thinking it can boot from any of the external JBOD disks; And also because I've had some system BIOS crashes when they tried to enumerate too many drives. But, this all happens at the BIOS level, before the OS has even started up, so in theory it should not affect what you are seeing. Regards, Marion Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive From: Michael Talbott mtalb...@lji.org Date: Fri, 17 Jul 2015 16:15:47 -0700 To: omnios-discuss omnios-discuss@lists.omniti.com Just realized my typo. I'm using this on my 90 and 180 drive systems: # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s boot-archive setprop start/timeout_seconds=1440 Seems like 8 seconds to detect each drive is pretty excessive. Any ideas on how to speed that up? Michael Talbott Systems Administrator La Jolla Institute On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote: I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 4T disks. Everything has worked out pretty well for the most part. But I've come into an issue where when I reboot any of them, I'm getting boot-archive service timeouts happening. I found a workaround of increasing the timeout value which brings me to the following. As you can see below in a dmesg output, it's taking the kernel about 8 seconds to detect each of the drives. They're connected via a couple SAS2008 based LSI cards. Is this normal? Is there a way to speed that up? I've fixed my frustrating boot-archive timeout problem by adjusting the timeout value from the default of 60 seconds (I guess that'll work ok on systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra time = 280 seconds (for the 90 drive systems). Which means it takes between 12-24 minutes to boot those machines up. # svccfg -s boot-archive setprop start/timeout_seconds=280 I figure I can't be the only one. A little googling also revealed: https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: unit-address w5c0f045679c3,0: w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: unit-address w5c0f045712b3,0: w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 (sd31) online Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: unit-address w5c0f04571497,0: w5c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 936769 kern.info] sd32 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 (sd32) online Jul 17 15:40:50 store2 genunix: [ID 583861 kern.info] sd33 at mpt_sas3: unit-address w5c0f042ac8eb,0: w5c0f042ac8eb,0 Jul 17 15:40:50 store2 genunix: [ID 936769 kern.info] sd33 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0 Jul 17 15:40:50 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0 (sd33) online Jul 17 15:40:59 store2 genunix: [ID 583861 kern.info] sd34 at mpt_sas3: unit-address w5c0f04571473,0: w5c0f04571473,0 Jul 17 15:40:59 store2 genunix: [ID 936769 kern.info] sd34 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571473,0 Jul 17 15:40:59 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080
Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
Thanks for the reply. The bios for the card is disabled already. The 8 second per drive scan happens after the kernel has already loaded and it is scanning for devices. I wonder if it's due to running newer firmware. I did update the cards to fw v.20.something before I moved to omnios. Is there a particular firmware version on the cards I should run to match OmniOS's drivers? Michael Talbott Systems Administrator La Jolla Institute On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote: Michael, I've not seen this; I do have one system with 120 drives and it definitely does not have this problem. A couple with 80+ drives are also free of this issue, though they are still running OpenIndiana. One thing I pretty much always do here, is to disable the boot option in the LSI HBA's config utility (accessible from the during boot after the BIOS has started up). I do this because I don't want the BIOS thinking it can boot from any of the external JBOD disks; And also because I've had some system BIOS crashes when they tried to enumerate too many drives. But, this all happens at the BIOS level, before the OS has even started up, so in theory it should not affect what you are seeing. Regards, Marion Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive From: Michael Talbott mtalb...@lji.org Date: Fri, 17 Jul 2015 16:15:47 -0700 To: omnios-discuss omnios-discuss@lists.omniti.com Just realized my typo. I'm using this on my 90 and 180 drive systems: # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s boot-archive setprop start/timeout_seconds=1440 Seems like 8 seconds to detect each drive is pretty excessive. Any ideas on how to speed that up? Michael Talbott Systems Administrator La Jolla Institute On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote: I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 4T disks. Everything has worked out pretty well for the most part. But I've come into an issue where when I reboot any of them, I'm getting boot-archive service timeouts happening. I found a workaround of increasing the timeout value which brings me to the following. As you can see below in a dmesg output, it's taking the kernel about 8 seconds to detect each of the drives. They're connected via a couple SAS2008 based LSI cards. Is this normal? Is there a way to speed that up? I've fixed my frustrating boot-archive timeout problem by adjusting the timeout value from the default of 60 seconds (I guess that'll work ok on systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra time = 280 seconds (for the 90 drive systems). Which means it takes between 12-24 minutes to boot those machines up. # svccfg -s boot-archive setprop start/timeout_seconds=280 I figure I can't be the only one. A little googling also revealed: https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: unit-address w5c0f045679c3,0: w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: unit-address w5c0f045712b3,0: w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 (sd31) online Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: unit-address w5c0f04571497,0: w5c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 936769 kern.info] sd32 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 (sd32) online Jul 17 15:40:50 store2 genunix: [ID 583861 kern.info] sd33 at mpt_sas3: unit-address w5c0f042ac8eb,0: w5c0f042ac8eb,0 Jul 17 15:40:50 store2 genunix: [ID 936769 kern.info] sd33
[OmniOS-discuss] Slow Drive Detection and boot-archive
I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 4T disks. Everything has worked out pretty well for the most part. But I've come into an issue where when I reboot any of them, I'm getting boot-archive service timeouts happening. I found a workaround of increasing the timeout value which brings me to the following. As you can see below in a dmesg output, it's taking the kernel about 8 seconds to detect each of the drives. They're connected via a couple SAS2008 based LSI cards. Is this normal? Is there a way to speed that up? I've fixed my frustrating boot-archive timeout problem by adjusting the timeout value from the default of 60 seconds (I guess that'll work ok on systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra time = 280 seconds (for the 90 drive systems). Which means it takes between 12-24 minutes to boot those machines up. # svccfg -s boot-archive setprop start/timeout_seconds=280 I figure I can't be the only one. A little googling also revealed: https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: unit-address w5c0f045679c3,0: w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: unit-address w5c0f045712b3,0: w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 (sd31) online Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: unit-address w5c0f04571497,0: w5c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 936769 kern.info] sd32 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 (sd32) online Jul 17 15:40:50 store2 genunix: [ID 583861 kern.info] sd33 at mpt_sas3: unit-address w5c0f042ac8eb,0: w5c0f042ac8eb,0 Jul 17 15:40:50 store2 genunix: [ID 936769 kern.info] sd33 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0 Jul 17 15:40:50 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0 (sd33) online Jul 17 15:40:59 store2 genunix: [ID 583861 kern.info] sd34 at mpt_sas3: unit-address w5c0f04571473,0: w5c0f04571473,0 Jul 17 15:40:59 store2 genunix: [ID 936769 kern.info] sd34 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571473,0 Jul 17 15:40:59 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571473,0 (sd34) online Jul 17 15:41:08 store2 genunix: [ID 583861 kern.info] sd35 at mpt_sas3: unit-address w5c0f042c636f,0: w5c0f042c636f,0 Jul 17 15:41:08 store2 genunix: [ID 936769 kern.info] sd35 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042c636f,0 Jul 17 15:41:08 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042c636f,0 (sd35) online Jul 17 15:41:17 store2 genunix: [ID 583861 kern.info] sd36 at mpt_sas3: unit-address w5c0f0401bf2f,0: w5c0f0401bf2f,0 Jul 17 15:41:17 store2 genunix: [ID 936769 kern.info] sd36 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bf2f,0 Jul 17 15:41:17 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bf2f,0 (sd36) online Jul 17 15:41:25 store2 genunix: [ID 583861 kern.info] sd38 at mpt_sas3: unit-address w5c0f0401bc1f,0: w5c0f0401bc1f,0 Jul 17 15:41:25 store2 genunix: [ID 936769 kern.info] sd38 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bc1f,0 Jul 17 15:41:26 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bc1f,0 (sd38) online Michael Talbott Systems Administrator La Jolla Institute ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com
Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
Just realized my typo. I'm using this on my 90 and 180 drive systems: # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s boot-archive setprop start/timeout_seconds=1440 Seems like 8 seconds to detect each drive is pretty excessive. Any ideas on how to speed that up? Michael Talbott Systems Administrator La Jolla Institute On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote: I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 4T disks. Everything has worked out pretty well for the most part. But I've come into an issue where when I reboot any of them, I'm getting boot-archive service timeouts happening. I found a workaround of increasing the timeout value which brings me to the following. As you can see below in a dmesg output, it's taking the kernel about 8 seconds to detect each of the drives. They're connected via a couple SAS2008 based LSI cards. Is this normal? Is there a way to speed that up? I've fixed my frustrating boot-archive timeout problem by adjusting the timeout value from the default of 60 seconds (I guess that'll work ok on systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra time = 280 seconds (for the 90 drive systems). Which means it takes between 12-24 minutes to boot those machines up. # svccfg -s boot-archive setprop start/timeout_seconds=280 I figure I can't be the only one. A little googling also revealed: https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: unit-address w5c0f045679c3,0: w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: unit-address w5c0f045712b3,0: w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 (sd31) online Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: unit-address w5c0f04571497,0: w5c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 936769 kern.info] sd32 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 (sd32) online Jul 17 15:40:50 store2 genunix: [ID 583861 kern.info] sd33 at mpt_sas3: unit-address w5c0f042ac8eb,0: w5c0f042ac8eb,0 Jul 17 15:40:50 store2 genunix: [ID 936769 kern.info] sd33 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0 Jul 17 15:40:50 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0 (sd33) online Jul 17 15:40:59 store2 genunix: [ID 583861 kern.info] sd34 at mpt_sas3: unit-address w5c0f04571473,0: w5c0f04571473,0 Jul 17 15:40:59 store2 genunix: [ID 936769 kern.info] sd34 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571473,0 Jul 17 15:40:59 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571473,0 (sd34) online Jul 17 15:41:08 store2 genunix: [ID 583861 kern.info] sd35 at mpt_sas3: unit-address w5c0f042c636f,0: w5c0f042c636f,0 Jul 17 15:41:08 store2 genunix: [ID 936769 kern.info] sd35 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042c636f,0 Jul 17 15:41:08 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042c636f,0 (sd35) online Jul 17 15:41:17 store2 genunix: [ID 583861 kern.info] sd36 at mpt_sas3: unit-address w5c0f0401bf2f,0: w5c0f0401bf2f,0 Jul 17 15:41:17 store2 genunix: [ID 936769 kern.info] sd36 is /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bf2f,0 Jul 17 15:41:17 store2 genunix: [ID 408114 kern.info] /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bf2f,0 (sd36) online Jul 17 15:41:25 store2 genunix: [ID 583861 kern.info] sd38 at mpt_sas3: unit-address w5c0f0401bc1f,0: