Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-29 Thread Schweiss, Chip
I have an OmniOS box with all the same hardware except the server and hard
disks.  I would wager this something to do with the WD disks and something
different happening in the init.

This is a stab in the dark, but try adding power-condition:false in
/kernel/drv/sd.conf for the WD disks.

-Chip



On Wed, Jul 29, 2015 at 12:48 PM, Michael Talbott mtalb...@lji.org wrote:

 Here's the specs of that server.

 Fujitsu RX300S8
  -
 http://www.fujitsu.com/fts/products/computing/servers/primergy/rack/rx300/
 128G ECC DDR3 1600 RAM
 2 x Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz
 2 x LSI 9200-8e
 2 x 10Gb Intel NICs
 2 x SuperMicro 847E26-RJBOD1 45 bay JBOD enclosures
  - http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm

 The enclosures are not currently set up for multipathing. The front and
 rear backplane each have a single independent SAS connection to one of the
 LSI 9200s.

 The two enclosures are fully loaded with 45 x 4TB WD4001FYYG-01SL3 drives
 each (90 total).
 http://www.newegg.com/Product/Product.aspx?Item=N82E16822236353

 Booting the server up in Ubuntu or CentOS does not have that 8 second
 delay. Each drive is found in a fraction of a second (activity LEDs on the
 enclosure flash on and off really quick as the drives are scanned). On
 OmniOS, the drives seem to be scanned in the same order, but, instead of it
 spending a fraction of a second on each drive, it spends 8 seconds on 1
 drive (led of only one drive rapidly flashing during that process) before
 moving on to the next x 90 drives.

 Is there anything I can do to get more verbosity in the boot messages that
 might just reveal the root issue?

 Any suggestions appreciated.

 Thanks

 
 Michael Talbott
 Systems Administrator
 La Jolla Institute

 On Jul 29, 2015, at 7:51 AM, Schweiss, Chip c...@innovates.com wrote:



 On Fri, Jul 24, 2015 at 5:03 PM, Michael Talbott mtalb...@lji.org wrote:

 Hi,

 I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot
 bios. But I'm still getting the 8 second per drive delay after the kernel
 loads. Any other ideas?


 8 seconds is way too long.   What JBODs and disks are you using?   Could
 it be they are powered off and the delay in waiting for the power on
 command to complete?   This could be accelerated by using lsiutils to send
 them all power on commands first.

 While I still consider it slow, however, my OmniOS systems with  LSI HBAs
 discover about 2 disks per second.   With systems with LOTS of disk all
 multipathed it still stacks up to a long time to discover them all.

 -Chip



 
 Michael Talbott
 Systems Administrator
 La Jolla Institute

  On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients
 Amstafs ::.. i...@houseofancients.nl wrote:
 
  Michael,
 
  I know v20 does cause lots of issue's.
  V19 , to the best of my knowledge doesn't contain any, so I would
 downgrade to v19
 
 
  Kr,
 
 
  Floris
  -Oorspronkelijk bericht-
  Van: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com]
 Namens Michael Talbott
  Verzonden: dinsdag 21 juli 2015 4:57
  Aan: Marion Hakanson hakan...@ohsu.edu
  CC: omnios-discuss omnios-discuss@lists.omniti.com
  Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
 
  Thanks for the reply. The bios for the card is disabled already. The 8
 second per drive scan happens after the kernel has already loaded and it is
 scanning for devices. I wonder if it's due to running newer firmware. I did
 update the cards to fw v.20.something before I moved to omnios. Is there a
 particular firmware version on the cards I should run to match OmniOS's
 drivers?
 
 
  
  Michael Talbott
  Systems Administrator
  La Jolla Institute
 
  On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu
 wrote:
 
  Michael,
 
  I've not seen this;  I do have one system with 120 drives and it
  definitely does not have this problem.  A couple with 80+ drives are
  also free of this issue, though they are still running OpenIndiana.
 
  One thing I pretty much always do here, is to disable the boot option
  in the LSI HBA's config utility (accessible from the during boot after
  the BIOS has started up).  I do this because I don't want the BIOS
  thinking it can boot from any of the external JBOD disks;  And also
  because I've had some system BIOS crashes when they tried to enumerate
  too many drives.  But, this all happens at the BIOS level, before the
  OS has even started up, so in theory it should not affect what you are
  seeing.
 
  Regards,
 
  Marion
 
 
  
  Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
  From: Michael Talbott mtalb...@lji.org
  Date: Fri, 17 Jul 2015 16:15:47 -0700
  To: omnios-discuss omnios-discuss@lists.omniti.com
 
  Just realized my typo. I'm using this on my 90 and 180 drive systems:
 
  # svccfg -s boot-archive

Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-29 Thread Michael Talbott
Here's the specs of that server.

Fujitsu RX300S8
 - http://www.fujitsu.com/fts/products/computing/servers/primergy/rack/rx300/ 
http://www.fujitsu.com/fts/products/computing/servers/primergy/rack/rx300/
128G ECC DDR3 1600 RAM
2 x Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz
2 x LSI 9200-8e
2 x 10Gb Intel NICs
2 x SuperMicro 847E26-RJBOD1 45 bay JBOD enclosures
 - http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm 
http://www.supermicro.com/products/chassis/4U/847/SC847E26-RJBOD1.cfm

The enclosures are not currently set up for multipathing. The front and rear 
backplane each have a single independent SAS connection to one of the LSI 9200s.

The two enclosures are fully loaded with 45 x 4TB WD4001FYYG-01SL3 drives each 
(90 total).
http://www.newegg.com/Product/Product.aspx?Item=N82E16822236353 
http://www.newegg.com/Product/Product.aspx?Item=N82E16822236353

Booting the server up in Ubuntu or CentOS does not have that 8 second delay. 
Each drive is found in a fraction of a second (activity LEDs on the enclosure 
flash on and off really quick as the drives are scanned). On OmniOS, the drives 
seem to be scanned in the same order, but, instead of it spending a fraction of 
a second on each drive, it spends 8 seconds on 1 drive (led of only one drive 
rapidly flashing during that process) before moving on to the next x 90 drives.

Is there anything I can do to get more verbosity in the boot messages that 
might just reveal the root issue?

Any suggestions appreciated.

Thanks


Michael Talbott
Systems Administrator
La Jolla Institute

 On Jul 29, 2015, at 7:51 AM, Schweiss, Chip c...@innovates.com wrote:
 
 
 
 On Fri, Jul 24, 2015 at 5:03 PM, Michael Talbott mtalb...@lji.org 
 mailto:mtalb...@lji.org wrote:
 Hi,
 
 I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot bios. 
 But I'm still getting the 8 second per drive delay after the kernel loads. 
 Any other ideas?
 
 
 8 seconds is way too long.   What JBODs and disks are you using?   Could it 
 be they are powered off and the delay in waiting for the power on command to 
 complete?   This could be accelerated by using lsiutils to send them all 
 power on commands first.
 
 While I still consider it slow, however, my OmniOS systems with  LSI HBAs 
 discover about 2 disks per second.   With systems with LOTS of disk all 
 multipathed it still stacks up to a long time to discover them all.
 
 -Chip
  
 
 
 Michael Talbott
 Systems Administrator
 La Jolla Institute
 
  On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients 
  Amstafs ::.. i...@houseofancients.nl mailto:i...@houseofancients.nl 
  wrote:
 
  Michael,
 
  I know v20 does cause lots of issue's.
  V19 , to the best of my knowledge doesn't contain any, so I would downgrade 
  to v19
 
 
  Kr,
 
 
  Floris
  -Oorspronkelijk bericht-
  Van: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com 
  mailto:omnios-discuss-boun...@lists.omniti.com] Namens Michael Talbott
  Verzonden: dinsdag 21 juli 2015 4:57
  Aan: Marion Hakanson hakan...@ohsu.edu mailto:hakan...@ohsu.edu
  CC: omnios-discuss omnios-discuss@lists.omniti.com 
  mailto:omnios-discuss@lists.omniti.com
  Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
 
  Thanks for the reply. The bios for the card is disabled already. The 8 
  second per drive scan happens after the kernel has already loaded and it is 
  scanning for devices. I wonder if it's due to running newer firmware. I did 
  update the cards to fw v.20.something before I moved to omnios. Is there a 
  particular firmware version on the cards I should run to match OmniOS's 
  drivers?
 
 
  
  Michael Talbott
  Systems Administrator
  La Jolla Institute
 
  On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu 
  mailto:hakan...@ohsu.edu wrote:
 
  Michael,
 
  I've not seen this;  I do have one system with 120 drives and it
  definitely does not have this problem.  A couple with 80+ drives are
  also free of this issue, though they are still running OpenIndiana.
 
  One thing I pretty much always do here, is to disable the boot option
  in the LSI HBA's config utility (accessible from the during boot after
  the BIOS has started up).  I do this because I don't want the BIOS
  thinking it can boot from any of the external JBOD disks;  And also
  because I've had some system BIOS crashes when they tried to enumerate
  too many drives.  But, this all happens at the BIOS level, before the
  OS has even started up, so in theory it should not affect what you are
  seeing.
 
  Regards,
 
  Marion
 
 
  
  Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
  From: Michael Talbott mtalb...@lji.org mailto:mtalb...@lji.org
  Date: Fri, 17 Jul 2015 16:15:47 -0700
  To: omnios-discuss omnios-discuss@lists.omniti.com 
  mailto:omnios-discuss

Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-29 Thread Michael Rasmussen
On Wed, 29 Jul 2015 17:07:52 -0500
Schweiss, Chip c...@innovates.com wrote:

 The only other thing that come to mind is that you mentioned you have only
 a single SAS path to these disks.   Have you disabled multipath?  (stmsboot
 -d)
 
What about driver firmware?
http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=CKTR9

-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael at rasmussen dot cc
http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xD3C9A00E
mir at datanom dot net
http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xE501F51C
mir at miras dot org
http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xE3E80917
--
/usr/games/fortune -es says:
Specifications subject to change without notice.


pgpWoayFPFbmP.pgp
Description: OpenPGP digital signature
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-29 Thread Michael Talbott
After some more trial and error, disabling/enabling multipath did not help.

But after a little googling, I ran across this: 
https://syneto.eu/knowledgebase/cannot-use-some-models-of-enterprise-wd-or-seagate-disks/
I guess these guys use the Illumos kernel too and it became obvious I didn't 
use the right syntax in sd.conf with your previous suggestion.

so I added 
sd-config-list= WD  WD4001FYYG,power-condition:false;
to /kernel/drv/sd.conf

and it seems to have cut my cold boot time in half :) Still, 4-5 seconds per 
drive seems pretty high for device detection, but it a vast improvement from 
than 8-9 seconds per drive.

Jul 29 17:05:25 store2 genunix: [ID 583861 kern.info] sd15 at mpt_sas2: 
unit-address w5c0f042ba1bf,0: w5c0f042ba1bf,0
Jul 29 17:05:25 store2 genunix: [ID 936769 kern.info] sd15 is 
/pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f042ba1bf,0
Jul 29 17:05:25 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f042ba1bf,0 (sd15) 
online
Jul 29 17:05:30 store2 genunix: [ID 583861 kern.info] sd17 at mpt_sas2: 
unit-address w5c0f01c06d37,0: w5c0f01c06d37,0
Jul 29 17:05:30 store2 genunix: [ID 936769 kern.info] sd17 is 
/pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f01c06d37,0
Jul 29 17:05:30 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f01c06d37,0 (sd17) 
online
Jul 29 17:05:34 store2 genunix: [ID 583861 kern.info] sd18 at mpt_sas2: 
unit-address w5c0f045717af,0: w5c0f045717af,0
Jul 29 17:05:34 store2 genunix: [ID 936769 kern.info] sd18 is 
/pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f045717af,0
Jul 29 17:05:34 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f045717af,0 (sd18) 
online
Jul 29 17:05:38 store2 genunix: [ID 583861 kern.info] sd19 at mpt_sas2: 
unit-address w5c0f011a48fb,0: w5c0f011a48fb,0
Jul 29 17:05:38 store2 genunix: [ID 936769 kern.info] sd19 is 
/pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f011a48fb,0
Jul 29 17:05:38 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e04@2/pci1000,3080@0/iport@f0/disk@w5c0f011a48fb,0 (sd19) 
online

Thanks.




Michael Talbott
Systems Administrator
La Jolla Institute

 On Jul 29, 2015, at 4:03 PM, Michael Rasmussen m...@miras.org wrote:
 
 On Wed, 29 Jul 2015 17:07:52 -0500
 Schweiss, Chip c...@innovates.com wrote:
 
 The only other thing that come to mind is that you mentioned you have only
 a single SAS path to these disks.   Have you disabled multipath?  (stmsboot
 -d)
 
 What about driver firmware?
 http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=CKTR9
 
 -- 
 Hilsen/Regards
 Michael Rasmussen
 
 Get my public GnuPG keys:
 michael at rasmussen dot cc
 http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xD3C9A00E
 mir at datanom dot net
 http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xE501F51C
 mir at miras dot org
 http://pgp.mit.edu:11371/pks/lookup?op=getsearch=0xE3E80917
 --
 /usr/games/fortune -es says:
 Specifications subject to change without notice.
 ___
 OmniOS-discuss mailing list
 OmniOS-discuss@lists.omniti.com
 http://lists.omniti.com/mailman/listinfo/omnios-discuss

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-29 Thread Schweiss, Chip
On Fri, Jul 24, 2015 at 5:03 PM, Michael Talbott mtalb...@lji.org wrote:

 Hi,

 I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot
 bios. But I'm still getting the 8 second per drive delay after the kernel
 loads. Any other ideas?


8 seconds is way too long.   What JBODs and disks are you using?   Could it
be they are powered off and the delay in waiting for the power on command
to complete?   This could be accelerated by using lsiutils to send them all
power on commands first.

While I still consider it slow, however, my OmniOS systems with  LSI HBAs
discover about 2 disks per second.   With systems with LOTS of disk all
multipathed it still stacks up to a long time to discover them all.

-Chip



 
 Michael Talbott
 Systems Administrator
 La Jolla Institute

  On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients
 Amstafs ::.. i...@houseofancients.nl wrote:
 
  Michael,
 
  I know v20 does cause lots of issue's.
  V19 , to the best of my knowledge doesn't contain any, so I would
 downgrade to v19
 
 
  Kr,
 
 
  Floris
  -Oorspronkelijk bericht-
  Van: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com]
 Namens Michael Talbott
  Verzonden: dinsdag 21 juli 2015 4:57
  Aan: Marion Hakanson hakan...@ohsu.edu
  CC: omnios-discuss omnios-discuss@lists.omniti.com
  Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
 
  Thanks for the reply. The bios for the card is disabled already. The 8
 second per drive scan happens after the kernel has already loaded and it is
 scanning for devices. I wonder if it's due to running newer firmware. I did
 update the cards to fw v.20.something before I moved to omnios. Is there a
 particular firmware version on the cards I should run to match OmniOS's
 drivers?
 
 
  
  Michael Talbott
  Systems Administrator
  La Jolla Institute
 
  On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote:
 
  Michael,
 
  I've not seen this;  I do have one system with 120 drives and it
  definitely does not have this problem.  A couple with 80+ drives are
  also free of this issue, though they are still running OpenIndiana.
 
  One thing I pretty much always do here, is to disable the boot option
  in the LSI HBA's config utility (accessible from the during boot after
  the BIOS has started up).  I do this because I don't want the BIOS
  thinking it can boot from any of the external JBOD disks;  And also
  because I've had some system BIOS crashes when they tried to enumerate
  too many drives.  But, this all happens at the BIOS level, before the
  OS has even started up, so in theory it should not affect what you are
  seeing.
 
  Regards,
 
  Marion
 
 
  
  Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
  From: Michael Talbott mtalb...@lji.org
  Date: Fri, 17 Jul 2015 16:15:47 -0700
  To: omnios-discuss omnios-discuss@lists.omniti.com
 
  Just realized my typo. I'm using this on my 90 and 180 drive systems:
 
  # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s
  boot-archive setprop start/timeout_seconds=1440
 
  Seems like 8 seconds to detect each drive is pretty excessive.
 
  Any ideas on how to speed that up?
 
 
  
  Michael Talbott
  Systems Administrator
  La Jolla Institute
 
  On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote:
 
  I have multiple NAS servers I've moved to OmniOS and each of them have
 90-180 4T disks. Everything has worked out pretty well for the most part.
 But I've come into an issue where when I reboot any of them, I'm getting
 boot-archive service timeouts happening. I found a workaround of increasing
 the timeout value which brings me to the following. As you can see below in
 a dmesg output, it's taking the kernel about 8 seconds to detect each of
 the drives. They're connected via a couple SAS2008 based LSI cards.
 
  Is this normal?
  Is there a way to speed that up?
 
  I've fixed my frustrating boot-archive timeout problem by adjusting
 the timeout value from the default of 60 seconds (I guess that'll work ok
 on systems with less than 8 drives?) to 8 seconds * 90 drives + a little
 extra time = 280 seconds (for the 90 drive systems). Which means it takes
 between 12-24 minutes to boot those machines up.
 
  # svccfg -s boot-archive setprop start/timeout_seconds=280
 
  I figure I can't be the only one. A little googling also revealed:
  https://www.illumos.org/issues/4614
  https://www.illumos.org/issues/4614
 
  Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at
  mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul
  17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is
  /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd4
  3,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info]
  /pci@0,0/pci8086,e06@2,2/pci1000,3080

Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-24 Thread Michael Talbott
Hi,

I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot bios. 
But I'm still getting the 8 second per drive delay after the kernel loads. Any 
other ideas?



Michael Talbott
Systems Administrator
La Jolla Institute

 On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients Amstafs 
 ::.. i...@houseofancients.nl wrote:
 
 Michael,
 
 I know v20 does cause lots of issue's.
 V19 , to the best of my knowledge doesn't contain any, so I would downgrade 
 to v19
 
 
 Kr,
 
 
 Floris
 -Oorspronkelijk bericht-
 Van: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] Namens 
 Michael Talbott
 Verzonden: dinsdag 21 juli 2015 4:57
 Aan: Marion Hakanson hakan...@ohsu.edu
 CC: omnios-discuss omnios-discuss@lists.omniti.com
 Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
 
 Thanks for the reply. The bios for the card is disabled already. The 8 second 
 per drive scan happens after the kernel has already loaded and it is scanning 
 for devices. I wonder if it's due to running newer firmware. I did update the 
 cards to fw v.20.something before I moved to omnios. Is there a particular 
 firmware version on the cards I should run to match OmniOS's drivers?
 
 
 
 Michael Talbott
 Systems Administrator
 La Jolla Institute
 
 On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote:
 
 Michael,
 
 I've not seen this;  I do have one system with 120 drives and it 
 definitely does not have this problem.  A couple with 80+ drives are 
 also free of this issue, though they are still running OpenIndiana.
 
 One thing I pretty much always do here, is to disable the boot option 
 in the LSI HBA's config utility (accessible from the during boot after 
 the BIOS has started up).  I do this because I don't want the BIOS 
 thinking it can boot from any of the external JBOD disks;  And also 
 because I've had some system BIOS crashes when they tried to enumerate 
 too many drives.  But, this all happens at the BIOS level, before the 
 OS has even started up, so in theory it should not affect what you are 
 seeing.
 
 Regards,
 
 Marion
 
 
 
 Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
 From: Michael Talbott mtalb...@lji.org
 Date: Fri, 17 Jul 2015 16:15:47 -0700
 To: omnios-discuss omnios-discuss@lists.omniti.com
 
 Just realized my typo. I'm using this on my 90 and 180 drive systems:
 
 # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s 
 boot-archive setprop start/timeout_seconds=1440
 
 Seems like 8 seconds to detect each drive is pretty excessive.
 
 Any ideas on how to speed that up?
 
 
 
 Michael Talbott
 Systems Administrator
 La Jolla Institute
 
 On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote:
 
 I have multiple NAS servers I've moved to OmniOS and each of them have 
 90-180 4T disks. Everything has worked out pretty well for the most part. 
 But I've come into an issue where when I reboot any of them, I'm getting 
 boot-archive service timeouts happening. I found a workaround of increasing 
 the timeout value which brings me to the following. As you can see below in 
 a dmesg output, it's taking the kernel about 8 seconds to detect each of 
 the drives. They're connected via a couple SAS2008 based LSI cards.
 
 Is this normal?
 Is there a way to speed that up?
 
 I've fixed my frustrating boot-archive timeout problem by adjusting the 
 timeout value from the default of 60 seconds (I guess that'll work ok on 
 systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra 
 time = 280 seconds (for the 90 drive systems). Which means it takes between 
 12-24 minutes to boot those machines up.
 
 # svccfg -s boot-archive setprop start/timeout_seconds=280
 
 I figure I can't be the only one. A little googling also revealed: 
 https://www.illumos.org/issues/4614 
 https://www.illumos.org/issues/4614
 
 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at 
 mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul 
 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd4
 3,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd4
 3,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 
 kern.info] sd30 at mpt_sas3: unit-address w5c0f045679c3,0: 
 w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 
 kern.info] sd30 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c
 3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c
 3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 
 kern.info] sd31 at mpt_sas3: unit-address w5c0f045712b3,0

Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-21 Thread Floris van Essen ..:: House of Ancients Amstafs ::..
Michael,

I know v20 does cause lots of issue's.
V19 , to the best of my knowledge doesn't contain any, so I would downgrade to 
v19


Kr,


Floris
-Oorspronkelijk bericht-
Van: OmniOS-discuss [mailto:omnios-discuss-boun...@lists.omniti.com] Namens 
Michael Talbott
Verzonden: dinsdag 21 juli 2015 4:57
Aan: Marion Hakanson hakan...@ohsu.edu
CC: omnios-discuss omnios-discuss@lists.omniti.com
Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

Thanks for the reply. The bios for the card is disabled already. The 8 second 
per drive scan happens after the kernel has already loaded and it is scanning 
for devices. I wonder if it's due to running newer firmware. I did update the 
cards to fw v.20.something before I moved to omnios. Is there a particular 
firmware version on the cards I should run to match OmniOS's drivers?



Michael Talbott
Systems Administrator
La Jolla Institute

 On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote:
 
 Michael,
 
 I've not seen this;  I do have one system with 120 drives and it 
 definitely does not have this problem.  A couple with 80+ drives are 
 also free of this issue, though they are still running OpenIndiana.
 
 One thing I pretty much always do here, is to disable the boot option 
 in the LSI HBA's config utility (accessible from the during boot after 
 the BIOS has started up).  I do this because I don't want the BIOS 
 thinking it can boot from any of the external JBOD disks;  And also 
 because I've had some system BIOS crashes when they tried to enumerate 
 too many drives.  But, this all happens at the BIOS level, before the 
 OS has even started up, so in theory it should not affect what you are 
 seeing.
 
 Regards,
 
 Marion
 
 
 
 Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
 From: Michael Talbott mtalb...@lji.org
 Date: Fri, 17 Jul 2015 16:15:47 -0700
 To: omnios-discuss omnios-discuss@lists.omniti.com
 
 Just realized my typo. I'm using this on my 90 and 180 drive systems:
 
 # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s 
 boot-archive setprop start/timeout_seconds=1440
 
 Seems like 8 seconds to detect each drive is pretty excessive.
 
 Any ideas on how to speed that up?
 
 
 
 Michael Talbott
 Systems Administrator
 La Jolla Institute
 
 On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote:
 
 I have multiple NAS servers I've moved to OmniOS and each of them have 
 90-180 4T disks. Everything has worked out pretty well for the most part. 
 But I've come into an issue where when I reboot any of them, I'm getting 
 boot-archive service timeouts happening. I found a workaround of increasing 
 the timeout value which brings me to the following. As you can see below in 
 a dmesg output, it's taking the kernel about 8 seconds to detect each of the 
 drives. They're connected via a couple SAS2008 based LSI cards.
 
 Is this normal?
 Is there a way to speed that up?
 
 I've fixed my frustrating boot-archive timeout problem by adjusting the 
 timeout value from the default of 60 seconds (I guess that'll work ok on 
 systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra 
 time = 280 seconds (for the 90 drive systems). Which means it takes between 
 12-24 minutes to boot those machines up.
 
 # svccfg -s boot-archive setprop start/timeout_seconds=280
 
 I figure I can't be the only one. A little googling also revealed: 
 https://www.illumos.org/issues/4614 
 https://www.illumos.org/issues/4614
 
 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at 
 mpt_sas3: unit-address w5c0f0401bd43,0: w5c0f0401bd43,0 Jul 
 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd4
 3,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd4
 3,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861 
 kern.info] sd30 at mpt_sas3: unit-address w5c0f045679c3,0: 
 w5c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769 
 kern.info] sd30 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c
 3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c
 3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861 
 kern.info] sd31 at mpt_sas3: unit-address w5c0f045712b3,0: 
 w5c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 936769 
 kern.info] sd31 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b
 3,0 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b
 3,0 (sd31) online Jul 17 15:40:42 store2 genunix: [ID 583861 
 kern.info] sd32 at mpt_sas3: unit-address w5c0f04571497,0

Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-21 Thread Richard Elling

 On Jul 20, 2015, at 7:56 PM, Michael Talbott mtalb...@lji.org wrote:
 
 Thanks for the reply. The bios for the card is disabled already. The 8 second 
 per drive scan happens after the kernel has already loaded and it is scanning 
 for devices. I wonder if it's due to running newer firmware. I did update the 
 cards to fw v.20.something before I moved to omnios. Is there a particular 
 firmware version on the cards I should run to match OmniOS's drivers?

Google LSI P20 firmware for many tales of woe for many different OSes.
Be aware that getting the latest version of firmware from Avago might not be 
obvious...
the latest version is 20.00.04.00 for Windows.
 -- richard

 
 
 
 Michael Talbott
 Systems Administrator
 La Jolla Institute
 
 On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote:
 
 Michael,
 
 I've not seen this;  I do have one system with 120 drives and it
 definitely does not have this problem.  A couple with 80+ drives
 are also free of this issue, though they are still running OpenIndiana.
 
 One thing I pretty much always do here, is to disable the boot option
 in the LSI HBA's config utility (accessible from the during boot after
 the BIOS has started up).  I do this because I don't want the BIOS
 thinking it can boot from any of the external JBOD disks;  And also
 because I've had some system BIOS crashes when they tried to enumerate
 too many drives.  But, this all happens at the BIOS level, before the
 OS has even started up, so in theory it should not affect what
 you are seeing.
 
 Regards,
 
 Marion
 
 
 
 Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
 From: Michael Talbott mtalb...@lji.org
 Date: Fri, 17 Jul 2015 16:15:47 -0700
 To: omnios-discuss omnios-discuss@lists.omniti.com
 
 Just realized my typo. I'm using this on my 90 and 180 drive systems:
 
 # svccfg -s boot-archive setprop start/timeout_seconds=720
 # svccfg -s boot-archive setprop start/timeout_seconds=1440
 
 Seems like 8 seconds to detect each drive is pretty excessive.
 
 Any ideas on how to speed that up?
 
 
 
 Michael Talbott
 Systems Administrator
 La Jolla Institute
 
 On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote:
 
 I have multiple NAS servers I've moved to OmniOS and each of them have 
 90-180 4T disks. Everything has worked out pretty well for the most part. 
 But I've come into an issue where when I reboot any of them, I'm getting 
 boot-archive service timeouts happening. I found a workaround of increasing 
 the timeout value which brings me to the following. As you can see below in 
 a dmesg output, it's taking the kernel about 8 seconds to detect each of 
 the drives. They're connected via a couple SAS2008 based LSI cards.
 
 Is this normal?
 Is there a way to speed that up?
 
 I've fixed my frustrating boot-archive timeout problem by adjusting the 
 timeout value from the default of 60 seconds (I guess that'll work ok on 
 systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra 
 time = 280 seconds (for the 90 drive systems). Which means it takes between 
 12-24 minutes to boot those machines up.
 
 # svccfg -s boot-archive setprop start/timeout_seconds=280
 
 I figure I can't be the only one. A little googling also revealed: 
 https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614
 
 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: 
 unit-address w5c0f0401bd43,0: w5c0f0401bd43,0
 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0
 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 
 (sd29) online
 Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: 
 unit-address w5c0f045679c3,0: w5c0f045679c3,0
 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0
 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 
 (sd30) online
 Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: 
 unit-address w5c0f045712b3,0: w5c0f045712b3,0
 Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0
 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 
 (sd31) online
 Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: 
 unit-address w5c0f04571497,0: w5c0f04571497,0
 Jul 17 15:40:42 store2 genunix: [ID 936769 kern.info] sd32 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0
 Jul 17 15:40:42 store2 genunix

Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-20 Thread Marion Hakanson
Michael,

I've not seen this;  I do have one system with 120 drives and it
definitely does not have this problem.  A couple with 80+ drives
are also free of this issue, though they are still running OpenIndiana.

One thing I pretty much always do here, is to disable the boot option
in the LSI HBA's config utility (accessible from the during boot after
the BIOS has started up).  I do this because I don't want the BIOS
thinking it can boot from any of the external JBOD disks;  And also
because I've had some system BIOS crashes when they tried to enumerate
too many drives.  But, this all happens at the BIOS level, before the
OS has even started up, so in theory it should not affect what
you are seeing.

Regards,

Marion



Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
From: Michael Talbott mtalb...@lji.org
Date: Fri, 17 Jul 2015 16:15:47 -0700
To: omnios-discuss omnios-discuss@lists.omniti.com

Just realized my typo. I'm using this on my 90 and 180 drive systems:

# svccfg -s boot-archive setprop start/timeout_seconds=720
# svccfg -s boot-archive setprop start/timeout_seconds=1440

Seems like 8 seconds to detect each drive is pretty excessive.

Any ideas on how to speed that up?



Michael Talbott
Systems Administrator
La Jolla Institute

 On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote:
 
 I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 
 4T disks. Everything has worked out pretty well for the most part. But I've 
 come into an issue where when I reboot any of them, I'm getting boot-archive 
 service timeouts happening. I found a workaround of increasing the timeout 
 value which brings me to the following. As you can see below in a dmesg 
 output, it's taking the kernel about 8 seconds to detect each of the drives. 
 They're connected via a couple SAS2008 based LSI cards.
 
 Is this normal?
 Is there a way to speed that up?
 
 I've fixed my frustrating boot-archive timeout problem by adjusting the 
 timeout value from the default of 60 seconds (I guess that'll work ok on 
 systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra 
 time = 280 seconds (for the 90 drive systems). Which means it takes between 
 12-24 minutes to boot those machines up.
 
 # svccfg -s boot-archive setprop start/timeout_seconds=280
 
 I figure I can't be the only one. A little googling also revealed: 
 https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614
 
 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: 
 unit-address w5c0f0401bd43,0: w5c0f0401bd43,0
 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0
 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 
 (sd29) online
 Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: 
 unit-address w5c0f045679c3,0: w5c0f045679c3,0
 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0
 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 
 (sd30) online
 Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: 
 unit-address w5c0f045712b3,0: w5c0f045712b3,0
 Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0
 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 
 (sd31) online
 Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: 
 unit-address w5c0f04571497,0: w5c0f04571497,0
 Jul 17 15:40:42 store2 genunix: [ID 936769 kern.info] sd32 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0
 Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 
 (sd32) online
 Jul 17 15:40:50 store2 genunix: [ID 583861 kern.info] sd33 at mpt_sas3: 
 unit-address w5c0f042ac8eb,0: w5c0f042ac8eb,0
 Jul 17 15:40:50 store2 genunix: [ID 936769 kern.info] sd33 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0
 Jul 17 15:40:50 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0 
 (sd33) online
 Jul 17 15:40:59 store2 genunix: [ID 583861 kern.info] sd34 at mpt_sas3: 
 unit-address w5c0f04571473,0: w5c0f04571473,0
 Jul 17 15:40:59 store2 genunix: [ID 936769 kern.info] sd34 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571473,0
 Jul 17 15:40:59 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080

Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-20 Thread Michael Talbott
Thanks for the reply. The bios for the card is disabled already. The 8 second 
per drive scan happens after the kernel has already loaded and it is scanning 
for devices. I wonder if it's due to running newer firmware. I did update the 
cards to fw v.20.something before I moved to omnios. Is there a particular 
firmware version on the cards I should run to match OmniOS's drivers?



Michael Talbott
Systems Administrator
La Jolla Institute

 On Jul 20, 2015, at 6:06 PM, Marion Hakanson hakan...@ohsu.edu wrote:
 
 Michael,
 
 I've not seen this;  I do have one system with 120 drives and it
 definitely does not have this problem.  A couple with 80+ drives
 are also free of this issue, though they are still running OpenIndiana.
 
 One thing I pretty much always do here, is to disable the boot option
 in the LSI HBA's config utility (accessible from the during boot after
 the BIOS has started up).  I do this because I don't want the BIOS
 thinking it can boot from any of the external JBOD disks;  And also
 because I've had some system BIOS crashes when they tried to enumerate
 too many drives.  But, this all happens at the BIOS level, before the
 OS has even started up, so in theory it should not affect what
 you are seeing.
 
 Regards,
 
 Marion
 
 
 
 Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
 From: Michael Talbott mtalb...@lji.org
 Date: Fri, 17 Jul 2015 16:15:47 -0700
 To: omnios-discuss omnios-discuss@lists.omniti.com
 
 Just realized my typo. I'm using this on my 90 and 180 drive systems:
 
 # svccfg -s boot-archive setprop start/timeout_seconds=720
 # svccfg -s boot-archive setprop start/timeout_seconds=1440
 
 Seems like 8 seconds to detect each drive is pretty excessive.
 
 Any ideas on how to speed that up?
 
 
 
 Michael Talbott
 Systems Administrator
 La Jolla Institute
 
 On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote:
 
 I have multiple NAS servers I've moved to OmniOS and each of them have 
 90-180 4T disks. Everything has worked out pretty well for the most part. 
 But I've come into an issue where when I reboot any of them, I'm getting 
 boot-archive service timeouts happening. I found a workaround of increasing 
 the timeout value which brings me to the following. As you can see below in 
 a dmesg output, it's taking the kernel about 8 seconds to detect each of the 
 drives. They're connected via a couple SAS2008 based LSI cards.
 
 Is this normal?
 Is there a way to speed that up?
 
 I've fixed my frustrating boot-archive timeout problem by adjusting the 
 timeout value from the default of 60 seconds (I guess that'll work ok on 
 systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra 
 time = 280 seconds (for the 90 drive systems). Which means it takes between 
 12-24 minutes to boot those machines up.
 
 # svccfg -s boot-archive setprop start/timeout_seconds=280
 
 I figure I can't be the only one. A little googling also revealed: 
 https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614
 
 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: 
 unit-address w5c0f0401bd43,0: w5c0f0401bd43,0
 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0
 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 
 (sd29) online
 Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: 
 unit-address w5c0f045679c3,0: w5c0f045679c3,0
 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0
 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 
 (sd30) online
 Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: 
 unit-address w5c0f045712b3,0: w5c0f045712b3,0
 Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0
 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 
 (sd31) online
 Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: 
 unit-address w5c0f04571497,0: w5c0f04571497,0
 Jul 17 15:40:42 store2 genunix: [ID 936769 kern.info] sd32 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0
 Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 
 (sd32) online
 Jul 17 15:40:50 store2 genunix: [ID 583861 kern.info] sd33 at mpt_sas3: 
 unit-address w5c0f042ac8eb,0: w5c0f042ac8eb,0
 Jul 17 15:40:50 store2 genunix: [ID 936769 kern.info] sd33

[OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-17 Thread Michael Talbott
I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 
4T disks. Everything has worked out pretty well for the most part. But I've 
come into an issue where when I reboot any of them, I'm getting boot-archive 
service timeouts happening. I found a workaround of increasing the timeout 
value which brings me to the following. As you can see below in a dmesg output, 
it's taking the kernel about 8 seconds to detect each of the drives. They're 
connected via a couple SAS2008 based LSI cards.

Is this normal?
Is there a way to speed that up?

I've fixed my frustrating boot-archive timeout problem by adjusting the timeout 
value from the default of 60 seconds (I guess that'll work ok on systems with 
less than 8 drives?) to 8 seconds * 90 drives + a little extra time = 280 
seconds (for the 90 drive systems). Which means it takes between 12-24 minutes 
to boot those machines up.

# svccfg -s boot-archive setprop start/timeout_seconds=280

I figure I can't be the only one. A little googling also revealed: 
https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614

Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: 
unit-address w5c0f0401bd43,0: w5c0f0401bd43,0
Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0
Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 (sd29) 
online
Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: 
unit-address w5c0f045679c3,0: w5c0f045679c3,0
Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0
Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 (sd30) 
online
Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: 
unit-address w5c0f045712b3,0: w5c0f045712b3,0
Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0
Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 (sd31) 
online
Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: 
unit-address w5c0f04571497,0: w5c0f04571497,0
Jul 17 15:40:42 store2 genunix: [ID 936769 kern.info] sd32 is 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0
Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 (sd32) 
online
Jul 17 15:40:50 store2 genunix: [ID 583861 kern.info] sd33 at mpt_sas3: 
unit-address w5c0f042ac8eb,0: w5c0f042ac8eb,0
Jul 17 15:40:50 store2 genunix: [ID 936769 kern.info] sd33 is 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0
Jul 17 15:40:50 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0 (sd33) 
online
Jul 17 15:40:59 store2 genunix: [ID 583861 kern.info] sd34 at mpt_sas3: 
unit-address w5c0f04571473,0: w5c0f04571473,0
Jul 17 15:40:59 store2 genunix: [ID 936769 kern.info] sd34 is 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571473,0
Jul 17 15:40:59 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571473,0 (sd34) 
online
Jul 17 15:41:08 store2 genunix: [ID 583861 kern.info] sd35 at mpt_sas3: 
unit-address w5c0f042c636f,0: w5c0f042c636f,0
Jul 17 15:41:08 store2 genunix: [ID 936769 kern.info] sd35 is 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042c636f,0
Jul 17 15:41:08 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042c636f,0 (sd35) 
online
Jul 17 15:41:17 store2 genunix: [ID 583861 kern.info] sd36 at mpt_sas3: 
unit-address w5c0f0401bf2f,0: w5c0f0401bf2f,0
Jul 17 15:41:17 store2 genunix: [ID 936769 kern.info] sd36 is 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bf2f,0
Jul 17 15:41:17 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bf2f,0 (sd36) 
online
Jul 17 15:41:25 store2 genunix: [ID 583861 kern.info] sd38 at mpt_sas3: 
unit-address w5c0f0401bc1f,0: w5c0f0401bc1f,0
Jul 17 15:41:25 store2 genunix: [ID 936769 kern.info] sd38 is 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bc1f,0
Jul 17 15:41:26 store2 genunix: [ID 408114 kern.info] 
/pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bc1f,0 (sd38) 
online



Michael Talbott
Systems Administrator
La Jolla Institute

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com

Re: [OmniOS-discuss] Slow Drive Detection and boot-archive

2015-07-17 Thread Michael Talbott
Just realized my typo. I'm using this on my 90 and 180 drive systems:

# svccfg -s boot-archive setprop start/timeout_seconds=720
# svccfg -s boot-archive setprop start/timeout_seconds=1440

Seems like 8 seconds to detect each drive is pretty excessive.

Any ideas on how to speed that up?



Michael Talbott
Systems Administrator
La Jolla Institute

 On Jul 17, 2015, at 4:07 PM, Michael Talbott mtalb...@lji.org wrote:
 
 I have multiple NAS servers I've moved to OmniOS and each of them have 90-180 
 4T disks. Everything has worked out pretty well for the most part. But I've 
 come into an issue where when I reboot any of them, I'm getting boot-archive 
 service timeouts happening. I found a workaround of increasing the timeout 
 value which brings me to the following. As you can see below in a dmesg 
 output, it's taking the kernel about 8 seconds to detect each of the drives. 
 They're connected via a couple SAS2008 based LSI cards.
 
 Is this normal?
 Is there a way to speed that up?
 
 I've fixed my frustrating boot-archive timeout problem by adjusting the 
 timeout value from the default of 60 seconds (I guess that'll work ok on 
 systems with less than 8 drives?) to 8 seconds * 90 drives + a little extra 
 time = 280 seconds (for the 90 drive systems). Which means it takes between 
 12-24 minutes to boot those machines up.
 
 # svccfg -s boot-archive setprop start/timeout_seconds=280
 
 I figure I can't be the only one. A little googling also revealed: 
 https://www.illumos.org/issues/4614 https://www.illumos.org/issues/4614
 
 Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at mpt_sas3: 
 unit-address w5c0f0401bd43,0: w5c0f0401bd43,0
 Jul 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0
 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bd43,0 
 (sd29) online
 Jul 17 15:40:24 store2 genunix: [ID 583861 kern.info] sd30 at mpt_sas3: 
 unit-address w5c0f045679c3,0: w5c0f045679c3,0
 Jul 17 15:40:24 store2 genunix: [ID 936769 kern.info] sd30 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0
 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045679c3,0 
 (sd30) online
 Jul 17 15:40:33 store2 genunix: [ID 583861 kern.info] sd31 at mpt_sas3: 
 unit-address w5c0f045712b3,0: w5c0f045712b3,0
 Jul 17 15:40:33 store2 genunix: [ID 936769 kern.info] sd31 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0
 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f045712b3,0 
 (sd31) online
 Jul 17 15:40:42 store2 genunix: [ID 583861 kern.info] sd32 at mpt_sas3: 
 unit-address w5c0f04571497,0: w5c0f04571497,0
 Jul 17 15:40:42 store2 genunix: [ID 936769 kern.info] sd32 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0
 Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571497,0 
 (sd32) online
 Jul 17 15:40:50 store2 genunix: [ID 583861 kern.info] sd33 at mpt_sas3: 
 unit-address w5c0f042ac8eb,0: w5c0f042ac8eb,0
 Jul 17 15:40:50 store2 genunix: [ID 936769 kern.info] sd33 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0
 Jul 17 15:40:50 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042ac8eb,0 
 (sd33) online
 Jul 17 15:40:59 store2 genunix: [ID 583861 kern.info] sd34 at mpt_sas3: 
 unit-address w5c0f04571473,0: w5c0f04571473,0
 Jul 17 15:40:59 store2 genunix: [ID 936769 kern.info] sd34 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571473,0
 Jul 17 15:40:59 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f04571473,0 
 (sd34) online
 Jul 17 15:41:08 store2 genunix: [ID 583861 kern.info] sd35 at mpt_sas3: 
 unit-address w5c0f042c636f,0: w5c0f042c636f,0
 Jul 17 15:41:08 store2 genunix: [ID 936769 kern.info] sd35 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042c636f,0
 Jul 17 15:41:08 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f042c636f,0 
 (sd35) online
 Jul 17 15:41:17 store2 genunix: [ID 583861 kern.info] sd36 at mpt_sas3: 
 unit-address w5c0f0401bf2f,0: w5c0f0401bf2f,0
 Jul 17 15:41:17 store2 genunix: [ID 936769 kern.info] sd36 is 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bf2f,0
 Jul 17 15:41:17 store2 genunix: [ID 408114 kern.info] 
 /pci@0,0/pci8086,e06@2,2/pci1000,3080@0/iport@f/disk@w5c0f0401bf2f,0 
 (sd36) online
 Jul 17 15:41:25 store2 genunix: [ID 583861 kern.info] sd38 at mpt_sas3: 
 unit-address w5c0f0401bc1f,0: