Hello,

I have a problem with the following OpenSolaris Server after adding four new 
harddrives and one hot-spare (all hdds are Samsung HD203WI), the server hangs 
when IO Load raises. 

Server:
Mainboard: Supermicro X8DT3-F
SAS Controller: 3Ware 9690SA-4I
CPU: Intel Xeon 5500

Here are some informations about the drive configuration:

format: (dthe drives at c7 contains the rpool)
AVAILABLE DISK SELECTIONS:
       0. c7t2d0 <DEFAULT cyl 18238 alt 2 hd 255 sec 63>
          /p...@0,0/pci15d9,1...@1f,2/d...@2,0
       1. c7t3d0 <DEFAULT cyl 18239 alt 2 hd 255 sec 63>
          /p...@0,0/pci15d9,1...@1f,2/d...@3,0
       2. c9t0d0 <AMCC-9690SA-4I  DISK-4.10-1.82TB>
          /p...@0,0/pci8086,3...@9/pci13c1,1...@0/s...@0,0
       3. c9t1d0 <AMCC-9690SA-4I  DISK-4.10-1.82TB>
          /p...@0,0/pci8086,3...@9/pci13c1,1...@0/s...@1,0
       4. c9t2d0 <AMCC-9690SA-4I  DISK-4.10-1.82TB>
          /p...@0,0/pci8086,3...@9/pci13c1,1...@0/s...@2,0
       5. c9t3d0 <AMCC-9690SA-4I  DISK-4.10-1.82TB>
          /p...@0,0/pci8086,3...@9/pci13c1,1...@0/s...@3,0
       6. c9t4d0 <AMCC-9690SA-4I  DISK-4.10-1.82TB>
          /p...@0,0/pci8086,3...@9/pci13c1,1...@0/s...@4,0
       7. c9t5d0 <AMCC-9690SA-4I  DISK-4.10-1.82TB>
          /p...@0,0/pci8086,3...@9/pci13c1,1...@0/s...@5,0
       8. c9t6d0 <AMCC-9690SA-4I  DISK-4.10-1.82TB>
          /p...@0,0/pci8086,3...@9/pci13c1,1...@0/s...@6,0
       9. c9t7d0 <AMCC-9690SA-4I  DISK-4.10-1.82TB>
          /p...@0,0/pci8086,3...@9/pci13c1,1...@0/s...@7,0
      10. c9t8d0 <AMCC-9690SA-4I  DISK-4.10-1.82TB>
          /p...@0,0/pci8086,3...@9/pci13c1,1...@0/s...@8,0

zpool status:

stor...@sodom:~# zpool status storage
  pool: storage
 state: ONLINE
 scrub: scrub in progress for 1h49m, 31,49% done, 3h59m to go
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c9t0d0  ONLINE       0     0     0
            c9t1d0  ONLINE       0     0     0
            c9t2d0  ONLINE       0     0     0
            c9t3d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c9t4d0  ONLINE       0     0     0
            c9t5d0  ONLINE       0     0     0
            c9t6d0  ONLINE       0     0     0
            c9t7d0  ONLINE       0     0     0
        spares
          c9t8d0    AVAIL

Is seems thath Scrub does not let the server crash...

last but not least the output generated by the sas controller cli:

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache  AVrfy
------------------------------------------------------------------------------
u0    SINGLE    OK             -       -       -       1862.63   RiW    ON
u1    SINGLE    OK             -       -       -       1862.63   RiW    ON
u2    SINGLE    OK             -       -       -       1862.63   RiW    ON
u3    SINGLE    OK             -       -       -       1862.63   RiW    ON
u4    SINGLE    OK             -       -       -       1862.63   RiW    ON
u5    SINGLE    OK             -       -       -       1862.63   RiW    ON
u6    SINGLE    OK             -       -       -       1862.63   RiW    ON
u7    SINGLE    OK             -       -       -       1862.63   RiW    ON
u8    SINGLE    OK             -       -       -       1862.63   RiW    ON

VPort Status         Unit Size      Type  Phy Encl-Slot    Model
------------------------------------------------------------------------------
p8    OK             u0   1.82 TB   SATA  -   /c9/e0/slt1  SAMSUNG HD203WI
p9    OK             u1   1.82 TB   SATA  -   /c9/e0/slt3  SAMSUNG HD203WI
p10   OK             u2   1.82 TB   SATA  -   /c9/e0/slt5  SAMSUNG HD203WI
p11   OK             u4   1.82 TB   SATA  -   /c9/e0/slt6  SAMSUNG HD203WI
p12   OK             u5   1.82 TB   SATA  -   /c9/e0/slt8  SAMSUNG HD203WI
p13   OK             u3   1.82 TB   SATA  -   /c9/e0/slt10 SAMSUNG HD203WI
p14   OK             u6   1.82 TB   SATA  -   /c9/e0/slt13 SAMSUNG HD203WI
p15   OK             u7   1.82 TB   SATA  -   /c9/e0/slt15 SAMSUNG HD203WI
p16   OK             u8   1.82 TB   SATA  -   /c9/e0/slt17 SAMSUNG HD203WI

/var/adm/messages told me:
...
Jul 20 17:28:37 sodom ahci: [ID 517647 kern.warning] WARNING: ahci0: watchdog 
port 3 satapkt 0xffffff019a8a0480 timed out
Jul 20 17:29:37 sodom ahci: [ID 517647 kern.warning] WARNING: ahci0: watchdog 
port 0 satapkt 0xffffff018db7a7a8 timed out
Jul 20 17:29:42 sodom ahci: [ID 517647 kern.warning] WARNING: ahci0: watchdog 
port 2 satapkt 0xffffff019b3dfbd8 timed out
...

After spending some time in google I tried the following "workarounds" wich 
didn't works... :
Some options like in /etc/system:
set cpupm_enabled = 0
set idle_cpu_no_deep_c = 1
and /etc/power.conf:
cpu_deep_idle          disable

I hope u guys can help me solving this Problem ;)
-- 
This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
storage-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to