Re: [discuss] Again: illumos based ZFS storage failure

Gabriele Bulfon Fri, 23 Nov 2012 06:45:23 -0800

Hi,
here is the output. Looks sane:
sonicle@xstorage:~# echo "::walk spa | ::print spa_t spa_name spa_suspended" | 
mdb -k
spa_name = [ "adaptec" ]
spa_suspended = 0
spa_name = [ "areca" ]
spa_suspended = 0
spa_name = [ "rpool1" ]
spa_suspended = 0
Da:
George Wilson
A:
[email protected]
Cc:
Gabriele Bulfon
Data:
23 novembre 2012 14.46.55 CET
Oggetto:
Re: [discuss] Again: illumos based ZFS storage failure
It's possible that the adaptec pool has      suspended because of some error on 
the storage. Can you run the      following as root and provide the output:
echo "::walk spa | ::print spa_t spa_name spa_suspended" | mdb -k
- George
On 11/23/12 5:43 AM, Gabriele Bulfon wrote:
Hi, I got the same problem this morning.
Thanks to Alasdair suggestion, I commented out the zfs "quota"          command 
from /etc/profile, so I
could enter the bash prompt and investigate.
As a summary of the problem:
- 3 zfs pools (rpool, areca, adpatec) each on a different          controller: 
rpool as a mirror of internal disks, areca as          raidz on 7 sata disks of 
the areca controller plus half space          of SD as zlog, adaptec as raidz 
on 8 disks of the adaptec          controller plus half space of SD as zlog.
- areca space is used for NFS sharing to          unix servers, and always 
responds.
- adaptec space is used for CIFS sharing and          an iScsi volume for the 
Windows PDC.
Then we have a vmware server with an iscsi          resource store, given to 
the virtualized PDC as a secondary          disk for sqlserver data and some 
more. PDC boots directly from          the vmware server disks.
At once, both CIFS from the storage and PDC          iscsi disk do not respond.
CIFS fails probably because the PDC AD is          not responding, probably 
busy checking the iscsi disk, in a          loop.
Going into the storage bash:
- zpool status shows everything find, every pool is correct.
- /var/adm/messages shows only smbd time          outs with the PDC, no 
hardware or zfs problem.
- at the time of failure, fmdump -eVvm          showed same previously found 
errors, 3 days earlier
- after              rebooting all the infrastructure, fmdump -eVvm showed same 
             previously found errors around the time of rebooting the           
   storage, not at the time of the experienced failures. We              find 
one stated error for each disk of the adaptec              controller (cut 
&paste at the end)
- a zfs              list areca/* showed all the areca filesystems
- a zfs              list adaptec blocked never returning
- any              access to the zfs structure of the adaptec would block
- as              suggested by Alasdair, I ran savecore -L (I checked to        
      have the dump device and with enough space)
- the              savecore command ran for sometime until reaching 100%,       
       then blocked never returning.
- I              could not "init 5" the storage, never returning
- I              tried sending the poweroff signal with the power button,       
       console showed its intention to power off, but never did              it
- I              forced powered off via the power button.
- Once              everything was powered on again, everything ran fine.
- I              looked for the dump into /var/crash, but I had no              
/var/crash at all.
How can              I investigate further this problem?
Do I have any chance to find the savecore output in the              dump 
device, even if I have not /var/crash?
Here is the fmdump output:
Nov                23 2012 09:25:04.422282821                
ereport.io.scsi.cmd.disk.dev.uderr
nvlist                version: 0
class = ereport.io.scsi.cmd.disk.dev.uderr
ena                = 0x2b08d604c300001
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path =                
/pci@0,0/pci8086,3595@2/pci8086,370@0/pci9005,2bc@e/disk@7,0
devid = id1,sd@TAdaptec_3805____________8366BAF3
(end                detector)
devid = id1,sd@TAdaptec_3805____________8366BAF3
driver-assessment = fail
op-code = 0x1a
cdb                = 0x1a 0x0 0x8 0x0 0x18 0x0
pkt-reason = 0x0
pkt-state = 0x1f
pkt-stats = 0x0
stat-code = 0x0
un-decode-info = sd_get_write_cache_enabled: Mode Sense                caching 
page code mismatch 0
__ttl = 0x1
----------------------------------------------------------------------------------
Da: Alasdair Lumsden
A:
[email protected]
Data: 10 novembre 2012 14.37.20 CET
Oggetto: Re: [discuss] illumos based ZFS storage failure
I haven't read the whole thread,            but the next time it happens you'll
want to invoke a panic and make the dump file available.            You'll want 
to
ensure that:
1. Multithreaded dump is disabled in /etc/system with:
* Disable MT dump
set dump_plat_mincpu=0
Without this there is a risk of your dump not saving            correctly.
2. That you have a dump device and that it's big enough to            capture 
your
kernel size (zfs set volsize=X rpool/dump)
3. That dumpadm is happy and set to save cores etc:
dumpadm -y -z on -c kernel -d /dev/zvol/dsk/rpool/dump
There's lots of good info here:
http://wiki.illumos.org/display/illumos/How+To+Report+Problems
You can also inspect things with mdb while the system is up,            but if 
it's
a production system normally you want to get it rebooted and            into
production again ASAP. So in that situation, you can take a            dump of 
the
running system with:
savecore -L
One thing to keep in mind is /etc/profile runs            /usr/sbin/quota, which
can screw over logins when the zfs subsystem is unhappy. I            really 
think
it should be removed by default since on most systems quotas            aren't 
even
used. So comment it out - we do so on all our systems. This            will give
you a better chance of logging in when things go wrong.
I think there's a way to SSH in bypassing /etc/profile but I            can't
remember what it is - perhaps someone can chime in.
Good luck. Centralised storage is difficult to do and when            it goes 
wrong
everything that depends on it goes down. It's a "all your            eggs in one
giant failbasket". Doing it homebrew with ZFS is cost            effective and 
can
be fast, but it is also risky. This is why there are            companies like
Nexenta out there with certified combinations of hardware            and 
software
engineered to work together. This extends to validating            firmware
combinations of disks/HBAs/etc.
Cheers,
Alasdair
-------------------------------------------
illumos-discuss
Archives:
https://www.listbox.com/member/archive/182180/=now
RSS Feed:
https://www.listbox.com/member/archive/rss/182180/21175541-02f10c6f
Modify Your Subscription:
https://www.listbox.com/member/?&;
Powered by Listbox:
http://www.listbox.com
illumos-discuss
|
Archives
|
Modify
Your                  Subscription




-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Re: [discuss] Again: illumos based ZFS storage failure

Reply via email to