Re: [discuss] Again: illumos based ZFS storage failure

George Wilson Tue, 11 Dec 2012 06:13:48 -0800

So this tells us that the pool is not suspending because of any errors.The next thing to look at is whether or not the I/Os are just notreturning. Are you able to generate a crash dump?


Here are some other things to try:


# echo "::stacks -m zfs"  | mdb -k > zfs_threads.out
# echo "::stacks -c spa_sync" | mdb -k
# echo "::zio_state -r" | mdb -k > zios.out

- George

On 12/11/12 6:15 AM, Gabriele Bulfon wrote:

I got the problem again today, I had moved the iscsi to the Areca,left only CIFS on the adaptec card,and I have only CIFS not running. Probably this is a problem of theadaptec card itself, but I don't know

if it's the combination of the illumos kernel + this card.

I have no signal of error anywhere, but zfs command not responding onthe filesystem over the adaptec.

Even df -h do note returns.

I tried the "walk spa" command during failure, but output the same:

sonicle@xstorage:~# echo "::walk spa | ::print spa_t spa_namespa_suspended" | mdb -k

spa_name = [ "adaptec" ]
spa_suspended = 0
spa_name = [ "areca" ]
spa_suspended = 0
spa_name = [ "rpool1" ]
spa_suspended = 0


------------------------------------------------------------------------


*Da:* George Wilson <[email protected]>
*A:* Gabriele Bulfon <[email protected]>
*Cc:* [email protected]
*Data:* 27 novembre 2012 15.10.45 CET
*Oggetto:* Re: [discuss] Again: illumos based ZFS storage failure


    Was this data you provided below from a time when the server was
    hung? If not, try running this next time you see the issue.

    - George

    On 11/23/12 9:44 AM, Gabriele Bulfon wrote:

    Hi,


    here is the output. Looks sane:

    sonicle@xstorage:~# echo "::walk spa | ::print spa_t spa_name
    spa_suspended" | mdb -k
    spa_name = [ "adaptec" ]
    spa_suspended = 0
    spa_name = [ "areca" ]
    spa_suspended = 0
    spa_name = [ "rpool1" ]
    spa_suspended = 0




    ------------------------------------------------------------------------


    *Da:* George Wilson <[email protected]>
    *A:* [email protected]
    *Cc:* Gabriele Bulfon <[email protected]>
    *Data:* 23 novembre 2012 14.46.55 CET
    *Oggetto:* Re: [discuss] Again: illumos based ZFS storage failure


        It's possible that the adaptec pool has suspended because of
        some error on the storage. Can you run the following as root
        and provide the output:

        echo "::walk spa | ::print spa_t spa_name spa_suspended" | mdb -k

        - George

        On 11/23/12 5:43 AM, Gabriele Bulfon wrote:

        Hi, I got the same problem this morning.

        Thanks to Alasdair suggestion, I commented out the zfs
        "quota" command from /etc/profile, so I
        could enter the bash prompt and investigate.

        As a summary of the problem:
        - 3 zfs pools (rpool, areca, adpatec) each on a different
        controller: rpool as a mirror of internal disks, areca as
        raidz on 7 sata disks of the areca controller plus half
        space of SD as zlog, adaptec as raidz on 8 disks of the
        adaptec controller plus half space of SD as zlog.
        - areca space is used for NFS sharing to unix servers, and
        always responds.
        - adaptec space is used for CIFS sharing and an iScsi volume
        for the Windows PDC.

        Then we have a vmware server with an iscsi resource store,
        given to the virtualized PDC as a secondary disk for
        sqlserver data and some more. PDC boots directly from the
        vmware server disks.

        At once, both CIFS from the storage and PDC iscsi disk do
        not respond.
        CIFS fails probably because the PDC AD is not responding,
        probably busy checking the iscsi disk, in a loop.

        Going into the storage bash:

        - zpool status shows everything find, every pool is correct.
        - /var/adm/messages shows only smbd time outs with the PDC,
        no hardware or zfs problem.
        - at the time of failure, fmdump -eVvm showed same
        previously found errors, 3 days earlier
        - after rebooting all the infrastructure, fmdump -eVvm
        showed same previously found errors around the time of
        rebooting the storage, not at the time of the experienced
        failures. We find one stated error for each disk of the
        adaptec controller (cut & paste at the end)
        - a zfs list areca/* showed all the areca filesystems
        - a zfs list adaptec blocked never returning
        - any access to the zfs structure of the adaptec would block
        - as suggested by Alasdair, I ran savecore -L (I checked to
        have the dump device and with enough space)
        - the savecore command ran for sometime until reaching 100%,
        then blocked never returning.
        - I could not "init 5" the storage, never returning
        - I tried sending the poweroff signal with the power button,
        console showed its intention to power off, but never did it
        - I forced powered off via the power button.
        - Once everything was powered on again, everything ran fine.
        - I looked for the dump into /var/crash, but I had no
        /var/crash at all.

        How can I investigate further this problem?
        Do I have any chance to find the savecore output in the dump
        device, even if I have not /var/crash?

        Here is the fmdump output:

        Nov 23 2012 09:25:04.422282821
        ereport.io.scsi.cmd.disk.dev.uderr
        nvlist version: 0
        class = ereport.io.scsi.cmd.disk.dev.uderr
        ena = 0x2b08d604c300001
        detector = (embedded nvlist)
        nvlist version: 0
        version = 0x0
        scheme = dev
        device-path =
        /pci@0,0/pci8086,3595@2/pci8086,370@0/pci9005,2bc@e/disk@7,0
        devid = id1,sd@TAdaptec_3805____________8366BAF3
        (end detector)

        devid = id1,sd@TAdaptec_3805____________8366BAF3
        driver-assessment = fail
        op-code = 0x1a
        cdb = 0x1a 0x0 0x8 0x0 0x18 0x0
        pkt-reason = 0x0
        pkt-state = 0x1f
        pkt-stats = 0x0
        stat-code = 0x0
        un-decode-info = sd_get_write_cache_enabled: Mode Sense
        caching page code mismatch 0


        __ttl = 0x1





        
----------------------------------------------------------------------------------

        Da: Alasdair Lumsden <[email protected]>
        A: [email protected]
        Data: 10 novembre 2012 14.37.20 CET
        Oggetto: Re: [discuss] illumos based ZFS storage failure

            I haven't read the whole thread, but the next time it
            happens you'll
            want to invoke a panic and make the dump file available.
            You'll want to
            ensure that:

            1. Multithreaded dump is disabled in /etc/system with:

            * Disable MT dump
            set dump_plat_mincpu=0

            Without this there is a risk of your dump not saving
            correctly.

            2. That you have a dump device and that it's big enough
            to capture your
            kernel size (zfs set volsize=X rpool/dump)

            3. That dumpadm is happy and set to save cores etc:

            dumpadm -y -z on -c kernel -d /dev/zvol/dsk/rpool/dump

            There's lots of good info here:

            http://wiki.illumos.org/display/illumos/How+To+Report+Problems

            You can also inspect things with mdb while the system is
            up, but if it's
            a production system normally you want to get it rebooted
            and into
            production again ASAP. So in that situation, you can
            take a dump of the
            running system with:

            savecore -L

            One thing to keep in mind is /etc/profile runs
            /usr/sbin/quota, which
            can screw over logins when the zfs subsystem is unhappy.
            I really think
            it should be removed by default since on most systems
            quotas aren't even
            used. So comment it out - we do so on all our systems.
            This will give
            you a better chance of logging in when things go wrong.

            I think there's a way to SSH in bypassing /etc/profile
            but I can't
            remember what it is - perhaps someone can chime in.

            Good luck. Centralised storage is difficult to do and
            when it goes wrong
            everything that depends on it goes down. It's a "all
            your eggs in one
            giant failbasket". Doing it homebrew with ZFS is cost
            effective and can
            be fast, but it is also risky. This is why there are
            companies like
            Nexenta out there with certified combinations of
            hardware and software
            engineered to work together. This extends to validating
            firmware
            combinations of disks/HBAs/etc.

            Cheers,

            Alasdair


            -------------------------------------------
            illumos-discuss
            Archives: https://www.listbox.com/member/archive/182180/=now
            RSS Feed:
            https://www.listbox.com/member/archive/rss/182180/21175541-02f10c6f
            Modify Your Subscription: https://www.listbox.com/member/?&;
            Powered by Listbox: http://www.listbox.com


        *illumos-discuss* | Archives
        <https://www.listbox.com/member/archive/182180/=now>
        <https://www.listbox.com/member/archive/rss/182180/21175546-6311bfb2>
        | Modify
        <https://www.listbox.com/member/?&;>
        Your Subscription       [Powered by Listbox]
        <http://www.listbox.com>





-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Re: [discuss] Again: illumos based ZFS storage failure

Reply via email to