Hi,
Since no one seems to have any clues on the ZFS list and that my problem seems
to be driver related, I'm reposting to driver-discuss in the hope that someone
can help.
Thanks and have a nice day,
Arnaud
De : Arnaud Brand
Envoyé : vendredi 8 janvier 2010 16:55
À : [email protected]
Cc : Arnaud Brand
Objet : ZFS partially hangs when removing an rpool mirrored disk while having
some IO on another pool on another partition of the same disk
Hello,
Sorry for the (very) long subject but I've pinpointed the problem to this exact
situation.
I know about the other threads related to hangs, but in my case there was no «
zfs destroy » involved, nor any compression or deduplication.
To make a long story short, when
- a disk contains 2 partitions (p1=32GB, p2=1800 GB) and
- p1 is used as part of a zfs mirror of rpool and
- p2 is used as part of a raidz (tested raidz1 and raidz2) of tank and
- some serious work is underway on tank (tested write, copy, scrub),
If you physically remove the disk, zfs partially hangs. Putting back the
physical disk does not help.
For the long story :
About the hardware :
1 x intel X25E (64GB SSD), 15x2TB SATA drives (7xWD, 8xHitachi), 2xQuadCore
Xeon, 12GB RAM, 2xAreca-1680 (8-ports SAS controller), tyan S7002 mainboard.
About the software / firmware :
Opensolaris b130 installed on the SSD drive, on the first 32 GB.
The areca cards are configured as a JBOD and are running the latest release
firmware.
Initial setup :
We created a 32GB partition on all of the 2TB drives and mirrored the system
partition, giving us a 16-way rpool mirror.
The rest of the 2TB drives's space was put in a second partition and used for a
raidz2 pool (named tank)
Problem :
Whenever we physically removed a disk from its tray while doing some speed
testing on the tank pool, the system hung.
At that time I hadn't read all the thread about zfs hangs and couldn't
determine wether the system was hung or just zfs.
In order to pinpoint the problem, we made another setup.
Second setup :
I reduced the number of partitions in the rpool mirror down to 3 (p1 from the
SSD, p1 from a 2TB drive on the same controller as the SSD and p1 from a 2TB
drive on the other controller).
Problem :
When the system is quiet, I am able to physically remove any disk, plug it back
and resilver it.
When I am putting some load on the tank pool, I can remove any disk that does
*not* contain the rpool mirror (I can plug it back and resilver it while the
load keeps running without noticeable performance impact).
When I am putting some load on the tank pool, I cannot physically remove a disk
that also contains a mirror of the rpool or zfs partially hangs.
When I say partially, I mean that :
- zpool iostat -v tank 5 freezes
- if I run any zpool command related to rpool, I'm stuck (zpool clear rpool c4t0d7s0 for example or zpool status rpool)
I can't launch new programms, but already launched programs continue to run (at
least in an ssh session, since gnome becomes more and more frozen as you move
from window to window).
From ssh sessions :
- prstat shows that only gnome-system-monitor, xorg, ssh, bash and various
*stat utils (prstat, fstat, iostat, mpstat) are consumming some CPU.
- zpool iostat -v tank 5 is frozen (It freezes when I issue a zpool clear rpool
c4t0d7s0 in another session)
- iostat -xn is not stuck but shows all zeroes since the very moment zpool
iostat froze (which is quite strange if you look at fsstat ouput hereafter).
NB: when I say all zeroes, I really mea nit, it's not zero dot domething, its
zero dot zero.
- mpstat shows normal activity (almost nothing since this is a test machine, so
only a few percent are used, but it still shows some activity and refreshes
correctly)
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 125 428 109 113 1 4 0 0 251 2 0 0 98
1 0 0 20 56 16 44 2 2 1 0 277 11 1 0 88
2 0 0 163 152 13 309 1 3 0 0 1370 4 0 0 96
3 0 0 19 111 41 90 0 4 0 0 80 0 0 0 100
4 0 0 69 192 17 66 0 3 0 0 20 0 0 0 100
5 0 0 10 61 7 92 0 4 0 0 167 0 0 0 100
6 0 0 96 191 25 74 0 4 1 0 5 0 0 0 100
7 0 0 16 58 6 63 0 3 1 0 59 0 0 0 100
- fsstat -F 5 shows all zeroes but for the zfs line (the figures hereunder stay
almost the same over time)
new name name attr attr lookup rddir read read write write
file remov chng get set ops ops ops bytes ops bytes
0 0 0 1,25K 0 2,51K 0 803 11,0M 473 11,0M zfs
- disk leds show no activity
- I cannot run any other command (neither from ssh, nor from gnome)
- I cannot open another ssh session (I don't even get the login prompt in putty)
- I can successfully ping the machine
- I cannot establish a new cifs session (the login prompt should not appear
since the machine is in an active directory domain, but when it's stuck the
prompt appear and I cannot authenticate. I guess it's related to ldap or
kerberos or whatever cannot be read on rpool), but an already active session
will stay open (last time I even managed to create a text file with a few lines
in it that was still there after I hard-rebooted).
- after some time (an hour or so) my ssh sessions eventually stuck too.
Having worked for quite some time with Opensolaris/ZFS (though not with so much
disks), I doubted the problem came from opensolaris and I already opened a case
with areca's tech support which is trying (at least they told me so) to
reproduce the problem. That's until I've read on zfs hangs issues.
We've found a workaround : we're going to put one internal 2.5'' disk to mirror
rpool and dedicate the whole of the 2TB disks for tank, but :
- I thought it might somehow be related to the other hang issues (and so it
might help developpers to hear about other similar issues)
- I would really like to rule out an opensolaris bug so that I can bring proofs
to areca that their driver has a problem and either request them to correct it
or request my supplier to replace the cards with working ones.
I think their driver is at fault because I found a message in /var/adm/message saying
"WARNING: arcmsr duplicate scsi_hba_pkt_comp(9F) on same scsi_pkt(9S)" and immediately
after that " WARNING: kstat rcnt == 0 when exiting runq, please check". Then the system
was hung. The comments in the code introducing these changes tell us that drivers behaving this way
could panic the system (or at least that how I understood these comments).
I've reached my limits here.
If anyone has ideas on how to determine what's going on, on other information I should publish, on other things to run or to check, I'll be happy to test them.
If I can be of any help to help troubleshoot the zfs hang problems that other are experiencing, I'd be happy to give a hand.
Thanks and have a nice day,
Regards,
Arnaud Brand
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss