Hi Garret, Thanks for your help.
What do you mean by "unbundled driver" ? It's integrated in Opensolaris. The code was initially provided by areca but has been integrated and modified since. Please see http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/scsi/adapters/arcmsr/arcmsr.c I don't know wether this is the scsi timeout you refer to, but here are the messages found in /var/adm/messages when we remove a drive : Jan 8 14:29:34 nc-tanktsm arcmsr: [ID 474728 kern.notice] arcmsr1: raid volume was kicked out Jan 8 14:29:34 nc-tanktsm arcmsr: [ID 832694 kern.warning] WARNING: arcmsr1: tran reset (level 0x1) called for target 0 lun 7 Jan 8 14:29:34 nc-tanktsm arcmsr: [ID 169690 kern.notice] arcmsr1: block read/write command while raidvolume missing (cmd 0a for target 0 lun 7) Jan 8 14:29:34 nc-tanktsm arcmsr: [ID 832694 kern.warning] WARNING: arcmsr1: tran reset (level 0x1) called for target 0 lun 7 Jan 8 14:29:35 nc-tanktsm arcmsr: [ID 989415 kern.notice] NOTICE: arcmsr1: T0L7 offlined Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 169690 kern.notice] arcmsr1: block read/write command while raidvolume missing (cmd 0a for target 0 lun 7) Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 832694 kern.warning] WARNING: arcmsr1: tran reset (level 0x1) called for target 0 lun 7 Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 182018 kern.warning] WARNING: arcmsr1: target reset not supported Jan 8 14:29:36 nc-tanktsm last message repeated 1 time Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 832694 kern.warning] WARNING: arcmsr1: tran reset (level 0x0) called for target 0 lun 7 Jan 8 14:29:36 nc-tanktsm last message repeated 1 time Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 182018 kern.warning] WARNING: arcmsr1: target reset not supported Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 832694 kern.warning] WARNING: arcmsr1: tran reset (level 0x0) called for target 0 lun 7 Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 169690 kern.notice] arcmsr1: block read/write command while raidvolume missing (cmd 0a for target 0 lun 7) Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 832694 kern.warning] WARNING: arcmsr1: tran reset (level 0x1) called for target 0 lun 7 Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 182018 kern.warning] WARNING: arcmsr1: target reset not supported Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 832694 kern.warning] WARNING: arcmsr1: tran reset (level 0x0) called for target 0 lun 7 Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 474728 kern.notice] arcmsr1: raid volume was kicked out And then it goes on repeating Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 169690 kern.notice] arcmsr1: block read/write command while raidvolume missing (cmd 0a for target 0 lun 7) Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 832694 kern.warning] WARNING: arcmsr1: tran reset (level 0x1) called for target 0 lun 7 Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 182018 kern.warning] WARNING: arcmsr1: target reset not supported Jan 8 14:29:36 nc-tanktsm arcmsr: [ID 832694 kern.warning] WARNING: arcmsr1: tran reset (level 0x0) called for target 0 lun 7 Sometimes I get Jan 8 14:29:46 nc-tanktsm fmd: [ID 377184 daemon.error] syslog-msgs-message-template And this one too Jan 8 14:29:55 nc-tanktsm scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,3...@7/pci17d3,1...@0/d...@0,7 (sd8): Jan 8 14:29:55 nc-tanktsm SYNCHRONIZE CACHE command failed (5) Or this one Jan 7 14:25:54 nc-tanktsm scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,3...@7/pci17d3,1...@0/d...@0,7 (sd8): Jan 7 14:25:54 nc-tanktsm i/o to invalid geometry And then it continues to repeat the four lines shown above (about the same number of times that they appear before the scsi message, I can't really tell, there are really a lot of them). As a side note, since I've read about it in some other forums, I can't see the disk drives in cfgadm, I can only see the controllers. Does this mean the driver's not well integrated with the system and/or that it does support hotplug ? Thanks again, Arnaud -----Message d'origine----- De : [email protected] [mailto:[email protected]] De la part de Garrett D'Amore Envoyé : dimanche 10 janvier 2010 18:20 À : Arnaud Brand Cc : [email protected] Objet : Re: [driver-discuss] Problem with Areca 1680 SAS Controller It does sound like a driver bug to me. Removing the disks should cause [Arnaud Brand] SCSI timeouts, which ZFS ought to be able to cope with. The problem is that if the timeouts don't happen and the jobs are left permanently outstanding, then you get into a hang situation. I'm not familiar with the Areca driver -- is this an unbundled driver? - Garrett Arnaud Brand wrote: > Hi, > > Since no one seems to have any clues on the ZFS list and that my problem > seems to be driver related, I'm reposting to driver-discuss in the hope that > someone can help. > > Thanks and have a nice day, > Arnaud > > > De : Arnaud Brand > Envoyé : vendredi 8 janvier 2010 16:55 > À : [email protected] > Cc : Arnaud Brand > Objet : ZFS partially hangs when removing an rpool mirrored disk while having > some IO on another pool on another partition of the same disk > > Hello, > > Sorry for the (very) long subject but I've pinpointed the problem to this > exact situation. > I know about the other threads related to hangs, but in my case there was no > « zfs destroy » involved, nor any compression or deduplication. > > To make a long story short, when > - a disk contains 2 partitions (p1=32GB, p2=1800 GB) and > - p1 is used as part of a zfs mirror of rpool and > - p2 is used as part of a raidz (tested raidz1 and raidz2) of tank and > - some serious work is underway on tank (tested write, copy, scrub), > If you physically remove the disk, zfs partially hangs. Putting back the > physical disk does not help. > > For the long story : > > About the hardware : > 1 x intel X25E (64GB SSD), 15x2TB SATA drives (7xWD, 8xHitachi), 2xQuadCore > Xeon, 12GB RAM, 2xAreca-1680 (8-ports SAS controller), tyan S7002 mainboard. > > About the software / firmware : > Opensolaris b130 installed on the SSD drive, on the first 32 GB. > The areca cards are configured as a JBOD and are running the latest release > firmware. > > Initial setup : > We created a 32GB partition on all of the 2TB drives and mirrored the system > partition, giving us a 16-way rpool mirror. > The rest of the 2TB drives's space was put in a second partition and used for > a raidz2 pool (named tank) > > Problem : > Whenever we physically removed a disk from its tray while doing some speed > testing on the tank pool, the system hung. > At that time I hadn't read all the thread about zfs hangs and couldn't > determine wether the system was hung or just zfs. > In order to pinpoint the problem, we made another setup. > > Second setup : > I reduced the number of partitions in the rpool mirror down to 3 (p1 from > the SSD, p1 from a 2TB drive on the same controller as the SSD and p1 from a > 2TB drive on the other controller). > > Problem : > When the system is quiet, I am able to physically remove any disk, plug it > back and resilver it. > When I am putting some load on the tank pool, I can remove any disk that does > *not* contain the rpool mirror (I can plug it back and resilver it while the > load keeps running without noticeable performance impact). > > When I am putting some load on the tank pool, I cannot physically remove a > disk that also contains a mirror of the rpool or zfs partially hangs. > When I say partially, I mean that : > - zpool iostat -v tank 5 freezes > - if I run any zpool command related to rpool, I'm stuck (zpool clear rpool > c4t0d7s0 for example or zpool status rpool) > > I can't launch new programms, but already launched programs continue to run > (at least in an ssh session, since gnome becomes more and more frozen as you > move from window to window). > > From ssh sessions : > > - prstat shows that only gnome-system-monitor, xorg, ssh, bash and various > *stat utils (prstat, fstat, iostat, mpstat) are consumming some CPU. > > - zpool iostat -v tank 5 is frozen (It freezes when I issue a zpool clear > rpool c4t0d7s0 in another session) > > - iostat -xn is not stuck but shows all zeroes since the very moment zpool > iostat froze (which is quite strange if you look at fsstat ouput hereafter). > NB: when I say all zeroes, I really mea nit, it's not zero dot domething, its > zero dot zero. > > - mpstat shows normal activity (almost nothing since this is a test machine, > so only a few percent are used, but it still shows some activity and > refreshes correctly) > CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl > 0 0 0 125 428 109 113 1 4 0 0 251 2 0 0 98 > 1 0 0 20 56 16 44 2 2 1 0 277 11 1 0 88 > 2 0 0 163 152 13 309 1 3 0 0 1370 4 0 0 96 > 3 0 0 19 111 41 90 0 4 0 0 80 0 0 0 100 > 4 0 0 69 192 17 66 0 3 0 0 20 0 0 0 100 > 5 0 0 10 61 7 92 0 4 0 0 167 0 0 0 100 > 6 0 0 96 191 25 74 0 4 1 0 5 0 0 0 100 > 7 0 0 16 58 6 63 0 3 1 0 59 0 0 0 100 > > - fsstat -F 5 shows all zeroes but for the zfs line (the figures hereunder > stay almost the same over time) > new name name attr attr lookup rddir read read write write > file remov chng get set ops ops ops bytes ops bytes > 0 0 0 1,25K 0 2,51K 0 803 11,0M 473 11,0M zfs > > - disk leds show no activity > > - I cannot run any other command (neither from ssh, nor from gnome) > > - I cannot open another ssh session (I don't even get the login prompt in > putty) > > - I can successfully ping the machine > > - I cannot establish a new cifs session (the login prompt should not appear > since the machine is in an active directory domain, but when it's stuck the > prompt appear and I cannot authenticate. I guess it's related to ldap or > kerberos or whatever cannot be read on rpool), but an already active session > will stay open (last time I even managed to create a text file with a few > lines in it that was still there after I hard-rebooted). > > - after some time (an hour or so) my ssh sessions eventually stuck too. > > Having worked for quite some time with Opensolaris/ZFS (though not with so > much disks), I doubted the problem came from opensolaris and I already opened > a case with areca's tech support which is trying (at least they told me so) > to reproduce the problem. That's until I've read on zfs hangs issues. > > We've found a workaround : we're going to put one internal 2.5'' disk to > mirror rpool and dedicate the whole of the 2TB disks for tank, but : > - I thought it might somehow be related to the other hang issues (and so it > might help developpers to hear about other similar issues) > - I would really like to rule out an opensolaris bug so that I can bring > proofs to areca that their driver has a problem and either request them to > correct it or request my supplier to replace the cards with working ones. > > I think their driver is at fault because I found a message in > /var/adm/message saying "WARNING: arcmsr duplicate scsi_hba_pkt_comp(9F) on > same scsi_pkt(9S)" and immediately after that " WARNING: kstat rcnt == 0 when > exiting runq, please check". Then the system was hung. The comments in the > code introducing these changes tell us that drivers behaving this way could > panic the system (or at least that how I understood these comments). > > I've reached my limits here. > If anyone has ideas on how to determine what's going on, on other information > I should publish, on other things to run or to check, I'll be happy to test > them. > > If I can be of any help to help troubleshoot the zfs hang problems that other > are experiencing, I'd be happy to give a hand. > > Thanks and have a nice day, > > Regards, > Arnaud Brand > > > > > > > _______________________________________________ > driver-discuss mailing list > [email protected] > http://mail.opensolaris.org/mailman/listinfo/driver-discuss > _______________________________________________ driver-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/driver-discuss
