On Mon, 14 Jan 2013, Brian Buhrow wrote: > Hello. I'm working on some patches to make the LSI Fusion SCSI driver > (mpt(4)) more robust. I'm making good progress, but I've run into a n > issue that has momentarily baffled me. If I get a bunch of concurrent jobs > running on a filesystem mounted on a raid set using disks across two > mpt(4) instances, they get into a state where they become deadlocked and > all but one of the processes is stuck in tstile, and the other remaining > process is in uvn_fp2. All the processes are trying to read the same file > in the filesystem, not write it, but read it. I have a debug version of > the kernel, and the machine is running, and other operations against the > filesystem work fine and complete successfully. I'm assuming the problem is > something I've introduced into the mpt(4) driver, though I'm not sure how > at the moment, sinceI've not been able to reproduce it In an alternative > environment. > When a process gets into uvn_fp2 state, it's waiting for something to > find it pages. Is there a way to figure out what it's waiting for and > which underlying kernel process the uvn_fp2 call is expecting to wake it > up? > > Any help on this issue would be greatly appreciated. I can give a lot more > details if someone is interested.
If you take a look in uvm_findpage() you'll see that the wait address for uvn_fp2 should be the page structure itself. You can dump the page structure and look at the flags and the lock structure to figure out what state it's in. Given that you're fiddling around with mpt, the most likely reason for this sort of behavior is that a disk transaction has been lost. The operation may have been lost because of some locking issue in the completion callback, but most likely the firmware lost track of the operation. If you're writing a SCSI driver properly, you should have a list of all outstanding operations, and each should have a timeout associated with it so the driver can determine it's been dropped somewhere and can be aborted and retried. The NetBSD mpt driver does not appear to do that. This tends to be a problem with LSI's drivers. They like to assume that the firmware is faultless, something that is usually not the case. I generally allocate an array for outstanding commands and use the array index for the identifier I give to the firmware. Of course, this does put a hard limit on the number of outstanding commands at any one time. But if the array fills up it can be reallocated on the fly without losing outstanding command IDs. You also need to be careful with command timeouts on certain devices. While a one or two minute timeout should be plenty for a disk type device, some operations on SCSI tape drives can take hours to complete. Eduardo
