On Tue, Jan 11, 2005 at 10:03:03AM -0600, Dave Kleikamp wrote: > On Tue, 2005-01-11 at 08:20 -0600, Dave Kleikamp wrote: > > I also noticed that several of the static functions called by diAlloc do > > show up in this latest stack trace, so I believe I was mistaken about > > the cause of the earlier deadlock. I now think that the thread in > > diAlloc was trying to grab the AG_LOCK, and never made it down into > > diNewIAG. I'm afraid there may still be two different problems that > > still need to be figured out. > > No, I think it's the same problem. I only saw a subset of the blocked > threads in the earlier case, and I believe there probably was a thread > blocked in diNewIAG. > > This patch simply blocks new transactions earlier when the tlocks are > starting to get scarce. > > When I implemented the multiple commit threads, I split TxLockVHWM out > of TxLockHWM, which was 80% of nTxLock. So this patch puts TxLockVHWM > back to the old value of TxLockHWM. Maybe I was a little too bold > waiting until 90% of the tlocks were in use before blocking new > transactions. (jfsSync thread is awakened when TxLockHWM is hit, and > transactions are actually blocked when TxLockVHWM is hit.)
Yeah, that seems to fix it for this particular case. I wonder if we're just delaying the inevitable though? Would it make sense to pre-allocate tlocks somehow before holding important semaphores? Admittedly, my understanding of the txnmgr is a bit limited, so this may be total nonsense. Sonny _______________________________________________ Jfs-discussion mailing list [email protected] http://www-124.ibm.com/developerworks/oss/mailman/listinfo/jfs-discussion
