> Actually, there's only one lock for the entire hash table. > There is code which occasionally must traverse the *entire* hash table, > which is probably the source of the occasional hangs. This algorithm > will improve somewhat, though it remains to be seen how well it will > handle 27000 stat structures. That's some working set. This sounds familiar. :-)Q: why do you traverse the entire hash table?Would it be possible to split this traversal up into several pieces? Background: System V UNIX once had (some versions may still have) such a "traverse all of a table" (in this case, the buffer cache) once every 30 seconds or so. On a system with a large working set of buffers, as well as a large buffer cache, this caused a major disturbance of interactive response. Eric Bina and I changed this to traverse 1/30th of the table every second (actually, tunable). The problem went away. Could you adopt this sort of incremental scan of the hash table?
