Robert Banz wrote: >> >> Could you do some rxdebug calls to the fileserver next time? So we >> know why it's getting unresponsive. >> It could be running out of threads. I don't expect that, but it could >> be ... > > The 'symptoms' seem to be, for the most part, volume-specific. Slow > response to accessing that volume, followed by the clients seeing a > timeout on it. So, guts-o-the-fileserver folk, is there a volume-wide > lock that gets set by a particular fileserver thread when it's being > acted upon? Since deleting a whole-bunch-of-files (a /bin/rm -fr <dir>) > is happening, that's a whole lot of requests coming in in-series to that > volume, being taken care of on (probably) a first-come, first-served > basis, leaving little room for other clients to get an op in on that > volume?
Lets say that there are N clients who are all attempting to use the contents of directory "D" in the read-write volume "V". Client 1 is making changes to the contents of the directory and clients 2 to N are reading the directory. In order for each of the N clients to read the directory, they need to perform a FetchData RPC which registers a callback with the file server on "D". Now each time that client 1 makes a change to the contents of "D", each of the callbacks that are currently registered with the file server must be broken in order to notify clients 2 to N that the data they have cached is no longer valid. If clients 2 to N are actively using "D", then when the callback break is received from the file server they will in turn attempt to perform a new FetchData operation to obtain the latest data value. This in turn registers a new callback. Now if client 1 is performing 30,000 individual RemoveFile RPCs it is going to extremely hard for any of the other clients to maintain to be able to maintain a callback until all 30,000 RPCs are completed. As soon as a FetchData operation completes, the callback will be broken and the cache contents will be invalidated. I don't believe there is a bug here its just a negative side effect of client side caching and the fact that the file system is only given one file name at a time to act on. Jeffrey Altman
smime.p7s
Description: S/MIME Cryptographic Signature
