While I have no objections to discussions regarding extending the RX abort packet format, I agree with Jeffrey Hutzelman that we need to understand what we intend to do with it. I do not believe that extending the packet format to describe "throttling" is a good idea.
Throttling is not an error code nor is it a reason. The RX abort packet is already returning the error code which describes why the request failed. Throttling is an OpenAFS specific implementation behavior that is used to prevent a client from eating up too much of the server side resources since OpenAFS at present doesn't have a better mechanism of prioritizing client requests and has a limited number of transactions that can be processed simultaneously. From my perspective, knowing that a packet was delayed due to throttling is only of use to humans that are attempting to debug what is believed to be poor OpenAFS file server performance to a client. I do not believe that a cache manager that is aware it is being throttled would do anything differently. The cache manager is issuing RPCs to the file server because an application is performing an action that the cache manager cannot satisfy locally. If the cache manager knew it was being throttled, what should it do differently? Should it begin failing requests locally without sending them to the file server? Should it locally begin throttling the "bad" application? The most common cause of throttling is a result of a cache manager that is issuing individual FetchStatus RPCs which can fail instead of InlineBulkStatus RPCs which never fail. Personally I do not see why the client that issues the individual FetchStatus RPCs should be punished for a directory full of access denied errors when the InlineBulkStatus issuer is not. Throttling is a very blunt instrument and instead of making it a part of the RX wire protocol I believe that OpenAFS should figure out an alternate method of prioritizing its resources. Jeffrey Altman
signature.asc
Description: OpenPGP digital signature
