On 10/26/2011 11:26 AM, Stephan Wiesand wrote: > Good point. This problem will vanish with IPv6, though. > > Containers could be another solution. > > Running multiple fileservers on different ports on the same system would be > even more efficient. Is this possible or could be implemented (in theory)?
YFS has already contributed much of the work necessary to run multiple file servers on the same system on different ports. What is missing in OpenAFS at the moment are the database changes to track file servers with port numbers in addition to addresses; the RPCs to tell the clients about those port numbers; etc. The requirements have been discussed on afs3-standardization and at the last couple of afs hackathons. However, there is not an agreed upon set of RPCs for the implementation. >> This also doesn't help you much if the server is getting bogged down due >> to the I/O in servicing the relevant requests, unless you separate the >> user volumes physically. > > A single 1.4 fileserver is not able to make a decent contemporary fileserver > sweat. I don't have much data on 1.6 servers. The 1.6 file servers are not significantly different. The bottlenecks throughout the file server stack prevent the file server from making full use of either the available CPU processing or the I/O bandwidth. A substantial re-architecture of the file server and the vol package is needed in order to obtain the desired throughput. >> But for the simple (and presumably common) case >> of running out of fileserver threads or the fileserver not being >> mp-scalable, sure. > > > We're suffering from the same problem as SLAC, and are working around it by > keeping home directories small enough to make them unusable for use from the > compute farm and providing extra volumes on fileservers dedicated to > workgroups. User education tends to be more effective if careless use > penalizes familiar co-workers only ;-) > > What would be a great feature to have is a way to keep the server from using > more than, say, half of the available threads for a single volume. Would this > be feasible to implement at all? Unfortunately, by the time the file server gets the request queued onto a worker thread such that the RPC can be evaluated and categorized such that volume or partition specific limits can be applied the file server might as well just answer the request. 80% of the work has already been done. The current file server architecture is a one-to-one worker thread per request model. Even if the file server was going to delay processing of a particular request, what would be desirable would be to re-queue the request with additional tag information permitting a worker thread that is completing a request on the limited resource to pick it up while allowing other workers to service unrelated requests. Issuing RPCs to the client and forcing the client to retry (as is done with VBUSY responses) will increase the overall load on the file server when there are no replicas on alternate file servers for the client to fail over to. Jeffrey Altman
signature.asc
Description: OpenPGP digital signature
