Here's an idea I had been kicking around for awhile but never got to.
All systems that want to make distcc requests need at least one local
server process. This server would be responsible for talking with the
other servers that were around (configured in some manner), keeping
track of each one's load (they would have either an open socket or get
regular UDP updates) and also how fast they have executed jobs (per K)
at various loads. The idea is that this current status is maintained on
each local server (so no central server going down is a single point of
failure), but not by every client when it starts up.
Yes, that's the sort of thing I've been thinking about as the next step past "just drop connection if load too high".
> I imagined this
server being the same one that handles the job requests from other machines (if configured to accept remote jobs). So, the local distcc executable would just connect to the server at localhost, ask it to run a job, and send it the output of the preprocessed input. The local server would be free to either send the job off to another host, or run it locally.
I don't think that's needed. In fact, it's probably better if the local server is connected to via a unix domain socket. That's slightly faster and more secure. Also, it lets us do tricky things like passing an open socket from the local server to the distcc program, so the bytes don't have to get relayed through the local server. (I've been waiting ten years for a reason to use fd passing!) It even lets us pass credentials, so the local server could even know for sure which unix user was making the request; that could come in handy if we want to restrict status info about jobs to the user who submitted the jobs.
The stats kept could start simple (with job balancing and overloading protection) and get as complicated as desired. For instance, a future complicated algorithm could take into account transfer time in addition to compile speed per-K and current load. (I imagine every job that a remote server performs coming back with the elapsed compile time as measured on the remote system so that the sending server could figure out how fast/slow the transfer part of the equation was. I forget if distcc already has this or not.)
To get status, a status tool would connect to the local server and ask for a summary.
Absolutely.
- Dan
__ distcc mailing list http://distcc.samba.org/
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/distcc
