Martin Pool wrote:
1. Since the list of hosts read from $prefix/etc/distcc/hosts is
the same for all workstations, every workstation will
issue large compile jobs to itself sometimes even though it'd be better
off only handling preprocessing and linking (right?)

Linking counts against jobs running on the local machine too. If it's linking in parallel with compiling then it should try to do the compiles remotely.

Ah, so it notices that "localhost" is the same machine as "zytor" when running on zytor?

It would probably be good to finish off support for DNS multi-A
records, and use that to spread work across machines.  I don't think
much more needs to be done.

That's just a direct replacement for the hosts file, though, isn't it? I'm not sure I want IT to be involved in this; it's a lot easier for me to modify distcc/hosts than it is to create DNS entries! [BTW x at xman dot org is working on the Rendezvous patch for distcc on linux, I hear. I'm not interested in that myself, but maybe others are.]

2. Distcc won't currently check the load average of each compile server,
so workstations busy with non-distcc jobs will get slammed with
distcc jobs, negatively impacting normal use of the workstations.

If the workstations have a reasonable amount of memory then running a couple of low-priority daemons should not hurt too much. Remember it will only accept about 2*NCPUS.

Yes, but that means the compile jobs (which could run faster on some other compile server which is ready and waiting) will execute slower.

We could check the load average before accepting jobs but that is
actually a pretty poor measure for modern machines.

Oh, I dunno, the number of processes in 'R' state seems like it'd be a pretty good measure of load if there's plenty of RAM and the distcc job wouldn't cause any disk I/O. Or the number of processes in 'R' or 'D' state if jobs do tend to do disk I/O, maybe.

3. If more than one user is issuing distcc jobs, their distcc's
will sometimes issue jobs to the same machine by chance
(fairly often, if distcc assigns jobs in order of the etc/distcc/hosts file).


Right, so those jobs will just stall for a bit.

Again, reducing the performance of the cluster.


Has anyone looked at these issues?   I suppose a first step I
might take if nobody else has might be to run a few benchmarks
to see if these potential problems actually happen in the real
world.

That would be good.

OK, benchmarks coming up... I only have six machines in the cluster at the moment, but I can probably coax a few more coworkers into joining for the good of science :-) Maybe we'll try running 1 to N/2 copies of the same job on an N machine cluster (simulating various numbers of different users doing normal work) and plot the compile time for each. - Dan

--
My technical stuff: http://kegel.com
My politics: see http://www.misleader.org for examples of why I'm for regime change
__ distcc mailing list http://distcc.samba.org/
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/distcc

Reply via email to