On 26 Sep 2003, Jeff <[EMAIL PROTECTED]> wrote: > Last month there was an interesting thread entitled "distcc over slow net > links". I have a similar problem in that two of the "animals" on my farm > have very old CPUs (although they are on a local 100baseT network). On a > real farm these animals would probably be turned into glue or hamburger, > but on a distcc farm I think that's animal cruelty! :)
In fact the same problem can happen even if all of the machines are the same speed. I can see the problem pretty clearly when building OpenPegasus, which uses a recursive Make through many directories and has some C++ files which are very slow to build. It spends a lot of time blocked waiting for large files to complete. > Here's a quote from the original thread posted by Timothee: > > >What I'm thinking, is that once local hosts are starved, distcc should > >find out that there is stuff running on slow hosts, and dupe the compile > >work on the local hosts, sending back whatever finishes first. > > The thread went into some interesting discussions on how to choose a faster > machine, and how distcc might be able to keep track of the speeds of each > server. In the case of this email, I believe "local hosts" meant the hosts > on his fast local network. But what I'd really like to do is to simply keep > "localhost" busy. That wouldn't require any of the additional complexity of > tracking the power of each server, and it also would be a little bit more > friendly if more than one person was trying to use the farm at a time. > > The project I'm working on consists of over 80 libraries, and some of them > are quite large. As make gets towards the end of each library, I often see > those two slow machines drag out the end of the build for 20 or 30 seconds. > With this many libraries, each one of those pauses really starts to hurt. > Using distcc's excellent new graph tool, it becomes especially obvious when > the fast hosts have all "scrolled off to white" and you see the two green > bars remaining. Maybe you should build more than one library at a time, so that even when reaching the end of any one build, there are several jobs that can be run? > I haven't yet looked at the distcc source, so I'm not sure how complex it > might be to implement a solution to keep the localhost busy. Because I'm > not familiar with the architecture I'm probably not the best person to > design the solution, but I'd be more than happy to help try to implement it > if there's interest. You don't need to look at the source. Just tell me if you can think of a better algorithm. Once you can describe it in English or pseudocode translating into C is fairly simple. At least in the example I'm looking at: a little while before the end of the build, five distcc processes get started by Make to build five C++ files. So we start two locally, two on machine A, one on machine B. It turns out that the file sent to machine B takes a very long time to build because the code is more complex. Until all five tasks complete, Make doesn't start any more jobs, so localhost and A sit idle. Timothee suggested killing the job on B and re-running it on localhost, but for at least this case it would be wasteful because B is as fast as localhost. For C++ code, transit time is relatively small. I think the real problem here is that recursive Make is harmful. The correct fix would be for Make to start additional jobs while it is waiting for B. -- Martin __ distcc mailing list http://distcc.samba.org/ To unsubscribe or change options: http://lists.samba.org/cgi-bin/mailman/listinfo/distcc
