Hi Todd/Arthur, Thanks for the insights. Unfortunately for me, I am that typical "young configuration manager" that software companies can afford easily when they dont have the patience or money to look for an expert. But with sound advice I intend to learn quickly and do my job well. Thanks a lot for your help especially!!
The exact background of the problem is here (I hope this answers your questions): 1. All 40 are different operating systems....well technically they are mostly Linux but things like different OS base versions, update/ service-pack levels, processor architectures, 32/64-bit, the family of linux OS (like redhat, suse, debian.....sorry if family is a wrong word here) would make them different in our and our customers' perspective because our product has several drivers (.ko) which have to be compiled absolutely particularly for each update/service-pack level using the native glibc version and kernel headers and kernel Makefile on that distro...anything else leads to failure of kernel module version magic during the modprobe time. Also some of our development experts noticed malfunction when the user-space code compiled against a base version of the platform (like RHEL 5) was used on an update (like RHEL 5 U1) so we were forced to start compiling both the user-space code and kernel-space code (in other words, the entire product) separately on each of the platforms and update levels we offer in our platform compatibility matrix. In effect, if I have a RHEL5 U3 released in the market, the build is to be considered and performed separate strictly for that platform. 2. As a result of point 1 above, the thirdparty codebase (+2GB of thousands of small code, header and make files, no pre-built and checked-in libraries) which we use has to be compiled on each platform to create a set of libraries that our product will link against while building on that platform. That's the reason why the 40 build machines need the thirdparty sandbox updated each morning before the build begins (so if the makefile detects updated thirdparty code, it builds the libraries first and then proceeds to build our product). I use a simple "cd <thirdparty-sandbox>; cvs update -d -P ." in a shell script that's cron-ned at roughly the same time on all machines. A constraint here is that, to "grab" the code changes of the US developers, I cannot start the cvs updates and build before 6 AM my time (IST) and the builds have to be ready to be used by the Indian testing team by 9 AM. (the ~30 minute skew suggested by you doesn't work in this case because I have 40 machines and definitely not 1200 minutes of time at my disposal :) I use the "-d -P" switches to make sure the sandbox gets all the new directories and loses the ones which are removed in CVS. But just that update takes 40 minutes. I don't really know why....maybe due to the large number of files and directories? Is it likely? The CVS client/server version is 1.11.17. The protocol I use is :pserver. 3. The cvs update command is not part of the build process, so the "touch"ing factor is out of question. Skew between local time and server time.....well, I don't really know how this works across time zones. The clients are in IST (Indian Standard Time) and the server is in Pacific Time (US, California). How does that affect? Are both converted to UTC or some such common standard and then compared? I do observe the problem of the client boxes losing time daily........these linux boxes are connected to a Windows NT domain and for keeping correct time I have configured the NTPD on them by pointing to some servers in nasa.gov. But in spite of that the boxes lose time regularly....it's a nuisance I could not fix in the last 2 yrs (but that's for another thread in another forum, I guess). Does it sound ominous to you? 4. All 40 machines are in India where I work. And the CVS server is in the US. Don't please ask me why it's designed that way :) . So the idea of sending changed file patch across my really fast LAN would sound any different now, given this geographical fact? 5. "when one machine is running update by itself, i.e., the other 39 are not updating while it is, how long does an update take? how long does that update take if the sandbox is already up to date" ------ Almost the same. 30 minutes plus just for a silent check of the sandbox and exit even if no file got updated on the server. 6. I might say in closing that this notorious thirdparty code is not even updated regularly but since one cannot predict when it is updated, I have to do the cvs update daily. Is there some way where I can query the server to find out if any change took place so that in the negative, I don't even bother to run the cvs update command? Does that make sense? I hope this gives more info to you....thanks for your time and patience. -Chaitanya
