Hi all,

I spoke with Jason earlier today and he asked me to type up how we are 
currently using CruiseControl at SAS. The hope is to share the issues that 
we've run into here so they can be addressed in Continuum.

First, the scope of our code base. We have ~five million lines of code, nearly 
300 projects, and currently more than 50 branches. Unlike a SourceForge style 
code base, our code is very tightly coupled between projects. We have low-level 
projects, then our mid-tier and finally, we have out end-user solutions. Each 
level contains products that we sell.

The first level of Continuous Integration we rolled out was just for compiles. 
We were covering 3 branches for all 300 projects. The CVS server couldn't take 
the load of diffing 5 million lines of code every five minutes. Actually, it 
could, but the CVS admins noticed us, so we had to find an alternative because 
we were slowing down the entire company. 

So the biggest issue was CVS load. 

We setup CVS triggers that create a text file (we call them trigger files) for 
each project. If any file within a given project tree changes, the text file 
gets touched (I think it writes the date/time stamp). CruiseControl then 
monitors the trigger file for changes. If a change has occurred, CC then goes 
to CVS to get the changes.

This keeps the load on the CVS servers to a minimum. It also keeps the CVS 
commits decoupled from the CI process. If the CI server is down, it will see 
the trigger files and start processing the appropriate projects when it 
restarts.

We considered "live" notification (via sockets for instance) we would've built 
a much more brittle system. Especially in the first few months, we took the CC 
box down to redeploy with new options, etc. When the box was down, build 
notifications would have been missed. Not having the build notifications 
tightly coupled turned out to be a very robust way to handle the problem. The 
CVS triggers pile up regardless of whether or not the CI box is available to 
consume them. There are now other processes at SAS that use the trigger files 
as well.

Distributed Builds...

We already have an in-house build system that can cluster builds. You ask the 
system to perform a build and it'll find a box to run your build on. The 
parallelism is awesome. CruiseControl was able to drive that system via Ant 
scripts, so we were able to take advantage of that system. 

In looking at Maven 1, we were hoping to be able to cluster a group of Maven 
servers and let CC distribute the builds to boxes as needed. A JavaSpaces based 
plugin is in development (on the CC mailing list) that we had hoped to use.

When jobs are sent to a JavaSpace, the client machines consume them when they 
are ready. This type of model is a very elegant form of load balancing. Faster 
machines consume build requests faster and so are ready to request another 
build sooner than a slow machine. Over the course of the day, the faster 
machine will process many more builds than the slow machine. Builds are "slow" 
enough that you don't need to queue them up on the client box or try to predict 
who should get more. Just let them consume them when they are ready.

JavaSpaces has concepts like transactions, so it's fairly mature. I also 
understand that Sun recently released Jini/JavaSpaces under a more acceptable 
license.

Let me know if you've got any questions about these.

Jared

-----------------------------------------
Jared Richardson
[EMAIL PROTECTED]
919-531-9136
http://www.sas.com <http://www.sas.com/> 
SAS... The Power to Know(r)
-----------------------------------------

 
"The plan is nothing; the planning is everything."
 
Dwight Eisenhower
 

Reply via email to