My apologies. I didn't phrase my question properly. Most of the software necessary was pulled down via svn, but I saw no such behaviour for AHP. After looking at it some more, I imagine the software was just manually installed on the machine. It was kind of a silly question to begin with, I suppose.
On Thu, Oct 9, 2008 at 4:16 AM, Jason Dillon <[EMAIL PROTECTED]> wrote: > On Oct 8, 2008, at 11:05 PM, Jason Warner wrote: > > Here's a quick question. Where does AHP come from? > > > http://www.anthillpro.com > > (ever heard of google :-P) > > --jason > > > > On Mon, Oct 6, 2008 at 1:18 PM, Jason Dillon <[EMAIL PROTECTED]>wrote: > >> Sure np, took me a while to get around to writing it too ;-) >> --jason >> >> >> On Oct 6, 2008, at 10:24 PM, Jason Warner wrote: >> >> Just got around to reading this. Thanks for the brain dump, Jason. No >> questions as of yet, but I'm sure I'll need a few more reads before I >> understand it all. >> >> On Thu, Oct 2, 2008 at 2:34 PM, Jason Dillon <[EMAIL PROTECTED]>wrote: >> >>> On Oct 1, 2008, at 11:20 PM, Jason Warner wrote: >>> >>> Is the GBuild stuff in svn the same as the anthill-based code or is that >>>> something different? GBuild seems to have scripts for running tck and that >>>> leads me to think they're the same thing, but I see no mention of anthill >>>> in >>>> the code. >>>> >>> >>> The Anthill stuff is completely different than the GBuild stuff. I >>> started out trying to get the TCK automated using GBuild, but decided that >>> the system lacked too many features to perform as I desired, and went ahead >>> with Anthill as it did pretty much everything, though had some stability >>> problems. >>> >>> One of the main reasons why I choose Anthill (AHP, Anthill Pro that is) >>> was its build agent and code repository systems. This allowed me to ensure >>> that each build used exactly the desired artifacts. Another was the >>> configurable workflow, which allowed me to create a custom chain of events >>> to handle running builds on remote agents and control what data gets set to >>> them, what it will collect and what logic to execute once all distributed >>> work has been completed for a particular build. And the kicker which help >>> facilitate bringing it all together was its concept of a build life. >>> >>> At the time I could find *no other* build tool which could meet all of >>> these needs, and so I went with AHP instead of spending months >>> building/testing features in GBuild. >>> >>> While AHP supports configuring a lot of stuff via its web-interface, I >>> found that it was very cumbersome, so I opted to write some glue, which was >>> stored in svn here: >>> >>> >>> https://svn.apache.org/viewvc/geronimo/sandbox/build-support/?pathrev=632245 >>> >>> Its been a while, so I have to refresh my memory on how this stuff >>> actually worked. First let me explain about the code repository (what it >>> calls codestation) and why it was critical to the TCK testing IMO. When we >>> use Maven normally, it pulls data from a set of external repositories, picks >>> up more repositories from the stuff it downloads and quickly we loose >>> control where stuff comes from. After it pulls down all that stuff, it >>> churns though a build and spits out the stuff we care about, normally >>> stuffing them (via mvn install) into the local repository. >>> >>> AHP supports by default tasks to publish artifacts (really just a set of >>> files controlled by an Ant-like include/exclude path) from a build agent >>> into Codestation, as well as tasks to resolve artifacts (ie. download them >>> from Codestation to the local working directory on the build agents system). >>> Each top-level build in AHP gets assigned a new (empty) build life. >>> Artifacts are always published to/resolved from a build life, either that >>> of the current build, or of a dependency build. >>> >>> So what I did was I setup builds for Geronimo Server (the normal >>> server/trunk stuff), which did the normal mvn install thingy, but I always >>> gave it a custom -Dmaven.local.repository which resolved to something inside >>> the working directory for the running build. The build was still online, so >>> it pulled down a bunch of stuff into an empty local repository (so it was a >>> clean build wrt the repository, as well as the source code, which was always >>> fetched for each new build). Once the build had finished, I used the >>> artifact publisher task to push *all* of the stuff in the local repository >>> into Codestation, labled as something like "Maven repository artifacts" for >>> the current build life. >>> >>> Then I setup another build for Apache Geronimo CTS Server (the >>> porting/branches/* stuff). This build was dependent upon the "Maven >>> repository artifacts" of the Geronimo Server build, and I configured those >>> artifacts to get installed on the build agents system in the same directory >>> that I configured the CTS Server build to use for its local maven >>> repository. So again the repo started out empty, then got populated with >>> all of the outputs from the normal G build, and then the cts-server build >>> was started. The build of the components and assemblies is normally fairly >>> quick and aside from some stuff in the private tck repo won't download muck >>> more stuff, because it already had most of its dependencies installed via >>> the Codestation dependency resolution. Once the build finished, I >>> published to cts-server assembly artifacts back to Codestation under like >>> "CTS Server Assemblies" or something. >>> >>> Up until this point its normal builds, but now we have built the G >>> server, then built the CTS server (using the *exact* artifacts from the G >>> server build, even though each might have happened on a different build >>> agent). And now we need to go and run a bunch of tests, using the *exact* >>> CTS server assemblies, produce some output, collect it, and once all of the >>> tests are done render some nice reports, etc. >>> >>> AHP supports setting up builds which contain "parallel" tasks, each of >>> those tasks is then performed by a build agent, they have fancy build agent >>> selection stuff, but for my needs I had basically 2 groups, one group for >>> running the server builds, and then another for running the tests. I only >>> set aside like 2 agents for builds and the rest for tests. Oh, I forgot to >>> mention that I had 2 16x 16g AMD beasts all running CentOS 5, each with >>> about 10-12 Xen virtual machines running internally to run build agents. >>> Each system also had a RAID-0 array setup over 4 disks to help reduce disk >>> io wait, which was as I found out the limiting factor when trying to run a >>> ton of builds that all checkout and download artifacts and such. >>> >>> I helped the AHP team add a new feature which was an parallel iterator >>> task, so you define *one* task that internally fires off n parallel tasks, >>> which would set the iteration number, and leave it up to the build logic to >>> pick what to do based on that index. The alternative was a unwieldy set of >>> like 200 tasks in their UI which simply didn't work at all. You might have >>> notice an "iterations.xml" file in the tck-testsuite directory, this was was >>> was used to take an iteration number and turn it into what tests we actually >>> run. The <iteration> bits are order sensitive in that file. >>> >>> Soooo, after we have a CTS Server for a particular G Server build, we can >>> no go an do "runtests" for a specific set of tests (defined by an >>> iteration)... this differed from the other builds above a little, but still >>> pulled down artifacts, the CTS Server assemblies (only the assemblies and >>> the required bits to run the geronimo-maven-plugin, which was used to >>> geronimo:install, as well as used by the tck itself to fire up the server >>> and so on). The key thing here, with regards to the maven configuration >>> (besides using that custom Codestation populated repository) was that the >>> builds were run *offline*. >>> >>> After runtests completed, the results are then soaked up (the stuff that >>> javatest pukes out with icky details, as well as the full log files and >>> other stuff I can recall) and then pushed back into Codestation. >>> >>> Once all of the iterations were finished, another task fires off which >>> generates a report. It does this by downloading from Codestation all of the >>> runtests outputs (each was zipped I think), unzips them one by one, run some >>> custom goo I wrote (based some of the concepts from original stuff from the >>> GBuild-based TCK automation), and generates a nice Javadoc-like report that >>> includes all of the gory details. >>> >>> I can't remember how long I spent working on this... too long (not the >>> reports I mean, the whole system). But in the end I recall something like >>> running an entire TCK testsuite for a single server configuration (like >>> jetty) in about 4-6 hours... I sent mail to the list with the results, so if >>> you are curious what the real number is, instead of my guess, you can look >>> for it there. But anyway it was damn quick running on just those 2 >>> machines. And I *knew* exactly that each of the distributed tests was >>> actually testing a known build that I could trace back to its artifacts and >>> then back to its SVN revision, without worrying about mvn downloading >>> something new when midnight rolled over or that a new G server or CTS server >>> build that might be in progress hasn't compromised the testing by polluting >>> the local repository. >>> >>> * * * >>> >>> So, about the sandbox/build-support stuff... >>> >>> First there is the 'harness' project, which is rather small, but contains >>> the basic stuff, like a version of ant and maven which all of these builds >>> would use, some other internal glue, a fix for an evil Maven problem >>> causing erroneous build failures due to some internal thread state >>> corruption or gremlins, not sure which. I kinda used this project to help >>> manage the software needed by normal builds, which is why Ant and Maven were >>> in there... ie. so I didn't have to go install it on each agent each time it >>> changed, just let the AHP system deal with it for me. >>> >>> This was setup as a normal AHP project, built using its internal Ant >>> builder (though having that builder configured still to use the local >>> version it pulled from SVN to ensure it always works. >>> >>> Each other build was setup to depend on the output artifacts from the >>> build harness build, using the latest in a range, like say using "3.*" for >>> the latest 3.x build (which looks like that was 3.7). This let me work on >>> new stuff w/o breaking the current builds as I hacked things up. >>> >>> So, in addition to all of the stuff I mentioned above wrt the G and CTS >>> builds, each also had this step which resolved the build harness artifacts >>> to that working directory, and the Maven builds were always run via the >>> version of Maven included from the harness. But, AHP didn't actually run >>> that version of Maven directly, it used its internal Ant task to execute the >>> version of Ant from the harness *and* use the harness.xml buildfile. >>> >>> The harness.xml stuff is some more goo which I wrote to help mange AHP >>> configurations. With AHP (at that time, not sure if it has changed) you had >>> to do most everything via the web UI, which sucked, and it was hard to >>> refactor sets of projects and so on. So I came up with a standard set of >>> tasks to execute for a project, then put all of the custom muck I needed >>> into what I called a _library_ and then had the AHP via harness.xml invoke >>> it with some configuration about what project it was and other build >>> details. >>> >>> The actual harness.xml is not very big, it simply makes sure that */bin/* >>> is executable (codestation couldn't preserve execute bits), uses the >>> Codestation command-line client (invoking the javaclass directly though) to >>> ask the repository to resolve artifacts from the "Build Library" to the >>> local repository. I had this artifact resolution separate from the normal >>> dependency (or harness) artifact resolution so that it was easier for me to >>> fix problems with the library while a huge set of TCK iterations were still >>> queued up to run. Basically, if I noticed a problem due to a code or >>> configuration issue in an early build, I could fix it, and use the existing >>> builds to verify the fix, instead of wasting an hour (sometimes more >>> depending on networking problems accessing remote repos while building the >>> servers) to rebuild and start over. >>> >>> This brings us to the 'libraries' project. In general the idea of a >>> _library_ was just a named/versioned collection of files, where you could be >>> used by a project. The main (er only) library defined in this SVN is >>> system/. This is the groovy glue which made everything work. This is where >>> the entry-point class is located (the guy who gets invoked via harness.xml >>> via: >>> >>> <target name="harness" depends="init"> >>> <groovy> >>> <classpath> >>> <pathelement location="${library.basedir}/groovy"/> >>> </classpath> >>> >>> gbuild.system.BuildHarness.bootstrap(this) >>> </groovy> >>> </target> >>> >>> I won't go into too much detail on this stuff now, take a look at it and >>> ask questions. But, basically there is stuff in gbuild.system.* which is >>> harness support muck, and stuff in gbuild.config.* which contains >>> configuration. I was kinda mid-refactoring of some things, starting to add >>> new features, not sure where I left off actually. But the key bits are in >>> gbuild.config.project.* This contains a package for each project, with the >>> package name being the same as the AHP project (with " " -> "_"). And then >>> in each of those package is at least a Controller.groovy class (or other >>> classes if special muck was needed, like for the report generation in >>> Geronimo_CTS, etc). >>> >>> The controller defines a set of actions, implemented as Groovy closures >>> bound to properties of the Controller class. One of the properties passed >>> in from the AHP configuration (configured via the Web UI, passed to the >>> harness.xml build, and then on to the Groovy harness) was the name of the >>> _action_ to execute. Most of that stuff should be fairly straightforward. >>> >>> So after a build is started (maybe from a Web UI click, or SVN change >>> detection, or a TCK runtests iteration) the following happens (in simplified >>> terms): >>> >>> * Agent starts build >>> * Agent cleans its working directory >>> * Agent downloads the build harness >>> * Agent downloads any dependencies >>> * Agent invoke Ant on harness.xml passing in some details >>> * Harness.xml downloads the system/1 library >>> * Harness.xml runs gbuild.system.BuildHarness >>> * BuildHarness tries to construct a Controller instance for the project >>> * BuildHarness tries to find Controller action to execute >>> * BuildHarness executes the Controller action >>> * Agent publishes output artifacts >>> * Agent completes build >>> >>> A few extra notes on libraries, the JavaEE TCK requires a bunch of stuff >>> we get from Sun to execute. This stuff isn't small, but is for the most >>> part read-only. So I setup a location on each build agent where these files >>> were installed to. I created AHP projects to manage them and treated them >>> like a special "library" one which tried really hard not to go fetch its >>> content unless the local content was out of date. This helped speed up the >>> entire build process... cause that delete/download of all that muck really >>> slows down 20 agents running in parallel on 2 big machines with stripped >>> array. For legal reasons this stuff was not kept in svn.apache.org's >>> main repository, and for logistical reasons wasn't kept in the private tck >>> repo on svn.apache.org either. Because there were so many files, and be >>> case the httpd configuration on svn.apache.org kicks out requests that >>> it thinks are *bunk* to help save the resources for the community, I had >>> setup a private ssl secured private svn repository on the old >>> gbuild.orgmachines to put in the full muck required, then setup some goo in >>> the >>> harness to resolve them. This goo is all in gbuild.system.library.* See >>> the gbuild.config.projects.Geronimo_CTS.Controller for more of how it was >>> actually used. >>> >>> * * * >>> >>> Okay, that is about all the brain-dump for TCK muck I have in me for >>> tonight. Reply with questions if you have any. >>> >>> Cheers, >>> >>> --jason >>> >>> >>> >> >> >> -- >> ~Jason Warner >> >> >> > > > -- > ~Jason Warner > > > -- ~Jason Warner
