I don't think the problem is related to the number of configurations, since all build tasks are queued and executed sequentially.
The problem with Hackystat-ALL is related to TempXXXX.csv file. I am not sure what's wrong with Hackystat-HPC, since old information is overwritten by the result of the manual build you invoked, but I guess it's the same problem.
In kernel.admin.TemporaryFileReaper, the reap interval is 1 hour. If build time is around 1 hour, then it's possible that the temporary files get deleted by the reaper, at the same time cruisecontrol is trying to copy files.
I increased the interval to 5 hours (the public server should be ok with this number), this should solve the build problem.
Cheers,
Cedric
Philip Johnson wrote:
--On Saturday, May 7, 2005 5:40 AM -1000 [EMAIL PROTECTED] wrote:
Hackystat build (configuration Hackystat-HPC) failed. Build report is available at http://xenia.ics.hawaii.edu/hackyDevSite/configurationBuildReport.do?year =2005&month=5&day=7&configuration=Hackystat-HPC Build Time Stamp: Sat May 07 05:40:00 HST 2005
Both the HPC and ALL configurations failed last night, neither seemingly due to developer-related activities. The errors were both build system related (i.e. not finding a file, etc.). I just rebuilt hackystat-HPC without changing anything and this time it passed.
It appears that either due to the number of configurations we are building, the length of time it takes to build things, or for some other reason, our build system has become unstable and we need to invest some effort into figuring out what's going on and what to do about it.
Cedric, I'd like to ask you to take the lead on this. Here are some thoughts:
(a) is the problem related to the length of time it takes to build each configuration? One way to check would be to emit a timestamp with each hackystat <echo> and see where the bottlenecks are. (One simple one: the docbook will go much faster if resolver.jar is in the ant/lib directory.)
(b) is the problem related to the number of configurations we're building? We're up to six, and we didn't see this stuff happening at 2 or 3.
(c) is there something happening on the particular machine we're using? fragmentation? etc.
Cheers, Philip
