Hi everyone,
More anecdotes. Scheduler related.
As you may recall, I attempted to fix our problem with collapsing
behavior by appending the master name to each scheduler (except the
force schedulers). The idea was to force the schedulers to be master
specific so that the builder and scheduler would be on the same master,
allowing the default (mostly, we changed the behavior to not regard
revision as significant) collapsing behavior to work. This appears to
have at least mostly worked.
But there's always a snag, isn't there? A few days ago, we switched the
branch that most of our work is on. The way we have out master.cfg set
up, this is a one line change. But it changes nearly everything. It
changes builder names, scheduler names, etc.
Now I'm seeing some odd anomalies. Such as builds being scheduled by
schedulers that no longer exist on any master, and are not in our
master.cfg, but are still in the database.
I am also seeing builders in current schedulers that never seem to get
builds in their queues. We have to force them to see anything happen.
And builders with builds in their queues that never seem to start.
Could this be part of the result of schedulers not being particularly
reconfigurable?
And on that note, there seems to be 3 schemes in 0.9.x for
checkConfig/reconfigService.
Number 1 is how the schedulers do it. Which is that they don't, but have
largish __init__() functions.
Number 2 is how the workers do it. checkconfig looks a lot like __init__
might, and reconfigService looks a lot like checkConfig, except that it
doesn't except.
Number 3 is how things like reporters do it. checkConfig only does
checks (and the occasional null-ish initialization), and reconfigService
copies its arguments into itself.
Which is the proper way, since I'm likely to have a go at updating the
schedulers? Number 1 is right out. Number 2 is pretty easy, mostly
moving the __init__ to checkConfig, and mostly copying to
reconfigService, and making sure to call base classes methods properly.
One slightly happier anecdote...
We ended up with a situation where there were 2 builders for a
particular worker. Both had current builds marked as acquiring locks
(remember that we use locks to keep it to one build per worker, except
for a special builder that should always run, even if there's another
build running. That's why we don't restrict builds at the worker level).
I did manage to go in through the manhole and release the lock from
whoever was holding it. By the time I got far enough to do that, I
wasn't interested in figuring out which build was actually holding onto it.
The first builder's build completed, and the second builder picked up
after that.
Yay.
As always, thanks for your assistance.
Neil Gilmore
grammatech.com
_______________________________________________
users mailing list
users@buildbot.net
https://lists.buildbot.net/mailman/listinfo/users