Since this isn't really the place for bug reports, consider this an anecdote.

When I came in this morning, there was a message that buildbot's UI was unresponsive, which it was. Looking at the most recent twistd.log, the only thing I was was that the irc client had been bouncing up and down for the last 8 hours or so. Buildbot stop failed to stop the process. Killing it did. But restart wasn't very successful -- every worker was an 'unauthorized login'. Again buildbot stop failed to stop the process, so I killed it again. Deleted state.sqlite, ran buildbot upgrade-master, and started again. This appears to have worked.

One problem we see when we have to do this is that worker indexes shift around when we have to delete the database. This makes any URLs in build emails that indexed into the previous database invalid. Not the worst problem ever, but inconvenient. Then again, I don't think buildbot is exactly designed to handle the vagaries of being reconfigured as often as ours is, or having its database cleared as often.

Our current production system has 0.9.0rc1 for the master, and mostly the same for workers. Twisted is 16.2.0. The only changes I know of are a couple custom build steps we have in, a fix for a problem with __cmp__() in ComparableMixin (we'd been somehow getting a set in an object, and __cmp__() would fail during some reconfigs), and a slight change to the UI to always show the last 4 builds instead of the default behavior.

When we do multi-master, it'll be rc2, at least to start. The troubles I'm having with that currently are not buildbot problems.

