Conceptually I like the approach a lot.
*However*, this will generate an incomprehensible amount of emails
when a new build slave gets added or when some revbumps are done etc.
In particular from the 10.5/ppc machine. The individual emails will be
a lot more helpful, but the amount of them ...
This sounds totally hacky, but one way around that I can see is to
write a special-purpose mailer. Either a job on the buildbot or a
special external script that iterates through all the jobs from the
last hour from all the builders and creates build failure summaries
for each individual developer (author/committer/maintainer). Maybe
this could be a build job on the builder master, but I don't know how
tricky it would get to do it.
How would you handle duplicate entries in the build list? Dependency
order could also become semi-obsolete, but that's probably a price to
I tried to play with buildNext, but I couldn't figure out how to read
properties of the build request.
On 11 March 2018 at 10:25, Ryan Schmidt wrote:
> The current buildbot setup has a number of problems that I believe could be
> solved by combining the currently separate portwatcher and portbuilder
> schedulers into a single ports scheduler.
> I am not suggesting that we return to the behavior of the ports scheduler on
> the old macOS forge buildbot system in which a single build would build all
> the specified ports. We will keep the current method of building only one
> port (and its dependencies) per build.
> The problems I want to solve are the following:
> 1. Currently, portwatcher is responsible for updating a copy of mpbb,
> MacPorts base and a ports tree that it shares with portbuilder. Having
> portbuilder maintain its own copy would waste a lot of time. If someone makes
> one commit that changes 100 ports and then no further commits occur for
> hours, we only want to update mpbb, MacPorts base and the ports tree once,
> not 100 times. But the fact that it's shared means that portwatcher must (and
> is configured to) wait for all triggered portbuilder builds to finish before
> it processes the next commit. This works fine, unless the buildmaster is
> stopped while portwatcher builds are pending. This has happened several times
> when the servers lost power during a power outage. (The servers are on a UPS,
> but the UPS does not provide as much instantaneous power as I expected, so if
> the servers are busy building, they draw more power than the UPS can
> instantaneously provide and the servers shut down immediately. I might remove
> the buildworker machines from the UPS and leave only the buildmaster, modem
> and router on it.) When buildmaster comes back online, it sees the
> portbuilder build that was in progress and starts it again, but it also sees
> the portwatcher build that was in progress and starts it again. Now we have a
> portwatcher running (updating mpbb, updating MacPorts, updating the ports
> tree, and updating its portindex) while portbuilder is trying to install a
> port. The portbuilder can fail if it is trying to install ports at the moment
> that portwatcher is updating the index (see
> 2. If a single portwatcher build "X" triggers many portbuilder builds, and
> while those portbuilder builds are in progress another commit comes in that
> would affect those ports, it don't notice until all portbuilder builds
> triggered by "X" are finished. This can waste time building ports that are
> already superseded by newer versions or revisions. An extreme example of this
> is if we were to force a portwatcher build for all ports (which we might want
> to do when a new version of macOS is released). mpbb, MacPorts base and the
> ports tree would be updated once, and then it would schedule a portbuilder
> for each port that had not yet been built. Building all ports will take
> weeks. During that time, a commit may come through that updates a port to a
> new version. But if the build of the old version of the port was still
> pending at that time, the buildbot will build the old version, because it
> can't update the ports tree until the current portwatcher build is done
> waiting for its triggered portbuilder builds.
> 3. When there are portwatcher builds pending, we have no idea how many
> portbuilder builds are pending. It may say there are e.g. 3 portbuilder
> builds pending, but the pending portwatchers could trigger any number of
> additional portbuilders.
> An objection to this proposal was that buildbot 0.8 does not have the
> capability to dynamically create scheduler steps at runtime. But that's not
> required and that's not what I'm proposing.
> Buildbot has the ability to call a function for each step to determine if
> that step should run, by specifying the doStepIf property. I recently started
> using this feature in portwatcher to skip the two trigger steps if there are
> no ports in the port list:
> Here is an example of what that looks like when it runs:
> The only port that was committed there had already been built so it was
> excluded from the port list, leaving the list empty, so the portbuilder
> trigger step was skipped (to save time) and the mirror trigger step was
> skipped (to prevent it from printing an error that no ports were specified).
> The skipped steps are still shown in the web interface, but if that's not
> desired, they can be hidden by also using the hideStepIf property.
> So my proposed combined ports scheduler would still contain all of the steps
> of the current portwatcher and portbuilder schedulers, but each build would
> still conceptually "be" either a portwatcher or a portbuilder, and for each
> build, the steps that don't relate to that conceptual function would be
> skipped and hidden.
> Buildbot has the ability to associate custom properties with a build. We use
> this to set a "portname" property when we trigger portbuilder builds.
> portwatcher builds don't have that property. (They have a "portlist" property
> if they were forced from the web interface, and always have a "fullportlist"
> property which is the portlist plus the computed list of ports that were
> modified by the commit.) So if a build has the "portname" property, it is a
> portbuilder, otherwise it is a portwatcher. We can then write some simple
> def is_portbuilder(step):
> return step.hasProperty('portname')
> def is_portwatcher(step):
> return not step.hasProperty('portname')
> def is_skipped(result, step):
> return (result == buildbot.status.results.SKIPPED)
> The steps of the combined ports scheduler would then be:
> * update mpbb (doStepIf=is_portwatcher, hideStepIf=is_skipped)
> * clean (doStepIf=is_portwatcher, hideStepIf=is_skipped)
> * selfupdate MacPorts (doStepIf=is_portwatcher, hideStepIf=is_skipped)
> * sync ports tree (doStepIf=is_portwatcher, hideStepIf=is_skipped)
> * get the list of subports (doStepIf=is_portwatcher, hideStepIf=is_skipped)
> * trigger mirror scheduler (doStepIf=is_portwatcher, hideStepIf=is_skipped,
> * trigger ports scheduler for each subport (doStepIf=is_portwatcher,
> hideStepIf=True, waitForFinish=False)
> * install dependencies (doStepIf=is_portbuilder, hideStepIf=is_skipped)
> * install port (doStepIf=is_portbuilder, hideStepIf=is_skipped)
> * gather archives (doStepIf=is_portbuilder, hideStepIf=is_skipped)
> * upload archives to buildmaster (doStepIf=is_portbuilder,
> * deploy archives (doStepIf=is_portbuilder, hideStepIf=is_skipped)
> * clean (doStepIf=is_portbuilder, hideStepIf=is_skipped)
> (I'm advocating here for always hiding the ports trigger step. I don't find
> its output in the waterfall to be very useful and it can take up a lot of
> space if a lot of builds were triggered.)
> This should solve problem (1). By having portwatcher and portbuilder tasks in
> the same queue, they can't run simultaneously so they can't cause the
> problems that happen when they run simultaneously.
> Solving (2) and (3) requires an additional step. Buildbot allows you to
> define a nextBuild property when you create the builder, and pass it a
> function that determines which of the pending builds should go next.
> We would write a function that looks through the pending builds in the order
> in which they were scheduled, and picks the first one that is a portwatcher
> (the first one that doesn't have a "portname" property). If none are
> portwatchers, it picks the first* one. (* This is where we would later
> improve the situation to pick the next port in the correct dependency order,
> but I have another email about that.)
> This solves (2) because now when a new commit comes in, a new "portwatcher"
> gets added to the end of the ports scheduler's queue, but it will be the next
> build picked immediately after the current "portbuilder" finishes, no matter
> how many other "portbuilders" are still pending. That will update the ports
> tree, so any pending builds for ports that were subsequently updated will
> build the now-current version, not the old version.
> It also solves (3) because now if 100 "portbuilder" builds are pending, we
> don't have to wait until all of them are finished before we know how many
> builds all the pending "portwatcher" builds will schedule; we only have to
> wait until the current "portbuilder" finishes.
> One drawback, depending on how you look at it, is that we would no longer be
> able to send a single combined email for all of the failed builds of a single
> commit, or of a single forced build. We would have to send an individual
> email for each failed build. Personally, I would prefer that, as the subject
> line of the email would make clear which port failed to build, rather than
> requiring me to open it to see what happened.
> If we give the new combined scheduler a new name like "ports", that will
> invalidate all old links to build logs. That's not a deal-breaker for me, but
> since we often paste build log URLs into tickets, it would be nice to keep
> them alive if we can. Maybe defining empty portbuilder and portwatcher
> schedulers would be enough.