Re: Buildbot proposal: combine portwatcher and portbuilder

2018-03-13 Thread Ryan Schmidt

On Mar 13, 2018, at 09:50, Mojca Miklavec wrote:

> Dear Ryan,
> 
> Please take a look at the discussion summary we just had at the meeting:
> 
> https://trac.macports.org/wiki/Meetings/MacPortsMeeting2018/BuildbotRestructuring


So...

> Ryan's proposal
> 
> Ryan asked us to get rid of the separate portwatcher & portbuilder jobs and 
> re-configure the remaining job to interleave the two types of actions. As we 
> understood it, this was to solve the following problem:
> 
>   • A commit for portA comes in
>   • portwatcher for this commit schedules a portbuilder job for portA
>   • This build takes a long time
>   • While the build is still running, a new commit for the same port 
> arrives, which queues a portwatcher job
>   • Another commit for the same port arrives, which queues another 
> portwatcher job
>   • When the portbuilder job finishes, a useless build is scheduled for 
> portA.

Well, one of the several problems it would solve.

> He proposed the following to solve this:
> 
>  +---+
>  |commit1|
>  +---+
>  |prepare|
>  |   resources   |
>  +---+
>  |   scheduler1  |
>  +---+
>  +---+
>  |   port1 dep1  |
>  +---+
>  +---+
>  |   port1 dep2  |
>  +---+
>  +---+
>  |commit2|
>  +---+
>  |prepare|
>  |   resources   |
>  +---+
>  |   scheduler2  |
>  +---+
>  +---+
>  |   port1 dep2  |
>  +---+
>  +---+
>  | port1 |
>  +---+
>  +---+
>  |   port2 dep1  |
>  +---+
>  +---+
>  |   port2 dep2  |
>  +---+
>  +---+
>  | port2 |
>  +---+

If those boxes for "port2 dep1", "port2 dep2, "port2" are meant to all be 
separate builds, then that's not what I proposed. I proposed that each build in 
the combined scheduler would either perform the steps of the current 
portwatcher, or the steps of the current portbuilder. So what you've shown as 
separate "port2 dep1" and "port2 dep2" builds or steps, I was proposing would 
be our current "install deps of port2" portbuilder step, to be followed in the 
same build by an "install port2" step.


> Unfortunately that introduces the problem with the prepared shared resources 
> (like the portindex, mpbb checkout, ports tree), because we really cannot 
> change the portstree while we have build scheduled in a dependency order that 
> was computed from the old ports tree. This would cause seemingly random and 
> hard-to-debug problems if a follow-up commit changes dependencies of ports.

I don't think this causes any problems; on the contrary, it solves them. If on 
April 1 I schedule a build of all ports, and it takes a month to finish them 
all, I don't want it to spend the rest of the month building ports as they were 
on April 1. I want it to build ports as they are at the time that it gets to 
the build. If a commit comes in on April 2, I want that to be reflected in the 
previously-scheduled but still pending builds.

For a specific example on my proposed setup, let's say I:

* force a build of zlib @1.2.11, libpng @1.6.34_1, pngpp @0.2.5_2, and pigz 
@2.4. (I only specify the port names of course, but those are the versions and 
revisions that are current at the time I force the build.) This schedules build 
b1, which "is" a "portwatcher".
* b1(portwatcher) updates the shared resources, and schedules four builds: 
b2(portbuilder:zlib), b3(portbuilder:libpng), b4(portbuilder:pngpp), 
b5(portbuilder:pigz).
* b2(portbuilder:zlib) installs dependencies of zlib, installs zlib @1.2.11, 
cleans up.
* b3(portbuilder:libpng) installs dependencies of libpng, installs libpng 
@1.6.34_1, cleans up.
* While libpng was building, commit abcd came in that updates zlib to the 
long-awaited version 2.0! It also revbumps all dependents because zlib 2 has a 
new library version. This schedules b6(portwatcher) at the end of the queue.
* b6(portwatcher) is selected to run next, because it is a portwatcher and has 
precedence over all portbuilders. It updates shared resources, schedules 
b7(portbuilder:zlib) and lots of other builds for the dependents that got 
revbumped, which includes libpng and pigz.
* b4(portbuilder:pngpp) installs dependencies of pngpp, which include libpng 
and zlib. Since the shared resources have already been updated by 
b6(portwatcher), this step of this build is where zlib @2.0 and the revbumped 
libpng @1.6.34_2 end up getting built. It then installs pngpp @0.2.5_2 and 
cleans up.
* b5(portbuilder:pigz) installs dependencies of pigz, installs pigz @2.4_1, 
cleans up. This build was scheduled before commit abcd came in, yet thanks to 
the shared resources being updated by b6(portwatcher) earlier, it is able to 
build the latest version of the port available now, and not waste time building 
an already-superseded version.
* 

Re: Buildbot proposal: combine portwatcher and portbuilder

2018-03-13 Thread Mojca Miklavec
Dear Ryan,

Please take a look at the discussion summary we just had at the meeting:

https://trac.macports.org/wiki/Meetings/MacPortsMeeting2018/BuildbotRestructuring

Regarding emails: we figured out that it makes no sense to make our
design decisions based on how annoying it would be to write emails. It
will probably be easier to write a separate set of scripts to send
individual emails separately.

Regarding the width of the waterfall: buildbot one solves that, the
waterfall is much leaner there :)

I have another request: could you please package buildbot 1.1 in a
Portfile? Ideally those who have buildbot installed now should get the
port replaced_by buildbot-0.8 and those who install it from scratch
should get buildbot version 1.0, but I'm not sure if MacPorts
currently support such a migration scheme. It would make more sense to
have buildbot 1.1 named "buildbot" and the old one named
"buildbot-0.8" though. Perhaps the new one should be called buildbot-1
after all, at least for a while, I don't know.

Mojca


Re: Buildbot proposal: combine portwatcher and portbuilder

2018-03-12 Thread Ryan Schmidt

On Mar 11, 2018, at 17:48, Rainer Müller wrote:
> On 2018-03-11 10:25, Ryan Schmidt wrote:
>> The current buildbot setup has a number of problems that I believe could be 
>> solved by combining the currently separate portwatcher and portbuilder 
>> schedulers into a single ports scheduler.
>> 
>> I am not suggesting that we return to the behavior of the ports scheduler on 
>> the old macOS forge buildbot system in which a single build would build all 
>> the specified ports. We will keep the current method of building only one 
>> port (and its dependencies) per build.
>> 
>> The problems I want to solve are the following:
>> 
>> 1. Currently, portwatcher is responsible for updating a copy of mpbb, 
>> MacPorts base and a ports tree that it shares with portbuilder. Having 
>> portbuilder maintain its own copy would waste a lot of time. If someone 
>> makes one commit that changes 100 ports and then no further commits occur 
>> for hours, we only want to update mpbb, MacPorts base and the ports tree 
>> once, not 100 times. But the fact that it's shared means that portwatcher 
>> must (and is configured to) wait for all triggered portbuilder builds to 
>> finish before it processes the next commit. This works fine, unless the 
>> buildmaster is stopped while portwatcher builds are pending. This has 
>> happened several times when the servers lost power during a power outage. 
>> (The servers are on a UPS, but the UPS does not provide as much 
>> instantaneous power as I expected, so if the servers are busy building, they 
>> draw more power than the UPS can instantaneously provide and the servers 
>> shut down immediately. I might remove the buildworker machines from the UPS 
>> and leave only the buildmaster, modem and router on it.) When buildmaster 
>> comes back online, it sees the portbuilder build that was in progress and 
>> starts it again, but it also sees the portwatcher build that was in progress 
>> and starts it again. Now we have a portwatcher running (updating mpbb, 
>> updating MacPorts, updating the ports tree, and updating its portindex) 
>> while portbuilder is trying to install a port. The portbuilder can fail if 
>> it is trying to install ports at the moment that portwatcher is updating the 
>> index (see https://trac.macports.org/ticket/53587).
> 
> Do the steps for selfupdate/sync really hurt that much that cannot just
> run them on every portbuilder run? Looking at the portwatcher build you
> linked as an example, these steps only took a few seconds in total. Why
> can we not just move these steps to the portbuilder?

I stand by my proposal. I want to update the ports tree clone on the buildbot 
workers only when it changes. I don't want to waste cycles updating a tree that 
has not changed.

> At the moment, portwatcher and portbuilder are sharing resources, but
> buildbot assumes each builder is isolated and that leads to these
> problems when resuming builds. I guess we should not do that...

If you want a separate portwatcher and portbuilder that each have their own 
separate copy of mpbb, MacPorts, the ports tree, and the portindex, not only is 
that more disk space used, but then a simple single port update commit will 
cause the buildworkers to have to do twice as much updating, which will take 
longer than what we're doing now. I want to optimize, not de-optimize! :)

>> 2. If a single portwatcher build "X" triggers many portbuilder builds, and 
>> while those portbuilder builds are in progress another commit comes in that 
>> would affect those ports, it don't notice until all portbuilder builds 
>> triggered by "X" are finished. This can waste time building ports that are 
>> already superseded by newer versions or revisions. An extreme example of 
>> this is if we were to force a portwatcher build for all ports (which we 
>> might want to do when a new version of macOS is released). mpbb, MacPorts 
>> base and the ports tree would be updated once, and then it would schedule a 
>> portbuilder for each port that had not yet been built. Building all ports 
>> will take weeks. During that time, a commit may come through that updates a 
>> port to a new version. But if the build of the old version of the port was 
>> still pending at that time, the buildbot will build the old version, because 
>> it can't update the ports tree until the current portwatcher build is done 
>> waiting for its triggered portbuilder builds.
> 
> To me it seems like this is only an issue for forced builds, not for the
> builds scheduled by commits.

No, it is relevant for all situations that cause lots of ports to build. My 
extreme example was forcing a build of all ports, but the same problem would 
happen if you committed a change to many ports, such as a commit that occurred 
in the past to remove $Id$ lines from all ports, or commits that will happen in 
the future to batch-add GitHub maintainer handles and add file size to 
checksums.

> So maybe for this use case the force scheduler should 

Re: Buildbot proposal: combine portwatcher and portbuilder

2018-03-11 Thread Rainer Müller
On 2018-03-11 10:25, Ryan Schmidt wrote:
> The current buildbot setup has a number of problems that I believe could be 
> solved by combining the currently separate portwatcher and portbuilder 
> schedulers into a single ports scheduler.
> 
> I am not suggesting that we return to the behavior of the ports scheduler on 
> the old macOS forge buildbot system in which a single build would build all 
> the specified ports. We will keep the current method of building only one 
> port (and its dependencies) per build.
> 
> The problems I want to solve are the following:
> 
> 1. Currently, portwatcher is responsible for updating a copy of mpbb, 
> MacPorts base and a ports tree that it shares with portbuilder. Having 
> portbuilder maintain its own copy would waste a lot of time. If someone makes 
> one commit that changes 100 ports and then no further commits occur for 
> hours, we only want to update mpbb, MacPorts base and the ports tree once, 
> not 100 times. But the fact that it's shared means that portwatcher must (and 
> is configured to) wait for all triggered portbuilder builds to finish before 
> it processes the next commit. This works fine, unless the buildmaster is 
> stopped while portwatcher builds are pending. This has happened several times 
> when the servers lost power during a power outage. (The servers are on a UPS, 
> but the UPS does not provide as much instantaneous power as I expected, so if 
> the servers are busy building, they draw more power than the UPS can 
> instantaneously provide and the servers shut down immediately. I might remove 
> the buildworker machines from the UPS and leave only the buildmaster, modem 
> and router on it.) When buildmaster comes back online, it sees the 
> portbuilder build that was in progress and starts it again, but it also sees 
> the portwatcher build that was in progress and starts it again. Now we have a 
> portwatcher running (updating mpbb, updating MacPorts, updating the ports 
> tree, and updating its portindex) while portbuilder is trying to install a 
> port. The portbuilder can fail if it is trying to install ports at the moment 
> that portwatcher is updating the index (see 
> https://trac.macports.org/ticket/53587).

Do the steps for selfupdate/sync really hurt that much that cannot just
run them on every portbuilder run? Looking at the portwatcher build you
linked as an example, these steps only took a few seconds in total. Why
can we not just move these steps to the portbuilder?

At the moment, portwatcher and portbuilder are sharing resources, but
buildbot assumes each builder is isolated and that leads to these
problems when resuming builds. I guess we should not do that...

> 2. If a single portwatcher build "X" triggers many portbuilder builds, and 
> while those portbuilder builds are in progress another commit comes in that 
> would affect those ports, it don't notice until all portbuilder builds 
> triggered by "X" are finished. This can waste time building ports that are 
> already superseded by newer versions or revisions. An extreme example of this 
> is if we were to force a portwatcher build for all ports (which we might want 
> to do when a new version of macOS is released). mpbb, MacPorts base and the 
> ports tree would be updated once, and then it would schedule a portbuilder 
> for each port that had not yet been built. Building all ports will take 
> weeks. During that time, a commit may come through that updates a port to a 
> new version. But if the build of the old version of the port was still 
> pending at that time, the buildbot will build the old version, because it 
> can't update the ports tree until the current portwatcher build is done 
> waiting for its triggered portbuilder builds.

To me it seems like this is only an issue for forced builds, not for the
builds scheduled by commits.

So maybe for this use case the force scheduler should be on a level one
higher in the hierarchy. Then the force scheduler would get a list of
ports and for each schedule a portwatcher with only one port name (or
use some other way of partitioning).

> 3. When there are portwatcher builds pending, we have no idea how many 
> portbuilder builds are pending. It may say there are e.g. 3 portbuilder 
> builds pending, but the pending portwatchers could trigger any number of 
> additional portbuilders.

Why does it matter how many portbuilder builds will be scheduled later?
I do not see the problem...?


Overall, my immediate thought was that with the current portwatcher, we
could just not wait for the triggered builds to finish. That seems to
solve (1) and (2), but also lose the ability to send summary emails.
Although I did not think enough about this whether it would really work.

Am I missing something why we would definitely need to merge portwatcher
and portbuilder?

> An objection to this proposal was that buildbot 0.8 does not have the 
> capability to dynamically create scheduler steps at runtime. But that's not 
> 

Re: Buildbot proposal: combine portwatcher and portbuilder

2018-03-11 Thread Mojca Miklavec
Dear Ryan,

Conceptually I like the approach a lot.

*However*, this will generate an incomprehensible amount of emails
when a new build slave gets added or when some revbumps are done etc.
In particular from the 10.5/ppc machine. The individual emails will be
a lot more helpful, but the amount of them ...

This sounds totally hacky, but one way around that I can see is to
write a special-purpose mailer. Either a job on the buildbot or a
special external script that iterates through all the jobs from the
last hour from all the builders and creates build failure summaries
for each individual developer (author/committer/maintainer). Maybe
this could be a build job on the builder master, but I don't know how
tricky it would get to do it.

How would you handle duplicate entries in the build list? Dependency
order could also become semi-obsolete, but that's probably a price to
be payed.

I tried to play with buildNext, but I couldn't figure out how to read
properties of the build request.

Mojca

On 11 March 2018 at 10:25, Ryan Schmidt wrote:
> The current buildbot setup has a number of problems that I believe could be 
> solved by combining the currently separate portwatcher and portbuilder 
> schedulers into a single ports scheduler.
>
> I am not suggesting that we return to the behavior of the ports scheduler on 
> the old macOS forge buildbot system in which a single build would build all 
> the specified ports. We will keep the current method of building only one 
> port (and its dependencies) per build.
>
> The problems I want to solve are the following:
>
> 1. Currently, portwatcher is responsible for updating a copy of mpbb, 
> MacPorts base and a ports tree that it shares with portbuilder. Having 
> portbuilder maintain its own copy would waste a lot of time. If someone makes 
> one commit that changes 100 ports and then no further commits occur for 
> hours, we only want to update mpbb, MacPorts base and the ports tree once, 
> not 100 times. But the fact that it's shared means that portwatcher must (and 
> is configured to) wait for all triggered portbuilder builds to finish before 
> it processes the next commit. This works fine, unless the buildmaster is 
> stopped while portwatcher builds are pending. This has happened several times 
> when the servers lost power during a power outage. (The servers are on a UPS, 
> but the UPS does not provide as much instantaneous power as I expected, so if 
> the servers are busy building, they draw more power than the UPS can 
> instantaneously provide and the servers shut down immediately. I might remove 
> the buildworker machines from the UPS and leave only the buildmaster, modem 
> and router on it.) When buildmaster comes back online, it sees the 
> portbuilder build that was in progress and starts it again, but it also sees 
> the portwatcher build that was in progress and starts it again. Now we have a 
> portwatcher running (updating mpbb, updating MacPorts, updating the ports 
> tree, and updating its portindex) while portbuilder is trying to install a 
> port. The portbuilder can fail if it is trying to install ports at the moment 
> that portwatcher is updating the index (see 
> https://trac.macports.org/ticket/53587).
>
> 2. If a single portwatcher build "X" triggers many portbuilder builds, and 
> while those portbuilder builds are in progress another commit comes in that 
> would affect those ports, it don't notice until all portbuilder builds 
> triggered by "X" are finished. This can waste time building ports that are 
> already superseded by newer versions or revisions. An extreme example of this 
> is if we were to force a portwatcher build for all ports (which we might want 
> to do when a new version of macOS is released). mpbb, MacPorts base and the 
> ports tree would be updated once, and then it would schedule a portbuilder 
> for each port that had not yet been built. Building all ports will take 
> weeks. During that time, a commit may come through that updates a port to a 
> new version. But if the build of the old version of the port was still 
> pending at that time, the buildbot will build the old version, because it 
> can't update the ports tree until the current portwatcher build is done 
> waiting for its triggered portbuilder builds.
>
> 3. When there are portwatcher builds pending, we have no idea how many 
> portbuilder builds are pending. It may say there are e.g. 3 portbuilder 
> builds pending, but the pending portwatchers could trigger any number of 
> additional portbuilders.
>
> An objection to this proposal was that buildbot 0.8 does not have the 
> capability to dynamically create scheduler steps at runtime. But that's not 
> required and that's not what I'm proposing.
>
> Buildbot has the ability to call a function for each step to determine if 
> that step should run, by specifying the doStepIf property. I recently started 
> using this feature in portwatcher to skip the two trigger steps if there are 
> no ports 

Buildbot proposal: combine portwatcher and portbuilder

2018-03-11 Thread Ryan Schmidt
The current buildbot setup has a number of problems that I believe could be 
solved by combining the currently separate portwatcher and portbuilder 
schedulers into a single ports scheduler.

I am not suggesting that we return to the behavior of the ports scheduler on 
the old macOS forge buildbot system in which a single build would build all the 
specified ports. We will keep the current method of building only one port (and 
its dependencies) per build.

The problems I want to solve are the following:

1. Currently, portwatcher is responsible for updating a copy of mpbb, MacPorts 
base and a ports tree that it shares with portbuilder. Having portbuilder 
maintain its own copy would waste a lot of time. If someone makes one commit 
that changes 100 ports and then no further commits occur for hours, we only 
want to update mpbb, MacPorts base and the ports tree once, not 100 times. But 
the fact that it's shared means that portwatcher must (and is configured to) 
wait for all triggered portbuilder builds to finish before it processes the 
next commit. This works fine, unless the buildmaster is stopped while 
portwatcher builds are pending. This has happened several times when the 
servers lost power during a power outage. (The servers are on a UPS, but the 
UPS does not provide as much instantaneous power as I expected, so if the 
servers are busy building, they draw more power than the UPS can 
instantaneously provide and the servers shut down immediately. I might remove 
the buildworker machines from the UPS and leave only the buildmaster, modem and 
router on it.) When buildmaster comes back online, it sees the portbuilder 
build that was in progress and starts it again, but it also sees the 
portwatcher build that was in progress and starts it again. Now we have a 
portwatcher running (updating mpbb, updating MacPorts, updating the ports tree, 
and updating its portindex) while portbuilder is trying to install a port. The 
portbuilder can fail if it is trying to install ports at the moment that 
portwatcher is updating the index (see https://trac.macports.org/ticket/53587).

2. If a single portwatcher build "X" triggers many portbuilder builds, and 
while those portbuilder builds are in progress another commit comes in that 
would affect those ports, it don't notice until all portbuilder builds 
triggered by "X" are finished. This can waste time building ports that are 
already superseded by newer versions or revisions. An extreme example of this 
is if we were to force a portwatcher build for all ports (which we might want 
to do when a new version of macOS is released). mpbb, MacPorts base and the 
ports tree would be updated once, and then it would schedule a portbuilder for 
each port that had not yet been built. Building all ports will take weeks. 
During that time, a commit may come through that updates a port to a new 
version. But if the build of the old version of the port was still pending at 
that time, the buildbot will build the old version, because it can't update the 
ports tree until the current portwatcher build is done waiting for its 
triggered portbuilder builds.

3. When there are portwatcher builds pending, we have no idea how many 
portbuilder builds are pending. It may say there are e.g. 3 portbuilder builds 
pending, but the pending portwatchers could trigger any number of additional 
portbuilders.

An objection to this proposal was that buildbot 0.8 does not have the 
capability to dynamically create scheduler steps at runtime. But that's not 
required and that's not what I'm proposing.

Buildbot has the ability to call a function for each step to determine if that 
step should run, by specifying the doStepIf property. I recently started using 
this feature in portwatcher to skip the two trigger steps if there are no ports 
in the port list:

https://github.com/macports/macports-infrastructure/commit/18135d6c75698f88b48698473c9364063fb6fba9

Here is an example of what that looks like when it runs:

https://build.macports.org/builders/ports-10.13_x86_64-watcher/builds/3989

The only port that was committed there had already been built so it was 
excluded from the port list, leaving the list empty, so the portbuilder trigger 
step was skipped (to save time) and the mirror trigger step was skipped (to 
prevent it from printing an error that no ports were specified). The skipped 
steps are still shown in the web interface, but if that's not desired, they can 
be hidden by also using the hideStepIf property.

So my proposed combined ports scheduler would still contain all of the steps of 
the current portwatcher and portbuilder schedulers, but each build would still 
conceptually "be" either a portwatcher or a portbuilder, and for each build, 
the steps that don't relate to that conceptual function would be skipped and 
hidden.

Buildbot has the ability to associate custom properties with a build. We use 
this to set a "portname" property when we trigger portbuilder builds.