On 4/22/13 9:54 PM, Kartikaya Gupta wrote:
TL;DR:
* Inbound is closed 25% of the time
* Turning off coalescing could increase resource usage by up to 60% (but
probably less than this).
* We spend 24% of our machine resources on changes that are later backed
out, or changes that are doing the backout
* The vast majority of changesets that are backed out from inbound are
detectable on a try push

Do we know how many of these have been pushed to try, and passed/compiled what they'd fail later?

I expect some cost of regressions to come from merging/rebasing, and it'd be interesting to know how much of that you can see in the data window you looked at.

"has been pushed to try" is obviously tricky to find out, in particular on rebases, and possibly modified patches during the rebase.

Axel


Because of the large effect from coalescing, any changes to the current
process must not require running the full set of tests on every push.
(In my proposal this is easily accomplished with trychooser syntax, but
other proposals include rotating through T-runs on pushes, etc.).

--- Long verion below ---

Following up from the infra load meeting we had last week, I spent some
time this weekend crunching various pieces of data on mozilla-inbound to
get a sense of how much coalescing actually helps us, how much backouts
hurt us, and generally to get some data on the impact of my previous
proposal for using a multi-headed tree. I didn't get all the data that I
wanted but as I probably won't get back to this for a bit, I thought I'd
share what I found so far and see if anybody has other specific pieces
of data they would like to see gathered.

-- Inbound uptime --

I looked at a ~9 day period from April 7th to April 16th. During this time:
* inbound was closed for 24.9587% of the total time
* inbound was closed for 15.3068% of the total time due to "bustage".
* inbound was closed for 11.2059% of the total time due to "infra".

Notes:
1) "bustage" and "infra" were determined by grep -i on the data from
treestatus.mozilla.org.
2) There is some overlap so bustage + infra != total.
3) I also weighted the downtime using checkins-per-hour histogram from
joduinn's blog at [1], but this didn't have a significant impact: the
total, bustage, and infra downtime percentages moved to 25.5392%,
15.7285%, and 11.3748% respectively.

-- Backout changes --

Next I did an analysis of the changes that landed on inbound during that
time period. The exact pushlog that I looked at (corresponding to the
same April 7 - April 16 time period) is at [2]. I removed all of the
merge changesets from this range, since I wanted to look at inbound in
as much isolation as possible.

In this range:
* there were a total of 916 changesets
* there were a total of 553 "pushes"
* 74 of the 916 changesets (8.07%) were backout changesets
* 116 of the 916 changesets (12.66%) were backed out
* removing all backouts and changes backed out removed 114 pushes (20.6%)

Of the 116 changesets that were backed out:
* 37 belonged to single-changeset pushes
* 65 belonged to multi-changeset pushes where the entire pushed was
backed out
* 14 belonged to multi-changeset pushes where the changesets were
selectively backed out

Of the 74 backout changesets:
* 4 were for commit message problems
* 25 were for build failures
* 36 were for test failures
* 5 were for leaks/talos regressions
* 1 was for premature landing
* 3 were for unknown reasons

Notes:
1) There were actually 79 backouts, but I ignored 5 of them because they
backed out changes that happened prior to the start of my range).
2) Additional changes at the end of my range may have been backed out,
but the backouts were not in my range so I didn't include them in my
analysis.
3) The 14 csets that were selectively backed out is interesting to me
because it implies that somebody did some work to identify which changes
in the push were bad, and this naturally means that there is room to
save on doing that work.

-- Merge conflicts --

I also wanted to determine how many of these changes conflicted with
each other, and how far away the conflicting changes were. I got a
partial result here but I need to do more analysis before I have numbers
worth posting.

-- Build farm resources --

Finally, I used a combination of gps' mozilla-build-analyzer tool [3]
and some custom tools to determine how much machine time was spent on
building all of these pushes and changes.

I looked at all the build.json files [4] from the 6th of April to the
17th of April and pulled out all the jobs that corresponding to the
"push" changesets in my range above. For this set of 553 changesets,
there were 500 (exactly!) distinct "builders". 111 of these had "-pgo"
or "_pgo" in the name, and I excluded them. I created a 553x389 matrix
with the remaining builders and filled in how much time was spent on
each changeset for each builder (in case of multiple jobs, I added the
times).

Then I assumed that any empty field in the 553x389 matrix was a result
of coalescing. This is a grossly simplifying assumption that I would
like to revisit - I know for Android changes we can detect that in some
cases and only run the relevant tests; my assumption means the rest of
the platforms are considered "coalesced" for these changes. I filled in
these fields in the matrix with the average time spent on all the other
builds for that builder in the matrix.

* A total of 228717299 seconds were spent on the 128777 entries in the
matrix
* After de-coalescing, a total of 373751505 seconds would have been
spent on the 215117 entries in the matrix (an increase of 63%)
* With coalescing, but removing all the backout pushes and pushes that
were completely backed out, a total of 173027517 seconds were spent on
97623 entries (down 24% from actual usage)
* With de-coalescing AND stripping backouts, a total of 292634211
seconds would have been spent on the 168437 entries (an increase of 27%
over actual usage)

Notes:
1) I tried a minor variation where I excluded the 21 builders that ran
on less than 50% of the changes, on the assumption that these were not
coalesced out, but are actually run on demand. This brought down the
increase from 63% to 58%.

-- Conclusions --

See TL;DR up top.

Cheers,
kats

[1] http://oduinn.com/images/2013/blog_2013_02_pushes_per_hour.png
[2]
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=74354f979ea8&tochange=cad82c3b69bc

[3]
http://gregoryszorc.com/blog/2013/04/01/bulk-analysis-of-mozilla%27s-build-and-test-data/

[4] http://builddata.pub.build.mozilla.org/buildjson/

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to