How’d we do in our strive for operational excellence last month? Read on to
find out!

Read on Phabricator at https://phabricator.wikimedia.org/phame/post/view/227

📈 Incidents

1 documented incident last month. That's the third month in a row that we
are at or near zero major incidents – not bad! [1] [2]

Learn about recent incidents at Incident status
<https://wikitech.wikimedia.org/wiki/Incident_status> on Wikitech, or
Preventive
measures <https://phabricator.wikimedia.org/project/view/4758/> in Phabric
ator.
💡 *Did you know: Our Incident status
<https://wikitech.wikimedia.org/wiki/Incident_status> page provides a
green-yellow status reflection over the past ten days, with a link to the
most recent incident doc if there was any during that time.*

-------

📊 Trends

This January saw a small recovery in our otherwise negative upward trend.
For the first time in twelve month more reports were closed than new
reports having outlived the previous month without resolution. What
happened twelve months ago? That January 2020, which also saw a small
recovery during the otherwise upward trend before and after it.

Perhaps its something about the post-December holidays that temporarily
improves the quality and/or reduces the quantity — of code changes. Only
time will tell if this is the start of a new trend, or merely a
post-holiday dip. [3]

While our month-to-month trend might not (yet) be improving, we do see
persistent improvements in our overall backlog of pre-2019 reports. This is
in part because generally don't file new reports there, so it makes sense
that it doesn't go up, but it's still good to see downward progress every
month, unlike with reports from more recent months which often see no
change month-to-month (see "Outstanding errors" below, for example).

This positive trend on our "Old" backlog started in October 2020 and has
consistently progressed every month since then (refer to the "Old" numbers
in red on the below chart, or the same column in the spreadsheet
<https://docs.google.com/spreadsheets/d/1tRCh8aB0UYyLlhftvcHvhWH4-e7cF5V01XvRObTVgUI/edit?usp=sharing>.
[3][4]
Figure 1, Figure 2: Unresolved error reports stacked by month.
<https://phabricator.wikimedia.org/phame/post/view/227/production_excellence_28_january_2021/#trends>

-------
📖 Outstanding errors

Summary over recent months:

   - ⚠️ July 2019 (2 of 18 issues left): *no change*.
   - ⚠️ August 2019 (1 of 14 issues): *no change*.
   - ✅ September 2019 (0 of 12 issues): Last two tasks were resolved (-2).
   - ⚠️ October 2019 (4 of 12 issues): One task resolved (-1).
   - ⚠️ November 2019 (1 of 5 issues): *no change*.
   - ⚠️ December 2019 (2 of 9 issues), Two tasks resolved (-2).
   - ⚠️ January 2020 (2 of 7 issues), no change.
   - ⚠️ February 2020 (1 of 7 issues left), One task resolved (-1).
   - March 2020 (2 of 2 issues left), no change.
   - April 2020 (9 of 14 issues left): *no change*.
   - May 2020 (6 of 14 issues left): One task resolved (-1).
   - June 2020 (7 of 14 issues left): *no change*.
   - July 2020 (9 of 24 new issues
   <https://phabricator.wikimedia.org/maniphest/query/s__D8Kd0xuQH/#R>): *no
   change*.
   - August 2020 (22 of 53 new issues
   <https://phabricator.wikimedia.org/maniphest/query/hu1yhWu4sXkP/#R>):
   One task resolved (-1).
   - September 2020 (13 of 33 new issues
   <https://phabricator.wikimedia.org/maniphest/query/CGFQViLShnOY/#R>):
   One task resolved (-1).
   - October 2020 (31 of 69 new issues
   <https://phabricator.wikimedia.org/maniphest/query/MYnnBybPTYpd/#R>):
   Four tasks fixed (-4).
   - November 2020 (14 of 38 new issues
   <https://phabricator.wikimedia.org/maniphest/query/CkC_VqQq5VC0/#R>): *no
   change*.
   - December 2020 (19 of 33 new issues
   <https://phabricator.wikimedia.org/maniphest/query/10NQy74iKaZJ/#R>)
   Three tasks resolved (-3)
   - *January 2021*: 7 of 50 new issues
   <https://phabricator.wikimedia.org/maniphest/query/WIP9W8q54HB6/#R>
   survived the month and remained unresolved (+50; -43)

Recent tally
160 issues open, as of Excellence #27
<https://phabricator.wikimedia.org/phame/post/view/219/production_excellence_27_december_2020/>
(4 Feb 2021).
-15 issues closed since, of the previous 160 open issues.
+7 new issues that survived January 2021.
152 issues open, as of today (16 Feb 2021).

January saw +50 new production errors reported in a single month, which is
an unfortunate all-time high. However, we've also done remarkably well on
addressing 43 of them within a month, when the potential root cause and
diagnostics data are still fresh in our minds. Well done!

For the on-going month of February, there have been 16 new issues
<https://phabricator.wikimedia.org/maniphest/query/xjFr73QLJYlE/#R>
reported so far.

Take a look at the workboard
<https://phabricator.wikimedia.org/tag/wikimedia-production-error/> and
look for tasks that could use your help!

-------

🎉 Thanks!

Thank you to everyone else who helped by reporting, investigating, or
resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof

-------

Footnotes:

[1] Incident status Wikitech
<https://wikitech.wikimedia.org/wiki/Incident_status>.
[2] Wikimedia incident stats by Krinkle, CodePen
<https://codepen.io/Krinkle/full/wbYMZK>.
[3] Month-over-month, Production Excellence spreadsheet
<https://docs.google.com/spreadsheets/d/1tRCh8aB0UYyLlhftvcHvhWH4-e7cF5V01XvRObTVgUI/edit>
.
[4] Open tasks, Wikimedia-prod-error, Phabricator
<https://phabricator.wikimedia.org/maniphest/query/Fw3RdXt1Sdxp/#R>.
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to