πŸ“˜ Read on Phabricator at
https://phabricator.wikimedia.org/phame/post/view/180/
-------

How’d we do in our strive for operational excellence last month? Read on to
find out!

## πŸ“Š Month in numbers

* 3 documented incidents. [1]
* 26 new Wikimedia-prod-error reports. [2]
* 26 Wikimedia-prod-error reports closed. [3]
* 198 currently open Wikimedia-prod-error reports in total. [4]

To read more about these incidents and pending actionables; check <
https://wikitech.wikimedia.org/wiki/Incident_documentation#2020>, or
Explore Wikimedia incident stats (interactive).

-------

## πŸ“– Paradoxical array key

Wikimedia encountered several Zend engine bugs that could corrupt a PHP
program at run-time, during the upgrade from HHVM to PHP 7.2. (Some of
these bugs are still being worked on.) One of the bugs we fixed last month
was particularly mysterious. Investigation led by Antoine (Hashar) and Tim
Starling.

MediaWiki would create an array in PHP and add a key-value pair to it. We
could iterate this array, and see that our key was there. Moments later, if
we tried to retrieve the value from that same array, sometimes the key
would no longer exist!

After many ad-hoc debug logs, core dumps, and GDB sessions, the problem was
tracked down to the string interning system of Zend PHP.  String interning
is a memory reduction technique. It means we only store one copy of a
character sequence in RAM, even if the many parts of the code use the same
character sequence. For example, the words β€œuser” and β€œedit” are frequently
used in the MediaWiki codebase. One of those sequences is the empty string
(β€œβ€), which is also used a lot in our code. This is the string we found
disappearing most often from our PHP arrays. This bug affected several
components, including Wikibase, the wikimedia/rdbms library, and
ResourceLoader.

Tim used a hardware watchpoint in GDB, and traced the root cause to the
Memcached client for PHP. The php-memcached client would β€œfree” a string
directly from the internal memory manager after doing some work. It did
this even for β€œinterned” strings that other parts of the program may still
be depending on.

Effie and Giuseppe backported the upstream fix to our php-memcached package
and deployed it to production. Thanks! β€”
https://phabricator.wikimedia.org/T232613

-------

## πŸ“‰  Outstanding reports

Take a look at the workboard and look for tasks that might need your help.
The workboard lists error reports, grouped by the month in which they were
first observed.

β†’  https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Breakdown of recent months (past two weeks not included):

* March: 3 of 10 reports left. (unchanged). ⚠️
* April: Two reports closed, 4 of 14 left.
* May: (All clear!)
* June: Two reports closed. 4 of 11 left.
* July: Four reports closed, 8 of 18 left.
* August: 4 of 14 reports left. (unchanged)
* September: One report closed, 8 of 12 left.
* October: 8 of 12 left (unchanged).
* November: 5 of 5 left (unchanged)
* December: Three reports closed, 6 of 9 left.
* January: 7 new reports survived the month of January.

There are a total of 57 reports filed in recent months that remain open.
This is down from 62 last month.

-------

## πŸŽ‰ Thanks!

Thank you to everyone who helped by reporting, investigating, or resolving
problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof

-------

Footnotes:

[1] Incidents. –
https://wikitech.wikimedia.org/wiki/Incident_documentation#2020

[2] Tasks created. –
https://phabricator.wikimedia.org/maniphest/query/qfCVpWqGX0tJ/#R

[3] Tasks closed. –
https://phabricator.wikimedia.org/maniphest/query/ndeCQjeJ6UNr/#R

[4] Open tasks. –
https://phabricator.wikimedia.org/maniphest/query/47MGY8BUDvRD/#R
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to