(added [email protected] to recipients list) On Fri, Mar 7, 2014 at 5:27 PM, Bryan Davis <[email protected]> wrote: > On Thursday every week a new WFM branch is cut to deploy the group0 > wikis (test* and wm.o). On the following Tuesday it is promoted to the > group1 wikis (all-wikipedias). Finally on Thursday is it promoted to > group2 (wikipedias) while the group0 wikis start using another new > version. At the current release cadence (one new branch a week) after > 2 weeks in production a branch is no longer used. There can be minor > exceptions to this due to major difficulties with a branch and/or > holiday conflicts, but for the sake of this discussion those > differences can be mostly ignored. > > A branch can't be deleted from the server cluster immediately after it > is removed from the last wiki however. For better or worse, each > branch contains static assets from core (resources & skins) and > extensions that are served by the apaches. These assets are served > using versioned URLs such as > https://bits.wikimedia.org/static-1.23wmf17/skins/common/images/poweredby_mediawiki_88x31.png. > Varnish caches pages containing these URLs for anons for up to 30 > days. That means that a request for static content contained by the > 1.23wmf17 branch could be needed to satisfly an apache request for up > to 30 days after that branch is no longer being used to satisfy PHP > backed requests. Assuming the weekly release cadence, this means that > the static assets from a branch are needed on the cluster for at least > 45 days (14 days of active branch use + 31 days of cached page use). > > At the moment we don't have a well documented procedure for cleaning > up old branches on tin and servers that rsync with tin (directly and > indirectly). It seems to be a process that Sam does occasionally. The > last commits that cleaned up old branches were merged on 2014-02-15: > https://gerrit.wikimedia.org/r/#/c/113640/,https://gerrit.wikimedia.org/r/#/c/113641/. > These commits cleaned up some truly ancient branches. > > A slightly different by related problem is the amount of disk space > consumed by the l10n cache files for unused MW versions. The combined > json and CDB files for the current 1.23 branches consume ~1.7G per > version. It looks like Sam has been pruning these at some point as > well as the cache/l10n directory for version 1.23wmf12 and earlier are > empty. > > I recommend that we add two new weekly cleanup steps: > > * When we deploy a new branch to group0 (Thursdays), all branches > retired more than 5 weeks ago should be removed. This should really > only include multiple branches the first time it's done to catch up. > After that it will be an "add a branch, kill a branch" situation. With > the current release cadence this will keep us at 7 checked out > branches on tin, 2 versions in active use and 5 waiting for potential > cache references to expire. > > * When we move group1 to the newest branch (Tuesdays), the cache/l10n > directory of all non-active branches should be purged. By this point > there is little chance that we will be reverting the wikipedias to the > N-2 branch and thus the l10n cache is just taking up disk space and > slowing down rsync comparisons. > > Are there any objections to adding these procedures to the MW deploy process?
Minor content correction: mentions of "30 days" should have really been "31 days". Apparently i changed it in some places before I hit send but I didn't get them all. The 31 day upper limit comes from the $wgSquidMaxage setting in InitialiseSettings.php [0] [0]: https://git.wikimedia.org/blob/operations%2Fmediawiki-config/87e36518db5644f15748fbfc36c4d1bf3b2f65e8/wmf-config%2FInitialiseSettings.php#L10276 Bryan -- Bryan Davis Wikimedia Foundation <[email protected]> [[m:User:BDavis_(WMF)]] Sr Software Engineer Boise, ID USA irc: bd808 v:415.839.6885 x6855 _______________________________________________ MediaWiki-Core mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-core
