[Wikitech-l] Migrating to Scap from Trebuchet, Timelines and Such

2016-04-05 Thread Tyler Cipriani
The Release Engineering team's goal for the April-June 2016 quarter is
to move everything that is currently deployed with Trebuchet over to
Scap. With the release of Scap 3.1.0, everything that is deployed via
Trebuchet can be ported to Scap—it supports git-fat, restarting
services, and there is even a puppet provider.

== What is the timeline? ==

We have made tasks for all of the existing projects that are deployed
via Trebuchet[0], and the goal is to move these all by **2016-06-30**
(AKA the End of The Quarter™).

If we missed your project that is deployed via Trebuchet, please add a
task with the #scap3 tag in phabricator.

== What is Scap? Why are we moving to it? ==

Scap is a tool that the Release Engineering team has been working on
as a successor to the salt-based Trebuchet deployment system.

* Stable and secure SSH-based command and control
* Detailed error logs available from every target node
* Built on tools that everyone is familiar with—SSH and Git

== What does it mean to move to Scap? ==

To assist with the move from Trebuchet to Scap:
* We have written documentation, including a quick start setup guide [1]
* Release Engineering folks are available to help with migrations,
questions, concerns in the #scap3 channel on freenode

== Why isn't Release Engineering just porting everything all by their
lonesome? ==

We're hoping that by the end of the quarter, not only will we get all
projects migrated, but we also want folks to be familiar with Scap and
know how to troubleshoot their own deployments. In the end, the goal
is to scale knowledge of how to use Wikimedia's deployment tooling to
the wider organization instead of a handful of people. "Teach a person
to fish"[2] and all...

<3,
Tyler Cipriani
WMF Release Engingeering

[0] https://phabricator.wikimedia.org/project/view/1824/
[1] https://doc.wikimedia.org/mw-tools-scap/scap3/quickstart/setup.html
[2] 
https://en.wiktionary.org/wiki/give_a_man_a_fish_and_you_feed_him_for_a_day;_teach_a_man_to_fish_and_you_feed_him_for_a_lifetime

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] [Breaking Change] Scap change for deployers

2016-05-10 Thread Tyler Cipriani
tl;dr: Our beloved scap is changing to use subcommands rather than a
bunch of scripts, but the existing scripts will work for a short time.

Starting with the 3.2.0 release[0], which will hit production in the
next day or so, scap will use subcommands rather than using many
different scripts that all call the same underlying code. The scripts
(e.g., deploy, sync-file, sync-dir, sync-wikiversions.) will continue
to work as usual, but they will issue a deprecation warning until the
next release when they will disappear.

The most notable exception is the `scap` command which must be invoked
as `scap sync [message]`.

The docs are updated[1] and you can see new help output there or on
phabricator[2].

Long story short, you will now run:

scap sync-file  [message]

Instead of:

sync-file  [message]

This change has been cherry-picked on beta cluster and is currently live there.

<3,
Tyler Cipriani and the Deployment Working Group

[0]. https://gerrit.wikimedia.org/r/#/c/287918
[1]. https://doc.wikimedia.org/mw-tools-scap/
[2]. https://phabricator.wikimedia.org/P3027

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] [Breaking Change] Scap change for deployers now live!

2016-05-11 Thread Tyler Cipriani
tl;dr starting today:

Use:
  scap sync 'message for posterity'
instead of:
  scap 'message for posterity'

Scap 3.2.0-1 is now alive and well in production which means scap
subcommands are live.

All subcommands are documented[0]. Additional documentation can be
seen by running `scap --help` (or `scap [subcommand] --help`). If you
have any questions feel free to ask them on-list or in IRC on #scap3
or #wikimedia-releng.

Thanks!
Tyler Cipriani and the Deployment Working Group

[0]. Mediawiki:
https://doc.wikimedia.org/mw-tools-scap/scap2/commands.html Scap3:
https://doc.wikimedia.org/mw-tools-scap/scap3/deploy_commands.html

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Breaking Change] Scap change for deployers

2016-05-14 Thread Tyler Cipriani

On 16-05-13 20:32:37, Legoktm wrote:

Would it be possible to have tab completion for the new scap
subcommands? "mwv" → mwversionsinuse versus typing out all of "scap
wikiversions-inuse" ;) Same with "sync-f", etc.


Hadn't thought about command expansion. Bash autocompletion is now
tracked in a ticket in Phabricator[0] and is definitely needed.

The plan is to get rid of the old commands like `mwversionsinuse` in
future in favor of the `scap [command]` format. Any thoughts on how to
make this transition easier for deployers are welcome—the best place for
the discussion is probably the phabricator ticket mentioned above
(although the ticket may need to be retitled as it evolves :)).

Thanks!
Tyler

[0]. https://phabricator.wikimedia.org/T135317

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] [Breaking Change] Scap3 stage changes will break custom checks

2016-05-18 Thread Tyler Cipriani
This is a change that affects services that have moved to deployment
via Scap3 (not MediaWiki deployments).

The 3.2.0-1 release that is currently live makes an important change
to the stages in which custom checks may be run. There is now a new
stage called `restart_service` that occurs after the `promote` stage.
The `promote` stage no longer does a service restart. This change is
outlined in the Scap3 docs[0].

This change likely means that you need to move any custom checks (in
scap/checks.yaml) that were intended to run post-service restart to
use the stage `restart_service` rather than `promote`.

For example this check, which depends on a service restart to work correctly:

  checks:
service_responds:
  type: command
  stage: promote
  command: curl -Ss localhost:1234

Should now be written as:

  checks:
service_responds:
  type: command
  stage: restart_service
  command: curl -Ss localhost:1234

Sorry for any inconvenience. For future releases, changelog highlights
will be sent to the list prior to release.

-- Tyler

[0]. 
https://doc.wikimedia.org/mw-tools-scap/scap3/quickstart/setup.html#service-restarts-and-checks

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Canary Deploys for MediaWiki

2016-07-25 Thread Tyler Cipriani

tl;dr: Scap will deploy to canary servers and check for error-log spikes in the 
next version (to be released Soon™).

In light of recent incidents[0] which have created outages accompanied by 
large, easily detectable, error-rate spikes, a patch has recently landed in 
Scap[1] that will:

   1. Push changes to a set of canary servers[2] before syncing to proxy servers
   2. Wait a configurable length of time (currently 20 seconds[3]) for any 
errors to have time to make themselves known
   3. Query Logstash (using a script written by Gabriel Wicke[4]) to determine 
if the error rate has increased over a configurable threshold (currently 
10-fold[5])

Big thanks to the folks that helped in this effort: Gabriel Wicke, Filippo 
Giunchedi and Giuseppe Lavagetto, Bryan Davis and Erik Bernhardson (for their 
mad Logstash skillz)!

It is noteworthy, that in instances where expedience is required—we're in the 
middle of an outage and who cares what Logstash has to say—the `--force` flag 
can be added to skip canary checks all together (i.e. `scap sync-file --force 
wmf-config/InitialiseSettings 'Panic!!'`).

The RelEng team's eventual goal is still to move MediaWiki deployments to the 
more robust and resillient Scap3 deployment framework. There is some 
high-priority work that has to happen before the Scap3 move. In the interim, we 
are taking steps (like this one) to respond to incidents and keep deployments 
safe.

Hopefully, this work and the error-rate alert work from Ori last week[6] will 
allow everyone to be more conscientious and more keenly aware of deployments 
that cause large aberrations in the rate of errors.

<3,
Your Friendly Neighborhood Release Engineering Team

[0]. 
https://wikitech.wikimedia.org/wiki/Incident_documentation/20160601-MediaWiki 
is the recent example I could find, but there have been others.
[1]. https://phabricator.wikimedia.org/D248
[2]. https://gerrit.wikimedia.org/r/#/c/294742/
[3]. https://github.com/wikimedia/scap/blob/master/scap/config.py#L19
[4]. https://gerrit.wikimedia.org/r/#/c/292505/
[5]. https://github.com/wikimedia/scap/blob/master/scap/config.py#L18
[6]. https://gerrit.wikimedia.org/r/#/c/300327/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] New scap version is live

2016-08-04 Thread Tyler Cipriani

tl;dr: The new scap version is live in production. It has canary
deploys.

Scap v.3.2.2-1 was deployed to production today. There are some new
internal improvements as well as some that are user-facing.

The improvements you'll probably notice are:

   * Tab completion works for scap subcommands(!)
   * Canary checks for MediaWiki deployments

Canary deployments:

   1. Sync your change(s) to the api and appserver canary hosts
   2. Wait (20 seconds) for traffic to hit those host
   3. If there isn't a large increase in the error rate on those hosts
  (10x), release your changes to the remainder of the fleet.

If, for whatever reason, you find yourself in a position where you don't
care about the error-rate change on canary nodes, use the --force flag,
i.e.:

   scap sync-file --force README 'Important README update'

Please report any problems in a phab ticket tagged with "Scap3" or in
#wikimedia-releng in IRC on freenode.

<3,
Your Hometown Release Engineers

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] New scap version is live

2016-08-08 Thread Tyler Cipriani

On 16-08-07 13:50:33, Legoktm wrote:

Does scap/whatever make any requests against those hosts? Or is it just
depending upon normal traffic to those hosts to possibly cause errors?


The script that scap is using to query logstash is
logstash_checker.py[0]. There are no requests being generated as part of
the deployment process, scap relies wholly on normal traffic to spot
errors.

There was some discussion on a couple phabricator tickets[1][2] about a
pre-canary check step that would still be nice to implement.

While the canary check script was a good step, I still feel that a
pre-canary deploy sanity check that consists of requests to known
end-points on unpooled servers would be a boon to the prevention of
catastrophic deploys.

-- Tyler

[0]. 

[1]. 
[2]. 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Wikis paused at 1.28.0-wmf.18 (was: Re: Upgrade of 1.28.0-wmf.19 to group 1 is on hold)

2016-09-20 Thread Tyler Cipriani

tl;dr: All wikis are staying at 1.28.0-wmf.18 for now

Last week all wikis were rolled back to MediaWiki version 1.28.0-wmf.18
due to several problems that were spotted on Friday (2016-09-16)[0][1].

The problems with wmf.19 seemed resolved by Monday (2016-09-19). The
plan was to roll wmf.19 out to all wikis yesterday afternoon and continue
with the wmf.20 branch-cut today as scheduled.

Late yesterday a large performance regression was discovered in
wmf.18[2].

We've paused the rollout of wmf.19 and the branching of wmf.20 to allow
time to investigate the nature of this performance regression.

After there is a better understanding of the performance regression, we
will reevaluate our plans for branching and rollout of wmf.19 and wmf.20.

-- Tyler

[0]. https://phabricator.wikimedia.org/T111441
[1]. https://phabricator.wikimedia.org/T145819
[2]. https://phabricator.wikimedia.org/T146099

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikis paused at 1.28.0-wmf.18 (was: Re: Upgrade of 1.28.0-wmf.19 to group 1 is on hold)

2016-09-20 Thread Tyler Cipriani

I have cut 1.28.0-wmf.20[0] for MediaWiki and extensions as the branch
cut was blocking merges to master for developers.

The state of deployed code has not changed – all wikis are running the
1.28.0-wmf.18 branch of MediaWiki and extensions.

Plans for moving forward are still being discussed on Phabricator[1].

-- Tyler

[0]. 

[1]. 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Group0 on wmf.20, all to wmf.20 tomorrow (was: Re: Upgrade of 1.28.0-wmf.19 to group 1 is on hold)

2016-09-21 Thread Tyler Cipriani

Group0 wikis (mediawikiwiki, test2wiki, testwiki, testwikidatawiki, and
zerowiki) are running version 1.28.0-wmf.20 of MediaWiki and extensions.
All other wikis are running 1.28.0-wmf.18.

Tomorrow there will be a shortened train schedule in the normal train
deployment window during which 1.28.0-wmf.20 will be pushed to all
wikis.

Any blockers to this plan are tracked on Phabricator[0].

Thank you for all your help and patience while we get the train
schedule[1] back on track!

-- Tyler

[0]. 
[1]. 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] CI maintenance on Thursday 3rd Nov 16:00 UTC

2016-10-27 Thread Tyler Cipriani

The Wikimedia continuous integration system will be unavailable while some
scheduled maintenance is done.

When: Thursday 3rd 2016 for two hours between 16:00 UTC to 18:00 UTC.

Impact: During that time, you will still be able to send patches to Gerrit but
no CI jobs will be run nor will patches be automatically merged when someone
votes "Code-Review +2". All patches sent during the operations will be sent to
the CI system for you as a convenience.

Why: The maintenance will move the core of the CI system (Jenkins and Zuul)
from an aged server to a fresh new machine.

More info: It will be done by Antoine Musso, Tyler Cipriani and Daniel Zahn.
You will be able to watch progress on IRC in the #wikimedia-operations channel.
See also: https://phabricator.wikimedia.org/T95757

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] MediaWiki and extensions .gitreview now uses track=1

2016-10-31 Thread Tyler Cipriani

Hi all!

Last week .gitreview for MediaWiki branches and extensions switched from
targeting a specific branch to using track=1[0].

This is a change that, going forward, should make it easier to do weekly
branching and releases without being too disruptive for developer
workflows.

The git-review version that allows for this change is 1.25.0. I have
updated the docs to reflect the use of this version[1].

It is also important to note that to use git-review with track=1 you
must be on a local branch that tracks an upstream – detached-head states
and branches that do no track upstream will cause strange errors.

-- Tyler

[0]. 
[1]. 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] 1.28 branching release dates and such :)

2016-11-01 Thread Tyler Cipriani

On 16-10-24 21:55:04, Chad wrote:

Tyler Cipriani's assisting me with this release, so expect to see some RCs
with his name
(and signatures) on them :)


As noted in Chad's email, I will be creating 1.28.0-rc.0 tomorrow.

I've gotten all the backports from 1.28.0-wmf.23 into the REL1_28
branch. If you have additional backports for the release, please let me
know so we can get them squared-away for rc.0.

Thanks!

-- Tyler


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] First release candidate for 1.28 (1.28.0-rc.0)

2016-11-02 Thread Tyler Cipriani

Hi all!

I am pleased to announce that the first release candidate for MediaWiki 1.28 is
now available.

Full release notes:
* 
https://phabricator.wikimedia.org/diffusion/MW/browse/REL1_28/RELEASE-NOTES-1.28
* https://www.mediawiki.org/wiki/Release_notes/1.28

Known issues and final release blockers can be found in Phabricator:
https://phabricator.wikimedia.org/project/board/1982/

-- Tyler

**
Download:
https://releases.wikimedia.org/mediawiki/1.28/mediawiki-1.28.0-rc.0.tar.gz

GPG signatures:
https://releases.wikimedia.org/mediawiki/1.28/mediawiki-core-1.28.0-rc.0.tar.gz.sig
https://releases.wikimedia.org/mediawiki/1.28/mediawiki-1.28.0-rc.0.tar.gz.sig

Public keys:
https://www.mediawiki.org/keys/keys.html


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Scap 3.4.0-1 is live

2016-11-28 Thread Tyler Cipriani

Hi all,

A new version of Scap has been released and with it comes a few changes.

tl;dr highlights:

* Old scap bin stubs (e.g., /usr/bin/sync-file, /usr/bin/sync-dir,
 /usr/bin/mwversionsinuse, etc) will now exit 1.

 Subcommands are now the only way to interact with scap, i.e., `sync-file` is
 now `scap sync-file`.

* Scap3 (non-mediawiki) deploys will now announce deploys in IRC -- you
 can specify a message for IRC via:

   scap deploy 'A message for the SAL'

* Scap lockfile errors now show you (a) who has the lockfile and (b) their
 deploy message. The output you'll see if another person is deploying
 looks like:

   sync-file failed:  Failed to acquire lock "/var/lock/scap"; owner is 
"thcipriani"; reason is "scap 3.4 sync file"

You can see a full changelog for both the 3.4.0-1 and the 3.3.1-1 release on
phabricator[0]. If you spot any issues, file a phabricator task tagged with the
`Scap3` project[1].

-- Your Humble Scap Toilers

[0]. 
[1]. 


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Scap 3.5.1-1 is live

2017-01-30 Thread Tyler Cipriani

Hi all,

Scap version 3.5.1-1 has been released and with it comes a few changes.

tl;dr highlights:

== MediaWiki Deploys ==

* Subcommands are the only way to scap (e.g., scap sync-file vs. sync-file) Old
 stub entry points for scap (e.g., sync-file, sync-dir, mwversionsinuse, etc)
 are gone. Formerly old binstubs simply exited with a non-zero exit code.

* MediaWiki canary deploys now check for both hhvm and mediawiki errors

* scap sync-file and scap sync-dir are now the same command internally.  scap
 sync-dir is now deprecated

== Scap3/Service Deploys ==

* Scap's rollback behavior has been greatly improved. Scap supports a global
 `failure_limit` and a per-group `failure_limit` -- if a deployment exceeds the
 number or percentage of failures specified by this limit a deploy will fail and
 you will be prompted to rollback. Also, if you opt to *not* continue a
 deployment on remaining deploy groups, you will receive the option to rollback.
 (Fixes T149008)

* Scap3: This scap release has some rollback logic fixes. First, if there is
 initial ssh failure for a host, scap will no longer attempt a rollback on that
 host (since the same ssh failure will likely cause a rollback failure). Next,
 all previously deployed groups of servers will now be rolled-back -- not just
 the group of servers that had failures. (Fixes T150267, T145460)

You can see the full changelog in the repo[0].

<3 -- The Scap Folks

[0]. 



signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] Announcement: Tyler Cipriani joins Wikimedia as Release Engineer

2015-02-09 Thread Tyler Cipriani
Thanks everybody for the warm welcome!

After a whirlwind first day, it's time to retreat to Chipotle and Seinfield
re-runs to help digest this massive information intake.

Thanks again!
Tyler
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] This week in logspam

2017-02-16 Thread Tyler Cipriani

Hi all!

Logspam makes it difficult to glance at error logs after a deployment and
reason about a deployment's impact [0]. Release Engineering is making a
conscious effort (in Scrum of Scrums, in Phabricator, and on mailing lists) to
connect logspam tasks with folks that can make an impact on these tasks (and,
consequentially, make an impact on deployments). Sometimes, as now, our effort
takes the form of a broad appeal to help investigate high impact logspam tasks.

This week in logspam:

* wfShellExec errors end up in HHVM log [1]
 These errors are the noisiest of the noisy recently. They take various forms,
 but many relate to PDF handling and start with "SyntaxError"
* Warning: Cannot modify header information - headers already sent [2]
 The latest iteration of this error seems to have started with the
 release of 1.29.0-wmf.10
* Warning: timed out after 0.2 seconds when connecting to rdb1001.eqiad.wmnet 
[110]: Connection timed out [3]
 saw a bit of movement last week, but there are some unanswered questions and
 the message is still going strong.
* Couple of session related ones
** Session "{session}": Unverified user provided and no metadata to auth it [4]
** Session "{session}": Metadata merge failed: {exception} [5]
* Throttler: throttler data not found for {user} [6]

If you or anyone you know has information that can lead to the cessation of
these errors, please add that person or a comment on the tasks listed here.

Thus concludes another exciting week in logspam.

<3
-- RelEngers

[0] 
[1] 
[2] 
[3] 
[4] 
[5] 
[6] 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] [scap] Scap3-deployed repo owners

2017-04-13 Thread Tyler Cipriani

Hiya!

tl;dr: if your repo has a patch from me[0], please merge it :)

The longer explanation for these patches is that the deployment server 
from which your code is fetched by targets is set via the git_server 
configuration variable. This variable will be updated in Puppet when the 
primary deployment server changes; however, updating it in every repo 
would be time consuming. Yesterday, I made a bunch of patches to remove 
this configuration variable from any repo where it is set. By removing 
this configuration variable from individual repos, all repos will 
respect the global value for git_server that is set in Puppet meaning 
that repo owners shouldn't have to worry about making updates when a 
deployment server is changed.


If you have any questions let me know via email or IRC in 
#wikimedia-releng.


Thank you for your help!

-- Tyler

[0].  


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] 1.30.0-wmf.9 train halted due to new log messages

2017-07-13 Thread Tyler Cipriani

Hello all,

There are a few new log messages that have crept their way into the 
1.30.0-wmf.9 train release currently winding its way down the track - on 
group0 and group1 wikis[0]. I've halted the train for the time being due 
to these new messages[1].


1. T170599[2] - Wikibase: $idSerialization must match /^Q[1-9]\d{0,9}\z/i
2. T170596[3] - Could not acquire lock 'LinksUpdate:job:pageid:xxx'
3. T170597[4] - Wikidata/extensions/Constraints: 
InvalidArgumentException:$itemId must be either ItemId or string


RelEng makes an attempt during train to keep the introduction of new log 
messages per-release to a minimum. Logspam masks real problems and can 
make the job of deployers and developers unpleasant.


Any help or guidance on any of these three tasks would be very much 
appreciated!


Thanks!

-- Tyler

[0]. 
[1].  

[2]. 
[3]. 
[4]. 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] 1.30.0-wmf.9 train halted due to new log messages

2017-07-13 Thread Tyler Cipriani
The previous blockers have been resolved. Thanks to everyone for their 
quick work and replies.


However, when I rolled forward 1.30.0-wmf.9 to the wikipedia wikis I hit 
a new problem:


T170648[1] - Timeout reached waiting for an available pooled curl 
connection!


Any input on that task would be appreciated. We will get through this 
train. Together.


Thanks!

-- Tyler

[1] <https://phabricator.wikimedia.org/T170648>


On 17-07-13 12:32:31, Tyler Cipriani wrote:

Hello all,

There are a few new log messages that have crept their way into the 
1.30.0-wmf.9 train release currently winding its way down the track - 
on group0 and group1 wikis[0]. I've halted the train for the time 
being due to these new messages[1].


1. T170599[2] - Wikibase: $idSerialization must match /^Q[1-9]\d{0,9}\z/i
2. T170596[3] - Could not acquire lock 'LinksUpdate:job:pageid:xxx'
3. T170597[4] - Wikidata/extensions/Constraints: 
InvalidArgumentException:$itemId must be either ItemId or string


RelEng makes an attempt during train to keep the introduction of new 
log messages per-release to a minimum. Logspam masks real problems and 
can make the job of deployers and developers unpleasant.


Any help or guidance on any of these three tasks would be very much 
appreciated!


Thanks!

-- Tyler

[0]. <https://tools.wmflabs.org/versions/>
[1].  
<https://wikitech.wikimedia.org/wiki/Deployments/Holding_the_train#Logspam>
[2]. <https://phabricator.wikimedia.org/T170599>
[3]. <https://phabricator.wikimedia.org/T170596>
[4]. <https://phabricator.wikimedia.org/T170597>


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] MediaWiki and extensions 1.30.0-wmf.14 group1 deployment blocked

2017-08-16 Thread Tyler Cipriani
The deployment of MediaWiki and extensions version 1.30.0-wmf.14 is 
blocked as a message in the error log gradually worsened following the 
roll out 1.30.0-wmf.14 to group1 wikis. The error:


   Cannot flush pre-lock snapshot because writes are pending

is detailed on phabricator[0].

As of right now, group0 wikis are on php-1.30.0-wmf.14, group1 wikis are 
on php-1.30.0-wmf.13 (excluding wikidatawiki which is on 
php-1.30.0-wmf.11), the remaining wikis are on php-1.30.0-wmf.13.


The types of issues that will halt the train and the process and 
procedures for when the train is halted are detailed on Wikitech[1].  
The current version deployed per wiki is available on the wikiversions 
toollabs page[2].


-- Tyler

[0]. 
[1]. 
[2]. 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] 1.31.0-wmf.23 MediaWiki train rollback

2018-02-28 Thread Tyler Cipriani

Hi all!

Wednesday Train update: we're currently running 1.31.0-wmf.23 on group0 
only, so we're a day behind schedule.


I decided to rollback based on a particularly noisy notice[0]. The right 
folks are already aware and working on it (thanks all!).


Reminder that we have a task for each train rollout to track (potential) 
blockers[1]. You can follow that task to track future progress.


Additional reminder that you can use the "Wikimedia MediaWiki versions"[2] 
tool on Toolforge to know which wikis have which version at any time.


Choo Choo,
-- Tyler

[0]. 
[1]. 
[2]. 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Tyler Cipriani

On 18-04-27 10:49:28, Stas Malyshev wrote:

Hi!


First, we now disallow multi-sync patch deployments. See T187761[0].
This means that the sync order of files is determined by git commit
parent relationships (or Gerrit's "depends-on"). This is to prevent SWAT
deployers from accidentally syncing two patches in the wrong order.


Question about this: if there's a patch that requires files to land in
specific order, e.g. one that part of the config is moved into another
file (example: https://gerrit.wikimedia.org/r/c/419367) is this handled
automatically by scap (i.e. all changes in the same patch land at the
same time atomically and scap takes care of nothing ever seeing the
intermediate states) or has to be managed manually, and if so, how?


Scap doesn't currently handle this since for MediaWiki deploys it's 
still using rsync at a basic level.


Currently, syncing changes in a way that avoids bad intermediate states 
is handled by the SWAT deployer and is determined on the fly at the time 
of deployment.


That is, the deployer figures out how to sync stuff on the fly. And 
deployers are pretty good at it, mostly. If I were deploying that change 
today I'd split it up and sync one at a time:


   - wmf-config/WikibaseSearchSettings.php
   - wmf-config/InitialiseSettings.php
   - wmf-config/Wikibase.php
   - wmf-config/Wikibase-production.php

(or maybe I'd combine the last two into a sync-dir wmf-config).

The new policy asks the folks submitting patches to split up patches to 
avoid bad intermediate states ahead of time.


So Instead of me syncing that change one file at a time, maybe that 
change becomes two changes and I can pull a change that adds the 
WikibaseSearchSettings.php file and the variables in 
InitialiseSettings.php -- sync-dir wmf-config, and then a second patch 
that could be synced all at once as well. Two syncs rather than 3 or 4 
in this case.


The hope is that this will be more efficient, less error-prone, and 
lower the difficulty factor for deployers. Ideally, this makes it more 
and more difficult to earn a t-shirt[0] :)


-- Tyler

[0]. 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Changes to SWAT deployment policies, effective Monday April 30th

2018-04-27 Thread Tyler Cipriani

On 18-04-27 21:00:35, Gergo Tisza wrote:

On Fri, Apr 27, 2018 at 7:05 PM, Niharika Kohli 
wrote:


Also, I think dropping the limit to 4 patches per window is extreme,
especially if we are asking people to start splitting their patches now.
Very often we can +2 multiple patches in one go if they don't affect each
other, or sync out changes together if they happen to the same file. I've
deployed 8 patches in a window often, with people asking if they can add
more yet. Due to timezone limitations, most people can only attend one of
the SWAT windows and if they can't get it out in that window, they have to
wait a whole day or more to get it out.



FWIW, here are some stats on the patch counts for the first three months of
2018 (might contain errors, tried not to spend too much time on it):

EU mid: 8, 6, 2, 10, 1, 6, 1, 2, 1, 2, 2, 6, 2, 9, 6, 10, 2, 1, 1, 8, 7, 5,
5, 5, 2, 7, 6, 8, 4, 7, 4, 4, 4, 1, 1, 4, 3, 3, 2, 4, 4, 5, 5, 1
(average: 4.25, max of 2-week rolling average: 5.7)
Morning: 7, 1, 1, 5, 1, 3, 4, 1, 6, 3, 4, 3, 3, 5, 6, 4, 3, 5, 8, 4, 5, 4,
3, 0, 3, 1, 5, 9, 2, 1, 3
(average: 3.6, max of 2-week rolling average: 5)
Evening: 8, 1, 1, 1, 1, 3, 3, 2, 1, 3, 4, 2, 1, 1, 2, 1, 1, 5, 1, 6, 1, 3,
3, 1, 2, 3, 1, 0, 9, 10, 5, 4, 1, 6, 2, 1, 2, 4, 3
(average: 2.8, max of 2-week rolling average: 4.75)



These stats are really cool and they made me want to dig a little more. There 
have been a few times where having data about the actual syncs that make up a 
given SWAT window would be nice[0] (this being another one of those times).


As of now, to get information about the number of syncs that make up a 
given SWAT window -- or to see how long a SWAT window actually takes -- there 
is some digging in the SAL required (and even then it can be hard to 
figure out what happened if there is a window with patches, but no 
syncs, or just one sync, etc). Anyway, I spent some time digging in the 
SAL[1] to correlate SWAT windows on Wikitech to actual syncs and 
deployments.


One thing I found is that number of patches on Wikitech isn't necessarily the 
number of patches that make it out in a given window -- which makes sense -- 
sometimes we run out of time in the window or people don't show up or something 
breaks and we have to stop.


2018-01-02 Europe:  8 patches  1:05 6/8
2018-01-03 Evening: 8 patches  1:01 8/8
2018-01-08 Europe:  8 patches  1:03 8/8
2018-01-29 Europe:  9 patches  0:58 4/9
2018-02-06 Europe:  10 patches 1:02 7/10
2018-02-13 Europe:  8 patches  1:01 5/8
2018-02-28 Europe:  8 patches  1:16 7/8

The other thing I found is that there was no SWAT window between 2018-01-02 and 
2018-03-09 with > 6 patches that we kept within the allotted 1 hour time limit 
and deployed all the patches (although we were very close a couple of times).


2018-01-02 Europe:  8 patches  1:05 6/8
2018-01-03 Evening: 8 patches  1:01
2018-01-03 Morning: 7 patches  1:19
2018-01-08 Europe:  8 patches  1:03
2018-01-29 Europe:  9 patches  0:58 4/9
2018-02-06 Europe:  10 patches 1:02 7/10
2018-02-13 Europe:  8 patches  1:01 5/8
2018-02-14 Europe:  7 patches  1:31 5/7
2018-02-26 Europe:  7 patches  1:32
2018-02-28 Europe:  8 patches  1:16 7/8
2018-03-05 Europe:  7 patches  0:57 5/7

Looking at this info maybe 6 is the magic number?

FWIW, I feel like I struggle to get out 8 patches in an hour (depending on the 
patches).


Although maybe requiring more patches per change and allowing fewer patches in
a given window at the same time may not be the best course of action. As Chad
said elsewhere in the thread maybe we should focus on "changes" per window,
where "change" is the equivalent of a patch currently.

-- Tyler

[0]. 
[1].  

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] Phabricator spam - account approval requirement enabled

2018-07-01 Thread Tyler Cipriani
I wrote a short, quick fixer script that is terrible, but is saving me 
some time in fixing some tasks today.


I figured I'd share the script[0] on this list even thought it's very 
very (very) alpha and was written very quickly.


Thank you to everyone looking at and thinking about this issue.

-- Tyler

[0].  

On 18-07-01 03:51:15, Leon Ziemba wrote:

I wrote a rollback script, currently running as CommunityTechBot
 and previously
Community
Tech bot . It
seems to work, aside from setting the triage level, which hopefully isn't a
huge deal. I can try to fix that later. It is also being slowed down by
rate limiting. The script isn't quite shareable yet but when it is I'll
publish it. Going to sleep now :)

~Leon

On Sun, Jul 1, 2018 at 2:58 AM Amir E. Aharoni 
wrote:


Thanks to all the people who are working (on a weekend!) to fix this.


--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬

2018-07-01 5:53 GMT+02:00 Greg Grossmeier :

> Hello,
>
> Unfortunately we are experiencing spam in our Phabricator instance
> again and have decided to turn on the requirement for new account
> approval by Phabricator admins as a mitigation step.
>
> I'm sorry for the inconvenience. We are actively working to address this
> issue.
>
> Greg
>
> --
> | Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
> | Release Team ManagerA18D 1138 8E47 FAC8 1C7D |
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Engineering mailing list
engineer...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/engineering



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] New Beta Cluster Deployment Servers

2018-08-02 Thread Tyler Cipriani

Hi All!

There are new deployment servers in beta cluster: deployment-deploy01 
and deployment-deploy02.


These servers replace deployment-tin and deployment-mira (which I just 
shut down today).


If you had a home directory on either deployment-tin or deployment-mira 
I've moved it to the new machines at:


   deployment-deploy01:~/deployment-tin-home
   deployment-deploy02:~/deployment-mira-home

If you see anything amiss that you feel may be a result of this move, 
please file a task on phab or find me on IRC.


Thanks to Daniel Zahn and Alex Monk for all their work on this task!

Thanks!
-- Tyler

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] This Week's Log Health

2018-08-24 Thread Tyler Cipriani

Hi all!

The state of the Wikimedia error logs makes determining the health of a 
deployment difficult. This week there were a number of log messages that 
made determining the health of this week's train difficult.


This email is a request for help for a couple of troubling messages 
currently showing up on our error logs; please help to investigate these 
log messages if you are able:


* Exception thrown for failure to save settings appears ~ 1000 
times/day[0]

* "Falling back to DifferenceEngineSlotDiffRenderer" Logspam[1]

Thank you for your help!

<3
-- Tyler

[0] 
[1] 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] Gerrit now automatically adds reviewers

2019-01-18 Thread Tyler Cipriani

Hi all,

Gerrit no longer automatically adds reviewers[0]. Unfortunately, this 
plugin appears (given the replies on this thread) to be missing key 
features needed to be useful for us at this time. Apologies to those 
folks whose inboxes were destroyed.


I would like to re-enable this plugin at some point, provided the 
features identified in this thread are added (perhaps also an 
"X-Gerrit-reviewers-by-blame: 1" email header, or subject line to make 
filtering these messages easier).


In the interim, project-owners are able to opt-in to using the 
reviewers-by-blame plugin on a per-project basis on their project admin 
page in Gerrit.


Also, the Git Reviewer Bot[1] provides folks an opt-in method of 
volunteering to review a subset of files in a particular repo.


Getting code review as a new contributor is hard. Thanks for bearing 
with us.


-- Tyler

[0]. 
[1]. 

On 19-01-17 13:51:58, Greg Grossmeier wrote:

Hello,

Yesterday we (the Release Engineering team) enabled a Gerrit plugin that
will automatically add reviewers to your changes based on who previously
has committed changes to the file.

For more, please read the blog post at:
https://phabricator.wikimedia.org/phame/post/view/139/gerrit_now_automatically_adds_reviewers/

NOTE: There are a couple requests from us open upstream to improve the
plugin[0], we'll incorporate those improvements when they are released.

On behalf of the rest of the Release Engineering Team[1],

Greg

[0] https://phabricator.wikimedia.org/T101131#4890023
[1] As well as Paladox, a Wikimedia volunteer with strong ties to
upstream Gerrit.

--
| Greg GrossmeierGPG: B2FA 27B1 F7EB D327 6B8E |
| Release Team ManagerA18D 1138 8E47 FAC8 1C7D |

___
Engineering mailing list
engineer...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/engineering


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] Gerrit now automatically adds reviewers

2019-01-19 Thread Tyler Cipriani

On 19-01-18 22:12:22, Pine W wrote:

I'm glad that this problematic change to communications was reverted.


Clarification: Enabling this plugin wasn't reverted, a configuration 
change was made to the default settings of the plugin.


Thanks to the helpful suggestions on this thread, it's my hope that the 
upstream plugin (in future) may contain additional configuration options 
to improve the usability of this plugin for everyone, including 
Wikimedia technical contributors.



I would like to suggest that this is the type of change that, when being
planned, should get a design review from a third party before coding
starts, should go through at least one RFC before coding starts, and be
widely communicated before coding starts and again a week or two before
deployment. Involving TechCom might also be appropriate. It appears that
none of those happened here. In terms of process this situation looks to me
like it's inexcusable.


As Chad mentioned this is a plugin developed by upstream Gerrit.

Enabling this plugin was tracked in Wikimedia's public Phabricator[0].

As is now well understood in hindsight, the default configuration of 
this plugin (as designed by Gerrit upstream) is far from optimal 
for Wikimedia technical contributors.


[0]. 


In the English Wikipedia community, doing something like this would have a
reasonable likelihood of costing an administrator their tools, and I hope
that a similar degree of accountability is enforced in the engineering
community. In particular, I expect engineering supervisors to follow
established technical processes for changes that impact others' workflows,
and if they decide to skip those processes without a compelling reason
(such as a site stability problem) then I hope that they will be held
accountable. Again, from my perspective, the failure to follow process here
is inexcusable.


As was pointed out by others: it's difficult to make a comparison 
between the English Wikipedia community and the Wikimedia technical 
contributors community (although many folks belong to both groups). I 
don't believe holding individuals to a post hoc set of standards creates 
a healthy community in any case.


I do agree that technical contributors should be accountable. That is, 
technical contributors should strive to be responsive to issues when 
they arise (as issues will arise when attempting to accomplish goals in 
a technical space).


-- Tyler


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] Gerrit now automatically adds reviewers

2019-01-22 Thread Tyler Cipriani

Hi all!

This plugin has been removed entirely from Wikimedia Gerrit[0]. I know 
of no one who intended to experiment with the plugin in its current form 
so it is now removed.


I have created a task to track suggestions for this plugin's 
improvement[1]. This task's scope is to track suggestions for 
improvements to the reviewers-by-blame plugin so they are not lost in 
this thread. Implementation details about individual suggestions should 
(likely) become seperate tasks.


Any discussion about what is needed to re-enable this plugin for 
Wikimedia's Gerrit is a different discussion for another task and time. 
We won't re-enable this plugin without notice or without further 
discussion.



On 19-01-21 18:20:13, Paladox via Wikitech-l wrote:
FYI i have a working prototype working ("Suggest Reviewer") button.


On Monday, 21 January 2019, 16:32:35 GMT, Paladox via Wikitech-l 
 wrote:

I’m currently working on addressing all the feedback as fast as I can.
I honestly think this extension is great especially for new users, who
do not know they need reviewers or who would review there change.
Granted this extension has some problems hence why feature requests
were filed against the extension.


+1 to what Niharika and other have said: thank you for all your work on 
Gerrit, Paladox!


You have made maintaining and keeping Gerrit secure easier for me 
personally.


Thanks all for your thoughts on this thread, and thank you in advance for 
ensuring the task for suggestions for improvement is accurate.


-- Tyler

[0]. 
[1]. 


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] [Train] 1.33.0-wmf.18 status update

2019-02-19 Thread Tyler Cipriani

Hello all!

I have not yet started the 1.33.0-wmf.18 train; however, at the end of 
last week, I noticed some errors that (AFAICT) are regressions in 
1.33.0-wmf.17.


There were two new errors that started showing up in 1.33.0-wmf.17:

1. ErrorException from includes/HeaderCallback.php: PHP Notice: 
Undefined offset: 1[0]
2. includes/specials/pagers/ActiveUsersPager.php: PHP Notice: Undefined 
index: dir[1]


Neither of these errors were happening at a high enough rate, or with 
enough of a user impact to trigger a rollback of 1.33.0-wmf.17; however, 
I added them as blockers for wmf.18 in the hopes that we could address 
regressions caused from wmf.17 before rolling out wmf.18.


If folks could take a look at these tasks and help me resolve these 
regressions before we start rollout of a new version that'd be great!


Thanks in advance for your help and attention!

-- Tyler

[0]. 
[1]. 


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Gerrit outage

2019-03-18 Thread Tyler Cipriani

Hello,

As part of cleanup and response Gerrit's use of http tokens has been 
disabled. You should still be able to use the http REST api using your 
LDAP password.


Gerrit's command-line tools [0] that operate via SSH are also still 
available.


-- Tyler

[0]. 

On 19-03-16 10:26:52, John Bennett wrote:

Hello,


On 16 March 2019, Wikimedia Foundation staff observed suspicious activity
associated with Gerrit and as a precautionary step has taken Gerrit offline
pending investigation.


The Wikimedia Foundation's Security, Site Reliability Engineering and
Release Engineering teams are investigating this incident as well as
potential improvements to prevent future incidents. More information will
be posted on Phabricator (https://phabricator.wikimedia.org/T218472 ) as it
becomes available and is confirmed. If you have any questions, please
contact the Security (secur...@wikimedia.org 
).


Thanks
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Gerrit ACL updates/phab mirroring

2019-04-05 Thread Tyler Cipriani

Hi All,

I wanted to send a heads-up that I've reverted some recent Gerrit ACL 
changes that I believe have been problematic from a Gerrit operational 
stability standpoint.


These ACL changes have been blocking some Phabricator mirroring, and you 
may receive emails from Phabricator about old changes being mirrored 
into Phabricator.


Sorry for the noise.

Your Gerrit Toiler,
-- Tyler

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Uploading new versions of other people's patches to gerrit

2019-04-09 Thread Tyler Cipriani

On 19-04-09 05:17:17, Isarra Yos wrote:
This seems to be broken, or something. It's causing problems for 
collaboration. Please fix.


This is the addPatchSet permission in Gerrit.

That particular permission was recently abused, and at that time the 
permission was modified. The current status will remain for at least the 
next week or two while we sort out some other Gerrit problems.


The current status is that "Project Owners" can use addPatchSet for any 
projects they own: this does not necessarily map to +2/-2 permissions 
for a given project, but should mostly align.


For any projects in the "mediawiki" hierarchy, the "mediawiki" group has 
the ability to addPatchSet.


After our current problems are resolved, I'd like to talk both with the 
folks impacted and with the security team about finding some middle 
ground between the current status and anyone being able to modify any 
patchset at any time.


-- Tyler


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Uploading new versions of other people's patches to gerrit

2019-04-09 Thread Tyler Cipriani

Hello,

On 19-04-09 17:50:12, Isarra Yos wrote:
To clarify, I get what you're trying to do, but there has got to be a 
better solution besides denying key features to and thus impairing 
long-term contributors, because disabling this for everyone (but 
apparently WMDE (?!)) does exactly that. On the wikis, for instance, 
we have specific groups for this sort of thing (rollback, file moving, 
bypass captcha, bypass rate limits), to let established users do the 
things they need to do, without it being allowed of absolutely 
everyone from the start, and without requiring them to be admins, 
either, to do it. Perhaps actually using one of our more general 
existing groups for this here would make sense?


Bawolff has suggested Trusted-Contributors as a group that might make 
sense to add. That seems sane to me, so I've added that group in 
addition to the list above. Hopefully that unblocks your work.


I agree: having specific groups that are granted specific permissions 
that allow established users to continue their work unobstructed should 
be achievable in the near-term.


-- Tyler


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Gerrit HTTP Token Auth re-enabled

2019-06-24 Thread Tyler Cipriani

Hi all!

tl;dr: Gerrit HTTP token auth has been re-enabled. To use it you'll need to
generate a token via your preferences page[0].

Gerrit HTTP token auth was disabled in mid-march due to concerns about its
implementation[1].

Thanks to the work of Paladox and Gerrit upstream in Gerrit 2.15.14[2] we've
re-enabled HTTP token authentication.

I previously removed all HTTP auth tokens, so in order to use HTTP token auth
you'll need to generate a fresh token via your preference page[0]

Your Lowly Gerrit Fiddler,
-- Tyler

[0]. 
[1]. 
[2]. 


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Gerrit HTTP Token Auth re-enabled

2019-06-25 Thread Tyler Cipriani

I spoke too soon :(

Frustratingly, Gerrit 2.15.14 exacerbated an existing bug[0] to the 
extent that I feel like we have no choice but to rollback to 2.15.13.


I have re-disabled HTTP token auth for the time being.

Apologies for the false hope,
-- Tyler

[0]. <https://phabricator.wikimedia.org/T224448>


On 19-06-24 15:53:25, Tyler Cipriani wrote:

Hi all!

tl;dr: Gerrit HTTP token auth has been re-enabled. To use it you'll need to
generate a token via your preferences page[0].

Gerrit HTTP token auth was disabled in mid-march due to concerns about its
implementation[1].

Thanks to the work of Paladox and Gerrit upstream in Gerrit 2.15.14[2] we've
re-enabled HTTP token authentication.

I previously removed all HTTP auth tokens, so in order to use HTTP token auth
you'll need to generate a fresh token via your preference page[0]

Your Lowly Gerrit Fiddler,
-- Tyler

[0]. <https://gerrit.wikimedia.org/r/#/settings/http-password>
[1]. <https://phabricator.wikimedia.org/T218750>
[2]. <https://www.gerritcodereview.com/2.15.html#21514>





signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Fwd: [RelEng] deployment-prep outage this Friday

2019-07-18 Thread Tyler Cipriani
The beta cluster will be intermittently unavailable tomorrow due to WMCS 
maintenance. See forwarded message for details.


Thanks!
-- Tyler

- Forwarded message from Andrew Bogott  -


Date: Wed, 17 Jul 2019 15:51:57 -0500
From: Andrew Bogott 
To: Release Engineering 
Subject: [RelEng] deployment-prep outage this Friday

On Friday I'm going to move many of the VMs in deployment-prep to new 
hardware.  The effect of this will be rolling, intermittent outages 
throughout the project as different pieces are offline.


I'll be starting in my morning, around 14:00 UTC (7AM Pacific Time).  The 
total process will take several hours.



-Andrew


___
RelEng mailing list
rel...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/releng


- End forwarded message -


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Blubber v0.8.0/blubber.yaml update

2019-07-25 Thread Tyler Cipriani

Hi all!

CAVEAT EMPTOR: If you don't use the Deployment Pipeline[0] or Blubber[1] 
then this email may mean nothing to you.


A new version of Blubber has been released -- v0.8.0[2].

The main change is that we've eliminated the "artifacts" command for use 
in multistage builds. The details of this change are available in the 
Blubber User Guide[3].


I've created patches that preserve the current functionality of Blubber 
for the Deployment Pipeline and tagged all of them "blubber-v4" in 
Gerrit[4].


Thanks!
-- Tyler

[0]. 
[1]. 
[2].  
[3].  
[4]. 


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Github: WMFGerrit closing pull requests

2019-08-05 Thread Tyler Cipriani
tl;dr: If your team doesn't do any development on GitHub then this email 
likely doesn't affect you.


As you may or may not know there is now a read-only replica of Gerrit 
available at https://gerrit-replica.wikimedia.org/ (hooray); however, 
over the weekend we noticed some missing tags from that mirror (boo).


To fix the missing tags for the replica I forced replication to run for 
all repos in Gerrit today as part of a configuration restart. After a 
replication sync I was able to ensure that all repos on the new replica 
were now up-to-date; however, it also closed all the pull requests that 
were made via pushing branches to wikimedia-org GitHub repos (which is 
the work flow of several apps teams and possibly others).


Apologies for the inconvenience and thanks to Dmitry Brant and Joe 
Walsh for pinging me about the problem.


I've since removed GitHub as a "mirror" -- meaning Gerrit will not 
delete branches there. Paladox has filed a task upstream to allow us to 
specify a full replication for a particular remote (i.e., gerrit-replica 
but not GitHub) instead of all remotes[0], and for added suspenders for 
our belt I've made a patch set that should exclude these projects from 
replicating to from Gerrit to GitHub in the future[1].


I think all of the fallout of this change is taken care of (judging from 
my GitHub search): 



But if your project was affected, please either reach out to me or add 
your project to the GitHub exclusion list in Puppet like in my 
patchset[1] and add me as a reviewer.


Thanks and sorry
-- Tyler

[0]. 
[1]. 


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Being logged out of gerrit

2019-10-13 Thread Tyler Cipriani
On Sun, Oct 13, 2019 at 10:11 AM Amir Sarabadani  wrote:
>
> The ticket implies this is fixed for a while, but I get logged out three
> times last week. I'm not sure if it's the only reason.

That's probably still the right ticket in my view. That's the task
where I tracked my investigation of this same issue previously. I
never felt I found the root cause, but the folks reporting the issue
originally said that it stopped happening after I raised the
web_session cache size so we closed out the task.

It looks like that task is reopened and we can followup there.

Thanks for reporting!

-- Tyler

>
> On Sat, Oct 12, 2019 at 6:55 AM Andre Klapper 
> wrote:
>
> > On Sat, 2019-10-12 at 10:43 +0700, Andre Klapper wrote:
> > > On Sat, 2019-10-12 at 08:09 +0530, Niharika Kohli wrote:
> > > > I constantly face this too. Would really like for this to be fixed.
> > >
> > > I cannot find an open task under
> > > https://phabricator.wikimedia.org/maniphest/query/Pv3ucr952tSH/#R
> > >
> > > Would someone like to file a task in Phab? :)
> >
> > Ah, https://phabricator.wikimedia.org/T222472 might be related?
> >
> > andre
> > --
> > Andre Klapper (he/him) | Bugwrangler / Developer Advocate
> > https://blogs.gnome.org/aklapper/
> >
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
> --
> Amir (he/him)
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] December Deployment Freeze

2019-10-16 Thread Tyler Cipriani
Hi All!

tl;dr: December deployment freeze 2019-12-19–2020-01-02. Train resumes
2020-01-06.

We're coming up to the time of year where we freeze our production
deployments to allow for the holiday season's limited availability as
well as to ensure the stability of our holiday fundraising.

After talking with the fundraising team we've decided to stick with
what we've done in years past[0][1][2][3][4] and freeze for the final
week and the first week of the year. For this year that means the
final deployment of the calendar year will be on 2019-12-19 and the
first deployment of the new year will be on 2020-01-02.

Train will resume running the week on 2020-01-06.

<3
-- Tyler

[0]. 
[1]. 
[2]. 
[3]. 
[4]. 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Gerrit: Server maintenance

2019-10-21 Thread Tyler Cipriani
On Thu, Oct 17, 2019 at 3:43 PM Patrick Mulhall
 wrote:
>
> Just a reminder that this will be happening on Monday next week.

This is happening in a few minutes. Progress will be logged in
#wikimedia-operations

-- Tyler

> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Gerrit: Server maintenance

2019-10-21 Thread Tyler Cipriani
On Mon, Oct 21, 2019 at 11:52 AM Tyler Cipriani  wrote:
>
> On Thu, Oct 17, 2019 at 3:43 PM Patrick Mulhall
>  wrote:
> >
> > Just a reminder that this will be happening on Monday next week.
>
> This is happening in a few minutes. Progress will be logged in
> #wikimedia-operations

Correction, in 1 hour and a few minutes :)

Post-Morning SWAT.

-- Tyler

>
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Gerrit: Server maintenance

2019-10-22 Thread Tyler Cipriani
After quite a bit of thought and experimentation we've managed to, I
believe, resolve the issue of reappearing patchsets on the new gerrit
server. All the gory details are available on the Phabricator task[0].

Thank you everyone, as always, for your patience and support.

-- Tyler

[0]. 

On Mon, Oct 21, 2019 at 4:33 PM Daniel Zahn  wrote:
>
> This has been completed.
>
> Gerrit is now running on Debian Buster and a new server, gerrit1001, with
> 64GB RAM.
>
> Next will be tuning the config to make use of it.
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] New, simple Docker development environment for MediaWiki core

2020-02-24 Thread Tyler Cipriani
On Mon, Feb 24, 2020 at 9:44 AM Brennen Bearnes  wrote:
> TL;DR: `docker-compose up` gets you a Docker environment with which
> to develop.
>
> The Engineering Productivity group is happy to announce the
> availability of a new, official Docker environment for MediaWiki
> core. [0] This is a component of our work on improving developer
> productivity, as part of the Wikimedia Foundation's "Platform
> Evolution" [1] multi-year priority, looking to support faster, more
> reliable technical change for our communities. We've been exploring
> options for a year now, and we had a great deal of input,
> particularly at the TechConf 2019, where Kosta Harlan worked
> closely with us to move this forward. [2]

Very cool! I'm really happy to see projects that grew out of TechConf
coming to fruition!

Kudos and thanks to everyone involved in this work!

-- Tyler

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] COVID-19 Deployment Guidance

2020-03-20 Thread Tyler Cipriani
Hi all!

In response to COVID-19, we are putting in place stricter guidelines around
deployments with an emphasis on site reliability.

To support this change, we have the following guidelines for software
development:


   -

   While we are not going to go into full emergency or holiday mode (i.e., no
   releases), we do think it is necessary to de-risk the deployment train by
   adding some additional scrutiny into the process. Our ask is that you take
   extra precautions as outlined in our deployment guidelines below. Most
   importantly, if you know you have limited availability to support a
   deployment, don’t put your code on the train. When in doubt, ask.


   -

   Please review the COVID-19 deployment guidelines at
   https://wikitech.wikimedia.org/wiki/Deployments/Covid-19
   -

   SWAT (emergency hot-fix) deploys will continue as is


   -

   We are limiting the frequency of onsite data center work to help
   minimize the exposure of our team members who travel in and out of our data
   center facilities. This will result in the general delay of hardware
   installations and repairs, though we will continue being immediately
   available for emergencies associated with uptime and critical
   redundancies.  We are still finalizing what this means and will provide
   additional guidance when we have it.


Please err on the side of caution with the changes you merge.

Considerations (from the wikitech page)


   -

   Can you roll back this change without lasting impact?
   -

  A recovery plan is required as this will help identify our capacity
  for recovering from the failure
  -

  THIS IS A KEY QUESTION, if you  can’t answer it, you shouldn’t deploy
  -

   Is specialized knowledge required to support this change in production?
   -

  Are there multiple people with this knowledge?
  -

   Is there a way to increase confidence about the correctness of this
   change?
   -

  Reviews (Design, Code, etc)
  -

  Testing coverage (unit tests, integration tests)
  -

  Manual testing (e.g. Beta, vagrant, docker)


We’re hosting office hours on Mondays at 17:00 UTC in #wikimedia-office
where you can ask questions regarding what is a good choice vs not.

Thank you all in advance for your understanding and empathy over the next
few weeks.

<3

-- Your Local (Internet) Neighborhood Release Engineers
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Deployment Guidance Office Hours

2020-04-07 Thread Tyler Cipriani
Hi all!

Just a friendly reminder:

We’re hosting office hours on Mondays at 17:00 UTC in #wikimedia-office on
freenode where you can ask questions regarding the COVID-19 deployment
guidelines (https://wikitech.wikimedia.org/wiki/Deployments/Covid-19).

This week's train deployment branch was cut today[0]. If you have any
concerns about patches that landed this week, feel free to reach out to me
or Release Engineering via email or IRC.

<3!
-- Tyler

[0]. 
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Deployments next week: no train, no deploys Tuesday–Friday

2020-04-15 Thread Tyler Cipriani
Hi All,

Reminder that there is no train next week and there are no deployments
Tuesday–Friday.

A large number of folks will be unavailable Wednesday through Friday, so
we're cancelling the deployment train next week and treating next Tuesday
like a Friday (which means only deploying in cases of emergency[0]).

As always, the deployment schedule on Wikitech should be up-to-date and
canonical[1].

Thanks all!
-- Tyler

[0]. 
[1]. 
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Namespace Localisations & Updates

2020-05-18 Thread Tyler Cipriani

Hi Samuel

On 20-05-18 09:57:54, RhinosF1 - wrote:

On https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/596424/, it was
raised correctly that namespaceDupes.php would need to be ran.

https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/596424/ and
https://gerrit.wikimedia.org/r/#/q/bug:%20T251287 can mostly run with the
train (expect the mediawiki config patch) but all require namespaceDupes to
be ran and on https://gerrit.wikimedia.org/r/#/q/bug:%20T251287 could do
with being deployed as close together as possible to avoid
inconsistently translated namespaces.

Would as mentioned SWAT be better for all 5 patches or should we let what
can ride with the train and deploy the one config patch shortly after and
run for both wikis in that window? or could we ask the train runners to do
that?


What you're describing sounds like it would be a good candidate for SWAT 
deployment. My reasoning is that (1) it is atypical to run maintenance 
scripts as part of the train and (2) there are no guarantees that a 
train won't rollback.


That is, backporting to a version that is stable ensures that we don't 
end up having rolled forward to all wikis, run the maintenance script, 
and then having to rollback due to an unrelated problem. Additionally, 
the log triage that follows a train window may mean that we can't 
guarantee a timely deploy of the configuration change following train.


To me, this feels safer/faster/easier as a SWAT deployment; even though 
this might make for a particularly long SWAT window.


Thanks!
-- Tyler


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Namespace Localisations & Updates

2020-05-18 Thread Tyler Cipriani

On 20-05-18 16:32:59, RhinosF1 - wrote:

My only concern with that is that between the 2 tasks it would be 5/6
patches in the SWAT window.


Yes, this work looks like it will consume a whole window easily :(


It would also be my first mediawiki core + extensions SWAT so are the
patches safe to +2 during / just before SWAT or Should I get that done
before?


On process I think might work:

* Merge core changes to master early in the week you plan to SWAT (but 
after branch cut for the week)
* Prepare cherry-picks to backport to current stable + branch to go out 
that week

* Add cherry-picks to deploy window

It's likely this is more than 6 patches :\

Syncing with a SWATter prior to SWAT and ensuring that you pick a window 
with some time after it (should you need more time to deploy) OR making 
a special window on the deployment calendar[0] for this set of changes 
would be best.


-- Tyler

[0]: <https://wikitech.wikimedia.org/wiki/Deployments>


On Mon, 18 May 2020 at 16:25, Tyler Cipriani 
wrote:


Hi Samuel

On 20-05-18 09:57:54, RhinosF1 - wrote:
>On https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/596424/, it was
>raised correctly that namespaceDupes.php would need to be ran.
>
>https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/596424/ and
>https://gerrit.wikimedia.org/r/#/q/bug:%20T251287 can mostly run with the
>train (expect the mediawiki config patch) but all require namespaceDupes
to
>be ran and on https://gerrit.wikimedia.org/r/#/q/bug:%20T251287 could do
>with being deployed as close together as possible to avoid
>inconsistently translated namespaces.
>
>Would as mentioned SWAT be better for all 5 patches or should we let what
>can ride with the train and deploy the one config patch shortly after and
>run for both wikis in that window? or could we ask the train runners to do
>that?

What you're describing sounds like it would be a good candidate for SWAT
deployment. My reasoning is that (1) it is atypical to run maintenance
scripts as part of the train and (2) there are no guarantees that a
train won't rollback.

That is, backporting to a version that is stable ensures that we don't
end up having rolled forward to all wikis, run the maintenance script,
and then having to rollback due to an unrelated problem. Additionally,
the log triage that follows a train window may mean that we can't
guarantee a timely deploy of the configuration change following train.

To me, this feels safer/faster/easier as a SWAT deployment; even though
this might make for a particularly long SWAT window.

Thanks!
-- Tyler
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


--
Thanks,
Samuel
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Subject: New train branch time: Tuesday 02:00UTC

2020-06-15 Thread Tyler Cipriani
Hi all!

Reminder that 1.35.0-wmf.37 will be branched automatically at 02:00
UTC tomorrow — Tue, 16 Jun.

Thanks!
-- Tyler

On Tue, Jun 2, 2020 at 3:11 PM Mukunda Modell  wrote:
>
> The Branch cut for our weekly MediaWiki release train is moving to full 
> automation, starting with 1.35.0-wmf.37[0] at 02:00 UTC, next Tuesday, June 
> 16th 2020.
>
> This is a slight change to the branch cut timing which usually happens at 
> approximately 17:00 UTC on Tuesdays. Previously, this was at the discretion 
> of the train deployer and will instead be at a deterministic time going 
> forward.
>
> If you have concerns about the timing of the branch cut, there is a 
> Phabricator Task[1] for that discussion. There is still flexibility in the 
> timing of the branch cut and we will consider further feedback as we fine 
> tune this process to best suit the needs of everyone involved.
>
> <3
> -- Release Engineering!
>
> [0]: 
> [1]: 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] [Train] Risky change template

2020-06-17 Thread Tyler Cipriani
Hi all!

A pattern of adding a note on the train-blocker phab task[0] for risky
changes has recently emerged.

Folks on RelEng have found this information useful, and we would like
to codify this, encouraging developers to continue this pattern. We've
added a page to Wikitech, with a template to use for these kinds of
changes[1].

If you're worried about the production impact of a patch you've
written and would like to ensure that folks who will be deploying it
are aware, please use our fancy new template and reply on the train
blocker task.

Thanks!
-- Tyler

[0]:  (<3 this tool)
[1]: 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Deployments next week (2020-06-29)

2020-06-25 Thread Tyler Cipriani
Hi All,

Next week is a bit of a shortened week as US Independence Day is observed
on 2020-07-03 (next Friday).

I've updated the Wikitech deployment calendar
 to reflect that we
normally discourage deploying on the last working day of any week (aside
from emergencies
).

The deployment train for 1.35.0-wmf.39
 will also be running on a
shortened schedule:

   - Tue, 30 Jun, EU Train Window, 13:00 UTC: Group0
   
   - Wed, 01 Jul, EU Train Window, 13:00 UTC: Group
   
   1
   
   - Wed, 01 Jul, US Train Window, 19:00 UTC: All Wikis
   

Thanks!
-- Tyler
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Replacement for Helm chart repository

2020-08-03 Thread Tyler Cipriani
On Mon, Aug 3, 2020 at 9:02 AM Janis Meybohm  wrote:
> Developers may now stop the process of packaging helm charts manually,
> rebuilding the index and pushing all that to git. As of now, increasing
> the charts version number in Chart.yaml is sufficient to have the chart
> being packaged and uploaded to ChartMuseum automatically.

This is great news! Thanks for this and the docs on wikitech :)

-- Tyler

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Deployments for (short) week of 2020-08-10

2020-08-09 Thread Tyler Cipriani
Hi All

tl;dr: Don't deploy on Thursday except for emergencies; the deployment
calendar on Wikitech
 is
up-to-date.

Friday of next week (2020-08-14) is a wmf holiday. As such, Thursday should
be treated as Friday for the purposes of deployment; that is, no
deployments on Thursday except for emergencies
.

This has implications for the (1.36.0-wmf.4) train—we'll be going to all
wikis on Wednesday evening UTC rather than Thursday evening UTC:

   - Tue, 11 Aug, EU Train Window, 13:00 UTC: Group0
   - Wed, 12 Aug, EU Train Window, 13:00 UTC: Group1
   - Wed 12 Aug, US Train Window, 19:00 UTC: All Wikis

Hopefully, this will allow time for manual testing on Group0 wikis, and
allow us to complete the train on time.

<3
-- Tyler
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] realtime notifications disabled in Phabricator

2020-08-12 Thread Tyler Cipriani

On 20-08-11 15:04:56, Daniel Zahn wrote:

re: >  "aphlict" service had been disabled on Phabricator because it
caused stability issues.

I am happy to announce that aphlict, the notification service for
Phabricator using websockets, is now finally back again.


\o/ This is great news!

Thank you for working to get this restored.

Realtime notifications are super useful for us; particularly for train
blocker tasks: very happy this service is back!

-- Tyler


signature.asc
Description: PGP signature
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] No train next week; no deployments next Tuesday

2020-08-24 Thread Tyler Cipriani
Hi all

There is a planned switchover to our secondary datacenter scheduled for
Tuesday, September 2nd 2020.

To avoid creating problems for our SREs we'll be skipping the train for
next week -- the week of 2020-08-31 -- and not doing any deployments the
day of the switchover -- 2020-09-01.

The deployment calendar is up-to-date[1].

Thanks!
-- Tyler


[0]: 
[1]: 
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Wikimedia's GitHub org help

2020-08-24 Thread Tyler Cipriani
Hi all!

If you've never created a repo or fork on the Wikimedia GitHub
organization you can skip this email.

I know that some repos are developed on our GitHub org for reasons.
What is developed on our GitHub org? How many things are actively
being developed on GitHub org? I have no idea :)

I recently realized that there's not a great way to figure this
out[0], but I've been able to narrow the scope a bit. Now I have a
list of repos that are (a) in our GitHub org and (b) not in our Gerrit
that I could use some help sorting through[1].

== Help, please ==

* Look through repos on The List™[1]

If your repos are on the list, for each of your repos either:

* Archive or Delete it if it's no longer maintained or empty/useless,
respectively (and remove them from the list on mw.org)[2]

Or:

* put a "{{tick}}" in the "Active" column on the list on mw.org

== Why==

In a more perfect future we could add the "mirror"[3] tag to repos on
GitHub that are mirrored from Gerrit (with a link to their canonical
repo locations; for example, gnome-deskop has this[4] and I'm very
jealous).

Hopefully, this will help folks wanting to contribute -- either a
Wikimedia GitHub repo is a mirror (in which case there's a link to
Gerrit in the description) or it's actively being developed on GitHub.

<3
-- Tyler

[0]: 
[1]: 
[2]: 

[3]: 

[4]: 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikimedia's GitHub org help

2020-08-25 Thread Tyler Cipriani
On Tue, Aug 25, 2020 at 1:17 AM Addshore  wrote:
> Is the main source of mirrors gerrit?

Yes, as of yesterday there are 2,493 Gerrit repos, 2,175 GitHub repos,
and 296 of those GitHub repos have no corresponding Gerrit repo from
which they are mirrored. The remaining 1,879 repos in github are
mirrored from gerrit.

> If so could we not write a script looking for .gitreview files and looking
> at the URL in there?

You mean to find the repo from which it originated? That's possible;
it's got some caveats. For example, there are ".gitreview" files on
github pointing to non-existent gerrit repos[0]. These just have to be
cleaned up manually, I think.

If it's useful, I wrote a handful of messy shell scripts (as is my
wont) that invoke the github api to come up with the list of 296 repos
that are on github but have no corresponding gerrit repo[1].

> I imagine there is also some API for marking things as mirrored? (or is it
> more manual than that?)

I talked to GitHub support about getting the "mirrored" tag for our
repos; it's totally manual and has to go through folks at GitHub
support is what I was told[2] :(

> Another thought would be adding some .wmgithub file with structured info
> about repos that are on github.
> Then rather than maintaining a manual list that is likely to get out of
> date we could write a thin UI infront of the data in these files and the
> GitHub API?

Making a UI/tool that monitors github repo creation seems like a good
idea rather than this list. My hope is that after some overdue manual
cleanup our github org will be clean enough to be able to make
inferences based on heuristics without having to add exogenous
metadata.

-- Tyler

[0]: 
<https://github.com/wikimedia/mediawiki-extensions-AddMetaAndTitle/blob/master/.gitreview>
[1]: <https://github.com/thcipriani/wikimedia-github-projects>
[2]: <https://phabricator.wikimedia.org/T237470#6406876>

>
> On Mon, 24 Aug 2020 at 23:47, Tyler Cipriani 
> wrote:
>
> > Hi all!
> >
> > If you've never created a repo or fork on the Wikimedia GitHub
> > organization you can skip this email.
> >
> > I know that some repos are developed on our GitHub org for reasons.
> > What is developed on our GitHub org? How many things are actively
> > being developed on GitHub org? I have no idea :)
> >
> > I recently realized that there's not a great way to figure this
> > out[0], but I've been able to narrow the scope a bit. Now I have a
> > list of repos that are (a) in our GitHub org and (b) not in our Gerrit
> > that I could use some help sorting through[1].
> >
> > == Help, please ==
> >
> > * Look through repos on The List™[1]
> >
> > If your repos are on the list, for each of your repos either:
> >
> > * Archive or Delete it if it's no longer maintained or empty/useless,
> > respectively (and remove them from the list on mw.org)[2]
> >
> > Or:
> >
> > * put a "{{tick}}" in the "Active" column on the list on mw.org
> >
> > == Why==
> >
> > In a more perfect future we could add the "mirror"[3] tag to repos on
> > GitHub that are mirrored from Gerrit (with a link to their canonical
> > repo locations; for example, gnome-deskop has this[4] and I'm very
> > jealous).
> >
> > Hopefully, this will help folks wanting to contribute -- either a
> > Wikimedia GitHub repo is a mirror (in which case there's a link to
> > Gerrit in the description) or it's actively being developed on GitHub.
> >
> > <3
> > -- Tyler
> >
> > [0]: <https://phabricator.wikimedia.org/T237470#6407509>
> > [1]: <https://www.mediawiki.org/wiki/Gerrit/GitHub#Projects_on_GitHub>
> > [2]: <
> > https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/archiving-a-github-repository
> > >
> > [3]: <
> > https://docs.github.com/en/github/getting-started-with-github/finding-ways-to-contribute-to-open-source-on-github#open-source-projects-with-mirrors-on-github
> > >
> > [4]: <https://github.com/GNOME/gnome-desktop>
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Wikimedia GitHub OAuth restrictions enabled

2020-08-25 Thread Tyler Cipriani
Hi!

Today we have enabled OAuth app access restrictions on the Wikimedia
GitHub organization[0]. As a result, any attempt to add an OAuth app
requires the approval of the WIkimedia organization's owners[1]. This
restriction was enabled to prevent accidentally granting OAuth
permissions to the organization's resources when that was not the
intention.

As a side effect, if your ssh key has been uploaded before February
2014, the next attempt to push over ssh to a wikimedia project (for
example: g...@github.com/wikimedia/*) will result in a prompt asking
you to manually validate the key. Just click the link, verify the
fingerprint and approve it if it is the correct one.

Thanks!
-- Tyler

[0]: 
[1]: 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikimedia's GitHub org help

2020-09-01 Thread Tyler Cipriani
Hi all!

Thank you for reviewing my list of the Wikimedia org's GitHub projects!

 A *lot* of repos were archived or deleted. Additionally, I archived
all projects that contained nothing but a ".gitreview" without a
corresponding Gerrit repository.

As a result, we've gone from 296 repositories exclusively on GitHub to
154 repositories exclusively on GitHub.

I believe this removed a lot of cruft and now all 2,038 repos on
Wikimedia's GitHub org are either:
* A mirror of a Gerrit repo (1,881 repos)
* A mirror of a Differential repo (3 repos)
* A project developed on GitHub or a fork of another GitHub project (154 repos)

I've updated the list of projects[0] that are exclusive to GitHub
(GitHub fork or a project developed on GItHub) if you'd like to take a
look.

Thanks again for all your help!

<3
-- Tyler

[0]. <https://www.mediawiki.org/wiki/Gerrit/GitHub#Projects_on_GitHub>

On Mon, Aug 24, 2020 at 4:46 PM Tyler Cipriani  wrote:
>
> Hi all!
>
> If you've never created a repo or fork on the Wikimedia GitHub
> organization you can skip this email.
>
> I know that some repos are developed on our GitHub org for reasons.
> What is developed on our GitHub org? How many things are actively
> being developed on GitHub org? I have no idea :)
>
> I recently realized that there's not a great way to figure this
> out[0], but I've been able to narrow the scope a bit. Now I have a
> list of repos that are (a) in our GitHub org and (b) not in our Gerrit
> that I could use some help sorting through[1].
>
> == Help, please ==
>
> * Look through repos on The List™[1]
>
> If your repos are on the list, for each of your repos either:
>
> * Archive or Delete it if it's no longer maintained or empty/useless,
> respectively (and remove them from the list on mw.org)[2]
>
> Or:
>
> * put a "{{tick}}" in the "Active" column on the list on mw.org
>
> == Why==
>
> In a more perfect future we could add the "mirror"[3] tag to repos on
> GitHub that are mirrored from Gerrit (with a link to their canonical
> repo locations; for example, gnome-deskop has this[4] and I'm very
> jealous).
>
> Hopefully, this will help folks wanting to contribute -- either a
> Wikimedia GitHub repo is a mirror (in which case there's a link to
> Gerrit in the description) or it's actively being developed on GitHub.
>
> <3
> -- Tyler
>
> [0]: <https://phabricator.wikimedia.org/T237470#6407509>
> [1]: <https://www.mediawiki.org/wiki/Gerrit/GitHub#Projects_on_GitHub>
> [2]: 
> <https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/archiving-a-github-repository>
> [3]: 
> <https://docs.github.com/en/github/getting-started-with-github/finding-ways-to-contribute-to-open-source-on-github#open-source-projects-with-mirrors-on-github>
> [4]: <https://github.com/GNOME/gnome-desktop>

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikimedia's GitHub org help

2020-09-01 Thread Tyler Cipriani
On Tue, Sep 1, 2020 at 12:11 PM Isaac Johnson  wrote:
> Thanks Tyler for doing this work! Is there an easy way (if not, no big
> deal) to also see the list of repos that were archived/deleted just to make
> sure lack of response didn't mean something disappeared that would have
> been useful to keep active?

Sure, the list of repos removed or archived (plus a repo that was
added and a repo that was renamed during this process :)) can be found
in this diff:
https://github.com/thcipriani/wikimedia-github-projects/compare/13aa797..7cdb83b

I don't have a way to differentiate between repos that were archived
vs deleted in that list.

Thanks!
-- Tyler

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] 📈 Wikimedia production errors help

2020-09-14 Thread Tyler Cipriani
Hello all!

Over the past few months we've reached the ignominious milestone of
the most open tasks of all time on the wikimedia-production-error
dashboard[0].

Background: The wikimedia-production-error dashboard is a workboard of
tasks created while digging through the Wikimedia production error
logs. All tasks there are log messages that have originated on
production servers.

The number of new tasks being created with this tag in a given week is
outpacing the number of tasks being closed in a given week: this past
week we added 41 tasks and only closed 22.

This is beginning to be unsustainable :(

There are currently 281 open tasks filed for errors in production.

Although we're triaging this workboard weekly, we rely on the
expertise of developers most familiar with the error messages to
triage them, prioritize them, and "fix" them (for whatever value of
"fix" is appropriate).

Below is a smattering of selected issues that could use some attention:

  1. PHP Fatal error: Out of memory in cdb/src/Reader/DBA.php[1]
  2. Uncaught ReferenceError: collectionCall is not defined[2]
  3. Flow: PHP Notice: Undefined index: flow-workflow-change[3]
  4. PHP Warning: unpack(): Type H: not enough input, need 4, have 0[4]
  5. TypeError: undefined is not an object (evaluating 'this.getMIMEType')[5]
  6. Elastica\Exception\ResponseException from line 56 of
GeoData/includes/Searcher.php[6]
  7. Wikimedia\CSS\Objects\ComponentValueList may not contain tokens
of type "[".[7]

Please help to triage or resolve these problems or any of the other
166 tasks needing triage[8] if you are able.

<3
-- Tyler

[0]: 
[1]: 
[2]: 
[3]: 
[4]: 
[5]: 
[6]: 
[7]: 
[8]: 

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] 📈 Wikimedia production errors help

2020-09-15 Thread Tyler Cipriani
Hi!

Thanks for the feedback, this is useful information.

On Tue, Sep 15, 2020 at 3:00 AM Niklas Laxström
 wrote:
> ma 14. syysk. 2020 klo 23.49 Tyler Cipriani (tcipri...@wikimedia.org) 
> kirjoitti:
> If there is an increase in the amount of real new issues and/or
> decrease in the amount of issues fixed, then I would be worried. Given
> what I said above, it's difficult to see if this is the case.

Indeed, a trendline for production quality is difficult to compare if
a large backlog is being added.

> Regardless, I do agree that we should aim to minimize production
> errors to make it easier to spot any new issues. I would encourage all
> maintainers and development teams to ensure that they have a regular
> process to check if they have and triage any production issues in code
> they maintain.

+100 to checking for production errors. It's my hope that folks who
have code that is going out on a train are:

1. Aware their code is going to production that week
2. Watching for related logs and alerts (where possible)
3. Performing other software quality assurance activities on their
code as it rolls out (manual testing, for example)

My assessment of risk as a person deploying software to production is
necessarily linked to my view into quality assurance activities. If
production errors are growing, I worry about sustainability. The
production error dashboard's past stability has provided assurances
about shared awareness and priority of a given week's deployment.

That is, I know there are software quality activities that take place
sometime after code hits group0 or group1 or group2; however, much of
that activity remains opaque. This is why this dashboard is crucial
for deployment.

Having the explicit assurances of folks whose code is going to
production that week would be preferable to any inference I can make
from this dashboard. It's my hope that maintainers and teams triaging
and grooming this dashboard will create an emergent process that can
be used to provide real insight. That is, if we all are keeping this
dashboard up-to-date collectively, it will be easier to see when
quality assurance activities have taken place. Further, if we
collectively fret over this dashboard then we'll share a collective
awareness of anomalies.

> Ending with a question: do we want to have both frontend and backend
> errors on the same tag/board, or should they be on separate ones?

That's a good question. I think that having a single workboard is nice
as there are reporting features[0] that provide some insights about
the overall health of production. Those insights are, as evidenced,
only as good as their inputs, but they remain valuable to me.
Additionally, a single tag may be used in saved searches and custom
dashboards to make it easy to stay on top of issues seen in production
(is my hope which may not align with how folks triage in practice).

Thanks for the feedback. This anomaly makes more sense to me than it did :)

-- Tyler

[0]: <https://phabricator.wikimedia.org/project/reports/1055/>

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] 📈 Wikimedia production errors help

2020-09-15 Thread Tyler Cipriani
On Tue, Sep 15, 2020 at 5:24 AM Derk-Jan Hartman
 wrote:
>
> In particular I count 13 frontend problems with the old TMH kaltura player.
> There is clearly no intent to fix those (volunteer or employee), as the
> Kaltura player has been unmaintained for 8 years.
> The choices as far as I can tell are to ignore them, undeploy a/v playback
> or to direct C-level management to get the audio and video stuff together.

The tasks that I mentioned in my original message are, likewise, tasks
that I'm not sure belong to any team or any particular person.

I have been using the phab tag/milestone "Release Engineering
(Logspam)" to ensure that we don't lose track of tasks that are:

1. problems in production
2. tagged in phabricator with a team or component (in contrast to
problems with unknown components/team tags)
3. no longer resourced or maintained in a discernible way

Feel free to apply that tag if those 3 conditions apply to these
tasks. Tracking these will make it easier to raise awareness later.

Thanks!
-- Tyler

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] 📈 Wikimedia production errors help

2020-09-16 Thread Tyler Cipriani
On Tue, Sep 15, 2020 at 11:06 AM Brennen Bearnes  wrote:
> On 9/15/20 9:43 AM, Alex Ezell wrote:
> > Do we use levels for any of these error log outputs? That is, are they
> > classified on output as High, Medium, Low, Info, or something like that?
>
> Teasing out more detail about reported error severity could be a useful
> exercise, but I'm not sure it would result in much more meaningful
> signals than we currently have about production health.  Serious
> problems can manifest as trivial-seeming notices, some issues start out
> that way and cascade over time, and generally any form of recurring
> logspam needs human evaluation before we can easily say much more than
> "this is a problem".

This aligns with my view of our team's ability to assign meaningful
priorities. High-level general knowledge about our deployment, errors,
and error logging can't substitute for domain expertise. Teams with
expertise in particular codebase are best positioned to understand the
impact of a particular message and derive a useful priority.

> it would be most helpful if we
> just had more eyes _routinely_ on the logs and the workboard.  (See
> Tyler's earlier and much more detailed/thoughtful response to this thread.)

+1 An interface between the log triage workboard and process with
team/maintainer workflows is a missing component of assigning
priorities.

There is a long developer feedback loop past integration. Hopefully,
this process helps to shorten the feedback loop to developers and
reduce the opacity of the process beyond integration through release
and monitoring. Having the expertise of developers writing the code be
a part of the deployment and monitoring of that code in production is
the goal of this process and the key to its utility.

-- Tyler

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] No train next week; no deployments next Tuesday

2020-10-21 Thread Tyler Cipriani
Hi all

There is a planned datacenter switchback next week scheduled for Tuesday,
October 27th 2020[0].

To avoid creating problems for our SREs we'll be skipping the train for
next week -- the week of 2020-10-26. Additionally, we want to avoid deploys
for a full 24 hours following the switchover. This means no deploys next
Tuesday (2020-10-27) and no deploys Wednesday (2020-10-28) until after
15:00UTC.

The deployment calendar is up-to-date[1].

As we head into the holiday season, I've added a list of foreseeable train
disruptions to the bottom of the Deployments Calendar[2].

Thanks!
-- Tyler

[0]: 
[1]: 
[2]: <
https://wikitech.wikimedia.org/wiki/Deployments#Upcoming_Release_Train_disruptions
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Deployment train next two weeks (2020-11-02, 2020-11-09)

2020-10-29 Thread Tyler Cipriani
Hi all,

We've entered that special time of year where the weekly deployment train
has a few disruptions coming up.

I've added a list of train disruptions through the end of the year to the
Deployments page on Wikitech[0].

Next week's train will happen, but the schedule will be different allowing
for a Tuesday holiday:

* Wed, 04 Nov 2020 noon PST: 1.36.0-wmf.16 Group0
* Thu, 05 Nov 2020 noon PST: 1.36.0-wmf.16 Group1
* Mon, 09 Nov 2020 noon PST: 1.36.0-wmf.16 Group2

Train will resume on the week of 2020-11-17 with 1.36.0-wmf.18.

As always, the Deployment Calendar on Wikitech[0] is the best source for
this information.

Thanks all

-- Tyler

[0]: <
https://wikitech.wikimedia.org/wiki/Deployments#Upcoming_Release_Train_disruptions
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] [Train] 1.36.0-wmf.20 status update

2020-12-03 Thread Tyler Cipriani
MediaWiki and extensions 1.36.0-wmf.20[0] is only deployed to testwikis[1].

We rolled back[2] today due to:

* T269396 Parser cache serving old results[3]

If these issues are resolved we can roll the train forward Monday.[4]

Thanks all!
-- Tyler

[0]: 
[1]: 
[2]: 
[3]: 
[4]: <
https://wikitech.wikimedia.org/wiki/Deployments/Holding_the_train#What_happens_next
?>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Last deployments of 2020 next week

2020-12-09 Thread Tyler Cipriani
Hi All,

Every year we stop deployments for the last full week of the year.

As we enter the last couple weeks of the year, I wanted to send out a
reminder that next week is the final deployment week of the year and that
wmf/1.36.0-wmf.22 will be the last train release of the year.

The deployment calendar on Wikitech
 is up-to-date and is the
canonical source for the deployment schedule.

Thank you!
-- Tyler
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] [Train] 1.36.0-wmf.30 status -- wmf.28, wmf.29: abandoned

2021-02-09 Thread Tyler Cipriani
Hi all

All wikis except testwikis are on 1.36.0-wmf.27; testwikis are running
1.36.0-wmf.30.

 We've rolled back to wmf.27 so that we have a stable base version from
which to roll out wmf.30.

We will proceed with rollout of wmf.30 once this cherry-pick for wmf.30 is
code-reviewed: https://gerrit.wikimedia.org/r/662965

assistance appreciated!

Thank you!
-- Tyler
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] [Train] 1.36.0-wmf.30 is in danger

2021-02-11 Thread Tyler Cipriani
Hiya all

tl;dr:
help us fix (or convince us not to care about):

* No atomic section is open (got LocalFile::lockingTransaction)[0]

Longer:
We've been on 1.36.0-wmf.27 for two weeks which is 1,038 changes behind
1.36.0-wmf.30 (the latest branch).

The remaining issue is: "No atomic section is open (got
LocalFile::lockingTransaction)"[0]

We're not sure about the impact of this log message. Holding the train is
our most effective tool to ensure that the log messages that we see don't
hurt users.

We need to solve this by Monday or we will remain on wmf.27 for another
week.

Any help you can provide is sincerely appreciated. Any feedback on how to
communicate about log messages more clearly so they get the attention they
need in a timely manner is also appreciated!

<3
-- Tyler

0: 
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Train vs Backport for Wikimedia Deployment

2021-03-18 Thread Tyler Cipriani
Hi All!

tl;dr: if you're a developer looking for guidance on how to deploy changes
to Wikimedia's MediaWiki cluster read https://w.wiki/36nY ; if you have
thoughts on our existing deployment documentation comment on
https://w.wiki/36nZ

---

Over time The Train™ has become the default way to deploy changes to
Wikimedia's MediaWiki cluster -- for some patches that may not always be
the right path. If a developer needs a change deployed *now*, or if there
is a desire to deploy a change in isolation then backports might be a
better path.

As with all things, some exceptions may apply. The Release Engineering team
has created some guidelines[0] that will hopefully help explain when
something MUST, SHOULD, or MAY[1] be deployed via the train or via backport.

This documentation is a bookmarkable quick reference for developers. It
does not change our backport window guidelines[2] or special deployment
window guidelines[3], those documents should not be in conflict with the
advice in the new guidelines. The new guidelines target a different
use-case.

Our deployment documentation is up-to-date but sprawling. The same
information is in multiple places and multiple audiences and use-cases are
often mixed into the same documents. Work to improve deployment
documentation is tracked on Phabricator[4].

Please reach out in #wikimedia-releng on freenode in IRC or attend the
Deployment Office Hours meeting (weekly on Mondays at 17:00UTC in
#wikimedia-office on freenode in IRC) if you have questions.

Thank you!
-- Tyler

[0]: 
[1]: 
[2]: 
[3]: 
[4]: 
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Gerrit authentication fails

2021-03-22 Thread Tyler Cipriani
Hi Swathi!

In looking in the gerrit logs it looks like you have two wikitech developer
accounts and there are failures signing in with the most recent one. This
is because Gerrit uses a single primary email per account and since you've
already signed in with your previous username that email is "claimed". I'll
follow-up with you off-list to determine which account you're trying to use.

Thanks!
-- Tyler

On Sat, Mar 20, 2021 at 9:39 AM Swathi Kasikala 
wrote:

> Hello,
> This is Swathi new to this MediaWiki environment and really excited to
> contribute to it for which I already created a Wikimedia developer account
> using this https://wikitech.wikimedia.org/wiki/Special:CreateAccount and
> now when I am trying to login to Gerrit (https://gerrit.wikimedia.org/)
> using the same credentials the message is been displayed that the
> authentication failed. Can someone help me out to solve this issue, please--
>
> Regards,
>
> Swathi Kasikala.
> amFOSS 
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Train vs Backport for Wikimedia Deployment

2021-03-23 Thread Tyler Cipriani
Hi!

On Tue, Mar 23, 2021 at 2:59 PM Kunal Mehta  wrote:

> > As with all things, some exceptions may apply. The Release Engineering
> > team has created some guidelines[0] that will hopefully help explain
> > when something MUST, SHOULD, or MAY[1] be deployed via the train or via
> > backport.
>
> Is this suppose to codify existing practice, or suggest/recommend people
> should be deploying things more frequently outside of the normal train?
> Tgr has raised roughly the same question on the talk page:
> .
>

Arg! Missed that there was discussion there :(

I'll follow-up here and try to answer the additional questions there as
well.

The intent of this page is to encourage more deployments outside of the
normal train by developers and code authors.

This page was meant to make clear the types of changes that would be
difficult to deploy outside of the train for historical reasons, and to
encourage the authors of other types of patches to consider deploying the
change themselves or with a deployer in a backport window.

There are a few reasons that releng wants to encourage backports:
1. Smaller deployments are easier to reason about
2. The code author is often in the best position to reason the effect of
their changes
3. The use of mwdebug as a manual testing platform, while similar to the
group-by-group rollout of the train, can ensure that a change is working on
several wikis in multiple groups with no impact to users.

Not every patch can be backported (just due to hours/day -- 420 changes
this train[0] -- 5 minute deploy each is 35 hours of deploying), also
backport windows are finite, and deployers time is limited. With those
constraints in mind, we'd still like to move to a more continuous model of
delivery and the hope is that these guidelines are a step in that
direction. I'm also open to other ideas about ways to make the train a
lighter and more consistent process.

Thanks!
-- Tyler

[0]: 
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Deployment calendar format change

2021-03-26 Thread Tyler Cipriani
tl;dr: The deployment calendar format will change in 2 weeks (2021-04-05)
to make it easier to edit with visual editor https://w.wiki/a3b

I updated the deployment calendar for the week of 2021-04-05[0] to use a
different format than in the past (compare to next week[1]). My hope is
that this new format will make it much easier to schedule deployment
windows and to schedule patches for backports using Visual Editor.

Also, selfishly, less squinting at Wikitext for me :)

All credit for the new format goes to Timo Tijhof. Thank you Timo!

Thanks all!
-- Tyler

[0]: 
[1]: 
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] MediaWiki 1.36-beta has been branched

2021-04-13 Thread Tyler Cipriani
Hey all,

This is a quick note to highlight that the REL1_36 branch has now been
created for MediaWiki core and each of the extensions and skins in
Wikimedia git[0]. This is the first step in the release process for
MediaWiki 1.36, which should be out in late May 2021, approximately nine
months after MediaWiki 1.35.

The branches reflect (or are at least very close to) the code as of the
last 'alpha' branch for the release, 1.36.0-wmf.38, which was deployed to
Wikimedia wikis last week for MediaWiki itself and those extensions and
skins available there.

>From now on patches that land in the main development branch of MediaWiki
and its bundled extensions and skins will be slated for the MediaWiki 1.37
release unless specifically backported[1].

If you are working on a critical bug fix that will affect the code in the
release, once the patch has been merged into the development branch, you
should propose it for backporting by cherry-picking to the REL1_36 branch.

If you are working on a new feature, that should now not be backported. If
you have an urgent case where the work should block release for everyone
else, please file a task against the `mw 1.36-release` project on
Phabricator.[2]

If you have tickets that are tagged for `mw-1.36-release`, please finish
them, untag them, or reach out to get them resolved in the next few days.

We hope to issue the first release candidate, 1.36.0-rc.0, in two weeks'
time, and if all goes well, to release MediaWiki 1.36.0 a few weeks after
that.

Thanks!
– Tyler

[0]: 
[1]: 
[2]: 
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] No train the week of 2021-04-19

2021-04-14 Thread Tyler Cipriani
Hi All

There will be no train next week (the week of 2021-04-19) due to a wmf
holiday on 2021-04-22.

There is a long-term calendar of upcoming deployment disruptions available
on Wikitech: https://wikitech.wikimedia.org/wiki/Deployments/Yearly_calendar

Thanks!
– Tyler
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] MediaWiki 1.36-beta has been branched

2021-04-23 Thread Tyler Cipriani
Hi All

On Tue, Apr 13, 2021 at 5:36 PM Tyler Cipriani 
wrote:

> We hope to issue the first release candidate, 1.36.0-rc.0, in two weeks'
> time, and if all goes well, to release MediaWiki 1.36.0 a few weeks after
> that.
>

With the "two weeks" deadline of the April 27th approaching, the remaining
blockers to creating a release candidate are:

* Remove non-critical path interface styles out of legacy feature into more
appropriate homes: https://phabricator.wikimedia.org/T278576
* Zero Config Install of VE + Parsoid for MW 1.36:
https://phabricator.wikimedia.org/T261220

The above blockers will delay the release of 1.36.0-rc.0 and are the only
blockers we are aware of at this time. No additional new features,
deprecations, or non-bug-fixes will be included in the 1.36 release.

Once these issues are resolved (both issues are in-progress as far as I'm
aware), rc.0 can be released, and the 1.36 release can resume.

Thanks!
– Tyler
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Backport deployment training!

2021-04-27 Thread Tyler Cipriani
Hello potential deployers!

We're starting backport deployment training for people interested in
learning how to deploy safely

.

Training happens in the #wikimedia-operations IRC channel as well as in a
Google meet hangout every Thursday at both 11:00 and 23:00 UTC.

If you're interested in becoming a deployer and joining the hangout please
fill out our deployment training request form
 on
Phabricator to signup.


Attendance at multiple training sessions is welcome and encouraged—the goal
is for you to be comfortable doing deployments yourself.

Everyone interested in having their code run in Wikimedia's production
should learn how to deploy! Knowing how to deploy is important to unblock
yourself and to help others in the technical community. The training will
guide you step-by-step through queuing up patches, (in)validating on
MWDebug, rolling back, and pushing live.

For more information see the deployment training guide on Wikitech
.

💖

– Tyler
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] MediaWiki 1.36.0-rc.0 is ready for testing

2021-04-29 Thread Tyler Cipriani
I'm pleased to announce the immediate availability of MediaWiki 1.36.0-rc.0,
the first release candidate for 1.36.x. Download links are at the end of the
e-mail. The tag has been signed and pushed to Git.

Please note that MediaWiki 1.36 now requires the PHP internationalization
extension, commonly referred to as Intl, ext-intl, or php-intl.

This is not a final release, and should not be used for production websites.
Known issues are tracked in Phabricator on the release workboard [1].

As always, please try out the release candidate in a test environment and do
report any issues that you discover. Please use the #MW-1.36-Release [2] tag
in Phabricator when reporting issues specific to this release, to make sure
that we find them as quickly as possible.

It is expected that MediaWiki 1.36 will become final in May 2021, though the
date may slip if blockers are identified.

Preliminary release notes:
https://gerrit.wikimedia.org/g/mediawiki/core/+/REL1_36/RELEASE-NOTES-1.36
https://www.mediawiki.org/wiki/Release_notes/1.36

Public keys:
https://www.mediawiki.org/keys/keys.html

Open Bugs:
[1] https://phabricator.wikimedia.org/project/board/3386/

Bug report form:
[2] 
https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?tags=MW-1.36-Release

**
Download:
https://releases.wikimedia.org/mediawiki/1.36/mediawiki-1.36.0-rc.0.tar.gz
https://releases.wikimedia.org/mediawiki/1.36/mediawiki-1.36.0-rc.0.zip

Download without bundled extensions:
https://releases.wikimedia.org/mediawiki/1.36/mediawiki-core-1.36.0-rc.0.tar.gz
https://releases.wikimedia.org/mediawiki/1.36/mediawiki-core-1.36.0-rc.0.zip

GPG signatures:
https://releases.wikimedia.org/mediawiki/1.36/mediawiki-core-1.36.0-rc.0.tar.gz.sig
https://releases.wikimedia.org/mediawiki/1.36/mediawiki-core-1.36.0-rc.0.zip.sig
https://releases.wikimedia.org/mediawiki/1.36/mediawiki-1.36.0-rc.0.tar.gz.sig
https://releases.wikimedia.org/mediawiki/1.36/mediawiki-1.36.0-rc.0.zip.sig

Public keys:
https://www.mediawiki.org/keys/keys.html

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] MediaWiki 1.36 Release Bugfix Bugging

2021-05-10 Thread Tyler Cipriani
👋

This is a quick plea to add bugfixes that should go out with the MediaWiki
1.36 release to the Phabricator board
<https://phabricator.wikimedia.org/project/view/4555/> so that we can get
them backported and tested.

Important links:

   - Currently reported bugs
   <https://phabricator.wikimedia.org/project/view/4555/>
   - Report a new bug form
   
<https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?tags=MW-1.36-Release&title=Backport%20Bugfix%20for%20%3CBUG%3E&description=%23%20%7Bicon%20bug%7D%20MediaWiki%201.36%20Bugfix%0aGerrit%20patch:%20%3CLINK%3E>
   - Preliminary Release notes
   <https://gerrit.wikimedia.org/g/mediawiki/core/+/REL1_36/RELEASE-NOTES-1.36>

🙏

Tyler Cipriani (he/him)
Engineering Manager, Release Engineering
Wikimedia Foundation
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Summary of *last week*'s deployment of 1.37.0-wmf.4

2021-05-12 Thread Tyler Cipriani
This is a belated summary of *last week*'s deployment of the 1.37.0-wmf.4
train. The primary train conductor for the week was Brennen Bearnes, with
Lars Wirzenius as backup in European timezones and considerable assistance
from Ahmon Dancy.

The blocker task for the week was: https://phabricator.wikimedia.org/T281145

== Stats ==
* 422 patches
* 2 risky patches identified
* 1 rollback
* 19 hours spent rolled back
* group2 was delayed by 1 day
* 10 train blockers were added, 6 were resolved, 4 were determined to not
be blockers

== 🚂😻 ==
It takes a village[0] to release a train. Thank you to the folks that
helped us this past week:
* Marius Hoch
* DannyS712
* Petr Pchelko
* Tim Starling
* Eric Gardner
* The inimitable Timo Tijhof
* The one and only James Forrester
* Majavah
* RhinosF1
* Urbanecm
* AntiCompositeNumber

And anyone else whose brain was occupied by thoughts of the train this past
week: thanks and sorry.

XOXO
– Train Gang

[0]: 
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[Wikitech-l] No train the week of 2021-05-31

2021-05-18 Thread Tyler Cipriani
Hi All

There will be no train 2021-05-31 (2021-05-31–2021-06-04) — having our
Engineering Productivity offsite.

There is a long-term calendar of upcoming known deployment disruptions
available on Wikitech:
https://wikitech.wikimedia.org/wiki/Deployments/Yearly_calendar

Thanks!

Tyler Cipriani (he/him)
Engineering Manager, Release Engineering
Wikimedia Foundation
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: No train the week of 2021-05-31

2021-05-26 Thread Tyler Cipriani
A quick reminder ICYMI that there is no train next week.

<3
– Tyler

On Tue, May 18, 2021 at 5:44 PM Tyler Cipriani 
wrote:

> Hi All
>
> There will be no train 2021-05-31 (2021-05-31–2021-06-04) — having our
> Engineering Productivity offsite.
>
> There is a long-term calendar of upcoming known deployment disruptions
> available on Wikitech:
> https://wikitech.wikimedia.org/wiki/Deployments/Yearly_calendar
>
> Thanks!
>
> Tyler Cipriani (he/him)
> Engineering Manager, Release Engineering
> Wikimedia Foundation
>
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] 🌠The More You Know*: Wikimedia’s Kubernetes Pipeline

2021-05-27 Thread Tyler Cipriani
Oh, hi there

tl;dr: we’re still moving MediaWiki into Kubernetes in Wikimedia
Production; there are resources on Wikitech[0] for folks interested in
learning more about the specifics of our Kubernetes pipeline.

⏳ Progress

Kubernetes is the future deployment platform for most Wikimedia code.
The developer and deployment experience under Kubernetes will
(hopefully) be easier and safer and developers will have more
autonomy.

The first Wikimedia production service (Mathoid) moved to Kubernetes
in July, 2018. Today there are 25 services using our production
Kubernetes infrastructure serving 30,000 requests per second, about
67-ish% of our overall traffic. All services are deployed by their
service owners without much oversight.

Our Docker registry[1] contains 311 Docker images with 7,690 tags. Our
production images are built via Continuous Integration (CI), using our
deployment pipeline. Blubber[2] and PipelineLib[3] were introduced in
2018 to allow developers to specify how they wanted their CI to run
and when to publish images to our shared registry.

Now we’re in the process of applying all we’ve learned from our
service migration to the MediaWiki migration to Kubernetes.

🏆 Goal

* Open Container Initiative (OCI)-compatible MediaWiki images
* Built, tested, and promoted by the image pipeline
* Deployed to Kubernetes via Helm
* Serving production traffic

🧐 Why

* Development - Standard platform for development and production
* Deployment - Safer, simpler, industry standard
* Production - Increased capacity, redundancy, reliability

👏 Now

* We’re building Docker images and Helm charts
** Release Engineering crammed all 184 branched extensions, 4 skins,
vendor, and core Mediawiki code into a single image alongside our
massive localization of 435 languages
** ServiceOps is working on php-fpm, apache, nutcracker, mcrouter and
envoy images and their corresponding helm charts

* We’re making our Kubernetes better
** ServiceOps has upgraded our base images and Kubernetes cluster

🔜 Next

* Figuring out how deployments work
** Backports, security releases, and train
* Figuring out how image upgrades work
** Is a php-fpm upgrade a deployment now? What does that look like?
* Figuring out how our Wikimedia configuration changes
** Currently a change to configuration requires a code change, does
this code change kick off a whole image build?

🎓 Resources

Kubernetes will be very impactful in our production services and we
would like to encourage those interested in this change to become
familiar with its concepts.

Please have a look at our collected Wikitech Kubernetes education
resources, tutorials, and guides[0]. Add material if you have had a
good experience with a class or tutorial.

Otherwise stay tuned and watch that page for additional resources to
be added. You will hear from us again as we have additional things to
report.

– <3
Tyler Cipriani (he/him) (On behalf of all the fine folks working on
Kubernetes for MediaWiki)
Engineering Manager, Release Engineering
Wikimedia Foundation

[0]: <https://wikitech.wikimedia.org/wiki/Kubernetes/Kubernetes_Education>
[1]: <https://docker-registry.wikimedia.org>
[2]: <https://wikitech.wikimedia.org/wiki/Blubber>
[3]: <https://wikitech.wikimedia.org/wiki/PipelineLib>

*subject line reference: https://en.wikipedia.org/wiki/The_More_You_Know
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] No deploys next week (week of 2021-06-14)

2021-06-07 Thread Tyler Cipriani
Hi All

There will be no deployments to Wikimedia production the week of 2021-06-14
(2021-06-14–2021-06-18) for Wikimedia Foundation's annual All Hands.

There is a long-term calendar of upcoming known deployment disruptions
available on Wikitech:
https://wikitech.wikimedia.org/wiki/Deployments/Yearly_calendar

Thanks!
Tyler Cipriani (he/him)
Engineering Manager, Release Engineering
Wikimedia Foundation
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] 📊 2021 Developer satisfaction survey results

2021-06-14 Thread Tyler Cipriani
👋

Results: https://www.mediawiki.org/wiki/Developer_Satisfaction_Survey/2021

Thanks to everyone who took the time to respond to the survey! 🥰

The developer satisfaction survey results show what's important to our
developer community, and the parts of our developer experience that
need more resources to improve.

➡️ Folks should read it, talk about it, and ask questions (please ;))!

<3
-
Tyler Cipriani (he/him)
Engineering Manager, Release Engineering
Wikimedia Foundation

[0]: <https://lists.wikimedia.org/pipermail/wikitech-l/2021-March/094317.html>
[1]: <https://www.mediawiki.org/wiki/User:Greg_(WMF)>
[2]: <https://lists.wikimedia.org/pipermail/wikitech-l/2021-March/094373.html>
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: Why does the train start on Tuesday?

2021-07-12 Thread Tyler Cipriani
(Late reply: I was out the week this was sent, then another week of
vacation happened)

On Wed, Jun 23, 2021 at 2:59 AM Jaime Crespo  wrote:

> * How often are issues surfaced in the group0 -> group1 vs group1 ->
> group2, are there any stats to back the need for a change there?
>

The closest number I have to issues/group is the count of new "blocker"
tasks filed in phabricator per group.

Progressive rollout to each group gives us more confidence in the code
being deployed, so for each group we should see progressively fewer
blockers.

🚂📈*Group vs blocker discovery over the past 153 trains**:*
[image: README_10_0.png]

   - *Before group0*: 233
   - *Group0*: 180
   - *Group1*: 230
   - *Group2*: 91

"Before group0" means that before we've rolled out the train to any wiki in
wikiprod, there's a blocker on the train task (just like today
1.37.0-wmf.14 is not deployed anywhere, but there's a blocker on the train
task: https://phabricator.wikimedia.org/T281155 ).

If we want each group to have progressively fewer blockers for each group
then the data shows that group0 is too small and/or group1 is too big.
There are other considerations. Deployers have a lot of work to do on
group0 day vs group1 day: so making group0 bigger/more useful for
developers makes the lives of deployers harder.

* Without changing the actual deploying days or the frequency, would there
> be any benefit of shifting the deploy over multiple weeks? (random example
> Tu: group1->group2, (new branch) We: group0, Th: group0-> group1) or would
> that make things worse?
>

I wonder what impact this change would have on blocker reports. For
instance, is it a function of the time left in the week that group2
surfaces relatively few blockers?


> * You mention commons. I am guessing Commons, and Wikidata, to some
> extent- are both large sites with a lot of visibility but also very
> different from the core features that are similar to most other wikis, but
> the test version of those on group0 may not be enough to catch all issues.
> Is there something that could be improved specifically for those sites?
>

This is a subset of a question I've been asking folks: why does the train
give us confidence? What does a train give us that a testing environment
like beta or a local environment can't give us? I think some of the magic
of train is the amount of traffic, but if that were the case then
artificial traffic should suffice. I think the other aspect of the train is
Hyrum's law[0]—all observable properties of a system are hammered with
traffic: even observable properties that were not built intentionally.


> * Can we do something to improve the speed from "a user notices an issue
> with the site" to "the right team/owner is aware of it and acts on it"?
>

Or can we do something to improve how many issues users notice? :)

Thanks for all of these great questions.
– Tyler

 [0]: 
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: Why does the train start on Tuesday?

2021-07-14 Thread Tyler Cipriani
On Mon, Jul 12, 2021 at 10:47 PM Risker  wrote:

> On Mon, 12 Jul 2021 at 19:26, Tyler Cipriani 
> wrote:
>
>> * Can we do something to improve the speed from "a user notices an issue
>>> with the site" to "the right team/owner is aware of it and acts on it"?
>>>
>>
>> Or can we do something to improve how many issues users notice? :)
>>
>>
> As someone who's been around for a long time as an editor, I can say
> honestly that having most of the issues addressed before they hit the
> really big projects has resulted in a huge improvement.  The train really
> works, and the only challenge I really see is what Jon mentions in his
> original post.  Some of those issues aren't really that significant in the
> great scheme of things, but there's a big leap when something takes two
> business days to fix from the Tuesday deployment and two business days to
> fix from the Thursday deployment.
>
> It's not always possible for even the best developer and the best testing
> systems to catch an issue that will be spotted by a hands-on user, several
> of whom are much more familiar with the purpose, expected outcomes and
> change impact on extensions than the people who have written them or QA'd
> them.  That's why there will always be plenty of issues that are identified
> by users, and it is in no way a problem that a small number of them
> (compared to what we saw 10-15 years ago) get through to the end of the
> train before being identified as needing to be addressed (for different
> values of "addressed").
>

Thank you for this response! The train existed before I started thinking
about MediaWiki-software deployment. The impression that it has had a
positive impact on the number of problems seen by users is important
information. Your response is a fantastic answer to a different question
I wonder about a lot: why does the train process give us confidence in the
code being released?

The next part of that question is: are there ways we can gain this
confidence with less disruption? I'd be interested in trying to catalog the
types of problems that are only spotted by hands-on users in the interest
of seeing if patterns emerge.

Thank you again!
– Tyler
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] 🚂🌈 Summary of last week's deployment of 1.37.0-wmf.15: Another Successful Trainbow*!

2021-07-26 Thread Tyler Cipriani
This is a belated summary of *last week*'s deployment of the 1.37.0-wmf.15
branch of MediaWiki, extensions, and skins (also known as "the train").

The primary person in charge last week was Antoine (hashar) Musso, with
Ahmon Dancy as backup, both from the Wikimedia Foundation Release
Engineering team.

The summary/blocker task for this week is:
https://phabricator.wikimedia.org/T281156

The new version is running on all sites: https://versions.toolforge.org/

== 📈 Stats ==
* 229 patches (15th smallest train since 1.31)
* 0 risky patches
* 0 time spent rolled back
* 1 day of delay
* 2 blockers were added, 0 were resolved, 2 blockers were removed

== 🚂🌈==
Everyone who deployed, triaged, and had code riding the train this week:
thank you. We've deployed another trainbow*!

Thanks especially to the folks who added and removed blockers this week:
* Timo Tijhof
* Daniel Kinzler
* Brennen Bearnes

❤️‍🔥
– Train Troop

* trainbow – a neologism meaning "happy and successful train" — a
portmanteau of "train" and "rainbow"!
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] 🚂🌈 Summary of 1.37.0-wmf.16 train deployment

2021-08-02 Thread Tyler Cipriani
This email is a summary of the Wikimedia production deployment of
1.37.0-wmf.16.

Mukunda Modell was the train conductor last week. Antoine Musso (hashar)
was the backup conductor.

Blocker task: https://phabricator.wikimedia.org/T281157

The new version is live on all sites: https://versions.toolforge.org/

== 📊 Stats ==
* 365 patches ↑
* 1 risky patch ↑
* 1 rollback (for ~2hours) ↑
* 0 days of delay ↓
* 2 blockers added, 0 resolved, 2 removed — same as last week
** OPEN: https://phabricator.wikimedia.org/T287704 by @mmodell
** RESOLVED: https://phabricator.wikimedia.org/T286490 by @dancy

== 🚂🌈 ==
A couple of tasks were mentioned but not added as blockers:

* OPEN: https://phabricator.wikimedia.org/T287642 by @brennen
* RESOLVED: https://phabricator.wikimedia.org/T287649 by @IKhitron
* RESOLVED: https://phabricator.wikimedia.org/T191021 by @Trizek-WMF

Thank you to everyone who filed tasks, triaged bugs, talked publicly, added
risky patch notifications, and braved yet another production deployment. To
quote Winnie the Pooh: "You are braver than you believe, stronger than you
seem, and smarter than you think"

Much thanks goes to:

* Addshore
* alexhollender
* daniel
* DannyS712
* Etonkovidova
* IKhitron
* Jdlrobson
* Krinkle
* Ladsgroup
* Legoktm
* Majavah
* ovasileva
* Pchelolo
* Zabe

– The Trainbow Bunch
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] 🚂🌈 Summary of 1.37.0-wmf.18 train deployment

2021-08-16 Thread Tyler Cipriani
This email is a summary of the Wikimedia production deployment of
1.37.0-wmf.18.

   - Conductor: Jeena Huneidi
   - Backup: Mukunda Modell
   - Blocker task: T281159 
   - Status: Live on all wikis 


*📊Stats*

   - 244 patches ▁▄█▅▄
   - 1 rollback ▁▁███
   - 1 day of delay (going to group0) ██▁▁█
   - 6 blockers ▄▁▁▆█

🚂🌈

Big WikiLove to the folks who helped out:

   - Timo Tijhof
   - James D. Forrester
   - Tacsipacsi
   - Lucas Werkmeister
   - Jon Robson
   - RhinosF1
   - KartikMistry
   - abi_
   - Zabe

Until the next train,
– Your local WikiMotive congregants
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: (no subject)

2021-08-18 Thread Tyler Cipriani
Hi Siddhi

Here's the link to the Wikimedia organization on Github:
https://github.com/wikimedia

Here's a link to MediaWiki on GitHub in case that's helpful:
https://github.com/wikimedia/mediawiki

These repositories are (mostly) mirrored from our Gerrit instance where we
do code reviews and development: https://gerrit.wikimedia.org/g/

I hope that's helpful!
– Tyler

On Wed, Aug 18, 2021 at 6:39 AM Siddhi Bhanushali <
siddhibhanushali1...@gmail.com> wrote:

> Can I get github link of Wikimedia.
> ___
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] No train the week of 2021-09-06

2021-08-24 Thread Tyler Cipriani
Hi All

There will be no deployment train the week of 2021-09-06 (Mon, 06 Sep –
Fri, 10 Sep). Release Engineering will be having a team focused work week.
Backport and config deployments should continue as planned.

There is a long-term calendar of upcoming known deployment disruptions
available on Wikitech:
https://wikitech.wikimedia.org/wiki/Deployments/Yearly_calendar

Thank you!

Tyler Cipriani (he/him)
Engineering Manager, Release Engineering
Wikimedia Foundation
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] MediaWiki 1.37-alpha will be branched as a beta on 14 September 2021

2021-08-24 Thread Tyler Cipriani
Hey all,

This is a quick note to highlight that in three weeks' time, the REL1_37
branch will be created for MediaWiki core and each of the extensions and
skins in Wikimedia git, with some (the 'tarball') included as sub-modules
of MediaWiki itself[0]. This is the first step in the release process for
MediaWiki 1.37, which should be out in late November 2021, approximately
six months after MediaWiki 1.36.

The branches will reflect the code as of the last 'alpha' branch for the
release, 1.37.0-wmf.23, which will be deployed to Wikimedia wikis in the
week beginning 13 September 2021 for MediaWiki itself and those extensions
and skins available there. (Note that there will not be a 1.37.0-wmf.22
deployment, so there are only two Wikimedia production train deployments
inclusive between now and the branch point.)

After that point, patches that land in the main development branch of
MediaWiki and its bundled extensions and skins will be slated for the
MediaWiki 1.38 release unless specifically backported[1].

If you are working on a new feature that you wish to land for the release,
you now have three weeks to finish your work and land it in the development
branch; feature changes should not be backported except in an urgent case.
If your work might not be complete in time, and yet should block release
for everyone else, please file a task against the `mw-1.37-release` project
on Phabricator.[2]

If you have tickets that are already tagged for `mw-1.37-release`, please
finish them, untag them, or reach out to get them resolved in the next few
weeks.

We hope to issue the first release candidate, 1.37.0-rc.0, two weeks after
the branch point, and if all goes well, to release MediaWiki 1.37.0 a few
weeks after that.

[0]: <https://www.mediawiki.org/wiki/Bundled_extensions_and_skins>
[1]: <https://www.mediawiki.org/wiki/Backporting_fixes>
[2]: <https://phabricator.wikimedia.org/tag/mw-1.37-release/>

Tyler Cipriani (he/him)
Engineering Manager, Release Engineering
Wikimedia Foundation
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] How we deploy code

2021-09-27 Thread Tyler Cipriani
Last week, I spoke to a few of my Wikimedia Foundation colleagues about how
we deploy code—I completely botched it.

At the end of the conversation, I was pretty sure I'd only succeeded  in
making a complex process more opaque. I decided to write a blog to redeem
myself: How We Deploy Code
<https://phabricator.wikimedia.org/phame/post/view/253/how_we_deploy_code/>

My goal was to write a very high-level overview of the process we use to
deploy code to Wikimedia production.

Hopefully, this is helpful.

<3
– Tyler Cipriani (he/him)
Engineering Manager, Release Engineering
Wikimedia Foundation
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

  1   2   >