[Wikitech-l] Re: maybe somebody can just review and approve this commit of script conversion?

2022-10-24 Thread bawolff
It seems like you've gotten reviews but you just don't like the answer.

It is extremely unlikely that anyone is going to merge it in over the
objection of other reviewers. The code needs to be structured so the logic
of what is going on is clear. 3000 lines of regexes with half of them
commented out just doesn't meet expectations for how mediawiki code should
be written.

>it has been said that the file is too big and the commit is too big.
>the code is nearly 3500 lines. i think it is not worth to divide it
>into separate library or files. as library it would be very fit for
>this conversion, not usable for other things. i feel making library as
>"premature optimisation". and laziness, feeling as useless work. also
>just dividing into files, into classes. 3500 lines is not very hard
>for me to browse.

You can feel however you want, but if you want to have the code merged into
mediawiki and deployed to wikipedia, you're going to have to put aside your
feelings, or be content with it staying as an unmerged patchset forever.

--
bawolff

On Mon, Oct 24, 2022 at 9:03 PM dinar qurbanov  wrote:

> hello
>
> can somebody help with reviewing technical aspects of a cyrillic <->
> latin converter ?
>
> it has been said that the file is too big and the commit is too big.
> the code is nearly 3500 lines. i think it is not worth to divide it
> into separate library or files. as library it would be very fit for
> this conversion, not usable for other things. i feel making library as
> "premature optimisation". and laziness, feeling as useless work. also
> just dividing into files, into classes. 3500 lines is not very hard
> for me to browse.
>
> if anybody wants to separate it into classes and files, feel free to
> edit, if you think it is good for mediawiki. i do not want to make
> such change by myself. i can help with proper names (for classes,
> variables).
>
> i said variables because it is possible to convert some lists of php
> commands into for cycles with arrays, just to make it look shorter. i
> think it is also like premature optimisation, i have more freedom when
> they are as they are, in further editing. i do not want to make such
> change by myself. feel free to edit.
>
> quick link to the newest version of the code:
>
> https://gerrit.wikimedia.org/r/c/mediawiki/core/+/164049/219/includes/language/converters/TtConverter.php
> .
>
> also it was suggested to get rid of regex lookahead and lookbehind.
> and strtr was like suggested. i think i probably cannot easily make
> those changes. and i think, is strtr really faster than preg_replace?
> i googled for benchmark results and i do not see, in the first page,
> like 10 results.
>
> the code is big, but it just modifies a string.
>
> tatar wikipedians have asked me about this code, yesterday, and now i
> try this approach, to write to this mailing list, before other ways to
> go. maybe there is just a lack of some people who can review this code
> and approve it, and maybe this mailing list can help.
>
> other ways to go to the problem's pages:
> https://phabricator.wikimedia.org/T27537 ,
> https://gerrit.wikimedia.org/r/c/mediawiki/core/+/164049 .
> ___
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: Infrastructure diagrams

2022-10-24 Thread Selena Deckelmann
Thanks so much, Timo!  Having these snapshots of our systems and
infrastructure, and also capturing how it evolves over time, is invaluable.

-selena

On Mon, Oct 24, 2022 at 6:51 AM Krinkle  wrote:

> I've done a major update to a number of diagrams on Wikitech.
>
> Usually, I don't mention an update here, but I'm highlighting it now as
> it's been a while since we mentioned them on-list and the community and
> foundation have grown a lot so some of these may be new to you.
>
> Given how much has changed in recent changes, I also included a changelog
> and a link to where in the docs you'd normally discover this diagram
> on-wiki:
>
> *== 1. File:Wikipedia_webrequest_2022.png
> 
> (Updated) ==*
>
> This is a highly simplified diagram, covering the general shape of our
> stack through the example of a typical Wikipedia webrequest.
>
> Previous:
> https://upload.wikimedia.org/wikipedia/commons/b/b3/Wikipedia_webrequest_flow_2020.png
> New:
> https://upload.wikimedia.org/wikipedia/commons/4/4d/Wikipedia_webrequest_2022.png
> Documentation: wikitech:MediaWiki_at_WMF
>  and
> wikitech:Caching_overview
> .
> Notable changes:
> * Change edge TLS termination ("HTTPS") from ats-tls to HAProxy. I wrote a
> "Caching overview § History
> " section.
> * Change appserver TLS from Nginx- to Envoy.
> * Add new MainStash DB.
> * Include storage ExternalStore DB, ParserCache DB, and Swift media.
> * Include services Shellbox, Mathoid, and Kask.
>
> *== 2. File:WMF_infrastructure_2022.png
> 
> (Updated) ==*
>
> This is a continous attempt at an overview of tier-1/user-facing
> infrastructure. It will likely never be complete from all POV, but.. it is
> more accurate and complete than it has been. Thanks to all that contributed
> by entertaining my many questions over the years.
>
> Previous (2016 by Elukey):
> https://upload.wikimedia.org/wikipedia/labs/4/4d/Infrastructure_overview.png
> New:
> https://upload.wikimedia.org/wikipedia/commons/4/48/WMF_infrastructure_2022.png
> Documentation: wikitech:Wikimedia_infrastructure
>  and
> wikitech:Purged 
> Notable changes:
> * Add new Drmrs data center in Marseille, France.
> * Add new services: purged.go, EventStreams, Thumbor, mcrouter, Envoy,
> etcd.
> * Add new distinction for Multi-DC between primary and secondary data
> center.
> * Change sessionstore from Redis to Kask/Cassandra.
> * Change jobqueue from Redis to EventGate/Kafka.
> * Include distinct MediaWiki server roles and clusters.
> * Include high-level MediaWiki platform components.
> * Include example flow for "JobQueue job" and "CDN purge".
>
> *== 3. File:MediaWiki_infrastructure_2022.png
> 
> (New) ==*
>
> Similar to WMF Infra diagram, but more abstract around DC and services,
> and more detailed within the platform. Including more core services, and
> recognising extensions as their own layer.
>
> New:
> https://upload.wikimedia.org/wikipedia/commons/e/ee/MediaWiki_infrastructure_2022.png
> Documentation: wikitech:MediaWiki_at_WMF
> 
>
> *== 4. File:Wikipedia_Memcached_flow_2022.png
> 
> (Updated)*
>
> Previous:
> https://upload.wikimedia.org/wikipedia/commons/d/db/Wikipedia_Memcached_flow_2020.png
> New:
> https://upload.wikimedia.org/wikipedia/commons/4/45/Wikipedia_Memcached_flow_2022.png
> Documentation: wikitech:Memcached_for_MediaWiki
> 
> Notable changes:
> * Include the three tiers of ParserCache.
> * Add WANCache legend to explain different keytypes you may encounter on
> the network.
> * Add full name of the mcrouter-with-onhost-tier service for greppability.
> * Add new WRStats service (T310662
> ). This was part of Multi-DC
> work
> 
> to reduce primary DB writes and (not bi-di replicated) Redis use in
> AbuseFilter. This service also replaces the old "User ping limiter" in core
> and is now able to serve both use cases.
> * Remove "on-host: soon" labels. Adopting on-host memc for WANCache was
> considered not worth the added runtime complexity (T264604
> ). Note that SRE's work on
> adding 10G network links for memcached hosts, and the addition of
> mcrouter-managed gutter pools take care of the general usecase that we were
> exploring on-host for. We kept it for 

[Wikitech-l] Re: Filtered lists with checkboxes

2022-10-24 Thread Strainu
Pe luni, 24 octombrie 2022, Bináris  a scris:
> Hi,
[...]
> By that time, do you know about such service on the tolserver? Or can I
do it myself somehow with Lua?

Hi Binaris,

For Commons files, there is https://pagepile-visual-filter.toolforge.org/
It shouldn't be too complicated to extend it to any list, but it's not
there yet. Maybe ask the maintainer for an extended version?

There are several tools working with PagePiles that can achieve the same
result, but they are all basically equivalent to editing the wikipage and
deleting the unwanted titles, which doesn't seem to be what you want.

HTH,
 Strainu


>
> --
> Bináris
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: Filtered lists with checkboxes

2022-10-24 Thread Bináris
Good idea. Thank you, I am not very familiar with JS, but I will try it.

Marcin Szwarc  ezt írta (időpont: 2022. okt. 24.,
H, 23:05):

> Hi, Bináris
>
> A client-side JavaScript code might help with that. This would remove a
> need of deploying any code to all users of the wiki while retaining the
> page previews.
>
> An overview of a solution:
> - find all links on the current page
> - create a checkbox for every link (and tag them with page title, for
> example using dataset property)
> - when you finish marking articles, query the selected boxes
> - display the output in any format you're comfortable with
>
> Cheers,
> Msz2001
>
>
> --
> *Od:* Bináris 
> *Wysłano:* poniedziałek, 24 października 2022, 22:54
> *Do:* Wikimedia developers 
> *Temat:* [Wikitech-l] Filtered lists with checkboxes
>
> Hi,
>
> I work with large categories. My dream is to upload a list of articles to
> a wikipage, where there are checkboxes next to the titles, and I can click.
> The page would return a filtered lists which I can use with my bot to
> change the category.
> On an advanced level, several columns of checkboxes could exist do divide
> the category to subcats.
>
> Checkbox is not the aim, it is only a way I can explain what I want to do.
> Of course, I can do it in Excel, but 1. on a wikipage I can move my mouse
> on the title and see the part of the content, and in Excel I cannot, 2. I
> can offer a wikipage to another user to make the filtering, an then do the
> work with bot.
>
> As far as I see, the most natural way is to code this in PHP, which needs
> a MediaWiki extension, if anyone feels like, and likes my idea...
> By that time, do you know about such service on the tolserver? Or can I do
> it myself somehow with Lua?
>
> --
> Bináris
>
> ___
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/



-- 
Bináris
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: Filtered lists with checkboxes

2022-10-24 Thread Marcin Szwarc
Hi, Bináris

A client-side JavaScript code might help with that. This would remove a need of 
deploying any code to all users of the wiki while retaining the page previews.

An overview of a solution:
- find all links on the current page
- create a checkbox for every link (and tag them with page title, for example 
using dataset property)
- when you finish marking articles, query the selected boxes
- display the output in any format you're comfortable with

Cheers,
Msz2001



Od: Bináris 
Wysłano: poniedziałek, 24 października 2022, 22:54
Do: Wikimedia developers 
Temat: [Wikitech-l] Filtered lists with checkboxes

Hi,

I work with large categories. My dream is to upload a list of articles to a 
wikipage, where there are checkboxes next to the titles, and I can click. The 
page would return a filtered lists which I can use with my bot to change the 
category.
On an advanced level, several columns of checkboxes could exist do divide the 
category to subcats.

Checkbox is not the aim, it is only a way I can explain what I want to do. Of 
course, I can do it in Excel, but 1. on a wikipage I can move my mouse on the 
title and see the part of the content, and in Excel I cannot, 2. I can offer a 
wikipage to another user to make the filtering, an then do the work with bot.

As far as I see, the most natural way is to code this in PHP, which needs a 
MediaWiki extension, if anyone feels like, and likes my idea...
By that time, do you know about such service on the tolserver? Or can I do it 
myself somehow with Lua?

--
Bináris

___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] maybe somebody can just review and approve this commit of script conversion?

2022-10-24 Thread dinar qurbanov
hello

can somebody help with reviewing technical aspects of a cyrillic <->
latin converter ?

it has been said that the file is too big and the commit is too big.
the code is nearly 3500 lines. i think it is not worth to divide it
into separate library or files. as library it would be very fit for
this conversion, not usable for other things. i feel making library as
"premature optimisation". and laziness, feeling as useless work. also
just dividing into files, into classes. 3500 lines is not very hard
for me to browse.

if anybody wants to separate it into classes and files, feel free to
edit, if you think it is good for mediawiki. i do not want to make
such change by myself. i can help with proper names (for classes,
variables).

i said variables because it is possible to convert some lists of php
commands into for cycles with arrays, just to make it look shorter. i
think it is also like premature optimisation, i have more freedom when
they are as they are, in further editing. i do not want to make such
change by myself. feel free to edit.

quick link to the newest version of the code:
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/164049/219/includes/language/converters/TtConverter.php
.

also it was suggested to get rid of regex lookahead and lookbehind.
and strtr was like suggested. i think i probably cannot easily make
those changes. and i think, is strtr really faster than preg_replace?
i googled for benchmark results and i do not see, in the first page,
like 10 results.

the code is big, but it just modifies a string.

tatar wikipedians have asked me about this code, yesterday, and now i
try this approach, to write to this mailing list, before other ways to
go. maybe there is just a lack of some people who can review this code
and approve it, and maybe this mailing list can help.

other ways to go to the problem's pages:
https://phabricator.wikimedia.org/T27537 ,
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/164049 .
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/


[Wikitech-l] Filtered lists with checkboxes

2022-10-24 Thread Bináris
Hi,

I work with large categories. My dream is to upload a list of articles to a
wikipage, where there are checkboxes next to the titles, and I can click.
The page would return a filtered lists which I can use with my bot to
change the category.
On an advanced level, several columns of checkboxes could exist do divide
the category to subcats.

Checkbox is not the aim, it is only a way I can explain what I want to do.
Of course, I can do it in Excel, but 1. on a wikipage I can move my mouse
on the title and see the part of the content, and in Excel I cannot, 2. I
can offer a wikipage to another user to make the filtering, an then do the
work with bot.

As far as I see, the most natural way is to code this in PHP, which needs a
MediaWiki extension, if anyone feels like, and likes my idea...
By that time, do you know about such service on the tolserver? Or can I do
it myself somehow with Lua?

-- 
Bináris
___
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: Infrastructure diagrams

2022-10-24 Thread Neil Shah-Quinn
Agreed—these diagrams are genuine masterworks which synthesize what would
otherwise take dozens (probably hundreds) of hours to learn. Thank you
very, very much, Timo.

On Mon, 24 Oct 2022 at 12:40, Johan Jönsson  wrote:

> Huge thanks for this. As someone who comes into contact with for example
> the WMF infrastructure but irregularly, having these diagrams somewhat up
> to date just to be able to follow along in conversations is very helpful.
>
> //Johan Jönsson
> --
>
> On Mon, Oct 24, 2022 at 3:51 PM Krinkle  wrote:
>
>> I've done a major update to a number of diagrams on Wikitech.
>>
>> Usually, I don't mention an update here, but I'm highlighting it now as
>> it's been a while since we mentioned them on-list and the community and
>> foundation have grown a lot so some of these may be new to you.
>>
>> Given how much has changed in recent changes, I also included a changelog
>> and a link to where in the docs you'd normally discover this diagram
>> on-wiki:
>>
>> *== 1. File:Wikipedia_webrequest_2022.png
>> 
>> (Updated) ==*
>>
>> This is a highly simplified diagram, covering the general shape of our
>> stack through the example of a typical Wikipedia webrequest.
>>
>> Previous:
>> https://upload.wikimedia.org/wikipedia/commons/b/b3/Wikipedia_webrequest_flow_2020.png
>> New:
>> https://upload.wikimedia.org/wikipedia/commons/4/4d/Wikipedia_webrequest_2022.png
>> Documentation: wikitech:MediaWiki_at_WMF
>>  and
>> wikitech:Caching_overview
>> .
>> Notable changes:
>> * Change edge TLS termination ("HTTPS") from ats-tls to HAProxy. I wrote
>> a "Caching overview § History
>> " section.
>> * Change appserver TLS from Nginx- to Envoy.
>> * Add new MainStash DB.
>> * Include storage ExternalStore DB, ParserCache DB, and Swift media.
>> * Include services Shellbox, Mathoid, and Kask.
>>
>> *== 2. File:WMF_infrastructure_2022.png
>> 
>> (Updated) ==*
>>
>> This is a continous attempt at an overview of tier-1/user-facing
>> infrastructure. It will likely never be complete from all POV, but.. it is
>> more accurate and complete than it has been. Thanks to all that contributed
>> by entertaining my many questions over the years.
>>
>> Previous (2016 by Elukey):
>> https://upload.wikimedia.org/wikipedia/labs/4/4d/Infrastructure_overview.png
>> New:
>> https://upload.wikimedia.org/wikipedia/commons/4/48/WMF_infrastructure_2022.png
>> Documentation: wikitech:Wikimedia_infrastructure
>>  and
>> wikitech:Purged 
>> Notable changes:
>> * Add new Drmrs data center in Marseille, France.
>> * Add new services: purged.go, EventStreams, Thumbor, mcrouter, Envoy,
>> etcd.
>> * Add new distinction for Multi-DC between primary and secondary data
>> center.
>> * Change sessionstore from Redis to Kask/Cassandra.
>> * Change jobqueue from Redis to EventGate/Kafka.
>> * Include distinct MediaWiki server roles and clusters.
>> * Include high-level MediaWiki platform components.
>> * Include example flow for "JobQueue job" and "CDN purge".
>>
>> *== 3. File:MediaWiki_infrastructure_2022.png
>> 
>> (New) ==*
>>
>> Similar to WMF Infra diagram, but more abstract around DC and services,
>> and more detailed within the platform. Including more core services, and
>> recognising extensions as their own layer.
>>
>> New:
>> https://upload.wikimedia.org/wikipedia/commons/e/ee/MediaWiki_infrastructure_2022.png
>> Documentation: wikitech:MediaWiki_at_WMF
>> 
>>
>> *== 4. File:Wikipedia_Memcached_flow_2022.png
>> 
>> (Updated)*
>>
>> Previous:
>> https://upload.wikimedia.org/wikipedia/commons/d/db/Wikipedia_Memcached_flow_2020.png
>> New:
>> https://upload.wikimedia.org/wikipedia/commons/4/45/Wikipedia_Memcached_flow_2022.png
>> Documentation: wikitech:Memcached_for_MediaWiki
>> 
>> Notable changes:
>> * Include the three tiers of ParserCache.
>> * Add WANCache legend to explain different keytypes you may encounter on
>> the network.
>> * Add full name of the mcrouter-with-onhost-tier service for greppability.
>> * Add new WRStats service (T310662
>> ). This was part of Multi-DC
>> work
>> 
>> to reduce primary DB writes and (not bi-di replicated) Redis use in
>> AbuseFilter. This service also replaces the old "User ping limiter" in core
>> and is now able to 

[Wikitech-l] Re: Infrastructure diagrams

2022-10-24 Thread Johan Jönsson
Huge thanks for this. As someone who comes into contact with for example
the WMF infrastructure but irregularly, having these diagrams somewhat up
to date just to be able to follow along in conversations is very helpful.

//Johan Jönsson
--

On Mon, Oct 24, 2022 at 3:51 PM Krinkle  wrote:

> I've done a major update to a number of diagrams on Wikitech.
>
> Usually, I don't mention an update here, but I'm highlighting it now as
> it's been a while since we mentioned them on-list and the community and
> foundation have grown a lot so some of these may be new to you.
>
> Given how much has changed in recent changes, I also included a changelog
> and a link to where in the docs you'd normally discover this diagram
> on-wiki:
>
> *== 1. File:Wikipedia_webrequest_2022.png
> 
> (Updated) ==*
>
> This is a highly simplified diagram, covering the general shape of our
> stack through the example of a typical Wikipedia webrequest.
>
> Previous:
> https://upload.wikimedia.org/wikipedia/commons/b/b3/Wikipedia_webrequest_flow_2020.png
> New:
> https://upload.wikimedia.org/wikipedia/commons/4/4d/Wikipedia_webrequest_2022.png
> Documentation: wikitech:MediaWiki_at_WMF
>  and
> wikitech:Caching_overview
> .
> Notable changes:
> * Change edge TLS termination ("HTTPS") from ats-tls to HAProxy. I wrote a
> "Caching overview § History
> " section.
> * Change appserver TLS from Nginx- to Envoy.
> * Add new MainStash DB.
> * Include storage ExternalStore DB, ParserCache DB, and Swift media.
> * Include services Shellbox, Mathoid, and Kask.
>
> *== 2. File:WMF_infrastructure_2022.png
> 
> (Updated) ==*
>
> This is a continous attempt at an overview of tier-1/user-facing
> infrastructure. It will likely never be complete from all POV, but.. it is
> more accurate and complete than it has been. Thanks to all that contributed
> by entertaining my many questions over the years.
>
> Previous (2016 by Elukey):
> https://upload.wikimedia.org/wikipedia/labs/4/4d/Infrastructure_overview.png
> New:
> https://upload.wikimedia.org/wikipedia/commons/4/48/WMF_infrastructure_2022.png
> Documentation: wikitech:Wikimedia_infrastructure
>  and
> wikitech:Purged 
> Notable changes:
> * Add new Drmrs data center in Marseille, France.
> * Add new services: purged.go, EventStreams, Thumbor, mcrouter, Envoy,
> etcd.
> * Add new distinction for Multi-DC between primary and secondary data
> center.
> * Change sessionstore from Redis to Kask/Cassandra.
> * Change jobqueue from Redis to EventGate/Kafka.
> * Include distinct MediaWiki server roles and clusters.
> * Include high-level MediaWiki platform components.
> * Include example flow for "JobQueue job" and "CDN purge".
>
> *== 3. File:MediaWiki_infrastructure_2022.png
> 
> (New) ==*
>
> Similar to WMF Infra diagram, but more abstract around DC and services,
> and more detailed within the platform. Including more core services, and
> recognising extensions as their own layer.
>
> New:
> https://upload.wikimedia.org/wikipedia/commons/e/ee/MediaWiki_infrastructure_2022.png
> Documentation: wikitech:MediaWiki_at_WMF
> 
>
> *== 4. File:Wikipedia_Memcached_flow_2022.png
> 
> (Updated)*
>
> Previous:
> https://upload.wikimedia.org/wikipedia/commons/d/db/Wikipedia_Memcached_flow_2020.png
> New:
> https://upload.wikimedia.org/wikipedia/commons/4/45/Wikipedia_Memcached_flow_2022.png
> Documentation: wikitech:Memcached_for_MediaWiki
> 
> Notable changes:
> * Include the three tiers of ParserCache.
> * Add WANCache legend to explain different keytypes you may encounter on
> the network.
> * Add full name of the mcrouter-with-onhost-tier service for greppability.
> * Add new WRStats service (T310662
> ). This was part of Multi-DC
> work
> 
> to reduce primary DB writes and (not bi-di replicated) Redis use in
> AbuseFilter. This service also replaces the old "User ping limiter" in core
> and is now able to serve both use cases.
> * Remove "on-host: soon" labels. Adopting on-host memc for WANCache was
> considered not worth the added runtime complexity (T264604
> ). Note that SRE's work on
> adding 10G network links for memcached hosts, and the addition of
> mcrouter-managed gutter pools take 

[Wikitech-l] Re: Infrastructure diagrams

2022-10-24 Thread Sammy Tarling
This is incredible, thank you so much!


--
Sammy Tarling
Software Engineer
Wikimedia Foundation


On Mon, 24 Oct 2022, 15:51 Krinkle,  wrote:

> I've done a major update to a number of diagrams on Wikitech.
>
> Usually, I don't mention an update here, but I'm highlighting it now as
> it's been a while since we mentioned them on-list and the community and
> foundation have grown a lot so some of these may be new to you.
>
> Given how much has changed in recent changes, I also included a changelog
> and a link to where in the docs you'd normally discover this diagram
> on-wiki:
>
> *== 1. File:Wikipedia_webrequest_2022.png
> 
> (Updated) ==*
>
> This is a highly simplified diagram, covering the general shape of our
> stack through the example of a typical Wikipedia webrequest.
>
> Previous:
> https://upload.wikimedia.org/wikipedia/commons/b/b3/Wikipedia_webrequest_flow_2020.png
> New:
> https://upload.wikimedia.org/wikipedia/commons/4/4d/Wikipedia_webrequest_2022.png
> Documentation: wikitech:MediaWiki_at_WMF
>  and
> wikitech:Caching_overview
> .
> Notable changes:
> * Change edge TLS termination ("HTTPS") from ats-tls to HAProxy. I wrote a
> "Caching overview § History
> " section.
> * Change appserver TLS from Nginx- to Envoy.
> * Add new MainStash DB.
> * Include storage ExternalStore DB, ParserCache DB, and Swift media.
> * Include services Shellbox, Mathoid, and Kask.
>
> *== 2. File:WMF_infrastructure_2022.png
> 
> (Updated) ==*
>
> This is a continous attempt at an overview of tier-1/user-facing
> infrastructure. It will likely never be complete from all POV, but.. it is
> more accurate and complete than it has been. Thanks to all that contributed
> by entertaining my many questions over the years.
>
> Previous (2016 by Elukey):
> https://upload.wikimedia.org/wikipedia/labs/4/4d/Infrastructure_overview.png
> New:
> https://upload.wikimedia.org/wikipedia/commons/4/48/WMF_infrastructure_2022.png
> Documentation: wikitech:Wikimedia_infrastructure
>  and
> wikitech:Purged 
> Notable changes:
> * Add new Drmrs data center in Marseille, France.
> * Add new services: purged.go, EventStreams, Thumbor, mcrouter, Envoy,
> etcd.
> * Add new distinction for Multi-DC between primary and secondary data
> center.
> * Change sessionstore from Redis to Kask/Cassandra.
> * Change jobqueue from Redis to EventGate/Kafka.
> * Include distinct MediaWiki server roles and clusters.
> * Include high-level MediaWiki platform components.
> * Include example flow for "JobQueue job" and "CDN purge".
>
> *== 3. File:MediaWiki_infrastructure_2022.png
> 
> (New) ==*
>
> Similar to WMF Infra diagram, but more abstract around DC and services,
> and more detailed within the platform. Including more core services, and
> recognising extensions as their own layer.
>
> New:
> https://upload.wikimedia.org/wikipedia/commons/e/ee/MediaWiki_infrastructure_2022.png
> Documentation: wikitech:MediaWiki_at_WMF
> 
>
> *== 4. File:Wikipedia_Memcached_flow_2022.png
> 
> (Updated)*
>
> Previous:
> https://upload.wikimedia.org/wikipedia/commons/d/db/Wikipedia_Memcached_flow_2020.png
> New:
> https://upload.wikimedia.org/wikipedia/commons/4/45/Wikipedia_Memcached_flow_2022.png
> Documentation: wikitech:Memcached_for_MediaWiki
> 
> Notable changes:
> * Include the three tiers of ParserCache.
> * Add WANCache legend to explain different keytypes you may encounter on
> the network.
> * Add full name of the mcrouter-with-onhost-tier service for greppability.
> * Add new WRStats service (T310662
> ). This was part of Multi-DC
> work
> 
> to reduce primary DB writes and (not bi-di replicated) Redis use in
> AbuseFilter. This service also replaces the old "User ping limiter" in core
> and is now able to serve both use cases.
> * Remove "on-host: soon" labels. Adopting on-host memc for WANCache was
> considered not worth the added runtime complexity (T264604
> ). Note that SRE's work on
> adding 10G network links for memcached hosts, and the addition of
> mcrouter-managed gutter pools take care of the general usecase that we were
> exploring on-host for. We kept it for ParserCache however (T244340
> 

[Wikitech-l] Infrastructure diagrams

2022-10-24 Thread Krinkle
I've done a major update to a number of diagrams on Wikitech.

Usually, I don't mention an update here, but I'm highlighting it now as it's 
been a while since we mentioned them on-list and the community and foundation 
have grown a lot so some of these may be new to you.

Given how much has changed in recent changes, I also included a changelog and a 
link to where in the docs you'd normally discover this diagram on-wiki:

*== 1. File:Wikipedia_webrequest_2022.png 
 
(Updated) ==*

This is a highly simplified diagram, covering the general shape of our stack 
through the example of a typical Wikipedia webrequest.
**
Previous: 
https://upload.wikimedia.org/wikipedia/commons/b/b3/Wikipedia_webrequest_flow_2020.png
New: 
https://upload.wikimedia.org/wikipedia/commons/4/4d/Wikipedia_webrequest_2022.png
Documentation: wikitech:MediaWiki_at_WMF 
 and 
wikitech:Caching_overview 
.
Notable changes:
* Change edge TLS termination ("HTTPS") from ats-tls to HAProxy. I wrote a 
"Caching overview § History 
" section.
* Change appserver TLS from Nginx- to Envoy.
* Add new MainStash DB.
* Include storage ExternalStore DB, ParserCache DB, and Swift media.
* Include services Shellbox, Mathoid, and Kask.

*== 2. File:WMF_infrastructure_2022.png 
 
*(Updated) ==**

This is a continous attempt at an overview of tier-1/user-facing 
infrastructure. It will likely never be complete from all POV, but.. it is more 
accurate and complete than it has been. Thanks to all that contributed by 
entertaining my many questions over the years.

Previous (2016 by Elukey): 
https://upload.wikimedia.org/wikipedia/labs/4/4d/Infrastructure_overview.png
New: 
https://upload.wikimedia.org/wikipedia/commons/4/48/WMF_infrastructure_2022.png
Documentation: wikitech:Wikimedia_infrastructure 
 and 
wikitech:Purged 
Notable changes:
* Add new Drmrs data center in Marseille, France.
* Add new services: purged.go, EventStreams, Thumbor, mcrouter, Envoy, etcd.
* Add new distinction for Multi-DC between primary and secondary data center.
* Change sessionstore from Redis to Kask/Cassandra.
* Change jobqueue from Redis to EventGate/Kafka.
* Include distinct MediaWiki server roles and clusters.
* Include high-level MediaWiki platform components.
* Include example flow for "JobQueue job" and "CDN purge".

*== 3. File:MediaWiki_infrastructure_2022.png 
 
(New) ==*

Similar to WMF Infra diagram, but more abstract around DC and services, and 
more detailed within the platform. Including more core services, and 
recognising extensions as their own layer.
**
New: 
https://upload.wikimedia.org/wikipedia/commons/e/ee/MediaWiki_infrastructure_2022.png
Documentation: wikitech:MediaWiki_at_WMF 


*== 4. File:Wikipedia_Memcached_flow_2022.png 
 
*(Updated)**

Previous: 
https://upload.wikimedia.org/wikipedia/commons/d/db/Wikipedia_Memcached_flow_2020.png
New: 
https://upload.wikimedia.org/wikipedia/commons/4/45/Wikipedia_Memcached_flow_2022.png
Documentation: wikitech:Memcached_for_MediaWiki 

Notable changes:
* Include the three tiers of ParserCache.
* Add WANCache legend to explain different keytypes you may encounter on the 
network.
* Add full name of the mcrouter-with-onhost-tier service for greppability.
* Add new WRStats service (T310662 
). This was part of Multi-DC work 
 
to reduce primary DB writes and (not bi-di replicated) Redis use in 
AbuseFilter. This service also replaces the old "User ping limiter" in core and 
is now able to serve both use cases.
* Remove "on-host: soon" labels. Adopting on-host memc for WANCache was 
considered not worth the added runtime complexity (T264604 
). Note that SRE's work on adding 
10G network links for memcached hosts, and the addition of mcrouter-managed 
gutter pools take care of the general usecase that we were exploring on-host 
for. We kept it for ParserCache however (T244340 
.

*== Edit link ==*

As before, each diagram file page has an "Edit" link in the description that 
takes you directly to the open-source Diagrams.net web app (loading file 
read-only from Google Drive). You can fork by using "Save as" in the web app. 
See also: