[Wikitech-l] Re: API issue
Hi, without further details on what requests you make, what user-agent your script uses, and the times at which you've seen this issue, it's going to be hard to help you. My bet is that you've incurred into one of our throttling/anti-abuse rules. If you open a task on phabricator with details (if possible, an IP would also help, in that case feel free to make the task private) and point me to it, I can investigate what is going on. I'm pretty confident the problem is not the servers being overloaded. Cheers, G. On Wed, Jan 10, 2024 at 6:28 AM ovskmendov--- via Wikitech-l < wikitech-l@lists.wikimedia.org> wrote: > I know. As I said before, this literally started on the first request I > made. > > Sent with Proton Mail <https://proton.me/> secure email. > > On Wednesday, January 10th, 2024 at 12:23 AM, Dalba > wrote: > > Maybe this is not it, but worth checking: "Make your requests in series > rather than in parallel, by waiting for one request to finish before > sending a new request."[1] > [1]: https://www.mediawiki.org/wiki/API:Etiquette > > On Wed, Jan 10, 2024 at 8:45 AM ovskmendov--- via Wikitech-l < > wikitech-l@lists.wikimedia.org> wrote: > >> >Maybe someone else is doing it from the same IP, or maybe it's a bug. >> You could file a report >> <https://phabricator.wikimedia.org/maniphest/task/edit/form/43/> in >> Phabricator with the details of what you are doing and the details from >> that error page. >> >> Given that I have a clean residential IPv6, that is very unlikely. Plus, >> it was working just a few days. And it still doesn’t work with a VPN. >> >> I don’t think it’s a bug as a few requests go through fine, based on the >> behavior, I think it’s server overload or something like that. >> >> Sent with Proton Mail <https://proton.me/> secure email. >> >> On Wednesday, January 10th, 2024 at 12:06 AM, Gergo Tisza < >> gti...@wikimedia.org> wrote: >> >> On Tue, Jan 9, 2024 at 8:53 PM ovskmendov--- via Wikitech-l < >> wikitech-l@lists.wikimedia.org> wrote: >> >>> I am not sending very many requests. I haven’t sent any requests in >>> several days and yet I get the error message. >>> >> >> Maybe someone else is doing it from the same IP, or maybe it's a bug. You >> could file a report >> <https://phabricator.wikimedia.org/maniphest/task/edit/form/43/> in >> Phabricator with the details of what you are doing and the details from >> that error page. >> >> >> ___ >> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org >> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org >> >> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/ > > > ___ > Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org > To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org > https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/ -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
[Wikitech-l] Re: March 2023 Datacenter Switchover
hat the datacenter switchover will happen >> on >> >> *Wednesday >> >> March 1st* starting at *14:00 UTC.* >> >> >> >> Please refer to the original email for any additional information. As >> >> always, you can reach out to me directly or the SRE team in >> >> #wikimedia-sre >> >> on IRC with any question, or through Phabricator. >> >> >> >> Thank you, >> >> >> >> On Tue, Feb 14, 2023 at 1:58 PM Clément Goubert < >> cgoub...@wikimedia.org> >> >> wrote: >> >> >> >>> Dear Wikitechians, >> >>> >> >>> On *Wednesday March 1st*, the SRE team will run a planned data center >> >>> switchover, moving all wikis from our primary data center in Virginia >> to >> >>> the secondary data center in Texas. This is an important periodic test >> >>> of >> >>> our tools and procedures, to ensure the wikis will continue to be >> >>> available >> >>> even in the event of major technical issues in our primary home. It >> also >> >>> gives all our SRE and ops teams a chance to do maintenance and >> upgrades >> >>> on >> >>> systems in Virginia that normally run 24 hours a day. >> >>> >> >>> The switchover process requires a *brief read-only period for all >> >>> Foundation-hosted wikis*, which will start at *14:00 UTC on Wednesday >> >>> March 1st*, and will last for a few minutes while we execute the >> >>> migration as efficiently as possible. All our public and private wikis >> >>> will >> >>> be continuously available for reading as usual, but no one will be >> able >> >>> to >> >>> save edits during the process. Users will see a notification of the >> >>> upcoming maintenance, and anyone still editing will be asked to try >> >>> again >> >>> in a few minutes. >> >>> >> >>> CommRel has already begun notifying communities of the read-only >> window. >> >>> A similar event will follow a few weeks later, when we move back to >> >>> Virginia. This is currently scheduled for *Wednesday, April 26th*. >> >>> >> >>> If you like, you can follow along on the day in the public >> >>> #wikimedia-operations channel on IRC (instructions for joining here >> >>> <https://meta.wikimedia.org/wiki/IRC/Instructions>). To report any >> >>> issues, you can reach us in #wikimedia-sre on IRC, or file a >> Phabricator >> >>> ticket with the *datacenter-switchover* tag (pre-filled form here >> >>> < >> https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=Datacenter-Switchover=Clement_Goubert >> >); >> >>> we'll be monitoring closely for reports of trouble during and after >> the >> >>> switchover. (If you're new to Phab, there's more information at >> >>> Phabricator/Help.) The switchover and its preparation are tracked >> >>> tracked in Phabricator Task T327920 >> >>> <https://phabricator.wikimedia.org/T327920> >> >>> >> >>> On behalf of the SRE team, please excuse the disruption, and our >> thanks >> >>> to everyone in a number of departments who've been involved in >> planning >> >>> this work for the past weeks. Feel free to reply directly to me with >> any >> >>> questions. >> >>> >> >>> Thank you, >> >>> >> >>> -- >> >>> Clément Goubert (they/them) >> >>> Senior SRE >> >>> Wikimedia Foundation >> >>> >> >> >> >> >> >> -- >> >> Clément Goubert (they/them) >> >> Senior SRE >> >> Wikimedia Foundation >> >> >> > >> > >> > -- >> > Clément Goubert (they/them) >> > Senior SRE >> > Wikimedia Foundation >> > >> ___ >> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org >> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org >> >> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/ > > > > -- > Amir (he/him) > > ___ > Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org > To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org > https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/ -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
[Wikitech-l] Re: ClassCrawler – extremely fast and structured code search engine
On Sat, Feb 5, 2022 at 10:19 PM Daniel Kinzler wrote: > Am 05.02.22 um 21:38 schrieb Amir Sarabadani: > > Codesearch has been working fine in the past couple of years. There is a > new frontend being built and I hope we can deploy it soon to provide a > better user experience and I personally don't see a value in > re-implementing codesearch. Especially using non-open source software. > > While I agree with several points that have been raised, in particular > about licensing and building on top of existing tools, I'd like to point > out that the idea is not to re-implement codesearch, but to overcome some > of its limitations. What we use codesearch for most is finding usages of > methods (and sometimes classes). This works fine if the method name is > fairly unique. But if the method name is generic, or you are moving a > method from one class to another an you want to find callers of the old > method, but not the new method, then regular experssions just don't cut it. > > Ok, why do you think symbol search can't be integrated in the current codesearch? That's what Amir was proposing. Sadly I don't think much of the current code of ClassCrawler can be reused for that goal, and it's a pity. Cheers, Giuseppe -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
[Wikitech-l] Re: New experimental backend for the Wikimedia Debug browser extension: what to expect
A quick update: it was decided that, while we get to the point where we can keep releases in sync with production with ease, we will limit public access to the k8s installation to group0 and test wikis. The restriction will be lifted once we feel confident we'll run the same code in k8s and physical servers all the time. Thanks, Giuseppe On Tue, Jul 27, 2021 at 6:16 PM Giuseppe Lavagetto wrote: > Hi all, > > This email is of interest to you only if you're a user of the "Wikimedia > Debug" browser extension. If you're not, you can safely skip it. > > As the more attentive might have noticed, the Wikimedia Debug browser > extension started offering a new option in the drop-down menu, besides the > usual mwdebug servers, labeled "k8s-experimental". That is, as the name > suggests, a very experimental setup of mediawiki running on kubernetes and > is not *yet* a place where you will be able to test your releases. > > Right now, that installation is a work in progress, but nonetheless it > seemed important to us to have a way to browse the wikis from the > installation running on kubernetes while we iron out bugs in preparation > for the actual migration of production traffic. > > This installation can thus: > - run on outdated versions of mediawiki (although we're trying to follow > train releases). > - be down for extended periods of time while we debug something, without > warning. > It also doesn't support (yet) profiling via xhprof. > > So while we welcome the curious to poke around at the performance and bugs > of the installation, it is not a suitable tool (yet) to debug your releases > on. > > I will add filters where appropriate to avoid logs from this installation > from polluting your dashboards in the coming days, but in the meantime, if > you see a log line coming from a server with a strange name like > "mediawiki-pinkunicorn"... that's mediawiki running on kubernetes and you > can mostly ignore it! > > You can follow our progress at https://phabricator.wikimedia.org/T283056 > > Cheers, > > Giuseppe > > -- > Giuseppe Lavagetto > Principal Site Reliability Engineer, Wikimedia Foundation > -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
[Wikitech-l] New experimental backend for the Wikimedia Debug browser extension: what to expect
Hi all, This email is of interest to you only if you're a user of the "Wikimedia Debug" browser extension. If you're not, you can safely skip it. As the more attentive might have noticed, the Wikimedia Debug browser extension started offering a new option in the drop-down menu, besides the usual mwdebug servers, labeled "k8s-experimental". That is, as the name suggests, a very experimental setup of mediawiki running on kubernetes and is not *yet* a place where you will be able to test your releases. Right now, that installation is a work in progress, but nonetheless it seemed important to us to have a way to browse the wikis from the installation running on kubernetes while we iron out bugs in preparation for the actual migration of production traffic. This installation can thus: - run on outdated versions of mediawiki (although we're trying to follow train releases). - be down for extended periods of time while we debug something, without warning. It also doesn't support (yet) profiling via xhprof. So while we welcome the curious to poke around at the performance and bugs of the installation, it is not a suitable tool (yet) to debug your releases on. I will add filters where appropriate to avoid logs from this installation from polluting your dashboards in the coming days, but in the meantime, if you see a log line coming from a server with a strange name like "mediawiki-pinkunicorn"... that's mediawiki running on kubernetes and you can mostly ignore it! You can follow our progress at https://phabricator.wikimedia.org/T283056 Cheers, Giuseppe -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
[Wikitech-l] Re: Stream of recent changes diffs
On Thu, Jul 1, 2021 at 3:10 PM Andrew Otto wrote: > This isn't helpful now, but your use case is relevant to something I hope > to pursue in the future: comprehensive mediawiki change events, including > content. I don't have a great place yet for collecting these use cases, so > I added it to Modern Event Platform parent ticket > <https://phabricator.wikimedia.org/T185233> so I don't forget. :) > > I don't think this is the use-case at all. As someone else already pointed out, diffs don't always give you the context and might be unparsable wikitext. So what you can do is either: 1) Send always the full content of the page changed in the stream, along with the diff. This is IMHO extremely wasteful, but it's also easy to implement 2) find a way to analyze the edits and emit specialized event tags that define what has changed. This is the correct way to go forward, IMHO, but it requires much more engineering time. I don't think there is really a big value in adding the full content of the page to every edit event. I'd rather suggest that people fetch the parsoid HTML from the API, and ensure we do good edge-side caching. Cheers, Giuseppe P.S. Please note that I'm only referring to streams offered to tools and in general to the public internet. Internally to the production cluster the use of content in events might (or might not) prove directly useful in some cases. -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
[Wikitech-l] Change to how we build the wikimedia base container images.
[X-posting from ops-l] Hi all, Starting today, we are building our base container images using debuerreotype instead than bootstrap-vz, which is unmaintained [1]. This is the same tool that is used for the dockerhub debian images, and our images are now completely equivalent to the debian base images, plus our own apt configuration[2]. With this change, we're also introducing a simpler nomenclature for our base images: we will tag our images with "$codename" instead than with "wikimedia-$codename". Thus: - the base stretch image is now docker-registry.wikimedia.org/stretch - the base buster image is now docker-registry.wikimedia.org/buster We have also added a new image based on the (yet unreleased, caveat emptor) debian bullseye. We will keep tagging the latest version of those images as "wikimedia-stretch" and "wikimedia-buster" for the time being, in order to allow for backwards compatibility, but we encourage everyone to migrate eventually to the new naming. Cheers, Giuseppe [1] https://phabricator.wikimedia.org/T281984 [2] Our very simple build script is here: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/docker/files/build-bare-slim.sh -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
[Wikitech-l] TechCom meeting 2020-10-28
This is the weekly TechCom board review in preparation of our meeting on Wednesday. If there are additional topics for TechCom to review, please let us know by replying to this email. However, please keep discussion about individual RFCs to the Phabricator tickets. Activity since Monday 2020-10-19 on the following boards: https://phabricator.wiki09media.org/tag/techcom/ <https://phabricator.wikimedia.org/tag/techcom/> https://phabricator.wikimedia.org/tag/techcom-rfc/ Committee board activity: - T239742 <https://phabricator.wikimedia.org/T239742>: Should npm packages maintained by Wikimedia be scoped or unscoped? Support for the idea of namespacing our npm packages has been expressed. RFCs: Phase progression: - T262946 <https://phabricator.wikimedia.org/T262946> Bump Firefox version in basic support to 3.6 or newer. Started the last call process. Some clarification requested, but no new opposition. IRC meeting request: - T263841 <https://phabricator.wikimedia.org/T263841> RFC: Expand API title generator to support other generated data Some guidance through the process is requested to techcom. Other RFC activity: - T119173 <https://phabricator.wikimedia.org/T119173>: RFC: Discourage use of MySQL's ENUM type. A question was asked to the DBAs about using ENUMs in maintenance Cheers, Giuseppe -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] TechCom meeting 2020-10-06
This is the weekly TechCom board review in preparation of our meeting on Wednesday. If there are additional topics for TechCom to review, please let us know by replying to this email. However, please keep discussion about individual RFCs to the Phabricator tickets. Activity since Monday 2020-09-28 on the following boards: https://phabricator.wikimedia.org/tag/techcom/ https://phabricator.wikimedia.org/tag/techcom-rfc/ Committee inbox: - T264334 <https://phabricator.wikimedia.org/T264334>: Could the registered module manifest be removed from the client? - New task about the possibility of removing the huge module registry from the js sent to the client. The idea is being discussed. Committee board activity: Nothing to report, besides inbox New RFCs: none. Phase progression: - T262946 <https://phabricator.wikimedia.org/T262946>: Bump Firefox version in basic support to 3.6 or newer - Moves to P3 (explore) - It is pointed out that we’ve dropped support in production for TLS 1.0/1.1 in january, so de facto only Firefox 27+ is able to connect to the wikimedia sites - In light of that, it’s suggested that we might bump the minimum supported versions of browsers further. IRC meeting request: none Other RFC activity: - T260714 <https://phabricator.wikimedia.org/T260714>: Parsoid Extension API. - Last call to be approved, that will end on October 7 (tomorrow) - T487 <https://phabricator.wikimedia.org/T487>: RfC: Associated namespaces. - On last call to be declined, there is some opposition to the opportunity of marking it as declined on phabricator. Last call should end on October 7 (tomorrow) - T263841 <https://phabricator.wikimedia.org/T263841>: RFC: Expand API title generator to support other generated data. - Erik asks if this is going to be generally applied to all generators or not. Cheers, Giuseppe -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Ops] CI downtime Monday May 11st 12:00 UTC
On Thu, May 7, 2020 at 5:20 PM Antoine Musso wrote: [CUT] > If a change has to happen on the configuration repositories: > > * operations/puppet : > locally run: bundle update && bundle exec rake test > If you have docker installed, you can also run the script $puppet_dir/utils/run_ci_locally.sh that, unsurprisingly, runs the same tests as CI does, in the same container. Cheers, Giuseppe -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] For title normalization, what characters are converted to uppercase ?
On Sun, Aug 4, 2019 at 11:34 AM Nicolas Vervelle wrote: > Thanks Brian, > > Great for the link to Php72ToUpper.php ! > I think I understand with it : for example, the first line says 'ƀ' => 'ƀ', > which should mean that this letter shouldn't be converted to uppercase by > MW ? > That's one of the letter I found that wasn't converted to uppercase and > that was generating a false positive in my code : so it's because specific > MW code is preventing the conversion :-) > Hi! No, that file is a temporary measure during a transition between two versions of php. In HHVM and PHP 5.x, calling mb_toupper("ƀ") would give the erroneous result "ƀ". In PHP 7.x, the result is the correct capitalization. The issue is that the titles of wiki articles get normalized, so under php7 we would have ƀar => Ƀar which would prevent you from being able to reach the page. Once we're done with the transition and we go through the process of coverting the (several hundred) pages/users that have the wrong title normalization, we will remove that table, and obtain the correct behaviour. You just need to subscribe https://phabricator.wikimedia.org/T219279 and wait for its resolution I think - most unicode horrors are fixed in recent versions of PHP, including the one you were citing. Cheers, Giuseppe -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] PHP 7 is now a beta feature
Hi all, as some of you might know, HHVM has decided some time ago to drop support for PHP, choosing to only support Hack (Facebook's own PHP-derivative language)[1]. This forced us to consider alternatives. In particular the last major upgrade to PHP, PHP 7, was supposed to have greatly improved the performance of the runtime, guaranteeing performance on par with HHVM. Given that early tests[2] showed promising performance, we decided to work on PHP7 support and on its rollout in production. I'm happy to announce that PHP 7 is now available as a beta feature on all wikis, and I encourage everyone to try it out and report bugs using the #php7.2-support tag. After this period of beta testing, we will proceed with a progressive rollout to a growing percentage of users, and hopefully we'll complete the transition in the next four months. A huge thank you to all the people who worked hard to reach this goal! Thanks, Giuseppe [1] https://hhvm.com/blog/2017/09/18/the-future-of-hhvm.html [2] https://lists.wikimedia.org/pipermail/wikitech-l/2017-September/088854.html -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Engineering] Gerrit now automatically adds reviewers
On Thu, Jan 17, 2019 at 10:52 PM Greg Grossmeier wrote: > Hello, > > Yesterday we (the Release Engineering team) enabled a Gerrit plugin that > will automatically add reviewers to your changes based on who previously > has committed changes to the file. > > While I commend the intention, this means I will get pinged for virtually any change in a couple of very busy repositories. The amount of noise will prevent me from being able to notice anyone's review request. I think it's going to be the same for other developers - I don't want to imagine what the inbox of a long-time mediawiki-core contributor must look like! What I fear is that the flood of reviews will make everyone just dull to notifications, obtaining the exact opposite effect that was intended. I say this because I auto added myself to all reviews in operations/puppet[1] in the past, which resulted in me ignoring all code review requests. I think a good compromise would be to modify the plugin so that it adds reviewers automatically, only if you're a new contributor (so you have - say - less than N patches submitted). While this gets improved, is there a way to opt-out from the feature individually or as a project? Thanks, Giuseppe [1] we already have a way to "monitor" all changes to a repository, to a directory within a repository, or even to individual files, which I was using extensively. Should we remove that? -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Wmfall] Datacenter Switchover recap
Sorry for the copy/paste fail, I meant > So I want to congratulate everyone who was involved in the process, that > includes most of the people on the core platform, performance, search and > SRE teams, but a special personal thanks goes to > Alexandros and Riccardo for driving most of the process and allowing me to > care about the switchover for less than a week before it happened and, yes, > to take the time to fix that bug too :) > > Cheers, Giuseppe -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Wmfall] Datacenter Switchover recap
On Thu, Sep 13, 2018 at 7:49 AM Bryan Davis wrote: > > Everyone involved worked hard to make this happen, but I'd like to > give a special shout out to Giuseppe Lavagetto for taking the time to > follow up on a VisualEditor problem that affected Wikitech > (<https://phabricator.wikimedia.org/T163438>). We noticed during the > April 2017 switchover that the client side code for VE was failing to > communicate with the backend component while the wikis were being > served from the Dallas datacenter. We guessed that this was a > configuration error of some sort, but did not take the time to debug > in depth. When the issue reoccurred during the current datacenter > switch, Giuseppe took a deep dive into the code and configuration, > identified the configuration difference that triggered the problem, > and made a patch for the Parsoid backend that fixes Wikitech. > > While I'm flattered by the compliments, I think it's fair to underline the problem was partly caused by a patch I made to Parsoid some time ago. So I mostly cleaned up a problem I caused - does this count for getting a new t-shirt, even if I fixed it with more than one year of delay? :P On the other hand, I want to join the choir praising the work that has been done for the switchover, and take the time to list all the things we've done collectively to make it as uneventful and fast (read-only time was less than 8 minutes this time) as it was: - Mediawiki now fetches its read-only state and which datacenter is the master from etcd, eliminating the need for a code deployment - We now connect to our per-datacenter distributed cache via mcrouter, which allows us to keep the caches in various datacenters consistent. This eliminated the need to wipe the cache during the read-only phase, thus resulting in a big reduction in the time we went to read-only - Our old jobqueue not only gave me innumerable debugging nightmares, but was hard and tricky to handle in a multi-datacenter environment. We have substituted it with a more modern system which needed no intervention during the switchover - Our media storage system (Swift + thumbor) is now active-active and we write and read from both datacenters - We created a framework for easily automate complex orchestration tasks (like a switchover) called "spicerack", which will benefit our operations in general and has the potential to reduce the toil on the SRE team, while proven, automated procedures can be coded for most events. - Last but not least, the Dallas datacenter (codenamed "codfw") needed little to no tuning when we moved all traffic, and we had to fix virtually nothing that went out of sync during the last 1.4 years. I know this might sound unimpressive, but keeping a datacenter that's not really used in good shape and in sync is a huge accomplishment in itself; I've never seen before such a show of flawless execution and collective discipline. So I want to congratulate everyone who was involved in the process, that includes most of the people on the core platform, performance, search and SRE teams, but a special personal thanks goes to - The whole SRE team, and really anyone working on our production environment, for keeping the Dallas datacenter in good shape for more than a year, so that we didn't need to adjust almost anything pre or post-switchover Alexandros and Riccardo for driving most of the process and allowing me to care about the switchover for less than a week before it happened and, yes, to take the time to fix that bug too :) Cheers, Giuseppe P.S. I'm sure I forgot someone / something amazing we've done; I apologize in advance. -- Giuseppe Lavagetto Principal Site Reliability Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Engineering] Phabricator spam - account approval requirement enabled
On Sun, Jul 1, 2018 at 7:16 AM Niharika Kohli wrote: > On Sat, Jun 30, 2018 at 8:53 PM Greg Grossmeier > wrote: > >> Hello, >> >> Unfortunately we are experiencing spam in our Phabricator instance >> again and have decided to turn on the requirement for new account >> approval by Phabricator admins as a mitigation step. >> > > I'd request that it please be kept on until we have some spam mitigation > tools. At the very least easier revert actions. > > Indeed. We should *not* remove the approval process until a better anti-vandalism system is available for phabricator. Repairing the damage that has been done will require a ton of man-hours. Cheers, Giuseppe -- Giuseppe Lavagetto Senior Technical Operations Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Gerrit login oddities
On Tue, Feb 13, 2018 at 5:56 AM, Chadwrote: > Hi, > > Two quick things: > > * First, we updated the cookie path for Gerrit to enable sharing > authentication with the new Gitiles repo browser--for most users this has > been transparent but a few people have reported problems with being able to > login. If this happens, try clearing out any cookies you have for > gerrit.wikimedia.org and try logging in anew. Just to help people not lose their settings and other things - you just need to remove the GerritAccount cookie for gerrit.wikimedia.org, then you'll be able to login again. Cheers Giuseppe ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] HHVM vs. Zend divergence
On Wed, Sep 20, 2017 at 10:56 PM, Brion Vibber <bvib...@wikimedia.org> wrote: > On Mon, Sep 18, 2017 at 1:58 PM, Max Semenik <maxsem.w...@gmail.com> > wrote: > > > > > 3) Revert WMF to Zend and forget about HHVM. This will result in > > performance degradation, however it will not be that dramatic: when we > > upgraded, we switched to HHVM from PHP 5.3 which was really outdated, > while > > 5.6 and 7 provided nice performance improvements. > > > > Migrating WMF's implementation to PHP 7 is probably the way to go. I leave > it up to ops to figure out how to make the change. :) > I think this is the more viable option too, mostly not to cause huge issues to non-Wikimedia users. The ease of installation and operation of PHP on most platforms compared to HHVM is incomparable. Just to make an example, I don't remember having to check the code of the VM to understand what an ini setting does with PHP (or even PHP-fpm), while with HHVM that has been a recurring pain, as options come and go without any warning between versions, not to mention the online docs that can even be plainly misleading. At the moment, there is no doubt PHP is a much friendlier environment for any third-party wiki. But I think there is other value in going the PHP7 way; I had actually pitched the idea of switching to PHP7 for some time now, for various reasons, namely: - We use HHVM differently than what Facebook does. For instance, we use the fcgi server mode, and not the pure http one, and we don't run in repoAuth mode. This has brought us new bugs to solve every time we upgrade, as there is no battle testing in production for those code patterns at scale before we use it. - The recent, prolonged difficulty of interaction with the FLOSS community, although acknowledged by the HHVM team as something they're willing to fix, is worrying in itself. - PHP7 has shown performance comparable to HHVM for most PHP shops that migrated. So the single most compelling reason for which we migrated (performance) might not be a factor anymore. Using a runtime readily available (and security-patched) by the upstream distribution would make the ops team lives easier as well. As for the actual migration: I don't think there is any need to panic or rush to a decision, but the timeline is pretty set: by the end of 2018, when official support for HHVM 3.24 will end, any migration should be well underway within the WMF infrastructure. I expect a migration from HHVM to PHP 7 to be a less formidable undertaking than the switch from PHP 5.3 to HHVM - we did repay a good deal t of the tech debt in the Wikimedia Foundation installation back then, and we won't have to change radically the way we serve MediaWiki, as PHP 7 works as a FastCGI server as well. Still, it will take time and resources, and it needs to be planned in advance. One important consequence of the announcement for Wikimedia is that we won't be able to use any version of MediaWiki not compatible with PHP 5.x until we transition to PHP7 (unless we decide to support both HHVM and PHP7). This might be important in steering the timing of the change in MediaWiki itself. Cheers, Giuseppe -- Giuseppe Lavagetto, Ph.d. Senior Technical Operations Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Ops] [Engineering] The train will resume tomorrow (was Re: All wikis reverted to wmf.8 last night due to T119736)
On Wed, Jul 13, 2016 at 2:15 AM, aude <aude.w...@gmail.com> wrote: > On Tue, Jul 12, 2016 at 7:56 PM, Ori Livneh <o...@wikimedia.org> wrote: >> Our failure to react to this swiftly and comprehensively is appalling and >> embarrassing. It represents failure of process at multiple levels and a lack >> of accountability. > > > This (unbreak now) bug has been open since November. I wonder how this has > been allowed to remain open and not addressed for this long? > I am sure we could've done way better even in our current structure, but it's pretty clear to me that the absence of a team dedicated to MediaWiki itself calls for such things to happen. Which is pretty absurd, when you remember that 99% of our traffic is still served by it. Cheers G. -- Giuseppe Lavagetto, Ph.d. Senior Technical Operations Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Upgrading HHVM to libicu52
Hi all, tomorrow, Thursday May 25th, I will be upgrading our HHVM servers to use a recent version of the ICU[1] library, a long-needed change that we are finally ready to perform: it allows us to stop maintaining an older version by ourselves, including having to patch it for any security issue. For details about the rationale and the long process involved, see https://phabricator.wikimedia.org/T86096 While the upgrade should be smooth, the ICU maintainers do not guarantee backward compatibility for collation, so to be sure that is addressed, we will need to run a maintenance script on all wikis that have $wgCategoryCollation set to anything including with 'uca', see https://phabricator.wikimedia.org/diffusion/OMWC/browse/master/wmf-config/InitialiseSettings.php;2f61ae1bcffe0f7b8626d544a98eea3c4a7d7905$13676 Since this script takes quite a long time to run[2], there will be some user-facing effect, during the transition period, namely, citing what MatmaRex says on the ticket: "After ICU is upgraded, but before the updateCollation script finishes, articles newly added to categories may appear out-of-order on category listing pages. The headings on them might be wrong in funny ways, too. Nothing else should be affected." If no last-minute showstopper blocks the process, I will be starting the procedure around 8:00 UTC, and log in the SAL[3] every step of the process. Don't hesitate to contact me on IRC (#wikimedia-operations on freenode, user _joe_) if you see some strange behaviour. Thanks in advance for your patience Giuseppe [1] ICU stands for International Components for Unicode [2] It is actually much, much faster to run now than it ever was, thanks to the amaizing work others have done to improve it, see https://phabricator.wikimedia.org/T58041 and https://phabricator.wikimedia.org/T130692 [3] https://wikitech.wikimedia.org/wiki/Server_Admin_Log -- Giuseppe Lavagetto, Ph.d. Senior Technical Operations Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Reducing the environmental impact of the Wikimedia movement
On Thu, Mar 31, 2016 at 12:39 AM, Tim Starling <tstarl...@wikimedia.org> wrote: > I think it's stretching the metaphor to call ops a "tight ship". We > could switch off spare servers in codfw for a substantial power > saving, in exchange for a ~10 minute penalty in failover time. But it > would probably cost a week or two of engineer time to set up suitable > automation for failover and periodic updates. > Just a small clarification: I don't think turning off and on periodically servers would be a feasible option because servers (and computers in general) tend to have a pretty high failure rate when being powered off and on regularly. We see this with some server failing every time we do a mass reboot due to some security issue. On the other hand, we could surely do better in terms of idle-server power consumption. In terms of costs and time spent (and probably also natural resources consumption, but I did no calculation whatsoever) it would probably be not sustainable. > Or we could have avoided a hot spare colo altogether, with smarter > disaster recovery plans, as I argued at the time. Another small clarification: our codfw datacenter is _not_ just a hot spare for disaster recovery and a lot of work has been done to make the two facilities mostly active-active (and a lot more will be done in the coming year). Cheers, Giuseppe P.S. The server energy footprint of the WMF is negligible if compared to the big internet players, but even a small-medium size local ISP has probably a larger footprint than us. This doesn't mean we should not try to get better, but we should always put things in prespective. -- Giuseppe Lavagetto Senior Technical Operations Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Services] [ANNOUNCEMENT] RESTBase and related services DC switch-over test
On Mon, Mar 14, 2016 at 10:54 PM, Marko Obrovac <mobro...@wikimedia.org> wrote: > Hello, > > The WMF’s technology department has for this quarter the goal of testing and > temporarily switching the main operational data centre from Eqiad (located > in Chicago) to Codfw (located in Dallas)~[1,2]. This includes both > back-end-processing as well as serving live traffic from it. > > As a part of this effort, we are scheduling a switch-over for RESTBase and > its back-end services, including: Parsoid, the Mobile Content Service, > CXServer, Mathoid, Citoid, Apertium and Zotero~[3]. Technically, it will not > be a real switch-over per se, because we will keep all of those services > active in both DCs. However, external traffic will be directed to the Dallas > DC only. > Hi all, just a quick heads-up: given the small issues we experienced last time, which we've found to be unrelated to the switch itself, we scheduled a new switch-over test lasting 24 hours, which is scheduled to start tomorrow (April 5th) at 14:00 UTC. We don't expect any significant user impact. Anyways, should you have any questions or concerns, don’t hesitate to contact us here or on IRC (#wikimedia-services / #wikimedia-operations @ freenode). Cheers, Giuseppe -- Giuseppe Lavagetto, Ph.d. Senior Technical Operations Engineer, Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Engineering] Announcing mediawiki-containers, a Docker-based MediaWiki installer
On Sat, Jan 30, 2016 at 9:59 AM, Gabriel Wickewrote: > Right now, Yuvi is evaluating the Kubernetes cluster manager in labs. Just a clarification: Yuvi has already evaluated kubernetes and it's being actively used to build an awesome replacement for at least part of what toollabs does right now. A handful of tools are already running, with success, on it for quite a long time. > Its features include scheduling of "pods" (groups of containers) to > hardware nodes, networking, rolling deploys and more. While all these > features provide a very high degree of automation, they also mean that > failures in Kubernetes can have grave consequences. I think operations > are wise to wait for Kubernetes to mature a bit further before > considering it for critical production use cases. > Failures in any complex system are surely scary, but kubernetes seems stable enough to be evaluated for production use. We also had an unconference session at the WMDS about this - or better what we want to achieve by using kubernetes as a tool. I will also stress that there are more "mature" cluster/container framework like Apache Mesos/Aurora/Marathon, but after taking a hard look at them me and Yuvi evaluated that kubernetes is way more promising for any of our use cases. This is still a bit further away in the future, anyways. There is already a phabricator task for this, which is anyways sitting idle at the moment as it's not in our immediate roadmap. The task is by the way trying to be independent of the specific technology in describing what we actually want to achieve. Kubernetes, as any other product we might use, is just a mean to an end, and we should never be in love with any specific technology. https://phabricator.wikimedia.org/T122822 > There is > also some support to run docker images in systemd, which could be an > alternative if we want to avoid the dependency on the docker runtime > in production. I guess you mean containers can run within systemd, but I don't think just running containers instead of firejail would give us any practical advantage at the moment from any operational prespective, but I might miss the point. > Lets get together and figure out a plan. Let's do it! maybe next quarter when ops are not mostly focused on the datacenter switch it will be easier, I guess :) Cheers, Giuseppe ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Port mw-vagrant to Raspberry Pi ( arm )
On Tue, Sep 29, 2015 at 7:04 PM, Tony Thomas <01tonytho...@gmail.com> wrote: > Hello, > [CUT] >3. hhvm is too ram hungry > If I'm not mistaken, hhvm won't compile on anything but an x86-64 architecture. So you definitely need to fall back to zend. Cheers, G. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: [Tools] Kubernetes picked to provide alternative to GridEngine
On Thu, Sep 17, 2015 at 4:00 PM, Brian Gerstlewrote: > Congrats on moving forward with a big decision! I'm very optimistic about > containers, so it's exciting to see movement in this area. > > Is there a larger arc of using this for our own services (Mediawiki, > RESTBase, etc.), potentially in production? > Hi Brian, I think we said this somewhere before, but yes we will consider if kubernetes is a viable platform to run services in production onto. But Kubernetes is much more than just "containers", it is a distributed computing environment directly derived by Google's own Borg system. I think it's a good candidate to be a platform to run some services onto, but probably not mediawiki or Restbase for the time being. Cheers, Giuseppe ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Evaluation of clustering solutions (continued)
Hi all, as previously announced, we've been evaluating a clustering solution for use as an alternative to GridEngine for toollabs https://lists.wikimedia.org/pipermail/wikitech-l/2015-August/082853.html Our goal is also to find a suitable, modern, stable tool to run not only toollabs webservices, but also - on a longer term - to find a modern, easier, more convenient way to run our microservices in production: a clusterized environment that will allow us to enhance single service availalbility and also to apply easier scaling of applications, reducing further the friction surface and the direct ops involvement in the day-to-day setup and deployment of services. Our evaluation of the available solutions is ongoing, and while we're mostly done filling up an evaluation spreadsheet (https://docs.google.com/spreadsheets/d/1YkVsd8Y5wBn9fvwVQmp9Sf8K9DZCqmyJ-ew-PAOb4R4/edit?usp=sharing), we would welcome and we encourage further involvement/suggestions. You can provide these easily on the tracking ticket for the evaluation, https://phabricator.wikimedia.org/T106475 We received some interesting feedback already, and we look forward incorporating more! We are considering two solutions - mesospheres' Marathon (which is based on Mesos) - https://mesosphere.github.io/marathon/ and Google's Kubernetes https://kubernetes.io. Now let us summarize a bit our findings so far: MESOS/MARATHON: Pros: - Mesos is stable and battle tested, although Marathon is quite young and mostly used in mesosphere's commercial offering - Supports overcommitting resources (which is important in toollabs, probably less so in production) - Has a nice, clean API and is fully distributed with no potential SPOFs - Chronos is another framework that can run on mesos and is a great distributed cron Cons: - Multitenancy story is non-existent, it was not designed to be a public PaaS offering. This is an issue even in production if we want to grant independence to single teams. - Container support seems experimental at best.(but getting better in newer versions) - Adoption of Marathon seems little and the community is not very lively. - Discovery/scaling logic is somewhat limited KUBERNETES Pros: - The design seems to be very well thought out, based off of experiences running Google's internal Borg system (see http://research.google.com/pubs/pub43438.html for details of Google's Borg clustering system). - A pretty refined security model is already implemented, so that single users/teams could be given access to individual namespaces and act independently - The community is very lively, and adoption is gaining momentum: kubernetes is the default way to deploy apps on Google Compute Engine, it's used by Red Hat for its own cloud solution (and they contribute patches to it), it has a clear roadmap to overcome most of its limitations - Container support is native and it's tecnology-agnostic, allowing (for now) Docker and Rkt containers to be used - The API is quite nice - Documentation is decently complete - Google engineers are actively supporting us in evaluating its usage Cons: - The master node is not highly available, although our cluster survived a pretty serious outage in labs that froze the master and wiped out one worker - No overcommitting allowed, it will be possible to mimic it with QoS (coming in the next version) - The ability to schedule one-off jobs is offered, but there is no distributed cron facility - In general it's a younger project with some outstanding bugs As you can see there are pretty big pros/cons for both these technologies, due to the fact they are still quite not boring - although one could argue that mesos and chronos at least have entered their boring stage. Our spreadsheet slightly favours Kubernetes at the moment, but that might change drastically, if we evaluate that some limitations are absolute showstoppers for us. In the remainder of this week and the next few ones, we will keep stress testing both our test installations to find out surprises and bugs. Let us know what you think - or reach out to us if you want to help in this evaluation process. We will keep you posted! Cheers, Giuseppe Yuvi ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Search errors on multiple Wikimedia projects
On Mon, Jun 15, 2015 at 3:28 PM, Chad innocentkil...@gmail.com wrote: On Mon, Jun 15, 2015 at 12:12 AM Pine W wiki.p...@gmail.com wrote: I'm getting this same error on multiple Wikimedia projects: An error has occurred while searching: Search is currently too busy. Please try again later. Help? The elasticsearch cluster exploded completely. It's mostly recovered now (just waiting for full redundancy) and searches should now be back on for users. Full incident report to follow I imagine. You are correct. I'm not making promises on how soon I'll get to write it, though. Also, we'd probably need to do some root cause investigation (I think we do have a candidate, but I'd like for some ElasticSearch expert to take a look more thoroughly). I will update this thread with a link to the outage report as soon as it's ready. Cheers Giuseppe ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Multimedia team?
While I know that doesn't sound fancy or attractive, I think the multimedia team should have as one of its focuses to help with transitioning to HHVM the imagescalers. There seem to be a few issues with that and some support that won't just come out of goodwill, but as a team commitment, would greatly help with that. Cheers, Giuseppe On Mon, May 11, 2015 at 4:33 PM, Brian Gerstle bgers...@wikimedia.org wrote: I'm also curious what our audio/video storage/transcoding/playback roadmap is. IMO it's a pretty fundamental feature that isn't well supported in all the clients (especially mobile). Could probably do some interesting audio stuff (e.g. narration in many languages) for visually impaired. On Mon, May 11, 2015 at 7:42 AM, Jean-Frédéric jeanfrederic.w...@gmail.com wrote: 2015-05-11 10:29 GMT+01:00 Antoine Musso hashar+...@free.fr: On 11/05/15 02:18, Tim Starling wrote: On 10/05/15 07:06, Brian Wolff wrote: People have been talking about vr for a long time. I think there is more pressing concerns (e.g. video). I suspect VR will stay in the video game realm or gimmick realm for a while yet Maybe VR is a gimmick, but VRML, or X3D as it is now called, could be a useful way to present 3D diagrams embedded in pages. Like SVG, we could use it with or without browser support. Hello, A potential use case for the encyclopedia, would be to display models of chemistry molecules. An example: http://wiki.jmol.org/index.php/Jmol_MediaWiki_Extension See https://phabricator.wikimedia.org/project/profile/16/ and https://phabricator.wikimedia.org/project/profile/804/ -- Jean-Frédéric ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- EN Wikipedia user page: https://en.wikipedia.org/wiki/User:Brian.gerstle IRC: bgerstle ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Tor proxy with blinded tokens
Hi Chris, I like the idea in general, in particular the fact that only established editors can ask for the tokens. What I don't get is why this proxy should be run by someone that is not the WMF, given - I guess - it would be exposed as a TOR hidden service, which will mask effectively the user IP from us, and will secure his communication from snooping by exit node managers, and so on. I guess the righteously traffic on such a proxy would be so low (as getting a token is /not/ going to be automated/immediate even for logged in users) that it could work without using up a lot of resources. Cheers, Giuseppe ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] 503 errors in Phabricator
Hi, I'm using phabricator regularly this morning (including doing pretty advanced searches) and I cannot reproduce the problem, but I am surely no expert. Is it still ongoing? Which urls in particular is giving you 503s? Cheers Giuseppe On Fri, Mar 6, 2015 at 9:02 AM, Pine W wiki.p...@gmail.com wrote: I'm repeatedly getting 503 errors when attemping to search Phabricator. Can someone check into this please? Pine ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Investigating building an apps content service using RESTBase and Node.js
On Wed, Feb 4, 2015 at 6:59 AM, Marko Obrovac mobro...@wikimedia.org wrote: On Tue, Feb 3, 2015 at 8:42 PM, Tim Starling tstarl...@wikimedia.org wrote: I don't really understand why you want it to be integrated with RESTBase. As far as I can tell (it is hard to pin these things down), RESTBase is a revision storage backend and possibly a public API for that backend. Actually, RESTBase's logic applies to the Mobile Apps case quite naturally. When a page is fetched and transformed, it can be stored so that consequent requests can simply retrieve the transformed document form storage. Ok, so in this vision RESTBase is a caching layer for revisions I thought the idea of SOA was to separate concerns. Wouldn't monitoring, caching and authorization would be best done as a node.js library which RESTBase and other services use? Good point. Ideally, what we would need to do is provide the right tools to developers to create services, which can then be placed strategically around DCs (in cooperation with Ops, ofc). For v1, however, we plan to provide only logical separation (to a certain extent) via modules which can be dynamically loaded/unloaded from RESTBase. In return, RESTBase will provide them with routing, monitoring, caching and authorisation out of the box. The good point here is that this 'modularisation' eases the transition to a more-decomposed orchestration SOA model. Going in that direction, however, requires some prerequisites to be fulfilled, such as [1]. So, now RESTBase has become a router and an auth provider as well? (Gabriel already clarified me that monitoring means that RESTbase will expose its own metrics for that specific service, so this is not monitoring of the service at all, rather accounting). I need some clarification at this point - what is RESTBase really going to do? I'm asking because when I read RESTBase will provide them with routing, [...] and authorisation I immediately think of a request router and a general on-wiki auth provider. And we already have both, and re-doing them in RESTBase would be plainly wrong. Maybe you intend very specific things when you say routing and not request routing, which is what everybody here will think of. And when you say auth you mean that RESTBase implements an auth schema for its clients, so that no client can access data from another one. If this is the case, I have some further questions: is this going to be RBAC? Which permissions models are you implementing? Are we sure it is what we will need? And foremost: will this be exposable to external consumers? will it be able to hook up to our traditional wiki auth scheme? Can you please expand a bit on those concepts? Or a lot of confusion, uncertainty and doubt will spread amongst your fellow engineers, resulting in an hostile view of what may be a perfectly well designed software. Cheers, Giuseppe ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Investigating building an apps content service using RESTBase and Node.js
On Wed, Feb 4, 2015 at 6:59 AM, Marko Obrovac mobro...@wikimedia.org wrote: On Tue, Feb 3, 2015 at 8:42 PM, Tim Starling tstarl...@wikimedia.org wrote: I don't really understand why you want it to be integrated with RESTBase. As far as I can tell (it is hard to pin these things down), RESTBase is a revision storage backend and possibly a public API for that backend. Actually, RESTBase's logic applies to the Mobile Apps case quite naturally. When a page is fetched and transformed, it can be stored so that consequent requests can simply retrieve the transformed document form storage. Ok, so in this vision RESTBase is a caching layer for revisions I thought the idea of SOA was to separate concerns. Wouldn't monitoring, caching and authorization would be best done as a node.js library which RESTBase and other services use? Good point. Ideally, what we would need to do is provide the right tools to developers to create services, which can then be placed strategically around DCs (in cooperation with Ops, ofc). For v1, however, we plan to provide only logical separation (to a certain extent) via modules which can be dynamically loaded/unloaded from RESTBase. In return, RESTBase will provide them with routing, monitoring, caching and authorisation out of the box. The good point here is that this 'modularisation' eases the transition to a more-decomposed orchestration SOA model. Going in that direction, however, requires some prerequisites to be fulfilled, such as [1]. So, now RESTBase has become a router and an auth provider as well? (Gabriel already clarified me that monitoring means that RESTbase will expose its own metrics for that specific service, so this is not monitoring of the service at all, rather accounting). I need some clarification at this point - what is RESTBase really going to do? I'm asking because when I read RESTBase will provide them with routing, [...] and authorisation I immediately think of a request router and a general on-wiki auth provider. And we already have both, and re-doing them in RESTBase would be plainly wrong. Maybe you intend very specific things when you say routing and not request routing, which is what everybody here will think of. And when you say auth you mean that RESTBase implements an auth schema for its clients, so that no client can access data from another one. If this is the case, I have some further questions: is this going to be RBAC? Which permissions models are you implementing? Are we sure it is what we will need? And foremost: will this be exposable to external consumers? will it be able to hook up to our traditional wiki auth scheme? Can you please expand a bit on those concepts? Or a lot of confusion, uncertainty and doubt will spread amongst your fellow engineers, resulting in an hostile view of what may be a perfectly well designed software. Cheers, Giuseppe ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Microservices/SOA: let's continue the discussion
Hi all, it has been since the Dev Summit discussions on SOA/Microservices[1] that I am pondering the outcomes and I am willing to post some afterthoughts to these lists. Having been one of the most vocal in raising concerns about microservices, and having had experience in an heavily service-oriented web platform before, I think I owe my fellow engineers some more lengthy explanations. Also, let me say that I am very happy with both the discussions we had in the Dev Summit and its outcomes - including the fact that the Ops and Services teams both share the desire to work strictly toghether on this. I tried to write down some thoughts about this, and ended up with a way too long email. So I decided to put up a page on wikitech here: https://wikitech.wikimedia.org/wiki/User:Giuseppe_Lavagetto/MicroServices Apart from my blabbing, have three questions on our strategy: How, when, what? None of this is clear to me as of today, and I guess if anyone has a clear picture of where we want to be in 6-to-12 months with microservices. If someone has a clear plan, please speak up so that we can tackle the challenges ahead of us on a practical basis, and not just based on some grand principles :) Cheers Giuseppe [1] I prefer the latter term, probably because SOA sounds bloated to me, and reminds me of enterprise software architectures that I don’t like. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Investigating building an apps content service using RESTBase and Node.js
On Wed, Feb 4, 2015 at 5:42 AM, Tim Starling tstarl...@wikimedia.org wrote: On 04/02/15 12:46, Dan Garry wrote: To address these challenges, we are considering performing some or all of these tasks in a service developed by the Mobile Apps Team with help from Services. This service will hit the APIs we currently hit on the client, aggregate the content we need on the server side, perform transforms we're currently doing on the client on the server instead, and serve the full response to the user via RESTBase. In addition to providing a public API end point, RESTBase would help with common tasks like monitoring, caching and authorisation. I don't really understand why you want it to be integrated with RESTBase. As far as I can tell (it is hard to pin these things down), RESTBase is a revision storage backend and possibly a public API for that backend. I thought the idea of SOA was to separate concerns. Wouldn't monitoring, caching and authorization would be best done as a node.js library which RESTBase and other services use? I agree with Tim. Using RESTBase as an integration layer for everything is SOA done wrong. If we need to have an authorization system, which is different from our APIs, we need to build it separately, not to add levels of indirection. Doing 4 things from one single service is basically rebuilding the mediawiki monolith, only in a different language :) What you need, IMO, is a thin proxy layer in front of all the separate APIs you have to call, including restbase for caching/revision storage. It may be built into the app or, if it is consumed by multiple apps, built as a thin proxy service itself. (I also don't get what monitoring means here, but someone could probably explain it to me) Cheers, Giuseppe ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] All non-api traffic is now served by HHVM
On 09/12/14 23:10, Brian Wolff wrote: Awesome. Any chance the video scalars could be put near the top of the list for servers to upgrade Ubuntu on? The really old version of libav on those servers is causing problems for people uploading videos in certain formats. Since API and appservers are done, we're left with the jobrunners (for which the conversion is already done), the imagescalers and the videoscalers. We are working right now on the imagescaler conversion, it will require some preparation work and some testing, but it won't take too long hopefully. Cheers, Giuseppe -- Giuseppe Lavagetto Wikimedia Foundation - TechOps Team ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] All non-api traffic is now served by HHVM
On 03/12/14 18:03, Giuseppe Lavagetto wrote: Hi all, [CUT] The API traffic is still being partially served by mod_php, but that will not be for long! As promised, all our API traffic is on HHVM as well as of now. The effects on CPU usage have been quite drastic on this cluster, where the load is higher: http://bit.ly/1Abwwzi Cheers, Giuseppe -- Giuseppe Lavagetto Wikimedia Foundation - TechOps Team ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] All non-api traffic is now served by HHVM
Hi all, it's been quite a journey since we started working on HHVM, and last week (November 25th) HHVM was finally introduced to all users who didn't opt-in to the beta feature. Starting on monday, we started reinstalling all the 150 remaining servers that were running Zend's mod_php, upgrading them from Ubuntu precise to Ubuntu trusty in the process. It seemed like an enormous task that would require me weeks to complete, even with the improved automation we built lately. Thanks to the incredible work by Yuvi and Alex, who helped me basically around the clock, today around 16:00 UTC we removed the last of the mod_php servers from our application server pool: all the non-API traffic is now being served by HHVM. This new PHP runtime has already halved our backend latency and page save times, and it has also reduced significantly the load on our cluster (as I write this email, the average cpu load on the application servers is around 16%, while it was easily above 50% in the pre-HHVM era). The API traffic is still being partially served by mod_php, but that will not be for long! Cheers, Giuseppe -- Giuseppe Lavagetto Wikimedia Foundation - TechOps Team ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Tor and Anonymous Users (I know, we've had this discussion a million times)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 30/09/14 23:02, Marc A. Pelletier wrote: On 09/30/2014 09:08 AM, Derric Atzrott wrote: [H]ow can we quantify the loss to Wikipedia, and to society at large, from turning away anonymous contributors? Wikipedians say 'we have to blacklist all these IP addresses because of trolls' and 'Wikipedia is rotting because nobody wants to edit it anymore' in the same breath, and we believe these points are related. I've been doing adminwork on enwiki since 2007 and I can tell give you two anecdotal data points: (a) Previously unknown TOR endpoints get found out because they invariably are the source of vandalism and/or spam. (b) I have never seen a good edit from a TOR endpoint. Ever. A third one I can add since I have held checkuser (2009): (c) I have never seen accounts created via TOR or that edited through TOR that weren't demonstrably block evasion, vandalism or (most often) spamming. None of this is TOR-specific, the same observations apply to open proxies in general, and the almost totality of hosted servers. Long blocks of open proxies or co-lo ranges that time out after *years* being blocked invariably start spewing spam and vandalism, often the very day the block expired. Hi Marc :) I know I don't need to convince you that TOR is a good thing in general. Still, I don't see how the abusive nature of what is being done via TOR makes it less valuable to our community, in particular in the post-Snowden era. Without involving countries where freedom of speech is not legally granted, it is reasonable to assume someone doing an edit that may look 'unfriendly' to the US or UK governments will feel uncomfortable doing that without TOR. If, as it seems right now, the problem is technical (weed out the bots and vandals) rather than ideological (as we allow anonymous contributions after all) we can find a way to allow people to edit any wikipedia via TOR while minimizing the amount of vandalism allowed. Of course, let's not kid ourselves - it will require some special measures probably, and editing via TOR would probably end up not being as easy as editing via a public-facing IP (we may e.g. restrict publishing via TOR to users that have logged in and have done 5 good edits reviewed by others, or we can use modern bot-detecting techniques in that case - those are just ideas). Cheers, Giuseppe - -- Giuseppe Lavagetto Wikimedia Foundation - TechOps Team -BEGIN PGP SIGNATURE- Version: GnuPG v1 iEYEARECAAYFAlQrq84ACgkQTwZ0G8La7IAWLgCglkaCutKP64khUn4zXpSsFnlD HMkAoL4HoAw7Rx4PoGvqo0D5lDKOBawd =RIjq -END PGP SIGNATURE- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l