Joe added a comment.
Well turns out the issue was simpler: we even had a TODO in the code:
# TODO: add mw-on-k8s once we think of moving wikidata or partial traffic.
Sigh. Thanks @Lucas_Werkmeister_WMDE for noticing, this will be fixed as soon
as I get a review.
TASK DETAIL
Joe added a comment.
I tried restarting ATS on a backend, cp1081, then made requests for
wikidata's special:random to trafficserver directly: still all going to
appservers on bare metal.
So the problem isn't in `mw-on-k8s.lua`, apparently...
TASK DETA
Joe added a comment.
Interestingly, I do get correct results for m.wikidata.org, but somehow not
for www.wikidata.org (also, please grep for `mw-web` as we've repooled eqiad in
the meantime).
This makes the whole thing even more puzzling tbh.
TASK DETAIL
Joe added a comment.
@Jdforrester-WMF no, this task is actually about that patch not having the
effect we expected.
TASK DETAIL
https://phabricator.wikimedia.org/T347493
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: Joe, Jdforrester
Joe added a comment.
Yes, localhost:6008 is pointing to `termbox.discovery.wmnet:4004` in
production.
The problem doesn't seem to be in termbox, as we could both fetch the data
from the service without issues. So the issue doesn't seem to be related to the
switch to use me
Joe closed this task as "Resolved".
Joe added a comment.
Termbox has been migrated
TASK DETAIL
https://phabricator.wikimedia.org/T334064
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: WMDE-leszek, Lucas_Werkmeister_WMDE, Joe,
Joe added a comment.
Just deployed the change to termbox-test, and I still see my test url
`http://termbox-test.staging.svc.eqiad.wmnet:3031/termbox?entity=Q229877&revision=630197&language=en&editLink=%2Fwiki%2FSpecial%3ASetLabelDescriptionAliases%2FQ229877&prefe
Joe changed the status of subtask T172497: Fix mediawiki heartbeat model,
change pt-heartbeat model to not use super-user, avoid SPOF and switch
automatically to the real master without puppet dependency from
"Open" to "Stalled".
TASK DETAIL
https://phabricator.wikimedi
Joe added a comment.
Re-thinking about this: what we're really interested in is knowing what is
the max lag of a server that is receiving user traffic.
So I crafted the following metric in prometheus:
`max(time() - blazegraph_lastupdated and
rate(blazegraph_queries_done_total
Joe renamed this task from "Depooled servers may still be taken into account
for query service maxlag" to "Query service maxlag calculation should exclude
datacenters that don't receive traffic and where the updater is turned off".
Joe updated the task descript
Joe edited projects, added serviceops; removed Sustainability (Incident
Followup), SRE.
Joe triaged this task as "Medium" priority.
TASK DETAIL
https://phabricator.wikimedia.org/T331405
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: J
Joe added a comment.
In T331405#8672360 <https://phabricator.wikimedia.org/T331405#8672360>,
@dcausse wrote:
> In T331405#8672341 <https://phabricator.wikimedia.org/T331405#8672341>,
@Joe wrote:
>
>> Updates shouldn't depend on where the discovery dns
Joe added a comment.
To ensure I understood your problem correctly: why were those servers not
getting updated anymore?
Updates shouldn't depend on where the discovery dns record points to, but
rather go to the local datacenter directly.
I think the bug here is with wdqs-up
Joe closed subtask T318918: Undeploy patch to use old PHP serialization in PHP
7.4 as "Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T305785
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: hoo, Joe
Cc: Michael, ItamarWMDE, Reedy
Joe closed subtask T318918: Undeploy patch to use old PHP serialization in PHP
7.4 as "Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T316923
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: Lucas_Werkme
Joe closed subtask T318918: Undeploy patch to use old PHP serialization in PHP
7.4 as "Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T305785
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: hoo, Joe
Cc: Michael, ItamarWMDE, Reedy
Joe closed subtask T318918: Undeploy patch to use old PHP serialization in PHP
7.4 as "Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T316923
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: Ollie.Shotton_WMDE, W
Joe changed the task status from "Open" to "Stalled".
Joe removed Joe as the assignee of this task.
Joe added a comment.
Hi, any news on this front? I'll release this bug as its completion doesn't
depend on me right now. When the functionality has been merged,
Joe changed the status of subtask T238751: Only generate maxlag from pooled
query service servers. from "Open" to "Stalled".
TASK DETAIL
https://phabricator.wikimedia.org/T270614
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: J
Joe added a comment.
Sadly I had to revert, because the `--lb` and the `--lb-pool` commands are
not recognized by the script.
mwmaint1002:~$ /usr/local/bin/mwscript
extensions/Wikidata.org/maintenance/updateQueryServiceLag.php --wiki
wikidatawiki --cluster wdqs --prometheus
Joe added a comment.
Given the changes we've made to puppet in the meantime, I am now able to feed
the right parameters to the script if we want to.
The following patch
https://gerrit.wikimedia.org/r/c/operations/puppet/+/797077
will cause these corresponding changes in p
Joe added a comment.
In T301471#7726097 <https://phabricator.wikimedia.org/T301471#7726097>,
@Michaelcochez wrote:
> @Joe for the base image, would you recommend our current approach of
starting from an 'empty' image and downloading the latest go distribution
ours
Joe added a comment.
Ok so a few requirements:
1. we need the repository to be on gerrit, and to include a `.pipeline`
directory to be built using blubber/the deployment pipeline.
2. you should probably base your image on debian bullseye and not debian
stretch, but that can be done
Joe added a comment.
Hi, if this service is to be used in the WMF production environment (and
given the call graph, it will), it needs to run on kubernetes, and thus we will
need to be built using our deployment pipeline first, and use the
deployment-charts repository to define the
Joe added a comment.
Given my opposition to the plan as proposed in this task, I've been asked to
explain it in more detail here.
First of all, I want to say that IMHO things would have gone smoother if you
asked SRE for an opinion about the plan before it was put in motion. Keep
Joe added a comment.
I think there are two options, depending on the level of security we want to
achieve and the urgency of bringing this to production:
1. We just point to the current installation and it should just work(TM). But
we'd need to perform a small migration afterwards
Joe added a comment.
I would say this needs a more thorough change of how we use envoy - I'm
specifically thinking of doing something more transparent like istio does. But
given I don't think we'll get around doing that soon enough for your timeline,
I'd advise to skip
Joe closed this task as "Resolved".
Joe added a comment.
Data on the number of apcu gets/s normalized after the release:
https://grafana-rw.wikimedia.org/goto/4-r5Lhk7z
I'm going to optimistically resolve this bug.
TASK DETAIL
https://phabricator.wikimedia.org
Joe added a comment.
In T285634#7183188 <https://phabricator.wikimedia.org/T285634#7183188>,
@daniel wrote:
>> Scavenging the production logs, we found that Special:EntityData requests
for rdf documents were possibly the culprit.
>
> Did the code change, or is
Joe added a comment.
Scavenging the production logs, we found that `Special:EntityData` requests
for rdf documents were possibly the culprit.
This is the result of profiling
http://www.wikidata.org/wiki/Special:EntityData/Q146190.rdf :
https://performance.wikimedia.org/xhgui
Joe added a comment.
I think what @Addshore just found is a good candidate for being the source of
the issue.
I'll try and get some more info from apcu on servers, although they've all
been recently restarted to ease the pressure. I will take several snapshots of
the apc
Joe added a comment.
There is definitely something going very wrong with memcached:
https://grafana.wikimedia.org/d/00316/memcache?viewPanel=60&orgId=1&from=now-30d&to=now
shows misses increasing across the board
TASK DETAIL
https://phabricator.wikimedia.org/T2
Joe added a subscriber: Pchelolo.
Joe added a comment.
In T281480#7046160 <https://phabricator.wikimedia.org/T281480#7046160>, @Joe
wrote:
> Given we only make requests to external storage when parsercache has a
miss, it seemed sensible to look for corresponding patterns in pa
Joe added a comment.
Given we only make requests to external storage when parsercache has a miss,
it seemed sensible to look for corresponding patterns in parsercache.
I see we introduced a new category of misses on the same date
"miss_absent_metadata", see
https
Joe added a comment.
To clarify a bit - restbase has hourly spikes of requests for the `feed`
endpoint, which go back to wikifeeds, which calls both restbase and the action
api.
From the graphs of calls from wikifeeds it's clear we have hourly peaks
happening at :00 in the numb
Joe closed this task as "Resolved".
Joe added a comment.
Reporting here in brief:
- We confirmed the problem had to do with activating firejail for all
executions of external programs. That triggered a kernel bug
- This kernel bug can be bypassed by disabling kernel memory
Joe claimed this task.
TASK DETAIL
https://phabricator.wikimedia.org/T260281
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: eprodromou, Michael, NullPointer, Platonides, hashar, Addshore, Majavah,
Ladsgroup, JMeybohm, ema, Joe, RhinosF1
Joe closed subtask Restricted Task as "Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T260281
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: eprodromou, Michael, NullPointer, Platonides, hashar, Addshore, Majavah,
Joe added a comment.
To test the hypothesis that this is related to firejail use, we're sending 1
req/s to one appserver to use pygments, to see if that has any particularly ill
effect.
TASK DETAIL
https://phabricator.wikimedia.org/T260329
EMAIL PREFERENCES
Joe added a comment.
In T260329#6382296 <https://phabricator.wikimedia.org/T260329#6382296>,
@Ladsgroup wrote:
> For the wikibase part, I highly doubt it, the php entry point calls
`wfLoadExtension` internally.
Hi strongly doubt it has any effect as well, but I'd pref
Joe updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T260329
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: CDanis, Aklapper, jijiki, ArielGlenn, RhinosF1, Joe, lmata, wkandek,
JMeybohm, Akuckartz, darthmon_wmde
Joe added a comment.
The list of software updated that day on the appservers is at P12221
<https://phabricator.wikimedia.org/P12221>
TASK DETAIL
https://phabricator.wikimedia.org/T260329
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To:
Joe created this task.
Joe added projects: serviceops, Operations, Sustainability (Incident Followup),
Platform Engineering, Wikidata.
TASK DESCRIPTION
Something induced a progressive memory leak on all servers serving MediaWiki
starting on august 4th between 8 and 13 UTC.
The problem is
Joe triaged this task as "Unbreak Now!" priority.
Joe added projects: Platform Engineering, Wikidata.
Joe added a comment.
I'm not 100% sure that slabs are the problem here, but I'll try to followup
later.
In the meantime, the servers we've rebooted yesterday ar
Joe triaged this task as "High" priority.
TASK DETAIL
https://phabricator.wikimedia.org/T258739
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: herron, RKemper, CDanis, dcausse, Aklapper, Dzahn, lmata, Alter-paule,
Beast1978, CBo
Joe claimed this task.
TASK DETAIL
https://phabricator.wikimedia.org/T258739
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: RKemper, CDanis, dcausse, Aklapper, Dzahn, lmata, Alter-paule, Beast1978,
CBogen, Un1tY, Akuckartz, Hook696
Joe added a comment.
So, while I find the idea of using poolcounter to limit the editing
**concurrency** (it's not rate-limiting, which is different) a good proposal,
and in general something desirable to have (including the possibility we tune
it down to zero if we're in a
Joe added a comment.
> - "The above suggests that the current rate limit is too high," this is not
correct, the problem is that there is no rate limit for bots at all. The group
explicitly doesn't have a rate limit. Adding such ratelimit was tried and
caused lots of
Joe added a comment.
I think the right way to do this would be to emit an htmlCacheUpdate job for
every wikidata edit in the interval.
These will:
- recursively find all linked pages (not sure this works for wikibase items
though - you might know that better)
- invalidate their
Joe added a comment.
I would like to read an assessment of why our current event processing
platform, change-propagation, is not suited for this purpose, and we need to
introduce a new software. I suppose this has been done at some point in another
task; if so a quick link would suffice
Joe added a comment.
In T240884#5813174 <https://phabricator.wikimedia.org/T240884#5813174>,
@Daimona wrote:
> In T240884#5810160 <https://phabricator.wikimedia.org/T240884#5810160>,
@sbassett wrote:
>
>> In T240884#5810094 <https://phabricator.wi
Joe added a comment.
I think the main question to answer is "does it make sense to create a safe
regex evaluation service?".
I think in a void the answer is "no". It could make sense to create a small
C++ program wrapping the main re2 functionality and shell out to
Joe added a comment.
In T240884#5789392 <https://phabricator.wikimedia.org/T240884#5789392>,
@Ladsgroup wrote:
>> Though this is mainly an implementation detail and not significant in
terms requirements or pros/cons.
>
> I disagree for a couple of reasons: gRPC is
Joe added a comment.
In T237319#5681384 <https://phabricator.wikimedia.org/T237319#5681384>,
@darthmon_wmde wrote:
> Is there anything that we can quickly do on wikibase to fix this?
> if so, please advise what concretely.
> Thanks!
In general, whenever your
Joe added a comment.
In T237319#5677665 <https://phabricator.wikimedia.org/T237319#5677665>,
@Vgutierrez wrote:
> I find this pretty worrisome for the following reasons:
>
> 1. right now we have one remap rule that catches all the requests handled
by appservers-rw.
Joe closed this task as "Resolved".
Joe added a comment.
I have just tested and I can easily run `helmfile diff` on termbox now, in
all environments. Resolving for now
TASK DETAIL
https://phabricator.wikimedia.org/T236709
EMAIL PREFERENCES
https://phabricator.wikimedia.or
Joe added a comment.
@Tarrow @Pablo-WMDE can someone try the release to staging? I should have
fixed the rbac roles there. It should've fixed your issues.
I am proceeding with releasing the change on the main clusters too, in the
meanwhile.
TASK DETAIL
Joe added a comment.
@Tarrow if it's an urgent bugfix we can just revert the change to let you
deploy immediately. Please let's coordinate on IRC, and sorry for the
inconvenience :)
TASK DETAIL
https://phabricator.wikimedia.org/T236709
EMAIL PREFERENC
Joe claimed this task.
Joe triaged this task as "High" priority.
Joe added a comment.
@Jakob_WMDE this is a result of our temporary fix for a CVE affecting
kubernetes. We will try to revert the situation tomorrow. Thanks for your
patience.
TASK DETAIL
https://phabricator.wik
Joe added a comment.
In T231089#5470160 <https://phabricator.wikimedia.org/T231089#5470160>,
@Krinkle wrote:
> Smells like T229433 <https://phabricator.wikimedia.org/T229433>. Which is
also about `''` array index, and PHP 7.2. It's obviously a bug in PHP 7
Joe added a comment.
In T231089#5445485 <https://phabricator.wikimedia.org/T231089#5445485>,
@Ladsgroup wrote:
> Probably we can just merge this to T224491: PHP 7 corruption during
deployment (was: PHP 7 fatals on mw1262)
<https://phabricator.wikimedia.org/T224491>
Joe added a comment.
So the real issue was:
- termbox **correctly** uses the `api-ro.discovery.wmnet` host
- the discovery record was **incorrectly** set to active-active
- so requests from termbox would just go to the nearest dc, meaning that in
codfw it would face super-cold caches
Joe added a comment.
In T232035#5467309 <https://phabricator.wikimedia.org/T232035#5467309>,
@Tarrow wrote:
> I think this is probably the same as T229313
<https://phabricator.wikimedia.org/T229313>. We suspected it might be related
to T231011 <https://phabricator.wik
Joe added subscribers: Pchelolo, Joe.
Joe added a comment.
One comment I can add is that if recentchanges jobs are being slow/delayed
because of queueing, you should see the effect on all wikis and not just one,
given how the jobqueue is configured.
Pinging @Pchelolo about the status of
Joe added a comment.
So, after turning off php7 this morning we saw no modification in the rate of
requests to mc1033.
It seems extremely probable that the switch of larger wikis to .wmf3 is what
caused this regression.
TASK DETAIL
https://phabricator.wikimedia.org/T223310
EMAIL
Joe added a comment.
@kostajh for now I'm switching off php7 for other investigations, so we will
know immediately if the additional traffic is due to that or not.
TASK DETAIL
https://phabricator.wikimedia.org/T223310
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/
Joe added a comment.
I think I know what happened here - and it's possibly in relation with
T223180 <https://phabricator.wikimedia.org/T223180> .
PHP7's APC memory was perfectly ok when I looked into it (and we just had the
beta feature enabled), but it's not suffic
Joe closed this task as "Resolved".
Joe added a comment.
I fixed the configuration of cpjobqueue in deployment-prep, restarted the
service, and verified requests are not getting through to the jobrunner:
2019-04-19T10:59:07 10234170172.16.4.124
proxy:fcgi://127.
Joe claimed this task.
TASK DETAIL
https://phabricator.wikimedia.org/T215339
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: kostajh, jijiki, Ramsey-WMF, Krenair, Cparle, Joe, Stashbot, Aklapper,
MarcoAurelio, Samwilson, Pchelolo
Joe added a comment.
FWIW, I don't think we need the TLS configuration in beta. I can try to
simplify things. Sorry for not noticing this bug earlier, but adding
#operations <https://phabricator.wikimedia.org/tag/operations/> or better
#serviceops <https://phabricator.wik
Joe added a comment.
In T212189#5053087 <https://phabricator.wikimedia.org/T212189#5053087>,
@RazShuty wrote:
> Hey @akosiaris, not sure I see it in there, maybe I'm lost a bit... can you
point me out to where the SSR is in
https://www.mediawiki.org/w/i
Joe added a comment.
In T214362#4967944 <https://phabricator.wikimedia.org/T214362#4967944>,
@Addshore wrote:
>
> We currently still want to be able to compute the check on demand, either
because the user wants to purge the current constraint check data, or i
Joe added a comment.
In T213318#4954262, @dbarratt wrote:
In T213318#4953461, @Smalyshev wrote:
I frankly have a bit of a hard time imagining an IT person of the kind that commonly installs smaller wikis being able to efficiently maintain a zoo of services that we're now running in WMF. I
Joe added a comment.
In order to better understand your needs, let me ask you a few questions:
Do we need/want just the constraint check for the latest version of the item, or one for each revision?
How will we access such constraints? Always by key and/or full dump, or other access patterns can
Joe added a comment.
In T213318#4888367, @Nikerabbit wrote:
Do I understand this correctly, that this would add a mandatory Nodejs service to run a Wikibase installation? Is there no client side rendering support planned initially? As an sysadmin for couple of third party wikis (some of which use
Joe added a comment.
In T213318#4885332, @daniel wrote:
So, in conclusion, Wikidata has a lot of edits, but several magnitudes fewer views than a Wikipedia of comparable size. So, while MediaWiki generally optimizes for heavy read loads, the Wikidata UI should be optimized for frequent edits, but
Joe added a comment.
Moving (even part of) the presentation layer outside of MediaWiki raises quite a few questions we have to make important decisions about.
But in the case of Wikidata, I can see how an exception could be made:
It's not part of the core functionality of MediaWiki
Its
Joe added a comment.
In T212189#4840039, @WMDE-leszek wrote:
To avoid misunderstandings: I was not questioning MediaWiki's action API being performant. By "lightweight" I was referring to "PHP has high startup time" point @daniel made above as one of the reason why no s
Joe added a comment.
In T212189#4839848, @WMDE-leszek wrote:
The intention of introducing the service is not to have a service that call Mediawiki. As discussed above, it is needed for the service to ask for some data, and this data shall be provided by some API. Currently, the only API that
Joe added a comment.
Also, if we're going to build microservices, I'd like to not see applications that "grow", at least in terms of what they can do. A microservice should do one thing and do it well. In this case, it's using data from mediawiki to render an HTML fragment
Joe added a comment.
In T212189#4838090, @Addshore wrote:
The "termbox" is more of an application than a template.
Only it knows which data it needs - actively "sending" data to it requires knowledge of which information is needed.
While seemingly trivial in the beginni
Joe added a comment.
Let me state it again: the SSR service should not need to call the mediawiki api. It should accept all the information needed to render the termbox in the call from mediawiki.
So we should have something like:
Mediawiki makes a POST request to SSR, sending the entity data
Joe added a comment.
In T212189#4833482, @daniel wrote:
I agree with Joe that it would be better to have the service be internal, and be called from MW. It doesn't have to be that way, but it's preferable because:
we would not expose a new endpoint
we should in general avoid (more
Joe added a comment.
In T212189#4831959, @mobrovac wrote:
In T212189#4831314, @daniel wrote:
@mobrovac Please note that the term box is shown based on user preferences (languages spoken), the initially served DOM however needs to be the same for all users, so it can be cached. Also note that the
Joe added a comment.
Also: it is stated in https://wikitech.wikimedia.org/wiki/WMDE/Wikidata/SSR_Service that "In case of no configured server-side rendering service or a malfunctioning of it, the client-side code will act as a fallback". This is a bit the other way around with respect
Joe added a comment.
Looking at the attached diagrams, it seems that the flow of a request is as follows:
page gets requested to MediaWiki
MW sends a request to the rendering service
the rendering service sends request(s) to mediawiki via api.php to fetch the data, and sends back the rendered
Joe added a project: serviceops.
TASK DETAILhttps://phabricator.wikimedia.org/T125976EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: JoeCc: gerritbot, Joe, Jdforrester-WMF, Krinkle, Dzahn, Reedy, MarcoAurelio, dcausse, Addshore, thcipriani, hashar, greg
Joe added a comment.
In T205865#4630573, @Joe wrote:
Have you checked if the latest changes didn't just switch execution from HHVM to PHP7? That could explain a better performance.
I can answer this: no, they did not. We launch mwscript for the dispatcher with PHP='hhvm -vEval.Jit=1
Joe added a comment.
Have you checked if the latest changes didn't just switch execution from HHVM to PHP7? That could explain a better performance.
Also, can I ask which redis servers are interacted with? I guess the ones for the locking system, right?TASK DETAILhttps://phabricator.wikimedi
Joe added a comment.
In T188045#4007098, @Smalyshev wrote:
I wonder if it's possible to use one of the new servers we're getting in T187766 to restore full capacity if debugging what is going on with 1004 takes time. Would it be a good thing to do?
If losing one server out of 4 i
Joe added a project: User-Joe.
TASK DETAILhttps://phabricator.wikimedia.org/T178810EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: JoeCc: Aklapper, aude, thiemowmde, hoo, Ladsgroup, Krinkle, Joe, daniel, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, Zppix
Joe added a comment.
oblivian@terbium:~$ /usr/local/bin/foreachwikiindblist /srv/mediawiki/dblists/group1.dblist showJobs.php --group | awk '{if ($3 > 1) print $_}'
cawiki: refreshLinks: 104355 queued; 3 claimed (3 active, 0 abandoned); 0 delayed
commonswiki: refreshLinks: 20731
Joe added a comment.
I think re2 seems like an interesting candidate. I would argue we still want to have a separate microservice running on a separate cluster from MediaWiki, for security reasons, and I would think it could be used to run the regular _expression_ validations as well.
AIUI, this
Joe added a comment.
FWIW we're seeing another almost-incontrollable growth of jobs on commons and probably other wikis. I might decide to raise the concurrency of those jobs.TASK DETAILhttps://phabricator.wikimedia.org/T173710EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/
Joe added a comment.
I did some more number crunching on the instances of runJob.php I'm running on terbium, I found what follows:
Wikibase refreshlinks jobs might benefit from being in smaller batches, as many of those are taking a long time to execute. Out of 33.4k wikibase jobs, we ha
Joe added a comment.
In T173710#3584505, @Krinkle wrote:
In T173710#3583445, @Joe wrote:
As a side comment: this is one of the cases where I would've loved to have an elastic environment to run MediaWiki-related applications: I could've spun up 10 instances of jobrunner dedicated to re
Joe added a comment.
In T173710#3581849, @aaron wrote:
Those refreshLInks jobs (from wikibase) are the only ones that use multiple titles per job, so they will be a lot slower (seems to be 50 pages/job) than the regular ones from MediaWiki core. That is a bit on the slow side for a run time of a
Joe added a comment.
We still have around 1.4 million items in queue for commons, evenly divided between htmlCacheUpdate jobs and refreshLinks jobs.
I've started a few runs of the refreshLinks job and since yesterday most jobs are just processing the same root job from August 26th.
Those
Joe added a subscriber: ema.Joe added a comment.
Correcting myself after a discussion with @ema: since we have up to 4 cache layers (at most), we should process any job with a root timestamp newer than 4 times the cache TTL cap. So anything older than 4 days should be safely discardable.
This
Joe added a comment.
@aaron so you're saying that when we have someone editing a lot of pages with a lot of backlinks we will see the jobqueue growing basically for quite a long time, as the divided jobs will be executed at a later time, and as long as the queue is long enough, we'l
1 - 100 of 189 matches
Mail list logo