BBlack added a comment.
In T331356#8718619 <https://phabricator.wikimedia.org/T331356#8718619>,
@MisterSynergy wrote:
> Some remarks:
>
> - We should consider these canonical HTTP URIs to be //names// in the first
place, which are unique worldwide and issued by the W
BBlack closed this task as "Resolved".
BBlack added a comment.
The redirects are neither //good// nor //bad//, they're instead both
necessary (although that necessity is waning) and insecure. We thought we had
standardized on all canonical URIs being of the secure variant ~8
BBlack created this task.
BBlack triaged this task as "High" priority.
BBlack added projects: Wikidata, Traffic.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: wdwb-tech.
TASK DESCRIPTION
It has come to our attention via T3309
BBlack reopened this task as "Open".
BBlack added a comment.
In T330906#8661013 <https://phabricator.wikimedia.org/T330906#8661013>,
@Ennomeijers wrote:
> As I already mentioned earlier, the SPARQL endpoint and the RDF serialized
data all use the HTTP version as the c
BBlack added a comment.
In T330906#8657917 <https://phabricator.wikimedia.org/T330906#8657917>,
@Ennomeijers wrote:
> Thanks for the replies! Advising to use HTTPS over HTTP makes sense.
>
> But not supporting redirection from HTTP to HTTPS will in my opinion
introduc
BBlack added a comment.
We chose S:BP for those queries on the assumption that, by its nature, it
would be a cheap page to monitor. Is there a better option we should be using,
or is this ticket more about fixing inefficiencies in it?
TASK DETAIL
https://phabricator.wikimedia.org/T284981
BBlack added a comment.
We can route different URI subspaces differently at the edge layer, based on
URI regexes, as shown here for the split of the API namespace of the primary
wiki sites:
https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production
BBlack added a comment.
I think you ran into a temporary blip in some unrelated DNS work (which is
already dealt with), not this bug (502 errors can happen for real infra failure
reasons, too!)
TASK DETAIL
https://phabricator.wikimedia.org/T237319
EMAIL PREFERENCES
https
BBlack added a comment.
We'll also need to normalize the incoming `Accept` headers up in the edge
cache layer to avoid pointless vary explosions. Ideally the normalization
should exactly match the application-layer logic that chooses the output
content type. Do you have some pseudo-code
BBlack added a comment.
As noted in T155359 <https://phabricator.wikimedia.org/T155359> - WMDE has
moved the hosting of this to some other platform, including the DNS hosting
(and we never had the whois entry). So this task can resolve as Decline I
think (or whatever), but we shou
BBlack added a comment.
@WMDE-leszek Thanks for looking into it! I believe @CRoslof is who you want
to coordinate with on our end, whose last statement on this topic back in
January was:
In T99531#4878798 <https://phabricator.wikimedia.org/T99531#4878798>,
@CRoslof
BBlack added a comment.
Re: `wikibase.org`, adding it as a non-canonical redirection to catch
confusion from those that manually type URLs is fine, but we should make sure
everyone is clear on which domainname is canonical for this project (I assume
`https://wikiba.se/`) and make sure
BBlack added a comment.
I think it would be better, from my perspective, to really understand the
use-cases better (which I don't). Why do these remote clients need "realtime"
(no staleness) fetches of Q items? What I hear is it sounds like all clients
expect everything to be
BBlack added a comment.
Looking at an internal version of the flavor=dump outputs of an entity,
related observations:
Test request from the inside: `curl -v
'https://www.wikidata.org/wiki/Special:EntityData/Q15223487.ttl?flavor=dump'
--resolve www.wikidata.org:443:10.2.2.1
BBlack added a comment.
There are different layers of "handing off" DNS management which are being conflated, but to run through them in order:
"Point the A record to the right place" - We don't support this, and can't realistically. We need control of the zone data directl
BBlack added a comment.
There's still a couple of things that can be done serially at present, one of which is necessary for the cert issuance later:
Switch the nameservers for wikiba.se to ns[012].wikimedia.org with your current registrar (United Domains). We have to have this to later issue
BBlack added a comment.
Thanks for the data and the patch! We'll dig into the DNS patch next week and get it merged in so we're serving wikiba.se from our DNS as-is (as in, pointing at your existing server IPs). Then we can do handoff of the domain ownership/registration without causing any
BBlack added a comment.
Yes, let's look at this today. I think we need better tg3 ethernet card support in interface::rps for one of our authdnses anyways, which you'll need here too.TASK DETAILhttps://phabricator.wikimedia.org/T206105EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings
BBlack added a comment.
There are plans underway at this point to support multiple LE certs on our standard cache terminators via the work in T199711 due by EOQ (end of Sept), which would make this whole thing simpler and zero cert cost. I couldn't say for sure how fast we'll shake out all
BBlack added a comment.
It's a complicated topic I think, on our end. There are ways to make it work today, but when I try to write down generic steps any internal service could take to talk to any other (esp MW or RB), it bogs down in complications that are probably less than ideal in various
BBlack added a comment.
This raises some questions that are probably unrelated to the problem at hand, but might affect things indirectly:
Why is an internal service (wdqs) querying a public endpoint? It should probably use private internal endpoints like appservers.svc or api.svc
BBlack added a comment.
It's a pain any direction we slice this, and I'm not fond of adding new canonical domains outside the known set for individual low-traffic projects. We didn't add new domains for a variety of other public-facing efforts (e.g. wdqs, ORES, maps, etc).
We don't have clear
BBlack added a comment.
No, we never made an incident rep on this one, and I don't think it would be fair at this time to implicate ORES as a cause. We can't really say that ORES was directly involved at all (or any of the other services investigated here). Because the cause was so unknown
BBlack lowered the priority of this task from "High" to "Normal".BBlack changed the task status from "Open" to "Stalled".BBlack added a comment.
The timeout changes above will offer some insulation, and as time passes we're not seeing evidence of this pr
BBlack added a comment.
In T179156#3720392, @BBlack wrote:
In T179156#3719995, @BBlack wrote:
We have an obvious case of normal slow chunked uploads of large files to commons to look at for examples to observe, though.
Rewinding a little: this is false, I was just getting confused
BBlack added a comment.
In T179156#3719995, @BBlack wrote:
We have an obvious case of normal slow chunked uploads of large files to commons to look at for examples to observe, though.
Rewinding a little: this is false, I was just getting confused by terminology. Commons "chunked&quo
BBlack lowered the priority of this task from "Unbreak Now!" to "High".BBlack added a comment.
Reducing this from UBN->High, because current best-working-theory is this problem is gone so long as we keep the VCL do_stream=false change reverted. Obviously, there's still some
BBlack added a comment.
In T179156#3719928, @daniel wrote:
In any case, this would consume front-edge client connections, but wouldn't trigger anything deeper into the stack
That's assuming varnish always caches the entire request, and never "streams" to the backend, even for file upl
BBlack added a comment.
Trickled-in POST on the client side would be something else. Varnish's timeout_idle, which is set to 5s on our frontends, acts as the limit for receiving all client request headers, but I'm not sure that it has such a limitation that applies to client-sent bodies. In any
BBlack added a comment.
In T179156#3718772, @ema wrote:
There's a timeout limiting the total amount of time varnish is allowed to spend on a single request, send_timeout, defaulting to 10 minutes. Unfortunately there's no counter tracking when the timer kicks in, although a debug line is logged
BBlack added a comment.
Now that I'm digging deeper, it seems there are one or more projects in progress built around Push-like things, in particular T113125 . I don't see any evidence that there's been live deploy of them yet, but maybe I'm missing something or other. If we have a live deploy
BBlack added a comment.
Does Echo have any kind of push notification going on, even in light testing yet?TASK DETAILhttps://phabricator.wikimedia.org/T179156EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: ema, Gehel, Smalyshev, TerraCodes, Jay8g
BBlack added a comment.
A while after the above, @hoo started focusing on a different aspect of this we've been somewhat ignoring as more of a side-symptom: that there tend to be a lot of sockets in a strange state on the "target" varnish, to various MW nodes. They look strange on
BBlack added a comment.
Updates from the Varnish side of things today (since I've been bad about getting commits/logs tagged onto this ticket):
18:15 - I took over looking at today's outburst on the Varnish side
The current target at the time was cp1053 (after elukey's earlier restart of cp1055
BBlack added a comment.
In T179156#3715432, @hoo wrote:
I think I found the root cuase now, seems it's actually related to the WikibaseQualityConstraints extension:
Isn't that the same extension referenced in the suspect commits mentioned above?
18:51 ladsgroup@tin: Synchronized php-1.31.0
BBlack added a comment.
Unless anyone objects, I'd like to start with reverting our emergency varnish max_connections changes from https://gerrit.wikimedia.org/r/#/c/386756 . Since the end of the log above, connection counts have returned to normal, which is ~100, which is 1/10th the normal 1K
BBlack added a comment.
My gut instinct remains what it was at the end of the log above. I think something in the revert of wikidatawiki to wmf.4 fixed this. And I think given the timing alignment of the Fix sorting of NullResults changes + the initial ORES->wikidata fatals makes th
BBlack added a comment.
Copying this in from etherpad (this is less awful than 6 hours of raw IRC+SAL logs, but still pretty verbose):
# cache servers work ongoing here, ethtool changes that require short depooled downtimes around short ethernet port outages:
17:49 bblack: ulsfo cp servers
BBlack added a comment.
Can you explain in more detail? Is the subject of this ticket was was shown as an error in your browser window? I doubt this is related to varnish and/or "mailbox lag".TASK DETAILhttps://phabricator.wikimedia.org/T175588EMAIL PREFERENCEShttps://phabricator.wik
BBlack removed parent tasks: T174932: Recurrent 'mailbox lag' critical alerts and 500s, T175473: Multiple 503 Errors.
TASK DETAILhttps://phabricator.wikimedia.org/T175588EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Aklapper, Esc3300
BBlack added a project: Traffic.
TASK DETAILhttps://phabricator.wikimedia.org/T99531EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: mark, greg, PokestarFan, faidon, Ladsgroup, Ivanhercaz, Addshore, Jonas, JeroenDeDauw, thiemowmde, hoo, JanZerebecki
BBlack removed a parent task: T104681: HTTPS Plans (tracking / high-level info).
TASK DETAILhttps://phabricator.wikimedia.org/T153563EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Kghbln, Dalba, Lydia_Pintscher, Jonas, Ricordisamoa, Lokal_Profil
BBlack added a comment.
Yeah that was the plan, for XKey to help here by consolidating that down to a single HTCP / PURGE per article touched. It's not useful for the mass-scale case (e.g. template/link references), as it doesn't scale well in that direction. But for the case like "1 ar
BBlack added a comment.
We can get broader averages by dividing the values seen in the aggregate client status code graphs using eqiad's text cluster (the remote sites would expect fewer due to some of the bursts being more likely to be dropped by the network)
This shows the past week's average
BBlack added a comment.
The lack of graph data from falling off the history is a sad commentary on how long this has remained unresolved :(
Some salient points from earlier within this ticket, to recap:
In T124418#1985526, @BBlack wrote:
Continuing with some stuff I was saying in IRC the other
BBlack reopened this task as "Open".BBlack added a comment.
Not resolved, as the purge graphs can attest!TASK DETAILhttps://phabricator.wikimedia.org/T124418EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: aaron, BBlackCc: GWicke, ArielGlenn, Krin
BBlack added a comment.
I clicked Submit too soon :) Continuing:
We'd expect content to be at minimum a day, if not significantly longer. MW currently emits 2-week cache headers (with plans to eventually bring that down closer to a day, but those plans are still further off). Cache invalidation
BBlack added a comment.
Nothing was ever resolved here. 30 minutes seems like an arbitrary number with no formal basis or reasoning, and is way shorter than we'd like for anything article-like.TASK DETAILhttps://phabricator.wikimedia.org/T142944EMAIL PREFERENCEShttps://phabricator.wikimedia.org
BBlack closed this task as "Resolved".BBlack claimed this task.
TASK DETAILhttps://phabricator.wikimedia.org/T132457EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Stashbot, gerritbot, ema, Gehel, BBlack, Aklapper, mschwarzer, Avner,
BBlack added a parent task: T147844: Standardize varnish applayer backend definitions.
TASK DETAILhttps://phabricator.wikimedia.org/T132457EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Stashbot, gerritbot, ema, Gehel, BBlack, Aklapper, mschwarzer
BBlack added a comment.
I think I'm lacking a lot of context here about these special pages and placeholders. But my bottom line thoughts are currently along these lines:
How do actual, real-world, anonymous users interact with these placeholders and special pages? What value is it providing
BBlack added a comment.
30 minutes isn't really reasonable, and neither is spamming more purge traffic. If there's a constant risk of the page content breaking without invalidation, how is even 30 minutes acceptable? Doesn't this mean that on average they'll be broken for 15 minutes after
BBlack added a subscriber: GWicke.BBlack added a comment.
@aaron and @GWicke - both patches sound promising, thanks for digging into this topic!TASK DETAILhttps://phabricator.wikimedia.org/T124418EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc
BBlack added a comment.
cache_maps cluster switched to the new varnish package today
TASK DETAIL
https://phabricator.wikimedia.org/T134989
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: jeremyb, Ronarts12, Krenair, Dzahn, GWicke
BBlack added a comment.
Current State:
- cp3007 and cp1045 are depooled from user traffic, icinga-downtimed for
several days, and have puppet disabled. Please do not re-enable puppet on
these! They also have confd shut down, and are running custom configs to
continue debugging
BBlack reopened blocking task T131501: Convert misc cluster to Varnish 4 as
"Open".
TASK DETAIL
https://phabricator.wikimedia.org/T133490
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: MZMcBride, gerritbot, BBlack, Bovlb
BBlack added a blocked task: T131501: Convert misc cluster to Varnish 4.
TASK DETAIL
https://phabricator.wikimedia.org/T134989
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Ronarts12, Krenair, Dzahn, GWicke, Smalyshev, Heather, Nirzar
BBlack added a comment.
I forgot one of our temporary hacks in the list above in
https://phabricator.wikimedia.org/T134989#2290254:
4. https://gerrit.wikimedia.org/r/#/c/288656/ - we also enabled a critical
small bit here in v4 vcl_hit. I reverted this for now during the varnish3
BBlack added a comment.
Has anyone been able to reproduce any of the problems in the tickets merged
into here, since roughly the timestamp of the above message?
TASK DETAIL
https://phabricator.wikimedia.org/T134989
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel
BBlack added a comment.
So we're currently have several experiments in play trying to figure this out:
1. We've got 2x upstream bugfixes applied to our varnishd on cache_misc:
https://github.com/varnishcache/varnish-cache/commit/d828a042b3fc2c2b4f1fea83021f0d5508649e50
+
https
BBlack added a comment.
In the merged ticket above, it's browser access to status.wm.o, and the
browser's getting a 304 Not Modified and complaining about it (due to missing
character encoding supposedly, but it's entirely likely it's missing everything
and that's just the first thing
BBlack added a subscriber: Kghbln.
BBlack merged a task: T135121: stats.wikimedia.org down.
TASK DETAIL
https://phabricator.wikimedia.org/T134989
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Kghbln, ema, Stashbot, Luke081515, matmarex
BBlack added a comment.
Status update: we've been debugging this off and on all day. It's some kind
of bug fallout from cache_misc's upgrade to Varnish 4. It's a very complicated
bug, and we don't really understand it yet. We've made some band-aid fixes to
VCL for now which should keep
BBlack added a comment.
Assuming there was no transient issue (which became cached) on the wdqs end
of things, then this was likely a transient thing from nginx experiments or the
cache_misc varnish4 upgrade. I banned all wdqs objects from cache_misc and now
your test URL works fine. Can
BBlack closed this task as "Resolved".
BBlack added a comment.
My test cases on cache_text work now, should be resolved!
TASK DETAIL
https://phabricator.wikimedia.org/T133866
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc:
BBlack closed this task as "Resolved".
BBlack claimed this task.
BBlack added a comment.
This works now. There's a significant pause at the start of the transfer
from the user's perspective if it's not a cache hit, because streaming is
disabled as a workaround (so it has to compl
BBlack added a comment.
We now have some understanding of the mechanism of this bug (
https://phabricator.wikimedia.org/T133866#2275985 ). It should go away in the
imminent varnish 4 upgrade of the misc cluster in
https://phabricator.wikimedia.org/T131501.
TASK DETAIL
https
BBlack added a blocking task: T131501: Convert misc cluster to Varnish 4.
TASK DETAIL
https://phabricator.wikimedia.org/T133490
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Bovlb, Aklapper, Mushroom, Avner, debt, Gehel, D3r1ck01
BBlack added a comment.
So, as it turns out, this is a general varnishd bug in our specific varnishd
build. For purposes of this bug, our varnishd code is essentially 3.0.7 plus a
bunch of ancient forward-ported 'plus' patches related to streaming, and we're
missing
https://github.com
BBlack closed blocking task Restricted Task as "Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T133866
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Ricordisamoa, Trung.anh.dinh, MZMcBride, Anomie, Yurivict,
BBlack triaged this task as "High" priority.
TASK DETAIL
https://phabricator.wikimedia.org/T133866
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Trung.anh.dinh, MZMcBride, Anomie, Yurivict, TerraCodes, Orlodrim, BBlack,
BBlack added a blocking task: Restricted Task.
TASK DETAIL
https://phabricator.wikimedia.org/T133866
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Trung.anh.dinh, MZMcBride, Anomie, Yurivict, TerraCodes, Orlodrim, BBlack,
akosiaris
BBlack added a comment.
Thanks for merging in the probably-related tasks. I had somehow missed
really noticing T123159 earlier... So probably digging into gunzip itself
isn't a fruitful path. I'm going to open a separate blocker for this that's
private, so we can keep merging public
BBlack added a comment.
Did some further testing on an isolated test machine, using our current
varnish3 package.
- Got 2833-byte test file from uncorrupted (--compressed) output on prod.
This is the exact compressed content bytes emitted by MW/Apache for the broken
test URL
BBlack added a comment.
Just jotting down the things I know so far from investigating this morning.
I still don't have a good answer yet.
Based on just the test URL, debugging it extensively at various layers:
1. The response size of that URL is in the ballpark of 32KB uncompressed
BBlack added a comment.
Do you know if some normal traffic is affected, such that we'd know a start
date for a recent change in behavior? Or is it suspected that it was always
this way?
I've been digging through some debugging on this URL (which is an applayer
chunked-response
BBlack added a comment.
I really don't think it's specifically Wikidata-related either at this point.
Wikidata might be a significant driver of update jobs in general, but the code
changes driving the several large rate increases were probably generic to all
update jobs.
TASK DETAIL
BBlack added a blocked task: T133821: Content purges are unreliable.
TASK DETAIL
https://phabricator.wikimedia.org/T124418
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Smalyshev, gerritbot, Legoktm, Addshore, daniel, hoo, aude
BBlack added a blocked task: T133821: Content purges are unreliable.
TASK DETAIL
https://phabricator.wikimedia.org/T102476
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer
BBlack edited projects, added Traffic; removed Varnish.
TASK DETAIL
https://phabricator.wikimedia.org/T133490
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Aklapper, Mushroom, Avner, debt, TerraCodes, Gehel, D3r1ck01, FloNight,
Izno
BBlack added a comment.
F3845100: Screen Shot 2016-04-07 at 7.47.28 PM.png
<https://phabricator.wikimedia.org/F3845100>
TASK DETAIL
https://phabricator.wikimedia.org/T124418
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Sma
BBlack edited the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T124418
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: BBlack
Cc: Smalyshev, gerritbot, Legoktm, Addshore, daniel, hoo, aude,
Lydia_Pintscher, JanZerebecki, MZMcBride
BBlack added a comment.
In https://phabricator.wikimedia.org/T121135#1910435, @Atsirlin wrote:
> @Legoktm: Frankly speaking, for a small project like Wikivoyage the cache
brings no obvious benefits, but triggers many serious issues including the
problem of page banners and
BBlack added a blocking task: T128813: cache_misc's misc_fetch_large_objects
has issues.
TASK DETAIL
https://phabricator.wikimedia.org/T127014
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Gehel, BBlack
Cc: gerritbot, BBlack, Gehel, Nikki, Mbch331
BBlack edited projects, added Traffic; removed Varnish.
TASK DETAIL
https://phabricator.wikimedia.org/T127014
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Gehel, BBlack
Cc: gerritbot, BBlack, Gehel, Nikki, Mbch331, Magnus, JanZerebecki, Smalyshev
BBlack added a comment.
I did some live experimentation with manual edits to the VCL. It is the
`between_bytes_timeout`, but the situation is complex. The timeout that's
failing is on the varnish frontend fetching from the varnish backend. These
are fixed at 2s, but because this is all
BBlack added a comment.
This is probably due to backend timeouts, I would guess? The default
applayer settings being applied to wdqs include `between_bytes_timeout` at only
4s, whereas `first_byte_timeout` is 185s. So if wdqs delayed all output, it
would have 3 minutes or so, but once
BBlack added a comment.
In https://phabricator.wikimedia.org/T126730#2034900, @Christopher wrote:
> I may be wrong, but the headers that are returned from a request to the nginx
> server wdqs1002 say that varnish 1.1 is already being used there.
It's varnish 3.0.6 currently (4.x is
BBlack added a comment.
In https://phabricator.wikimedia.org/T125392#1994242, @Milimetric wrote:
> @BBlack - so you think cache_status is not even close to accurate? Do we
> have other accurate measurements of it so we could compare to what extent
> it's misleading? I'm happy
BBlack added a comment.
IIRC, the problem we've beat our heads against in past SPARQL-related tickets
is the fact that SPARQL clients are using `POST` method for readonly queries,
due to argument length issues and whatnot. On the surface, that's a
dealbreaker for caching them as `POST` isn't
BBlack added a comment.
So, current thinking is that at least one of (maybe two of?) the bumps are from
moving what used to be synchronous HTCP purge during requests to JobRunner jobs
which should be doing the same thing. However, assuming it's that alone (or
even just investigating that part
BBlack added a comment.
Well then apparently the 10/s edits to all projects number I found before is
complete bunk :)
http://wikipulse.herokuapp.com/ has numbers for wikidata edits that
approximately line up with yours, and then shows Wikipedias at about double
that rate (which might
BBlack added a comment.
heh so: https://phabricator.wikimedia.org/T113192 ->
https://gerrit.wikimedia.org/r/#/c/258365/5 is probably the Jan 20 bump.
TASK DETAIL
https://phabricator.wikimedia.org/T124418
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferen
BBlack added a comment.
FYI - "cache_status" is not an accurate reflection of anything. I'm not sure
why we really even log it for analytics. The problem is that it only reflects
some varnish state about the first of up to 3 layers of caching, and even then
it does so poorly.
T
BBlack added a comment.
@ori - yeah that makes sense for the initial bump, and I think there may have
even been a followup to do deferred purges, which may be one of the other
multipliers, but I haven't found it yet (as in, insert an immediate job and
also somehow insert one that fires
BBlack added a comment.
Another data point from the weekend: In one sample I took Saturday morning,
when I sampled for 300s, the top site being purged was srwiki, and something
like 98% of the purges flowing for srwiki were all Talk: pages (well, with
Talk: as %-encoded something in Serbian
BBlack added a comment.
Regardless, the average rate of HTCP these days is normally-flat-ish (a few
scary spikes aside), and is mostly throttled by the jobqueue. The question
still remains: what caused permanent, large bumps in the jobqueue
htmlCacheUpdate insertion rate on ~Dec4, ~Dec11
BBlack added a comment.
@daniel - Sorry I should have linked this earlier, I made a paste at the time:
https://phabricator.wikimedia.org/P2547 . Note that
`/%D0%A0%D0%B0%D0%B7%D0%B3%D0%BE%D0%B2%D0%BE%D1%80:` is the Serbian srwiki
version of `/Talk:`.
TASK DETAIL
https
BBlack added a comment.
Well, we have 3 different stages of rate-increase in the insert graph, so it
could well be that we have 3 independent causes to look at here. And it's not
necessarily true that any of them are buggy, but we need to understand what
they're doing and why, because maybe
BBlack added a comment.
Continuing with some stuff I was saying in IRC the other day. At the "new
normal", we're seeing something in the approximate ballpark of 400/s articles
purged (which is then multiplied commonly for ?action=history and mobile and
ends up more like ~1600/s a
1 - 100 of 117 matches
Mail list logo