[Wikidata-bugs] [Maniphest] T331356: Wikidata seems to still be utilizing insecure HTTP URIs

2023-04-13 Thread BBlack
BBlack added a comment. In T331356#8718619 <https://phabricator.wikimedia.org/T331356#8718619>, @MisterSynergy wrote: > Some remarks: > > - We should consider these canonical HTTP URIs to be //names// in the first place, which are unique worldwide and issued by the W

[Wikidata-bugs] [Maniphest] T330906: HTTP URIs do not resolve from NL and DE?

2023-03-06 Thread BBlack
BBlack closed this task as "Resolved". BBlack added a comment. The redirects are neither //good// nor //bad//, they're instead both necessary (although that necessity is waning) and insecure. We thought we had standardized on all canonical URIs being of the secure variant ~8

[Wikidata-bugs] [Maniphest] T331356: Wikidata seems to still be utilizing insecure HTTP URIs

2023-03-06 Thread BBlack
BBlack created this task. BBlack triaged this task as "High" priority. BBlack added projects: Wikidata, Traffic. Restricted Application added a subscriber: Aklapper. Restricted Application added a project: wdwb-tech. TASK DESCRIPTION It has come to our attention via T3309

[Wikidata-bugs] [Maniphest] T330906: HTTP URIs do not resolve from NL and DE?

2023-03-06 Thread BBlack
BBlack reopened this task as "Open". BBlack added a comment. In T330906#8661013 <https://phabricator.wikimedia.org/T330906#8661013>, @Ennomeijers wrote: > As I already mentioned earlier, the SPARQL endpoint and the RDF serialized data all use the HTTP version as the c

[Wikidata-bugs] [Maniphest] T330906: HTTP URIs do not resolve from NL and DE?

2023-03-02 Thread BBlack
BBlack added a comment. In T330906#8657917 <https://phabricator.wikimedia.org/T330906#8657917>, @Ennomeijers wrote: > Thanks for the replies! Advising to use HTTPS over HTTP makes sense. > > But not supporting redirection from HTTP to HTTPS will in my opinion introduc

[Wikidata-bugs] [Maniphest] T284981: SELECT query arriving to wikidatawiki db codfw hosts causing pile ups during schema change

2021-10-08 Thread BBlack
BBlack added a comment. We chose S:BP for those queries on the assumption that, by its nature, it would be a cheap page to monitor. Is there a better option we should be using, or is this ticket more about fixing inefficiencies in it? TASK DETAIL https://phabricator.wikimedia.org/T284981

[Wikidata-bugs] [Maniphest] T266702: Move WDQS UI to microsites

2020-10-29 Thread BBlack
BBlack added a comment. We can route different URI subspaces differently at the edge layer, based on URI regexes, as shown here for the split of the API namespace of the primary wiki sites: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production

[Wikidata-bugs] [Maniphest] [Commented On] T237319: 502 errors on ATS/8.0.5

2019-11-26 Thread BBlack
BBlack added a comment. I think you ran into a temporary blip in some unrelated DNS work (which is already dealt with), not this bug (502 errors can happen for real infra failure reasons, too!) TASK DETAIL https://phabricator.wikimedia.org/T237319 EMAIL PREFERENCES https

[Wikidata-bugs] [Maniphest] [Commented On] T232006: LDF service does not Vary responses by Accept, sending incorrect cached responses to clients

2019-09-18 Thread BBlack
BBlack added a comment. We'll also need to normalize the incoming `Accept` headers up in the edge cache layer to avoid pointless vary explosions. Ideally the normalization should exactly match the application-layer logic that chooses the output content type. Do you have some pseudo-code

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2019-08-14 Thread BBlack
BBlack added a comment. As noted in T155359 <https://phabricator.wikimedia.org/T155359> - WMDE has moved the hosting of this to some other platform, including the DNS hosting (and we never had the whois entry). So this task can resolve as Decline I think (or whatever), but we shou

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2019-04-25 Thread BBlack
BBlack added a comment. @WMDE-leszek Thanks for looking into it! I believe @CRoslof is who you want to coordinate with on our end, whose last statement on this topic back in January was: In T99531#4878798 <https://phabricator.wikimedia.org/T99531#4878798>, @CRoslof

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2019-04-25 Thread BBlack
BBlack added a comment. Re: `wikibase.org`, adding it as a non-canonical redirection to catch confusion from those that manually type URLs is fine, but we should make sure everyone is clear on which domainname is canonical for this project (I assume `https://wikiba.se/`) and make sure

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-12 Thread BBlack
BBlack added a comment. I think it would be better, from my perspective, to really understand the use-cases better (which I don't). Why do these remote clients need "realtime" (no staleness) fetches of Q items? What I hear is it sounds like all clients expect everything to be

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-08 Thread BBlack
BBlack added a comment. Looking at an internal version of the flavor=dump outputs of an entity, related observations: Test request from the inside: `curl -v 'https://www.wikidata.org/wiki/Special:EntityData/Q15223487.ttl?flavor=dump' --resolve www.wikidata.org:443:10.2.2.1

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2019-02-20 Thread BBlack
BBlack added a comment. There are different layers of "handing off" DNS management which are being conflated, but to run through them in order: "Point the A record to the right place" - We don't support this, and can't realistically. We need control of the zone data directl

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2018-12-13 Thread BBlack
BBlack added a comment. There's still a couple of things that can be done serially at present, one of which is necessary for the cert issuance later: Switch the nameservers for wikiba.se to ns[012].wikimedia.org with your current registrar (United Domains). We have to have this to later issue

[Wikidata-bugs] [Maniphest] [Updated] T99531: [Task] move wikiba.se webhosting to wikimedia cluster

2018-11-21 Thread BBlack
BBlack added a comment. Thanks for the data and the patch! We'll dig into the DNS patch next week and get it merged in so we're serving wikiba.se from our DNS as-is (as in, pointing at your existing server IPs). Then we can do handoff of the domain ownership/registration without causing any

[Wikidata-bugs] [Maniphest] [Commented On] T206105: Optimize networking configuration for WDQS

2018-10-15 Thread BBlack
BBlack added a comment. Yes, let's look at this today. I think we need better tg3 ethernet card support in interface::rps for one of our authdnses anyways, which you'll need here too.TASK DETAILhttps://phabricator.wikimedia.org/T206105EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] [Updated] T99531: [Task] move wikiba.se webhosting to wikimedia misc-cluster

2018-08-18 Thread BBlack
BBlack added a comment. There are plans underway at this point to support multiple LE certs on our standard cache terminators via the work in T199711 due by EOQ (end of Sept), which would make this whole thing simpler and zero cert cost. I couldn't say for sure how fast we'll shake out all

[Wikidata-bugs] [Maniphest] [Commented On] T199219: WDQS should use internal endpoint to communicate to Wikidata

2018-07-11 Thread BBlack
BBlack added a comment. It's a complicated topic I think, on our end. There are ways to make it work today, but when I try to write down generic steps any internal service could take to talk to any other (esp MW or RB), it bogs down in complications that are probably less than ideal in various

[Wikidata-bugs] [Maniphest] [Commented On] T199146: "Blocked" response when trying to access constraintsrdf action from production host

2018-07-09 Thread BBlack
BBlack added a comment. This raises some questions that are probably unrelated to the problem at hand, but might affect things indirectly: Why is an internal service (wdqs) querying a public endpoint? It should probably use private internal endpoints like appservers.svc or api.svc

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia misc-cluster

2017-12-11 Thread BBlack
BBlack added a comment. It's a pain any direction we slice this, and I'm not fond of adding new canonical domains outside the known set for individual low-traffic projects. We didn't add new domains for a variety of other public-facing efforts (e.g. wdqs, ORES, maps, etc). We don't have clear

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-11-22 Thread BBlack
BBlack added a comment. No, we never made an incident rep on this one, and I don't think it would be fair at this time to implicate ORES as a cause. We can't really say that ORES was directly involved at all (or any of the other services investigated here). Because the cause was so unknown

[Wikidata-bugs] [Maniphest] [Changed Status] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-11-06 Thread BBlack
BBlack lowered the priority of this task from "High" to "Normal".BBlack changed the task status from "Open" to "Stalled".BBlack added a comment. The timeout changes above will offer some insulation, and as time passes we're not seeing evidence of this pr

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3720392, @BBlack wrote: In T179156#3719995, @BBlack wrote: We have an obvious case of normal slow chunked uploads of large files to commons to look at for examples to observe, though. Rewinding a little: this is false, I was just getting confused

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3719995, @BBlack wrote: We have an obvious case of normal slow chunked uploads of large files to commons to look at for examples to observe, though. Rewinding a little: this is false, I was just getting confused by terminology. Commons "chunked&quo

[Wikidata-bugs] [Maniphest] [Lowered Priority] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack lowered the priority of this task from "Unbreak Now!" to "High".BBlack added a comment. Reducing this from UBN->High, because current best-working-theory is this problem is gone so long as we keep the VCL do_stream=false change reverted. Obviously, there's still some

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3719928, @daniel wrote: In any case, this would consume front-edge client connections, but wouldn't trigger anything deeper into the stack That's assuming varnish always caches the entire request, and never "streams" to the backend, even for file upl

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. Trickled-in POST on the client side would be something else. Varnish's timeout_idle, which is set to 5s on our frontends, acts as the limit for receiving all client request headers, but I'm not sure that it has such a limitation that applies to client-sent bodies. In any

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3718772, @ema wrote: There's a timeout limiting the total amount of time varnish is allowed to spend on a single request, send_timeout, defaulting to 10 minutes. Unfortunately there's no counter tracking when the timer kicks in, although a debug line is logged

[Wikidata-bugs] [Maniphest] [Updated] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-29 Thread BBlack
BBlack added a comment. Now that I'm digging deeper, it seems there are one or more projects in progress built around Push-like things, in particular T113125 . I don't see any evidence that there's been live deploy of them yet, but maybe I'm missing something or other. If we have a live deploy

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-29 Thread BBlack
BBlack added a comment. Does Echo have any kind of push notification going on, even in light testing yet?TASK DETAILhttps://phabricator.wikimedia.org/T179156EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: ema, Gehel, Smalyshev, TerraCodes, Jay8g

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-28 Thread BBlack
BBlack added a comment. A while after the above, @hoo started focusing on a different aspect of this we've been somewhat ignoring as more of a side-symptom: that there tend to be a lot of sockets in a strange state on the "target" varnish, to various MW nodes. They look strange on

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-28 Thread BBlack
BBlack added a comment. Updates from the Varnish side of things today (since I've been bad about getting commits/logs tagged onto this ticket): 18:15 - I took over looking at today's outburst on the Varnish side The current target at the time was cp1053 (after elukey's earlier restart of cp1055

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. In T179156#3715432, @hoo wrote: I think I found the root cuase now, seems it's actually related to the WikibaseQualityConstraints extension: Isn't that the same extension referenced in the suspect commits mentioned above? 18:51 ladsgroup@tin: Synchronized php-1.31.0

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. Unless anyone objects, I'd like to start with reverting our emergency varnish max_connections changes from https://gerrit.wikimedia.org/r/#/c/386756 . Since the end of the log above, connection counts have returned to normal, which is ~100, which is 1/10th the normal 1K

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. My gut instinct remains what it was at the end of the log above. I think something in the revert of wikidatawiki to wmf.4 fixed this. And I think given the timing alignment of the Fix sorting of NullResults changes + the initial ORES->wikidata fatals makes th

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. Copying this in from etherpad (this is less awful than 6 hours of raw IRC+SAL logs, but still pretty verbose): # cache servers work ongoing here, ethtool changes that require short depooled downtimes around short ethernet port outages: 17:49 bblack: ulsfo cp servers

[Wikidata-bugs] [Maniphest] [Commented On] T175588: Server overloaded .. can't save (only remove or cancel)

2017-09-11 Thread BBlack
BBlack added a comment. Can you explain in more detail? Is the subject of this ticket was was shown as an error in your browser window? I doubt this is related to varnish and/or "mailbox lag".TASK DETAILhttps://phabricator.wikimedia.org/T175588EMAIL PREFERENCEShttps://phabricator.wik

[Wikidata-bugs] [Maniphest] [Updated] T175588: Server overloaded .. can't save (only remove or cancel)

2017-09-11 Thread BBlack
BBlack removed parent tasks: T174932: Recurrent 'mailbox lag' critical alerts and 500s, T175473: Multiple 503 Errors. TASK DETAILhttps://phabricator.wikimedia.org/T175588EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Aklapper, Esc3300

[Wikidata-bugs] [Maniphest] [Updated] T99531: [Task] move wikiba.se webhosting to wikimedia misc-cluster

2017-07-27 Thread BBlack
BBlack added a project: Traffic. TASK DETAILhttps://phabricator.wikimedia.org/T99531EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: mark, greg, PokestarFan, faidon, Ladsgroup, Ivanhercaz, Addshore, Jonas, JeroenDeDauw, thiemowmde, hoo, JanZerebecki

[Wikidata-bugs] [Maniphest] [Updated] T153563: Consider switching to HTTPS for Wikidata query service links

2017-06-26 Thread BBlack
BBlack removed a parent task: T104681: HTTPS Plans (tracking / high-level info). TASK DETAILhttps://phabricator.wikimedia.org/T153563EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Kghbln, Dalba, Lydia_Pintscher, Jonas, Ricordisamoa, Lokal_Profil

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2017-06-01 Thread BBlack
BBlack added a comment. Yeah that was the plan, for XKey to help here by consolidating that down to a single HTCP / PURGE per article touched. It's not useful for the mass-scale case (e.g. template/link references), as it doesn't scale well in that direction. But for the case like "1 ar

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2017-05-31 Thread BBlack
BBlack added a comment. We can get broader averages by dividing the values seen in the aggregate client status code graphs using eqiad's text cluster (the remote sites would expect fewer due to some of the bursts being more likely to be dropped by the network) This shows the past week's average

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2017-05-30 Thread BBlack
BBlack added a comment. The lack of graph data from falling off the history is a sad commentary on how long this has remained unresolved :( Some salient points from earlier within this ticket, to recap: In T124418#1985526, @BBlack wrote: Continuing with some stuff I was saying in IRC the other

[Wikidata-bugs] [Maniphest] [Reopened] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2017-05-19 Thread BBlack
BBlack reopened this task as "Open".BBlack added a comment. Not resolved, as the purge graphs can attest!TASK DETAILhttps://phabricator.wikimedia.org/T124418EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: aaron, BBlackCc: GWicke, ArielGlenn, Krin

[Wikidata-bugs] [Maniphest] [Commented On] T142944: Performance and caching considerations for article placeholders accesses

2016-11-08 Thread BBlack
BBlack added a comment. I clicked Submit too soon :) Continuing: We'd expect content to be at minimum a day, if not significantly longer. MW currently emits 2-week cache headers (with plans to eventually bring that down closer to a day, but those plans are still further off). Cache invalidation

[Wikidata-bugs] [Maniphest] [Commented On] T142944: Performance and caching considerations for article placeholders accesses

2016-11-08 Thread BBlack
BBlack added a comment. Nothing was ever resolved here. 30 minutes seems like an arbitrary number with no formal basis or reasoning, and is way shorter than we'd like for anything article-like.TASK DETAILhttps://phabricator.wikimedia.org/T142944EMAIL PREFERENCEShttps://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] [Closed] T132457: Move wdqs to an LVS service

2016-10-12 Thread BBlack
BBlack closed this task as "Resolved".BBlack claimed this task. TASK DETAILhttps://phabricator.wikimedia.org/T132457EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Stashbot, gerritbot, ema, Gehel, BBlack, Aklapper, mschwarzer, Avner,

[Wikidata-bugs] [Maniphest] [Updated] T132457: Move wdqs to an LVS service

2016-10-11 Thread BBlack
BBlack added a parent task: T147844: Standardize varnish applayer backend definitions. TASK DETAILhttps://phabricator.wikimedia.org/T132457EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Stashbot, gerritbot, ema, Gehel, BBlack, Aklapper, mschwarzer

[Wikidata-bugs] [Maniphest] [Commented On] T142944: Performance and caching considerations for article placeholders accesses

2016-08-17 Thread BBlack
BBlack added a comment. I think I'm lacking a lot of context here about these special pages and placeholders. But my bottom line thoughts are currently along these lines: How do actual, real-world, anonymous users interact with these placeholders and special pages? What value is it providing

[Wikidata-bugs] [Maniphest] [Commented On] T142944: Performance and caching considerations for article placeholders accesses

2016-08-16 Thread BBlack
BBlack added a comment. 30 minutes isn't really reasonable, and neither is spamming more purge traffic. If there's a constant risk of the page content breaking without invalidation, how is even 30 minutes acceptable? Doesn't this mean that on average they'll be broken for 15 minutes after

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-06-17 Thread BBlack
BBlack added a subscriber: GWicke.BBlack added a comment. @aaron and @GWicke - both patches sound promising, thanks for digging into this topic!TASK DETAILhttps://phabricator.wikimedia.org/T124418EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-16 Thread BBlack
BBlack added a comment. cache_maps cluster switched to the new varnish package today TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: jeremyb, Ronarts12, Krenair, Dzahn, GWicke

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-13 Thread BBlack
BBlack added a comment. Current State: - cp3007 and cp1045 are depooled from user traffic, icinga-downtimed for several days, and have puppet disabled. Please do not re-enable puppet on these! They also have confd shut down, and are running custom configs to continue debugging

[Wikidata-bugs] [Maniphest] [Block] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-13 Thread BBlack
BBlack reopened blocking task T131501: Convert misc cluster to Varnish 4 as "Open". TASK DETAIL https://phabricator.wikimedia.org/T133490 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: MZMcBride, gerritbot, BBlack, Bovlb

[Wikidata-bugs] [Maniphest] [Updated] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-13 Thread BBlack
BBlack added a blocked task: T131501: Convert misc cluster to Varnish 4. TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Ronarts12, Krenair, Dzahn, GWicke, Smalyshev, Heather, Nirzar

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-13 Thread BBlack
BBlack added a comment. I forgot one of our temporary hacks in the list above in https://phabricator.wikimedia.org/T134989#2290254: 4. https://gerrit.wikimedia.org/r/#/c/288656/ - we also enabled a critical small bit here in v4 vcl_hit. I reverted this for now during the varnish3

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a comment. Has anyone been able to reproduce any of the problems in the tickets merged into here, since roughly the timestamp of the above message? TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a comment. So we're currently have several experiments in play trying to figure this out: 1. We've got 2x upstream bugfixes applied to our varnishd on cache_misc: https://github.com/varnishcache/varnish-cache/commit/d828a042b3fc2c2b4f1fea83021f0d5508649e50 + https

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a comment. In the merged ticket above, it's browser access to status.wm.o, and the browser's getting a 304 Not Modified and complaining about it (due to missing character encoding supposedly, but it's entirely likely it's missing everything and that's just the first thing

[Wikidata-bugs] [Maniphest] [Merged] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a subscriber: Kghbln. BBlack merged a task: T135121: stats.wikimedia.org down. TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Kghbln, ema, Stashbot, Luke081515, matmarex

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-11 Thread BBlack
BBlack added a comment. Status update: we've been debugging this off and on all day. It's some kind of bug fallout from cache_misc's upgrade to Varnish 4. It's a very complicated bug, and we don't really understand it yet. We've made some band-aid fixes to VCL for now which should keep

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-11 Thread BBlack
BBlack added a comment. Assuming there was no transient issue (which became cached) on the wdqs end of things, then this was likely a transient thing from nginx experiments or the cache_misc varnish4 upgrade. I banned all wdqs objects from cache_misc and now your test URL works fine. Can

[Wikidata-bugs] [Maniphest] [Closed] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-09 Thread BBlack
BBlack closed this task as "Resolved". BBlack added a comment. My test cases on cache_text work now, should be resolved! TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc:

[Wikidata-bugs] [Maniphest] [Closed] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-09 Thread BBlack
BBlack closed this task as "Resolved". BBlack claimed this task. BBlack added a comment. This works now. There's a significant pause at the start of the transfer from the user's perspective if it's not a cache hit, because streaming is disabled as a workaround (so it has to compl

[Wikidata-bugs] [Maniphest] [Updated] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-09 Thread BBlack
BBlack added a comment. We now have some understanding of the mechanism of this bug ( https://phabricator.wikimedia.org/T133866#2275985 ). It should go away in the imminent varnish 4 upgrade of the misc cluster in https://phabricator.wikimedia.org/T131501. TASK DETAIL https

[Wikidata-bugs] [Maniphest] [Updated] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-09 Thread BBlack
BBlack added a blocking task: T131501: Convert misc cluster to Varnish 4. TASK DETAIL https://phabricator.wikimedia.org/T133490 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Bovlb, Aklapper, Mushroom, Avner, debt, Gehel, D3r1ck01

[Wikidata-bugs] [Maniphest] [Updated] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-09 Thread BBlack
BBlack added a comment. So, as it turns out, this is a general varnishd bug in our specific varnishd build. For purposes of this bug, our varnishd code is essentially 3.0.7 plus a bunch of ancient forward-ported 'plus' patches related to streaming, and we're missing https://github.com

[Wikidata-bugs] [Maniphest] [Unblock] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-09 Thread BBlack
BBlack closed blocking task Restricted Task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Ricordisamoa, Trung.anh.dinh, MZMcBride, Anomie, Yurivict,

[Wikidata-bugs] [Maniphest] [Triaged] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-07 Thread BBlack
BBlack triaged this task as "High" priority. TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Trung.anh.dinh, MZMcBride, Anomie, Yurivict, TerraCodes, Orlodrim, BBlack,

[Wikidata-bugs] [Maniphest] [Updated] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-07 Thread BBlack
BBlack added a blocking task: Restricted Task. TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Trung.anh.dinh, MZMcBride, Anomie, Yurivict, TerraCodes, Orlodrim, BBlack, akosiaris

[Wikidata-bugs] [Maniphest] [Updated] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-07 Thread BBlack
BBlack added a comment. Thanks for merging in the probably-related tasks. I had somehow missed really noticing T123159 earlier... So probably digging into gunzip itself isn't a fruitful path. I'm going to open a separate blocker for this that's private, so we can keep merging public

[Wikidata-bugs] [Maniphest] [Commented On] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-05 Thread BBlack
BBlack added a comment. Did some further testing on an isolated test machine, using our current varnish3 package. - Got 2833-byte test file from uncorrupted (--compressed) output on prod. This is the exact compressed content bytes emitted by MW/Apache for the broken test URL

[Wikidata-bugs] [Maniphest] [Commented On] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-03 Thread BBlack
BBlack added a comment. Just jotting down the things I know so far from investigating this morning. I still don't have a good answer yet. Based on just the test URL, debugging it extensively at various layers: 1. The response size of that URL is in the ballpark of 32KB uncompressed

[Wikidata-bugs] [Maniphest] [Commented On] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-03 Thread BBlack
BBlack added a comment. Do you know if some normal traffic is affected, such that we'd know a start date for a recent change in behavior? Or is it suspected that it was always this way? I've been digging through some debugging on this URL (which is an applayer chunked-response

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-05-03 Thread BBlack
BBlack added a comment. I really don't think it's specifically Wikidata-related either at this point. Wikidata might be a significant driver of update jobs in general, but the code changes driving the several large rate increases were probably generic to all update jobs. TASK DETAIL

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-04-27 Thread BBlack
BBlack added a blocked task: T133821: Content purges are unreliable. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Smalyshev, gerritbot, Legoktm, Addshore, daniel, hoo, aude

[Wikidata-bugs] [Maniphest] [Updated] T102476: RFC: Requirements for change propagation

2016-04-27 Thread BBlack
BBlack added a blocked task: T133821: Content purges are unreliable. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer

[Wikidata-bugs] [Maniphest] [Updated] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-04-24 Thread BBlack
BBlack edited projects, added Traffic; removed Varnish. TASK DETAIL https://phabricator.wikimedia.org/T133490 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Aklapper, Mushroom, Avner, debt, TerraCodes, Gehel, D3r1ck01, FloNight, Izno

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-04-07 Thread BBlack
BBlack added a comment. F3845100: Screen Shot 2016-04-07 at 7.47.28 PM.png <https://phabricator.wikimedia.org/F3845100> TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Sma

[Wikidata-bugs] [Maniphest] [Edited] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-04-07 Thread BBlack
BBlack edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Smalyshev, gerritbot, Legoktm, Addshore, daniel, hoo, aude, Lydia_Pintscher, JanZerebecki, MZMcBride

[Wikidata-bugs] [Maniphest] [Commented On] T121135: Banners fail to show up occassionally on Russian Wikivoyage

2016-04-05 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T121135#1910435, @Atsirlin wrote: > @Legoktm: Frankly speaking, for a small project like Wikivoyage the cache brings no obvious benefits, but triggers many serious issues including the problem of page banners and

[Wikidata-bugs] [Maniphest] [Updated] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack added a blocking task: T128813: cache_misc's misc_fetch_large_objects has issues. TASK DETAIL https://phabricator.wikimedia.org/T127014 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, BBlack Cc: gerritbot, BBlack, Gehel, Nikki, Mbch331

[Wikidata-bugs] [Maniphest] [Updated] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack edited projects, added Traffic; removed Varnish. TASK DETAIL https://phabricator.wikimedia.org/T127014 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, BBlack Cc: gerritbot, BBlack, Gehel, Nikki, Mbch331, Magnus, JanZerebecki, Smalyshev

[Wikidata-bugs] [Maniphest] [Commented On] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack added a comment. I did some live experimentation with manual edits to the VCL. It is the `between_bytes_timeout`, but the situation is complex. The timeout that's failing is on the varnish frontend fetching from the varnish backend. These are fixed at 2s, but because this is all

[Wikidata-bugs] [Maniphest] [Updated] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack added a comment. This is probably due to backend timeouts, I would guess? The default applayer settings being applied to wdqs include `between_bytes_timeout` at only 4s, whereas `first_byte_timeout` is 185s. So if wdqs delayed all output, it would have 3 minutes or so, but once

[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidata Sparql queries

2016-02-17 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T126730#2034900, @Christopher wrote: > I may be wrong, but the headers that are returned from a request to the nginx > server wdqs1002 say that varnish 1.1 is already being used there. It's varnish 3.0.6 currently (4.x is

[Wikidata-bugs] [Maniphest] [Commented On] T125392: [Task] figure out the ratio of page views by logged-in vs. logged-out users

2016-02-16 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T125392#1994242, @Milimetric wrote: > @BBlack - so you think cache_status is not even close to accurate? Do we > have other accurate measurements of it so we could compare to what extent > it's misleading? I'm happy

[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidata Sparql queries

2016-02-15 Thread BBlack
BBlack added a comment. IIRC, the problem we've beat our heads against in past SPARQL-related tickets is the fact that SPARQL clients are using `POST` method for readonly queries, due to argument length issues and whatnot. On the surface, that's a dealbreaker for caching them as `POST` isn't

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-03 Thread BBlack
BBlack added a comment. So, current thinking is that at least one of (maybe two of?) the bumps are from moving what used to be synchronous HTCP purge during requests to JobRunner jobs which should be doing the same thing. However, assuming it's that alone (or even just investigating that part

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-03 Thread BBlack
BBlack added a comment. Well then apparently the 10/s edits to all projects number I found before is complete bunk :) http://wikipulse.herokuapp.com/ has numbers for wikidata edits that approximately line up with yours, and then shows Wikipedias at about double that rate (which might

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-03 Thread BBlack
BBlack added a comment. heh so: https://phabricator.wikimedia.org/T113192 -> https://gerrit.wikimedia.org/r/#/c/258365/5 is probably the Jan 20 bump. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferen

[Wikidata-bugs] [Maniphest] [Commented On] T125392: [Task] figure out the ratio of page views by logged-in vs. logged-out users

2016-02-03 Thread BBlack
BBlack added a comment. FYI - "cache_status" is not an accurate reflection of anything. I'm not sure why we really even log it for analytics. The problem is that it only reflects some varnish state about the first of up to 3 layers of caching, and even then it does so poorly. T

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. @ori - yeah that makes sense for the initial bump, and I think there may have even been a followup to do deferred purges, which may be one of the other multipliers, but I haven't found it yet (as in, insert an immediate job and also somehow insert one that fires

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. Another data point from the weekend: In one sample I took Saturday morning, when I sampled for 300s, the top site being purged was srwiki, and something like 98% of the purges flowing for srwiki were all Talk: pages (well, with Talk: as %-encoded something in Serbian

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. Regardless, the average rate of HTCP these days is normally-flat-ish (a few scary spikes aside), and is mostly throttled by the jobqueue. The question still remains: what caused permanent, large bumps in the jobqueue htmlCacheUpdate insertion rate on ~Dec4, ~Dec11

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. @daniel - Sorry I should have linked this earlier, I made a paste at the time: https://phabricator.wikimedia.org/P2547 . Note that `/%D0%A0%D0%B0%D0%B7%D0%B3%D0%BE%D0%B2%D0%BE%D1%80:` is the Serbian srwiki version of `/Talk:`. TASK DETAIL https

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-31 Thread BBlack
BBlack added a comment. Well, we have 3 different stages of rate-increase in the insert graph, so it could well be that we have 3 independent causes to look at here. And it's not necessarily true that any of them are buggy, but we need to understand what they're doing and why, because maybe

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-31 Thread BBlack
BBlack added a comment. Continuing with some stuff I was saying in IRC the other day. At the "new normal", we're seeing something in the approximate ballpark of 400/s articles purged (which is then multiplied commonly for ?action=history and mobile and ends up more like ~1600/s a

  1   2   >