[Wikidata-bugs] [Maniphest] [Commented On] T109038: [Bug] Users are unable to login on wikidata.org until they clear their cookies

2015-08-21 Thread BBlack
BBlack added a comment. Ah, that makes some logical sense. We should probably stripping duplicate _User the way we are for duplicate _Token to address the bulk of it TASK DETAIL https://phabricator.wikimedia.org/T109038 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] [Commented On] T109038: [Bug] Users are unable to login on wikidata.org until they clear their cookies

2015-08-21 Thread BBlack
BBlack added a comment. Pushed a fix for deleting the duplicate centralauth_User for wikidata.org, should be in effect globally now. TASK DETAIL https://phabricator.wikimedia.org/T109038 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: JanZerebecki

[Wikidata-bugs] [Maniphest] [Commented On] T107602: Set up a public interface to the wikidata query service

2015-08-04 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T107602#1507676, @JanZerebecki wrote: If we put it in misc then this would be the first that has another level behind misc instead of one named server. I have no preference. You or whoever wants to merge it chooses? Another

[Wikidata-bugs] [Maniphest] [Updated] T107602: Set up a public interface to the wikidata query service

2015-08-04 Thread BBlack
BBlack added a comment. Bringing this conversation back here from the comments in https://gerrit.wikimedia.org/r/#/c/228411/ The short summary about what this does is: A read only mirror of Wikidata.org (only the public information) in a special database for anyone to run queries against

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T109038: [Bug] Users are unable to login on wikidata.org until they clear their cookies

2015-08-14 Thread BBlack
BBlack added a subscriber: JanZerebecki. BBlack added a comment. We think the workaround deployed via https://gerrit.wikimedia.org/r/231556 should fix this up well enough. It worked for @JanZerebecki who still had the old bad cookie, which the fixup wiped out. Can others confirm? TASK

[Wikidata-bugs] [Maniphest] [Commented On] T107601: Assign an LVS service to the wikidata query service

2015-08-04 Thread BBlack
BBlack added a subscriber: BBlack. BBlack added a comment. Do we actually need an internal service endpoint like `wdqs.svc.eqiad.wmnet` for this, or are we just doing this as part of a standard construction for services with multiple hosts behind public varnish/LVS need another layer of LVS

[Wikidata-bugs] [Maniphest] [Commented On] T109038: [Bug] Users are unable to login on wikidata.org until they clear their cookies

2015-08-14 Thread BBlack
BBlack added a comment. Does someone have the specific details here on what cookie name to wipe on requests to what domainname(s)? TASK DETAIL https://phabricator.wikimedia.org/T109038 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc

[Wikidata-bugs] [Maniphest] [Commented On] T109038: [Bug] Users are unable to login on wikidata.org until they clear their cookies

2015-08-14 Thread BBlack
BBlack added a comment. As best I can tell from my own testing (but I think someone with deeper insight into the CORS change for (www|query).wikidata.org and S:UL and such would need to confirm this sounds sane): I was able to reproduce the issue, and I was able to apparently perma-fix

[Wikidata-bugs] [Maniphest] [Commented On] T112151: Support POST for SPARQL query endpoint

2015-10-21 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T112151#1739918, @Smalyshev wrote: > > As Andrew said above, why not support this directly in WQDS if you have to > > support it at all? > > > Because in Blazegraph, allowing POST means allowing write requests

[Wikidata-bugs] [Maniphest] [Commented On] T112151: Support POST for SPARQL query endpoint

2015-10-20 Thread BBlack
BBlack added a subscriber: BBlack. BBlack added a comment. I'm not a fan of this on a few levels: 1. As Andrew said above, why not support this directly in WQDS if you have to support it at all? as in, let the POSTs come through the rest of the stack unmolested, and deal with it inside WQDS

[Wikidata-bugs] [Maniphest] [Commented On] T109072: [Task] Revert https://gerrit.wikimedia.org/r/#/c/231556/3 on 2015-09-14

2015-09-15 Thread BBlack
BBlack added a comment. The commit is staged above, but we should probably hold until the 16th or so just in case. TASK DETAIL https://phabricator.wikimedia.org/T109072 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: gerritbot

[Wikidata-bugs] [Maniphest] [Changed Project Column] T107602: Set up a public interface to the wikidata query service

2015-09-22 Thread BBlack
BBlack moved this task to Done on the Traffic workboard. TASK DETAIL https://phabricator.wikimedia.org/T107602 WORKBOARD https://phabricator.wikimedia.org/project/board/1201/ EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Joe, BBlack Cc: jeremyb

[Wikidata-bugs] [Maniphest] [Updated] T119917: Set up backend per-IP limits on varnish for WDQS

2015-12-01 Thread BBlack
BBlack added a project: Traffic. TASK DETAIL https://phabricator.wikimedia.org/T119917 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: BBlack, Aklapper, Smalyshev, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, Mbch331

[Wikidata-bugs] [Maniphest] [Commented On] T119917: Set up backend per-IP limits on varnish for WDQS

2015-12-01 Thread BBlack
BBlack added a subscriber: BBlack. BBlack added a comment. It would be best to use the header `X-Client-IP` as the notion of the client IP address for these sorts of purposes. This is intended to resolve trusted XFF, but has a much shorter list (intended to be improved on), whereas TrustedXFF

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-06-17 Thread BBlack
BBlack added a subscriber: GWicke.BBlack added a comment. @aaron and @GWicke - both patches sound promising, thanks for digging into this topic!TASK DETAILhttps://phabricator.wikimedia.org/T124418EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. @ori - yeah that makes sense for the initial bump, and I think there may have even been a followup to do deferred purges, which may be one of the other multipliers, but I haven't found it yet (as in, insert an immediate job and also somehow insert one that fires

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-31 Thread BBlack
BBlack added a comment. Well, we have 3 different stages of rate-increase in the insert graph, so it could well be that we have 3 independent causes to look at here. And it's not necessarily true that any of them are buggy, but we need to understand what they're doing and why, because maybe

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-31 Thread BBlack
BBlack added a comment. Continuing with some stuff I was saying in IRC the other day. At the "new normal", we're seeing something in the approximate ballpark of 400/s articles purged (which is then multiplied commonly for ?action=history and mobile and ends up more like ~1600/s a

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. Another data point from the weekend: In one sample I took Saturday morning, when I sampled for 300s, the top site being purged was srwiki, and something like 98% of the purges flowing for srwiki were all Talk: pages (well, with Talk: as %-encoded something in Serbian

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. Regardless, the average rate of HTCP these days is normally-flat-ish (a few scary spikes aside), and is mostly throttled by the jobqueue. The question still remains: what caused permanent, large bumps in the jobqueue htmlCacheUpdate insertion rate on ~Dec4, ~Dec11

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-03 Thread BBlack
BBlack added a comment. So, current thinking is that at least one of (maybe two of?) the bumps are from moving what used to be synchronous HTCP purge during requests to JobRunner jobs which should be doing the same thing. However, assuming it's that alone (or even just investigating that part

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-03 Thread BBlack
BBlack added a comment. Well then apparently the 10/s edits to all projects number I found before is complete bunk :) http://wikipulse.herokuapp.com/ has numbers for wikidata edits that approximately line up with yours, and then shows Wikipedias at about double that rate (which might

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-03 Thread BBlack
BBlack added a comment. heh so: https://phabricator.wikimedia.org/T113192 -> https://gerrit.wikimedia.org/r/#/c/258365/5 is probably the Jan 20 bump. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferen

[Wikidata-bugs] [Maniphest] [Commented On] T125392: [Task] figure out the ratio of page views by logged-in vs. logged-out users

2016-02-03 Thread BBlack
BBlack added a comment. FYI - "cache_status" is not an accurate reflection of anything. I'm not sure why we really even log it for analytics. The problem is that it only reflects some varnish state about the first of up to 3 layers of caching, and even then it does so poorly. T

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-02-01 Thread BBlack
BBlack added a comment. @daniel - Sorry I should have linked this earlier, I made a paste at the time: https://phabricator.wikimedia.org/P2547 . Note that `/%D0%A0%D0%B0%D0%B7%D0%B3%D0%BE%D0%B2%D0%BE%D1%80:` is the Serbian srwiki version of `/Talk:`. TASK DETAIL https

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-25 Thread BBlack
BBlack added a project: Wikidata. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: MZMcBride, Luke081515, Denniss, aaron, faidon, Joe, ori, BBlack, Aklapper, Wikidata-bugs, aude

[Wikidata-bugs] [Maniphest] [Changed Subscribers] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-25 Thread BBlack
BBlack added a subscriber: JanZerebecki. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: JanZerebecki, MZMcBride, Luke081515, Denniss, aaron, faidon, Joe, ori, BBlack, Aklapper

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-01-25 Thread BBlack
BBlack added a comment. Yeah but the rate increase we're looking at is actually in the htmlCacheUpdate job insertion rate, regardless of magnification due to pages-affected-per-update. I'm surprised that we don't have any logs/data as to the source of those jobs. TASK DETAIL https

[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidata Sparql queries

2016-02-15 Thread BBlack
BBlack added a comment. IIRC, the problem we've beat our heads against in past SPARQL-related tickets is the fact that SPARQL clients are using `POST` method for readonly queries, due to argument length issues and whatnot. On the surface, that's a dealbreaker for caching them as `POST` isn't

[Wikidata-bugs] [Maniphest] [Commented On] T125392: [Task] figure out the ratio of page views by logged-in vs. logged-out users

2016-02-16 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T125392#1994242, @Milimetric wrote: > @BBlack - so you think cache_status is not even close to accurate? Do we > have other accurate measurements of it so we could compare to what extent > it's misleading? I'm happy

[Wikidata-bugs] [Maniphest] [Commented On] T126730: [RFC] Caching for results of wikidata Sparql queries

2016-02-17 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T126730#2034900, @Christopher wrote: > I may be wrong, but the headers that are returned from a request to the nginx > server wdqs1002 say that varnish 1.1 is already being used there. It's varnish 3.0.6 currently (4.x is

[Wikidata-bugs] [Maniphest] [Edited] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-04-07 Thread BBlack
BBlack edited the task description. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Smalyshev, gerritbot, Legoktm, Addshore, daniel, hoo, aude, Lydia_Pintscher, JanZerebecki, MZMcBride

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-04-07 Thread BBlack
BBlack added a comment. F3845100: Screen Shot 2016-04-07 at 7.47.28 PM.png <https://phabricator.wikimedia.org/F3845100> TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Sma

[Wikidata-bugs] [Maniphest] [Commented On] T121135: Banners fail to show up occassionally on Russian Wikivoyage

2016-04-05 Thread BBlack
BBlack added a comment. In https://phabricator.wikimedia.org/T121135#1910435, @Atsirlin wrote: > @Legoktm: Frankly speaking, for a small project like Wikivoyage the cache brings no obvious benefits, but triggers many serious issues including the problem of page banners and

[Wikidata-bugs] [Maniphest] [Updated] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack added a blocking task: T128813: cache_misc's misc_fetch_large_objects has issues. TASK DETAIL https://phabricator.wikimedia.org/T127014 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, BBlack Cc: gerritbot, BBlack, Gehel, Nikki, Mbch331

[Wikidata-bugs] [Maniphest] [Commented On] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack added a comment. I did some live experimentation with manual edits to the VCL. It is the `between_bytes_timeout`, but the situation is complex. The timeout that's failing is on the varnish frontend fetching from the varnish backend. These are fixed at 2s, but because this is all

[Wikidata-bugs] [Maniphest] [Updated] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack edited projects, added Traffic; removed Varnish. TASK DETAIL https://phabricator.wikimedia.org/T127014 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, BBlack Cc: gerritbot, BBlack, Gehel, Nikki, Mbch331, Magnus, JanZerebecki, Smalyshev

[Wikidata-bugs] [Maniphest] [Updated] T127014: Empty result on a tree query

2016-03-23 Thread BBlack
BBlack added a comment. This is probably due to backend timeouts, I would guess? The default applayer settings being applied to wdqs include `between_bytes_timeout` at only 4s, whereas `first_byte_timeout` is 185s. So if wdqs delayed all output, it would have 3 minutes or so, but once

[Wikidata-bugs] [Maniphest] [Updated] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-04-24 Thread BBlack
BBlack edited projects, added Traffic; removed Varnish. TASK DETAIL https://phabricator.wikimedia.org/T133490 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Aklapper, Mushroom, Avner, debt, TerraCodes, Gehel, D3r1ck01, FloNight, Izno

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-04-27 Thread BBlack
BBlack added a blocked task: T133821: Content purges are unreliable. TASK DETAIL https://phabricator.wikimedia.org/T124418 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Smalyshev, gerritbot, Legoktm, Addshore, daniel, hoo, aude

[Wikidata-bugs] [Maniphest] [Updated] T102476: RFC: Requirements for change propagation

2016-04-27 Thread BBlack
BBlack added a blocked task: T133821: Content purges are unreliable. TASK DETAIL https://phabricator.wikimedia.org/T102476 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: ArielGlenn, hoo, Addshore, RobLa-WMF, StudiesWorld, intracer

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-11 Thread BBlack
BBlack added a comment. Assuming there was no transient issue (which became cached) on the wdqs end of things, then this was likely a transient thing from nginx experiments or the cache_misc varnish4 upgrade. I banned all wdqs objects from cache_misc and now your test URL works fine. Can

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-11 Thread BBlack
BBlack added a comment. Status update: we've been debugging this off and on all day. It's some kind of bug fallout from cache_misc's upgrade to Varnish 4. It's a very complicated bug, and we don't really understand it yet. We've made some band-aid fixes to VCL for now which should keep

[Wikidata-bugs] [Maniphest] [Merged] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a subscriber: Kghbln. BBlack merged a task: T135121: stats.wikimedia.org down. TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Kghbln, ema, Stashbot, Luke081515, matmarex

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a comment. In the merged ticket above, it's browser access to status.wm.o, and the browser's getting a 304 Not Modified and complaining about it (due to missing character encoding supposedly, but it's entirely likely it's missing everything and that's just the first thing

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a comment. So we're currently have several experiments in play trying to figure this out: 1. We've got 2x upstream bugfixes applied to our varnishd on cache_misc: https://github.com/varnishcache/varnish-cache/commit/d828a042b3fc2c2b4f1fea83021f0d5508649e50 + https

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-16 Thread BBlack
BBlack added a comment. cache_maps cluster switched to the new varnish package today TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: jeremyb, Ronarts12, Krenair, Dzahn, GWicke

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-13 Thread BBlack
BBlack added a comment. I forgot one of our temporary hacks in the list above in https://phabricator.wikimedia.org/T134989#2290254: 4. https://gerrit.wikimedia.org/r/#/c/288656/ - we also enabled a critical small bit here in v4 vcl_hit. I reverted this for now during the varnish3

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-13 Thread BBlack
BBlack added a comment. Current State: - cp3007 and cp1045 are depooled from user traffic, icinga-downtimed for several days, and have puppet disabled. Please do not re-enable puppet on these! They also have confd shut down, and are running custom configs to continue debugging

[Wikidata-bugs] [Maniphest] [Updated] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-13 Thread BBlack
BBlack added a blocked task: T131501: Convert misc cluster to Varnish 4. TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Ronarts12, Krenair, Dzahn, GWicke, Smalyshev, Heather, Nirzar

[Wikidata-bugs] [Maniphest] [Block] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-13 Thread BBlack
BBlack reopened blocking task T131501: Convert misc cluster to Varnish 4 as "Open". TASK DETAIL https://phabricator.wikimedia.org/T133490 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: MZMcBride, gerritbot, BBlack, Bovlb

[Wikidata-bugs] [Maniphest] [Commented On] T134989: WDQS empty response - transfer clsoed with 15042 bytes remaining to read

2016-05-12 Thread BBlack
BBlack added a comment. Has anyone been able to reproduce any of the problems in the tickets merged into here, since roughly the timestamp of the above message? TASK DETAIL https://phabricator.wikimedia.org/T134989 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2016-05-03 Thread BBlack
BBlack added a comment. I really don't think it's specifically Wikidata-related either at this point. Wikidata might be a significant driver of update jobs in general, but the code changes driving the several large rate increases were probably generic to all update jobs. TASK DETAIL

[Wikidata-bugs] [Maniphest] [Commented On] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-03 Thread BBlack
BBlack added a comment. Do you know if some normal traffic is affected, such that we'd know a start date for a recent change in behavior? Or is it suspected that it was always this way? I've been digging through some debugging on this URL (which is an applayer chunked-response

[Wikidata-bugs] [Maniphest] [Commented On] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-03 Thread BBlack
BBlack added a comment. Just jotting down the things I know so far from investigating this morning. I still don't have a good answer yet. Based on just the test URL, debugging it extensively at various layers: 1. The response size of that URL is in the ballpark of 32KB uncompressed

[Wikidata-bugs] [Maniphest] [Commented On] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-05 Thread BBlack
BBlack added a comment. Did some further testing on an isolated test machine, using our current varnish3 package. - Got 2833-byte test file from uncorrupted (--compressed) output on prod. This is the exact compressed content bytes emitted by MW/Apache for the broken test URL

[Wikidata-bugs] [Maniphest] [Triaged] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-07 Thread BBlack
BBlack triaged this task as "High" priority. TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Trung.anh.dinh, MZMcBride, Anomie, Yurivict, TerraCodes, Orlodrim, BBlack,

[Wikidata-bugs] [Maniphest] [Updated] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-07 Thread BBlack
BBlack added a comment. Thanks for merging in the probably-related tasks. I had somehow missed really noticing T123159 earlier... So probably digging into gunzip itself isn't a fruitful path. I'm going to open a separate blocker for this that's private, so we can keep merging public

[Wikidata-bugs] [Maniphest] [Updated] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-07 Thread BBlack
BBlack added a blocking task: Restricted Task. TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Trung.anh.dinh, MZMcBride, Anomie, Yurivict, TerraCodes, Orlodrim, BBlack, akosiaris

[Wikidata-bugs] [Maniphest] [Unblock] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-09 Thread BBlack
BBlack closed blocking task Restricted Task as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Ricordisamoa, Trung.anh.dinh, MZMcBride, Anomie, Yurivict,

[Wikidata-bugs] [Maniphest] [Updated] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-09 Thread BBlack
BBlack added a blocking task: T131501: Convert misc cluster to Varnish 4. TASK DETAIL https://phabricator.wikimedia.org/T133490 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc: Bovlb, Aklapper, Mushroom, Avner, debt, Gehel, D3r1ck01

[Wikidata-bugs] [Maniphest] [Updated] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-09 Thread BBlack
BBlack added a comment. So, as it turns out, this is a general varnishd bug in our specific varnishd build. For purposes of this bug, our varnishd code is essentially 3.0.7 plus a bunch of ancient forward-ported 'plus' patches related to streaming, and we're missing https://github.com

[Wikidata-bugs] [Maniphest] [Updated] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-09 Thread BBlack
BBlack added a comment. We now have some understanding of the mechanism of this bug ( https://phabricator.wikimedia.org/T133866#2275985 ). It should go away in the imminent varnish 4 upgrade of the misc cluster in https://phabricator.wikimedia.org/T131501. TASK DETAIL https

[Wikidata-bugs] [Maniphest] [Closed] T133490: Wikidata Query Service REST endpoint returns truncated results

2016-05-09 Thread BBlack
BBlack closed this task as "Resolved". BBlack claimed this task. BBlack added a comment. This works now. There's a significant pause at the start of the transfer from the user's perspective if it's not a cache hit, because streaming is disabled as a workaround (so it has to compl

[Wikidata-bugs] [Maniphest] [Closed] T133866: Varnish seems to sometimes mangle uncompressed API results

2016-05-09 Thread BBlack
BBlack closed this task as "Resolved". BBlack added a comment. My test cases on cache_text work now, should be resolved! TASK DETAIL https://phabricator.wikimedia.org/T133866 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: BBlack Cc:

[Wikidata-bugs] [Maniphest] [Commented On] T142944: Performance and caching considerations for article placeholders accesses

2016-08-16 Thread BBlack
BBlack added a comment. 30 minutes isn't really reasonable, and neither is spamming more purge traffic. If there's a constant risk of the page content breaking without invalidation, how is even 30 minutes acceptable? Doesn't this mean that on average they'll be broken for 15 minutes after

[Wikidata-bugs] [Maniphest] [Commented On] T142944: Performance and caching considerations for article placeholders accesses

2016-08-17 Thread BBlack
BBlack added a comment. I think I'm lacking a lot of context here about these special pages and placeholders. But my bottom line thoughts are currently along these lines: How do actual, real-world, anonymous users interact with these placeholders and special pages? What value is it providing

[Wikidata-bugs] [Maniphest] [Commented On] T142944: Performance and caching considerations for article placeholders accesses

2016-11-08 Thread BBlack
BBlack added a comment. Nothing was ever resolved here. 30 minutes seems like an arbitrary number with no formal basis or reasoning, and is way shorter than we'd like for anything article-like.TASK DETAILhttps://phabricator.wikimedia.org/T142944EMAIL PREFERENCEShttps://phabricator.wikimedia.org

[Wikidata-bugs] [Maniphest] [Commented On] T142944: Performance and caching considerations for article placeholders accesses

2016-11-08 Thread BBlack
BBlack added a comment. I clicked Submit too soon :) Continuing: We'd expect content to be at minimum a day, if not significantly longer. MW currently emits 2-week cache headers (with plans to eventually bring that down closer to a day, but those plans are still further off). Cache invalidation

[Wikidata-bugs] [Maniphest] [Updated] T132457: Move wdqs to an LVS service

2016-10-11 Thread BBlack
BBlack added a parent task: T147844: Standardize varnish applayer backend definitions. TASK DETAILhttps://phabricator.wikimedia.org/T132457EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Stashbot, gerritbot, ema, Gehel, BBlack, Aklapper, mschwarzer

[Wikidata-bugs] [Maniphest] [Closed] T132457: Move wdqs to an LVS service

2016-10-12 Thread BBlack
BBlack closed this task as "Resolved".BBlack claimed this task. TASK DETAILhttps://phabricator.wikimedia.org/T132457EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Stashbot, gerritbot, ema, Gehel, BBlack, Aklapper, mschwarzer, Avner,

[Wikidata-bugs] [Maniphest] [Updated] T99531: [Task] move wikiba.se webhosting to wikimedia misc-cluster

2017-07-27 Thread BBlack
BBlack added a project: Traffic. TASK DETAILhttps://phabricator.wikimedia.org/T99531EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: mark, greg, PokestarFan, faidon, Ladsgroup, Ivanhercaz, Addshore, Jonas, JeroenDeDauw, thiemowmde, hoo, JanZerebecki

[Wikidata-bugs] [Maniphest] [Updated] T153563: Consider switching to HTTPS for Wikidata query service links

2017-06-26 Thread BBlack
BBlack removed a parent task: T104681: HTTPS Plans (tracking / high-level info). TASK DETAILhttps://phabricator.wikimedia.org/T153563EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Kghbln, Dalba, Lydia_Pintscher, Jonas, Ricordisamoa, Lokal_Profil

[Wikidata-bugs] [Maniphest] [Reopened] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2017-05-19 Thread BBlack
BBlack reopened this task as "Open".BBlack added a comment. Not resolved, as the purge graphs can attest!TASK DETAILhttps://phabricator.wikimedia.org/T124418EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: aaron, BBlackCc: GWicke, ArielGlenn, Krin

[Wikidata-bugs] [Maniphest] [Updated] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2017-05-30 Thread BBlack
BBlack added a comment. The lack of graph data from falling off the history is a sad commentary on how long this has remained unresolved :( Some salient points from earlier within this ticket, to recap: In T124418#1985526, @BBlack wrote: Continuing with some stuff I was saying in IRC the other

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2017-05-31 Thread BBlack
BBlack added a comment. We can get broader averages by dividing the values seen in the aggregate client status code graphs using eqiad's text cluster (the remote sites would expect fewer due to some of the bursts being more likely to be dropped by the network) This shows the past week's average

[Wikidata-bugs] [Maniphest] [Commented On] T124418: Investigate massive increase in htmlCacheUpdate jobs in Dec/Jan

2017-06-01 Thread BBlack
BBlack added a comment. Yeah that was the plan, for XKey to help here by consolidating that down to a single HTCP / PURGE per article touched. It's not useful for the mass-scale case (e.g. template/link references), as it doesn't scale well in that direction. But for the case like "1 ar

[Wikidata-bugs] [Maniphest] [Updated] T175588: Server overloaded .. can't save (only remove or cancel)

2017-09-11 Thread BBlack
BBlack removed parent tasks: T174932: Recurrent 'mailbox lag' critical alerts and 500s, T175473: Multiple 503 Errors. TASK DETAILhttps://phabricator.wikimedia.org/T175588EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: Aklapper, Esc3300

[Wikidata-bugs] [Maniphest] [Commented On] T175588: Server overloaded .. can't save (only remove or cancel)

2017-09-11 Thread BBlack
BBlack added a comment. Can you explain in more detail? Is the subject of this ticket was was shown as an error in your browser window? I doubt this is related to varnish and/or "mailbox lag".TASK DETAILhttps://phabricator.wikimedia.org/T175588EMAIL PREFERENCEShttps://phabricator.wik

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-11-22 Thread BBlack
BBlack added a comment. No, we never made an incident rep on this one, and I don't think it would be fair at this time to implicate ORES as a cause. We can't really say that ORES was directly involved at all (or any of the other services investigated here). Because the cause was so unknown

[Wikidata-bugs] [Maniphest] [Changed Status] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-11-06 Thread BBlack
BBlack lowered the priority of this task from "High" to "Normal".BBlack changed the task status from "Open" to "Stalled".BBlack added a comment. The timeout changes above will offer some insulation, and as time passes we're not seeing evidence of this pr

[Wikidata-bugs] [Maniphest] [Commented On] T99531: [Task] move wikiba.se webhosting to wikimedia misc-cluster

2017-12-11 Thread BBlack
BBlack added a comment. It's a pain any direction we slice this, and I'm not fond of adding new canonical domains outside the known set for individual low-traffic projects. We didn't add new domains for a variety of other public-facing efforts (e.g. wdqs, ORES, maps, etc). We don't have clear

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-28 Thread BBlack
BBlack added a comment. Updates from the Varnish side of things today (since I've been bad about getting commits/logs tagged onto this ticket): 18:15 - I took over looking at today's outburst on the Varnish side The current target at the time was cp1053 (after elukey's earlier restart of cp1055

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-28 Thread BBlack
BBlack added a comment. A while after the above, @hoo started focusing on a different aspect of this we've been somewhat ignoring as more of a side-symptom: that there tend to be a lot of sockets in a strange state on the "target" varnish, to various MW nodes. They look strange on

[Wikidata-bugs] [Maniphest] [Updated] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-29 Thread BBlack
BBlack added a comment. Now that I'm digging deeper, it seems there are one or more projects in progress built around Push-like things, in particular T113125 . I don't see any evidence that there's been live deploy of them yet, but maybe I'm missing something or other. If we have a live deploy

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-29 Thread BBlack
BBlack added a comment. Does Echo have any kind of push notification going on, even in light testing yet?TASK DETAILhttps://phabricator.wikimedia.org/T179156EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: BBlackCc: ema, Gehel, Smalyshev, TerraCodes, Jay8g

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3718772, @ema wrote: There's a timeout limiting the total amount of time varnish is allowed to spend on a single request, send_timeout, defaulting to 10 minutes. Unfortunately there's no counter tracking when the timer kicks in, although a debug line is logged

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3719928, @daniel wrote: In any case, this would consume front-edge client connections, but wouldn't trigger anything deeper into the stack That's assuming varnish always caches the entire request, and never "streams" to the backend, even for file upl

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. Trickled-in POST on the client side would be something else. Varnish's timeout_idle, which is set to 5s on our frontends, acts as the limit for receiving all client request headers, but I'm not sure that it has such a limitation that applies to client-sent bodies. In any

[Wikidata-bugs] [Maniphest] [Lowered Priority] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack lowered the priority of this task from "Unbreak Now!" to "High".BBlack added a comment. Reducing this from UBN->High, because current best-working-theory is this problem is gone so long as we keep the VCL do_stream=false change reverted. Obviously, there's still some

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3720392, @BBlack wrote: In T179156#3719995, @BBlack wrote: We have an obvious case of normal slow chunked uploads of large files to commons to look at for examples to observe, though. Rewinding a little: this is false, I was just getting confused

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-30 Thread BBlack
BBlack added a comment. In T179156#3719995, @BBlack wrote: We have an obvious case of normal slow chunked uploads of large files to commons to look at for examples to observe, though. Rewinding a little: this is false, I was just getting confused by terminology. Commons "chunked&quo

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. Unless anyone objects, I'd like to start with reverting our emergency varnish max_connections changes from https://gerrit.wikimedia.org/r/#/c/386756 . Since the end of the log above, connection counts have returned to normal, which is ~100, which is 1/10th the normal 1K

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. Copying this in from etherpad (this is less awful than 6 hours of raw IRC+SAL logs, but still pretty verbose): # cache servers work ongoing here, ethtool changes that require short depooled downtimes around short ethernet port outages: 17:49 bblack: ulsfo cp servers

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. My gut instinct remains what it was at the end of the log above. I think something in the revert of wikidatawiki to wmf.4 fixed this. And I think given the timing alignment of the Fix sorting of NullResults changes + the initial ORES->wikidata fatals makes th

[Wikidata-bugs] [Maniphest] [Commented On] T179156: 503 spikes and resulting API slowness starting 18:45 October 26

2017-10-27 Thread BBlack
BBlack added a comment. In T179156#3715432, @hoo wrote: I think I found the root cuase now, seems it's actually related to the WikibaseQualityConstraints extension: Isn't that the same extension referenced in the suspect commits mentioned above? 18:51 ladsgroup@tin: Synchronized php-1.31.0

[Wikidata-bugs] [Maniphest] [Updated] T99531: [Task] move wikiba.se webhosting to wikimedia misc-cluster

2018-08-18 Thread BBlack
BBlack added a comment. There are plans underway at this point to support multiple LE certs on our standard cache terminators via the work in T199711 due by EOQ (end of Sept), which would make this whole thing simpler and zero cert cost. I couldn't say for sure how fast we'll shake out all

[Wikidata-bugs] [Maniphest] [Commented On] T199146: "Blocked" response when trying to access constraintsrdf action from production host

2018-07-09 Thread BBlack
BBlack added a comment. This raises some questions that are probably unrelated to the problem at hand, but might affect things indirectly: Why is an internal service (wdqs) querying a public endpoint? It should probably use private internal endpoints like appservers.svc or api.svc

[Wikidata-bugs] [Maniphest] [Commented On] T199219: WDQS should use internal endpoint to communicate to Wikidata

2018-07-11 Thread BBlack
BBlack added a comment. It's a complicated topic I think, on our end. There are ways to make it work today, but when I try to write down generic steps any internal service could take to talk to any other (esp MW or RB), it bogs down in complications that are probably less than ideal in various

[Wikidata-bugs] [Maniphest] [Commented On] T206105: Optimize networking configuration for WDQS

2018-10-15 Thread BBlack
BBlack added a comment. Yes, let's look at this today. I think we need better tg3 ethernet card support in interface::rps for one of our authdnses anyways, which you'll need here too.TASK DETAILhttps://phabricator.wikimedia.org/T206105EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings

  1   2   >