[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2020-01-15 Thread gerritbot
gerritbot added a comment. Change 499363 abandoned by Addshore: Add caching of Special:EntityData results https://gerrit.wikimedia.org/r/499363 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-06-30 Thread Smalyshev
Smalyshev added a comment. Eventually, yes. TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Smalyshev Cc: Lucas_Werkmeister_WMDE, Addshore, Smalyshev, BBlack, Aklapper, Gehel, darthmon_wmde,

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-06-30 Thread Addshore
Addshore added a comment. I guess this will eventually be in wdqs 0.3.3 ? TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Smalyshev, Addshore Cc: Lucas_Werkmeister_WMDE, Addshore, Smalyshev,

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-06-25 Thread Smalyshev
Smalyshev added a comment. No release yet, but if you check out Updater or WDQS build, you get the same behavior. TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Smalyshev Cc:

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-06-25 Thread Addshore
Addshore added a comment. Will this change also get rolled out to 3rd parties using the updater? / Is it in a certain release? TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Smalyshev,

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-05-13 Thread Addshore
Addshore added a comment. So, I did some really crappy analysis of the hit rate in varnish before and after this change, looking at the 5th of april and the 5th of may (one before and one after as far as I can tell). | SUMMARY| April 5th | May 5th | | hit-front | 1132 |

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-30 Thread gerritbot
gerritbot added a comment. Change 504990 **merged** by Gehel: [operations/puppet@production] Enable revision fetches in production https://gerrit.wikimedia.org/r/504990 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-23 Thread gerritbot
gerritbot added a comment. Change 502655 **merged** by jenkins-bot: [mediawiki/extensions/Wikibase@master] Allow revision dump for redirects https://gerrit.wikimedia.org/r/502655 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-19 Thread Smalyshev
Smalyshev added a comment. Results of caching can be seen here: https://grafana.wikimedia.org/d/00489/wikidata-query-service?orgId=1=now-30d=now=24_name=wdqs-internal Deploy date is Apr 11, fetch time drops from 135/195 ms (eqiad/codfw) to 90/150 ms when requests are cached. TASK

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-18 Thread gerritbot
gerritbot added a comment. Change 504990 had a related patch set uploaded (by Smalyshev; owner: Smalyshev): [operations/puppet@production] Enable revision fetches in production https://gerrit.wikimedia.org/r/504990 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-11 Thread gerritbot
gerritbot added a comment. Change 502909 **merged** by Gehel: [operations/puppet@production] Enable revisions support on internal clusters https://gerrit.wikimedia.org/r/502909 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-10 Thread gerritbot
gerritbot added a comment. Change 502909 had a related patch set uploaded (by Smalyshev; owner: Smalyshev): [operations/puppet@production] Enable revisions support on internal clusters https://gerrit.wikimedia.org/r/502909 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-09 Thread gerritbot
gerritbot added a comment. Change 502655 had a related patch set uploaded (by Smalyshev; owner: Smalyshev): [mediawiki/extensions/Wikibase@master] Allow revision dump for redirects https://gerrit.wikimedia.org/r/502655 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-05 Thread gerritbot
gerritbot added a comment. Change 501450 **merged** by jenkins-bot: [wikidata/query/rdf@master] Work around status 400 on redirect revision fetch https://gerrit.wikimedia.org/r/501450 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-04 Thread gerritbot
gerritbot added a comment. Change 501450 had a related patch set uploaded (by Smalyshev; owner: Smalyshev): [wikidata/query/rdf@master] Work around status 400 on redirect revision fetch https://gerrit.wikimedia.org/r/501450 TASK DETAIL https://phabricator.wikimedia.org/T217897

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-03 Thread Smalyshev
Smalyshev added a comment. I've also made a counter to check how many "forward skips" - i.e. loading revision further than we've asked in change - we get. The averages are between 0.1 and 0.5, sometimes going to 1 - i.e. we're saving up to one item fetch/update per second, or since we're

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-03 Thread gerritbot
gerritbot added a comment. Change 501056 **merged** by Gehel: [operations/puppet@production] wdqs: expose revision-fetch mechanism https://gerrit.wikimedia.org/r/501056 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-03 Thread gerritbot
gerritbot added a comment. Change 501056 had a related patch set uploaded (by Gehel; owner: Smalyshev): [operations/puppet@production] wdqs: expose revision-fetch mechanism https://gerrit.wikimedia.org/r/501056 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-03 Thread gerritbot
gerritbot added a comment. Change 500359 **merged** by Gehel: [operations/puppet@production] Enable using revision-fetch mechanism for test & internal clusters https://gerrit.wikimedia.org/r/500359 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-02 Thread gerritbot
gerritbot added a comment. Change 499951 **merged** by jenkins-bot: [wikidata/query/rdf@master] Implement more cache-friendly Wikibase fetch strategy https://gerrit.wikimedia.org/r/499951 TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-31 Thread gerritbot
gerritbot added a comment. Change 500359 had a related patch set uploaded (by Smalyshev; owner: Smalyshev): [operations/puppet@production] Enable using revision-fetch mechanism for test & internal clusters https://gerrit.wikimedia.org/r/500359 TASK DETAIL

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-29 Thread Addshore
Addshore added a comment. In T217897#5066900 , @Smalyshev wrote: > > WDQS does know what the latest version of the entity that it is trying to get updates for is, > > But "last version that WDQS knows of" can be very different from

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-28 Thread gerritbot
gerritbot added a comment. Change 499951 had a related patch set uploaded (by Smalyshev; owner: Smalyshev): [wikidata/query/rdf@master] Implement more cache-friendly Wikibase fetch strategy https://gerrit.wikimedia.org/r/499951 TASK DETAIL https://phabricator.wikimedia.org/T217897

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-28 Thread Smalyshev
Smalyshev added a comment. @Addshore btw do I understand right that constraints can not be fetched per-revision? In this case, do we still need cache-busting there? Or constrains manage their caches? I am not sure what to do here. TASK DETAIL https://phabricator.wikimedia.org/T217897

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-28 Thread Smalyshev
Smalyshev added a comment. > WDQS does know what the latest version of the entity that it is trying to get updates for is, But "last version that WDQS knows of" can be very different from "last version that Wikidata has". That's the whole issue. I had an idea recently though. Maybe

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-28 Thread Addshore
Addshore added a comment. In T217897#5062728 , @Smalyshev wrote: > > the cache we are talking about there would be unnecessary if the wdqs just hit varnish. > > It is problematic for WDQS to "just hit varnish", because varnish does

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-27 Thread Smalyshev
Smalyshev added a comment. > the cache we are talking about there would be unnecessary if the wdqs just hit varnish. It is problematic for WDQS to "just hit varnish", because varnish does not know if certain revision is the latest one available or not. Wikidata on the other hand does.

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-27 Thread Addshore
Addshore added a comment. In T217897#5060748 , @Smalyshev wrote: > Looking at the distribution of Special:EntityData fetches, if we cache entities under 10K, we will capture about 90% of them. Most frequent sizes are 1 to 4K. So

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-27 Thread Addshore
Addshore added a comment. In T217897#5056213 , @Smalyshev wrote: > > I'm still a bit confused about this logic inside the updater, especially with this id validation checking if we have the revision already etc? > > Not sure what

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-26 Thread gerritbot
gerritbot added a comment. Change 499363 had a related patch set uploaded (by Smalyshev; owner: Smalyshev): [mediawiki/extensions/Wikibase@master] Add caching of Special:EntityData results https://gerrit.wikimedia.org/r/499363 TASK DETAIL https://phabricator.wikimedia.org/T217897

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-26 Thread Smalyshev
Smalyshev added a comment. Looking at the distribution of Special:EntityData fetches, if we cache entities under 10K, we will capture about 90% of them. Most frequent sizes are 1 to 4K. So caching probably worth trying. TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-25 Thread Smalyshev
Smalyshev added a comment. > I'm still a bit confused about this logic inside the updater, especially with this id validation checking if we have the revision already etc? Not sure what you mean "already". You can have revision ID in the change, and revision ID in Wikidata, but you

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-22 Thread Addshore
Addshore added a comment. In T217897#5026499 , @Smalyshev wrote: > > I guess the wdqs internal machines would have comparable response times? > > You can see response times for RDF loading in the dashboard:

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-13 Thread Addshore
Addshore added a comment. The only other thing I was going to add (forgot before i hit submit on the last post) Within the cluster varnish cached results for entities return much faster than the php returned results (of course) | entity | varnish result | php result |

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-13 Thread Addshore
Addshore added a comment. >> In the case that lead to this ticket, it was a remote client at Orange issuing a very high rate of these uncacheable queries It's not just Orange it would seem.. I took a quick look at the webrequest data for the WDQS updater UA and there are other

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-12 Thread Smalyshev
Smalyshev added a comment. > Why do these remote clients need "realtime" (no staleness) fetches of Q items? Because that's what Query Service is - realtime (well, near-realtime, given update times) queryable representation of Wikidata content in RDF form. > What I hear is it sounds

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-12 Thread BBlack
BBlack added a comment. I think it would be better, from my perspective, to really understand the use-cases better (which I don't). Why do these remote clients need "realtime" (no staleness) fetches of Q items? What I hear is it sounds like all clients expect everything to be perfectly

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-12 Thread Smalyshev
Smalyshev added a comment. > it at least make sense to raise the issue so that the WDQS use case is addressed if it can be addressed For that, we better define "WDQS case". The best I have right now is the event aggregation idea described above. In theory, it could also be combined

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-12 Thread Smalyshev
Smalyshev added a comment. > I'm not sure I understand why a If-Modified-Since: would not work. How would you see it working? In Varnish, it would be useless since Varnish has no way of knowing if Wikidata item changed since being cached. If we go to the backend, first we are already

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-12 Thread Gehel
Gehel added a comment. Given the discussion above, I'm not sure I understand why a `If-Modified-Since:` would not work. What am I missing? TASK DETAIL https://phabricator.wikimedia.org/T217897 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To:

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-08 Thread Smalyshev
Smalyshev added a comment. > don't do cache busting on events older than X This however gave me an idea. If we kept a map of all latest revision IDs for all items we've recently updated, we could eliminate a lot of stale updates - especially when we're catching up after the lag. The

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-08 Thread Smalyshev
Smalyshev added a comment. We've been around this topic a number of times, so I'll write a summary where we're at so far. I'm sorry it's going to be long, there's a bunch of issues at play. Also, if after reading this you think it's utter nonsense and I'm missing an obvious solution to this

[Wikidata-bugs] [Maniphest] [Commented On] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-08 Thread BBlack
BBlack added a comment. Looking at an internal version of the flavor=dump outputs of an entity, related observations: Test request from the inside: `curl -v 'https://www.wikidata.org/wiki/Special:EntityData/Q15223487.ttl?flavor=dump' --resolve www.wikidata.org:443:10.2.2.1` -