[Wikidata-bugs] [Maniphest] [Updated] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-26 Thread Smalyshev
Smalyshev removed a project: Patch-For-Review.

TASK DETAIL
  https://phabricator.wikimedia.org/T217897

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev
Cc: Lucas_Werkmeister_WMDE, Addshore, Smalyshev, BBlack, Aklapper, Gehel, 
alaa_wmde, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, 
Lahi, Gq86, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, 
LawExplorer, Zppix, _jensen, rosalieper, Jonas, Xmlizer, Wong128hk, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, 
fgiunchedi, joker88john, CucyNoiD, NebulousIris, Gaboe420, Versusxo, 
Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, 
Darkminds3113, Bsandipan, Lordiis, Adik2382, Ramalepe, Liugev6, WSH1906, 
Lewizho99, Maathavan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-04-23 Thread ReleaseTaggerBot
ReleaseTaggerBot added a project: MW-1.34-notes (1.34.0-wmf.3; 2019-04-30).

TASK DETAIL
  https://phabricator.wikimedia.org/T217897

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ReleaseTaggerBot
Cc: Lucas_Werkmeister_WMDE, Addshore, Smalyshev, BBlack, Aklapper, Gehel, 
alaa_wmde, joker88john, Legado_Shulgin, CucyNoiD, Nandana, NebulousIris, 
thifranc, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, 
Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Lahi, Gq86, Baloch007, 
Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, 
Hfbn0, Ramalepe, Liugev6, QZanden, EBjune, merbst, LawExplorer, WSH1906, 
Lewizho99, Zppix, Maathavan, _jensen, rosalieper, Jonas, Xmlizer, Wong128hk, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, 
Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-26 Thread gerritbot
gerritbot added a project: Patch-For-Review.

TASK DETAIL
  https://phabricator.wikimedia.org/T217897

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: gerritbot
Cc: Addshore, Smalyshev, BBlack, Aklapper, Gehel, alaa_wmde, Legado_Shulgin, 
CucyNoiD, Nandana, NebulousIris, thifranc, AndyTan, Gaboe420, Versusxo, 
Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, 
Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, 
Ramalepe, Liugev6, QZanden, EBjune, merbst, LawExplorer, Lewizho99, Zppix, 
Maathavan, _jensen, rosalieper, Jonas, Xmlizer, Wong128hk, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, 
fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-17 Thread Smalyshev
Smalyshev added a project: User-Smalyshev.

TASK DETAIL
  https://phabricator.wikimedia.org/T217897

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev
Cc: Addshore, Smalyshev, BBlack, Aklapper, Gehel, alaa_wmde, Legado_Shulgin, 
Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, 
merbst, LawExplorer, Zppix, _jensen, rosalieper, Jonas, Xmlizer, Wong128hk, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, 
Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-15 Thread Smalyshev
Smalyshev added a comment.


  > After talking with Stas this apparently makes updating within the updater 
harder etc as it might result in more writes to sparql? (
  
  Yes because it would have to do SPARQL Update for each individual revision.
  
  > I guess the wdqs internal machines would have comparable response times?
  
  You can see response times for RDF loading in the dashboard: 
https://grafana.wikimedia.org/d/00489/wikidata-query-service?orgId=1=now-24h=now=26
  
  > but 11 hours in a 24 hour period is still pretty significant
  
  I'm not sure I understand how this figure was obtained but there's absolutely 
no way Updater spends half time in waiting for RDF loading. In reality, it 
spends most of its time in SPARQL Update.
  
  > I hope the Java updater does some amount of async work
  
  All RDF is loaded in parallel of course (10 threads if I remember correctly). 
It should be relatively easy to see timings by yourself - just run the Updater 
with verbose logging (DEBUG level I think - `-v` option should do that).
  
  > writing to blazegraph while getting the next data ready?
  
  That could be possible but doesn't happen now. May be a good idea to try. 
However, since SPARQL Update dominates the timings pretty heavily it's unlikely 
we'd save too much. And since we need to validate IDs against database (to 
ensure we don't already have the revision we're about to fetch) we can not 
fetch RDF before previous update has finished, thus reducing the 
parallelizeable part to essentially only Kafka data loading, which doesn't seem 
to be worth it.
  
  > Another thing to consider here is in theory even when using the cache 
buster method the data the wdqs updater currently gets when passing nocache=ts 
may not be up to date due to maxlag, not sure if that has been considered in 
the updater process at all?
  
  Yes, see discussion at T210901  
and T212550 . TLDR: we know it 
happens, we have stopgap measure to counter it, but we haven't implemented the 
real solution yet.

TASK DETAIL
  https://phabricator.wikimedia.org/T217897

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev
Cc: Addshore, Smalyshev, BBlack, Aklapper, Gehel, alaa_wmde, Legado_Shulgin, 
Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, 
merbst, LawExplorer, Zppix, _jensen, rosalieper, Jonas, Xmlizer, Wong128hk, 
jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, 
Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-12 Thread Gehel
Gehel added a comment.


  In the end, this looks like a more generic issue of processing events with 
some level of transactionality. It looks like some of this might be addressed 
in T185233  (more specifically in 
T105766  ?). I don't fully 
understand the exact goal of those tickets, but it at least make sense to raise 
the issue so that the WDQS use case is addressed if it can be addressed.

TASK DETAIL
  https://phabricator.wikimedia.org/T217897

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: Smalyshev, BBlack, Aklapper, Gehel, alaa_wmde, Legado_Shulgin, Nandana, 
thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, 
Zppix, _jensen, rosalieper, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-12 Thread Gehel
Gehel added a comment.


  Somewhat related to this issue: T207837 
. This ticket is about sharing the 
common workload of updating, including the fetches from wikidata, which would 
also reduce the load on wikidata.

TASK DETAIL
  https://phabricator.wikimedia.org/T217897

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: Smalyshev, BBlack, Aklapper, Gehel, alaa_wmde, Legado_Shulgin, Nandana, 
thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, 
Zppix, _jensen, rosalieper, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T217897: Reduce / remove the aggessive cache busting behaviour of wdqs-updater

2019-03-08 Thread Smalyshev
Smalyshev added a comment.


  > disable cache busting by default, enable it internally
  
  This would immediately break all external updaters. They'd just pick up the 
first update in a bunch and ignore the rest, because of the caching.
  
  > use the event date instead of the current date as timestamp (would enable 
caching the fetch for the same event from multiple clients)
  
  Timestamps are very bad identifiers, since they don't have enough resolution 
- many edits can happen in a second. Also, AFAIK there's no easy way to fetch 
revision by timestamp, only by revision ID. Also, see above about why we don't 
want to fetch by revision ID - this applies for timestamps too, even if they 
worked.
  
  > don't do cache busting on events older than X
  
  This can work only if we knew that there's an edit for the same item newer 
than X. Otherwise of two edits older than X, you'd get the data for the first 
and the second would be ignored since it would fetch the data from the cache. 
Of course, if you knew there's a new edit you could just skip all the events 
before that edit altogether :)
  
  > back off when data received is older than the event
  
  We already do, see T210901  and 
discussion around it, it's not fully implemented yet but that's the workaround 
we're using against inconsistent chronology.

TASK DETAIL
  https://phabricator.wikimedia.org/T217897

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev
Cc: Smalyshev, BBlack, Aklapper, Gehel, alaa_wmde, Legado_Shulgin, Nandana, 
thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, 
Zppix, _jensen, rosalieper, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs