[Wikidata-bugs] [Maniphest] [Assigned] T204531: Wikidata dumps creating large amounts of log spam

2018-12-10 Thread Imarlier
Imarlier assigned this task to aaron.Imarlier moved this task from Inbox to Doing on the Performance-Team board.Imarlier added a comment.
@aaron is going to see what else can be done to reduce spam, will then assign back to @ArielGlennTASK DETAILhttps://phabricator.wikimedia.org/T204531WORKBOARDhttps://phabricator.wikimedia.org/project/board/1212/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: aaron, ImarlierCc: Imarlier, gerritbot, aaron, Joe, akosiaris, hoo, Agabi10, Marostegui, ArielGlenn, Krinkle, fgiunchedi, jcrespo, CucyNoiD, Nandana, NebulousIris, jijiki, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Lunewa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, LawExplorer, Vali.matei, Lewizho99, Volans, Maathavan, _jensen, D3r1ck01, MGChecker, gnosygnu, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331, Jay8g, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Reassigned] T207718: Errors trying to fetch RDF from Wikidata

2018-12-03 Thread Imarlier
Imarlier reassigned this task from Imarlier to Smalyshev.
TASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Smalyshev, ImarlierCc: Imarlier, BBlack, ema, Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, _jensen, D3r1ck01, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T207718: Errors trying to fetch RDF from Wikidata

2018-12-03 Thread Imarlier
Imarlier added a comment.
@Smalyshev Guessing this should go back to you for followup?TASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: BBlack, ema, Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, _jensen, D3r1ck01, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T207718: Errors trying to fetch RDF from Wikidata

2018-12-03 Thread Imarlier
Imarlier added a comment.
I've been running this in a tmux session on a few of the wdqs servers:  while :; do DSTAMP=$(date); CW=$(sudo netstat -anet | grep 208.80.154.224 | grep -c CLOSE_WAIT); echo "${DSTAMP}: ${CW}"; sleep 1; done >> ~/close_waits.txt.  (154.224 is the edge for the text cache cluster.)

It's a very small sample at this point, but there are a few things that make me think that this is the issue, or one of the issues:

wdqs1006 had three errors thrown at 2018-12-03T20:09:56.  This is the close_wait count around that time:

Mon Dec  3 20:09:30 UTC 2018: 0
Mon Dec  3 20:09:31 UTC 2018: 0
Mon Dec  3 20:09:32 UTC 2018: 0
Mon Dec  3 20:09:33 UTC 2018: 0
Mon Dec  3 20:09:34 UTC 2018: 0
Mon Dec  3 20:09:35 UTC 2018: 0
Mon Dec  3 20:09:36 UTC 2018: 0
Mon Dec  3 20:09:37 UTC 2018: 0
Mon Dec  3 20:09:38 UTC 2018: 10
Mon Dec  3 20:09:39 UTC 2018: 10
Mon Dec  3 20:09:40 UTC 2018: 10
Mon Dec  3 20:09:41 UTC 2018: 10
Mon Dec  3 20:09:42 UTC 2018: 10
Mon Dec  3 20:09:43 UTC 2018: 10
Mon Dec  3 20:09:44 UTC 2018: 10
Mon Dec  3 20:09:45 UTC 2018: 10
Mon Dec  3 20:09:46 UTC 2018: 10
Mon Dec  3 20:09:47 UTC 2018: 10
Mon Dec  3 20:09:48 UTC 2018: 10
Mon Dec  3 20:09:49 UTC 2018: 10
Mon Dec  3 20:09:50 UTC 2018: 10
Mon Dec  3 20:09:51 UTC 2018: 10
Mon Dec  3 20:09:52 UTC 2018: 10
Mon Dec  3 20:09:53 UTC 2018: 10
Mon Dec  3 20:09:54 UTC 2018: 10
Mon Dec  3 20:09:55 UTC 2018: 10
Mon Dec  3 20:09:57 UTC 2018: 0
Mon Dec  3 20:09:58 UTC 2018: 0
Mon Dec  3 20:09:59 UTC 2018: 0
Mon Dec  3 20:10:00 UTC 2018: 0

wdqs1004 had three exceptions thrown at 20:08:23.  This is the close_wait count around that time:

Mon Dec  3 20:08:10 UTC 2018: 0
Mon Dec  3 20:08:11 UTC 2018: 0
Mon Dec  3 20:08:12 UTC 2018: 0
Mon Dec  3 20:08:13 UTC 2018: 0
Mon Dec  3 20:08:14 UTC 2018: 0
Mon Dec  3 20:08:15 UTC 2018: 10
Mon Dec  3 20:08:16 UTC 2018: 10
Mon Dec  3 20:08:17 UTC 2018: 10
Mon Dec  3 20:08:18 UTC 2018: 10
Mon Dec  3 20:08:19 UTC 2018: 10
Mon Dec  3 20:08:20 UTC 2018: 10
Mon Dec  3 20:08:21 UTC 2018: 10
Mon Dec  3 20:08:22 UTC 2018: 10
Mon Dec  3 20:08:23 UTC 2018: 0
Mon Dec  3 20:08:24 UTC 2018: 0
Mon Dec  3 20:08:25 UTC 2018: 0
Mon Dec  3 20:08:26 UTC 2018: 0
Mon Dec  3 20:08:27 UTC 2018: 0
Mon Dec  3 20:08:28 UTC 2018: 0
Mon Dec  3 20:08:29 UTC 2018: 0

wdqs1005 had three exceptions thrown at 20:07:51.  This is the close_wait count around that time:

Mon Dec  3 20:07:20 UTC 2018: 0
Mon Dec  3 20:07:21 UTC 2018: 0
Mon Dec  3 20:07:22 UTC 2018: 10
Mon Dec  3 20:07:23 UTC 2018: 10
Mon Dec  3 20:07:24 UTC 2018: 10
Mon Dec  3 20:07:25 UTC 2018: 10
Mon Dec  3 20:07:26 UTC 2018: 10
Mon Dec  3 20:07:27 UTC 2018: 10
Mon Dec  3 20:07:28 UTC 2018: 10
Mon Dec  3 20:07:29 UTC 2018: 10
Mon Dec  3 20:07:30 UTC 2018: 10
Mon Dec  3 20:07:31 UTC 2018: 10
Mon Dec  3 20:07:32 UTC 2018: 10
Mon Dec  3 20:07:33 UTC 2018: 10
Mon Dec  3 20:07:35 UTC 2018: 10
Mon Dec  3 20:07:36 UTC 2018: 10
Mon Dec  3 20:07:37 UTC 2018: 10
Mon Dec  3 20:07:38 UTC 2018: 10
Mon Dec  3 20:07:39 UTC 2018: 10
Mon Dec  3 20:07:40 UTC 2018: 10
Mon Dec  3 20:07:41 UTC 2018: 10
Mon Dec  3 20:07:42 UTC 2018: 10
Mon Dec  3 20:07:43 UTC 2018: 10
Mon Dec  3 20:07:44 UTC 2018: 10
Mon Dec  3 20:07:45 UTC 2018: 10
Mon Dec  3 20:07:46 UTC 2018: 10
Mon Dec  3 20:07:47 UTC 2018: 10
Mon Dec  3 20:07:48 UTC 2018: 10
Mon Dec  3 20:07:49 UTC 2018: 10
Mon Dec  3 20:07:50 UTC 2018: 10
Mon Dec  3 20:07:51 UTC 2018: 0
Mon Dec  3 20:07:52 UTC 2018: 0
Mon Dec  3 20:07:53 UTC 2018: 0
Mon Dec  3 20:07:54 UTC 2018: 0
Mon Dec  3 20:07:55 UTC 2018: 0
Mon Dec  3 20:07:56 UTC 2018: 0
Mon Dec  3 20:07:57 UTC 2018: 0
Mon Dec  3 20:07:58 UTC 2018: 0
Mon Dec  3 20:07:59 UTC 2018: 0

wdqs1006 at 20:07:06:

Mon Dec  3 20:06:46 UTC 2018: 0
Mon Dec  3 20:06:47 UTC 2018: 0
Mon Dec  3 20:06:48 UTC 2018: 9
Mon Dec  3 20:06:49 UTC 2018: 9
Mon Dec  3 20:06:50 UTC 2018: 9
Mon Dec  3 20:06:51 UTC 2018: 9
Mon Dec  3 20:06:52 UTC 2018: 9
Mon Dec  3 20:06:53 UTC 2018: 9
Mon Dec  3 20:06:54 UTC 2018: 9
Mon Dec  3 20:06:55 UTC 2018: 9
Mon Dec  3 20:06:56 UTC 2018: 9
Mon Dec  3 20:06:57 UTC 2018: 9
Mon Dec  3 20:06:58 UTC 2018: 9
Mon Dec  3 20:07:00 UTC 2018: 9
Mon Dec  3 20:07:01 UTC 2018: 9
Mon Dec  3 20:07:02 UTC 2018: 9
Mon Dec  3 20:07:03 UTC 2018: 9
Mon Dec  3 20:07:04 UTC 2018: 9
Mon Dec  3 20:07:05 UTC 2018: 9
Mon Dec  3 20:07:06 UTC 2018: 0
Mon Dec  3 20:07:07 UTC 2018: 0
Mon Dec  3 20:07:08 UTC 2018: 0
Mon Dec  3 20:07:09 UTC 2018: 0
Mon Dec  3 20:07:10 UTC 2018: 0
Mon Dec  3 20:07:11 UTC 2018: 0

The pattern is consistent enough that I feel pretty comfortable concluding that this is the cause.

https://jira.apache.org/jira/browse/HTTPCLIENT-1918 has a bunch of suggestions around this.  It also notes that HttpClient has connection management logging that can be used to detect cases where connections aren't being released back to the pool correctly.TASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Imarlie

[Wikidata-bugs] [Maniphest] [Commented On] T207718: Errors trying to fetch RDF from Wikidata

2018-12-03 Thread Imarlier
Imarlier added a comment.
@Smalyshev Yes, it would be slower, but it would also be diagnostic -- if persistent connections are disabled and the errors stop, we can be pretty confident that something about the way that they're configured is what's resulting in this issue.TASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: BBlack, ema, Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, _jensen, D3r1ck01, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T207718: Errors trying to fetch RDF from Wikidata

2018-12-03 Thread Imarlier
Imarlier added a comment.
@Smalyshev Another thought: why not just disable pooling, and have the client close each connection after each request?TASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: BBlack, ema, Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, _jensen, D3r1ck01, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T207718: Errors trying to fetch RDF from Wikidata

2018-12-03 Thread Imarlier
Imarlier added a comment.
@BBlack @ema Couple of questions for you about Nginx:


Do we have nginx configured to handle a specific number of requests on a given worker process/thread, and then shut that down?
If it possible for nginx to be restarted (interrupting existing persistent connections) due to config updates or the like, and if so, is there a record of times when that has happened?


What I'm poking at is the issue described here: https://williamsbdev.com/posts/no-http-response-exceptions/.

The Wikidata updater process has a pool of connections to the text cache nodes, eg

imarlier@wdqs1003:~$ sudo netstat -anet | grep 208.80.154.224
tcp6   0  0 10.64.0.14:50536208.80.154.224:443  ESTABLISHED 499283749063
tcp6   0  0 10.64.0.14:50278208.80.154.224:443  ESTABLISHED 499283748769
tcp6   0  0 10.64.0.14:51154208.80.154.224:443  ESTABLISHED 499283750591
tcp6   0  0 10.64.0.14:50566208.80.154.224:443  ESTABLISHED 499283733664
tcp6   0  0 10.64.0.14:50564208.80.154.224:443  ESTABLISHED 499283749076
tcp6   0  0 10.64.0.14:51184208.80.154.224:443  ESTABLISHED 499283738772
tcp6   0  0 10.64.0.14:51648208.80.154.224:443  ESTABLISHED 499283750982
tcp6   0  0 10.64.0.14:50794208.80.154.224:443  ESTABLISHED 499283749269
tcp6   0  0 10.64.0.14:51662208.80.154.224:443  ESTABLISHED 499283750987
tcp6   0  0 10.64.0.14:50796208.80.154.224:443  ESTABLISHED 499283738743

The implication of the article linked above is that if the process that represents the remote side of these connections were to shut down while the connection itself is idle, it's possible that the socket would enter a CLOSE_WAIT state on the local side, and that in turn would result in the NoHttpResponseException that we're seeing in this case.

I'm doing some super super naive testing of this hypothesis on one of the wdqs hosts, but I'm curious whether it strikes you as being possible based on what we see here.TASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: BBlack, ema, Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, _jensen, D3r1ck01, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T207718: Errors trying to fetch RDF from Wikidata

2018-11-14 Thread Imarlier
Imarlier added a comment.
Hrm.  In that case, very likely that you're right, and what I'm seeing is
the retry.

Out of curiosity, have you examined GC behavior around the times that these
issues crop up?TASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: BBlack, ema, Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, D3r1ck01, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T207718: Errors trying to fetch RDF from Wikidata

2018-11-14 Thread Imarlier
Imarlier added a comment.

In T207718#4748289, @Smalyshev wrote:
So, an interesting thing: in at least some of these cases, there is a web request that is making it to wikidata, and that is returning a 200.

The request is retried if it fails, are you sure it's not the retry that you are seeing with 200?


It could be -- how quickly does it retry?  Immediately?  Or is there a delay?TASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: BBlack, ema, Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, D3r1ck01, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T207718: Errors trying to fetch RDF from Wikidata

2018-11-14 Thread Imarlier
Imarlier added a comment.
Doesn't appear to have solved the issue, but I need to verify that the patches have actually been deployed: https://logstash.wikimedia.org/goto/d72dcf3ef04eb8d02cb6a4c602754cfdTASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: BBlack, ema, Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, D3r1ck01, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T202764: Wikidata produces a lot of failed requests for recentchanges API

2018-11-14 Thread Imarlier
Imarlier added a comment.
@Gehel have the patches referenced above been deployed?TASK DETAILhttps://phabricator.wikimedia.org/T202764EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Stashbot, Marostegui, Banyek, Reedy, gerritbot, Krinkle, Addshore, Yurik, jcrespo, Imarlier, Ladsgroup, Lydia_Pintscher, WMDE-leszek, Aklapper, Gehel, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, D3r1ck01, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T207718: Errors trying to fetch RDF from Wikidata

2018-11-05 Thread Imarlier
Imarlier added a comment.
Given what we're seeing so far, I suggest that we let https://gerrit.wikimedia.org/r/471737 land and see if that changes behavior.  It seems reasonable to me that it might be the culprit.TASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: BBlack, ema, Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T207718: Errors trying to fetch RDF from Wikidata

2018-11-04 Thread Imarlier
Imarlier added a comment.
So, an interesting thing: in at least some of these cases, there is a web request that is making it to wikidata, and that is returning a 200.  I put together a jupyter notebook that pulls down a list of errors from logstash, and then queries the webrequest table in Hive for matches.  Looking at a couple of hours of output, there's a mix of requests that actually did have a match and those that did not:

[2018-10-31 21:47:18] [/wiki/Special:EntityData/Q47409824.ttl?nocache=1540842786911=dump] Error received at 2018-10-29 19:53:06.912000+00:00
[2018-10-31 21:47:18] [/wiki/Special:EntityData/Q47409824.ttl?nocache=1540842786911=dump] executing Hive query looking for a matching web request
[2018-10-31 21:55:50] [/wiki/Special:EntityData/Q47409824.ttl?nocache=1540842786911=dump] No matching web requests found
[2018-10-31 21:55:50] [/wiki/Special:EntityData/Q57954108.ttl?nocache=1540842620660=dump] Error received at 2018-10-29 19:50:20.661000+00:00
[2018-10-31 21:55:50] [/wiki/Special:EntityData/Q57954108.ttl?nocache=1540842620660=dump] executing Hive query looking for a matching web request
[2018-10-31 21:57:28] [/wiki/Special:EntityData/Q57954108.ttl?nocache=1540842620660=dump] No matching web requests found
[2018-10-31 21:57:28] [/wiki/Special:EntityData/Q43203379.ttl?nocache=1540842620660=dump] Error received at 2018-10-29 19:50:20.661000+00:00
[2018-10-31 21:57:28] [/wiki/Special:EntityData/Q43203379.ttl?nocache=1540842620660=dump] executing Hive query looking for a matching web request
[2018-10-31 21:58:04] [/wiki/Special:EntityData/Q43203379.ttl?nocache=1540842620660=dump] No matching web requests found
[2018-10-31 21:58:04] [/wiki/Special:EntityData/Q42486692.ttl?nocache=1540842396024=dump] Error received at 2018-10-29 19:46:36.025000+00:00
[2018-10-31 21:58:04] [/wiki/Special:EntityData/Q42486692.ttl?nocache=1540842396024=dump] executing Hive query looking for a matching web request
[2018-10-31 21:58:40] [/wiki/Special:EntityData/Q42486692.ttl?nocache=1540842396024=dump] No matching web requests found
[2018-10-31 21:58:40] [/wiki/Special:EntityData/Q33890407.ttl?nocache=1540842396024=dump] Error received at 2018-10-29 19:46:36.025000+00:00
[2018-10-31 21:58:40] [/wiki/Special:EntityData/Q33890407.ttl?nocache=1540842396024=dump] executing Hive query looking for a matching web request
[2018-10-31 21:59:12] [/wiki/Special:EntityData/Q33890407.ttl?nocache=1540842396024=dump] No matching web requests found
[2018-10-31 21:59:12] [/wiki/Special:EntityData/Q50092000.ttl?nocache=1540841863246=dump] Error received at 2018-10-29 19:37:43.247000+00:00
[2018-10-31 21:59:12] [/wiki/Special:EntityData/Q50092000.ttl?nocache=1540841863246=dump] executing Hive query looking for a matching web request
[2018-10-31 21:59:44] [/wiki/Special:EntityData/Q50092000.ttl?nocache=1540841863246=dump] No matching web requests found
[2018-10-31 21:59:44] [/wiki/Special:EntityData/Q47410266.ttl?nocache=1540841271671=dump] Error received at 2018-10-29 19:27:51.672000+00:00
[2018-10-31 21:59:44] [/wiki/Special:EntityData/Q47410266.ttl?nocache=1540841271671=dump] executing Hive query looking for a matching web request
[2018-10-31 22:00:14] [/wiki/Special:EntityData/Q47410266.ttl?nocache=1540841271671=dump] No matching web requests found
[2018-10-31 22:00:14] [/wiki/Special:EntityData/Q12261543.ttl?nocache=1540841271667=dump] Error received at 2018-10-29 19:27:51.671000+00:00
[2018-10-31 22:00:14] [/wiki/Special:EntityData/Q12261543.ttl?nocache=1540841271667=dump] executing Hive query looking for a matching web request
[2018-10-31 22:00:54] [/wiki/Special:EntityData/Q12261543.ttl?nocache=1540841271667=dump] No matching web requests found
[2018-10-31 22:00:54] [/wiki/Special:EntityData/Q47621848.ttl?nocache=1540841210186=dump] Error received at 2018-10-29 19:26:50.187000+00:00
[2018-10-31 22:00:54] [/wiki/Special:EntityData/Q47621848.ttl?nocache=1540841210186=dump] executing Hive query looking for a matching web request
[2018-10-31 22:01:34] [/wiki/Special:EntityData/Q47621848.ttl?nocache=1540841210186=dump] No matching web requests found
[2018-10-31 22:01:34] [/wiki/Special:EntityData/Q47621848.ttl?nocache=1540841210182=dump] Error received at 2018-10-29 19:26:50.186000+00:00
[2018-10-31 22:01:34] [/wiki/Special:EntityData/Q47621848.ttl?nocache=1540841210182=dump] executing Hive query looking for a matching web request
[2018-10-31 22:02:08] [/wiki/Special:EntityData/Q47621848.ttl?nocache=1540841210182=dump] No matching web requests found
[2018-10-31 22:02:08] [/wiki/Special:EntityData/Q47621848.ttl?nocache=1540841210175=dump] Error received at 2018-10-29 19:26:50.182000+00:00
[2018-10-31 22:02:08] [/wiki/Special:EntityData/Q47621848.ttl?nocache=1540841210175=dump] executing Hive query looking for a matching web request
[2018-10-31 22:02:40] [/wiki/Special:EntityData/Q47621848.ttl?nocache=1540841210175=dump] No matching web requests found
[2018-10-31 22:02:40] [/wiki/Special:EntityData/Q33861879.ttl?nocache

[Wikidata-bugs] [Maniphest] [Commented On] T202764: Wikidata produces a lot of failed requests for recentchanges API

2018-11-01 Thread Imarlier
Imarlier added a comment.
Can someone add me to T202765, mentioned above by @Gehel?  The issue that @Smalyshev created last week seems to involve requests that are never getting to wikidata, which suggests a wdqs related issue of some sort (and thus potentially related to spikes in resource use or something like that.)TASK DETAILhttps://phabricator.wikimedia.org/T202764EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Stashbot, Marostegui, Banyek, Reedy, gerritbot, Krinkle, Addshore, Yurik, jcrespo, Imarlier, Ladsgroup, Lydia_Pintscher, WMDE-leszek, Aklapper, Gehel, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Subscribers] T207718: Errors trying to fetch RDF from Wikidata

2018-10-31 Thread Imarlier
Imarlier added subscribers: ema, BBlack.Imarlier added a comment.
Actually, as I think about it, the lack of a matching webrequest is itself a potentially interesting clue.  Varnish should have recorded the request even if there were timeouts from the backend, but there's no indication of that.

@BBlack @ema Question: Do varnish error logs get aggregated in to logstash?  If not, how do I access them?TASK DETAILhttps://phabricator.wikimedia.org/T207718EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: BBlack, ema, Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T198946: Add Schema property 'sameAs' pointing to Wikidata entries

2018-10-29 Thread Imarlier
Imarlier edited projects, added Performance-Team (Radar); removed Performance-Team.
TASK DETAILhttps://phabricator.wikimedia.org/T198946EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Niedzielski, ImarlierCc: Gilles, nray, Smalyshev, WMDE-leszek, Tarrow, BBlack, Addshore, Lydia_Pintscher, mpopov, pmiazga, Jdrewniak, gerritbot, Denny, ovasileva, Tbayer, Elitre, CKoerner_WMF, Jdlrobson, Niedzielski, phuedx, dr0ptp4kt, Imarlier, cscott, Krenair, Aklapper, Deskana, CucyNoiD, Nandana, NebulousIris, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Chicocvenancio, Th3d3v1ls, Ramalepe, Liugev6, QZanden, LawExplorer, Winter, Vali.matei, Lewizho99, Maathavan, Jonas, Wikidata-bugs, aude, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T207718: Errors trying to fetch RDF from Wikidata

2018-10-29 Thread Imarlier
Imarlier moved this task from Inbox to Doing on the Performance-Team board.Imarlier claimed this task.
TASK DETAILhttps://phabricator.wikimedia.org/T207718WORKBOARDhttps://phabricator.wikimedia.org/project/board/1212/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Gehel, Aklapper, Smalyshev, Legado_Shulgin, Nandana, thifranc, AndyTan, Davinaclare77, Qtn1293, Imarlier, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T97368: Investigate more efficient memcached solution for CacheAwarePropertyInfoStore

2018-09-17 Thread Imarlier
Imarlier edited projects, added Performance-Team (Radar); removed Performance-Team.
TASK DETAILhttps://phabricator.wikimedia.org/T97368EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Stashbot, gerritbot, Jdforrester-WMF, Joe, mark, Addshore, jijiki, akosiaris, Krinkle, JanZerebecki, thiemowmde, aude, daniel, Aklapper, hoo, AndyTan, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Davinaclare77, Adrian1985, Qtn1293, Cpaulf30, Imarlier, Lahi, Gq86, Baloch007, Darkminds3113, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Hfbn0, Ramalepe, Liugev6, QZanden, LawExplorer, Vali.matei, Lewizho99, Zppix, Maathavan, Wong128hk, Wikidata-bugs, ArielGlenn, faidon, He7d3r, Mbch331, Jay8g, fgiunchedi, greg___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T202764: Wikidata produces a lot of failed requests for recentchanges API

2018-08-27 Thread Imarlier
Imarlier added a comment.
Hey, @Smalyshev -- Did you tag perf team on this because you're hoping that we can help with the investigation?TASK DETAILhttps://phabricator.wikimedia.org/T202764EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Imarlier, Ladsgroup, Lydia_Pintscher, WMDE-leszek, Aklapper, Gehel, Smalyshev, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Vali.matei, Zppix, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T110022: Create a library with HTTP related functions/code

2018-08-20 Thread Imarlier
Imarlier added a comment.
Made this suggestion in T201409 but figured it should be repeated here: whatever the implementation ends up being, it should be easy (or possibly required?) to set the User Agent string, with the goal that any caller will use that to specify what job/service/etc is making a call.  It makes it much, much easier to debug cross-service calls when you have that info.TASK DETAILhttps://phabricator.wikimedia.org/T110022EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Imarlier, BPirkle, Nikerabbit, Dereckson, Ladsgroup, Krinkle, PokestarFan, Tgr, Lydia_Pintscher, Ricordisamoa, Legoktm, Addshore, Aklapper, Arch2all, Lahi, Gq86, dmaza, GoranSMilovanovic, DFelten, QZanden, LawExplorer, HakanIST, Abbe98, Hannolans, Samwilson, Wikidata-bugs, aude, Dinoguy1000, fbstj, Tpt, Mbch331, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T110022: Create a library with HTTP related functions/code

2018-08-09 Thread Imarlier
Imarlier added a comment.
FWIW, I'd maybe offer a soft poke in the direction of Guzzle, mostly because I find that having REST-y functionality integrated can be helpful.  But it's not something that I feel strongly about.TASK DETAILhttps://phabricator.wikimedia.org/T110022EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Imarlier, BPirkle, Nikerabbit, Dereckson, Ladsgroup, Krinkle, PokestarFan, Tgr, Lydia_Pintscher, Ricordisamoa, Legoktm, Addshore, Aklapper, Arch2all, Lahi, Gq86, dmaza, GoranSMilovanovic, DFelten, QZanden, LawExplorer, HakanIST, Abbe98, Hannolans, Samwilson, Wikidata-bugs, aude, Dinoguy1000, fbstj, Tpt, Mbch331, bd808___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Project Column] T152185: Map code throwing JS exception on IE11 Windows 7 causing Pagebanner not to be shown and Common.js not running due to JS error thrown by le

2018-06-11 Thread Imarlier
Imarlier moved this task from Inbox to Radar on the Performance-Team board.Imarlier edited projects, added Performance-Team (Radar); removed Performance-Team.
TASK DETAILhttps://phabricator.wikimedia.org/T152185WORKBOARDhttps://phabricator.wikimedia.org/project/board/1212/EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Jdlrobson, ImarlierCc: Krinkle, gerritbot, JGirault, Massimo_Telo, Nicholas.tsg, Jdlrobson, ovasileva, Andyrom75, Aklapper, Imarlier, Winter, Vali.matei, Wikidata-bugs, Lydia_Pintscher___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T117534: DCAT-AP: XML produces invalid output with HHVM

2018-01-09 Thread Imarlier
Imarlier added a comment.
This was reverted back to using php5 in https://phabricator.wikimedia.org/T182348TASK DETAILhttps://phabricator.wikimedia.org/T117534EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: hoo, ImarlierCc: Imarlier, nichtich, MoritzMuehlenhoff, gerritbot, StudiesWorld, hoo, ArielGlenn, Lokal_Profil, Aklapper, Adrian1985, Cpaulf30, Lahi, Gq86, Baloch007, Darkminds3113, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, LawExplorer, Litanwei, Lewizho99, Maathavan, Wikidata-bugs, aude, Svick, Mbch331, Krenair, Joe, jeremyb___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T183101: Items missing from Wikidata index due to LinksUpdate lock failures

2018-01-05 Thread Imarlier
Imarlier added a comment.
Just noticed this in SAL:

15:00	<jynus@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1100 (duration: 00m 58s)

Potentially interesting, given that timeouts has started to be reported slightly before that.

The commit that was being synced just refers to the host as being broken: https://phabricator.wikimedia.org/rOMWC6b3f4c9b5be8f231fb4c687e8cdcff2468796905

This does explain the big surge in exceptions around that time, but doesn't help to clear up what connection (if any) this issue had to the failed LinksUpdate jobs.TASK DETAILhttps://phabricator.wikimedia.org/T183101EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Legoktm, Imarlier, aaron, Aklapper, EBernhardson, dcausse, Smalyshev, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, Vali.matei, Jdrewniak, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Claimed] T183101: Items missing from Wikidata index due to LinksUpdate lock failures

2018-01-02 Thread Imarlier
Imarlier claimed this task.
TASK DETAILhttps://phabricator.wikimedia.org/T183101EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Legoktm, Imarlier, aaron, Aklapper, EBernhardson, dcausse, Smalyshev, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, Vali.matei, Jdrewniak, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T183101: Items missing from Wikidata index due to LinksUpdate lock failures

2018-01-02 Thread Imarlier
Imarlier added a comment.
@aaron -- would love your thoughts on this, when you get a chance.TASK DETAILhttps://phabricator.wikimedia.org/T183101EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Legoktm, Imarlier, aaron, Aklapper, EBernhardson, dcausse, Smalyshev, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, Vali.matei, Jdrewniak, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T183101: Items missing from Wikidata index due to LinksUpdate lock failures

2017-12-22 Thread Imarlier
Imarlier added a comment.
Alternate theory that I think is wrong: there's a chance that something wonky happened with db1070 around this time, which caused replication to stop and may also have caused issues getting locks.  If that were the case I would expect to see far more write errors here (like on the original submit, and not just on the LinksUpdate job), but I can't rule it out right now.TASK DETAILhttps://phabricator.wikimedia.org/T183101EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Legoktm, Imarlier, aaron, Aklapper, EBernhardson, dcausse, Smalyshev, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, Vali.matei, Jdrewniak, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T183101: Items missing from Wikidata index due to LinksUpdate lock failures

2017-12-22 Thread Imarlier
Imarlier added a comment.
First appearance of a timeout on db1100:

2017-12-16T14:37:46	mw1311	WARNING	Wikimedia\Rdbms\LoadBalancer::doWait: Timed out waiting on db1100 pos db1070-bin.001655/808461384

First exception trying to get the lock:

2017-12-16T14:38:16	mw1290	ERROR	[WjUvyQpAEDcAACARVmoAAACM] /w/api.php   RuntimeException from line 205 of /srv/mediawiki/php-1.31.0-wmf.12/includes/deferred/LinksUpdate.php: Could not acquire lock 'LinksUpdate:atomicity:pageid:1524206'.

Prior to this exception being recorded, there hadn't been any exceptions received for a period of nearly 2 minutes -- in particular, there hadn't been any database timeouts or slave replication lag issues during that time.

The first indication of replication lag is recorded 27 seconds later:

2017-12-16T14:38:43	mw1308	ERROR	[8e59bd7e779c680d140f8366] /rpc/RunJobs.php?wiki=dawiki=refreshLinks=30=300M   Wikimedia\Rdbms\DBReplicationWaitError from line 373 of /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/lbfactory/LBFactory.php: Could not wait for replica DBs to catch up to db1070

Last set of timeouts waiting on replication to db1100:

2017-12-16T15:02:02	mw1239	WARNING	Wikimedia\Rdbms\LoadBalancer::doWait: Timed out waiting on db1100 pos db1070-bin.001655/825818703
	2017-12-16T15:02:06	mw1288	WARNING	Wikimedia\Rdbms\LoadBalancer::doWait: Timed out waiting on db1100 pos db1070-bin.001655/826052699
	2017-12-16T15:02:12	terbium	WARNING	Wikimedia\Rdbms\LoadBalancer::doWait: Timed out waiting on db1100 pos db1070-bin.001655/826518643
	2017-12-16T15:02:25	mw1286	WARNING	Wikimedia\Rdbms\LoadBalancer::doWait: Timed out waiting on db1100 pos db1070-bin.001655/827446248
	2017-12-16T15:03:25	mw1286	WARNING	Wikimedia\Rdbms\LoadBalancer::doWait: Timed out waiting on db1100 pos db1070-bin.001655/831205807

This is right around when everything recovered, and I don't think that it's coincidence.   I'm not sure yet how this is all overlapping, but I'm pretty certain that it is.TASK DETAILhttps://phabricator.wikimedia.org/T183101EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Legoktm, Imarlier, aaron, Aklapper, EBernhardson, dcausse, Smalyshev, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, Vali.matei, Jdrewniak, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T183101: Items missing from Wikidata index due to LinksUpdate lock failures

2017-12-22 Thread Imarlier
Imarlier added a comment.
Just throwing a couple of notes in here:

Q45825741 and Q45825750 were both updated at very close to the same time (2017-12-16 14:54:00-15:02:00); both had a number of updates that happened within a very tight window (<1 minute).  Both also have log entries indicating that they experienced failures getting a lock:

2017-12-16T14:55:09	mw1283	INFO	Expectation (readQueryTime <= 5) by MediaWiki::restInPeace not met (actual: 15.000407934189):
query-m: SELECT GET_LOCK('X', N) AS lockstatus [TRX#43742a]
#0 /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/TransactionProfiler.php(223): Wikimedia\Rdbms\TransactionProfiler->reportExpectationViolated()
#1 /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/database/Database.php(1049): Wikimedia\Rdbms\TransactionProfiler->recordQueryCompletion()
#2 /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/database/Database.php(948): Wikimedia\Rdbms\Database->doProfiledQuery()
#3 /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/database/DatabaseMysqlBase.php(1039): Wikimedia\Rdbms\Database->query()
#4 /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/database/Database.php(3559): Wikimedia\Rdbms\DatabaseMysqlBase->lock()



2017-12-16T15:00:38	mw1235	INFO	Expectation (readQueryTime <= 5) by MediaWiki::restInPeace not met (actual: 15.000345230103):
query-m: SELECT GET_LOCK('X', N) AS lockstatus [TRX#aa6d60]
#0 /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/TransactionProfiler.php(223): Wikimedia\Rdbms\TransactionProfiler->reportExpectationViolated()
#1 /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/database/Database.php(1049): Wikimedia\Rdbms\TransactionProfiler->recordQueryCompletion()
#2 /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/database/Database.php(948): Wikimedia\Rdbms\Database->doProfiledQuery()
#3 /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/database/DatabaseMysqlBase.php(1039): Wikimedia\Rdbms\Database->query()
#4 /srv/mediawiki/php-1.31.0-wmf.12/includes/libs/rdbms/database/Database.php(3559): Wikimedia\Rdbms\DatabaseMysqlBase->lock()
#5 /srv/mediawiki/php-1.31.0-wmf.12/includes/deferred/LinksUpdate.php(203): Wikimedia\Rdbms\Database->getScopedLockAndFlush()
#6 /srv/mediawiki/php-1.31.0-wmf.12/includes/deferred/LinksUpdate.php(170): LinksUpdate::acquirePageLock()
#7 /srv/mediawiki/php-1.31.0-wmf.12/includes/deferred/DeferredUpdates.php(259): LinksUpdate->doUpdate()
#8 /srv/mediawiki/php-1.31.0-wmf.12/includes/deferred/DeferredUpdates.php(210): DeferredUpdates::runUpdate()
#9 /srv/mediawiki/php-1.31.0-wmf.12/includes/deferred/DeferredUpdates.php(131): DeferredUpdates::execute()
#10 /srv/mediawiki/php-1.31.0-wmf.12/includes/MediaWiki.php(897): DeferredUpdates::doUpdates()
#11 /srv/mediawiki/php-1.31.0-wmf.12/includes/MediaWiki.php(719): MediaWiki->restInPeace()
#12 (): Closure$MediaWiki::doPostOutputShutdown()
#13 {main}

Both of those records also have one or more log entries showing that db1100 was significantly lagged at this time:

2017-12-16T14:54:54	mw1283	WARNING	Wikimedia\Rdbms\LoadBalancer::doWait: Timed out waiting on db1100 pos db1070-bin.001655/818130407



2017-12-16T15:00:44	mw1314	WARNING	Wikimedia\Rdbms\LoadBalancer::doWait: Timed out waiting on db1100 pos db1070-bin.001655/821972877

This doesn't necessarily mean anything, but it does immediately bring to mind the possibility of a bad interaction between two simultaneous transactions -- a process that gets the update lock, then tries to read from a slave that's lagged, times out waiting on that, and ultimately bails out without releasing the lock?

This code is new to me, but that seems like an angle that's worth looking at.

Q45825730 and Q45825742 were both created around the same time.  However, neither of them shows any indication of similar database timeouts or lock failures.  Here's 42:

Time 	host  	level  	message 
	2017-12-16T14:55:07	mw1286	INFO	AbuseFilter::filterAction: cache miss for 'Q45825742' (key wikidatawiki:abusefilter:check-stash:default:19082eae700e0352ac30604486024ca3:v1).
	2017-12-16T14:54:03	wdqs1003	INFO	Got 5 changes, from Q45825742@609888642@20171216145333|645260053 to Q17580698@609888669@20171216145400|645260055
	2017-12-16T14:53:33	mw1316	INFO	AbuseFilter::filterAction: cache miss for 'Q45825742' (key wikidatawiki:abusefilter:check-stash:default:75754391037fe534c51178ac2603dc26:v1).
	2017-12-16T14:53:23	wdqs1005	INFO	Got 2 changes, from Q2492695@609888609@20171216145249|645260020 to Q45825742@609888612@20171216145251|645260022
	2017-12-16T14:53:23	wdqs2002	INFO	Got 2 changes, from Q2492695@609888609@20171216145249|645260020 to Q45825742@609888612@20171216145251|645260022
	2017-12-16T14:53:22	wdqs1004	INFO	Got 2 changes, from Q2492695@609888609@20171216145249|645260020 to Q45825742@609888612@20171216145251|645260022
	2017-12-16T14:53:22	wdqs2001	INFO	Got 2 changes, from Q2492695@609888609@20171216145249|645260020 to Q45825742@609888612@20171216145251|64

[Wikidata-bugs] [Maniphest] [Commented On] T183101: Items missing from Wikidata index

2017-12-20 Thread Imarlier
Imarlier added a comment.
Aaron is on vacation this week, likely not paying any attention to notifications. We'll try to take a look at it without him.TASK DETAILhttps://phabricator.wikimedia.org/T183101EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: ImarlierCc: Imarlier, aaron, Aklapper, EBernhardson, dcausse, Smalyshev, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, LawExplorer, Vali.matei, Jdrewniak, Wikidata-bugs, aude, Mbch331, Jay8g, Krenair___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs