[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-14 Thread RLazarus
RLazarus closed this task as "Resolved".
RLazarus added a comment.


  Yep, the alert has cleared. Thanks!

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, RLazarus
Cc: dcausse, Dzahn, RLazarus, RKemper, Aklapper, Ramtin0071, Devnull, lmata, 
Muchiri124, CBogen, Akuckartz, Legado_Shulgin, Nandana, Namenlos314, 
Davinaclare77, Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, 
Zppix, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, 
Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-14 Thread dcausse
dcausse assigned this task to RKemper.
dcausse moved this task from Ready for Development to Needs Reporting on the 
Discovery-Search (Current work) board.
dcausse added a comment.


  Seems to be resolved now:
  
dcausse@mwmaint1002:~$ mwscript 
extensions/Wikidata.org/maintenance/updateQueryServiceLag.php --wiki 
wikidatawiki --cluster wdqs --prometheus prometheus.svc.eqiad.wmnet 
--prometheus prometheus.svc.codfw.wmnet
Done.

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper, dcausse
Cc: dcausse, Dzahn, RLazarus, RKemper, Aklapper, Ramtin0071, Devnull, lmata, 
Muchiri124, CBogen, Akuckartz, Legado_Shulgin, Nandana, Namenlos314, 
Davinaclare77, Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, 
Zppix, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, 
Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-14 Thread CBogen
CBogen set the point value for this task to "3".

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: CBogen
Cc: Dzahn, RLazarus, RKemper, Aklapper, Ramtin0071, Devnull, lmata, Muchiri124, 
CBogen, Akuckartz, Legado_Shulgin, Nandana, Namenlos314, Davinaclare77, 
Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-14 Thread CBogen
CBogen moved this task from All WDQS-related tasks to Current work on the 
Wikidata-Query-Service board.
CBogen added a project: Discovery-Search (Current work).

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

WORKBOARD
  https://phabricator.wikimedia.org/project/board/891/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: CBogen
Cc: Dzahn, RLazarus, RKemper, Aklapper, Ramtin0071, Devnull, lmata, Muchiri124, 
CBogen, Akuckartz, Legado_Shulgin, Nandana, Namenlos314, Davinaclare77, 
Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-09 Thread RKemper
RKemper added a comment.


  Job lives here: 
https://github.com/wikimedia/mediawiki-extensions-Wikidata.org/blob/60c5f96ebf424b792077bb7c6b533a68702e7aea/maintenance/updateQueryServiceLag.php#L70
  
  I have a patch open here: 
https://gerrit.wikimedia.org/r/c/operations/puppet/+/646888 that addresses a 
different manifestation of this same issue with `blazegraph_lastupdated`. 
Currently it looks like we'll want to switch `blazegraph_lastupdated` from a 
`Counter` to a `Gauge`, but I need to figure out if changing the metric type 
will break the behavior in `updateQueryServiceLag.php` or elsewhere.

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper
Cc: Dzahn, RLazarus, RKemper, Aklapper, Ramtin0071, Devnull, lmata, Muchiri124, 
CBogen, Akuckartz, Legado_Shulgin, Nandana, Namenlos314, Davinaclare77, 
Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-09 Thread Dzahn
Dzahn added a comment.


  I ran the command manually as the same user, www-data. The error is simply 
"Failed to get lag from prometheus".
  
@mwmaint1002:~# sudo -u www-data /usr/local/bin/mwscript 
extensions/Wikidata.org/maintenance/updateQueryServiceLag.php --wiki 
wikidatawiki --cluster wdqs --prometheus prometheus.svc.eqiad.wmnet 
--prometheus prometheus.svc.codfw.wmnet
Failed to get lag from prometheus

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Dzahn
Cc: Dzahn, RLazarus, RKemper, Aklapper, Ramtin0071, Devnull, lmata, Muchiri124, 
CBogen, Akuckartz, Legado_Shulgin, Nandana, Namenlos314, Davinaclare77, 
Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-09 Thread Dzahn
Dzahn added a comment.


ExecStart=/usr/local/bin/mw-cli-wrapper /usr/local/bin/mwscript 
extensions/Wikidata.org/maintenance/updateQueryServiceLag.php --wiki 
wikidatawiki --cluster wdqs --prometheus prometheus.svc.eqiad.wmnet 
--prometheus prometheus.svc.codfw.wmnet

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Dzahn
Cc: Dzahn, RLazarus, RKemper, Aklapper, Ramtin0071, Devnull, lmata, Muchiri124, 
CBogen, Akuckartz, Legado_Shulgin, Nandana, Namenlos314, Davinaclare77, 
Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-09 Thread RKemper
RKemper added a comment.


  There's some context in the description of 
https://phabricator.wikimedia.org/T269204 that mentions that the counter metric 
`blazegraph_lastupdated` is now `blazegraph_lastupdated_total`, so if 
the`mediawiki_job_wikidata-updateQueryServiceLag`  job has to do with that 
metric then the re-image is likely the source of the problem

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RKemper
Cc: Dzahn, RLazarus, RKemper, Aklapper, Ramtin0071, Devnull, lmata, Muchiri124, 
CBogen, Akuckartz, Legado_Shulgin, Nandana, Namenlos314, Davinaclare77, 
Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-09 Thread Dzahn
Dzahn added a comment.


[20:17:19]  ryankemper: I guess it makes sense that 
"job_wikidata-updateQueryServiceLag" could not run during current work
[20:19:46]   mutante: yeah, I'm not familiar with how the 
job works specifically but that would make sense
[20:20:42]   if it's supposed to alert at <50% availability 
then that might be a bit unexpected because only one node in each `dc x 
[internal, external]` is being re-imaged at a time
[20:21:12]   So for codfw for example there's one codfw 
wdqs-internal host that would be unable to report and one codfw external wdqs 
host
[20:21:48]  ryankemper: no, this case is not about 
availability, it's "one of the mediawiki maintenance 'crons' (that are now 
systemd timers) failed to run on the maintenance servers
[20:26:03]  ryankemper: what happens is: "maintenance job 
tries to update what the current lag is like.. tries to get lag data from 
prometheus and that fails.  now since it's a systemd timer and not a cron it 
means it's a failed service which then turns into an Icinga alert about 
"systemd state is bad on a mwmaint server". and nothing clears it.. but this 
job runs every minute.. so it's more alert than 
[20:26:09]  is appropriate
[20:26:35]  let me just clear that failed service and wait 
a minute
[20:27:14]  !log mwmaint1002 - systemctl reset-failed to 
clear icinga alert

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Dzahn
Cc: Dzahn, RLazarus, RKemper, Aklapper, Ramtin0071, Devnull, lmata, Muchiri124, 
CBogen, Akuckartz, Legado_Shulgin, Nandana, Namenlos314, Davinaclare77, 
Qtn1293, Techguru.pc, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, 
Th3d3v1ls, Hfbn0, QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-09 Thread jbond
jbond triaged this task as "Medium" priority.

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jbond
Cc: RLazarus, RKemper, Aklapper, Devnull, lmata, Muchiri124, CBogen, Akuckartz, 
Legado_Shulgin, Nandana, Namenlos314, Davinaclare77, Qtn1293, Techguru.pc, 
Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, 
QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, rosalieper, Scott_WUaS, 
Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T269693: mediawiki_job_wikidata-updateQueryServiceLag failing

2020-12-08 Thread RLazarus
RLazarus created this task.
RLazarus added projects: Wikidata-Query-Service, Operations.
Restricted Application added a subscriber: Aklapper.
Restricted Application added a project: Wikidata.

TASK DESCRIPTION
  This systemd timer runs every minute, but the last time it succeeded was Dec 
7, 19:41 UTC. Since then it's been failing consistently; journalctl looks like:
  
Dec 08 16:52:00 mwmaint1002 systemd[1]: Started MediaWiki periodic job 
wikidata-updateQueryServiceLag.
Dec 08 16:52:01 mwmaint1002 
mediawiki_job_wikidata-updateQueryServiceLag[131972]: Failed to get lag from 
prometheus
Dec 08 16:52:01 mwmaint1002 systemd[1]: 
mediawiki_job_wikidata-updateQueryServiceLag.service: Main process exited, 
code=exited, status=1/FAILURE
Dec 08 16:52:01 mwmaint1002 systemd[1]: 
mediawiki_job_wikidata-updateQueryServiceLag.service: Unit entered failed state.
Dec 08 16:52:01 mwmaint1002 systemd[1]: 
mediawiki_job_wikidata-updateQueryServiceLag.service: Failed with result 
'exit-code'.
  
  SAL shows @RKemper was reimaging WDQS hosts at the time it started failing, 
not sure if that's related or coincidence.

TASK DETAIL
  https://phabricator.wikimedia.org/T269693

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RLazarus
Cc: RLazarus, RKemper, Aklapper, Devnull, lmata, Muchiri124, CBogen, Akuckartz, 
Legado_Shulgin, Nandana, Namenlos314, Davinaclare77, Qtn1293, Techguru.pc, 
Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, 
QZanden, EBjune, merbst, LawExplorer, Zppix, _jensen, rosalieper, Scott_WUaS, 
Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs