Smalyshev added a comment.
@Izno yes, pasted in a wrong window :)TASK DETAILhttps://phabricator.wikimedia.org/T119915EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Gehel, SmalyshevCc: Izno, Dzahn, gerritbot, Gehel, Ricordisamoa, hoo, Addshore, Aklapper,
Izno added a comment.
In T119915#3421698, @Smalyshev wrote:
Implemented as geof:globe, geof:latitude & geof:longitude
@Smalyshev Did you mean to close and out this resolution on the globe task you closed today?TASK DETAILhttps://phabricator.wikimedia.org/T119915EMAIL
Gehel added a comment.
The UNKNOWN disappeared now that we are active/active. Previously, when no traffic was sent to codfw, we had no meaningful data about response time. This can be closed again.TASK DETAILhttps://phabricator.wikimedia.org/T119915EMAIL
Smalyshev added a comment.
Implemented as geof:globe, geof:latitude & geof:longitudeTASK DETAILhttps://phabricator.wikimedia.org/T119915EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Gehel, SmalyshevCc: Dzahn, gerritbot, Gehel, Ricordisamoa, hoo, Addshore,
Dzahn added a comment.
The Icinga/graphite check "Response time for WDQS" is in status "UNKNOWN" because there are "No valid datapoints found".
https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2=einsteinium=Response+time+of+WDQS
This is a common problem with Icinga/Graphite checks
gerritbot added a comment.
Change 286992 merged by Gehel:
Add response time checks to WDQS
https://gerrit.wikimedia.org/r/286992
TASK DETAIL
https://phabricator.wikimedia.org/T119915
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To:
gerritbot added a comment.
Change 286992 had a related patch set uploaded (by Gehel):
Add response time checks to WDQS
https://gerrit.wikimedia.org/r/286992
TASK DETAIL
https://phabricator.wikimedia.org/T119915
EMAIL PREFERENCES
Addshore added a comment.
Just looking at the other things I am recording right now but it may infact
make sense to put a monitor on the Done Rate or the Queries Per Second.
A Done Rate of 0 or a QPS of 0 for longer than the normal query timeout
should shout at us, as it means no
Smalyshev added a comment.
I think we need to at least put monitor on whatever "varnish latency" counts
and alert say if it's over 30 s.
TASK DETAIL
https://phabricator.wikimedia.org/T119915
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To:
Addshore added a comment.
It should be noted we just had an partial outage for 6/7 hours without us
noticed ;)
wdqs1002 seemed to totally die but nothing pinged anyone.
wdqs1001 seemed to stop updating (resulting in a warning in icinga) but again
it doesn't seem to have pinged anyone.
10 matches
Mail list logo