[Wikidata-bugs] [Maniphest] T274270: WDQS servers taking up to 30 minutes to reboot

2022-03-29 Thread bking
bking added a comment. This is still happening, @RKemper found some interesting links that could explain this behavior: https://wiki.freedesktop.org/www/Software/systemd/Debugging/#diagnosingshutdownproblems https://old.reddit.com/r/archlinux/comments/ba3zec

[Wikidata-bugs] [Maniphest] T242453: Detect and alert and/or remediate Blazegraph deadlocks

2022-03-29 Thread bking
bking added a comment. Per conversation with dcausse, we could potentially run jstack on a timer and grep the output for errors as shown above, then alert and/or remediate. TASK DETAIL https://phabricator.wikimedia.org/T242453 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings

[Wikidata-bugs] [Maniphest] T242453: Detect and alert and/or remediate Blazegraph deadlocks

2022-03-29 Thread bking
bking renamed this task from "Deadlock in blazegraph blocking all queries and updates" to "Detect and alert and/or remediate Blazegraph deadlocks". TASK DETAIL https://phabricator.wikimedia.org/T242453 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/pa

[Wikidata-bugs] [Maniphest] T302494: The WDQS Streaming Updater should use S3 to access thanos-swift instead of the native swift protocol

2022-03-14 Thread bking
bking added a comment. Per messages above, we have completely failed over the wdqs and wdqs-internal services from eqiad to codfw. TASK DETAIL https://phabricator.wikimedia.org/T302494 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: RKemper

[Wikidata-bugs] [Maniphest] T303134: Should wdqs LVS checks page

2022-03-14 Thread bking
bking claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T303134 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: bking Cc: jbond, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana

[Wikidata-bugs] [Maniphest] T301953: Investigate wdqs1013 stability issues

2022-03-14 Thread bking
bking added a comment. Suggestions: - Data reload - Server reimage - Hardware tests - Close observation over a limited time TASK DETAIL https://phabricator.wikimedia.org/T301953 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: bking Cc

[Wikidata-bugs] [Maniphest] T301953: Investigate wdqs1013 stability issues

2022-03-14 Thread bking
bking claimed this task. TASK DETAIL https://phabricator.wikimedia.org/T301953 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: bking Cc: bking, Aklapper, Zbyszko, Astuthiodit_1, karapayneWMDE, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz

[Wikidata-bugs] [Maniphest] T293862: Investigate using jvmquake to limit the time a JVM is unusable due to GC overhead

2022-03-14 Thread bking
bking added a comment. Manually installed on wdqs1010 TASK DETAIL https://phabricator.wikimedia.org/T293862 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse, bking Cc: bking, Aklapper, dcausse, Astuthiodit_1, karapayneWMDE, Invadibot

[Wikidata-bugs] [Maniphest] T296470: Initialize WCQS production servers

2022-01-11 Thread bking
bking added a comment. Started data load via tmux session on cumin1001 at ~ `Tue Jan 11 16:53:46 2022` . Expected to take at least 24 hours. TASK DETAIL https://phabricator.wikimedia.org/T296470 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences

[Wikidata-bugs] [Maniphest] T298525: Tune "BlazegraphFreeAllocatorsDecreasingRapidly" alerts

2022-01-04 Thread bking
bking added a comment. Related commits here <https://gerrit.wikimedia.org/r/plugins/gitiles/operations/alerts/+log/refs/heads/master/team-search-platform/blazegraph.yaml> TASK DETAIL https://phabricator.wikimedia.org/T298525 EMAIL PREFERENCES https://phabricator.wikimedia.org/se

[Wikidata-bugs] [Maniphest] T298525: Tune "BlazegraphFreeAllocatorsDecreasingRapidly" alerts

2022-01-04 Thread bking
bking renamed this task from "Tune "BlazegraphFreeAllocatorsDecreasingRapidly"" to "Tune "BlazegraphFreeAllocatorsDecreasingRapidly" alerts". TASK DETAIL https://phabricator.wikimedia.org/T298525 EMAIL PREFERENCES https://phabricator.wikimedia.org/set

[Wikidata-bugs] [Maniphest] T298525: Tune "BlazegraphFreeAllocatorsDecreasingRapidly"

2022-01-04 Thread bking
bking added a subscriber: dcausse. bking added a comment. More context from @dcausse : The alert is managed by Alertmanager, code stored in Gerrit <https://gerrit.wikimedia.org/r/plugins/gitiles/operations/alerts/+/refs/heads/master/team-search-platform/blazegraph.yaml>

<    1   2   3   4