For Tool Labs, the plan is as follows: - tomorrow, we will disable the queue so no new tasks will be distributed to the affected hosts - we will send an e-mail with tasks that are still running an hour later
Unfortunately, there is currently no host that can run jobs that take longer than a few days, because other virt* hosts will also be rebooted this week. For reference, the current long-running jobs on these hosts are the following, grouped by user name:. Please take a look and consider whether the jobs are still doing something useful -- and if not, please kill them (qdel <job id>). Merlijn Columns: job id name start date/time aka --------------- 1317747 start Sat Aug 1 19:17:12 2015 tools.checkwiki --------------- 145845 eswiki-munch Thu Jun 25 05:00:13 2015 818559 arwiki-munch Sat Jul 18 05:00:16 2015 tools.dexbot --------------- 1236997 del Thu Jul 30 13:36:09 2015 1341699 kian_new2 Sun Aug 2 11:03:18 2015 tools.gpy --------------- 527733 gpy Thu Jul 9 01:14:28 2015 tools.luke081515bot --------------- 1346744 queue Sun Aug 2 14:24:31 2015 tools.mjbmrbot --------------- 209254 lgdcp2_1 Sat Jun 27 15:35:04 2015 273994 lgdcp2_2 Tue Jun 30 02:00:07 2015 345013 lgdcp2_3 Thu Jul 2 15:00:05 2015 807548 lsdcp2_3 Fri Jul 17 21:00:12 2015 1092477 lgdcp1_4 Sun Jul 26 14:00:07 2015 1093960 lsdcp1_4 Sun Jul 26 15:00:10 2015 tools.shuaib-bot --------------- 1622344 translator Mon Aug 10 02:10:09 2015 tools.wikidata-exports --------------- 694469 create_dumps Tue Jul 14 08:40:22 2015 735030 create_dumps Wed Jul 15 14:31:25 2015 768842 create_dumps Thu Jul 16 16:12:52 2015 On 10 August 2015 at 21:20, Andrew Bogott <[email protected]> wrote: > On Wednesday I'll be rebooting labvirt1001. This will cause downtime for > about 10% of labs instances, and this downtime may last as long as 60 > minutes (although the average downtime will be much less.) > > We will do our best to juggle and reschedule ToolLabs jobs, but persistent > jobs that cannot gracefully restart may be interrupted and require your > personal attention. > > Here is the list of instances that will be affected by this reboot: > > | citoidtest | ACTIVE | - | Running | > public=10.68.16.182 | > | conf | ACTIVE | - | Running | > public=10.68.18.87, 208.80.155.233 | > | deployment-bastion | ACTIVE | - | Running | > public=10.68.16.58, 208.80.155.191 | > | deployment-cache-text02 | ACTIVE | - | Running | > public=10.68.16.16 | > | deployment-elastic08 | ACTIVE | - | Running | > public=10.68.17.188 | > | deployment-memc03 | ACTIVE | - | Running | > public=10.68.16.15 | > | deployment-parsoid05 | ACTIVE | - | Running | > public=10.68.16.120 | > | deployment-pdf01 | ACTIVE | - | Running | > public=10.68.16.73 | > | deployment-restbase01 | ACTIVE | - | Running | > public=10.68.17.227 | > | deployment-salt | ACTIVE | - | Running | > public=10.68.16.99 | > | deployment-urldownloader | ACTIVE | - | Running | > public=10.68.16.135 | > | diffengine | ACTIVE | - | Running | > public=10.68.17.127 | > | educationdashboard-i18n | SHUTOFF | - | Shutdown | > public=10.68.16.235 | > | ee-flow-extra | ACTIVE | - | Running | > public=10.68.16.102 | > | etcd01 | ACTIVE | - | Running | > public=10.68.16.130 | > | etcd03 | ACTIVE | - | Running | > public=10.68.16.132 | > | firstinstance | SHUTOFF | - | NOSTATE | > public=10.68.16.212 | > | graphite-trusty | ACTIVE | - | Running | > public=10.68.17.181 | > | huggle-d2 | ACTIVE | - | Running | > public=10.68.17.194 | > | icinga | ACTIVE | - | Running | > public=10.68.16.195 | > | integration-raita | ACTIVE | - | Running | > public=10.68.16.53 | > | integration-slave-trusty-1013 | ACTIVE | - | Running | > public=10.68.18.28 | > | integration-slave-trusty-1015 | ACTIVE | - | Running | > public=10.68.18.30 | > | k8s-worker-02 | ACTIVE | - | Running | > public=10.68.18.91 | > | kartotherian1 | ACTIVE | - | Running | > public=10.68.16.117 | > | language-replag-slave | SHUTOFF | - | Shutdown | > public=10.68.16.248 | > | maps-tiles2 | ACTIVE | - | Running | > public=10.68.17.110 | > | mobile-browser-tests | ACTIVE | - | Running | > public=10.68.16.149 | > | mwreview-proxy-test | ACTIVE | - | Running | > public=10.68.16.83 | > | osmit-cruncher1 | ACTIVE | - | Running | > public=10.68.17.92 | > | puppet-jmm-debdeploy-precise | ACTIVE | - | Running | > public=10.68.18.106 | > | puppet-mailman | ACTIVE | - | Running | > public=10.68.17.177 | > | sentry-builder | ACTIVE | - | Running | > public=10.68.18.82 | > | staging-eventlogging | ACTIVE | - | Running | > public=10.68.16.199 | > | staging-ms-be03 | ACTIVE | - | Running | > public=10.68.17.249 | > | staging-rdb01 | ACTIVE | - | Running | > public=10.68.17.193 | > | staging-tin | ACTIVE | - | Running | > public=10.68.16.110 | > | stashbot-logstash | ACTIVE | - | Running | > public=10.68.18.101 | > | tools-bastion-02 | ACTIVE | - | Running | > public=10.68.16.44, 208.80.155.132 | > | tools-exec-1201 | ACTIVE | - | Running | > public=10.68.17.49, 208.80.155.203 | > | tools-exec-1202 | ACTIVE | - | Running | > public=10.68.16.57, 208.80.155.211 | > | tools-exec-1204 | ACTIVE | - | Running | > public=10.68.17.88, 208.80.155.213 | > | tools-exec-1206 | ACTIVE | - | Running | > public=10.68.17.105, 208.80.155.215 | > | tools-exec-1209 | ACTIVE | - | Running | > public=10.68.17.129, 208.80.155.218 | > | tools-exec-1213 | ACTIVE | - | Running | > public=10.68.17.252, 208.80.155.222 | > | tools-exec-1217 | ACTIVE | - | Running | > public=10.68.18.20, 208.80.155.226 | > | tools-exec-1218 | ACTIVE | - | Running | > public=10.68.18.19, 208.80.155.227 | > | tools-exec-1408 | ACTIVE | - | Running | > public=10.68.18.14, 208.80.155.152 | > | tools-exec-cyberbot | ACTIVE | - | Running | > public=10.68.16.39 | > | tools-webgrid-generic-1404 | ACTIVE | - | Running | > public=10.68.18.53 | > | tools-webgrid-lighttpd-1409 | ACTIVE | - | Running | > public=10.68.18.43 | > | tools-webgrid-lighttpd-1410 | ACTIVE | - | Running | > public=10.68.18.44 | > | toolsbeta-exec-101 | ACTIVE | - | Running | > public=10.68.16.7 | > | toolsbeta-exec-201 | ACTIVE | - | Running | > public=10.68.16.250 | > | wikidata-mobile | ACTIVE | - | Running | > public=10.68.18.41 | > | wikispy | ACTIVE | - | Running | > public=10.68.17.119 | > | wlmjurytool2014 | ACTIVE | - | Running | > public=10.68.17.134 | > | wmt-exec | ACTIVE | - | Running | > public=10.68.17.236 | > > > > _______________________________________________ > Labs-announce mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/labs-announce > > _______________________________________________ > Labs-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/labs-l > >
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
