btw I did free as well on bastion1 and it had totally empty memory, so theories about OOM are not correct here
On Wed, Mar 6, 2013 at 6:04 PM, Petr Bena <[email protected]> wrote: > On Wed, Mar 6, 2013 at 6:01 PM, Jeremy Baron <[email protected]> wrote: >> On Wed, Mar 6, 2013 at 4:54 PM, Petr Bena <[email protected]> wrote: >>> okay this is third time when we have same outage... bastion2 and 3 >>> were accessible for short time after bastion1's gluster died, then >>> they died as well. public keys weren't accessible on any of them so >>> basically labs were inaccessible for anyone. >> >> citation needed. I was just able to log in to both of >> bastion[23].wmflabs.org on the first try. > > yes - I said they were working, now they don't. When bastion1 died I > was still there, so I could check that /public/keys wasn't readable at > all (some IO error) I don't have root on bastion so I couldn't read > logs though, no idea why they aren't readable for everyone > >> >> [removed garbage about password auth being wonderful...] >> > > HOW DARE YOU > >>> Set up a cron script that sync a local folder on bastion with >>> /public/keys so that when gluster is down or that folder isn't working >>> login to bastion's still works. >> >> That might be feasible. But really the solution is don't let people >> kill the bastion. idk how we do that. and idk why the past social >> restrictions aren't sufficient. maybe we need ulimit or cgroups or >> something. :-( >> > > it weren't people who kill them it was gluster or something like that > - we need reliable storage for keys if it's only way to login > >> -Jeremy >> >> _______________________________________________ >> Labs-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/labs-l _______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
