On Sun, Mar 3, 2013 at 7:51 AM, Petr Bena <[email protected]> wrote:
> HI, > > today it's second time that bastion was inaccessible: > > If you are having access problems, please see: > > https://wikitech.wikimedia.org/wiki/Access#Accessing_public_and_private_instances > debug1: Authentications that can continue: publickey > debug1: Next authentication method: publickey > debug1: Offering RSA public key: /home/petanb/.ssh/id_rsa > debug2: we sent a publickey packet, wait for reply > > > if we can't have a different way to authenticate than using public > keys WHICH ARE broken often - can we have at least second stable login > server. > > BTW I assume that logins didn't work because of gluster so that it > wouldn't work anyway, but if gluster suck so hard, can we at least > have password auth until you fix it? Bad authentication is better than > no working authentication > > Though I'm usually more than happy to blame gluster, this was not caused by gluster. It was because someone OOM'd the instance. We've actually finally stablized gluster to a point where we shouldn't be having complete outages any more: https://ganglia.wikimedia.org/latest/?r=month&cs=&ce=&m=cpu_report&s=by+name&c=Glusterfs+cluster+pmtpa&h=&host_regex=&max_graphs=0&tab=m&vn=&sh=1&z=small&hc=4 Note in the above graph that the past week and a half the memory usage has been mostly flat. There was one spot where the memory ballooned, then a spot where it dropped. That last memory balloon was before the changes we put in place and the drop was where I restarted the glusterd processes (which doesn't affect filesystem access). There are some split brain issues still around from the most recent round of instability, but the SSH keys are perfectly fine. I will not enable password authentication. It's incredibly insecure. So, to get a little more back on point, I've just created bastion2.wmflabs.org and bastion3.wmflabs.org, in case the bastion instances OOM again. - Ryan
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
