Are the clocks dramatically out of sync? Basically any bug in signing could
cause that kind of log message, but u think simple time sync so they're
using different keys is the most common.
On Mon, Jul 24, 2017 at 9:36 AM <[email protected]> wrote:

> Hi,
>
> I'm running a Ceph cluster which I started back in bobtail age and kept it
> running/upgrading over the years. It has three nodes, each running one MON,
> 10 OSDs and one MDS. The cluster has one MDS active and two standby.
> Machines are 8-core Opterons with 32GB of ECC RAM each. I'm using it to
> host our clients (about 25) /home using CephFS and as a RBD Backend for a
> couple of libvirt VMs (about 5).
>
> Currently I'm running 11.2.0 (kraken) and a couple of month ago I started
> experiencing some strange behaviour. Exactly 2 of my ~25 CephFS Clients
> (always the same two) keep freezing their /home about 1 or two hours after
> first boot in the morning. At the moment of freeze, syslog starts reporting
> loads of:
>
> _hostname_ kernel: libceph: osdXX 172.16.0.XXX:68XX bad authorize reply
>
> On one of the clients I replaced every single piece of hardware with new
> hardware, so that machine is completely replaced now including NIC, Switch,
> Network-Cabling and did a complete OS reinstall. But the user is still
> getting that behaviour. As far as I could get, it seems that key
> renegotiation is failing and client tries to keep connecting with old cephx
> key. But I cannot find a reason for why this is happening and how to fix it.
>
> Biggest problem, the second affected machine is the one of our CEO and if
> we won't fix it I will have a hard time explaining that Ceph is the way to
> go.
>
> The two affected machines do not share any common piece of network segment
> other than TOR-Switch in Ceph Rack, while there are other clients that do
> share network segment with affected machines but arent affected at all.
>
> Google won't help me either on this one, seems no one else is experiencing
> something similar.
>
> Client setup on all clients is Debian Jessie with 4.9 Backports kernel,
> using kernel client for mounting CephFS. I think the whole thing started
> with a kernel upgrade from one 4.X series to another, but cannout
> reconstruct.
>
> Any help greatly appreciated.
>
> Best regards,
> Tobi
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to