Re: [ceph-users] lease_timeout - new election

2017-08-25 Thread Webert de Souza Lima
Oh god root@bhs1-mail03-ds03:~# zgrep "lease" /var/log/ceph/*.gz /var/log/ceph/ceph-mon.bhs1-mail03-ds03.log.2.gz:2017-08-24 06:39:22.384112 7f44c60f1700 1 mon.bhs1-mail03-ds03@2(peon).paxos(paxos updating c 8973251..8973960) lease_timeout -- calling new election

Re: [ceph-users] lease_timeout - new election

2017-08-21 Thread Webert de Souza Lima
I really need some help through this. This is happening very frequently and I can't seem to figure out why. My services rely on cephfs and when this happens, the mds suicides. It's always the same, see the last occurrence logs: host bhs1-mail03-ds03: 2017-08-19 06:35:54.072809 7f44c60f1700 1

Re: [ceph-users] lease_timeout - new election

2017-08-09 Thread Webert de Souza Lima
Hi David, thanks for your feedback. With that in mind, I did rm a 15TB RBD Pool about 1 hour or so before this had happened. I wouldn't think it would be related to this because there was nothing different going on after I removed it. Not even high system load. But considering what you sid, I

Re: [ceph-users] lease_timeout - new election

2017-08-09 Thread David Turner
I just want to point out that there are many different types of network issues that don't involve entire networks. Bad nic, bad/loose cable, a service on a server restarting our modifying the network stack, etc. That said there are other things that can prevent an mds service, or any service from

[ceph-users] lease_timeout - new election

2017-08-09 Thread Webert de Souza Lima
Hi, I recently had a mds outage beucase the mds suicided due to "dne in the mds map". I've asked it here before and I know that happens because the monitors took out this mds from the mds map even though it was alive. Weird thing, there was no network related issues happening at the time, which