Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-10 Thread Philippe D'Anjou
After trying to disable the paxos service trim temporarily (since that seemed to trigger it initially), we now see this:     "assert_condition": "from != to",     "assert_func": "void PaxosService::trim(MonitorDBStore::TransactionRef, version_t, version_t)",     "assert_file":

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-10 Thread Philippe D'Anjou
How do I Import an osdmap in Nautilus? I saw documentation for older version but it seems one now can only export but not import anymore? Am Donnerstag, 10. Oktober 2019, 08:52:03 OESZ hat Philippe D'Anjou Folgendes geschrieben: I dont think this has anything to do with CephFS, the

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-09 Thread Philippe D'Anjou
I dont think this has anything to do with CephFS, the mon crashes for same reason even without the mds running.I have still the old rocksdb files but they had a corruption issue, not sure if that's easier to fix, there havent been any changes on the cluster in between. This is a disaster

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-09 Thread Gregory Farnum
On Mon, Oct 7, 2019 at 11:11 PM Philippe D'Anjou wrote: > > Hi, > unfortunately it's single mon, because we had major outage on this cluster > and it's just being used to copy off data now. We werent able to add more > mons because once a second mon was added it crashed the first one (there's a

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-08 Thread Philippe D'Anjou
Hi,unfortunately it's single mon, because we had major outage on this cluster and it's just being used to copy off data now. We werent able to add more mons because once a second mon was added it crashed the first one (there's a bug tracker ticket). I still have old rocksdb files before I ran

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-07 Thread Gregory Farnum
On Sun, Oct 6, 2019 at 1:08 AM Philippe D'Anjou wrote: > > I had to use rocksdb repair tool before because the rocksdb files got > corrupted, for another reason (another bug possibly). Maybe that is why now > it crash loops, although it ran fine for a day. Yeah looks like it lost a bit of

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-06 Thread Philippe D'Anjou
I had to use rocksdb repair tool before because the rocksdb files got corrupted, for another reason (another bug possibly). Maybe that is why now it crash loops, although it ran fine for a day.What is meant with "turn it off and rebuild from remainder"? Am Samstag, 5. Oktober 2019,

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-04 Thread Gregory Farnum
Hmm, that assert means the monitor tried to grab an OSDMap it had on disk but it didn't work. (In particular, a "pinned" full map which we kept around after trimming the others to save on disk space.) That *could* be a bug where we didn't have the pinned map and should have (or incorrectly

[ceph-users] mon sudden crash loop - pinned map

2019-10-04 Thread Philippe D'Anjou
Hi,our mon is acting up all of a sudden and dying in crash loop with the following: 2019-10-04 14:00:24.339583 lease_expire=0.00 has v0 lc 4549352     -3> 2019-10-04 14:00:24.335 7f6e5d461700  5 mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c 4548623..4549352) is_readable = 1 -