[ceph-users] upgrade problem nautilus 14.2.15 -> 14.2.18? (Broken ceph!)

Simon Oosthoek Thu, 25 Mar 2021 12:34:13 -0700

Hi

I'm in a bit of a panic :-(

Recently we started attempting to configure a radosgw to our cephcluster, which was until now only doing cephfs (and rbd wss working aswell). We were messing about with ceph-ansible, as this was how weoriginally installed the cluster. Anyway, it installed nautilus 14.2.18on the radosgw and I though it would be good to pull up the rest of thecluster to that level as well using our tried and tested ceph upgradescript (it basically does an update of all ceph nodes one by one andchecks whether ceph is ok again before doing the next)


After the 3rd mon/mgr was done, all pg's were unavailable :-(
obviously, the script is not continuing, but ceph is also broken now...

The message deceptively is: HEALTH_WARN Reduced data availability: 5568pgs inactive


That's all PGs!

I tried as a desperate measure to upgrade one ceph OSD node, but thatbroke as well, the osd service on that node gets an interrupt from thekernel....


the versions are now like:
20:29 [root@cephmon1 ~]# ceph versions
{
    "mon": {

"ceph version 14.2.18(befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3

    },
    "mgr": {

"ceph version 14.2.18(befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 3

    },
    "osd": {

"ceph version 14.2.15(afdd217ae5fb1ed3f60e16bd62357ca58cc650e5) nautilus (stable)": 156

    },
    "mds": {

"ceph version 14.2.15(afdd217ae5fb1ed3f60e16bd62357ca58cc650e5) nautilus (stable)": 2

    },
    "overall": {

"ceph version 14.2.15(afdd217ae5fb1ed3f60e16bd62357ca58cc650e5) nautilus (stable)": 158,"ceph version 14.2.18(befbc92f3c11eedd8626487211d200c0b44786d9) nautilus (stable)": 6

    }
}


12 OSDs are down

# ceph -s
  cluster:
    id:     b489547c-ba50-4745-a914-23eb78e0e5dc
    health: HEALTH_WARN
            Reduced data availability: 5568 pgs inactive

  services:
    mon: 3 daemons, quorum cephmon3,cephmon1,cephmon2 (age 50m)
    mgr: cephmon1(active, since 53m), standbys: cephmon3, cephmon2
    mds: cephfs:1 {0=cephmds2=up:active} 1 up:standby

osd: 168 osds: 156 up (since 28m), 156 in (since 18m); 1722remapped pgs


  data:
    pools:   12 pools, 5568 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     100.000% pgs unknown
             5568 unknown

  progress:
    Rebalancing after osd.103 marked in
      [..............................]


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] upgrade problem nautilus 14.2.15 -> 14.2.18? (Broken ceph!)

Reply via email to