could it be a missing 'ceph osd require-osd-release luminous' on your cluster?

When I check a luminous cluster I get this:

host1:~ # ceph osd dump | grep recovery
flags sortbitwise,recovery_deletes,purged_snapdirs

The flags in the code you quote seem related to that.
Can you check that output on your cluster?

Found this in a thread from last year [1].


[1] https://www.spinics.net/lists/ceph-devel/msg40191.html

Zitat von Andrew Bruce <dbmail1...@gmail.com>:

Hello All! Yesterday started upgrade from luminous to mimic with one of my 3 MONs.

After applying mimic yum repo and updating - a restart reports the following error from the MON log file:

==> /var/log/ceph/ceph-mon.lvtncephx121.log <==
2019-02-07 10:02:40.110 7fc8283ed700 -1 mon.lvtncephx121@0(probing) e4 handle_probe_reply existing cluster has not completed a full luminous scrub to purge legacy snapdir objects; please scrub before upgrading beyond luminous.

My question is simply: What exactly does this require?

Yesterday afternoon I did a manual:

ceph osd scrub all

But that has zero effect. I still get the same message on restarting the MON

I have no errors in the cluster except for the single MON (lvtncephx121) that I'm working to migrate to mimic first:

[root@lvtncephx110 ~]# ceph status
    id:     5fabf1b2-cfd0-44a8-a6b5-fb3fd0545517
    health: HEALTH_WARN
            1/3 mons down, quorum lvtncephx122,lvtncephx123

mon: 3 daemons, quorum lvtncephx122,lvtncephx123, out of quorum: lvtncephx121
    mgr: lvtncephx122(active), standbys: lvtncephx123, lvtncephx121
    mds: cephfs-1/1/1 up  {0=lvtncephx151=up:active}, 1 up:standby
    osd: 18 osds: 18 up, 18 in
    rgw: 2 daemons active

    pools:   23 pools, 2016 pgs
    objects: 2608k objects, 10336 GB
    usage:   20689 GB used, 39558 GB / 60247 GB avail
    pgs:     2016 active+clean

    client:   5612 B/s rd, 3756 kB/s wr, 1350 op/s rd, 412 op/s wr

FWIW: The source code has the following:

// Monitor.cc
    if (!osdmon()->osdmap.test_flag(CEPH_OSDMAP_PURGED_SNAPDIRS) ||
        !osdmon()->osdmap.test_flag(CEPH_OSDMAP_RECOVERY_DELETES)) {
derr << __func__ << " existing cluster has not completed a full luminous"
           << " scrub to purge legacy snapdir objects; please scrub before"
           << " upgrading beyond luminous." << dendl;

So two question:
How to show the current flags in the OSD map checked by the monitor?
How to get these flags set so the MON will actually start.


