I found one mistake in my procedure, I didn't read the docs as carefully as I should have:

# If your monitors' ids are not sorted by ip address, please specify them in order. # For example. if mon 'a' is 10.0.0.3, mon 'b' is 10.0.0.2, and mon 'c' is 10.0.0.4,
# please passing "--mon-ids b a c".

I also didn't recreate the auth keys to limit the test procedure to the rebuild of the mon store only, so I can't really comment on that. So from my perspective, there's no need to write anything new. The only improvement in the docs would be an adoption of the process to cephadm deployments. I'm not sure if I will have that time though.


Zitat von Eugen Block <ebl...@nde.ag>:

Alright, I just went through the procedure myself on a test cluster I set up this morning. It took a bit more than the mentioned procedure from the docs. I'm not entirely sure if I made mistakes along the way, but I have the cluster back up, so in general it (still) works.

I'll try to write up something to summarize what I've been facing, but my time is currently quite limited.

Zitat von Vinícius Barreto <viniciuschag...@gmail.com>:

I had a similar problem last year and created this thread on the proxmox
forum to help anyone going through the same situation:

Note: The explanation and documentation was quite long. If you read and
study calmly, you will be able to solve it

Note 2: At the time, it took 2 to 3 nights of sleep away from me.

Good luck

[1]


 [SOLVED] - ceph cluster rebuild - Import bluestore OSDs from old cluster
(bad fsid) - OSD dont start. He only stays in down state[1]
forum.proxmox.com[1]

Enviado do meu iPhone

Em 20 de ago. de 2025, à(s) 14:37, Eugen Block <ebl...@nde.ag> escreveu:

I feel like there's still a misunderstanding here.

The mentioned procedure is:

ms=/root/mon-store
mkdir $ms

# collect the cluster map from stopped OSDs
for host in $hosts; do
 rsync -avz $ms/. user@$host:$ms.remote
 rm -rf $ms
 ssh user@$host <<EOF
   for osd in /var/lib/ceph/osd/ceph-*; do
     ceph-objectstore-tool --data-path \$osd --no-mon-config --op
update-mon-db --mon-store-path $ms.remote
   done
EOF
 rsync -avz user@$host:$ms.remote/. $ms
done

It collects the clustermap on each host, querying each OSD, but then it
"merges" it into one store, the local $ms store. That is used then to
start up the first monitor. So however you do this, make sure you have
all the clustermaps in one store. Did you stop the newly created mon
first? And I don't care about the ceph-mon.target, that's always on to
ensure the MON starts automatically after boot.

Can you clarify that you really have all the clustermaps in one store?
If not, you'll need to repeat the steps. In theory the steps should work
exactly as they're described.

Zitat von Gilberto Ferreira <gilberto.nune...@gmail.com>:

That's strange.

Now I have only the ceph-mon.target available:



systemctl status ceph-mon.target

● ceph-mon.target - ceph target allowing to start/stop all
ceph-mon@.service

instances at once

    Loaded: loaded (/usr/lib/systemd/system/ceph-mon.target;
enabled;

preset: enabled)

    Active: active since Wed 2025-08-20 14:07:12 -03; 1min 47s ago

Invocation: 1fcbb21af715460294bd6d8549557ed9



Notice: journal has been rotated since unit was started, output may be

incomplete.



And you did rebuild the store from all OSDs as I mentioned, correct?

Yes...

Like that:



ceph-volume lvm activate --all

mkdir /root/mon-store

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0
--no-mon-config

--op update-mon-db --mon-store-path mon-store/

ceph-monstore-tool mon-store/ rebuild -- --keyring

/etc/pve/priv/ceph.client.admin.keyring --mon-ids pve01 pve02 pve03

mv /var/lib/ceph/mon/ceph-pve01/store.db/

/var/lib/ceph/mon/ceph-pve01/store.db-bkp

cp -rf mon-store/store.db/ /var/lib/ceph/mon/ceph-pve01/

chown -R ceph:ceph /var/lib/ceph/mon/ceph-pve01/store.db



On each node.

---





Gilberto Nunes Ferreira

+55 (47) 99676-7530 - Whatsapp / Telegram













Em qua., 20 de ago. de 2025 às 13:49, Eugen Block <ebl...@nde.ag>
escreveu:



What does the monitor log? Does it at least start successfully? And

you did rebuild the store from all OSDs as I mentioned, correct?



Zitat von Gilberto Ferreira <gilberto.nune...@gmail.com>:



Hi again...

I have reinstall all Proxmox nodes and install ceph on each node.

Create the mons and mgr on eatch node.

I have issue the command ceph-volume lvm activate --all, on each
node, in

order bring up the /var/lib/ceph/osd/<node>

After that I ran this commands:

ceph-volume lvm activate --all

mkdir /root/mon-store

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0

--no-mon-config

--op update-mon-db --mon-store-path mon-store/

ceph-monstore-tool mon-store/ rebuild -- --keyring

/etc/pve/priv/ceph.client.admin.keyring --mon-ids pve01 pve02 pve03

mv /var/lib/ceph/mon/ceph-pve01/store.db/

/var/lib/ceph/mon/ceph-pve01/store.db-bkp

cp -rf mon-store/store.db/ /var/lib/ceph/mon/ceph-pve01/

chown -R ceph:ceph /var/lib/ceph/mon/ceph-pve01/store.db



But now I got nothing!

No monitor, no manager, no osd, none!



Perhaps somebody point me what I did wrong.



Thanks



Em qua., 20 de ago. de 2025 às 11:32, Gilberto Ferreira <

gilberto.nune...@gmail.com> escreveu:



I can see the content of the mentioned folders just after issue the

command ceph-volume....

Thanks anyway.







Em qua., 20 de ago. de 2025 às 11:26, Eugen Block <ebl...@nde.ag>

escreveu:



I assume you're right. Do you see the OSD contents in

/var/lib/ceph/osd/ceph-pve01 after activating?

And remember to collect the clustermap from all OSDs for this

procedure to succeed.



Zitat von Gilberto Ferreira <gilberto.nune...@gmail.com>:



I see...



But I had another problem.

The script from (0) indicate that should be exist a
/var/lib/ceph/osd

folder, like:

/var/lib/ceph/osd/ceph-pve01

/var/lib/ceph/osd/ceph-pve02

and so on.



But this folder appears only if I run ceph-volume lvm activate
--all.

So my question is: when I should run this command: after or before

use

the

script?

I think I need to run ceph-volume lvm activate --all, right?

Just to clarify.



Thanks



Em qua., 20 de ago. de 2025 às 11:08, Eugen Block <ebl...@nde.ag>

escreveu:



Yes, you need a monitor. The mgr is not required and can be
deployed

later. After you created the monitor, replace the mon store
contents

by the collected clustermaps from the mentioned procedure. Keep
the

ownerships of the directories/files in mind. If the monitor
starts

successfully (with the original FSID), you can try to start one
of

the

OSDs. If that works, start the rest of them, wait for the peering

storm to settle, create two more monitors and two mgr daemons.



Note that if you lose the mon store and you had a CephFS, you'll

need

to recreate that from the existing pools.



Zitat von Gilberto Ferreira <gilberto.nune...@gmail.com>:



> Hi

>

> Do I need to create any mon and/or mgr in the new ceph cluster?

>

>

>

> Em seg., 18 de ago. de 2025 às 13:03, Eugen Block
<ebl...@nde.ag>

escreveu:

>

>> Hi,

>>

>> this sounds like you created a new cluster (new fsid), the
OSDs

still

>> have the previous fsid configured. I'd rather recommend to
follow

this

>> procedure [0] to restore the mon store utilizing OSDs rather
than

>> trying to manipulate otherwise intact OSDs to fit into the
"new"

>> cluster. That way you'll have "your" cluster back. I don't
know

if

>> there are any specifics to using proxmox, though. But the

mentioned

>> procedure seems to work just fine, I've read multiple reports
on

this

>> list. Luckily, I haven't had to use it myself.

>>

>> Regards,

>> Eugen

>>

>> [0]

>>

>>






https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds

>>

>> Zitat von Gilberto Ferreira <gilberto.nune...@gmail.com>:

>>

>> > Hi

>> >

>> > I have 3 nodes Proxmox Cluster with CEPH, and after a
crash, I

have to

>> > reinstall Proxmox from scratch, along with Ceph.

>> > OSD are intact.

>> > I already did ceph-volume lvm activate --all and the OSD

appears

with

>> > ceph-volum lvm list and I got a folder with the name of the
OSD

under

>> > /var/lib/ceph/osd.

>> > However is not appear in ceph osd tree or ceph -s or even in

the

web

gui.

>> > Is there any way to re-add this OSD to Proxmox CEPH?

>> >

>> > Thanks a lot for any help.

>> >

>> >

>> > Best Regards

>> > ---

>> > Gilbert

>> > _______________________________________________

>> > ceph-users mailing list -- ceph-users@ceph.io

>> > To unsubscribe send an email to ceph-users-le...@ceph.io

>>

>>

>> _______________________________________________

>> ceph-users mailing list -- ceph-users@ceph.io

>> To unsubscribe send an email to ceph-users-le...@ceph.io

>>

> _______________________________________________

> ceph-users mailing list -- ceph-users@ceph.io

> To unsubscribe send an email to ceph-users-le...@ceph.io





_______________________________________________

ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io



_______________________________________________

ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io





_______________________________________________

ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io





_______________________________________________

ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io





_______________________________________________

ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io



_______________________________________________

ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



Links:
------
[1] https://forum.proxmox.com/threads/ceph-cluster-rebuild-import-bluestore-osds-from-old-cluster-bad-fsid-osd-dont-start-he-only-stays-in-down-state.151349/#post-687214


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to