Hi,
I still think the best approach would be to rebuild the MON store from
the OSDs as described here [2]. Just creating new MONs with the same
IDs might not be sufficient because they would miss all the OSD
keyrings etc., so you'd still have to do some work to get it up. It
might be easi
Hi,
Yes, the old mon daemons are removed. In the first post mon daemons were
started with mon data from scratch. After some code search, I suspect
without original mon data I could restore the cluster from all osds. But I
may be wrong on this. For now, I think it could be of less configuration if
I received a few suggestions, and resolved my issue.
Anthony D'Atri suggested mclock (newer than my nautilus version), adding
"--osd_recovery_max_single_start 1” (didn’t seem to take),
“osd_op_queue_cut_off=high” (which I didn’t get to checking), and pgremapper
(from github).
Pgremapper did th
Hi,
I'm not familiar with rook so the steps required may vary. If you try
to reuse the old mon stores you'll have the mentioned mismatch between
the new daemons and the old monmap (which still contains the old mon
daemons). It's not entirely clear what went wrong in the first place
and wh
Hi Eugen,
Thank you for help on this.
Forget the log. A little progress, the monitors store were restored. I
created a new ceph cluster to use the restored monitors store. But the
monitor log complains:
debug 2023-03-09T11:00:31.233+ 7fe95234f880 0 starting mon.a rank -1
at public addrs [v2
Thanks for the hint, did run some short test, all fine. I am not sure
it's a drive issue.
Some more digging, the file with bad performance has this segments:
[root@afsvos01 vicepa]# hdparm --fibmap $PWD/0
/vicepa/0:
filesystem blocksize 4096, begins at LBA 2048; assuming 512 byte sectors.
by
Hi all,
we seem to have hit a bug in the ceph fs kernel client and I just want to
confirm what action to take. We get the error "wrong peer at address" in dmesg
and some jobs on that server seem to get stuck in fs access; log extract below.
I found these 2 tracker items https://tracker.ceph.com
Hi,
we've observed 500er errors on uploading files to a single bucket, but the
problem went away after around 2 hours.
We've checked and saw the following error message:
2023-03-08T17:55:58.778+ 7f8062f15700 0 WARNING: set_req_state_err
err_no=125 resorting to 500 2023-03-08T17:55:58.778+
Hi,
I haven't had the chance to play with LRC yet, so I can't really
comment on that. But can you share your osd tree as well? I assume you
already did, but can you verify that the crush rule works as expected
and the chunks are distributed correctly?
Regards,
Eugen
Zitat von steve.bake.
Hi,
there's no attachment to your email, please use something like
pastebin to provide OSD logs.
Thanks
Eugen
Zitat von Ben :
Hi,
I ended up with having whole set of osds to get back original ceph cluster.
I figured out to make the cluster running. However, it's status is
something as bel
10 matches
Mail list logo