[ceph-users] Re: mds.0.journaler.pq(ro) _finish_read got error -2 [solved]

2023-12-11 Thread Eugen Block
Just for posterity, we made the CephFS available again. We walked through the disaster recovery steps where one of the steps was to reset the journal. I was under the impression that the specified command 'cephfs-journal-tool [--rank=N] journal reset' would simply reset all the journals

[ceph-users] Re: MDS recovery with existing pools

2023-12-11 Thread Konstantin Shalygin
Good to hear that, Eugen! CC'ed Zac for a your docs mention k > On Dec 11, 2023, at 23:28, Eugen Block wrote: > > Update: apparently, we did it! > We walked through the disaster recovery steps where one of the steps was to > reset the journal. I was under the impression that the specified

[ceph-users] Is there any way to merge an rbd image's full backup and a diff?

2023-12-11 Thread Satoru Takeuchi
Hi, I'm developing RBD images' backup system. In my case, a backup data must be stored at least two weeks. To meet this requirement, I'd like to take backups as follows: 1. Take a full backup by rbd export first. 2. Take a differencial backups everyday. 3. Merge the full backup and the oldest

[ceph-users] Re: MDS recovery with existing pools

2023-12-11 Thread Eugen Block
Update: apparently, we did it! We walked through the disaster recovery steps where one of the steps was to reset the journal. I was under the impression that the specified command 'cephfs-journal-tool [--rank=N] journal reset' would simply reset all the journals (mdlog and purge_queue), but

[ceph-users] Deleting files from lost+found in 18.2.0

2023-12-11 Thread Thomas Widhalm
Hi, I saw in the changelogs that it should now finally be possible to delete files from lost+found with 18.x . I upgraded recently but still I can't delete or move files from there. I tried to change permissions but still every time I try it, I get "read only filesystem" but only when

[ceph-users] Re: Ceph 17.2.7 to 18.2.0 issues

2023-12-11 Thread pclark6063
Thanks for this, I've replied above but sadly a client eviction and remount didn't help. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph 17.2.7 to 18.2.0 issues

2023-12-11 Thread pclark6063
Hi, Thank you very much for the reply. So I evicted all my clients and still no luck. Check for blocked ops returns 0 from each mds service. Each mds service is serving a different pool suffering the same issue. If I write any recent files I can both stat and pull those so I have zero issues

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-12-11 Thread Yuri Weinstein
Per Guillaume it was tested and does not have any impact on other areas. So I will cherry-pick it for the release Thx On Mon, Dec 11, 2023 at 7:53 AM Guillaume Abrioux wrote: > > Hi Yuri, > > > > Any chance we can include [1] ? This patch fixes mpath devices deployments, > the PR has missed a

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-12-11 Thread Guillaume Abrioux
Hi Yuri, Any chance we can include [1] ? This patch fixes mpath devices deployments, the PR has missed a merge and was backported onto reef this morning only. Thanks, [1] https://github.com/ceph/ceph/pull/53539/commits/1e7223281fa044c9653633e305c0b344e4c9b3a4 -- Guillaume Abrioux Software

[ceph-users] mds.0.journaler.pq(ro) _finish_read got error -2

2023-12-11 Thread Eugen Block
Hi, I'm trying to help someone with a broken CephFS. We managed to recover basic ceph functionality but the CephFS is still inaccessible (currently read-only). We went through the disaster recovery steps but to no avail. Here's a snippet from the startup logs: ---snip--- mds.0.41

[ceph-users] Re: MDS recovery with existing pools

2023-12-11 Thread Eugen Block
So we did walk through the advanced recovery page but didn't really succeed. The CephFS is still going to readonly because of the purge_queue error. Is there any chance to recover from that or should we try to recover with an empty metadata pool next? I'd still appreciate any comments. ;-)

[ceph-users] Re: Osd full

2023-12-11 Thread David C.
Hi Mohamed, Changing weights is no longer a good practice. The balancer is supposed to do the job. The number of pg per osd is really tight on your infrastructure. Can you display the ceph osd tree command? Cordialement, *David CASIER*

[ceph-users] Re: Ceph 16.2.14: ceph-mgr getting oom-killed

2023-12-11 Thread Zakhar Kirpichenko
Hi, Another update: after 2 more weeks the mgr process grew to ~1.5 GB, which again was expected: mgr.ceph01.vankui ceph01 *:8443,9283 running (2w)102s ago 2y 1519M- 16.2.14 fc0182d6cda5 3451f8c6c07e mgr.ceph02.shsinf ceph02 *:8443,9283 running (2w)102s ago 7M

[ceph-users] Re: OSD CPU and write latency increase after upgrade from 15.2.16 to 17.2.6

2023-12-11 Thread Tony Yao
After analysis, our cluster used compression, but did not use isal during compression. The compilation of ISAL compress in the current code depends on the macro HAVE_NASM_X64_AVX2. However, the macro HAVE_NASM_X64_AVX2 has been removed, resulting in the compression not using ISAL even if

[ceph-users] Osd full

2023-12-11 Thread Mohamed LAMDAOUAR
Hello the team, We initially had a cluster of 3 machines with 4 osd on each machine, we added 4 machines in the cluster (each machine with 4 osd) We launched the balancing but it never finished, still in progress. But the big issue: we have an osd full and all the pools on this osd are read only.

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-12-11 Thread Xiubo Li
Hi Frank, Sure, just take your time. Thanks - Xiubo On 12/8/23 19:54, Frank Schilder wrote: Hi Xiubo, I will update the case. I'm afraid this will have to wait a little bit though. I'm too occupied for a while and also don't have a test cluster that would help speed things up. I will