[ceph-users] Re: ceph-volume / ecnrypted OSD issues with functionalities

2020-12-04 Thread Panayiotis Gotsis
Responding partially to my own query, I have decided on the following structure, in order to have encrypted OSDs/Bluestore journals and not wait for proper ceph-volume support. 1) SSD(s), fully encrypted, acting as PV(s) for VG(s) to store LVs for the Block DBs. My current setup is 1 SSD for 4

[ceph-users] Re: MDS lost, Filesystem degraded and wont mount

2020-12-04 Thread Janek Bevendorff
This is very common issue. Deleting mdsX_openfiles.Y has become part of my standard maintenance repertoire. As soon as you have a few more clients and one of them starts opening and closing files in rapid succession (or does other metadata-heavy things), it becomes very likely that the MDS

[ceph-users] Re: block.db/block.wal device performance dropped after upgrade to 14.2.10

2020-12-04 Thread Seena Fallah
Hi, I'm facing this issue too and I see the attached rocksdb log from Mark in my cluster which means there is a burst read on my block.db. I've sent some information from my issue in this thread[1]. Hope you help me with what's going on in my cluster. Thanks. [1]:

[ceph-users] ceph-volume / ecnrypted OSD issues with functionalities

2020-12-04 Thread Panayiotis Gotsis
Hello, I have made some tests with creating OSDs and I have found out that there are big issues with the ceph-volume functionality. 1) If using dmcrypt and separate data and db block devices, ceph-volume creates cryprodevs/PVs/VGs/LVs for both devices. This might seem as normal, until one

[ceph-users] Re: High read throughput on BlueFS

2020-12-04 Thread Seena Fallah
I found that bluefs_max_prefetch is set to 1048576 which equals to 1MiB! So why it's reading about 1GiB/s? On Thu, Dec 3, 2020 at 8:03 PM Seena Fallah wrote: > My first question is about this metric: ceph_bluefs_read_prefetch_bytes > and I want to know what operation is related to this metric?

[ceph-users] Re: [Suspicious newsletter] Re: PG_DAMAGED

2020-12-04 Thread Szabo, Istvan (Agoda)
This is a completely new cluster with full ssd and nvme :/ -Original Message- From: Eugen Block Sent: Friday, December 4, 2020 4:32 PM To: ceph-users@ceph.io Subject: [Suspicious newsletter] [ceph-users] Re: PG_DAMAGED Email received from outside the company. If in doubt don't click

[ceph-users] PG_DAMAGED

2020-12-04 Thread Szabo, Istvan (Agoda)
Hi, Not sure is it related to my 15.2.7 update, but today I got many time this issue: 2020-12-04T15:14:23.910799+0700 osd.40 (osd.40) 11 : cluster [DBG] 11.2 deep-scrub starts 2020-12-04T15:14:23.947255+0700 osd.40 (osd.40) 12 : cluster [ERR] 11.2 soid

[ceph-users] bucket radoslist stuck in a loop while listing objects

2020-12-04 Thread James, GleSYS
Hi, I recently attempted to run the ‘rgw-orphan-list’ tool against our cluster (octopus 15.2.7) to identify any orphans and noticed that the 'radosgw-admin bucket radoslist’ command appeared to be stuck in a loop. I saw in the 'radosgw-admin-XX.intermediate’ output file the same sequence

[ceph-users] Re: atime with cephfs

2020-12-04 Thread Filippo Stenico
Hi all, We would need the same feature in our HPC cluster. I guess this is not an unfrequent problem, I was wondering if you guys found an alternative solution. Best -- Filippo Stenico Services and Support for Science IT (S3IT) Office Y11 F 52 University of Zürich Winterthurerstrasse 190,

[ceph-users] Re: MDS lost, Filesystem degraded and wont mount

2020-12-04 Thread Dan van der Ster
Excellent! For the record, this PR is the plan to fix this: https://github.com/ceph/ceph/pull/36089 (nautilus, octopus PRs here: https://github.com/ceph/ceph/pull/37382 https://github.com/ceph/ceph/pull/37383) Cheers, Dan On Fri, Dec 4, 2020 at 11:35 AM Anton Aleksandrov wrote: > > Thank you

[ceph-users] Re: MDS lost, Filesystem degraded and wont mount

2020-12-04 Thread Anton Aleksandrov
Thank you very much! This solution helped: Stop all MDS, then: # rados -p cephfs_metadata_pool rm mds0_openfiles.0 then start one MDS. We are back online. Amazing!!! :) On 04.12.2020 12:20, Dan van der Ster wrote: Please also make sure the mds_beacon_grace is high on the mon's too. it

[ceph-users] Re: PG_DAMAGED

2020-12-04 Thread Dan van der Ster
In my experience inconsistencies caused by IO errors always have a SCSI Medium Error showing up in the kernel logs. (dmesg, journalctl -k, /v/l/messages, ...) (Except in the case of one very bad non-enterprise SMR drive I run at home, not at work). -- dan On Fri, Dec 4, 2020 at 11:03 AM Hans van

[ceph-users] Re: MDS lost, Filesystem degraded and wont mount

2020-12-04 Thread Dan van der Ster
Please also make sure the mds_beacon_grace is high on the mon's too. it doesn't matter which mds you select to be the running one. Is the processing getting killed, restarted? If you're confident that the mds is getting OOM killed during rejoin step, then you might find this useful:

[ceph-users] Re: MDS lost, Filesystem degraded and wont mount

2020-12-04 Thread Anton Aleksandrov
Yes, MDS eats all memory+swap, stays like this for a moment and then frees memory. mds_beacon_grace was already set to 1800 Also on other it is seen this message: Map has assigned me to become a standby. Does it matter, which MDS we stop and which we leave running? Anton On 04.12.2020

[ceph-users] Re: PG_DAMAGED

2020-12-04 Thread Hans van den Bogert
Interesting, your comment implies that it is a replication issue, which does not stem from a faulty disk. But, couldn't the disk have a bit flip? Or would you argue that would've shown as a disk read error somewhere (because of ECC on the disk.) On 12/4/20 10:51 AM, Dan van der Ster wrote:

[ceph-users] Re: MDS lost, Filesystem degraded and wont mount

2020-12-04 Thread Dan van der Ster
How many active MDS's did you have? (max_mds == 1, right?) Stop the other two MDS's so you can focus on getting exactly one running. Tail the log file and see what it is reporting. Increase mds_beacon_grace to 600 so that the mon doesn't fail this MDS while it is rejoining. Is that single MDS

[ceph-users] Re: PG_DAMAGED

2020-12-04 Thread Dan van der Ster
Note that in this case the inconsistencies are not coming from object reads, but from comparing the omap digests of an rgw index shard. This seems to be a result of a replication issue sometime in the past on this cluster. On Fri, Dec 4, 2020 at 10:32 AM Eugen Block wrote: > > Hi, > > this is

[ceph-users] MDS lost, Filesystem degraded and wont mount

2020-12-04 Thread Anton Aleksandrov
Hello community, we are on ceph 13.2.8 - today something happenned with one MDS and cephs status tells, that filesystem is degraded. It won't mount either. I have take server with MDS, that was not working down. There are 2 more MDS servers, but they stay in "rejoin" state. Also only 1 is

[ceph-users] Re: [Suspicious newsletter] Re: PG_DAMAGED

2020-12-04 Thread Eugen Block
There's no guarantee that new disks can't be faulty. We had this last year when we expanded our cluster with brand new servers and disks, one of the new OSDs failed almost immediately. You can wait and see how often this appears and if it's always the same disk. Just keep it in mind.

[ceph-users] Re: PG_DAMAGED

2020-12-04 Thread Eugen Block
Hi, this is not necessarily but most likely a hint to a (slowly) failing disk. Check all OSDs for this PG for disk errors in dmesg and smartctl. Regards, Eugen Zitat von "Szabo, Istvan (Agoda)" : Hi, Not sure is it related to my 15.2.7 update, but today I got many time this issue: