[ceph-users] Re: cephfs - blacklisted client coming back?

2020-11-09 Thread Andras Pataki
Hi Dan, That makes sense - the time between blacklist and magic comeback was around 1 hour - thanks for the explanation.  Is this is a safe default?  At eviction, the MDS takes all caps from the client away, so if it comes back in an hour, doesn't it then  write to files that it perhaps

[ceph-users] Re: The feasibility of mixed SSD and HDD replicated pool

2020-11-09 Thread 胡 玮文
For read-only workload, this should make no difference, since all read are from SSD normally. But I think it’s still beneficial to writing, backfilling, recovering. And also I will have some HDD only pools, so WAL/DB on SSD will definitely improve performance for these pools. I will always put

[ceph-users] Re: The feasibility of mixed SSD and HDD replicated pool

2020-11-09 Thread 胡 玮文
在 2020年11月10日,02:26,Dave Hall 写道:  This thread caught my attention. I have a smaller cluster with a lot of OSDs sharing the same SSD on each OSD node. I mentioned in an earlier post that I found a statement in

[ceph-users] cephfs - blacklisted client coming back?

2020-11-09 Thread Andras Pataki
We had some network problems (high packet drops) to some cephfs client nodes that run ceph-fuse (14.2.13) against a Nautilus cluster (on version 14.2.8).  As a result a couple of clients got evicted (as one would expect).  What was really odd is that the clients were trying to flush data they

[ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk

2020-11-09 Thread Rafael Lopez
Hi Mariusz, all We have seen this issue as well, on redhat ceph 4 (I have an unresolved case open). In our case, `radosgw-admin stat` is not a sufficient check to guarantee that there are rados objects. You have to do a `rados stat` to know that. In your case, the object is ~48M in size, appears

[ceph-users] Re: cephfs - blacklisted client coming back?

2020-11-09 Thread Dan van der Ster
Hi Andras, The osd blocklist entries expire after 1hr by default: Option("mon_osd_blacklist_default_expire", Option::TYPE_FLOAT, Option::LEVEL_ADVANCED) .set_default(1_hr) .add_service("mon") .set_description("Duration in seconds that blacklist entries for clients "

[ceph-users] cephfs forward scrubbing docs

2020-11-09 Thread Dan van der Ster
Hi, Today while debugging something we had a few questions that might lead to improving the cephfs forward scrub docs: https://docs.ceph.com/en/latest/cephfs/scrub/ tldr: 1. Should we document which sorts of issues that the forward scrub is able to fix? 2. Can we make it more visible (in docs)

[ceph-users] move rgw bucket to different pool

2020-11-09 Thread Frank Ritchie
Hi All, Are there any methods for relocating an entire RGW bucket to a different storage pool other than copying the contents? Thanks, Frank ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Mon went down and won't come back

2020-11-09 Thread Paul Mezzanini
Problem Resolved: Reason - NoDangClue I had the broken monitor sitting there trying to join and failing, just watching the debug log scroll. I then stopped ceph-mon-01 and started it in debug to watch the messages and also see if debug on ceph-mon-02 was able to read it all. Not only did it

[ceph-users] Re: high latency after maintenance

2020-11-09 Thread Marcel Kuiper
Yes the OSDs are all bluestore. So does this mean that we can assign most of the memory to the osd processes by setting the osd_memory_target? > > If your OSDs are all BlueStore, page cache isn’t nearly as important as > with Filestore. > > >> ___

[ceph-users] Re: Dovecot and fnctl locks

2020-11-09 Thread Dan van der Ster
Hi, Yeah the negative pid is interesting. AFAICT we use a negative pid to indicate that the lock was taken on another host: https://github.com/torvalds/linux/blob/master/fs/ceph/locks.c#L119 https://github.com/torvalds/linux/commit/9d5b86ac13c573795525ecac6ed2db39ab23e2a8 "Finally, we convert

[ceph-users] Re: OverlayFS with Cephfs to mount a snapshot read/write

2020-11-09 Thread Nathan Fish
Updating kernel versions has improved many Ceph-related things for me. I don't use CentOS, but at a glance I see that you can get newer kernels via "Elrepo". I would seriously consider doing so. Even sticking with an LTS kernel, you can still get much newer kernels (up to 5.4.75).

[ceph-users] Re: The feasibility of mixed SSD and HDD replicated pool

2020-11-09 Thread Dave Hall
This thread caught my attention. I have a smaller cluster with a lot of OSDs sharing the same SSD on each OSD node. I mentioned in an earlier post that I found a statement in https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/ indicating that if the SSD/NVMe in a node is

[ceph-users] Re: OverlayFS with Cephfs to mount a snapshot read/write

2020-11-09 Thread Frédéric Nass
I feel lucky to have you on this one. ;-) Do you mean applying a specific patch on 3.10 kernel? Or is this one too old to have it working anyways. Frédéric. Le 09/11/2020 à 19:07, Luis Henriques a écrit : Frédéric Nass writes: Hi Luis, Thanks for your help. Sorry I forgot about the

[ceph-users] Re: OverlayFS with Cephfs to mount a snapshot read/write

2020-11-09 Thread Frédéric Nass
Luis, I gave RHEL 8 and kernel 4.18 a try and it's working perfectly! \o/ Same commands, same mount options. Does anyone know why and if there's any chances I can have this working with CentOS/RHEL 7 and 3.10 kernel? Best regards, Frédéric. Le 09/11/2020 à 15:04, Frédéric Nass a écrit :

[ceph-users] Re: OverlayFS with Cephfs to mount a snapshot read/write

2020-11-09 Thread Luis Henriques
Frédéric Nass writes: > Hi Luis, > > Thanks for your help. Sorry I forgot about the kernel details. This is latest > RHEL 7.9. > > ~/ uname -r > 3.10.0-1160.2.2.el7.x86_64 > > ~/ grep CONFIG_TMPFS_XATTR /boot/config-3.10.0-1160.2.2.el7.x86_64 > CONFIG_TMPFS_XATTR=y > > upper directory /upperdir

[ceph-users] Re: pg xyz is stuck undersized for long time

2020-11-09 Thread Frank Schilder
My PGs are healthy now, but the underlying problem itself is not fixed. I was interested if someone knew a fast fix to get the PGs complete right away. The down OSDs have been shut down a long time ago and are sitting in a different crush root. It was 1 OSD in an HDD pool that I'm re-organising

[ceph-users] Re: [External Email] Re: OverlayFS with Cephfs to mount a snapshot read/write

2020-11-09 Thread Dave Hall
All, I'm not sure if this is relevant here, but I recently tried to use OverlayFS with an NFS share. It wouldn't work because NFS does not present to the kernel as a block device. OverlayFS requires a block device abstraction. If CephFS doesn't present as a block device you won't get it to

[ceph-users] Re: Mon went down and won't come back

2020-11-09 Thread Eugen Block
I thought it might be related to reported issues where the MONs were specified with IP:PORT, but that can be ruled out. Does the current monmap match your actual setup? You wrote the keys are correct, but maybe there's still a keyring left in 'ceph auth ls'? Zitat von Paul Mezzanini :

[ceph-users] Re: Cephfs Kernel client not working properly without ceph cluster IP

2020-11-09 Thread Nathan Fish
It sounds like your client is able to reach the mon but not the OSD? It needs to be able to reach all mons and all OSDs. On Sun, Nov 8, 2020 at 4:29 AM Amudhan P wrote: > > Hi, > > I have mounted my cephfs (ceph octopus) thru kernel client in Debian. > I get following error in "dmesg" when I try

[ceph-users] Re: Cephfs Kernel client not working properly without ceph cluster IP

2020-11-09 Thread Eugen Block
Clients don't need the cluster IP because that's only for OSD <--> OSD replication, no client traffic. But of course to be able to communicate with Ceph the clients need a public IP, how else would they contact the MON? Or did I misunderstand your setup? Zitat von Amudhan P : Hi, I

[ceph-users] ceph command on cephadm install stuck

2020-11-09 Thread Oliver Weinmann
Hi, on my fresh deployed cephadm bootstrap node I can no longer run the ceph command. It is just stuck: [root@gedasvl02 ~]# ceph orch dev ls INFO:cephadm:Inferring fsid c7879f24-1f90-11eb-8ba2-005056b703af INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 [root@gedasvl02 ~]#

[ceph-users] Re: [Suspicious newsletter] Re: Multisite sync not working - permission denied

2020-11-09 Thread Michael Breen
Thank you, Istvan, and Amit, who also replied. I had tried three Ceph versions, including (after your suggestion), the latest, but in my case it wasn't that. What the problem was, I don't know, but I changed from using VMs to test this functionality last week to using clusters of real hardware

[ceph-users] Multisite mechanism deeper understanding

2020-11-09 Thread Szabo, Istvan (Agoda)
Hi, Couple of questions came up which is not really documented anywhere, hopefully someone knows the answers: 1. Is there a way to see the replication queue? I want to create metrics like is there any delay in the replication etc ... 2. Is the replication FIFO? 3. Actually how a replication

[ceph-users] Re: OverlayFS with Cephfs to mount a snapshot read/write

2020-11-09 Thread Frédéric Nass
Hi Luis, Thanks for your help. Sorry I forgot about the kernel details. This is latest RHEL 7.9. ~/ uname -r 3.10.0-1160.2.2.el7.x86_64 ~/ grep CONFIG_TMPFS_XATTR /boot/config-3.10.0-1160.2.2.el7.x86_64 CONFIG_TMPFS_XATTR=y upper directory /upperdir is using xattrs ~/ ls -l

[ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk

2020-11-09 Thread Janek Bevendorff
We are having the exact same problem (also Octopus). The object is listed by s3cmd, but trying to download it results in a 404 error. radosgw-admin object stat shows that the object still exists. Any further ideas how I can restore access to this object? (Sorry if this is a duplicate, but it

[ceph-users] Dovecot and fnctl locks

2020-11-09 Thread Dan van der Ster
Hi all, MDS version v14.2.11 Client kernel 3.10.0-1127.19.1.el7.x86_64 We are seeing a strange issue with a dovecot use-case on cephfs. Occasionally we have dovecot reporting a file locked, such as: Nov 09 13:55:00 dovecot-backend-00.cern.ch dovecot[27710]: imap(reguero)<23945>: Error: Mailbox

[ceph-users] Re: high latency after maintenance

2020-11-09 Thread Marcel Kuiper
Hi Anthony > > Did you add a bunch of data since then, or change the Ceph release? Do > you have bluefs_buffered_io set to false? > > We did not change ceph release in the meantime. It is very welll possible that the delays were just not noticed during out previous maintenances.

[ceph-users] Re: OverlayFS with Cephfs to mount a snapshot read/write

2020-11-09 Thread Luis Henriques
Frédéric Nass writes: > Hello, > > I would like to use a cephfs snapshot as a read/write volume without having to > clone it first as the cloning operation is - if I'm not mistaken - still > inefficient as of now. This is for a data restore use case with Moodle > application needing a writable

[ceph-users] OverlayFS with Cephfs to mount a snapshot read/write

2020-11-09 Thread Frédéric Nass
Hello, I would like to use a cephfs snapshot as a read/write volume without having to clone it first as the cloning operation is - if I'm not mistaken - still inefficient as of now. This is for a data restore use case with Moodle application needing a writable data directory to start. The