[ceph-users] Re: How to run ceph_osd_dump

2020-11-11 Thread Denis Krienbühl
Hi Eugen That works. Apart from the release notes, there’s also documentation that has this wrong: https://docs.ceph.com/en/latest/rados/operations/monitoring/#network-performance-checks Thank you!

[ceph-users] Re: How to run ceph_osd_dump

2020-11-11 Thread Eugen Block
Hi, although the Nautilus v14.2.5 release notes [1] state that this command is available for both mgr and osd it doesn't seem to apply to mgr. But you should be able to run it for an osd daemon. Regards, Eugen [1] https://docs.ceph.com/en/latest/releases/nautilus/ Zitat von Denis

[ceph-users] Re: (Ceph Octopus) Repairing a neglected Ceph cluster - Degraded Data Reduncancy, all PGs degraded, undersized, not scrubbed in time

2020-11-11 Thread Anthony D'Atri
> Am 11.11.20 um 11:20 schrieb Hans van den Bogert: >> Hoping to learn from this myself, why will the current setup never work? That was a bit harsh to have said. Without seeing your EC profile and the topology, it’s hard to say for sure, but I suspect that adding another node with at least

[ceph-users] question about rgw delete speed

2020-11-11 Thread Adrian Nicolae
Hey guys, I'm in charge of a local cloud-storage service. Our primary object storage is a vendor-based one and I want to replace it in the near future with Ceph with the following setup : - 6 OSD servers with 36 SATA 16TB drives each and 3 big NVME per server (1 big NVME for every 12

[ceph-users] Re: How to use ceph-volume to create multiple OSDs per NVMe disk, and with fixed WAL/DB partition on another device?

2020-11-11 Thread Anthony D'Atri
Quoting in your message looks kind of messy so forgive me if I’m propagating that below. Honestly I agree that the Optanes will give diminishing returns at best for all but the most extreme workloads (which will probably want to use NVMoF natively anyway). >>> >>> This does split up the

[ceph-users] Unable to clarify error using vfs_ceph (Samba gateway for CephFS)

2020-11-11 Thread Matt Larson
I am getting an error in the log.smbd from the Samba gateway that I don’t understand and looking for help from anyone who has gotten the vfs_ceph working. Background: I am trying to get a Samba gateway with CephFS working with the vfs_ceph module. I observed that the default Samba package on

[ceph-users] Re: OverlayFS with Cephfs to mount a snapshot read/write

2020-11-11 Thread Frédéric Nass
Hi Jeff, I understand the idea behind patch [1] but it breaks the operation of overlayfs with cephfs. Should the patch be abandoned and tests be modified or should overlayfs code be adapted to work with cephfs, if that's possible? Either way, it'd be nice if overlayfs could work again with

[ceph-users] Re: newbie question: direct objects of different sizes to different pools?

2020-11-11 Thread Bill Anderson
Thanks so much! On Wed, Nov 11, 2020 at 11:24 AM Void Star Nill wrote: > Sorry, I didn't realize the question was for S3 RGW interface. > > I haven't used RGW, but from what I can see from the documentation, you > can create multiple zones and each zone can be configured with different >

[ceph-users] Re: newbie question: direct objects of different sizes to different pools?

2020-11-11 Thread Void Star Nill
Sorry, I didn't realize the question was for S3 RGW interface. I haven't used RGW, but from what I can see from the documentation, you can create multiple zones and each zone can be configured with different pools. Checkout some documentation at https://docs.ceph.com/en/latest/radosgw/placement/

[ceph-users] Re: newbie question: direct objects of different sizes to different pools?

2020-11-11 Thread Bill Anderson
Thank you for that info. Is it possible for an S3 RGW client to choose a pool, though? On Wed, Nov 11, 2020 at 10:40 AM Void Star Nill wrote: > You can do this by creating 2 different pools with different replication > settings. But your users/clients need to choose the right pool

[ceph-users] Re: newbie question: direct objects of different sizes to different pools?

2020-11-11 Thread Void Star Nill
You can do this by creating 2 different pools with different replication settings. But your users/clients need to choose the right pool while writing the files. -Shridhar On Tue, 10 Nov 2020 at 12:58, wrote: > Hi All, > > I'm exploring deploying Ceph at my organization for use as an object >

[ceph-users] Re: How to use ceph-volume to create multiple OSDs per NVMe disk, and with fixed WAL/DB partition on another device?

2020-11-11 Thread Void Star Nill
I have a similar setup and have been running some large concurrent benchmarks and I am seeing that running multiple OSDs per NVME doesn't really make a lot of difference. In fact, it actually increases the write amplification if you have write-heavy workloads, so performance degrades over time.

[ceph-users] How to run ceph_osd_dump

2020-11-11 Thread Denis Krienbühl
Hi We’ve recently encountered the following errors: [WRN] OSD_SLOW_PING_TIME_BACK: Slow OSD heartbeats on back (longest 2752.832ms) Slow OSD heartbeats on back from osd.2 [nvme-a] to osd.290 [nvme-c] 2752.832 msec ... Truncated long network list. Use ceph

[ceph-users] Re: Nautilus - osdmap not trimming

2020-11-11 Thread Dan van der Ster
Hi, v14.2.13 has an important fix in this area: https://tracker.ceph.com/issues/47290 Without this fix, your cluster will not trim if there are any *down* osds in the cluster. On our clusters we are running v14.2.11 patched with commit "mon/OSDMonitor: only take in osd into consideration when

[ceph-users] Re: _get_class not permitted to load rgw_gc

2020-11-11 Thread Dan van der Ster
Found it -- there was an octopus machine running radosgw-admin to collect user stats. Rolled that back to luminous and no more _get_class warnings. Sorry for the noise. - dan On Wed, Nov 11, 2020 at 3:31 PM Dan van der Ster wrote: > > Well I won't do that, because it seems the new rgw_gc class

[ceph-users] Re: _get_class not permitted to load rgw_gc

2020-11-11 Thread Dan van der Ster
Well I won't do that, because it seems the new rgw_gc class hasn't even been backported to nautilus https://tracker.ceph.com/issues/42409. I'm still looking to find what is trying to call it. -- dan On Wed, Nov 11, 2020 at 10:59 AM Dan van der Ster wrote: > > Would this be a bad idea? > >

[ceph-users] Re: (Ceph Octopus) Repairing a neglected Ceph cluster - Degraded Data Reduncancy, all PGs degraded, undersized, not scrubbed in time

2020-11-11 Thread Hans van den Bogert
And also the erasure coded profile, so an example on my cluster would be: $ ceph osd pool get objects.rgw.buckets.data erasure_code_profile erasure_code_profile: objects_ecprofile $ ceph osd erasure-code-profile get objects_ecprofile crush-device-class= crush-failure-domain=host

[ceph-users] Re: (Ceph Octopus) Repairing a neglected Ceph cluster - Degraded Data Reduncancy, all PGs degraded, undersized, not scrubbed in time

2020-11-11 Thread Hans van den Bogert
Can you show a `ceph osd tree` ? On 11/7/20 1:14 AM, seffyr...@gmail.com wrote: I've inherited a Ceph Octopus cluster that seems like it needs urgent maintenance before data loss begins to happen. I'm the guy with the most Ceph experience on hand and that's not saying much. I'm experiencing

[ceph-users] Re: Is there a way to make Cephfs kernel client to write data to ceph osd smoothly with buffer io

2020-11-11 Thread Frank Schilder
These kernel parameters influence the flushing of data, and also performance: vm.dirty_bytes vm.dirty_background_bytes Smaller vm.dirty_background_bytes will make the transfer more smooth and the ceph cluster will like that. However, it reduces the chances of merge operations in cache and the

[ceph-users] Re: (Ceph Octopus) Repairing a neglected Ceph cluster - Degraded Data Reduncancy, all PGs degraded, undersized, not scrubbed in time

2020-11-11 Thread Robert Sander
Hi, Am 11.11.20 um 11:20 schrieb Hans van den Bogert: > Hoping to learn from this myself, why will the current setup never work? There are only 4 OSDs in the cluster, with a mix of HDD and SSD. And they try to use erasure coding on that small setup. Erasure coding starts to work with at least 7

[ceph-users] Re: (Ceph Octopus) Repairing a neglected Ceph cluster - Degraded Data Reduncancy, all PGs degraded, undersized, not scrubbed in time

2020-11-11 Thread Hans van den Bogert
Hoping to learn from this myself, why will the current setup never work? On 11/11/20 10:29 AM, Robert Sander wrote: Am 07.11.20 um 01:14 schrieb seffyr...@gmail.com: I've inherited a Ceph Octopus cluster that seems like it needs urgent maintenance before data loss begins to happen. I'm the

[ceph-users] Re: _get_class not permitted to load rgw_gc

2020-11-11 Thread Dan van der Ster
Would this be a bad idea? Option("osd_class_load_list", Option::TYPE_STR, Option::LEVEL_ADVANCED) - .set_default("cephfs hello journal lock log numops " "otp rbd refcount rgw timeindex user version cas") + .set_default("cephfs hello journal lock log numops " "otp rbd refcount rgw rgw_gc timeindex

[ceph-users] Re: disable / remove multisite sync RGW (Ceph Nautilus)

2020-11-11 Thread Eugen Block
Hi, I haven't done that myself yet but I would assume that stopping the second gateway and removing the secondary zone [1] should do no harm to the master zone, assuming that it was always synced from master to secondary, not bidirectional. Regards, Eugen [1]

[ceph-users] disable / remove multisite sync RGW (Ceph Nautilus)

2020-11-11 Thread Markus Gans
Hello everybody, we are running a multisite (active/active) gateway on 2 ceph clusters. One production and one backup cluster. Now we make a backup with rclone from the master and don't need anymore the second Gateway. What is the best way to shutdown the second Gateway and remove the multisite

[ceph-users] Re: (Ceph Octopus) Repairing a neglected Ceph cluster - Degraded Data Reduncancy, all PGs degraded, undersized, not scrubbed in time

2020-11-11 Thread Robert Sander
Am 07.11.20 um 01:14 schrieb seffyr...@gmail.com: > I've inherited a Ceph Octopus cluster that seems like it needs urgent > maintenance before data loss begins to happen. I'm the guy with the most Ceph > experience on hand and that's not saying much. I'm experiencing most of the > ops and

[ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk

2020-11-11 Thread Janek Bevendorff
Yeah, that seems to be it. There are 239 objects prefixed .8naRUHSG2zfgjqmwLnTPvvY1m6DZsgh in my dump. However, there are none of the multiparts from the other file to be found and the head object is 0 bytes. I checked another multipart object with an end pointer of 11. Surprisingly, it had

[ceph-users] _get_class not permitted to load rgw_gc

2020-11-11 Thread Dan van der Ster
Hi, We have this "not permitted to load rgw_gc" error on some of our osds. Anyone knows what this is and how to fix it? Nautilus 14.2.11 and CentOS 7 / 8: 2020-11-11 09:48:15.914 7f665c1ea700 0 _get_class not permitted to load rgw_gc 2020-11-11 09:48:15.914 7f665c1ea700 -1 osd.874 163331 class

[ceph-users] Re: Cephfs Kernel client not working properly without ceph cluster IP

2020-11-11 Thread Eugen Block
Do you find any issue in the below commands I have used to set cluster IP in cluster. Yes I do: ### adding public IP for ceph cluster ### ceph config set global cluster_network 10.100.4.0/24 I'm still not convinced that your setup is as you want it to be. Can you share your actual config?

[ceph-users] Re: How to use ceph-volume to create multiple OSDs per NVMe disk, and with fixed WAL/DB partition on another device?

2020-11-11 Thread Jan Fajerski
On Fri, Nov 06, 2020 at 10:15:52AM -, victorh...@yahoo.com wrote: I'm building a new 4-node Proxmox/Ceph cluster, to hold disk images for our VMs. (Ceph version is 15.2.5). Each node has 6 x NVMe SSDs (4TB), and 1 x Optane drive (960GB). CPU is AMD Rome 7442, so there should be plenty of

[ceph-users] Re: How to use ceph-volume to create multiple OSDs per NVMe disk, and with fixed WAL/DB partition on another device?

2020-11-11 Thread Eugen Block
There's an option for the block.db size in ceph-volume batch command: --block-db-size BLOCK_DB_SIZE Set (or override) the "bluestore_block_db_size" value, in bytes Zitat von victorh...@yahoo.com: I'm building a new 4-node Proxmox/Ceph