[ceph-users] Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Zakhar Kirpichenko
Got it. I don't have any specific throttling set up for RBD-backed storage. I also previously tested several different backends and found that virtio consistently produced better performance than virtio-scsi in different scenarios, thus my VMs run virtio. /Z On Wed, Oct 6, 2021 at 7:10 AM

[ceph-users] Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Anthony D'Atri
To be clear, I’m suspecting explicit throttling as described here: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-blockio-techniques not impact from virtualization as such,

[ceph-users] Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Zakhar Kirpichenko
Hi! The clients are KVM VMs, there's QEMU/libvirt impact for sure. I will test with a baremetal client and see whether it performs much better. /Z On Wed, 6 Oct 2021, 01:29 Anthony D'Atri, wrote: > The lead PG handling ops isn’t a factor, with RBD your volumes touch > dozens / hundreds of

[ceph-users] Re: ceph-iscsi issue after upgrading from nautilus to octopus

2021-10-05 Thread icy chan
Hi, This issue also happens on another platform that runs with Pacific (v16.2.6) on Ubuntu 20.04. # ceph health detail HEALTH_WARN 20 stray daemon(s) not managed by cephadm [WRN] CEPHADM_STRAY_DAEMON: 20 stray daemon(s) not managed by cephadm stray daemon

[ceph-users] Re: radosgw breaking because of too many open files

2021-10-05 Thread shubjero
Found the issue. Upgrading to Octopus did replace /etc/init.d/radosgw which contained some changes to the distribution detection and setting ulimits. New radosgw init script: -snip- echo "Starting $name..." if [ $DEBIAN -eq 1 ]; then start-stop-daemon

[ceph-users] MDS replay questions

2021-10-05 Thread Brian Kim
Dear ceph-users, We have a ceph cluster with 3 MDS's and recently had to replay our cache which is taking an extremely long time to complete. Is there some way to speed up this process as well as apply some checkpoint so it doesn't have to start all the way from the beginning? -- Best Wishes,

[ceph-users] Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Zakhar Kirpichenko
Hi! I can post the crush map tomorrow morning, but it definitely isn't targeting the NVME drives. I'm having a performance issue specifically with the HDD-backed pool, where each OSD is an NVME-backed WAL/DB + HDD-backed storage. /Z On Tue, 5 Oct 2021, 22:43 Tor Martin Ølberg, wrote: > Hi

[ceph-users] Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Zakhar Kirpichenko
I'm not sure, fio might be showing some bogus values in the summary, I'll check the readings again tomorrow. Another thing I noticed is that writes seem bandwidth-limited and don't scale well with block size and/or number of threads. I.e. one clients writes at about the same speed regardless of

[ceph-users] Re: Orchestrator is internally ignoring applying a spec against SSDs, apparently determining they're rotational.

2021-10-05 Thread Chris
Hi! So I nuked the cluster, zapped all the disks, and redeployed. Then I applied this osd spec (this time via the dashboard since I was full of hope): service_type: osd service_id: osd_spec_default placement: host_pattern: '*' data_devices: rotational: 1 db_devices: rotational: 0

[ceph-users] Re: radosgw breaking because of too many open files

2021-10-05 Thread Marc
> In Ceph Nautilus we used to set in ceph.conf the following which I > think helped is avoid the situation: > > [global] > max open files = 131072 > > This config option seems to be no longer recognized by ceph. > ceph config set ??? (I would not know, I am still Nautilus)

[ceph-users] radosgw breaking because of too many open files

2021-10-05 Thread shubjero
Just upgraded from Ceph Nautilus to Ceph Octopus on Ubuntu 18.04 using standard ubuntu packages from the Ceph repo. Upgrade has gone OK but we are having issues with our radosgw service, eventually failing after some load, here's what we see in the logs: 2021-10-05T15:55:16.328-0400 7fa47700

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-05 Thread Szabo, Istvan (Agoda)
This unable to load table properties also interesting before caught signal: -16> 2021-10-05T20:31:28.484+0700 7f310cce5f00 2 rocksdb: [db/version_set.cc:1362] Unable to load table properties for file 247222 --- NotFound: -15> 2021-10-05T20:31:28.484+0700 7f310cce5f00 2 rocksdb:

[ceph-users] Re: 1 MDS report slow metadata IOs

2021-10-05 Thread Abdelillah Asraoui
The osds are continuously flapping up/down due to the slow MDS metadata IOs .. what is causing the slow MDS metadata IOs ? currently, there are 2 mds and 3 monitors deployed .. would it help to just one mds and one monitor ? thanks! On Tue, Oct 5, 2021 at 1:42 PM Eugen Block wrote: > All your

[ceph-users] Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Tor Martin Ølberg
Hi Zakhar, Out of curiosity, what does your crushmap look like? Probably a long shot but are you sure your crush map is targeting the NVME's for the rados bench you are performing? Tor Martin Ølberg On Tue, Oct 5, 2021 at 9:31 PM Christian Wuerdig < christian.wuer...@gmail.com> wrote: > Maybe

[ceph-users] Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Christian Wuerdig
Maybe some info is missing but 7k write IOPs at 4k block size seem fairly decent (as you also state) - the bandwidth automatically follows from that so not sure what you're expecting? I am a bit puzzled though - by my math 7k IOPS at 4k should only be 27MiB/sec - not sure how the 120MiB/sec was

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-05 Thread Szabo, Istvan (Agoda)
Hmm, I’ve removed from the cluster, now data rebalance, I’ll do with the next one ☹ Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-05 Thread Szabo, Istvan (Agoda)
This one is in messages: https://justpaste.it/3x08z Buffered_io is turned on by default in 15.2.14 octopus FYI. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com

[ceph-users] Re: Erasure coded pool chunk count k

2021-10-05 Thread Christian Wuerdig
A couple of notes to this: Ideally you should have at least 2 more failure domains than your base resilience (K+M for EC or size=N for replicated) - reasoning: Maintenance needs to be performed so chances are every now and then you take a host down for a few hours or possibly days to do some

[ceph-users] Re: 1 MDS report slow metadata IOs

2021-10-05 Thread Eugen Block
All your PGs are inactive, if two of four OSDs are down and you probably have a pool size of 3 then no IO can be served. You’d need at least three up ODSs to resolve that. Zitat von Abdelillah Asraoui : Ceph is reporting warning on slow metdataIOs on one of the MDS server, this is a new

[ceph-users] 1 MDS report slow metadata IOs

2021-10-05 Thread Abdelillah Asraoui
Ceph is reporting warning on slow metdataIOs on one of the MDS server, this is a new cluster with no upgrade.. Anyone has encountered this and is there a workaround .. ceph -s cluster: id: 801691e6xx-x-xx-xx-xx health: HEALTH_WARN 1 MDSs report slow metadata IOs

[ceph-users] Re: *****SPAM***** Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Zakhar Kirpichenko
Aren't all writes to bluestore turned into sequential writes? /Z On Tue, 5 Oct 2021, 20:05 Marc, wrote: > > Hi Zakhar, > > > using 16 threads) are not. Literally every storage device in my setup > > can read and write at least 200+ MB/s sequentially, so I'm trying to > > find an explanation

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-05 Thread Szabo, Istvan (Agoda)
Hmm, tried another one which hasn’t been spilledover disk, still coredumped ☹ Is there any special thing that we need to do before we migrate db next to the block? Our osds are using dmcrypt, is it an issue? { "backtrace": [ "(()+0x12b20) [0x7f310aa49b20]", "(gsignal()+0x10f)

[ceph-users] Re: *****SPAM***** Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Marc
Hi Zakhar, > using 16 threads) are not. Literally every storage device in my setup > can read and write at least 200+ MB/s sequentially, so I'm trying to > find an explanation for this behavior. All writes in ceph are random afaik ___ ceph-users

[ceph-users] Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Zakhar Kirpichenko
Hi Marc, Many thanks for your comment! As I mentioned, rados bench results are more or less acceptable and explainable. RBD clients writing at ~120 MB/st tops (regardless of the number of threads or block size btw) and reading ~50 MB/s in a single thread (I managed to read over 500 MB/s using 16

[ceph-users] Re: CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Marc
You are aware of this: https://yourcmc.ru/wiki/Ceph_performance I am having these results with ssd and 2.2GHz xeon and no cpu state/freq/cpugovernor optimalization, so your results with hdd look quite ok to me. [@c01 ~]# rados -p rbd.ssd bench 30 write Maintaining 16 concurrent writes of

[ceph-users] CEPH 16.2.x: disappointing I/O performance

2021-10-05 Thread Zakhar Kirpichenko
Hi, I built a CEPH 16.2.x cluster with relatively fast and modern hardware, and its performance is kind of disappointing. I would very much appreciate an advice and/or pointers :-) The hardware is 3 x Supermicro SSG-6029P nodes, each equipped with: 2 x Intel(R) Xeon(R) Gold 5220R CPUs 384 GB

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-05 Thread Igor Fedotov
Not sure dmcrypt is a culprit here. Could you please set debug-bluefs to 20 and collect an OSD startup log. On 10/5/2021 4:43 PM, Szabo, Istvan (Agoda) wrote: Hmm, tried another one which hasn’t been spilledover disk, still coredumped ☹ Is there any special thing that we need to do before

[ceph-users] Re: is it possible to remove the db+wal from an external device (nvme)

2021-10-05 Thread Eugen Block
Do you see oom killers in dmesg on this host? This line indicates it: "(tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0x146) [0x7f310b7d8c96]", Zitat von "Szabo, Istvan (Agoda)" : Hmm, tried another one which hasn’t been spilledover disk, still coredumped ☹ Is there any

[ceph-users] Re: Can't join new mon - lossy channel, failing

2021-10-05 Thread Konstantin Shalygin
As last resort we've change ipaddr of this host, and mon successfully joined to quorum. When revert ipaddr back - mon can't join, we think there something on switch side or on old mon's side. From old mon's I was checked new mon process connectivity via telnet - all works It's good to make a

[ceph-users] Re: Daemon Version Mismatch (But Not Really?) After Deleting/Recreating OSDs

2021-10-05 Thread Edward R Huyer
Gotcha. Thanks for the input regardless. I suppose I'll continue what I'm doing, and plan on doing an upgrade via quay.io in the near future. -Original Message- From: Gregory Farnum Sent: Monday, October 4, 2021 7:14 PM To: Edward R Huyer Cc: ceph-users@ceph.io Subject: Re:

[ceph-users] Broken mon state after (attempted) 16.2.5 -> 16.2.6 upgrade

2021-10-05 Thread Jonathan D. Proulx
In the middle of a normal cephadm upgrade from 16.2.5 to 16.2.6, after the mgrs had successfully upgraded, 2/5 mons didn’t come back up (and the upgrade stopped at that point). Attempting to manually restart the crashed mons resulted in **all** of the other mons crashing too, usually with:

[ceph-users] Re: MDS not becoming active after migrating to cephadm

2021-10-05 Thread Petr Belyaev
Just tried it, stopped all mds nodes and created one using orch. Result: 0/1 daemons up (1 failed), 1 standby. Same as before, and logs don’t show any errors as well. I’ll probably try upgrading the orch-based setup to 16.2.6 over the weekend to match the exact non-dockerized MDS version,