[ceph-users] Re: Unexpected slow read for HDD cluster (good write speed)

2023-03-27 Thread Arvid Picciani
Yes, during my last adventure of trying to get any reasonable performance out of ceph, i realized my testing methodology was wrong. Both the kernel client and qemu have queues everywhere that make the numbers hard to understand. fio has rbd support, which gives more useful values. https://subscri

[ceph-users] Re: ln: failed to create hard link 'file name': Read-only file system

2023-03-27 Thread Xiubo Li
On 22/03/2023 23:41, Gregory Farnum wrote: On Wed, Mar 22, 2023 at 8:27 AM Frank Schilder wrote: Hi Gregory, thanks for your reply. First a quick update. Here is how I get ln to work after it failed, there seems no timeout: $ ln envs/satwindspy/include/ffi.h mambaforge/pkgs/libffi-3.3-h5852

[ceph-users] Re: Almalinux 9

2023-03-27 Thread Arvid Picciani
on rocky, which should be identical to alma (?), i had to do this: https://almalinux.discourse.group/t/nothing-provides-python3-pecan-in-almalinux-9/2017/4 because the rpm has a broken dependency to pecan. But switching from debian to the official ceph rpm packages was worth it. The systemd unit

[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services

2023-03-27 Thread Lokendra Rathour
Hi All, Any help in this issue would be appreciated. Thanks once again. On Tue, Jan 24, 2023 at 7:32 PM Lokendra Rathour wrote: > Hi Team, > > > > We have a ceph cluster with 3 storage nodes: > > 1. storagenode1 - abcd:abcd:abcd::21 > > 2. storagenode2 - abcd:abcd:abcd::22 > > 3. storagenode3 -

[ceph-users] Adding new server to existing ceph cluster - with separate block.db on NVME

2023-03-27 Thread Robert W. Eckert
Hi, I am trying to add a new server to an existing cluster, but cannot get the OSDs to create correctly When I try Cephadm ceph-volume lvm create, it returns nothing but the container info. [root@hiho ~]# cephadm ceph-volume lvm create --bluestore --data /dev/sdd --block.db /dev/nvme0n1p3 Infer

[ceph-users] Re: ln: failed to create hard link 'file name': Read-only file system

2023-03-27 Thread Xiubo Li
Hi Frank, Thanks very much for your logs. I will check it. - Xiubo On 28/03/2023 06:35, Frank Schilder wrote: Dear Xiubo, I managed to collect logs and uploaded them to: ceph-post-file: 3d4d1419-a11e-4937-b0b1-bd99234d4e57 By the way, if you run the test with the conda.tgz at the link loca

[ceph-users] Re: Question about adding SSDs

2023-03-27 Thread Kyriazis, George
Thanks, I’ll try adjusting mds_cache_memory_limit. I did get some messages about MDS being slow trimming the cache, which implies that it was over its cache size. I never had any problems with the kernel mount, fortunately. I am running 17.2.5 (Quincy) My metadata pool size if about 15GB wit

[ceph-users] Re: ln: failed to create hard link 'file name': Read-only file system

2023-03-27 Thread Frank Schilder
Dear Xiubo, I managed to collect logs and uploaded them to: ceph-post-file: 3d4d1419-a11e-4937-b0b1-bd99234d4e57 By the way, if you run the test with the conda.tgz at the link location, be careful: it contains a .bashrc file to activate the conda environment. Un-tar it only in a dedicated loca

[ceph-users] Re: rbd cp vs. rbd clone + rbd flatten

2023-03-27 Thread Tony Liu
Thank you Ilya! Tony From: Ilya Dryomov Sent: March 27, 2023 10:28 AM To: Tony Liu Cc: ceph-users@ceph.io; d...@ceph.io Subject: Re: [ceph-users] rbd cp vs. rbd clone + rbd flatten On Wed, Mar 22, 2023 at 10:51 PM Tony Liu wrote: > > Hi, > > I want > 1)

[ceph-users] orphan multipart objects in Ceph cluster

2023-03-27 Thread Ramin Najjarbashi
I hope this email finds you well. I wanted to share a recent experience I had with our Ceph cluster and get your feedback on a solution I came up with. Recently, we had some orphan objects stuck in our cluster that were not visible by any client like s3cmd, boto3, and mc. This caused some confusio

[ceph-users] Re: Ceph cluster out of balance after adding OSDs

2023-03-27 Thread Pat Vaughan
Looking at the pools, there are 2 crush rules. Only one pool has a meaningful amount of data, the charlotte.rgw.buckets.data pool. This is the crush rule for that pool. { "rule_id": 1, "rule_name": "charlotte.rgw.buckets.data", "type": 3, "steps": [ { "op": "se

[ceph-users] ceph orch ps shows version, container and image id as unknown

2023-03-27 Thread Adiga, Anantha
Hi, Has anybody noticed this? ceph orch ps shows version, container and image id as unknown only for mon, mgr and osds. Ceph health is OK and all daemons are running fine. cephadm ls shows values for version, container and image id. root@cr21meg16ba0101:~# cephadm shell ceph orch ps Inferring

[ceph-users] Re: Ceph Mgr/Dashboard Python depedencies: a new approach

2023-03-27 Thread Ken Dreyer
Yeah, unfortunately we had all of these in the Copr, and some infrastructure change deleted them: https://bugzilla.redhat.com/show_bug.cgi?id=2143742 So the quickest route back will be to rebuild the missing-from-EPEL packages with the newer Copr settings, and I have written notes for that in http

[ceph-users] Re: Ceph Mgr/Dashboard Python depedencies: a new approach

2023-03-27 Thread Casey Bodley
i would hope that packaging for epel9 would be relatively easy, given that the epel8 packages already exist. as a first step, we'd need to build a full list of the missing packages. the tracker issue only complains about python3-asyncssh python3-pecan and python3-routes, but some of their dependenc

[ceph-users] Re: Ceph Mgr/Dashboard Python depedencies: a new approach

2023-03-27 Thread Ken Dreyer
I hope we don't backport such a big change to Quincy. That will have a large impact on how we build in restricted environments with no internet access. We could get the missing packages into EPEL. - Ken On Fri, Mar 24, 2023 at 7:32 AM Ernesto Puerta wrote: > > Hi Casey, > > The original idea wa

[ceph-users] Re: Question about adding SSDs

2023-03-27 Thread Marc
> > We have a ceph cluster (Proxmox based) with is HDD-based. We’ve had > some performance and “slow MDS” issues while doing VM/CT backups from > the Proxmox cluster, especially when rebalancing is going on at the same > time. I also had to increase the mds cache quite a lot to get rid of 'slow'

[ceph-users] Re: rbd cp vs. rbd clone + rbd flatten

2023-03-27 Thread Ilya Dryomov
On Wed, Mar 22, 2023 at 10:51 PM Tony Liu wrote: > > Hi, > > I want > 1) copy a snapshot to an image, > 2) no need to copy snapshots, > 3) no dependency after copy, > 4) all same image format 2. > In that case, is rbd cp the same as rbd clone + rbd flatten? > I ran some tests, seems like it, but w

[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-27 Thread Laura Flores
Rados review, second round: Failures: 1. https://tracker.ceph.com/issues/58560 2. https://tracker.ceph.com/issues/58476 3. https://tracker.ceph.com/issues/58475 -- pending Q backport 4. https://tracker.ceph.com/issues/49287 5. https://tracker.ceph.com/issues/58585 Details:

[ceph-users] Question about adding SSDs

2023-03-27 Thread Kyriazis, George
Hello ceph community, We have a ceph cluster (Proxmox based) with is HDD-based. We’ve had some performance and “slow MDS” issues while doing VM/CT backups from the Proxmox cluster, especially when rebalancing is going on at the same time. My thought is that one of following is going to improve

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Igor Fedotov
On 3/27/2023 12:19 PM, Boris Behrens wrote: Nonetheless the IOPS the bench command generates are still VERY low compared to the nautilus cluster (~150 vs ~250). But this is something I would pin to this bug:https://tracker.ceph.com/issues/58530 I've just run "ceph tell bench" against main, oct

[ceph-users] Re: ln: failed to create hard link 'file name': Read-only file system

2023-03-27 Thread Frank Schilder
> Sorry for late. No worries. > The ceph qa teuthology test cases have already one similar test, which > will untar a kernel tarball, but never seen this yet. > > I will try this again tomorrow without the NFS client. Great. In case you would like to use the archive I sent you a link for, please

[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-27 Thread Casey Bodley
On Fri, Mar 24, 2023 at 3:46 PM Yuri Weinstein wrote: > > Details of this release are updated here: > > https://tracker.ceph.com/issues/59070#note-1 > Release Notes - TBD > > The slowness we experienced seemed to be self-cured. > Neha, Radek, and Laura please provide any findings if you have them.

[ceph-users] Re: Ceph cluster out of balance after adding OSDs

2023-03-27 Thread Robert Sander
On 27.03.23 16:34, Pat Vaughan wrote: Yes, all the OSDs are using the SSD device class. Do you have multiple CRUSH rules by chance? Are all pools using the same CRUSH rule? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Marc
> > > > And for your reference - IOPS numbers I'm getting in my lab with > data/DB > > colocated: > > > > 1) OSD on top of Intel S4600 (SATA SSD) - ~110 IOPS > > sata ssd's on Nautilus: Micron 5100 117 MZ7KM1T9HMJP-5 122 ___ ceph-users mailing

[ceph-users] Re: Ceph cluster out of balance after adding OSDs

2023-03-27 Thread Robert Sander
On 27.03.23 16:04, Pat Vaughan wrote: we looked at the number of PGs for that pool, and found that there was only 1 for the rgw.data and rgw.log pools, and "osd pool autoscale-status" doesn't return anything, so it looks like that hasn't been working. If you are in this situation, have a look

[ceph-users] Re: ln: failed to create hard link 'file name': Read-only file system

2023-03-27 Thread Xiubo Li
Frank, Sorry for late. On 24/03/2023 01:56, Frank Schilder wrote: Hi Xiubo and Gregory, sorry for the slow reply, I did some more debugging and didn't have too much time. First some questions to collecting logs, but please see also below for reproducing the issue yourselves. I can reproduce

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Boris Behrens
Hey Igor, we are currently using these disks - all SATA attached (is it normal to have some OSDs without waer counter?): # ceph device ls | awk '{print $1}' | cut -f 1,2 -d _ | sort | uniq -c 18 SAMSUNG_MZ7KH3T8 (4TB) 126 SAMSUNG_MZ7KM1T9 (2TB) 24 SAMSUNG_MZ7L37T6 (8TB) 1 TOSHI

[ceph-users] Ceph cluster out of balance after adding OSDs

2023-03-27 Thread Pat Vaughan
We setup a small Ceph cluster about 6 months ago with just 6x 200GB OSDs with one EC 4x2 pool. When we created that pool, we enabled pg_autoscale. The OSDs stayed pretty well balanced. After our developers released a new "feature" that caused the storage to balloon up to over 80%, we added another

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Marc
> > > > >> > >> What I also see is that I have three OSDs that have quite a lot of > OMAP > >> data, in compare to other OSDs (~20 time higher). I don't know if > this > >> is an issue: > > > > I have on 2TB ssd's with 2GB - 4GB omap data, while on 8TB hdd's the > omap data is only 53MB - 100MB. >

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Anthony D'Atri
> >> >> What I also see is that I have three OSDs that have quite a lot of OMAP >> data, in compare to other OSDs (~20 time higher). I don't know if this >> is an issue: > > I have on 2TB ssd's with 2GB - 4GB omap data, while on 8TB hdd's the omap > data is only 53MB - 100MB. > Should I manu

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Marc
> > What I also see is that I have three OSDs that have quite a lot of OMAP > data, in compare to other OSDs (~20 time higher). I don't know if this > is an issue: I have on 2TB ssd's with 2GB - 4GB omap data, while on 8TB hdd's the omap data is only 53MB - 100MB. Should I manually clean this? (

[ceph-users] Re: EC profiles where m>k (EC 8+12)

2023-03-27 Thread Clyso GmbH - Ceph Foundation Member
Hi Fabien, we have also used it several times for 2 DC setups. However, we always try to use as few chunks as possible, as it is very inefficient when storing small files (min alloc size) and it can also lead to quite some problems with backfill and recovery in large ceph clusters. Joachim

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Igor Fedotov
Hi Boris, I wouldn't recommend to take absolute "osd bench" numbers too seriously. It's definitely not a full-scale quality benchmark tool. The idea was just to make brief OSDs comparison from c1 and c2. And for your reference -  IOPS numbers I'm getting in my lab with data/DB colocated: 1

[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-27 Thread Venky Shankar
On Sat, Mar 25, 2023 at 1:17 AM Yuri Weinstein wrote: > > Details of this release are updated here: > > https://tracker.ceph.com/issues/59070#note-1 > Release Notes - TBD > > The slowness we experienced seemed to be self-cured. > Neha, Radek, and Laura please provide any findings if you have them.

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Boris Behrens
Hello together, I've redeployed all OSDs in the cluster and did a blkdiscard before deploying them again. It looks now a lot better, even better before the octopus. I am waiting for confirmation from the dev and customer teams as the value over all OSDs can be misleading, and we still have some OS

[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-27 Thread Nizamudeen A
Dashboard LGTM! On Sat, Mar 25, 2023 at 1:16 AM Yuri Weinstein wrote: > Details of this release are updated here: > > https://tracker.ceph.com/issues/59070#note-1 > Release Notes - TBD > > The slowness we experienced seemed to be self-cured. > Neha, Radek, and Laura please provide any findings i