Hello,
we've noticed similar issue after upgrading our test 3 node cluster from
15.2.14-1~bpo10+1 to 16.1.0-1~bpo10+1.
quick tests using rados bench:
16.2.5-1~bpo10+1:
Total time run: 133.28
Total writes made: 576
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 17.2869
Stddev Bandwidth: 34.1485
Max bandwidth (MB/sec): 204
Min bandwidth (MB/sec): 0
Average IOPS: 4
Stddev IOPS: 8.55426
Max IOPS: 51
Min IOPS: 0
Average Latency(s): 3.59873
Stddev Latency(s): 5.99964
Max latency(s): 30.6307
Min latency(s): 0.0865062
after downgrading OSDs:
15.2.14-1~bpo10+1:
Total time run: 120.135
Total writes made: 16324
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 543.524
Stddev Bandwidth: 21.7548
Max bandwidth (MB/sec): 580
Min bandwidth (MB/sec): 436
Average IOPS: 135
Stddev IOPS: 5.43871
Max IOPS: 145
Min IOPS: 109
Average Latency(s): 0.117646
Stddev Latency(s): 0.0391269
Max latency(s): 0.544229
Min latency(s): 0.0602735
We currently run on this setup:
{
"mon": {
"ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a)
pacific (stable)": 2
},
"mgr": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be)
octopus (stable)": 3
},
"osd": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be)
octopus (stable)": 35
},
"mds": {},
"overall": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be)
octopus (stable)": 38,
"ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a)
pacific (stable)": 2
}
}
which solved performance issues. All OSDs were newly created and fully
synced from other nodes when upgrading and downgrading back to 15.2.
Best Regards,
Martin
Dne 05. 09. 21 v 19:45 Luis Domingues napsal(a):
> Hello,
>
> I run a test cluster of 3 machines with 24 HDDs each, running bare-metal on
> CentOS 8. Long story short, I can have a bandwidth of ~ 1'200 MB/s when I do
> a rados bench, writing objects of 128k, when the cluster is installed with
> Nautilus.
>
> When I upgrade the cluster to Pacific, (using ceph-ansible to deploy and/or
> upgrade), my performances drop to ~400 MB/s of bandwidth doing the same rados
> bench.
>
> I am kind of clueless on what makes the performance drop so much. Does
> someone have some ideas where I can dig to find the root of this difference?
>
> Thanks,
> Luis Domingues
--
Martin Mlynář
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]