Re: [PVE-User] Proxmox with ceph storage VM performance strangeness

Eneko Lacunza Wed, 18 Mar 2020 00:49:50 -0700

Hi Rainer,

El 17/3/20 a las 16:58, Rainer Krienke escribió:

thanks for your answer,

Take into account I haven't used iothreads, what I told you is what Ilearned here and elsewhere. Alexandre and Alwin are experts in thisinstead ;)

if I  understand you correctly, than iothreads can only help if the VM
has more than one disk, hence your proposal to build a raid0 on two rbd
devices. The disadvantage of this solution would of course be that disk
usage would be doubled.

Not necesarilly, just create more, smaller disks. Create a strippedraid0 and add it as PV to LVM, then create the LVs you need.


Alwin is right that this will make disk management more complex...

A fileserver VM I manage (not yet productive) could profit from this. I
use LVM on it anyway and I could use striped LVs, so those volumes would
read from more vm pv disks. Should help I guess.

The hosts CPU is a AMD EPYC 7402 24-Core Processor. Does it make sense
to select a specific CPU-type for the VM. My test machines have a
default kvm64 processor. The number of processors should then probably
be minimal equal to the number of disks (number of iothreads)?

If all hosts have the same CPU, then use "host" type CPU.

Do you know if it makes any difference wheater I use the VirtIO
SCSI-driver versus the Virtio-SCSI-single driver?

I haven't tried -single, maybe others can comment on this.

Cheers
Eneko


Thank you very much
Rainer

Am 17.03.20 um 14:10 schrieb Eneko Lacunza:

Hi,

You can try to enable IO threads and assign multiple Ceph disks to the
VM, then build some kind of raid0 to increase performance.

Generally speaking, a SSD based Ceph cluster is considered to perform
well when a VM gets about 2000 IOPS, and factors like CPU 1-thread
performance, network and disk have to be selected with care. Also
server's energy saving disabled, etc.

What CPUs in those 9 nodes?

Ceph is built for parallel access and scaling. You're only using 1
thread of your VM for disk IO currently.

Cheers
Eneko

El 17/3/20 a las 14:04, Rainer Krienke escribió:

Hello,

I run a pve 6.1-7 cluster with 5 nodes that is attached (via 10Gb
Network) to a ceph nautilus cluster with 9 ceph nodes and 144 magnetic
disks. The pool with rbd images for disk storage is erasure coded with a
4+2 profile.

I ran some performance tests since I noticed that there seems to be a
strange limit to the disk read/write rate on a single VM even if the
physical machine hosting the VM as well as cluster is in total capable
of doing much more.

So what I did was to run a bonnie++ as well as a dd read/write test
first in parallel on 10 VMs, then on 5 VMs and at last on a single one.

A value of "75" for "bo++rd" in the first line below means that each of
the 10 bonnie++-processes running on 10 different proxmox VMs in
parallel reported in average over all the results a value of
75MBytes/sec for "block read". The ceph-values are the peaks measured by
ceph itself during the test run (all rd/wr values in MBytes/sec):

VM-count:  bo++rd: bo++wr: ceph(rd/wr):  dd-rd:  dd-wr:  ceph(rd/wr):
10           75      42      540/485       55     58      698/711
   5           90      62      310/338       47     80      248/421
   1          108     114      111/120      130    145      337/165


What I find a little strange is that running many VMs doing IO in
parallel I reach a write rate of about 485-711 MBytes/sec. However when
running a single VM the maximum is at 120-165 MBytes/sec. Since the
whole networking is based on a 10GB infrastructure and an iperf test
between a VM and a ceph node reported nearby 10Gb I would expect a
higher rate for the single VM. Even if I run a test with 5 VMs on *one*
physical host (values not shown above), the results are not far behind
the values for 5 VMs on 5 hosts shown above. So the single host seems
not to be the limiting factor, but the VM itself is limiting IO.

What rates do you find on your proxmox/ceph cluster for single VMs?
Does any one have any explanation for this rather big difference or
perhaps an idea what to try in order to get higher IO-rates from a
single VM?

Thank you very much in advance
Rainer



---------------------------------------------
Here are the more detailed test results for anyone interested:

Using bonnie++:
10 VMs (two on each of the 5 hosts) VMs: 4GB RAM, BTRFS, cd /root;
bonnie++ -u root
    Average for each VM:
    block write: ~42MByte/sec, block read: ~75MByte/sec
    ceph: total peak: 485MByte/sec write, 540MByte/sec read

5 VMs (one on each of the 5 hosts) 4GB RAM, BTRFS, cd /root; bonnie++ -u
root
    Average for each VM:
    block write: ~62MByte/sec, block read: ~90MByte/sec
    ceph: total peak: 338MByte/sec write, 310MByte/sec read

1 VM  4GB RAM, BTRFS, cd /root; bonnie++ -u root
    Average for VM:
    block write: ~114 MByte/sec, block read: ~108MByte/sec
    ceph: total peak: 120 MByte/sec write, 111MByte/sec read


Using dd:
10 VMs (two on each of the 5 hosts) VMs: 4GB RAM, write on a ceph based
vm-disk "sdb" (rbd)
    write: dd if=/dev/zero of=/dev/sdb bs=nnn count=kkk conv=fsync
status=progress
    read:  dd of=/dev/null if=/dev/sdb bs=nnn count=kkk  status=progress
    Average for each VM:
    bs=1024k count=12000: dd write: ~58MByte/sec, dd read: ~48MByte/sec
    bs=4096k count=3000:  dd write: ~59MByte/sec, dd read: ~55MByte/sec
    ceph: total peak: 711MByte/sec write, 698 MByte/sec read

5 VMs (two on each of the 5 hosts) VMs: 4GB RAM, write on a ceph based
vm-disk "sdb" (rbd)
    write: dd if=/dev/zero of=/dev/sdb bs=4096k count=3000 conv=fsync
status=progress
    read:  dd of=/dev/null if=/dev/sdb bs=4096k count=3000
status=progress
    Average for each VM:
    bs=4096 count=3000:  dd write: ~80 MByte/sec, dd read: ~47MByte/sec
    ceph: total peak: 421MByte/sec write, 248 MByte/sec read

1 VM: 4GB RAM, write on a ceph based vm-disk "sdb" (rbd-device)
    write: dd if=/dev/zero of=/dev/sdb bs=4096k count=3000 conv=fsync
status=progress
    read:  dd of=/dev/null if=/dev/sdb bs=4096k count=3000
status=progress
    Average for each VM:
    bs=4096k count=3000:  dd write: ~145 MByte/sec, dd read: ~130
MByte/sec
    ceph: total peak: 165 MByte/sec write, 337 MByte/sec read



--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarragako bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

_______________________________________________
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Re: [PVE-User] Proxmox with ceph storage VM performance strangeness

Reply via email to