Thank you again Alex, It makes a lot of sense now, with this detailed explanation.
On Mon, Apr 15, 2019, 20:25 Alex McWhirter <[email protected]> wrote: > On 2019-04-15 13:08, Leo David wrote: > > Thank you Alex ! > I will try these performance settings. > If someone from the dev guys could validate and recommend those as a good > standard configuration, it would be just great. > If they are ok, wouldn't be a nice to have them applied from within UI > with the "Optimize for VirtStore" button ? > Thnak you ! > > On Mon, Apr 15, 2019 at 7:39 PM Alex McWhirter <[email protected]> wrote: > >> On 2019-04-14 23:22, Leo David wrote: >> >> Hi, >> Thank you Alex, I was looking for some optimisation settings as well, >> since I am pretty much in the same boat, using ssd based >> replicate-distributed volumes across 12 hosts. >> Could anyone else (maybe even from from ovirt or rhev team) validate >> these settings or add some other tweaks as well, so we can use them as >> standard ? >> Thank you very much again ! >> >> On Mon, Apr 15, 2019, 05:56 Alex McWhirter <[email protected]> wrote: >> >>> On 2019-04-14 20:27, Jim Kusznir wrote: >>> >>> Hi all: >>> I've had I/O performance problems pretty much since the beginning of >>> using oVirt. I've applied several upgrades as time went on, but strangely, >>> none of them have alleviated the problem. VM disk I/O is still very slow >>> to the point that running VMs is often painful; it notably affects nearly >>> all my VMs, and makes me leary of starting any more. I'm currently running >>> 12 VMs and the hosted engine on the stack. >>> My configuration started out with 1Gbps networking and hyperconverged >>> gluster running on a single SSD on each node. It worked, but I/O was >>> painfully slow. I also started running out of space, so I added an SSHD on >>> each node, created another gluster volume, and moved VMs over to it. I >>> also ran that on a dedicated 1Gbps network. I had recurring disk failures >>> (seems that disks only lasted about 3-6 months; I warrantied all three at >>> least once, and some twice before giving up). I suspect the Dell PERC 6/i >>> was partly to blame; the raid card refused to see/acknowledge the disk, but >>> plugging it into a normal PC showed no signs of problems. In any case, >>> performance on that storage was notably bad, even though the gig-e >>> interface was rarely taxed. >>> I put in 10Gbps ethernet and moved all the storage on that none the >>> less, as several people here said that 1Gbps just wasn't fast enough. Some >>> aspects improved a bit, but disk I/O is still slow. And I was still having >>> problems with the SSHD data gluster volume eating disks, so I bought a >>> dedicated NAS server (supermicro 12 disk dedicated FreeNAS NFS storage >>> system on 10Gbps ethernet). Set that up. I found that it was actually >>> FASTER than the SSD-based gluster volume, but still slow. Lately its been >>> getting slower, too...Don't know why. The FreeNAS server reports network >>> loads around 4MB/s on its 10Gbe interface, so its not network constrained. >>> At 4MB/s, I'd sure hope the 12 spindle SAS interface wasn't constrained >>> either..... (and disk I/O operations on the NAS itself complete much >>> faster). >>> So, running a test on my NAS against an ISO file I haven't accessed in >>> months: >>> # dd >>> if=en_windows_server_2008_r2_standard_enterprise_datacenter_and_web_x64_dvd_x15-59754.iso >>> of=/dev/null bs=1024k count=500 >>> >>> 500+0 records in >>> 500+0 records out >>> 524288000 bytes transferred in 2.459501 secs (213168465 bytes/sec) >>> Running it on one of my hosts: >>> root@unifi:/home/kusznir# time dd if=/dev/sda of=/dev/null bs=1024k >>> count=500 >>> 500+0 records in >>> 500+0 records out >>> 524288000 bytes (524 MB, 500 MiB) copied, 7.21337 s, 72.7 MB/s >>> (I don't know if this is a true apples to apples comparison, as I don't >>> have a large file inside this VM's image). Even this is faster than I >>> often see. >>> I have a VoIP Phone server running as a VM. Voicemail and other >>> recordings usually fail due to IO issues opening and writing the files. >>> Often, the first 4 or so seconds of the recording is missed; sometimes the >>> entire thing just fails. I didn't use to have this problem, but its >>> definately been getting worse. I finally bit the bullet and ordered a >>> physical server dedicated for my VoIP System...But I still want to figure >>> out why I'm having all these IO problems. I read on the list of people >>> running 30+ VMs...I feel that my IO can't take any more VMs with any >>> semblance of reliability. We have a Quickbooks server on here too >>> (windows), and the performance is abysmal; my CPA is charging me extra >>> because of all the lost staff time waiting on the system to respond and >>> generate reports..... >>> I'm at my whits end...I started with gluster on SSD with 1Gbps network, >>> migrated to 10Gbps network, and now to dedicated high performance NAS box >>> over NFS, and still have performance issues.....I don't know how to >>> troubleshoot the issue any further, but I've never had these kinds of >>> issues when I was playing with other VM technologies. I'd like to get to >>> the point where I can resell virtual servers to customers, but I can't do >>> so with my current performance levels. >>> I'd greatly appreciate help troubleshooting this further. >>> --Jim >>> >>> _______________________________________________ >>> Users mailing list -- [email protected] >>> To unsubscribe send an email to [email protected] >>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>> https://lists.ovirt.org/archives/list/[email protected]/message/ZR64VABNT2SGKLNP3XNTHCGFZXSOJAQF/ >>> >>> Been working on optimizing the same. This is where im at currently. >>> >>> Gluster volume settings. >>> >>> diagnostics.count-fop-hits: on >>> diagnostics.latency-measurement: on >>> performance.write-behind-window-size: 64MB >>> performance.flush-behind: on >>> performance.stat-prefetch: on >>> server.event-threads: 4 >>> client.event-threads: 8 >>> performance.io-thread-count: 32 >>> network.ping-timeout: 30 >>> cluster.granular-entry-heal: enable >>> performance.strict-o-direct: on >>> storage.owner-gid: 36 >>> storage.owner-uid: 36 >>> features.shard: on >>> cluster.shd-wait-qlength: 10000 >>> cluster.shd-max-threads: 8 >>> cluster.locking-scheme: granular >>> cluster.data-self-heal-algorithm: full >>> cluster.server-quorum-type: server >>> cluster.quorum-type: auto >>> cluster.eager-lock: enable >>> network.remote-dio: off >>> performance.low-prio-threads: 32 >>> performance.io-cache: off >>> performance.read-ahead: off >>> performance.quick-read: off >>> auth.allow: * >>> user.cifs: off >>> transport.address-family: inet >>> nfs.disable: off >>> performance.client-io-threads: on >>> >>> sysctl options >>> >>> net.core.rmem_max = 134217728 >>> net.core.wmem_max = 134217728 >>> net.ipv4.tcp_rmem = 4096 87380 134217728 >>> net.ipv4.tcp_wmem = 4096 65536 134217728 >>> net.core.netdev_max_backlog = 300000 >>> net.ipv4.tcp_moderate_rcvbuf =1 >>> net.ipv4.tcp_no_metrics_save = 1 >>> net.ipv4.tcp_congestion_control=htcp >>> >>> custom /sbin/ifup-local file, Storage is the bridge name, which == >>> ens3f0/1 in bond2 >>> >>> #!/bin/bash >>> case "$1" in >>> Storage) >>> /sbin/ethtool -K ens3f0 tx off rx off tso off gso off >>> /sbin/ethtool -K ens3f1 tx off rx off tso off gso off >>> /sbin/ip link set dev ens3f0 txqueuelen 10000 >>> /sbin/ip link set dev ens3f1 txqueuelen 10000 >>> /sbin/ip link set dev bond2 txqueuelen 10000 >>> /sbin/ip link set dev Storage txqueuelen 10000 >>> ;; >>> *) >>> ;; >>> esac >>> exit 0 >>> >>> i still have some latency issues, but my writes are up to 264MB/S >>> sequential on HDD's >>> >>> output of crystal diskmark on windows 10 vm >>> >>> Sequential Read (Q= 32,T= 1) : 688.536 MB/s >>> Sequential Write (Q= 32,T= 1) : 264.254 MB/s >>> Random Read 4KiB (Q= 8,T= 8) : 176.069 MB/s [ 42985.6 IOPS] >>> Random Write 4KiB (Q= 8,T= 8) : 63.217 MB/s [ 15433.8 IOPS] >>> Random Read 4KiB (Q= 32,T= 1) : 159.598 MB/s [ 38964.4 IOPS] >>> Random Write 4KiB (Q= 32,T= 1) : 54.212 MB/s [ 13235.4 IOPS] >>> Random Read 4KiB (Q= 1,T= 1) : 3.488 MB/s [ 851.6 IOPS] >>> Random Write 4KiB (Q= 1,T= 1) : 3.006 MB/s [ 733.9 IOPS] >>> >>> also enabling libgfapi on the engine was the best performance option i >>> ever tweaked, easily doubled reads / writes >>> _______________________________________________ >>> Users mailing list -- [email protected] >>> To unsubscribe send an email to [email protected] >>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>> https://lists.ovirt.org/archives/list/[email protected]/message/S7I3PQVERQZT6Q6CXDWJEWCY2ELEGRHY/ >> >> >> >> Also with all of that said, i've mostly solved the rest of my issues by >> enabling performance.read-ahead on the gluster volume. I am saturating my >> 10G network, which translates to 700MB/s reads, 350MB/s writes (replica 2) >> >> just make sure your local read ahead settings on the bricks are sane, I.E >> "blockdev --getra /dev/sdx", mine is 8192 >> >> > > > -- > Best regards, Leo David > > > > To be fair most of these are defaults, the ones i have changed from > defaults are. > > performance.read-ahead: on > > performance.stat-prefetch: on > > performance.flush-behind: on (pretty sure this was on by default, but i > explicitly set it) > > performance.client-io-threads: on > > performance.write-behind-window-size: 64MB (this was set to 1MB, but i set > to 64MB which is the size of a single shard in distributed replicate mode) > > > > These are env specific, i have 48 cores / host so adding a few threads to > for this helped making things more consistent. > > server.event-threads: 4 > client.event-threads: 8 > > > > As far as NIC tuning, with gluster basically working exclusively with > large files you want some big buffers. also HTCP congestion protocol was > basically designed for this use case. In my case TCP offload on the nics > was hurting me, so i disabled it. Then uped the txqueuelength, again > because we are working with exclusively large files. > > > > The NIC tuning stuff is pretty hardware specific, i can't see ovirt devs > using them as defaults, especially since they would be really bad to do on > 1GB networks. The gluster settings also have some valid points. > stat-prefecth is off because at one point this used to corrupt data on live > migration. This was fixed in gluster, but appears to be a bit of a leftover > now. read-ahead can slow you down on 1GB networks. client-io-threads may be > a bad idea if you are really packing the hosts up with VM's or have low > core counts / no SMT. Write-behind windows are dangerous on power loss, > etc... > > The defaults from ovirt are fairly sane, and really only needed minimal > tweaking to get optimal performance. > >
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/UYUAODRS4VIAR2ZQMG3G43XBK2EZXJEM/

