One more interesting thing to note. As a test I just re-ran DD on engine VM and got around 3-5MB/sec average writes. I had not previous set this volume to optimize for virt store. So I went ahead and set that option and now I'm getting 50MB/sec writes.
However, if I compare my gluster engine volume info now VS what I just posted in above reply before I made the optimize change all the gluster options are identical, not one value is changed as far as I can see. What is the optimize for virt store option in the admin GUI doing exactly? On Sat, Aug 4, 2018 at 10:29 AM, Jayme <[email protected]> wrote: > One more note on this. I only set optimize for virt on data volumes. I > did not and wasn't sure if I should set on engine volume. My DD tests on > engine VM are writing at ~8Mb/sec (like my test VM on data volume was > before I made the change). Is it recommended to use the optimize for virt > on the engine volume as well? > > > > On Sat, Aug 4, 2018 at 10:26 AM, Jayme <[email protected]> wrote: > >> Interesting that it should have been set by cockpit but seemingly wasn't >> (at least it did not appear so in my case, as setting optimize for virt >> increased performance dramatically). I did indeed use the cockpit to >> deploy. I was using ovirt node on all three host, recent download/burn of >> 4.2.5. Here is my current gluster volume info if it's helpful to anyone: >> >> Volume Name: data >> Type: Replicate >> Volume ID: 1428c3d3-8a51-4e45-a7bb-86b3bde8b6ea >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: MASKED:/gluster_bricks/data/data >> Brick2: MASKED:/gluster_bricks/data/data >> Brick3: MASKED:/gluster_bricks/data/data >> Options Reconfigured: >> features.barrier: disable >> server.allow-insecure: on >> cluster.granular-entry-heal: enable >> performance.strict-o-direct: on >> network.ping-timeout: 30 >> storage.owner-gid: 36 >> storage.owner-uid: 36 >> user.cifs: off >> features.shard: on >> cluster.shd-wait-qlength: 10000 >> cluster.shd-max-threads: 8 >> cluster.locking-scheme: granular >> cluster.data-self-heal-algorithm: full >> cluster.server-quorum-type: server >> cluster.quorum-type: auto >> cluster.eager-lock: enable >> network.remote-dio: enable >> performance.low-prio-threads: 32 >> performance.io-cache: off >> performance.read-ahead: off >> performance.quick-read: off >> transport.address-family: inet >> nfs.disable: on >> performance.client-io-threads: off >> >> Volume Name: data2 >> Type: Replicate >> Volume ID: e97a2e9c-cd47-4f18-b2c2-32d917a8c016 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: MASKED:/gluster_bricks/data2/data2 >> Brick2: MASKED:/gluster_bricks/data2/data2 >> Brick3: MASKED:/gluster_bricks/data2/data2 >> Options Reconfigured: >> server.allow-insecure: on >> cluster.granular-entry-heal: enable >> performance.strict-o-direct: on >> network.ping-timeout: 30 >> storage.owner-gid: 36 >> storage.owner-uid: 36 >> user.cifs: off >> features.shard: on >> cluster.shd-wait-qlength: 10000 >> cluster.shd-max-threads: 8 >> cluster.locking-scheme: granular >> cluster.data-self-heal-algorithm: full >> cluster.server-quorum-type: server >> cluster.quorum-type: auto >> cluster.eager-lock: enable >> network.remote-dio: enable >> performance.low-prio-threads: 32 >> performance.io-cache: off >> performance.read-ahead: off >> performance.quick-read: off >> transport.address-family: inet >> nfs.disable: on >> performance.client-io-threads: off >> >> Volume Name: engine >> Type: Replicate >> Volume ID: ae465791-618c-4075-b68c-d4972a36d0b9 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: MASKED:/gluster_bricks/engine/engine >> Brick2: MASKED:/gluster_bricks/engine/engine >> Brick3: MASKED:/gluster_bricks/engine/engine >> Options Reconfigured: >> cluster.granular-entry-heal: enable >> performance.strict-o-direct: on >> network.ping-timeout: 30 >> storage.owner-gid: 36 >> storage.owner-uid: 36 >> user.cifs: off >> features.shard: on >> cluster.shd-wait-qlength: 10000 >> cluster.shd-max-threads: 8 >> cluster.locking-scheme: granular >> cluster.data-self-heal-algorithm: full >> cluster.server-quorum-type: server >> cluster.quorum-type: auto >> cluster.eager-lock: enable >> network.remote-dio: off >> performance.low-prio-threads: 32 >> performance.io-cache: off >> performance.read-ahead: off >> performance.quick-read: off >> transport.address-family: inet >> nfs.disable: on >> performance.client-io-threads: off >> >> Volume Name: vmstore >> Type: Replicate >> Volume ID: 7065742b-c09d-410b-9e89-174ade4fc3f5 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: MASKED:/gluster_bricks/vmstore/vmstore >> Brick2: MASKED:/gluster_bricks/vmstore/vmstore >> Brick3: MASKED:/gluster_bricks/vmstore/vmstore >> Options Reconfigured: >> cluster.granular-entry-heal: enable >> performance.strict-o-direct: on >> network.ping-timeout: 30 >> storage.owner-gid: 36 >> storage.owner-uid: 36 >> user.cifs: off >> features.shard: on >> cluster.shd-wait-qlength: 10000 >> cluster.shd-max-threads: 8 >> cluster.locking-scheme: granular >> cluster.data-self-heal-algorithm: full >> cluster.server-quorum-type: server >> cluster.quorum-type: auto >> cluster.eager-lock: enable >> network.remote-dio: off >> performance.low-prio-threads: 32 >> performance.io-cache: off >> performance.read-ahead: off >> performance.quick-read: off >> transport.address-family: inet >> nfs.disable: on >> performance.client-io-threads: off >> >> Volume Name: vmstore2 >> Type: Replicate >> Volume ID: 6f9a1c51-c0bc-46ad-b94a-fc2989a36e0c >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: MASKED:/gluster_bricks/vmstore2/vmstore2 >> Brick2: MASKED:/gluster_bricks/vmstore2/vmstore2 >> Brick3: MASKED:/gluster_bricks/vmstore2/vmstore2 >> Options Reconfigured: >> cluster.granular-entry-heal: enable >> performance.strict-o-direct: on >> network.ping-timeout: 30 >> storage.owner-gid: 36 >> storage.owner-uid: 36 >> user.cifs: off >> features.shard: on >> cluster.shd-wait-qlength: 10000 >> cluster.shd-max-threads: 8 >> cluster.locking-scheme: granular >> cluster.data-self-heal-algorithm: full >> cluster.server-quorum-type: server >> cluster.quorum-type: auto >> cluster.eager-lock: enable >> network.remote-dio: off >> performance.low-prio-threads: 32 >> performance.io-cache: off >> performance.read-ahead: off >> performance.quick-read: off >> transport.address-family: inet >> nfs.disable: on >> performance.client-io-threads: off >> >> On Fri, Aug 3, 2018 at 6:53 AM, Sahina Bose <[email protected]> wrote: >> >>> >>> >>> On Fri, 3 Aug 2018 at 3:07 PM, Jayme <[email protected]> wrote: >>> >>>> Hello, >>>> >>>> The option to optimize for virt store is tough to find (in my opinion) >>>> you have to go to volumes > volume name and then click the two dots to >>>> expand further options in the top right to see it. No one would know to >>>> find it (or that it even exists) if they weren't specifically looking. >>>> >>>> I don't know enough about it but my assumption is that there are >>>> reasons why it's not set by default (as it might or should not need to >>>> apply to ever volume created), however my suggestion would be that it be >>>> included in the cockpit as a selectable option next to each volume you >>>> create with a hint to suggest that for best performance select it for any >>>> volume that is going to be a data volume for VMs >>>> >>> >>> If you have installed via Cockpit, the options are set. >>> Can you provide the “gluster volume info “ output after you optimised >>> for virt? >>> >>> >>> >>> . >>>> >>>> I simply installed using the latest node ISO / default cockpit >>>> deployment. >>>> >>>> Hope this helps! >>>> >>>> - Jayme >>>> >>>> On Fri, Aug 3, 2018 at 5:15 AM, Sahina Bose <[email protected]> wrote: >>>> >>>>> >>>>> >>>>> On Fri, Aug 3, 2018 at 5:25 AM, Jayme <[email protected]> wrote: >>>>> >>>>>> Bill, >>>>>> >>>>>> I thought I'd let you (and others know this) as it might save you >>>>>> some headaches. I found that my performance problem was resolved by >>>>>> clicking "optimize for virt store" option in the volume settings of the >>>>>> hosted engine (for the data volume). Doing this one change has increased >>>>>> my I/O performance by 10x alone. I don't know why this would not be set >>>>>> or >>>>>> recommended by default but I'm glad I found it! >>>>>> >>>>> >>>>> Thanks for the feedback, Could you log a bug to make it default by >>>>> providing the user flow that you used. >>>>> >>>>> Also, I would be interested to know how you prepared the gluster >>>>> volume for use - if it was using the Cockpit deployment UI, the volume >>>>> options would have been set by default. >>>>> >>>>> >>>>>> - James >>>>>> >>>>>> On Thu, Aug 2, 2018 at 2:32 PM, William Dossett < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Yeah, I am just ramping up here, but this project is mostly on my >>>>>>> own time and money, hence no SSDs for Gluster… I’ve already blown close >>>>>>> to >>>>>>> $500 of my own money on 10Gb ethernet cards and SFPs on ebay as my >>>>>>> company >>>>>>> frowns on us getting good deals for equipment on ebay and would rather >>>>>>> go >>>>>>> to their preferred supplier – where $500 wouldn’t even buy half a 10Gb >>>>>>> CNA >>>>>>> ☹ but I believe in this project and it feels like it is getting >>>>>>> ready for showtime – if I can demo this in a few weeks and get some >>>>>>> interest I’ll be asking them to reimburse me, that’s for sure! >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hopefully going to get some of the other work off my plate and work >>>>>>> on this later this afternoon, will let you know any findings. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> Bill >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> *From:* Jayme <[email protected]> >>>>>>> *Sent:* Thursday, August 2, 2018 11:07 AM >>>>>>> *To:* William Dossett <[email protected]> >>>>>>> *Cc:* users <[email protected]> >>>>>>> *Subject:* Re: [ovirt-users] Tuning and testing GlusterFS >>>>>>> performance >>>>>>> >>>>>>> >>>>>>> >>>>>>> Bill, >>>>>>> >>>>>>> >>>>>>> >>>>>>> Appreciate the feedback and would be interested to hear some of your >>>>>>> results. I'm a bit worried about what i'm seeing so far on a very >>>>>>> stock 3 >>>>>>> node HCI setup. 8mb/sec on that dd test mentioned in the original post >>>>>>> from within a VM (which may be explained by bad testing methods or some >>>>>>> other configuration considerations).. but what is more worrisome to me >>>>>>> is >>>>>>> that I tried another dd test to time creating a 32GB file, it was >>>>>>> taking a >>>>>>> long time so I exited the process and the VM basically locked up on me, >>>>>>> I >>>>>>> couldn't access it or the console and eventually had to do a hard >>>>>>> shutdown >>>>>>> of the VM to recover. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I don't plan to host many VMs, probably around 15. They aren't >>>>>>> super demanding servers but some do read/write big directories such as >>>>>>> working with github repos and large node_module folders, rsyncs of >>>>>>> fairly >>>>>>> large dirs etc. I'm definitely going to have to do a lot more testing >>>>>>> before I can be assured enough to put any important VMs on this cluster. >>>>>>> >>>>>>> >>>>>>> >>>>>>> - James >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Aug 2, 2018 at 1:54 PM, William Dossett < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>> I usually look at IOPs using IOMeter… you usually want several >>>>>>> workers running reads and writes in different threads at the same time. >>>>>>> You can run Dynamo on a Linux instance and then connect it to a window >>>>>>> GUI >>>>>>> running IOMeter to give you stats. I was getting around 250 IOPs on >>>>>>> JBOD >>>>>>> sata 7200rpm drives which isn’t bad for cheap and cheerful sata drives. >>>>>>> >>>>>>> >>>>>>> >>>>>>> As I said, I’ve worked with HCI in VMware now for a couple of years, >>>>>>> intensely this last year when we had some defective Dell hardware and >>>>>>> trying to diagnose the problem. Since then the hardware has been >>>>>>> completely replaced with all flash solution. So when I got the all >>>>>>> flash >>>>>>> solution I used IOmeter on it and was only getting around 3000 IOPs on >>>>>>> enterprise flash disks… not exactly stellar, but OK for one VM. The >>>>>>> trick >>>>>>> there was the scale out. There is a VMware Fling call HCI Bench. Its >>>>>>> very >>>>>>> cool in that you spin up one VM and then it spawns 40 more VMs across >>>>>>> the >>>>>>> cluster. I could then use VSAN observer and it showed my hosts were >>>>>>> actually doing 30K IOPs on average which is absolutely stellar >>>>>>> performance. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Anyway, moral of the story there was that your one VM may seem like >>>>>>> its quick, but not what you would expect from flash… but as you add >>>>>>> more >>>>>>> VMs in the cluster and they are all doing workloads, it scales out >>>>>>> beautifully and the read/write speed does not slow down as you add more >>>>>>> loads. I’m hoping that’s what we are going to see with Gluster. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Also, you are using mb nomenclature below, is that Mb, or MB? I am >>>>>>> sort of assuming MB megabytes per second… it does not seem very fast. >>>>>>> I’m >>>>>>> probably not going to get to work more on my cluster today as I’ve got >>>>>>> other projects that I need to get done on time, but I want to try and >>>>>>> get >>>>>>> some templates up and running and do some more testing either tomorrow >>>>>>> or >>>>>>> this weekend and see what I get in just basic writing MB/s and let you >>>>>>> know. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> Bill >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> *From:* Jayme <[email protected]> >>>>>>> *Sent:* Thursday, August 2, 2018 8:12 AM >>>>>>> *To:* users <[email protected]> >>>>>>> *Subject:* [ovirt-users] Tuning and testing GlusterFS performance >>>>>>> >>>>>>> >>>>>>> >>>>>>> So I've finally completed my first HCI build using the below >>>>>>> configuration: >>>>>>> >>>>>>> >>>>>>> >>>>>>> 3x >>>>>>> >>>>>>> Dell PowerEdge R720 >>>>>>> >>>>>>> 2x 2.9 GHz 8 Core E5-2690 >>>>>>> >>>>>>> 256GB RAM >>>>>>> >>>>>>> 2x250gb SSD Raid 1 (boot/os) >>>>>>> >>>>>>> 2x2TB SSD jbod passthrough (used for gluster bricks) >>>>>>> >>>>>>> 1Gbe Nic for management 10Gbe nic for Gluster >>>>>>> >>>>>>> >>>>>>> >>>>>>> Using Replica 3 with no arbiter. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Installed the latest version of oVirt available at the time 4.2.5. >>>>>>> Created recommended volumes (with an additional data volume on second >>>>>>> SSD). >>>>>>> Not using VDO >>>>>>> >>>>>>> >>>>>>> >>>>>>> First thing I did was setup glusterFS network on 10Gbe and set it to >>>>>>> be used for glusterFS and migration traffic. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I've setup a single test VM using Centos7 minimal on the default >>>>>>> "x-large instance" profile. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Within this VM if I do very basic write test using something like: >>>>>>> >>>>>>> >>>>>>> >>>>>>> dd bs=1M count=256 if=/dev/zero of=test conv=fdatasync >>>>>>> >>>>>>> >>>>>>> >>>>>>> I'm seeing quite slow speeds, only 8mb/sec. >>>>>>> >>>>>>> >>>>>>> >>>>>>> If I do the same from one of the hosts gluster mounts i.e. >>>>>>> >>>>>>> >>>>>>> >>>>>>> host1: /rhev/data-center/mnt/glusterSD/HOST:data >>>>>>> >>>>>>> >>>>>>> >>>>>>> I get about 30mb/sec (which still seems fairly low?) >>>>>>> >>>>>>> >>>>>>> >>>>>>> Am I testing incorrectly here? Is there anything I should be tuning >>>>>>> on the Gluster volumes to increase performance with SSDs? Where can I >>>>>>> find >>>>>>> out where the bottle neck is here, or is this expected performance of >>>>>>> Gluster? >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Users mailing list -- [email protected] >>>>>> To unsubscribe send an email to [email protected] >>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>>> oVirt Code of Conduct: https://www.ovirt.org/communit >>>>>> y/about/community-guidelines/ >>>>>> List Archives: https://lists.ovirt.org/archiv >>>>>> es/list/[email protected]/message/JKUWFXIZOWQ42JFBIJLFJGBKISY7OMPV/ >>>>>> >>>>>> >>>>> >>>> >> >
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/DJ5E34NTGCV6XRWJFFDBWKFSGYSKKUCE/

