[ovirt-users] Re: Tuning and testing GlusterFS performance

Jayme Sat, 04 Aug 2018 06:43:18 -0700

One more interesting thing to note.  As a test I just re-ran DD on engine
VM and got around 3-5MB/sec average writes.  I had not previous set this
volume to optimize for virt store.  So I went ahead and set that option and
now I'm getting 50MB/sec writes.


However, if I compare my gluster engine volume info now VS what I just
posted in above reply before I made the optimize change all the gluster
options are identical, not one value is changed as far as I can see.  What
is the optimize for virt store option in the admin GUI doing exactly?

On Sat, Aug 4, 2018 at 10:29 AM, Jayme <[email protected]> wrote:

> One more note on this.  I only set optimize for virt on data volumes.  I
> did not and wasn't sure if I should set on engine volume.  My DD tests on
> engine VM are writing at ~8Mb/sec (like my test VM on data volume was
> before I made the change).  Is it recommended to use the optimize for virt
> on the engine volume as well?
>
>
>
> On Sat, Aug 4, 2018 at 10:26 AM, Jayme <[email protected]> wrote:
>
>> Interesting that it should have been set by cockpit but seemingly wasn't
>> (at least it did not appear so in my case, as setting optimize for virt
>> increased performance dramatically).  I did indeed use the cockpit to
>> deploy.  I was using ovirt node on all three host, recent download/burn of
>> 4.2.5.  Here is my current gluster volume info if it's helpful to anyone:
>>
>> Volume Name: data
>> Type: Replicate
>> Volume ID: 1428c3d3-8a51-4e45-a7bb-86b3bde8b6ea
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: MASKED:/gluster_bricks/data/data
>> Brick2: MASKED:/gluster_bricks/data/data
>> Brick3: MASKED:/gluster_bricks/data/data
>> Options Reconfigured:
>> features.barrier: disable
>> server.allow-insecure: on
>> cluster.granular-entry-heal: enable
>> performance.strict-o-direct: on
>> network.ping-timeout: 30
>> storage.owner-gid: 36
>> storage.owner-uid: 36
>> user.cifs: off
>> features.shard: on
>> cluster.shd-wait-qlength: 10000
>> cluster.shd-max-threads: 8
>> cluster.locking-scheme: granular
>> cluster.data-self-heal-algorithm: full
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> cluster.eager-lock: enable
>> network.remote-dio: enable
>> performance.low-prio-threads: 32
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> transport.address-family: inet
>> nfs.disable: on
>> performance.client-io-threads: off
>>
>> Volume Name: data2
>> Type: Replicate
>> Volume ID: e97a2e9c-cd47-4f18-b2c2-32d917a8c016
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: MASKED:/gluster_bricks/data2/data2
>> Brick2: MASKED:/gluster_bricks/data2/data2
>> Brick3: MASKED:/gluster_bricks/data2/data2
>> Options Reconfigured:
>> server.allow-insecure: on
>> cluster.granular-entry-heal: enable
>> performance.strict-o-direct: on
>> network.ping-timeout: 30
>> storage.owner-gid: 36
>> storage.owner-uid: 36
>> user.cifs: off
>> features.shard: on
>> cluster.shd-wait-qlength: 10000
>> cluster.shd-max-threads: 8
>> cluster.locking-scheme: granular
>> cluster.data-self-heal-algorithm: full
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> cluster.eager-lock: enable
>> network.remote-dio: enable
>> performance.low-prio-threads: 32
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> transport.address-family: inet
>> nfs.disable: on
>> performance.client-io-threads: off
>>
>> Volume Name: engine
>> Type: Replicate
>> Volume ID: ae465791-618c-4075-b68c-d4972a36d0b9
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: MASKED:/gluster_bricks/engine/engine
>> Brick2: MASKED:/gluster_bricks/engine/engine
>> Brick3: MASKED:/gluster_bricks/engine/engine
>> Options Reconfigured:
>> cluster.granular-entry-heal: enable
>> performance.strict-o-direct: on
>> network.ping-timeout: 30
>> storage.owner-gid: 36
>> storage.owner-uid: 36
>> user.cifs: off
>> features.shard: on
>> cluster.shd-wait-qlength: 10000
>> cluster.shd-max-threads: 8
>> cluster.locking-scheme: granular
>> cluster.data-self-heal-algorithm: full
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> cluster.eager-lock: enable
>> network.remote-dio: off
>> performance.low-prio-threads: 32
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> transport.address-family: inet
>> nfs.disable: on
>> performance.client-io-threads: off
>>
>> Volume Name: vmstore
>> Type: Replicate
>> Volume ID: 7065742b-c09d-410b-9e89-174ade4fc3f5
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: MASKED:/gluster_bricks/vmstore/vmstore
>> Brick2: MASKED:/gluster_bricks/vmstore/vmstore
>> Brick3: MASKED:/gluster_bricks/vmstore/vmstore
>> Options Reconfigured:
>> cluster.granular-entry-heal: enable
>> performance.strict-o-direct: on
>> network.ping-timeout: 30
>> storage.owner-gid: 36
>> storage.owner-uid: 36
>> user.cifs: off
>> features.shard: on
>> cluster.shd-wait-qlength: 10000
>> cluster.shd-max-threads: 8
>> cluster.locking-scheme: granular
>> cluster.data-self-heal-algorithm: full
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> cluster.eager-lock: enable
>> network.remote-dio: off
>> performance.low-prio-threads: 32
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> transport.address-family: inet
>> nfs.disable: on
>> performance.client-io-threads: off
>>
>> Volume Name: vmstore2
>> Type: Replicate
>> Volume ID: 6f9a1c51-c0bc-46ad-b94a-fc2989a36e0c
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: MASKED:/gluster_bricks/vmstore2/vmstore2
>> Brick2: MASKED:/gluster_bricks/vmstore2/vmstore2
>> Brick3: MASKED:/gluster_bricks/vmstore2/vmstore2
>> Options Reconfigured:
>> cluster.granular-entry-heal: enable
>> performance.strict-o-direct: on
>> network.ping-timeout: 30
>> storage.owner-gid: 36
>> storage.owner-uid: 36
>> user.cifs: off
>> features.shard: on
>> cluster.shd-wait-qlength: 10000
>> cluster.shd-max-threads: 8
>> cluster.locking-scheme: granular
>> cluster.data-self-heal-algorithm: full
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> cluster.eager-lock: enable
>> network.remote-dio: off
>> performance.low-prio-threads: 32
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> transport.address-family: inet
>> nfs.disable: on
>> performance.client-io-threads: off
>>
>> On Fri, Aug 3, 2018 at 6:53 AM, Sahina Bose <[email protected]> wrote:
>>
>>>
>>>
>>> On Fri, 3 Aug 2018 at 3:07 PM, Jayme <[email protected]> wrote:
>>>
>>>> Hello,
>>>>
>>>> The option to optimize for virt store is tough to find (in my opinion)
>>>> you have to go to volumes > volume name and then click the two dots to
>>>> expand further options in the top right to see it.  No one would know to
>>>> find it (or that it even exists) if they weren't specifically looking.
>>>>
>>>> I don't know enough about it but my assumption is that there are
>>>> reasons why it's not set by default (as it might or should not need to
>>>> apply to ever volume created), however my suggestion would be that it be
>>>> included in the cockpit as a selectable option next to each volume you
>>>> create with a hint to suggest that for best performance select it for any
>>>> volume that is going to be a data volume for VMs
>>>>
>>>
>>> If you have installed via Cockpit, the options are set.
>>> Can you provide the “gluster volume info “ output  after you optimised
>>> for virt?
>>>
>>>
>>>
>>> .
>>>>
>>>> I simply installed using the latest node ISO / default cockpit
>>>> deployment.
>>>>
>>>> Hope this helps!
>>>>
>>>> - Jayme
>>>>
>>>> On Fri, Aug 3, 2018 at 5:15 AM, Sahina Bose <[email protected]> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Aug 3, 2018 at 5:25 AM, Jayme <[email protected]> wrote:
>>>>>
>>>>>> Bill,
>>>>>>
>>>>>> I thought I'd let you (and others know this) as it might save you
>>>>>> some headaches.  I found that my performance problem was resolved by
>>>>>> clicking "optimize for virt store" option in the volume settings of the
>>>>>> hosted engine (for the data volume).  Doing this one change has increased
>>>>>> my I/O performance by 10x alone.  I don't know why this would not be set 
>>>>>> or
>>>>>> recommended by default but I'm glad I found it!
>>>>>>
>>>>>
>>>>> Thanks for the feedback, Could you log a bug to make it default by
>>>>> providing the user flow that you used.
>>>>>
>>>>> Also, I would be interested to know how you prepared the gluster
>>>>> volume for use - if it was using the Cockpit deployment UI, the volume
>>>>> options would have been set by default.
>>>>>
>>>>>
>>>>>> - James
>>>>>>
>>>>>> On Thu, Aug 2, 2018 at 2:32 PM, William Dossett <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Yeah, I am just ramping up here, but this project is mostly on my
>>>>>>> own time and money, hence no SSDs for Gluster… I’ve already blown close 
>>>>>>> to
>>>>>>> $500 of my own money on 10Gb ethernet cards and SFPs on ebay as my 
>>>>>>> company
>>>>>>> frowns on us getting good deals for equipment on ebay and would rather 
>>>>>>> go
>>>>>>> to their preferred supplier – where $500 wouldn’t even buy half a 10Gb 
>>>>>>> CNA
>>>>>>> ☹  but I believe in this project and it feels like it is getting
>>>>>>> ready for showtime – if I can demo this in a few weeks and get some
>>>>>>> interest I’ll be asking them to reimburse me, that’s for sure!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hopefully going to get some of the other work off my plate and work
>>>>>>> on this later this afternoon, will let you know any findings.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> Bill
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* Jayme <[email protected]>
>>>>>>> *Sent:* Thursday, August 2, 2018 11:07 AM
>>>>>>> *To:* William Dossett <[email protected]>
>>>>>>> *Cc:* users <[email protected]>
>>>>>>> *Subject:* Re: [ovirt-users] Tuning and testing GlusterFS
>>>>>>> performance
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Bill,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Appreciate the feedback and would be interested to hear some of your
>>>>>>> results.  I'm a bit worried about what i'm seeing so far on a very 
>>>>>>> stock 3
>>>>>>> node HCI setup.  8mb/sec on that dd test mentioned in the original post
>>>>>>> from within a VM (which may be explained by bad testing methods or some
>>>>>>> other configuration considerations).. but what is more worrisome to me 
>>>>>>> is
>>>>>>> that I tried another dd test to time creating a 32GB file, it was 
>>>>>>> taking a
>>>>>>> long time so I exited the process and the VM basically locked up on me, 
>>>>>>> I
>>>>>>> couldn't access it or the console and eventually had to do a hard 
>>>>>>> shutdown
>>>>>>> of the VM to recover.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I don't plan to host many VMs, probably around 15.  They aren't
>>>>>>> super demanding servers but some do read/write big directories such as
>>>>>>> working with github repos and large node_module folders, rsyncs of 
>>>>>>> fairly
>>>>>>> large dirs etc.  I'm definitely going to have to do a lot more testing
>>>>>>> before I can be assured enough to put any important VMs on this cluster.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> - James
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Aug 2, 2018 at 1:54 PM, William Dossett <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> I usually look at IOPs using IOMeter… you usually want several
>>>>>>> workers running reads and writes in different threads at the same time.
>>>>>>> You can run Dynamo on a Linux instance and then connect it to a window 
>>>>>>> GUI
>>>>>>> running IOMeter to give you stats.  I was getting around 250 IOPs on 
>>>>>>> JBOD
>>>>>>> sata 7200rpm drives which isn’t bad for cheap and cheerful sata drives.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> As I said, I’ve worked with HCI in VMware now for a couple of years,
>>>>>>> intensely this last year when we had some defective Dell hardware and
>>>>>>> trying to diagnose the problem.  Since then the hardware has been
>>>>>>> completely replaced with all flash solution.   So when I got the all 
>>>>>>> flash
>>>>>>> solution I used IOmeter on it and was only getting around 3000 IOPs on
>>>>>>> enterprise flash disks… not exactly stellar, but OK for one VM.  The 
>>>>>>> trick
>>>>>>> there was the scale out.  There is a VMware Fling call HCI Bench.  Its 
>>>>>>> very
>>>>>>> cool in that you spin up one VM and then it spawns 40 more VMs across 
>>>>>>> the
>>>>>>> cluster.  I  could then use VSAN observer and it showed my hosts were
>>>>>>> actually doing 30K IOPs on average which is absolutely stellar
>>>>>>> performance.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Anyway, moral of the story there was that your one VM may seem like
>>>>>>> its quick, but not what you would expect from flash…   but as you add 
>>>>>>> more
>>>>>>> VMs in the cluster and they are all doing workloads, it scales out
>>>>>>> beautifully and the read/write speed does not slow down as you add more
>>>>>>> loads.  I’m hoping that’s what we are going to see with Gluster.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Also, you are using mb nomenclature below, is that Mb, or MB?  I am
>>>>>>> sort of assuming MB megabytes per second…  it does not seem very fast.  
>>>>>>> I’m
>>>>>>> probably not going to get to work more on my cluster today as I’ve got
>>>>>>> other projects that I need to get done on time, but I want to try and 
>>>>>>> get
>>>>>>> some templates up and running and do some more testing either tomorrow 
>>>>>>> or
>>>>>>> this weekend and see what I get in just basic writing MB/s and let you 
>>>>>>> know.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> Bill
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* Jayme <[email protected]>
>>>>>>> *Sent:* Thursday, August 2, 2018 8:12 AM
>>>>>>> *To:* users <[email protected]>
>>>>>>> *Subject:* [ovirt-users] Tuning and testing GlusterFS performance
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> So I've finally completed my first HCI build using the below
>>>>>>> configuration:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 3x
>>>>>>>
>>>>>>> Dell PowerEdge R720
>>>>>>>
>>>>>>> 2x 2.9 GHz 8 Core E5-2690
>>>>>>>
>>>>>>> 256GB RAM
>>>>>>>
>>>>>>> 2x250gb SSD Raid 1 (boot/os)
>>>>>>>
>>>>>>> 2x2TB SSD jbod passthrough (used for gluster bricks)
>>>>>>>
>>>>>>> 1Gbe Nic for management 10Gbe nic for Gluster
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Using Replica 3 with no arbiter.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Installed the latest version of oVirt available at the time 4.2.5.
>>>>>>> Created recommended volumes (with an additional data volume on second 
>>>>>>> SSD).
>>>>>>> Not using VDO
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> First thing I did was setup glusterFS network on 10Gbe and set it to
>>>>>>> be used for glusterFS and migration traffic.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I've setup a single test VM using Centos7 minimal on the default
>>>>>>> "x-large instance" profile.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Within this VM if I do very basic write test using something like:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> dd bs=1M count=256 if=/dev/zero of=test conv=fdatasync
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I'm seeing quite slow speeds, only 8mb/sec.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> If I do the same from one of the hosts gluster mounts i.e.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> host1: /rhev/data-center/mnt/glusterSD/HOST:data
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I get about 30mb/sec (which still seems fairly low?)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Am I testing incorrectly here?  Is there anything I should be tuning
>>>>>>> on the Gluster volumes to increase performance with SSDs?  Where can I 
>>>>>>> find
>>>>>>> out where the bottle neck is here, or is this expected performance of
>>>>>>> Gluster?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list -- [email protected]
>>>>>> To unsubscribe send an email to [email protected]
>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>>>>> oVirt Code of Conduct: https://www.ovirt.org/communit
>>>>>> y/about/community-guidelines/
>>>>>> List Archives: https://lists.ovirt.org/archiv
>>>>>> es/list/[email protected]/message/JKUWFXIZOWQ42JFBIJLFJGBKISY7OMPV/
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>

_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/DJ5E34NTGCV6XRWJFFDBWKFSGYSKKUCE/

[ovirt-users] Re: Tuning and testing GlusterFS performance

Reply via email to