Wish I knew or was able to get detailed description of those options myself. here is direct-io-mode https://serverfault.com/questions/517775/glusterfs-direct-i-o-mode Same as you I ran tests on a large volume of files, finding that main delays are in attribute calls, ending up with those mount options to add performance. I discovered those options through basically googling this user list with people sharing their tests. Not sure I would share your optimism, and rather then going up I downgraded to 3.12 and have no dir view issue now. Though I had to recreate the cluster and had to re-add bricks with existing data.
On Tue, Apr 10, 2018 at 1:47 AM, Artem Russakovskii <archon...@gmail.com> wrote: > Hi Vlad, > > I'm using only localhost: mounts. > > Can you please explain what effect each option has on performance issues > shown in my posts? "negative-timeout=10,attribute > -timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5" From > what I remember, direct-io-mode=enable didn't make a difference in my > tests, but I suppose I can try again. The explanations about direct-io-mode > are quite confusing on the web in various guides, saying enabling it could > make performance worse in some situations and better in others due to OS > file cache. > > There are also these gluster volume settings, adding to the confusion: > Option: performance.strict-o-direct > Default Value: off > Description: This option when set to off, ignores the O_DIRECT flag. > > Option: performance.nfs.strict-o-direct > Default Value: off > Description: This option when set to off, ignores the O_DIRECT flag. > > Re: 4.0. I moved to 4.0 after finding out that it fixes the disappearing > dirs bug related to cluster.readdir-optimize if you remember ( > http://lists.gluster.org/pipermail/gluster-users/2018-April/033830.html). > I was already on 3.13 by then, and 4.0 resolved the issue. It's been stable > for me so far, thankfully. > > > Sincerely, > Artem > > -- > Founder, Android Police <http://www.androidpolice.com>, APK Mirror > <http://www.apkmirror.com/>, Illogical Robot LLC > beerpla.net | +ArtemRussakovskii > <https://plus.google.com/+ArtemRussakovskii> | @ArtemR > <http://twitter.com/ArtemR> > > On Mon, Apr 9, 2018 at 10:38 PM, Vlad Kopylov <vladk...@gmail.com> wrote: > >> you definitely need mount options to /etc/fstab >> use ones from here http://lists.gluster.org/piper >> mail/gluster-users/2018-April/033811.html >> >> I went on with using local mounts to achieve performance as well >> >> Also, 3.12 or 3.10 branches would be preferable for production >> >> On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii <archon...@gmail.com> >> wrote: >> >>> Hi again, >>> >>> I'd like to expand on the performance issues and plead for help. Here's >>> one case which shows these odd hiccups: https://i.imgur.com/CXBPjTK.gifv >>> . >>> >>> In this GIF where I switch back and forth between copy operations on 2 >>> servers, I'm copying a 10GB dir full of .apk and image files. >>> >>> On server "hive" I'm copying straight from the main disk to an attached >>> volume block (xfs). As you can see, the transfers are relatively speedy and >>> don't hiccup. >>> On server "citadel" I'm copying the same set of data to a 4-replicate >>> gluster which uses block storage as a brick. As you can see, performance is >>> much worse, and there are frequent pauses for many seconds where nothing >>> seems to be happening - just freezes. >>> >>> All 4 servers have the same specs, and all of them have performance >>> issues with gluster and no such issues when raw xfs block storage is used. >>> >>> hive has long finished copying the data, while citadel is barely >>> chugging along and is expected to take probably half an hour to an hour. I >>> have over 1TB of data to migrate, at which point if we went live, I'm not >>> even sure gluster would be able to keep up instead of bringing the machines >>> and services down. >>> >>> >>> >>> Here's the cluster config, though it didn't seem to make any difference >>> performance-wise before I applied the customizations vs after. >>> >>> Volume Name: apkmirror_data1 >>> Type: Replicate >>> Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 1 x 4 = 4 >>> Transport-type: tcp >>> Bricks: >>> Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1 >>> Brick2: forge:/mnt/forge_block1/apkmirror_data1 >>> Brick3: hive:/mnt/hive_block1/apkmirror_data1 >>> Brick4: citadel:/mnt/citadel_block1/apkmirror_data1 >>> Options Reconfigured: >>> cluster.quorum-count: 1 >>> cluster.quorum-type: fixed >>> network.ping-timeout: 5 >>> network.remote-dio: enable >>> performance.rda-cache-limit: 256MB >>> performance.readdir-ahead: on >>> performance.parallel-readdir: on >>> network.inode-lru-limit: 500000 >>> performance.md-cache-timeout: 600 >>> performance.cache-invalidation: on >>> performance.stat-prefetch: on >>> features.cache-invalidation-timeout: 600 >>> features.cache-invalidation: on >>> cluster.readdir-optimize: on >>> performance.io-thread-count: 32 >>> server.event-threads: 4 >>> client.event-threads: 4 >>> performance.read-ahead: off >>> cluster.lookup-optimize: on >>> performance.cache-size: 1GB >>> cluster.self-heal-daemon: enable >>> transport.address-family: inet >>> nfs.disable: on >>> performance.client-io-threads: on >>> >>> >>> The mounts are done as follows in /etc/fstab: >>> /dev/disk/by-id/scsi-0Linode_Volume_citadel_block1 /mnt/citadel_block1 >>> xfs defaults 0 2 >>> localhost:/apkmirror_data1 /mnt/apkmirror_data1 glusterfs >>> defaults,_netdev 0 0 >>> >>> I'm really not sure if direct-io-mode mount tweaks would do anything >>> here, what the value should be set to, and what it is by default. >>> >>> The OS is OpenSUSE 42.3, 64-bit. 80GB of RAM, 20 CPUs, hosted by Linode. >>> >>> I'd really appreciate any help in the matter. >>> >>> Thank you. >>> >>> >>> Sincerely, >>> Artem >>> >>> -- >>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>> <http://www.apkmirror.com/>, Illogical Robot LLC >>> beerpla.net | +ArtemRussakovskii >>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>> <http://twitter.com/ArtemR> >>> >>> On Thu, Apr 5, 2018 at 11:13 PM, Artem Russakovskii <archon...@gmail.com >>> > wrote: >>> >>>> Hi, >>>> >>>> I'm trying to squeeze performance out of gluster on 4 80GB RAM 20-CPU >>>> machines where Gluster runs on attached block storage (Linode) in (4 >>>> replicate bricks), and so far everything I tried results in sub-optimal >>>> performance. >>>> >>>> There are many files - mostly images, several million - and many >>>> operations take minutes, copying multiple files (even if they're small) >>>> suddenly freezes up for seconds at a time, then continues, iostat >>>> frequently shows large r_await and w_awaits with 100% utilization for the >>>> attached block device, etc. >>>> >>>> But anyway, there are many guides out there for small-file performance >>>> improvements, but more explanation is needed, and I think more tweaks >>>> should be possible. >>>> >>>> My question today is about performance.cache-size. Is this a size of >>>> cache in RAM? If so, how do I view the current cache size to see if it gets >>>> full and I should increase its size? Is it advisable to bump it up if I >>>> have many tens of gigs of RAM free? >>>> >>>> >>>> >>>> More generally, in the last 2 months since I first started working with >>>> gluster and set a production system live, I've been feeling frustrated >>>> because Gluster has a lot of poorly-documented and confusing options. I >>>> really wish documentation could be improved with examples and better >>>> explanations. >>>> >>>> Specifically, it'd be absolutely amazing if the docs offered a strategy >>>> for setting each value and ways of determining more optimal values. For >>>> example, for performance.cache-size, if it said something like "run command >>>> abc to see your current cache size, and if it's hurting, up it, but be >>>> aware that it's limited by RAM," it'd be already a huge improvement to the >>>> docs. And so on with other options. >>>> >>>> >>>> >>>> The gluster team is quite helpful on this mailing list, but in a >>>> reactive rather than proactive way. Perhaps it's tunnel vision once you've >>>> worked on a project for so long where less technical explanations and even >>>> proper documentation of options takes a back seat, but I encourage you to >>>> be more proactive about helping us understand and optimize Gluster. >>>> >>>> Thank you. >>>> >>>> Sincerely, >>>> Artem >>>> >>>> -- >>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>>> <http://www.apkmirror.com/>, Illogical Robot LLC >>>> beerpla.net | +ArtemRussakovskii >>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>>> <http://twitter.com/ArtemR> >>>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users