OK, thank you. I'll try that. The reason I was confused about its status is these things in the doc:
How To Test > TBD. > Documentation > TBD > Status > Design complete. Implementation done. The only thing pending is the > compounding of two fops in shd code. Sincerely, Artem -- Founder, Android Police <http://www.androidpolice.com>, APK Mirror <http://www.apkmirror.com/>, Illogical Robot LLC beerpla.net | +ArtemRussakovskii <https://plus.google.com/+ArtemRussakovskii> | @ArtemR <http://twitter.com/ArtemR> On Tue, Apr 17, 2018 at 11:49 PM, Ravishankar N <ravishan...@redhat.com> wrote: > > > On 04/18/2018 11:59 AM, Artem Russakovskii wrote: > > Btw, I've now noticed at least 5 variations in toggling binary option > values. Are they all interchangeable, or will using the wrong value not > work in some cases? > > yes/no > true/false > True/False > on/off > enable/disable > > It's quite a confusing/inconsistent practice, especially given that many > options will accept any value without erroring out/validation. > > > All these options are okay. > > > > Sincerely, > Artem > > -- > Founder, Android Police <http://www.androidpolice.com>, APK Mirror > <http://www.apkmirror.com/>, Illogical Robot LLC > beerpla.net | +ArtemRussakovskii > <https://plus.google.com/+ArtemRussakovskii> | @ArtemR > <http://twitter.com/ArtemR> > > On Tue, Apr 17, 2018 at 11:22 PM, Artem Russakovskii <archon...@gmail.com> > wrote: > >> Thanks for the link. Looking at the status of that doc, it isn't quite >> ready yet, and there's no mention of the option. >> > > No, this is a completed feature available since 3.8 IIRC. You can use it > safely. There is a difference in how to enable it though. Instead of using > 'gluster volume set ...', you need to use 'gluster volume heal <volname> > granular-entry-heal enable' to turn it on. If there are no pending heals, > it will run successfully. Otherwise you need to wait until heals are over > (i.e. heal info shows zero entries). Just follow what the CLI says and you > should be fine. > > -Ravi > > >> Does it mean that whatever is ready now in 4.0.1 is incomplete but can be >> enabled via granular-entry-heal=on, and when it is complete, it'll become >> the default and the flag will simply go away? >> >> Is there any risk enabling the option now in 4.0.1? >> >> >> Sincerely, >> Artem >> >> -- >> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >> <http://www.apkmirror.com/>, Illogical Robot LLC >> beerpla.net | +ArtemRussakovskii >> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >> <http://twitter.com/ArtemR> >> >> On Tue, Apr 17, 2018 at 11:16 PM, Ravishankar N <ravishan...@redhat.com> >> wrote: >> >>> >>> >>> On 04/18/2018 10:35 AM, Artem Russakovskii wrote: >>> >>> Hi Ravi, >>> >>> Could you please expand on how these would help? >>> >>> By forcing full here, we move the logic from the CPU to network, thus >>> decreasing CPU utilization, is that right? >>> >>> Yes, 'diff' employs the rchecksum FOP which does a sha256 checksum >>> which can consume CPU. So yes it is sort of shifting the load from CPU to >>> the network. But if your average file size is small, it would make sense to >>> copy the entire file instead of computing checksums. >>> >>> This is assuming the CPU and disk utilization are caused by the differ >>> and not by lstat and other calls or something. >>> >>>> Option: cluster.data-self-heal-algorithm >>>> Default Value: (null) >>>> Description: Select between "full", "diff". The "full" algorithm copies >>>> the entire file from source to sink. The "diff" algorithm copies to sink >>>> only those blocks whose checksums don't match with those of source. If no >>>> option is configured the option is chosen dynamically as follows: If the >>>> file does not exist on one of the sinks or empty file exists or if the >>>> source file size is about the same as page size the entire file will be >>>> read and written i.e "full" algo, otherwise "diff" algo is chosen. >>> >>> >>> I really have no idea what this means and how/why it would help. Any >>> more info on this option? >>> >>> >>> https://github.com/gluster/glusterfs-specs/blob/master/done/ >>> GlusterFS%203.8/granular-entry-self-healing.md should help. >>> Regards, >>> Ravi >>> >>> >>> Option: cluster.granular-entry-heal >>>> Default Value: no >>>> Description: If this option is enabled, self-heal will resort to >>>> granular way of recording changelogs and doing entry self-heal. >>> >>> >>> Thank you. >>> >>> >>> Sincerely, >>> Artem >>> >>> -- >>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>> <http://www.apkmirror.com/>, Illogical Robot LLC >>> beerpla.net | +ArtemRussakovskii >>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>> <http://twitter.com/ArtemR> >>> >>> On Tue, Apr 17, 2018 at 9:58 PM, Ravishankar N <ravishan...@redhat.com> >>> wrote: >>> >>>> >>>> On 04/18/2018 10:14 AM, Artem Russakovskii wrote: >>>> >>>> Following up here on a related and very serious for us issue. >>>> >>>> I took down one of the 4 replicate gluster servers for maintenance >>>> today. There are 2 gluster volumes totaling about 600GB. Not that much >>>> data. After the server comes back online, it starts auto healing and pretty >>>> much all operations on gluster freeze for many minutes. >>>> >>>> For example, I was trying to run an ls -alrt in a folder with 7300 >>>> files, and it took a good 15-20 minutes before returning. >>>> >>>> During this time, I can see iostat show 100% utilization on the brick, >>>> heal status takes many minutes to return, glusterfsd uses up tons of CPU (I >>>> saw it spike to 600%). gluster already has massive performance issues for >>>> me, but healing after a 4-hour downtime is on another level of bad perf. >>>> >>>> For example, this command took many minutes to run: >>>> >>>> gluster volume heal androidpolice_data3 info summary >>>> Brick nexus2:/mnt/nexus2_block4/androidpolice_data3 >>>> Status: Connected >>>> Total Number of entries: 91 >>>> Number of entries in heal pending: 90 >>>> Number of entries in split-brain: 0 >>>> Number of entries possibly healing: 1 >>>> >>>> Brick forge:/mnt/forge_block4/androidpolice_data3 >>>> Status: Connected >>>> Total Number of entries: 87 >>>> Number of entries in heal pending: 86 >>>> Number of entries in split-brain: 0 >>>> Number of entries possibly healing: 1 >>>> >>>> Brick hive:/mnt/hive_block4/androidpolice_data3 >>>> Status: Connected >>>> Total Number of entries: 87 >>>> Number of entries in heal pending: 86 >>>> Number of entries in split-brain: 0 >>>> Number of entries possibly healing: 1 >>>> >>>> Brick citadel:/mnt/citadel_block4/androidpolice_data3 >>>> Status: Connected >>>> Total Number of entries: 0 >>>> Number of entries in heal pending: 0 >>>> Number of entries in split-brain: 0 >>>> Number of entries possibly healing: 0 >>>> >>>> >>>> Statistics showed a diminishing number of failed heals: >>>> ... >>>> Ending time of crawl: Tue Apr 17 21:13:08 2018 >>>> >>>> Type of crawl: INDEX >>>> No. of entries healed: 2 >>>> No. of entries in split-brain: 0 >>>> No. of heal failed entries: 102 >>>> >>>> Starting time of crawl: Tue Apr 17 21:13:09 2018 >>>> >>>> Ending time of crawl: Tue Apr 17 21:14:30 2018 >>>> >>>> Type of crawl: INDEX >>>> No. of entries healed: 4 >>>> No. of entries in split-brain: 0 >>>> No. of heal failed entries: 91 >>>> >>>> Starting time of crawl: Tue Apr 17 21:14:31 2018 >>>> >>>> Ending time of crawl: Tue Apr 17 21:15:34 2018 >>>> >>>> Type of crawl: INDEX >>>> No. of entries healed: 0 >>>> No. of entries in split-brain: 0 >>>> No. of heal failed entries: 88 >>>> ... >>>> >>>> Eventually, everything heals and goes back to at least where the roof >>>> isn't on fire anymore. >>>> >>>> The server stats and volume options were given in one of the previous >>>> replies to this thread. >>>> >>>> Any ideas or things I could run and show the output of to help >>>> diagnose? I'm also very open to working with someone on the team on a live >>>> debugging session if there's interest. >>>> >>>> >>>> It is likely that self-heal is causing the CPU spike due to the flood >>>> of lookups/ locks and checksum fops that the self-heal-daemon sends to the >>>> bricks. >>>> There's a script to control shd's cpu usage using cgroups. That should >>>> help in regulating self-heal traffic: https://review.gluster.org/#/c >>>> /18404/ (see extras/control-cpu-load.sh) >>>> Other self-heal related volume options that you could change are >>>> setting 'cluster.data-self-heal-algorithm' to 'full' and >>>> 'granular-entry-heal' to 'enable'. `gluster volume set help` should give >>>> you more information about these options. >>>> Thanks, >>>> Ravi >>>> >>>> >>>> >>>> Thank you. >>>> >>>> >>>> Sincerely, >>>> Artem >>>> >>>> -- >>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>>> <http://www.apkmirror.com/>, Illogical Robot LLC >>>> beerpla.net | +ArtemRussakovskii >>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>>> <http://twitter.com/ArtemR> >>>> >>>> On Tue, Apr 10, 2018 at 9:56 AM, Artem Russakovskii < >>>> archon...@gmail.com> wrote: >>>> >>>>> Hi Vlad, >>>>> >>>>> I actually saw that post already and even asked a question 4 days ago ( >>>>> https://serverfault.com/questions/517775/glusterfs-direct-i >>>>> -o-mode#comment1172497_540917). The accepted answer also seems to go >>>>> against your suggestion to enable direct-io-mode as it says it should be >>>>> disabled for better performance when used just for file accesses. >>>>> >>>>> It'd be great if someone from the Gluster team chimed in about this >>>>> thread. >>>>> >>>>> >>>>> Sincerely, >>>>> Artem >>>>> >>>>> -- >>>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>>>> <http://www.apkmirror.com/>, Illogical Robot LLC >>>>> beerpla.net | +ArtemRussakovskii >>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>>>> <http://twitter.com/ArtemR> >>>>> >>>>> On Tue, Apr 10, 2018 at 7:01 AM, Vlad Kopylov <vladk...@gmail.com> >>>>> wrote: >>>>> >>>>>> Wish I knew or was able to get detailed description of those options >>>>>> myself. >>>>>> here is direct-io-mode https://serverfault.com/questi >>>>>> ons/517775/glusterfs-direct-i-o-mode >>>>>> Same as you I ran tests on a large volume of files, finding that main >>>>>> delays are in attribute calls, ending up with those mount options to add >>>>>> performance. >>>>>> I discovered those options through basically googling this user list >>>>>> with people sharing their tests. >>>>>> Not sure I would share your optimism, and rather then going up I >>>>>> downgraded to 3.12 and have no dir view issue now. Though I had to >>>>>> recreate >>>>>> the cluster and had to re-add bricks with existing data. >>>>>> >>>>>> On Tue, Apr 10, 2018 at 1:47 AM, Artem Russakovskii < >>>>>> archon...@gmail.com> wrote: >>>>>> >>>>>>> Hi Vlad, >>>>>>> >>>>>>> I'm using only localhost: mounts. >>>>>>> >>>>>>> Can you please explain what effect each option has on performance >>>>>>> issues shown in my posts? "negative-timeout=10,attribute >>>>>>> -timeout=30,fopen-keep-cache,direct-io-mode=enable,fetch-attempts=5" >>>>>>> From what I remember, direct-io-mode=enable didn't make a difference in >>>>>>> my >>>>>>> tests, but I suppose I can try again. The explanations about >>>>>>> direct-io-mode >>>>>>> are quite confusing on the web in various guides, saying enabling it >>>>>>> could >>>>>>> make performance worse in some situations and better in others due to OS >>>>>>> file cache. >>>>>>> >>>>>>> There are also these gluster volume settings, adding to the >>>>>>> confusion: >>>>>>> Option: performance.strict-o-direct >>>>>>> Default Value: off >>>>>>> Description: This option when set to off, ignores the O_DIRECT flag. >>>>>>> >>>>>>> Option: performance.nfs.strict-o-direct >>>>>>> Default Value: off >>>>>>> Description: This option when set to off, ignores the O_DIRECT flag. >>>>>>> >>>>>>> Re: 4.0. I moved to 4.0 after finding out that it fixes the >>>>>>> disappearing dirs bug related to cluster.readdir-optimize if you >>>>>>> remember ( >>>>>>> http://lists.gluster.org/pipermail/gluster-users/2018-April >>>>>>> /033830.html). I was already on 3.13 by then, and 4.0 resolved the >>>>>>> issue. It's been stable for me so far, thankfully. >>>>>>> >>>>>>> >>>>>>> Sincerely, >>>>>>> Artem >>>>>>> >>>>>>> -- >>>>>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>>>>>> <http://www.apkmirror.com/>, Illogical Robot LLC >>>>>>> beerpla.net | +ArtemRussakovskii >>>>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>>>>>> <http://twitter.com/ArtemR> >>>>>>> >>>>>>> On Mon, Apr 9, 2018 at 10:38 PM, Vlad Kopylov <vladk...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> you definitely need mount options to /etc/fstab >>>>>>>> use ones from here http://lists.gluster.org/piper >>>>>>>> mail/gluster-users/2018-April/033811.html >>>>>>>> >>>>>>>> I went on with using local mounts to achieve performance as well >>>>>>>> >>>>>>>> Also, 3.12 or 3.10 branches would be preferable for production >>>>>>>> >>>>>>>> On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii < >>>>>>>> archon...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi again, >>>>>>>>> >>>>>>>>> I'd like to expand on the performance issues and plead for help. >>>>>>>>> Here's one case which shows these odd hiccups: >>>>>>>>> https://i.imgur.com/CXBPjTK.gifv. >>>>>>>>> >>>>>>>>> In this GIF where I switch back and forth between copy operations >>>>>>>>> on 2 servers, I'm copying a 10GB dir full of .apk and image files. >>>>>>>>> >>>>>>>>> On server "hive" I'm copying straight from the main disk to an >>>>>>>>> attached volume block (xfs). As you can see, the transfers are >>>>>>>>> relatively >>>>>>>>> speedy and don't hiccup. >>>>>>>>> On server "citadel" I'm copying the same set of data to a >>>>>>>>> 4-replicate gluster which uses block storage as a brick. As you can >>>>>>>>> see, >>>>>>>>> performance is much worse, and there are frequent pauses for many >>>>>>>>> seconds >>>>>>>>> where nothing seems to be happening - just freezes. >>>>>>>>> >>>>>>>>> All 4 servers have the same specs, and all of them have >>>>>>>>> performance issues with gluster and no such issues when raw xfs block >>>>>>>>> storage is used. >>>>>>>>> >>>>>>>>> hive has long finished copying the data, while citadel is barely >>>>>>>>> chugging along and is expected to take probably half an hour to an >>>>>>>>> hour. I >>>>>>>>> have over 1TB of data to migrate, at which point if we went live, I'm >>>>>>>>> not >>>>>>>>> even sure gluster would be able to keep up instead of bringing the >>>>>>>>> machines >>>>>>>>> and services down. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Here's the cluster config, though it didn't seem to make any >>>>>>>>> difference performance-wise before I applied the customizations vs >>>>>>>>> after. >>>>>>>>> >>>>>>>>> Volume Name: apkmirror_data1 >>>>>>>>> Type: Replicate >>>>>>>>> Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e >>>>>>>>> Status: Started >>>>>>>>> Snapshot Count: 0 >>>>>>>>> Number of Bricks: 1 x 4 = 4 >>>>>>>>> Transport-type: tcp >>>>>>>>> Bricks: >>>>>>>>> Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1 >>>>>>>>> Brick2: forge:/mnt/forge_block1/apkmirror_data1 >>>>>>>>> Brick3: hive:/mnt/hive_block1/apkmirror_data1 >>>>>>>>> Brick4: citadel:/mnt/citadel_block1/apkmirror_data1 >>>>>>>>> Options Reconfigured: >>>>>>>>> cluster.quorum-count: 1 >>>>>>>>> cluster.quorum-type: fixed >>>>>>>>> network.ping-timeout: 5 >>>>>>>>> network.remote-dio: enable >>>>>>>>> performance.rda-cache-limit: 256MB >>>>>>>>> performance.readdir-ahead: on >>>>>>>>> performance.parallel-readdir: on >>>>>>>>> network.inode-lru-limit: 500000 >>>>>>>>> performance.md-cache-timeout: 600 >>>>>>>>> performance.cache-invalidation: on >>>>>>>>> performance.stat-prefetch: on >>>>>>>>> features.cache-invalidation-timeout: 600 >>>>>>>>> features.cache-invalidation: on >>>>>>>>> cluster.readdir-optimize: on >>>>>>>>> performance.io-thread-count: 32 >>>>>>>>> server.event-threads: 4 >>>>>>>>> client.event-threads: 4 >>>>>>>>> performance.read-ahead: off >>>>>>>>> cluster.lookup-optimize: on >>>>>>>>> performance.cache-size: 1GB >>>>>>>>> cluster.self-heal-daemon: enable >>>>>>>>> transport.address-family: inet >>>>>>>>> nfs.disable: on >>>>>>>>> performance.client-io-threads: on >>>>>>>>> >>>>>>>>> >>>>>>>>> The mounts are done as follows in /etc/fstab: >>>>>>>>> /dev/disk/by-id/scsi-0Linode_Volume_citadel_block1 >>>>>>>>> /mnt/citadel_block1 xfs defaults 0 2 >>>>>>>>> localhost:/apkmirror_data1 /mnt/apkmirror_data1 glusterfs >>>>>>>>> defaults,_netdev 0 0 >>>>>>>>> >>>>>>>>> I'm really not sure if direct-io-mode mount tweaks would do >>>>>>>>> anything here, what the value should be set to, and what it is by >>>>>>>>> default. >>>>>>>>> >>>>>>>>> The OS is OpenSUSE 42.3, 64-bit. 80GB of RAM, 20 CPUs, hosted by >>>>>>>>> Linode. >>>>>>>>> >>>>>>>>> I'd really appreciate any help in the matter. >>>>>>>>> >>>>>>>>> Thank you. >>>>>>>>> >>>>>>>>> >>>>>>>>> Sincerely, >>>>>>>>> Artem >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>>>>>>>> <http://www.apkmirror.com/>, Illogical Robot LLC >>>>>>>>> beerpla.net | +ArtemRussakovskii >>>>>>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>>>>>>>> <http://twitter.com/ArtemR> >>>>>>>>> >>>>>>>>> On Thu, Apr 5, 2018 at 11:13 PM, Artem Russakovskii < >>>>>>>>> archon...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I'm trying to squeeze performance out of gluster on 4 80GB RAM >>>>>>>>>> 20-CPU machines where Gluster runs on attached block storage >>>>>>>>>> (Linode) in (4 >>>>>>>>>> replicate bricks), and so far everything I tried results in >>>>>>>>>> sub-optimal >>>>>>>>>> performance. >>>>>>>>>> >>>>>>>>>> There are many files - mostly images, several million - and many >>>>>>>>>> operations take minutes, copying multiple files (even if they're >>>>>>>>>> small) >>>>>>>>>> suddenly freezes up for seconds at a time, then continues, iostat >>>>>>>>>> frequently shows large r_await and w_awaits with 100% utilization >>>>>>>>>> for the >>>>>>>>>> attached block device, etc. >>>>>>>>>> >>>>>>>>>> But anyway, there are many guides out there for small-file >>>>>>>>>> performance improvements, but more explanation is needed, and I >>>>>>>>>> think more >>>>>>>>>> tweaks should be possible. >>>>>>>>>> >>>>>>>>>> My question today is about performance.cache-size. Is this a size >>>>>>>>>> of cache in RAM? If so, how do I view the current cache size to see >>>>>>>>>> if it >>>>>>>>>> gets full and I should increase its size? Is it advisable to bump it >>>>>>>>>> up if >>>>>>>>>> I have many tens of gigs of RAM free? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> More generally, in the last 2 months since I first started >>>>>>>>>> working with gluster and set a production system live, I've been >>>>>>>>>> feeling >>>>>>>>>> frustrated because Gluster has a lot of poorly-documented and >>>>>>>>>> confusing >>>>>>>>>> options. I really wish documentation could be improved with examples >>>>>>>>>> and >>>>>>>>>> better explanations. >>>>>>>>>> >>>>>>>>>> Specifically, it'd be absolutely amazing if the docs offered a >>>>>>>>>> strategy for setting each value and ways of determining more optimal >>>>>>>>>> values. For example, for performance.cache-size, if it said >>>>>>>>>> something like >>>>>>>>>> "run command abc to see your current cache size, and if it's >>>>>>>>>> hurting, up >>>>>>>>>> it, but be aware that it's limited by RAM," it'd be already a huge >>>>>>>>>> improvement to the docs. And so on with other options. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The gluster team is quite helpful on this mailing list, but in a >>>>>>>>>> reactive rather than proactive way. Perhaps it's tunnel vision once >>>>>>>>>> you've >>>>>>>>>> worked on a project for so long where less technical explanations >>>>>>>>>> and even >>>>>>>>>> proper documentation of options takes a back seat, but I encourage >>>>>>>>>> you to >>>>>>>>>> be more proactive about helping us understand and optimize Gluster. >>>>>>>>>> >>>>>>>>>> Thank you. >>>>>>>>>> >>>>>>>>>> Sincerely, >>>>>>>>>> Artem >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Founder, Android Police <http://www.androidpolice.com>, APK >>>>>>>>>> Mirror <http://www.apkmirror.com/>, Illogical Robot LLC >>>>>>>>>> beerpla.net | +ArtemRussakovskii >>>>>>>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>>>>>>>>> <http://twitter.com/ArtemR> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Gluster-users mailing list >>>>>>>>> Gluster-users@gluster.org >>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing >>>> listGluster-users@gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>> >>>> >>> >>> >> > >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users