One thing I noticed (and reported in another email to the mailing list) is that when the really slow dir lists happen the first time I ls, the log file is filled with hundreds or even thousands of messages like these: [2020-04-30 17:49:04.844167] I [MSGID: 109063] [dht-layout.c:659:dht_layout_normalize] 0-SNIP_data1-dht: Found anomalies in (null) (gfid = c86e39a1-32ef-4eaf-b5a7-a90d73239c5a). Holes=1 overlaps=0
The subsequent ls is much faster, and there are no such log messages. Could whatever is causing those contribute to the massive slowdown? Although even the subsequent ls of 20-40s is still several orders of magnitude slower than ls on the xfs brick itself, which takes only ~0.2s. Sincerely, Artem -- Founder, Android Police <http://www.androidpolice.com>, APK Mirror <http://www.apkmirror.com/>, Illogical Robot LLC beerpla.net | @ArtemR <http://twitter.com/ArtemR> On Thu, Apr 30, 2020 at 10:54 AM Artem Russakovskii <archon...@gmail.com> wrote: > getfattr -d -m. -e hex . > # file: . > trusted.afr.SNIP_data1-client-0=0x000000000000000000000000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x44b2db00267a47508b2a8a921f20e0f5 > trusted.glusterfs.dht=0x000000010000000000000000ffffffff > trusted.glusterfs.dht.mds=0x00000000 > > Sincerely, > Artem > > -- > Founder, Android Police <http://www.androidpolice.com>, APK Mirror > <http://www.apkmirror.com/>, Illogical Robot LLC > beerpla.net | @ArtemR <http://twitter.com/ArtemR> > > > On Thu, Apr 30, 2020 at 9:05 AM Felix Kölzow <felix.koel...@gmx.de> wrote: > >> Dear Artem, >> >> sry for the noise, since you already provide the xfs_info. >> >> Could you provide the output of >> >> >> getfattr -d -m. -e hex /DirectoryPathOfInterest_onTheBrick/ >> >> >> Felix >> >> On 30/04/2020 18:01, Felix Kölzow wrote: >> >> Dear Artem, >> >> can you also provide some information w.r.t your xfs filesystem, i.e. >> xfs_info of your block device? >> >> >> Regards, >> >> Felix >> On 30/04/2020 17:27, Artem Russakovskii wrote: >> >> Hi Strahil, in the original email I included both the times for the first >> and subsequent reads on the fuse mounted gluster volume as well as the xfs >> filesystem the gluster data resides on (this is the brick, right?). >> >> On Thu, Apr 30, 2020, 7:44 AM Strahil Nikolov <hunter86...@yahoo.com> >> wrote: >> >>> On April 30, 2020 4:24:23 AM GMT+03:00, Artem Russakovskii < >>> archon...@gmail.com> wrote: >>> >Hi all, >>> > >>> >We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes, and the >>> >10TB one especially is extremely slow to do certain things with (and >>> >has >>> >been since gluster 3.x when we started). We're currently on 5.13. >>> > >>> >The number of files isn't even what I'd consider that great - under >>> >100k >>> >per dir. >>> > >>> >Here are some numbers to look at: >>> > >>> >On gluster volume in a dir of 45k files: >>> >The first time >>> > >>> >time find | wc -l >>> >45423 >>> >real 8m44.819s >>> >user 0m0.459s >>> >sys 0m0.998s >>> > >>> >And again >>> > >>> >time find | wc -l >>> >45423 >>> >real 0m34.677s >>> >user 0m0.291s >>> >sys 0m0.754s >>> > >>> > >>> >If I run the same operation on the xfs block device itself: >>> >The first time >>> > >>> >time find | wc -l >>> >45423 >>> >real 0m13.514s >>> >user 0m0.144s >>> >sys 0m0.501s >>> > >>> >And again >>> > >>> >time find | wc -l >>> >45423 >>> >real 0m0.197s >>> >user 0m0.088s >>> >sys 0m0.106s >>> > >>> > >>> >I'd expect a performance difference here but just as it was several >>> >years >>> >ago when we started with gluster, it's still huge, and simple file >>> >listings >>> >are incredibly slow. >>> > >>> >At the time, the team was looking to do some optimizations, but I'm not >>> >sure this has happened. >>> > >>> >What can we do to try to improve performance? >>> > >>> >Thank you. >>> > >>> > >>> > >>> >Some setup values follow. >>> > >>> >xfs_info /mnt/SNIP_block1 >>> >meta-data=/dev/sdc isize=512 agcount=103, >>> >agsize=26214400 >>> >blks >>> > = sectsz=512 attr=2, projid32bit=1 >>> > = crc=1 finobt=1, sparse=0, rmapbt=0 >>> > = reflink=0 >>> >data = bsize=4096 blocks=2684354560, >>> >imaxpct=25 >>> > = sunit=0 swidth=0 blks >>> >naming =version 2 bsize=4096 ascii-ci=0, ftype=1 >>> >log =internal log bsize=4096 blocks=51200, version=2 >>> > = sectsz=512 sunit=0 blks, lazy-count=1 >>> >realtime =none extsz=4096 blocks=0, rtextents=0 >>> > >>> >Volume Name: SNIP_data1 >>> >Type: Replicate >>> >Volume ID: SNIP >>> >Status: Started >>> >Snapshot Count: 0 >>> >Number of Bricks: 1 x 4 = 4 >>> >Transport-type: tcp >>> >Bricks: >>> >Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1 >>> >Brick2: forge:/mnt/SNIP_block1/SNIP_data1 >>> >Brick3: hive:/mnt/SNIP_block1/SNIP_data1 >>> >Brick4: citadel:/mnt/SNIP_block1/SNIP_data1 >>> >Options Reconfigured: >>> >cluster.quorum-count: 1 >>> >cluster.quorum-type: fixed >>> >network.ping-timeout: 5 >>> >network.remote-dio: enable >>> >performance.rda-cache-limit: 256MB >>> >performance.readdir-ahead: on >>> >performance.parallel-readdir: on >>> >network.inode-lru-limit: 500000 >>> >performance.md-cache-timeout: 600 >>> >performance.cache-invalidation: on >>> >performance.stat-prefetch: on >>> >features.cache-invalidation-timeout: 600 >>> >features.cache-invalidation: on >>> >cluster.readdir-optimize: on >>> >performance.io-thread-count: 32 >>> >server.event-threads: 4 >>> >client.event-threads: 4 >>> >performance.read-ahead: off >>> >cluster.lookup-optimize: on >>> >performance.cache-size: 1GB >>> >cluster.self-heal-daemon: enable >>> >transport.address-family: inet >>> >nfs.disable: on >>> >performance.client-io-threads: on >>> >cluster.granular-entry-heal: enable >>> >cluster.data-self-heal-algorithm: full >>> > >>> >Sincerely, >>> >Artem >>> > >>> >-- >>> >Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>> ><http://www.apkmirror.com/>, Illogical Robot LLC >>> >beerpla.net | @ArtemR <http://twitter.com/ArtemR> >>> >>> Hi Artem, >>> >>> Have you checked the same on brick level ? How big is the difference ? >>> >>> Best Regards, >>> Strahil Nikolov >>> >> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://bluejeans.com/441850968 >> >> Gluster-users mailing >> listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://bluejeans.com/441850968 >> >> Gluster-users mailing >> listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users >> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://bluejeans.com/441850968 >> >> Gluster-users mailing list >> Gluster-users@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> >
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users