I did another test with inode_size on xfs bricks=1024Bytes, but it had also no effect. Here is the measurement:
(All values in MiB/s) 64KiB 1MiB 10MiB 0,16 2,52 76,58 Beside of that I was not able to set the xattr trusted.io-stats-dump. I am wondering myself why it is not working Regards David Spisla Am Mi., 6. Nov. 2019 um 11:16 Uhr schrieb RAFI KC <rkavu...@redhat.com>: > > On 11/6/19 3:42 PM, David Spisla wrote: > > Hello Rafi, > > I tried to set the xattr via > > setfattr -n trusted.io-stats-dump -v '/tmp/iostat.log' > /gluster/repositories/repo1/ > > but it had no effect. There is no such a xattr via getfattr and no > logfile. The command setxattr is not available. What I am doing wrong? > > > I will check it out and get back to you. > > > By the way, you mean to increase the inode size of xfs layer from 512 > Bytes to 1024KB(!)? I think it should be 1024 Bytes because 2048 Bytes is > the maximum > > It was a type, I meant to set up 1024 bytes, sorry for that. > > > Regards > David > > Am Mi., 6. Nov. 2019 um 04:10 Uhr schrieb RAFI KC <rkavu...@redhat.com>: > >> I will take a look at the profile info shared. Since there is a huge >> difference in the performance numbers between fuse and samba, it would be >> great if we can get the profile info of fuse (on v7). This will help to >> compare the number of calls for each fops. There should be some fops that >> samba repeat, and we can find out it by comparing with fuse. >> >> Also if possible, can you please get client profile info from fuse mount >> using the command `setxattr -n trusted.io-stats-dump -v <logfile >> /tmp/iostat.log> </mnt/fuse(mount point)>`. >> >> >> Regards >> >> Rafi KC >> >> On 11/5/19 11:05 PM, David Spisla wrote: >> >> I did the test with Gluster 7.0 ctime disabled. But it had no effect: >> (All values in MiB/s) >> 64KiB 1MiB 10MiB >> 0,16 2,60 54,74 >> >> Attached there is now the complete profile file also with the results >> from the last test. I will not repeat it with an higher inode size because >> I don't think this will have an effect. >> There must be another cause for the low performance >> >> >> Yes. No need to try with higher inode size >> >> >> >> Regards >> David Spisla >> >> Am Di., 5. Nov. 2019 um 16:25 Uhr schrieb David Spisla < >> spisl...@gmail.com>: >> >>> >>> >>> Am Di., 5. Nov. 2019 um 12:06 Uhr schrieb RAFI KC <rkavu...@redhat.com>: >>> >>>> >>>> On 11/4/19 8:46 PM, David Spisla wrote: >>>> >>>> Dear Gluster Community, >>>> >>>> I also have a issue concerning performance. The last days I updated our >>>> test cluster from GlusterFS v5.5 to v7.0 . The setup in general: >>>> >>>> 2 HP DL380 Servers with 10Gbit NICs, 1 Distribute-Replica 2 Volume with >>>> 2 Replica Pairs. Client is SMB Samba (access via vfs_glusterfs) . I did >>>> several tests to ensure that Samba don't causes the fall. >>>> The setup ist completely the same except the Gluster Version >>>> Here are my results: >>>> 64KiB 1MiB 10MiB (Filesize) >>>> 3,49 47,41 300,50 (Values in MiB/s with >>>> GlusterFS v5.5) >>>> 0,16 2,61 76,63 (Values in MiB/s >>>> with GlusterFS v7.0) >>>> >>>> >>>> Can you please share the profile information [1] for both versions? >>>> Also it would be really helpful if you can mention the io patterns that >>>> used for this tests. >>>> >>>> [1] : >>>> https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/ >>>> >>> Hello Rafi, >>> thank you for your help. >>> >>> * First more information about the io patterns: As a client we use a >>> DL360 Windws Server 2017 machine with 10Gbit NIC connected to the storage >>> machines. The share will be mounted via SMB and the tests writes with fio. >>> We use this job files (see attachment). Each job file will be executed >>> separetely and there is a sleep about 60s between each test run to calm >>> down the system before starting a new test. >>> >>> * Attached below you find the profile output from the tests with v5.5 >>> (ctime enabled), v7.0 (ctime enabled). >>> >>> * Beside of the tests with Samba I did also some fio tests directly on >>> the FUSE Mounts (locally on one of the storage nodes). The results show >>> that there is only a small decrease of performance between v5.5 and v7.0 >>> (All values in MiB/s) >>> 64KiB 1MiB 10MiB >>> 50,09 679,96 1023,02 (v5.5) >>> 47,00 656,46 977,60 (v7.0) >>> >>> It seems to be that the combination of samba + gluster7.0 has a lot of >>> problems, or not? >>> >>> >>>> >>>> We use this volume options (GlusterFS 7.0): >>>> >>>> Volume Name: archive1 >>>> Type: Distributed-Replicate >>>> Volume ID: 44c17844-0bd4-4ca2-98d8-a1474add790c >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 2 x 2 = 4 >>>> Transport-type: tcp >>>> Bricks: >>>> Brick1: fs-dl380-c1-n1:/gluster/brick1/glusterbrick >>>> Brick2: fs-dl380-c1-n2:/gluster/brick1/glusterbrick >>>> Brick3: fs-dl380-c1-n1:/gluster/brick2/glusterbrick >>>> Brick4: fs-dl380-c1-n2:/gluster/brick2/glusterbrick >>>> Options Reconfigured: >>>> performance.client-io-threads: off >>>> nfs.disable: on >>>> storage.fips-mode-rchecksum: on >>>> transport.address-family: inet >>>> user.smb: disable >>>> features.read-only: off >>>> features.worm: off >>>> features.worm-file-level: on >>>> features.retention-mode: enterprise >>>> features.default-retention-period: 120 >>>> network.ping-timeout: 10 >>>> features.cache-invalidation: on >>>> features.cache-invalidation-timeout: 600 >>>> performance.nl-cache: on >>>> performance.nl-cache-timeout: 600 >>>> client.event-threads: 32 >>>> server.event-threads: 32 >>>> cluster.lookup-optimize: on >>>> performance.stat-prefetch: on >>>> performance.cache-invalidation: on >>>> performance.md-cache-timeout: 600 >>>> performance.cache-samba-metadata: on >>>> performance.cache-ima-xattrs: on >>>> performance.io-thread-count: 64 >>>> cluster.use-compound-fops: on >>>> performance.cache-size: 512MB >>>> performance.cache-refresh-timeout: 10 >>>> performance.read-ahead: off >>>> performance.write-behind-window-size: 4MB >>>> performance.write-behind: on >>>> storage.build-pgfid: on >>>> features.ctime: on >>>> cluster.quorum-type: fixed >>>> cluster.quorum-count: 1 >>>> features.bitrot: on >>>> features.scrub: Active >>>> features.scrub-freq: daily >>>> >>>> For GlusterFS 5.5 its nearly the same except the fact that there were 2 >>>> options to enable ctime feature. >>>> >>>> >>>> >>>> Ctime stores additional metadata information as an extended attributes >>>> which sometimes exceeds the default inode size. In such scenarios the >>>> additional xattrs won't fit into the default size. This will result in >>>> additional blocks to be used to store xattrs in the inide, which will >>>> effect the latency. This is purely based on the i/o operations and the >>>> total xattrs size stored in the inode. >>>> >>>> Is it possible for you to repeat the test by disabling ctime or >>>> increasing the inode size to a higher value say 1024KB? >>>> >>> I will do so but for today I could not finish tests with ctime disabled >>> (or higher inode value) because it takes a lot of time with v7.0 due to the >>> low performance and I will perform it tomorrow. As soon as possible I give >>> you the results. >>> By the way: You really mean inode size on xfs layer 1024KB? Or do you >>> mean 1024Bytes? We use per default 512Bytes, because this is the >>> recommended size until now . But it seems to be that there is a need for a >>> new recommendation when using ctime feature as a default. I can not image >>> that this is the real cause for the low performance because in v5.5 we also >>> use ctime feature with inode size 512Bytes. >>> >>> Regards >>> David >>> >>>> >>>> Our optimization for Samba looks like this (for every version): >>>> >>>> [global] >>>> workgroup = SAMBA >>>> netbios name = CLUSTER >>>> kernel share modes = no >>>> aio read size = 1 >>>> aio write size = 1 >>>> kernel oplocks = no >>>> max open files = 100000 >>>> nt acl support = no >>>> security = user >>>> server min protocol = SMB2 >>>> store dos attributes = no >>>> strict locking = no >>>> full_audit:failure = pwrite_send pwrite_recv pwrite offload_write_send >>>> offload_write_recv create_file open unlink connect disconnect rename chown >>>> fchown lchown chmod fchmod mkdir rmdir ntimes ftruncate fallocate >>>> full_audit:success = pwrite_send pwrite_recv pwrite offload_write_send >>>> offload_write_recv create_file open unlink connect disconnect rename chown >>>> fchown lchown chmod fchmod mkdir rmdir ntimes ftruncate fallocate >>>> full_audit:facility = local5 >>>> durable handles = yes >>>> posix locking = no >>>> log level = 2 >>>> max log size = 100000 >>>> debug pid = yes >>>> >>>> What can be the cause for this rapid falling of the performance for >>>> small files? Are some of our vol options not recommended anymore? >>>> There were some patches concerning performance for small files in v6.0 >>>> und v7.0 : >>>> >>>> #1670031 <https://bugzilla.redhat.com/1670031>: performance regression >>>> seen with smallfile workload tests >>>> >>>> #1659327 <https://bugzilla.redhat.com/1659327>: 43% regression in >>>> small-file sequential read performance >>>> >>>> And one patch for the io-cache: >>>> >>>> #1659869 <https://bugzilla.redhat.com/1659869>: improvements to >>>> io-cache >>>> >>>> Regards >>>> >>>> David Spisla >>>> >>>> >>>> ________ >>>> >>>> Community Meeting Calendar: >>>> >>>> APAC Schedule - >>>> Every 2nd and 4th Tuesday at 11:30 AM IST >>>> Bridge: https://bluejeans.com/118564314 >>>> >>>> NA/EMEA Schedule - >>>> Every 1st and 3rd Tuesday at 01:00 PM EDT >>>> Bridge: https://bluejeans.com/118564314 >>>> >>>> Gluster-users mailing >>>> listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>>
________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users