Re: [Gluster-users] Performance is falling rapidly when updating from v5.5 to v7.0

2019-11-06 Thread David Spisla
I did another test with inode_size on xfs bricks=1024Bytes, but it had also
no effect. Here is the measurement:

(All values in MiB/s)
64KiB1MiB 10MiB
0,16   2,52   76,58

Beside of that I was not able to set the xattr trusted.io-stats-dump. I am
wondering myself why it is not working

Regards
David Spisla

Am Mi., 6. Nov. 2019 um 11:16 Uhr schrieb RAFI KC :

>
> On 11/6/19 3:42 PM, David Spisla wrote:
>
> Hello Rafi,
>
> I tried to set the xattr via
>
> setfattr -n trusted.io-stats-dump -v '/tmp/iostat.log'
> /gluster/repositories/repo1/
>
> but it had no effect. There is no such a xattr via getfattr and no
> logfile. The command setxattr is not available. What I am doing wrong?
>
>
> I will check it out and get back to you.
>
>
> By the way, you mean to increase the inode size of xfs layer from 512
> Bytes to 1024KB(!)? I think it should be 1024 Bytes because 2048 Bytes is
> the maximum
>
> It was a type, I meant to set up 1024 bytes, sorry for that.
>
>
> Regards
> David
>
> Am Mi., 6. Nov. 2019 um 04:10 Uhr schrieb RAFI KC :
>
>> I will take a look at the profile info shared. Since there is a huge
>> difference in the performance numbers between fuse and samba, it would be
>> great if we can get the profile info of fuse (on v7). This will help to
>> compare the number of calls for each fops. There should be some fops that
>> samba repeat, and we can find out it by comparing with fuse.
>>
>> Also if possible, can you please get client profile info from fuse mount
>> using the command `setxattr -n trusted.io-stats-dump -v > /tmp/iostat.log> `.
>>
>>
>> Regards
>>
>> Rafi KC
>>
>> On 11/5/19 11:05 PM, David Spisla wrote:
>>
>> I did the test with Gluster 7.0 ctime disabled. But it had no effect:
>> (All values in MiB/s)
>> 64KiB1MiB 10MiB
>> 0,16   2,60   54,74
>>
>> Attached there is now the complete profile file also with the results
>> from the last test. I will not repeat it with an higher inode size because
>> I don't think this will have an effect.
>> There must be another cause for the low performance
>>
>>
>> Yes. No need to try with higher inode size
>>
>>
>>
>> Regards
>> David Spisla
>>
>> Am Di., 5. Nov. 2019 um 16:25 Uhr schrieb David Spisla <
>> spisl...@gmail.com>:
>>
>>>
>>>
>>> Am Di., 5. Nov. 2019 um 12:06 Uhr schrieb RAFI KC :
>>>

 On 11/4/19 8:46 PM, David Spisla wrote:

 Dear Gluster Community,

 I also have a issue concerning performance. The last days I updated our
 test cluster from GlusterFS v5.5 to v7.0 . The setup in general:

 2 HP DL380 Servers with 10Gbit NICs, 1 Distribute-Replica 2 Volume with
 2 Replica Pairs. Client is SMB Samba (access via vfs_glusterfs) . I did
 several tests to ensure that Samba don't causes the fall.
 The setup ist completely the same except the Gluster Version
 Here are my results:
 64KiB   1MiB 10MiB(Filesize)
 3,49 47,41300,50  (Values in MiB/s with
 GlusterFS v5.5)
 0,16  2,61 76,63(Values in MiB/s
 with GlusterFS v7.0)


 Can you please share the profile information [1] for both versions?
 Also it would be really helpful if you can mention the io patterns that
 used for this tests.

 [1] :
 https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/

>>> Hello Rafi,
>>> thank you for your help.
>>>
>>> * First more information about the io patterns: As a client we use a
>>> DL360 Windws Server 2017 machine with 10Gbit NIC connected to the storage
>>> machines. The share will be mounted via SMB and the tests writes with fio.
>>> We use this job files (see attachment). Each job file will be executed
>>> separetely and there is a sleep about 60s between each test run to calm
>>> down the system before starting a new test.
>>>
>>> * Attached below you find the profile output from the tests with v5.5
>>> (ctime enabled), v7.0 (ctime enabled).
>>>
>>> * Beside of the tests with Samba I did also some fio tests directly on
>>> the FUSE Mounts (locally on one of the storage nodes). The results show
>>> that there is only a small decrease of performance between v5.5 and v7.0
>>> (All values in MiB/s)
>>> 64KiB1MiB 10MiB
>>> 50,09 679,96   1023,02 (v5.5)
>>> 47,00 656,46977,60 (v7.0)
>>>
>>> It seems to be that the combination of samba + gluster7.0 has a lot of
>>> problems, or not?
>>>
>>>

 We use this volume options (GlusterFS 7.0):

 Volume Name: archive1
 Type: Distributed-Replicate
 Volume ID: 44c17844-0bd4-4ca2-98d8-a1474add790c
 Status: Started
 Snapshot Count: 0
 Number of Bricks: 2 x 2 = 4
 Transport-type: tcp
 Bricks:
 Brick1: fs-dl380-c1-n1:/gluster/brick1/glusterbrick
 Brick2: fs-dl380-c1-n2:/gluster/brick1/glusterbrick
 Brick3: fs-dl380-c1-n1:/gluster/brick2/glusterbrick
 Brick4: 

Re: [Gluster-users] Performance is falling rapidly when updating from v5.5 to v7.0

2019-11-06 Thread RAFI KC


On 11/6/19 3:42 PM, David Spisla wrote:

Hello Rafi,

I tried to set the xattr via

setfattr -n trusted.io-stats-dump -v '/tmp/iostat.log' 
/gluster/repositories/repo1/


but it had no effect. There is no such a xattr via getfattr and no 
logfile. The command setxattr is not available. What I am doing wrong?



I will check it out and get back to you.


By the way, you mean to increase the inode size of xfs layer from 512 
Bytes to 1024KB(!)? I think it should be 1024 Bytes because 2048 Bytes 
is the maximum

It was a type, I meant to set up 1024 bytes, sorry for that.


Regards
David

Am Mi., 6. Nov. 2019 um 04:10 Uhr schrieb RAFI KC >:


I will take a look at the profile info shared. Since there is a
huge difference in the performance numbers between fuse and samba,
it would be great if we can get the profile info of fuse (on v7).
This will help to compare the number of calls for each fops. There
should be some fops that samba repeat, and we can find out it by
comparing with fuse.

Also if possible, can you please get client profile info from fuse
mount using the command `setxattr -n trusted.io-stats-dump -v
 `.


Regards

Rafi KC


On 11/5/19 11:05 PM, David Spisla wrote:

I did the test with Gluster 7.0 ctime disabled. But it had no effect:
(All values in MiB/s)
64KiB    1MiB     10MiB
0,16   2,60   54,74

Attached there is now the complete profile file also with the
results from the last test. I will not repeat it with an higher
inode size because I don't think this will have an effect.
There must be another cause for the low performance



Yes. No need to try with higher inode size




Regards
David Spisla

Am Di., 5. Nov. 2019 um 16:25 Uhr schrieb David Spisla
mailto:spisl...@gmail.com>>:



Am Di., 5. Nov. 2019 um 12:06 Uhr schrieb RAFI KC
mailto:rkavu...@redhat.com>>:


On 11/4/19 8:46 PM, David Spisla wrote:

Dear Gluster Community,

I also have a issue concerning performance. The last
days I updated our test cluster from GlusterFS v5.5 to
v7.0 . The setup in general:

2 HP DL380 Servers with 10Gbit NICs, 1
Distribute-Replica 2 Volume with 2 Replica Pairs. Client
is SMB Samba (access via vfs_glusterfs) . I did several
tests to ensure that Samba don't causes the fall.
The setup ist completely the same except the Gluster Version
Here are my results:
64KiB       1MiB 10MiB            (Filesize)
3,49             47,41     300,50  (Values in
MiB/s with GlusterFS v5.5)
0,16              2,61    76,63 (Values in MiB/s
with GlusterFS v7.0)



Can you please share the profile information [1] for both
versions?  Also it would be really helpful if you can
mention the io patterns that used for this tests.

[1] :

https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/

Hello Rafi,
thank you for your help.

* First more information about the io patterns: As a client
we use a DL360 Windws Server 2017 machine with 10Gbit NIC
connected to the storage machines. The share will be mounted
via SMB and the tests writes with fio. We use this job files
(see attachment). Each job file will be executed separetely
and there is a sleep about 60s between each test run to calm
down the system before starting a new test.

* Attached below you find the profile output from the tests
with v5.5 (ctime enabled), v7.0 (ctime enabled).

* Beside of the tests with Samba I did also some fio tests
directly on the FUSE Mounts (locally on one of the storage
nodes). The results show that there is only a small decrease
of performance between v5.5 and v7.0
(All values in MiB/s)
64KiB    1MiB     10MiB
50,09 679,96   1023,02 (v5.5)
47,00 656,46    977,60 (v7.0)

It seems to be that the combination of samba + gluster7.0 has
a lot of problems, or not?




We use this volume options (GlusterFS 7.0):

Volume Name: archive1
Type: Distributed-Replicate
Volume ID: 44c17844-0bd4-4ca2-98d8-a1474add790c
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: fs-dl380-c1-n1:/gluster/brick1/glusterbrick
Brick2: fs-dl380-c1-n2:/gluster/brick1/glusterbrick
Brick3: fs-dl380-c1-n1:/gluster/brick2/glusterbrick
Brick4: fs-dl380-c1-n2:/gluster/brick2/glusterbrick
Options Reconfigured:
performance.client-io-threads: off
 

Re: [Gluster-users] Performance is falling rapidly when updating from v5.5 to v7.0

2019-11-06 Thread David Spisla
Hello Rafi,

I tried to set the xattr via

setfattr -n trusted.io-stats-dump -v '/tmp/iostat.log'
/gluster/repositories/repo1/

but it had no effect. There is no such a xattr via getfattr and no logfile.
The command setxattr is not available. What I am doing wrong?
By the way, you mean to increase the inode size of xfs layer from 512 Bytes
to 1024KB(!)? I think it should be 1024 Bytes because 2048 Bytes is the
maximum

Regards
David

Am Mi., 6. Nov. 2019 um 04:10 Uhr schrieb RAFI KC :

> I will take a look at the profile info shared. Since there is a huge
> difference in the performance numbers between fuse and samba, it would be
> great if we can get the profile info of fuse (on v7). This will help to
> compare the number of calls for each fops. There should be some fops that
> samba repeat, and we can find out it by comparing with fuse.
>
> Also if possible, can you please get client profile info from fuse mount
> using the command `setxattr -n trusted.io-stats-dump -v  /tmp/iostat.log> `.
>
>
> Regards
>
> Rafi KC
>
> On 11/5/19 11:05 PM, David Spisla wrote:
>
> I did the test with Gluster 7.0 ctime disabled. But it had no effect:
> (All values in MiB/s)
> 64KiB1MiB 10MiB
> 0,16   2,60   54,74
>
> Attached there is now the complete profile file also with the results from
> the last test. I will not repeat it with an higher inode size because I
> don't think this will have an effect.
> There must be another cause for the low performance
>
>
> Yes. No need to try with higher inode size
>
>
>
> Regards
> David Spisla
>
> Am Di., 5. Nov. 2019 um 16:25 Uhr schrieb David Spisla  >:
>
>>
>>
>> Am Di., 5. Nov. 2019 um 12:06 Uhr schrieb RAFI KC :
>>
>>>
>>> On 11/4/19 8:46 PM, David Spisla wrote:
>>>
>>> Dear Gluster Community,
>>>
>>> I also have a issue concerning performance. The last days I updated our
>>> test cluster from GlusterFS v5.5 to v7.0 . The setup in general:
>>>
>>> 2 HP DL380 Servers with 10Gbit NICs, 1 Distribute-Replica 2 Volume with
>>> 2 Replica Pairs. Client is SMB Samba (access via vfs_glusterfs) . I did
>>> several tests to ensure that Samba don't causes the fall.
>>> The setup ist completely the same except the Gluster Version
>>> Here are my results:
>>> 64KiB   1MiB 10MiB(Filesize)
>>> 3,49 47,41300,50  (Values in MiB/s with
>>> GlusterFS v5.5)
>>> 0,16  2,61 76,63(Values in MiB/s
>>> with GlusterFS v7.0)
>>>
>>>
>>> Can you please share the profile information [1] for both versions?
>>> Also it would be really helpful if you can mention the io patterns that
>>> used for this tests.
>>>
>>> [1] :
>>> https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/
>>>
>> Hello Rafi,
>> thank you for your help.
>>
>> * First more information about the io patterns: As a client we use a
>> DL360 Windws Server 2017 machine with 10Gbit NIC connected to the storage
>> machines. The share will be mounted via SMB and the tests writes with fio.
>> We use this job files (see attachment). Each job file will be executed
>> separetely and there is a sleep about 60s between each test run to calm
>> down the system before starting a new test.
>>
>> * Attached below you find the profile output from the tests with v5.5
>> (ctime enabled), v7.0 (ctime enabled).
>>
>> * Beside of the tests with Samba I did also some fio tests directly on
>> the FUSE Mounts (locally on one of the storage nodes). The results show
>> that there is only a small decrease of performance between v5.5 and v7.0
>> (All values in MiB/s)
>> 64KiB1MiB 10MiB
>> 50,09 679,96   1023,02 (v5.5)
>> 47,00 656,46977,60 (v7.0)
>>
>> It seems to be that the combination of samba + gluster7.0 has a lot of
>> problems, or not?
>>
>>
>>>
>>> We use this volume options (GlusterFS 7.0):
>>>
>>> Volume Name: archive1
>>> Type: Distributed-Replicate
>>> Volume ID: 44c17844-0bd4-4ca2-98d8-a1474add790c
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 2 x 2 = 4
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: fs-dl380-c1-n1:/gluster/brick1/glusterbrick
>>> Brick2: fs-dl380-c1-n2:/gluster/brick1/glusterbrick
>>> Brick3: fs-dl380-c1-n1:/gluster/brick2/glusterbrick
>>> Brick4: fs-dl380-c1-n2:/gluster/brick2/glusterbrick
>>> Options Reconfigured:
>>> performance.client-io-threads: off
>>> nfs.disable: on
>>> storage.fips-mode-rchecksum: on
>>> transport.address-family: inet
>>> user.smb: disable
>>> features.read-only: off
>>> features.worm: off
>>> features.worm-file-level: on
>>> features.retention-mode: enterprise
>>> features.default-retention-period: 120
>>> network.ping-timeout: 10
>>> features.cache-invalidation: on
>>> features.cache-invalidation-timeout: 600
>>> performance.nl-cache: on
>>> performance.nl-cache-timeout: 600
>>> client.event-threads: 32
>>> server.event-threads: 32
>>> cluster.lookup-optimize: on
>>> performance.stat-prefetch: on
>>> performance.cache-invalidation: on

Re: [Gluster-users] Performance is falling rapidly when updating from v5.5 to v7.0

2019-11-05 Thread RAFI KC
I will take a look at the profile info shared. Since there is a huge 
difference in the performance numbers between fuse and samba, it would 
be great if we can get the profile info of fuse (on v7). This will help 
to compare the number of calls for each fops. There should be some fops 
that samba repeat, and we can find out it by comparing with fuse.


Also if possible, can you please get client profile info from fuse mount 
using the command `setxattr -n trusted.io-stats-dump -v /tmp/iostat.log> `.



Regards

Rafi KC


On 11/5/19 11:05 PM, David Spisla wrote:

I did the test with Gluster 7.0 ctime disabled. But it had no effect:
(All values in MiB/s)
64KiB    1MiB     10MiB
0,16   2,60   54,74

Attached there is now the complete profile file also with the results 
from the last test. I will not repeat it with an higher inode size 
because I don't think this will have an effect.

There must be another cause for the low performance



Yes. No need to try with higher inode size




Regards
David Spisla

Am Di., 5. Nov. 2019 um 16:25 Uhr schrieb David Spisla 
mailto:spisl...@gmail.com>>:




Am Di., 5. Nov. 2019 um 12:06 Uhr schrieb RAFI KC
mailto:rkavu...@redhat.com>>:


On 11/4/19 8:46 PM, David Spisla wrote:

Dear Gluster Community,

I also have a issue concerning performance. The last days I
updated our test cluster from GlusterFS v5.5 to v7.0 . The
setup in general:

2 HP DL380 Servers with 10Gbit NICs, 1 Distribute-Replica 2
Volume with 2 Replica Pairs. Client is SMB Samba (access via
vfs_glusterfs) . I did several tests to ensure that Samba
don't causes the fall.
The setup ist completely the same except the Gluster Version
Here are my results:
64KiB       1MiB             10MiB          (Filesize)
3,49             47,41 300,50  (Values in MiB/s with
GlusterFS v5.5)
0,16              2,61 76,63    (Values in MiB/s with
GlusterFS v7.0)



Can you please share the profile information [1] for both
versions?  Also it would be really helpful if you can mention
the io patterns that used for this tests.

[1] :

https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/

Hello Rafi,
thank you for your help.

* First more information about the io patterns: As a client we use
a DL360 Windws Server 2017 machine with 10Gbit NIC connected to
the storage machines. The share will be mounted via SMB and the
tests writes with fio. We use this job files (see attachment).
Each job file will be executed separetely and there is a sleep
about 60s between each test run to calm down the system before
starting a new test.

* Attached below you find the profile output from the tests with
v5.5 (ctime enabled), v7.0 (ctime enabled).

* Beside of the tests with Samba I did also some fio tests
directly on the FUSE Mounts (locally on one of the storage nodes).
The results show that there is only a small decrease of
performance between v5.5 and v7.0
(All values in MiB/s)
64KiB    1MiB     10MiB
50,09 679,96   1023,02 (v5.5)
47,00 656,46    977,60 (v7.0)

It seems to be that the combination of samba + gluster7.0 has a
lot of problems, or not?




We use this volume options (GlusterFS 7.0):

Volume Name: archive1
Type: Distributed-Replicate
Volume ID: 44c17844-0bd4-4ca2-98d8-a1474add790c
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: fs-dl380-c1-n1:/gluster/brick1/glusterbrick
Brick2: fs-dl380-c1-n2:/gluster/brick1/glusterbrick
Brick3: fs-dl380-c1-n1:/gluster/brick2/glusterbrick
Brick4: fs-dl380-c1-n2:/gluster/brick2/glusterbrick
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
user.smb: disable
features.read-only: off
features.worm: off
features.worm-file-level: on
features.retention-mode: enterprise
features.default-retention-period: 120
network.ping-timeout: 10
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.nl-cache: on
performance.nl-cache-timeout: 600
client.event-threads: 32
server.event-threads: 32
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
performance.cache-samba-metadata: on
performance.cache-ima-xattrs: on
performance.io-thread-count: 64
cluster.use-compound-fops: on
performance.cache-size: 512MB
performance.cache-refresh-timeout: 10
   

Re: [Gluster-users] Performance is falling rapidly when updating from v5.5 to v7.0

2019-11-05 Thread David Spisla
I did the test with Gluster 7.0 ctime disabled. But it had no effect:
(All values in MiB/s)
64KiB1MiB 10MiB
0,16   2,60   54,74

Attached there is now the complete profile file also with the results from
the last test. I will not repeat it with an higher inode size because I
don't think this will have an effect.
There must be another cause for the low performance

Regards
David Spisla

Am Di., 5. Nov. 2019 um 16:25 Uhr schrieb David Spisla :

>
>
> Am Di., 5. Nov. 2019 um 12:06 Uhr schrieb RAFI KC :
>
>>
>> On 11/4/19 8:46 PM, David Spisla wrote:
>>
>> Dear Gluster Community,
>>
>> I also have a issue concerning performance. The last days I updated our
>> test cluster from GlusterFS v5.5 to v7.0 . The setup in general:
>>
>> 2 HP DL380 Servers with 10Gbit NICs, 1 Distribute-Replica 2 Volume with 2
>> Replica Pairs. Client is SMB Samba (access via vfs_glusterfs) . I did
>> several tests to ensure that Samba don't causes the fall.
>> The setup ist completely the same except the Gluster Version
>> Here are my results:
>> 64KiB   1MiB 10MiB(Filesize)
>> 3,49 47,41300,50  (Values in MiB/s with
>> GlusterFS v5.5)
>> 0,16  2,61 76,63(Values in MiB/s with
>> GlusterFS v7.0)
>>
>>
>> Can you please share the profile information [1] for both versions?  Also
>> it would be really helpful if you can mention the io patterns that used for
>> this tests.
>>
>> [1] :
>> https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/
>>
> Hello Rafi,
> thank you for your help.
>
> * First more information about the io patterns: As a client we use a DL360
> Windws Server 2017 machine with 10Gbit NIC connected to the storage
> machines. The share will be mounted via SMB and the tests writes with fio.
> We use this job files (see attachment). Each job file will be executed
> separetely and there is a sleep about 60s between each test run to calm
> down the system before starting a new test.
>
> * Attached below you find the profile output from the tests with v5.5
> (ctime enabled), v7.0 (ctime enabled).
>
> * Beside of the tests with Samba I did also some fio tests directly on the
> FUSE Mounts (locally on one of the storage nodes). The results show that
> there is only a small decrease of performance between v5.5 and v7.0
> (All values in MiB/s)
> 64KiB1MiB 10MiB
> 50,09 679,96   1023,02 (v5.5)
> 47,00 656,46977,60 (v7.0)
>
> It seems to be that the combination of samba + gluster7.0 has a lot of
> problems, or not?
>
>
>>
>> We use this volume options (GlusterFS 7.0):
>>
>> Volume Name: archive1
>> Type: Distributed-Replicate
>> Volume ID: 44c17844-0bd4-4ca2-98d8-a1474add790c
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 2 x 2 = 4
>> Transport-type: tcp
>> Bricks:
>> Brick1: fs-dl380-c1-n1:/gluster/brick1/glusterbrick
>> Brick2: fs-dl380-c1-n2:/gluster/brick1/glusterbrick
>> Brick3: fs-dl380-c1-n1:/gluster/brick2/glusterbrick
>> Brick4: fs-dl380-c1-n2:/gluster/brick2/glusterbrick
>> Options Reconfigured:
>> performance.client-io-threads: off
>> nfs.disable: on
>> storage.fips-mode-rchecksum: on
>> transport.address-family: inet
>> user.smb: disable
>> features.read-only: off
>> features.worm: off
>> features.worm-file-level: on
>> features.retention-mode: enterprise
>> features.default-retention-period: 120
>> network.ping-timeout: 10
>> features.cache-invalidation: on
>> features.cache-invalidation-timeout: 600
>> performance.nl-cache: on
>> performance.nl-cache-timeout: 600
>> client.event-threads: 32
>> server.event-threads: 32
>> cluster.lookup-optimize: on
>> performance.stat-prefetch: on
>> performance.cache-invalidation: on
>> performance.md-cache-timeout: 600
>> performance.cache-samba-metadata: on
>> performance.cache-ima-xattrs: on
>> performance.io-thread-count: 64
>> cluster.use-compound-fops: on
>> performance.cache-size: 512MB
>> performance.cache-refresh-timeout: 10
>> performance.read-ahead: off
>> performance.write-behind-window-size: 4MB
>> performance.write-behind: on
>> storage.build-pgfid: on
>> features.ctime: on
>> cluster.quorum-type: fixed
>> cluster.quorum-count: 1
>> features.bitrot: on
>> features.scrub: Active
>> features.scrub-freq: daily
>>
>> For GlusterFS 5.5 its nearly the same except the fact that there were 2
>> options to enable ctime feature.
>>
>>
>>
>> Ctime stores additional metadata information as an extended attributes
>> which sometimes exceeds the default inode size. In such scenarios the
>> additional xattrs won't fit into the default size. This will result in
>> additional blocks to be used to store xattrs in the inide, which will
>> effect the latency. This is purely based on the i/o operations and the
>> total xattrs size stored in the inode.
>>
>> Is it possible for you to repeat the test by disabling ctime or
>> increasing the inode size to a higher value say 1024KB?
>>
> I will do so but for today I could not 

Re: [Gluster-users] Performance is falling rapidly when updating from v5.5 to v7.0

2019-11-05 Thread David Spisla
Am Di., 5. Nov. 2019 um 12:06 Uhr schrieb RAFI KC :

>
> On 11/4/19 8:46 PM, David Spisla wrote:
>
> Dear Gluster Community,
>
> I also have a issue concerning performance. The last days I updated our
> test cluster from GlusterFS v5.5 to v7.0 . The setup in general:
>
> 2 HP DL380 Servers with 10Gbit NICs, 1 Distribute-Replica 2 Volume with 2
> Replica Pairs. Client is SMB Samba (access via vfs_glusterfs) . I did
> several tests to ensure that Samba don't causes the fall.
> The setup ist completely the same except the Gluster Version
> Here are my results:
> 64KiB   1MiB 10MiB(Filesize)
> 3,49 47,41300,50  (Values in MiB/s with
> GlusterFS v5.5)
> 0,16  2,61 76,63(Values in MiB/s with
> GlusterFS v7.0)
>
>
> Can you please share the profile information [1] for both versions?  Also
> it would be really helpful if you can mention the io patterns that used for
> this tests.
>
> [1] :
> https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/
>
Hello Rafi,
thank you for your help.

* First more information about the io patterns: As a client we use a DL360
Windws Server 2017 machine with 10Gbit NIC connected to the storage
machines. The share will be mounted via SMB and the tests writes with fio.
We use this job files (see attachment). Each job file will be executed
separetely and there is a sleep about 60s between each test run to calm
down the system before starting a new test.

* Attached below you find the profile output from the tests with v5.5
(ctime enabled), v7.0 (ctime enabled).

* Beside of the tests with Samba I did also some fio tests directly on the
FUSE Mounts (locally on one of the storage nodes). The results show that
there is only a small decrease of performance between v5.5 and v7.0
(All values in MiB/s)
64KiB1MiB 10MiB
50,09 679,96   1023,02 (v5.5)
47,00 656,46977,60 (v7.0)

It seems to be that the combination of samba + gluster7.0 has a lot of
problems, or not?


>
> We use this volume options (GlusterFS 7.0):
>
> Volume Name: archive1
> Type: Distributed-Replicate
> Volume ID: 44c17844-0bd4-4ca2-98d8-a1474add790c
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x 2 = 4
> Transport-type: tcp
> Bricks:
> Brick1: fs-dl380-c1-n1:/gluster/brick1/glusterbrick
> Brick2: fs-dl380-c1-n2:/gluster/brick1/glusterbrick
> Brick3: fs-dl380-c1-n1:/gluster/brick2/glusterbrick
> Brick4: fs-dl380-c1-n2:/gluster/brick2/glusterbrick
> Options Reconfigured:
> performance.client-io-threads: off
> nfs.disable: on
> storage.fips-mode-rchecksum: on
> transport.address-family: inet
> user.smb: disable
> features.read-only: off
> features.worm: off
> features.worm-file-level: on
> features.retention-mode: enterprise
> features.default-retention-period: 120
> network.ping-timeout: 10
> features.cache-invalidation: on
> features.cache-invalidation-timeout: 600
> performance.nl-cache: on
> performance.nl-cache-timeout: 600
> client.event-threads: 32
> server.event-threads: 32
> cluster.lookup-optimize: on
> performance.stat-prefetch: on
> performance.cache-invalidation: on
> performance.md-cache-timeout: 600
> performance.cache-samba-metadata: on
> performance.cache-ima-xattrs: on
> performance.io-thread-count: 64
> cluster.use-compound-fops: on
> performance.cache-size: 512MB
> performance.cache-refresh-timeout: 10
> performance.read-ahead: off
> performance.write-behind-window-size: 4MB
> performance.write-behind: on
> storage.build-pgfid: on
> features.ctime: on
> cluster.quorum-type: fixed
> cluster.quorum-count: 1
> features.bitrot: on
> features.scrub: Active
> features.scrub-freq: daily
>
> For GlusterFS 5.5 its nearly the same except the fact that there were 2
> options to enable ctime feature.
>
>
>
> Ctime stores additional metadata information as an extended attributes
> which sometimes exceeds the default inode size. In such scenarios the
> additional xattrs won't fit into the default size. This will result in
> additional blocks to be used to store xattrs in the inide, which will
> effect the latency. This is purely based on the i/o operations and the
> total xattrs size stored in the inode.
>
> Is it possible for you to repeat the test by disabling ctime or increasing
> the inode size to a higher value say 1024KB?
>
I will do so but for today I could not finish tests with ctime disabled (or
higher inode value) because it takes a lot of time with v7.0 due to the low
performance and I will perform it tomorrow. As soon as possible I give you
the results.
By the way: You really mean inode size on xfs layer 1024KB? Or do you mean
1024Bytes? We use per default 512Bytes, because this is the recommended
size until now . But it seems to be that there is a need for a new
recommendation when using ctime feature as a default. I can not image that
this is the real cause for the low performance because in v5.5 we also use
ctime feature with inode size 512Bytes.


Re: [Gluster-users] Performance is falling rapidly when updating from v5.5 to v7.0

2019-11-05 Thread RAFI KC


On 11/4/19 8:46 PM, David Spisla wrote:

Dear Gluster Community,

I also have a issue concerning performance. The last days I updated 
our test cluster from GlusterFS v5.5 to v7.0 . The setup in general:


2 HP DL380 Servers with 10Gbit NICs, 1 Distribute-Replica 2 Volume 
with 2 Replica Pairs. Client is SMB Samba (access via vfs_glusterfs) . 
I did several tests to ensure that Samba don't causes the fall.

The setup ist completely the same except the Gluster Version
Here are my results:
64KiB       1MiB             10MiB (Filesize)
3,49            47,41            300,50  (Values in MiB/s with 
GlusterFS v5.5)
0,16             2,61     76,63    (Values in MiB/s 
with GlusterFS v7.0)



Can you please share the profile information [1] for both versions?  
Also it would be really helpful if you can mention the io patterns that 
used for this tests.


[1] : 
https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/




We use this volume options (GlusterFS 7.0):

Volume Name: archive1
Type: Distributed-Replicate
Volume ID: 44c17844-0bd4-4ca2-98d8-a1474add790c
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: fs-dl380-c1-n1:/gluster/brick1/glusterbrick
Brick2: fs-dl380-c1-n2:/gluster/brick1/glusterbrick
Brick3: fs-dl380-c1-n1:/gluster/brick2/glusterbrick
Brick4: fs-dl380-c1-n2:/gluster/brick2/glusterbrick
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
user.smb: disable
features.read-only: off
features.worm: off
features.worm-file-level: on
features.retention-mode: enterprise
features.default-retention-period: 120
network.ping-timeout: 10
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.nl-cache: on
performance.nl-cache-timeout: 600
client.event-threads: 32
server.event-threads: 32
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
performance.cache-samba-metadata: on
performance.cache-ima-xattrs: on
performance.io-thread-count: 64
cluster.use-compound-fops: on
performance.cache-size: 512MB
performance.cache-refresh-timeout: 10
performance.read-ahead: off
performance.write-behind-window-size: 4MB
performance.write-behind: on
storage.build-pgfid: on
features.ctime: on
cluster.quorum-type: fixed
cluster.quorum-count: 1
features.bitrot: on
features.scrub: Active
features.scrub-freq: daily

For GlusterFS 5.5 its nearly the same except the fact that there were 
2 options to enable ctime feature.




Ctime stores additional metadata information as an extended attributes 
which sometimes exceeds the default inode size. In such scenarios the 
additional xattrs won't fit into the default size. This will result in 
additional blocks to be used to store xattrs in the inide, which will 
effect the latency. This is purely based on the i/o operations and the 
total xattrs size stored in the inode.


Is it possible for you to repeat the test by disabling ctime or 
increasing the inode size to a higher value say 1024KB?




Our optimization for Samba looks like this (for every version):

[global]
workgroup = SAMBA
netbios name = CLUSTER
kernel share modes = no
aio read size = 1
aio write size = 1
kernel oplocks = no
max open files = 10
nt acl support = no
security = user
server min protocol = SMB2
store dos attributes = no
strict locking = no
full_audit:failure = pwrite_send pwrite_recv pwrite offload_write_send 
offload_write_recv create_file open unlink connect disconnect rename 
chown fchown lchown chmod fchmod mkdir rmdir ntimes ftruncate fallocate
full_audit:success = pwrite_send pwrite_recv pwrite offload_write_send 
offload_write_recv create_file open unlink connect disconnect rename 
chown fchown lchown chmod fchmod mkdir rmdir ntimes ftruncate fallocate

full_audit:facility = local5
durable handles = yes
posix locking = no
log level = 2
max log size = 10
debug pid = yes

What can be the cause for this rapid falling of the performance for 
small files? Are some of our vol options not recommended anymore?
There were some patches concerning performance for small files in v6.0 
und v7.0 :


#1670031 : performance regression 
seen with smallfile workload tests


#1659327 : 43% regression in 
small-file sequential read performance


And one patch for the io-cache:

#1659869 : improvements to io-cache

Regards

David Spisla




Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Community Meeting Calendar: