Re: [Gluster-users] Slow write times to gluster disk

2018-07-14 Thread Raghavendra Gowdappa
On 6/30/18, Raghavendra Gowdappa  wrote:
> On Fri, Jun 29, 2018 at 10:38 PM, Pat Haley  wrote:
>
>>
>> Hi Raghavendra,
>>
>> We ran the tests (write tests) and I copied the log files for both the
>> server and the client to http://mseas.mit.edu/download/
>> phaley/GlusterUsers/2018/Jun29/ .  Is there any additional trace
>> information you need?  (If so, where should I look for it?)
>>
>
> Nothing for now. I can see from logs that workaround is not helping. fstat
> requests are not absorbed by md-cache and read-ahead is witnessing them and
> flushing its read-ahead cache. I am investigating more on md-cache (It also
> seems to be invalidating inodes quite frequently which actually might be
> the root cause of seeing so many fstat requests from kernel). Will post
> when I find anything relevant.

+Poornima.

@Poornima,

Can you investigate why fstats sent by kernel are not absorbed by
md-cache in sequential read tests? Note that md-cache doesn't flush
its metadata cache on reads (which can be a bug for applications
requiring strict atime consistency). So, I am expecting fstats
should've absorbed by it.

regards,
Raghavendra

>
>
>> Also the volume information you requested
>>
>> [root@mseas-data2 ~]# gluster volume info data-volume
>>
>> Volume Name: data-volume
>> Type: Distribute
>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>> Status: Started
>> Number of Bricks: 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: mseas-data2:/mnt/brick1
>> Brick2: mseas-data2:/mnt/brick2
>> Options Reconfigured:
>> diagnostics.client-log-level: TRACE
>> network.inode-lru-limit: 5
>> performance.md-cache-timeout: 60
>> performance.open-behind: off
>> disperse.eager-lock: off
>> auth.allow: *
>> server.allow-insecure: on
>> nfs.exports-auth-enable: on
>> diagnostics.brick-sys-log-level: WARNING
>> performance.readdir-ahead: on
>> nfs.disable: on
>> nfs.export-volumes: off
>> [root@mseas-data2 ~]#
>>
>>
>> On 06/29/2018 12:28 PM, Raghavendra Gowdappa wrote:
>>
>>
>>
>> On Fri, Jun 29, 2018 at 8:24 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Raghavendra,
>>>
>>> Our technician was able to try the manual setting today.  He found that
>>> our upper limit for performance.md-cache-timeout was 60 not 600, so he
>>> used that value, along with the network.inode-lru-limit=5.
>>>
>>> The result was another small (~1%) increase in speed.  Does this suggest
>>> some addition tests/changes we could try?
>>>
>>
>> Can you set gluster option diagnostics.client-log-level to TRACE  and run
>> sequential read tests again (with md-cache-timeout value of 60)?
>>
>> #gluster volume set  diagnostics.client-log-level TRACE
>>
>> Also are you sure that open-behind was turned off? Can you give the
>> output
>> of,
>>
>> # gluster volume info 
>>
>>
>>> Thanks
>>>
>>> Pat
>>>
>>>
>>>
>>>
>>> On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote:
>>>
>>>
>>>
>>> On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley  wrote:
>>>

 Hi Raghavendra,

 Setting the performance.write-behind off had a small improvement on the
 write speed (~3%),

 We were unable to turn on "group metadata-cache".  When we try get
 errors like

 # gluster volume set data-volume group metadata-cache
 '/var/lib/glusterd/groups/metadata-cache' file format not valid.

 Was metadata-cache available for gluster 3.7.11? We ask because the
 release notes for 3.11 mentions “Feature for metadata-caching/small
 file
 performance is production ready.” (https://gluster.readthedocs.i
 o/en/latest/release-notes/3.11.0/).

 Do any of these results suggest anything?  If not, what further tests
 would be useful?

>>>
>>> Group metadata-cache is just a bunch of options one sets on a volume.
>>> So,
>>> You can set them manually using gluster cli. Following are the options
>>> and
>>> their values:
>>>
>>> performance.md-cache-timeout=600
>>> network.inode-lru-limit=5
>>>
>>>
>>>
 Thanks

 Pat




 On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:



 On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley  wrote:

>
> Hi Raghavendra,
>
> Thanks for the suggestions.  Our technician will be in on Monday.
> We'll test then and let you know the results.
>
> One question I have, is the "group metadata-cache" option supposed to
> directly impact the performance or is it to help collect data?  If the
> latter, where will the data be located?
>

 It impacts performance.


> Thanks again.
>
> Pat
>
>
>
> On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:
>
>
>
> On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa <
> rgowd...@redhat.com> wrote:
>
>> For the case of writes to glusterfs mount,
>>
>> I saw in earlier conversations that there are too many lookups, but
>> small number of writes. Since writes cached in write-behind would
>> invalidate metadata 

Re: [Gluster-users] Slow write times to gluster disk

2018-07-13 Thread Raghavendra Gowdappa
On Fri, Jul 13, 2018 at 5:00 AM, Pat Haley  wrote:

>
> Hi Raghavendra,
>
> We were wondering if you have had a chance to look at this again, and if
> so, did you have any further suggestions?
>

Sorry Pat. Too much of work :). I'll be working on a patch today to make
sure read-ahead doesn't flush its cache due to fstats making this behavior
optional. You can try this patch and let us know about results.
Will let you know when patch is ready.


> Thanks
>
> Pat
>
>
>
> On 07/06/2018 01:27 AM, Raghavendra Gowdappa wrote:
>
>
>
> On Fri, Jul 6, 2018 at 5:29 AM, Pat Haley  wrote:
>
>>
>> Hi Raghavendra,
>>
>> Our technician may have some time to look at this issue tomorrow.  Are
>> there any tests that you'd like to see?
>>
>
> Sorry. I've been busy with other things and was away from work for couple
> of days. It'll take me another 2 days to work on this issue again. So, most
> likely you'll have an update on this next week.
>
>
>> Thanks
>>
>> Pat
>>
>>
>>
>> On 06/29/2018 11:25 PM, Raghavendra Gowdappa wrote:
>>
>>
>>
>> On Fri, Jun 29, 2018 at 10:38 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Raghavendra,
>>>
>>> We ran the tests (write tests) and I copied the log files for both the
>>> server and the client to http://mseas.mit.edu/download/
>>> phaley/GlusterUsers/2018/Jun29/ .  Is there any additional trace
>>> information you need?  (If so, where should I look for it?)
>>>
>>
>> Nothing for now. I can see from logs that workaround is not helping.
>> fstat requests are not absorbed by md-cache and read-ahead is witnessing
>> them and flushing its read-ahead cache. I am investigating more on md-cache
>> (It also seems to be invalidating inodes quite frequently which actually
>> might be the root cause of seeing so many fstat requests from kernel). Will
>> post when I find anything relevant.
>>
>>
>>> Also the volume information you requested
>>>
>>> [root@mseas-data2 ~]# gluster volume info data-volume
>>>
>>> Volume Name: data-volume
>>> Type: Distribute
>>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>>> Status: Started
>>> Number of Bricks: 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: mseas-data2:/mnt/brick1
>>> Brick2: mseas-data2:/mnt/brick2
>>> Options Reconfigured:
>>> diagnostics.client-log-level: TRACE
>>> network.inode-lru-limit: 5
>>> performance.md-cache-timeout: 60
>>> performance.open-behind: off
>>> disperse.eager-lock: off
>>> auth.allow: *
>>> server.allow-insecure: on
>>> nfs.exports-auth-enable: on
>>> diagnostics.brick-sys-log-level: WARNING
>>> performance.readdir-ahead: on
>>> nfs.disable: on
>>> nfs.export-volumes: off
>>> [root@mseas-data2 ~]#
>>>
>>>
>>> On 06/29/2018 12:28 PM, Raghavendra Gowdappa wrote:
>>>
>>>
>>>
>>> On Fri, Jun 29, 2018 at 8:24 PM, Pat Haley  wrote:
>>>

 Hi Raghavendra,

 Our technician was able to try the manual setting today.  He found that
 our upper limit for performance.md-cache-timeout was 60 not 600, so he
 used that value, along with the network.inode-lru-limit=5.

 The result was another small (~1%) increase in speed.  Does this
 suggest some addition tests/changes we could try?

>>>
>>> Can you set gluster option diagnostics.client-log-level to TRACE  and
>>> run sequential read tests again (with md-cache-timeout value of 60)?
>>>
>>> #gluster volume set  diagnostics.client-log-level TRACE
>>>
>>> Also are you sure that open-behind was turned off? Can you give the
>>> output of,
>>>
>>> # gluster volume info 
>>>
>>>
 Thanks

 Pat




 On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote:



 On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley  wrote:

>
> Hi Raghavendra,
>
> Setting the performance.write-behind off had a small improvement on
> the write speed (~3%),
>
> We were unable to turn on "group metadata-cache".  When we try get
> errors like
>
> # gluster volume set data-volume group metadata-cache
> '/var/lib/glusterd/groups/metadata-cache' file format not valid.
>
> Was metadata-cache available for gluster 3.7.11? We ask because the
> release notes for 3.11 mentions “Feature for metadata-caching/small file
> performance is production ready.” (https://gluster.readthedocs.i
> o/en/latest/release-notes/3.11.0/).
>
> Do any of these results suggest anything?  If not, what further tests
> would be useful?
>

 Group metadata-cache is just a bunch of options one sets on a volume.
 So, You can set them manually using gluster cli. Following are the options
 and their values:

 performance.md-cache-timeout=600
 network.inode-lru-limit=5



> Thanks
>
> Pat
>
>
>
>
> On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:
>
>
>
> On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley  wrote:
>
>>
>> Hi Raghavendra,
>>
>> Thanks for the suggestions.  Our technician will be in on Monday.

Re: [Gluster-users] Slow write times to gluster disk

2018-07-12 Thread Pat Haley


Hi Raghavendra,

We were wondering if you have had a chance to look at this again, and if 
so, did you have any further suggestions?


Thanks

Pat


On 07/06/2018 01:27 AM, Raghavendra Gowdappa wrote:



On Fri, Jul 6, 2018 at 5:29 AM, Pat Haley > wrote:



Hi Raghavendra,

Our technician may have some time to look at this issue tomorrow. 
Are there any tests that you'd like to see?


Sorry. I've been busy with other things and was away from work for 
couple of days. It'll take me another 2 days to work on this issue 
again. So, most likely you'll have an update on this next week.



Thanks

Pat



On 06/29/2018 11:25 PM, Raghavendra Gowdappa wrote:



On Fri, Jun 29, 2018 at 10:38 PM, Pat Haley mailto:pha...@mit.edu>> wrote:


Hi Raghavendra,

We ran the tests (write tests) and I copied the log files for
both the server and the client to
http://mseas.mit.edu/download/phaley/GlusterUsers/2018/Jun29/

.  Is there any additional trace information you need?  (If
so, where should I look for it?)


Nothing for now. I can see from logs that workaround is not
helping. fstat requests are not absorbed by md-cache and
read-ahead is witnessing them and flushing its read-ahead cache.
I am investigating more on md-cache (It also seems to be
invalidating inodes quite frequently which actually might be the
root cause of seeing so many fstat requests from kernel). Will
post when I find anything relevant.


Also the volume information you requested

[root@mseas-data2 ~]# gluster volume info data-volume

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
diagnostics.client-log-level: TRACE
network.inode-lru-limit: 5
performance.md-cache-timeout: 60
performance.open-behind: off
disperse.eager-lock: off
auth.allow: *
server.allow-insecure: on
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off
[root@mseas-data2 ~]#


On 06/29/2018 12:28 PM, Raghavendra Gowdappa wrote:



On Fri, Jun 29, 2018 at 8:24 PM, Pat Haley mailto:pha...@mit.edu>> wrote:


Hi Raghavendra,

Our technician was able to try the manual setting
today.  He found that our upper limit for
performance.md-cache-timeout was 60 not 600, so he used
that value, along with the network.inode-lru-limit=5.

The result was another small (~1%) increase in speed. 
Does this suggest some addition tests/changes we could try?


Can you set gluster option diagnostics.client-log-level to
TRACE  and run sequential read tests again (with
md-cache-timeout value of 60)?

#gluster volume set  diagnostics.client-log-level TRACE

Also are you sure that open-behind was turned off? Can you
give the output of,

# gluster volume info 


Thanks

Pat




On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote:



On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley
mailto:pha...@mit.edu>> wrote:


Hi Raghavendra,

Setting the performance.write-behind off had a
small improvement on the write speed (~3%),

We were unable to turn on "group metadata-cache".
When we try get errors like

# gluster volume set data-volume group metadata-cache
'/var/lib/glusterd/groups/metadata-cache' file
format not valid.

Was metadata-cache available for gluster 3.7.11? We
ask because the release notes for 3.11 mentions
“Feature for metadata-caching/small file
performance is production ready.”
(https://gluster.readthedocs.io/en/latest/release-notes/3.11.0/

).

Do any of these results suggest anything?  If not,
what further tests would be useful?


Group metadata-cache is just a bunch of options one
sets on a volume. So, You can set them manually using
gluster cli. Following are the options and their values:

performance.md-cache-timeout=600
network.inode-lru-limit=5



Thanks

Pat




On 06/22/2018 07:51 AM, Raghavendra 

Re: [Gluster-users] Slow write times to gluster disk

2018-07-05 Thread Raghavendra Gowdappa
On Fri, Jul 6, 2018 at 5:29 AM, Pat Haley  wrote:

>
> Hi Raghavendra,
>
> Our technician may have some time to look at this issue tomorrow.  Are
> there any tests that you'd like to see?
>

Sorry. I've been busy with other things and was away from work for couple
of days. It'll take me another 2 days to work on this issue again. So, most
likely you'll have an update on this next week.


> Thanks
>
> Pat
>
>
>
> On 06/29/2018 11:25 PM, Raghavendra Gowdappa wrote:
>
>
>
> On Fri, Jun 29, 2018 at 10:38 PM, Pat Haley  wrote:
>
>>
>> Hi Raghavendra,
>>
>> We ran the tests (write tests) and I copied the log files for both the
>> server and the client to http://mseas.mit.edu/download/
>> phaley/GlusterUsers/2018/Jun29/ .  Is there any additional trace
>> information you need?  (If so, where should I look for it?)
>>
>
> Nothing for now. I can see from logs that workaround is not helping. fstat
> requests are not absorbed by md-cache and read-ahead is witnessing them and
> flushing its read-ahead cache. I am investigating more on md-cache (It also
> seems to be invalidating inodes quite frequently which actually might be
> the root cause of seeing so many fstat requests from kernel). Will post
> when I find anything relevant.
>
>
>> Also the volume information you requested
>>
>> [root@mseas-data2 ~]# gluster volume info data-volume
>>
>> Volume Name: data-volume
>> Type: Distribute
>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>> Status: Started
>> Number of Bricks: 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: mseas-data2:/mnt/brick1
>> Brick2: mseas-data2:/mnt/brick2
>> Options Reconfigured:
>> diagnostics.client-log-level: TRACE
>> network.inode-lru-limit: 5
>> performance.md-cache-timeout: 60
>> performance.open-behind: off
>> disperse.eager-lock: off
>> auth.allow: *
>> server.allow-insecure: on
>> nfs.exports-auth-enable: on
>> diagnostics.brick-sys-log-level: WARNING
>> performance.readdir-ahead: on
>> nfs.disable: on
>> nfs.export-volumes: off
>> [root@mseas-data2 ~]#
>>
>>
>> On 06/29/2018 12:28 PM, Raghavendra Gowdappa wrote:
>>
>>
>>
>> On Fri, Jun 29, 2018 at 8:24 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Raghavendra,
>>>
>>> Our technician was able to try the manual setting today.  He found that
>>> our upper limit for performance.md-cache-timeout was 60 not 600, so he
>>> used that value, along with the network.inode-lru-limit=5.
>>>
>>> The result was another small (~1%) increase in speed.  Does this suggest
>>> some addition tests/changes we could try?
>>>
>>
>> Can you set gluster option diagnostics.client-log-level to TRACE  and run
>> sequential read tests again (with md-cache-timeout value of 60)?
>>
>> #gluster volume set  diagnostics.client-log-level TRACE
>>
>> Also are you sure that open-behind was turned off? Can you give the
>> output of,
>>
>> # gluster volume info 
>>
>>
>>> Thanks
>>>
>>> Pat
>>>
>>>
>>>
>>>
>>> On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote:
>>>
>>>
>>>
>>> On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley  wrote:
>>>

 Hi Raghavendra,

 Setting the performance.write-behind off had a small improvement on the
 write speed (~3%),

 We were unable to turn on "group metadata-cache".  When we try get
 errors like

 # gluster volume set data-volume group metadata-cache
 '/var/lib/glusterd/groups/metadata-cache' file format not valid.

 Was metadata-cache available for gluster 3.7.11? We ask because the
 release notes for 3.11 mentions “Feature for metadata-caching/small file
 performance is production ready.” (https://gluster.readthedocs.i
 o/en/latest/release-notes/3.11.0/).

 Do any of these results suggest anything?  If not, what further tests
 would be useful?

>>>
>>> Group metadata-cache is just a bunch of options one sets on a volume.
>>> So, You can set them manually using gluster cli. Following are the options
>>> and their values:
>>>
>>> performance.md-cache-timeout=600
>>> network.inode-lru-limit=5
>>>
>>>
>>>
 Thanks

 Pat




 On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:



 On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley  wrote:

>
> Hi Raghavendra,
>
> Thanks for the suggestions.  Our technician will be in on Monday.
> We'll test then and let you know the results.
>
> One question I have, is the "group metadata-cache" option supposed to
> directly impact the performance or is it to help collect data?  If the
> latter, where will the data be located?
>

 It impacts performance.


> Thanks again.
>
> Pat
>
>
>
> On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:
>
>
>
> On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa <
> rgowd...@redhat.com> wrote:
>
>> For the case of writes to glusterfs mount,
>>
>> I saw in earlier conversations that there are too many lookups, but
>> small number of 

Re: [Gluster-users] Slow write times to gluster disk

2018-07-05 Thread Pat Haley


Hi Raghavendra,

Our technician may have some time to look at this issue tomorrow. Are 
there any tests that you'd like to see?


Thanks

Pat


On 06/29/2018 11:25 PM, Raghavendra Gowdappa wrote:



On Fri, Jun 29, 2018 at 10:38 PM, Pat Haley > wrote:



Hi Raghavendra,

We ran the tests (write tests) and I copied the log files for both
the server and the client to
http://mseas.mit.edu/download/phaley/GlusterUsers/2018/Jun29/
 . 
Is there any additional trace information you need? (If so, where
should I look for it?)


Nothing for now. I can see from logs that workaround is not helping. 
fstat requests are not absorbed by md-cache and read-ahead is 
witnessing them and flushing its read-ahead cache. I am investigating 
more on md-cache (It also seems to be invalidating inodes quite 
frequently which actually might be the root cause of seeing so many 
fstat requests from kernel). Will post when I find anything relevant.



Also the volume information you requested

[root@mseas-data2 ~]# gluster volume info data-volume

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
diagnostics.client-log-level: TRACE
network.inode-lru-limit: 5
performance.md-cache-timeout: 60
performance.open-behind: off
disperse.eager-lock: off
auth.allow: *
server.allow-insecure: on
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off
[root@mseas-data2 ~]#


On 06/29/2018 12:28 PM, Raghavendra Gowdappa wrote:



On Fri, Jun 29, 2018 at 8:24 PM, Pat Haley mailto:pha...@mit.edu>> wrote:


Hi Raghavendra,

Our technician was able to try the manual setting today.  He
found that our upper limit for performance.md-cache-timeout
was 60 not 600, so he used that value, along with the
network.inode-lru-limit=5.

The result was another small (~1%) increase in speed.  Does
this suggest some addition tests/changes we could try?


Can you set gluster option diagnostics.client-log-level to TRACE 
and run sequential read tests again (with md-cache-timeout value
of 60)?

#gluster volume set  diagnostics.client-log-level TRACE

Also are you sure that open-behind was turned off? Can you give
the output of,

# gluster volume info 


Thanks

Pat




On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote:



On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley mailto:pha...@mit.edu>> wrote:


Hi Raghavendra,

Setting the performance.write-behind off had a small
improvement on the write speed (~3%),

We were unable to turn on "group metadata-cache".  When
we try get errors like

# gluster volume set data-volume group metadata-cache
'/var/lib/glusterd/groups/metadata-cache' file format
not valid.

Was metadata-cache available for gluster 3.7.11? We ask
because the release notes for 3.11 mentions “Feature for
metadata-caching/small file performance is production
ready.”
(https://gluster.readthedocs.io/en/latest/release-notes/3.11.0/
).

Do any of these results suggest anything?  If not, what
further tests would be useful?


Group metadata-cache is just a bunch of options one sets on
a volume. So, You can set them manually using gluster cli.
Following are the options and their values:

performance.md-cache-timeout=600
network.inode-lru-limit=5



Thanks

Pat




On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:



On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley
mailto:pha...@mit.edu>> wrote:


Hi Raghavendra,

Thanks for the suggestions. Our technician will be
in on Monday.  We'll test then and let you know the
results.

One question I have, is the "group metadata-cache"
option supposed to directly impact the performance
or is it to help collect data? If the latter, where
will the data be located?


It impacts performance.


Thanks again.

Pat



On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:



On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra
Gowdappa mailto:rgowd...@redhat.com>> wrote:

  

Re: [Gluster-users] Slow write times to gluster disk

2018-06-29 Thread Raghavendra Gowdappa
On Fri, Jun 29, 2018 at 10:38 PM, Pat Haley  wrote:

>
> Hi Raghavendra,
>
> We ran the tests (write tests) and I copied the log files for both the
> server and the client to http://mseas.mit.edu/download/
> phaley/GlusterUsers/2018/Jun29/ .  Is there any additional trace
> information you need?  (If so, where should I look for it?)
>

Nothing for now. I can see from logs that workaround is not helping. fstat
requests are not absorbed by md-cache and read-ahead is witnessing them and
flushing its read-ahead cache. I am investigating more on md-cache (It also
seems to be invalidating inodes quite frequently which actually might be
the root cause of seeing so many fstat requests from kernel). Will post
when I find anything relevant.


> Also the volume information you requested
>
> [root@mseas-data2 ~]# gluster volume info data-volume
>
> Volume Name: data-volume
> Type: Distribute
> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
> Status: Started
> Number of Bricks: 2
> Transport-type: tcp
> Bricks:
> Brick1: mseas-data2:/mnt/brick1
> Brick2: mseas-data2:/mnt/brick2
> Options Reconfigured:
> diagnostics.client-log-level: TRACE
> network.inode-lru-limit: 5
> performance.md-cache-timeout: 60
> performance.open-behind: off
> disperse.eager-lock: off
> auth.allow: *
> server.allow-insecure: on
> nfs.exports-auth-enable: on
> diagnostics.brick-sys-log-level: WARNING
> performance.readdir-ahead: on
> nfs.disable: on
> nfs.export-volumes: off
> [root@mseas-data2 ~]#
>
>
> On 06/29/2018 12:28 PM, Raghavendra Gowdappa wrote:
>
>
>
> On Fri, Jun 29, 2018 at 8:24 PM, Pat Haley  wrote:
>
>>
>> Hi Raghavendra,
>>
>> Our technician was able to try the manual setting today.  He found that
>> our upper limit for performance.md-cache-timeout was 60 not 600, so he
>> used that value, along with the network.inode-lru-limit=5.
>>
>> The result was another small (~1%) increase in speed.  Does this suggest
>> some addition tests/changes we could try?
>>
>
> Can you set gluster option diagnostics.client-log-level to TRACE  and run
> sequential read tests again (with md-cache-timeout value of 60)?
>
> #gluster volume set  diagnostics.client-log-level TRACE
>
> Also are you sure that open-behind was turned off? Can you give the output
> of,
>
> # gluster volume info 
>
>
>> Thanks
>>
>> Pat
>>
>>
>>
>>
>> On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote:
>>
>>
>>
>> On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley  wrote:
>>
>>>
>>> Hi Raghavendra,
>>>
>>> Setting the performance.write-behind off had a small improvement on the
>>> write speed (~3%),
>>>
>>> We were unable to turn on "group metadata-cache".  When we try get
>>> errors like
>>>
>>> # gluster volume set data-volume group metadata-cache
>>> '/var/lib/glusterd/groups/metadata-cache' file format not valid.
>>>
>>> Was metadata-cache available for gluster 3.7.11? We ask because the
>>> release notes for 3.11 mentions “Feature for metadata-caching/small file
>>> performance is production ready.” (https://gluster.readthedocs.i
>>> o/en/latest/release-notes/3.11.0/).
>>>
>>> Do any of these results suggest anything?  If not, what further tests
>>> would be useful?
>>>
>>
>> Group metadata-cache is just a bunch of options one sets on a volume. So,
>> You can set them manually using gluster cli. Following are the options and
>> their values:
>>
>> performance.md-cache-timeout=600
>> network.inode-lru-limit=5
>>
>>
>>
>>> Thanks
>>>
>>> Pat
>>>
>>>
>>>
>>>
>>> On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:
>>>
>>>
>>>
>>> On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley  wrote:
>>>

 Hi Raghavendra,

 Thanks for the suggestions.  Our technician will be in on Monday.
 We'll test then and let you know the results.

 One question I have, is the "group metadata-cache" option supposed to
 directly impact the performance or is it to help collect data?  If the
 latter, where will the data be located?

>>>
>>> It impacts performance.
>>>
>>>
 Thanks again.

 Pat



 On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:



 On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa <
 rgowd...@redhat.com> wrote:

> For the case of writes to glusterfs mount,
>
> I saw in earlier conversations that there are too many lookups, but
> small number of writes. Since writes cached in write-behind would
> invalidate metadata cache, lookups won't be absorbed by md-cache. I am
> wondering what would results look like if we turn off
> performance.write-behind.
>
> @Pat,
>
> Can you set,
>
> # gluster volume set  performance.write-behind off
>

 Please turn on "group metadata-cache" for write tests too.


> and redo the tests writing to glusterfs mount? Let us know about the
> results you see.
>
> regards,
> Raghavendra
>
> On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa <
> rgowd...@redhat.com> 

Re: [Gluster-users] Slow write times to gluster disk

2018-06-29 Thread Pat Haley


Hi Raghavendra,

We ran the tests (write tests) and I copied the log files for both the 
server and the client to 
http://mseas.mit.edu/download/phaley/GlusterUsers/2018/Jun29/ .  Is 
there any additional trace information you need?  (If so, where should I 
look for it?)


Also the volume information you requested

[root@mseas-data2 ~]# gluster volume info data-volume

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
diagnostics.client-log-level: TRACE
network.inode-lru-limit: 5
performance.md-cache-timeout: 60
performance.open-behind: off
disperse.eager-lock: off
auth.allow: *
server.allow-insecure: on
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off
[root@mseas-data2 ~]#


On 06/29/2018 12:28 PM, Raghavendra Gowdappa wrote:



On Fri, Jun 29, 2018 at 8:24 PM, Pat Haley > wrote:



Hi Raghavendra,

Our technician was able to try the manual setting today.  He found
that our upper limit for performance.md-cache-timeout was 60 not
600, so he used that value, along with the
network.inode-lru-limit=5.

The result was another small (~1%) increase in speed. Does this
suggest some addition tests/changes we could try?


Can you set gluster option diagnostics.client-log-level to TRACE  and 
run sequential read tests again (with md-cache-timeout value of 60)?


#gluster volume set  diagnostics.client-log-level TRACE

Also are you sure that open-behind was turned off? Can you give the 
output of,


# gluster volume info 


Thanks

Pat




On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote:



On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley mailto:pha...@mit.edu>> wrote:


Hi Raghavendra,

Setting the performance.write-behind off had a small
improvement on the write speed (~3%),

We were unable to turn on "group metadata-cache".  When we
try get errors like

# gluster volume set data-volume group metadata-cache
'/var/lib/glusterd/groups/metadata-cache' file format not valid.

Was metadata-cache available for gluster 3.7.11? We ask
because the release notes for 3.11 mentions “Feature for
metadata-caching/small file performance is production ready.”
(https://gluster.readthedocs.io/en/latest/release-notes/3.11.0/
).

Do any of these results suggest anything?  If not, what
further tests would be useful?


Group metadata-cache is just a bunch of options one sets on a
volume. So, You can set them manually using gluster cli.
Following are the options and their values:

performance.md-cache-timeout=600
network.inode-lru-limit=5



Thanks

Pat




On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:



On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley mailto:pha...@mit.edu>> wrote:


Hi Raghavendra,

Thanks for the suggestions.  Our technician will be in
on Monday.  We'll test then and let you know the results.

One question I have, is the "group metadata-cache"
option supposed to directly impact the performance or is
it to help collect data?  If the latter, where will the
data be located?


It impacts performance.


Thanks again.

Pat



On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:



On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa
mailto:rgowd...@redhat.com>> wrote:

For the case of writes to glusterfs mount,

I saw in earlier conversations that there are too
many lookups, but small number of writes. Since
writes cached in write-behind would invalidate
metadata cache, lookups won't be absorbed by
md-cache. I am wondering what would results look
like if we turn off performance.write-behind.

@Pat,

Can you set,

# gluster volume set 
performance.write-behind off


Please turn on "group metadata-cache" for write tests too.


and redo the tests writing to glusterfs mount? Let
us know about the results you see.

regards,
Raghavendra

On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra
Gowdappa mailto:rgowd...@redhat.com>> wrote:



On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra
Gowdappa mailto:rgowd...@redhat.com>> wrote:

For the case of reading from Glusterfs
  

Re: [Gluster-users] Slow write times to gluster disk

2018-06-29 Thread Raghavendra Gowdappa
On Fri, Jun 29, 2018 at 8:24 PM, Pat Haley  wrote:

>
> Hi Raghavendra,
>
> Our technician was able to try the manual setting today.  He found that
> our upper limit for performance.md-cache-timeout was 60 not 600, so he
> used that value, along with the network.inode-lru-limit=5.
>
> The result was another small (~1%) increase in speed.  Does this suggest
> some addition tests/changes we could try?
>

Can you set gluster option diagnostics.client-log-level to TRACE  and run
sequential read tests again (with md-cache-timeout value of 60)?

#gluster volume set  diagnostics.client-log-level TRACE

Also are you sure that open-behind was turned off? Can you give the output
of,

# gluster volume info 


> Thanks
>
> Pat
>
>
>
>
> On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote:
>
>
>
> On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley  wrote:
>
>>
>> Hi Raghavendra,
>>
>> Setting the performance.write-behind off had a small improvement on the
>> write speed (~3%),
>>
>> We were unable to turn on "group metadata-cache".  When we try get errors
>> like
>>
>> # gluster volume set data-volume group metadata-cache
>> '/var/lib/glusterd/groups/metadata-cache' file format not valid.
>>
>> Was metadata-cache available for gluster 3.7.11? We ask because the
>> release notes for 3.11 mentions “Feature for metadata-caching/small file
>> performance is production ready.” (https://gluster.readthedocs.i
>> o/en/latest/release-notes/3.11.0/).
>>
>> Do any of these results suggest anything?  If not, what further tests
>> would be useful?
>>
>
> Group metadata-cache is just a bunch of options one sets on a volume. So,
> You can set them manually using gluster cli. Following are the options and
> their values:
>
> performance.md-cache-timeout=600
> network.inode-lru-limit=5
>
>
>
>> Thanks
>>
>> Pat
>>
>>
>>
>>
>> On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:
>>
>>
>>
>> On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Raghavendra,
>>>
>>> Thanks for the suggestions.  Our technician will be in on Monday.  We'll
>>> test then and let you know the results.
>>>
>>> One question I have, is the "group metadata-cache" option supposed to
>>> directly impact the performance or is it to help collect data?  If the
>>> latter, where will the data be located?
>>>
>>
>> It impacts performance.
>>
>>
>>> Thanks again.
>>>
>>> Pat
>>>
>>>
>>>
>>> On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:
>>>
>>>
>>>
>>> On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa <
>>> rgowd...@redhat.com> wrote:
>>>
 For the case of writes to glusterfs mount,

 I saw in earlier conversations that there are too many lookups, but
 small number of writes. Since writes cached in write-behind would
 invalidate metadata cache, lookups won't be absorbed by md-cache. I am
 wondering what would results look like if we turn off
 performance.write-behind.

 @Pat,

 Can you set,

 # gluster volume set  performance.write-behind off

>>>
>>> Please turn on "group metadata-cache" for write tests too.
>>>
>>>
 and redo the tests writing to glusterfs mount? Let us know about the
 results you see.

 regards,
 Raghavendra

 On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa <
 rgowd...@redhat.com> wrote:

>
>
> On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa <
> rgowd...@redhat.com> wrote:
>
>> For the case of reading from Glusterfs mount, read-ahead should help.
>> However, we've known issues with read-ahead[1][2]. To work around these,
>> can you try with,
>>
>> 1. Turn off performance.open-behind
>> #gluster volume set  performance.open-behind off
>>
>> 2. enable group meta metadata-cache
>> # gluster volume set  group metadata-cache
>>
>
> [1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>
>
>>
>>
>> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:
>>
>>>
>>> Hi,
>>>
>>> We were recently revisiting our problems with the slowness of
>>> gluster writes (http://lists.gluster.org/pipe
>>> rmail/gluster-users/2017-April/030529.html). Specifically we were
>>> testing the suggestions in a recent post (
>>> http://lists.gluster.org/pipermail/gluster-users/2018-March
>>> /033699.html). The first two suggestions (specifying a
>>> negative-timeout in the mount settings or adding 
>>> rpc-auth-allow-insecure to
>>> glusterd.vol) did not improve our performance, while setting
>>> "disperse.eager-lock off" provided a tiny (5%) speed-up.
>>>
>>> Some of the various tests we have tried earlier can be seen in the
>>> links below.  Do any of the above observations suggest what we could try
>>> next to either improve the speed or debug the issue?  Thanks
>>>
>>> http://lists.gluster.org/pipermail/gluster-users/2017-June/0

Re: [Gluster-users] Slow write times to gluster disk

2018-06-29 Thread Pat Haley


Hi Raghavendra,

Our technician was able to try the manual setting today.  He found that 
our upper limit for performance.md-cache-timeout was 60 not 600, so he 
used that value, along with the network.inode-lru-limit=5.


The result was another small (~1%) increase in speed.  Does this suggest 
some addition tests/changes we could try?


Thanks

Pat




On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote:



On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley > wrote:



Hi Raghavendra,

Setting the performance.write-behind off had a small improvement
on the write speed (~3%),

We were unable to turn on "group metadata-cache".  When we try get
errors like

# gluster volume set data-volume group metadata-cache
'/var/lib/glusterd/groups/metadata-cache' file format not valid.

Was metadata-cache available for gluster 3.7.11? We ask because
the release notes for 3.11 mentions “Feature for
metadata-caching/small file performance is production ready.”
(https://gluster.readthedocs.io/en/latest/release-notes/3.11.0/
).

Do any of these results suggest anything?  If not, what further
tests would be useful?


Group metadata-cache is just a bunch of options one sets on a volume. 
So, You can set them manually using gluster cli. Following are the 
options and their values:


performance.md-cache-timeout=600
network.inode-lru-limit=5



Thanks

Pat




On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:



On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley mailto:pha...@mit.edu>> wrote:


Hi Raghavendra,

Thanks for the suggestions.  Our technician will be in on
Monday.  We'll test then and let you know the results.

One question I have, is the "group metadata-cache" option
supposed to directly impact the performance or is it to help
collect data?  If the latter, where will the data be located?


It impacts performance.


Thanks again.

Pat



On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:



On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa
mailto:rgowd...@redhat.com>> wrote:

For the case of writes to glusterfs mount,

I saw in earlier conversations that there are too many
lookups, but small number of writes. Since writes cached
in write-behind would invalidate metadata cache, lookups
won't be absorbed by md-cache. I am wondering what would
results look like if we turn off performance.write-behind.

@Pat,

Can you set,

# gluster volume set  performance.write-behind off


Please turn on "group metadata-cache" for write tests too.


and redo the tests writing to glusterfs mount? Let us
know about the results you see.

regards,
Raghavendra

On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa
mailto:rgowd...@redhat.com>> wrote:



On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra
Gowdappa mailto:rgowd...@redhat.com>> wrote:

For the case of reading from Glusterfs mount,
read-ahead should help. However, we've known
issues with read-ahead[1][2]. To work around
these, can you try with,

1. Turn off performance.open-behind
#gluster volume set 
performance.open-behind off

2. enable group meta metadata-cache
# gluster volume set  group metadata-cache


[1]
https://bugzilla.redhat.com/show_bug.cgi?id=1084508

[2]
https://bugzilla.redhat.com/show_bug.cgi?id=1214489





On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley
mailto:pha...@mit.edu>> wrote:


Hi,

We were recently revisiting our problems
with the slowness of gluster writes

(http://lists.gluster.org/pipermail/gluster-users/2017-April/030529.html

).
Specifically we were testing the suggestions
in a recent post

(http://lists.gluster.org/pipermail/gluster-users/2018-March/033699.html

).
The first two suggestions (specifying a
negative-timeout in the mount settings or

Re: [Gluster-users] Slow write times to gluster disk

2018-06-25 Thread Raghavendra Gowdappa
On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley  wrote:

>
> Hi Raghavendra,
>
> Setting the performance.write-behind off had a small improvement on the
> write speed (~3%),
>
> We were unable to turn on "group metadata-cache".  When we try get errors
> like
>
> # gluster volume set data-volume group metadata-cache
> '/var/lib/glusterd/groups/metadata-cache' file format not valid.
>
> Was metadata-cache available for gluster 3.7.11? We ask because the
> release notes for 3.11 mentions “Feature for metadata-caching/small file
> performance is production ready.” (https://gluster.readthedocs.
> io/en/latest/release-notes/3.11.0/).
>
> Do any of these results suggest anything?  If not, what further tests
> would be useful?
>

Group metadata-cache is just a bunch of options one sets on a volume. So,
You can set them manually using gluster cli. Following are the options and
their values:

performance.md-cache-timeout=600
network.inode-lru-limit=5



> Thanks
>
> Pat
>
>
>
>
> On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:
>
>
>
> On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley  wrote:
>
>>
>> Hi Raghavendra,
>>
>> Thanks for the suggestions.  Our technician will be in on Monday.  We'll
>> test then and let you know the results.
>>
>> One question I have, is the "group metadata-cache" option supposed to
>> directly impact the performance or is it to help collect data?  If the
>> latter, where will the data be located?
>>
>
> It impacts performance.
>
>
>> Thanks again.
>>
>> Pat
>>
>>
>>
>> On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:
>>
>>
>>
>> On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa <
>> rgowd...@redhat.com> wrote:
>>
>>> For the case of writes to glusterfs mount,
>>>
>>> I saw in earlier conversations that there are too many lookups, but
>>> small number of writes. Since writes cached in write-behind would
>>> invalidate metadata cache, lookups won't be absorbed by md-cache. I am
>>> wondering what would results look like if we turn off
>>> performance.write-behind.
>>>
>>> @Pat,
>>>
>>> Can you set,
>>>
>>> # gluster volume set  performance.write-behind off
>>>
>>
>> Please turn on "group metadata-cache" for write tests too.
>>
>>
>>> and redo the tests writing to glusterfs mount? Let us know about the
>>> results you see.
>>>
>>> regards,
>>> Raghavendra
>>>
>>> On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa <
>>> rgowd...@redhat.com> wrote:
>>>


 On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa <
 rgowd...@redhat.com> wrote:

> For the case of reading from Glusterfs mount, read-ahead should help.
> However, we've known issues with read-ahead[1][2]. To work around these,
> can you try with,
>
> 1. Turn off performance.open-behind
> #gluster volume set  performance.open-behind off
>
> 2. enable group meta metadata-cache
> # gluster volume set  group metadata-cache
>

 [1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489


>
>
> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:
>
>>
>> Hi,
>>
>> We were recently revisiting our problems with the slowness of gluster
>> writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
>> /030529.html). Specifically we were testing the suggestions in a
>> recent post (http://lists.gluster.org/pipe
>> rmail/gluster-users/2018-March/033699.html). The first two
>> suggestions (specifying a negative-timeout in the mount settings or 
>> adding
>> rpc-auth-allow-insecure to glusterd.vol) did not improve our performance,
>> while setting "disperse.eager-lock off" provided a tiny (5%) speed-up.
>>
>> Some of the various tests we have tried earlier can be seen in the
>> links below.  Do any of the above observations suggest what we could try
>> next to either improve the speed or debug the issue?  Thanks
>>
>> http://lists.gluster.org/pipermail/gluster-users/2017-June/0
>> 31565.html
>> http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html
>>
>> Pat
>>
>> --
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley  Email:  pha...@mit.edu
>> Center for Ocean Engineering   Phone:  (617) 253-6824
>> Dept. of Mechanical EngineeringFax:(617) 253-8125
>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>> 77 Massachusetts Avenue
>> Cambridge, MA  02139-4301
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>

>>>
>>
>> --
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley  Email:  pha...@mit.edu
>> Center for Ocean Engineering   Phone:  (617) 

Re: [Gluster-users] Slow write times to gluster disk

2018-06-25 Thread Pat Haley


Hi Raghavendra,

Setting the performance.write-behind off had a small improvement on the 
write speed (~3%),


We were unable to turn on "group metadata-cache".  When we try get 
errors like


# gluster volume set data-volume group metadata-cache
'/var/lib/glusterd/groups/metadata-cache' file format not valid.

Was metadata-cache available for gluster 3.7.11? We ask because the 
release notes for 3.11 mentions “Feature for metadata-caching/small file 
performance is production ready.” 
(https://gluster.readthedocs.io/en/latest/release-notes/3.11.0/).


Do any of these results suggest anything?  If not, what further tests 
would be useful?


Thanks

Pat



On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:



On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley > wrote:



Hi Raghavendra,

Thanks for the suggestions.  Our technician will be in on Monday. 
We'll test then and let you know the results.

One question I have, is the "group metadata-cache" option supposed
to directly impact the performance or is it to help collect data? 
If the latter, where will the data be located?


It impacts performance.


Thanks again.

Pat



On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:



On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa
mailto:rgowd...@redhat.com>> wrote:

For the case of writes to glusterfs mount,

I saw in earlier conversations that there are too many
lookups, but small number of writes. Since writes cached in
write-behind would invalidate metadata cache, lookups won't
be absorbed by md-cache. I am wondering what would results
look like if we turn off performance.write-behind.

@Pat,

Can you set,

# gluster volume set  performance.write-behind off


Please turn on "group metadata-cache" for write tests too.


and redo the tests writing to glusterfs mount? Let us know
about the results you see.

regards,
Raghavendra

On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa
mailto:rgowd...@redhat.com>> wrote:



On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa
mailto:rgowd...@redhat.com>> wrote:

For the case of reading from Glusterfs mount,
read-ahead should help. However, we've known issues
with read-ahead[1][2]. To work around these, can you
try with,

1. Turn off performance.open-behind
#gluster volume set  performance.open-behind off

2. enable group meta metadata-cache
# gluster volume set  group metadata-cache


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1084508

[2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489





On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley
mailto:pha...@mit.edu>> wrote:


Hi,

We were recently revisiting our problems with the
slowness of gluster writes

(http://lists.gluster.org/pipermail/gluster-users/2017-April/030529.html

).
Specifically we were testing the suggestions in a
recent post

(http://lists.gluster.org/pipermail/gluster-users/2018-March/033699.html

).
The first two suggestions (specifying a
negative-timeout in the mount settings or adding
rpc-auth-allow-insecure to glusterd.vol) did not
improve our performance, while setting
"disperse.eager-lock off" provided a tiny (5%)
speed-up.

Some of the various tests we have tried earlier
can be seen in the links below. Do any of the
above observations suggest what we could try next
to either improve the speed or debug the issue?
Thanks


http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html



http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html



Pat

-- 



-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley       Email: pha...@mit.edu


Re: [Gluster-users] Slow write times to gluster disk

2018-06-22 Thread Raghavendra Gowdappa
On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley  wrote:

>
> Hi Raghavendra,
>
> Thanks for the suggestions.  Our technician will be in on Monday.  We'll
> test then and let you know the results.
>
> One question I have, is the "group metadata-cache" option supposed to
> directly impact the performance or is it to help collect data?  If the
> latter, where will the data be located?
>

It impacts performance.


> Thanks again.
>
> Pat
>
>
>
> On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:
>
>
>
> On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa <
> rgowd...@redhat.com> wrote:
>
>> For the case of writes to glusterfs mount,
>>
>> I saw in earlier conversations that there are too many lookups, but small
>> number of writes. Since writes cached in write-behind would invalidate
>> metadata cache, lookups won't be absorbed by md-cache. I am wondering what
>> would results look like if we turn off performance.write-behind.
>>
>> @Pat,
>>
>> Can you set,
>>
>> # gluster volume set  performance.write-behind off
>>
>
> Please turn on "group metadata-cache" for write tests too.
>
>
>> and redo the tests writing to glusterfs mount? Let us know about the
>> results you see.
>>
>> regards,
>> Raghavendra
>>
>> On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa <
>> rgowd...@redhat.com> wrote:
>>
>>>
>>>
>>> On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa <
>>> rgowd...@redhat.com> wrote:
>>>
 For the case of reading from Glusterfs mount, read-ahead should help.
 However, we've known issues with read-ahead[1][2]. To work around these,
 can you try with,

 1. Turn off performance.open-behind
 #gluster volume set  performance.open-behind off

 2. enable group meta metadata-cache
 # gluster volume set  group metadata-cache

>>>
>>> [1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
>>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>>>
>>>


 On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:

>
> Hi,
>
> We were recently revisiting our problems with the slowness of gluster
> writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
> /030529.html). Specifically we were testing the suggestions in a
> recent post (http://lists.gluster.org/pipe
> rmail/gluster-users/2018-March/033699.html). The first two
> suggestions (specifying a negative-timeout in the mount settings or adding
> rpc-auth-allow-insecure to glusterd.vol) did not improve our performance,
> while setting "disperse.eager-lock off" provided a tiny (5%) speed-up.
>
> Some of the various tests we have tried earlier can be seen in the
> links below.  Do any of the above observations suggest what we could try
> next to either improve the speed or debug the issue?  Thanks
>
> http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
> http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html
>
> Pat
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley  Email:  pha...@mit.edu
> Center for Ocean Engineering   Phone:  (617) 253-6824
> Dept. of Mechanical EngineeringFax:(617) 253-8125
> MIT, Room 5-213http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA  02139-4301
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



>>>
>>
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley  Email:  pha...@mit.edu
> Center for Ocean Engineering   Phone:  (617) 253-6824
> Dept. of Mechanical EngineeringFax:(617) 253-8125
> MIT, Room 5-213http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA  02139-4301
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2018-06-21 Thread Pat Haley


Hi Raghavendra,

Thanks for the suggestions.  Our technician will be in on Monday. We'll 
test then and let you know the results.


One question I have, is the "group metadata-cache" option supposed to 
directly impact the performance or is it to help collect data? If the 
latter, where will the data be located?


Thanks again.

Pat


On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:



On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa 
mailto:rgowd...@redhat.com>> wrote:


For the case of writes to glusterfs mount,

I saw in earlier conversations that there are too many lookups,
but small number of writes. Since writes cached in write-behind
would invalidate metadata cache, lookups won't be absorbed by
md-cache. I am wondering what would results look like if we turn
off performance.write-behind.

@Pat,

Can you set,

# gluster volume set  performance.write-behind off


Please turn on "group metadata-cache" for write tests too.


and redo the tests writing to glusterfs mount? Let us know about
the results you see.

regards,
Raghavendra

On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa
mailto:rgowd...@redhat.com>> wrote:



On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa
mailto:rgowd...@redhat.com>> wrote:

For the case of reading from Glusterfs mount, read-ahead
should help. However, we've known issues with
read-ahead[1][2]. To work around these, can you try with,

1. Turn off performance.open-behind
#gluster volume set  performance.open-behind off

2. enable group meta metadata-cache
# gluster volume set  group metadata-cache


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1084508

[2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489





On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley mailto:pha...@mit.edu>> wrote:


Hi,

We were recently revisiting our problems with the
slowness of gluster writes

(http://lists.gluster.org/pipermail/gluster-users/2017-April/030529.html

).
Specifically we were testing the suggestions in a
recent post

(http://lists.gluster.org/pipermail/gluster-users/2018-March/033699.html

).
The first two suggestions (specifying a
negative-timeout in the mount settings or adding
rpc-auth-allow-insecure to glusterd.vol) did not
improve our performance, while setting
"disperse.eager-lock off" provided a tiny (5%) speed-up.

Some of the various tests we have tried earlier can be
seen in the links below.  Do any of the above
observations suggest what we could try next to either
improve the speed or debug the issue?  Thanks


http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html



http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html



Pat

-- 



-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley Email: pha...@mit.edu 
Center for Ocean Engineering  Phone:  (617) 253-6824
Dept. of Mechanical Engineering Fax:    (617) 253-8125
MIT, Room 5-213 http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA 02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org

http://lists.gluster.org/mailman/listinfo/gluster-users








--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Raghavendra Gowdappa
On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa 
wrote:

> For the case of writes to glusterfs mount,
>
> I saw in earlier conversations that there are too many lookups, but small
> number of writes. Since writes cached in write-behind would invalidate
> metadata cache, lookups won't be absorbed by md-cache. I am wondering what
> would results look like if we turn off performance.write-behind.
>
> @Pat,
>
> Can you set,
>
> # gluster volume set  performance.write-behind off
>

Please turn on "group metadata-cache" for write tests too.


> and redo the tests writing to glusterfs mount? Let us know about the
> results you see.
>
> regards,
> Raghavendra
>
> On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa  > wrote:
>
>>
>>
>> On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa <
>> rgowd...@redhat.com> wrote:
>>
>>> For the case of reading from Glusterfs mount, read-ahead should help.
>>> However, we've known issues with read-ahead[1][2]. To work around these,
>>> can you try with,
>>>
>>> 1. Turn off performance.open-behind
>>> #gluster volume set  performance.open-behind off
>>>
>>> 2. enable group meta metadata-cache
>>> # gluster volume set  group metadata-cache
>>>
>>
>> [1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>>
>>
>>>
>>>
>>> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:
>>>

 Hi,

 We were recently revisiting our problems with the slowness of gluster
 writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
 /030529.html). Specifically we were testing the suggestions in a
 recent post (http://lists.gluster.org/pipe
 rmail/gluster-users/2018-March/033699.html). The first two suggestions
 (specifying a negative-timeout in the mount settings or adding
 rpc-auth-allow-insecure to glusterd.vol) did not improve our performance,
 while setting "disperse.eager-lock off" provided a tiny (5%) speed-up.

 Some of the various tests we have tried earlier can be seen in the
 links below.  Do any of the above observations suggest what we could try
 next to either improve the speed or debug the issue?  Thanks

 http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
 http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html

 Pat

 --

 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 Pat Haley  Email:  pha...@mit.edu
 Center for Ocean Engineering   Phone:  (617) 253-6824
 Dept. of Mechanical EngineeringFax:(617) 253-8125
 MIT, Room 5-213http://web.mit.edu/phaley/www/
 77 Massachusetts Avenue
 Cambridge, MA  02139-4301

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Raghavendra Gowdappa
Please note that these suggestions are for native fuse mount.

On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa 
wrote:

> For the case of writes to glusterfs mount,
>
> I saw in earlier conversations that there are too many lookups, but small
> number of writes. Since writes cached in write-behind would invalidate
> metadata cache, lookups won't be absorbed by md-cache. I am wondering what
> would results look like if we turn off performance.write-behind.
>
> @Pat,
>
> Can you set,
>
> # gluster volume set  performance.write-behind off
>
> and redo the tests writing to glusterfs mount? Let us know about the
> results you see.
>
> regards,
> Raghavendra
>
> On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa  > wrote:
>
>>
>>
>> On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa <
>> rgowd...@redhat.com> wrote:
>>
>>> For the case of reading from Glusterfs mount, read-ahead should help.
>>> However, we've known issues with read-ahead[1][2]. To work around these,
>>> can you try with,
>>>
>>> 1. Turn off performance.open-behind
>>> #gluster volume set  performance.open-behind off
>>>
>>> 2. enable group meta metadata-cache
>>> # gluster volume set  group metadata-cache
>>>
>>
>> [1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>>
>>
>>>
>>>
>>> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:
>>>

 Hi,

 We were recently revisiting our problems with the slowness of gluster
 writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
 /030529.html). Specifically we were testing the suggestions in a
 recent post (http://lists.gluster.org/pipe
 rmail/gluster-users/2018-March/033699.html). The first two suggestions
 (specifying a negative-timeout in the mount settings or adding
 rpc-auth-allow-insecure to glusterd.vol) did not improve our performance,
 while setting "disperse.eager-lock off" provided a tiny (5%) speed-up.

 Some of the various tests we have tried earlier can be seen in the
 links below.  Do any of the above observations suggest what we could try
 next to either improve the speed or debug the issue?  Thanks

 http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
 http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html

 Pat

 --

 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 Pat Haley  Email:  pha...@mit.edu
 Center for Ocean Engineering   Phone:  (617) 253-6824
 Dept. of Mechanical EngineeringFax:(617) 253-8125
 MIT, Room 5-213http://web.mit.edu/phaley/www/
 77 Massachusetts Avenue
 Cambridge, MA  02139-4301

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Raghavendra Gowdappa
For the case of writes to glusterfs mount,

I saw in earlier conversations that there are too many lookups, but small
number of writes. Since writes cached in write-behind would invalidate
metadata cache, lookups won't be absorbed by md-cache. I am wondering what
would results look like if we turn off performance.write-behind.

@Pat,

Can you set,

# gluster volume set  performance.write-behind off

and redo the tests writing to glusterfs mount? Let us know about the
results you see.

regards,
Raghavendra

On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa 
wrote:

>
>
> On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa  > wrote:
>
>> For the case of reading from Glusterfs mount, read-ahead should help.
>> However, we've known issues with read-ahead[1][2]. To work around these,
>> can you try with,
>>
>> 1. Turn off performance.open-behind
>> #gluster volume set  performance.open-behind off
>>
>> 2. enable group meta metadata-cache
>> # gluster volume set  group metadata-cache
>>
>
> [1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>
>
>>
>>
>> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:
>>
>>>
>>> Hi,
>>>
>>> We were recently revisiting our problems with the slowness of gluster
>>> writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
>>> /030529.html). Specifically we were testing the suggestions in a recent
>>> post (http://lists.gluster.org/pipermail/gluster-users/2018-March
>>> /033699.html). The first two suggestions (specifying a negative-timeout
>>> in the mount settings or adding rpc-auth-allow-insecure to glusterd.vol)
>>> did not improve our performance, while setting "disperse.eager-lock off"
>>> provided a tiny (5%) speed-up.
>>>
>>> Some of the various tests we have tried earlier can be seen in the links
>>> below.  Do any of the above observations suggest what we could try next to
>>> either improve the speed or debug the issue?  Thanks
>>>
>>> http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
>>> http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html
>>>
>>> Pat
>>>
>>> --
>>>
>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>> Pat Haley  Email:  pha...@mit.edu
>>> Center for Ocean Engineering   Phone:  (617) 253-6824
>>> Dept. of Mechanical EngineeringFax:(617) 253-8125
>>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>>> 77 Massachusetts Avenue
>>> Cambridge, MA  02139-4301
>>>
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Raghavendra Gowdappa
On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa 
wrote:

> For the case of reading from Glusterfs mount, read-ahead should help.
> However, we've known issues with read-ahead[1][2]. To work around these,
> can you try with,
>
> 1. Turn off performance.open-behind
> #gluster volume set  performance.open-behind off
>
> 2. enable group meta metadata-cache
> # gluster volume set  group metadata-cache
>

[1]  https://bugzilla.redhat.com/show_bug.cgi?id=1084508
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489


>
>
> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:
>
>>
>> Hi,
>>
>> We were recently revisiting our problems with the slowness of gluster
>> writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
>> /030529.html). Specifically we were testing the suggestions in a recent
>> post (http://lists.gluster.org/pipermail/gluster-users/2018-March
>> /033699.html). The first two suggestions (specifying a negative-timeout
>> in the mount settings or adding rpc-auth-allow-insecure to glusterd.vol)
>> did not improve our performance, while setting "disperse.eager-lock off"
>> provided a tiny (5%) speed-up.
>>
>> Some of the various tests we have tried earlier can be seen in the links
>> below.  Do any of the above observations suggest what we could try next to
>> either improve the speed or debug the issue?  Thanks
>>
>> http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
>> http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html
>>
>> Pat
>>
>> --
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley  Email:  pha...@mit.edu
>> Center for Ocean Engineering   Phone:  (617) 253-6824
>> Dept. of Mechanical EngineeringFax:(617) 253-8125
>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>> 77 Massachusetts Avenue
>> Cambridge, MA  02139-4301
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Raghavendra Gowdappa
For the case of reading from Glusterfs mount, read-ahead should help.
However, we've known issues with read-ahead[1][2]. To work around these,
can you try with,

1. Turn off performance.open-behind
#gluster volume set  performance.open-behind off

2. enable group meta metadata-cache
# gluster volume set  group metadata-cache


On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley  wrote:

>
> Hi,
>
> We were recently revisiting our problems with the slowness of gluster
> writes (http://lists.gluster.org/pipermail/gluster-users/2017-April
> /030529.html). Specifically we were testing the suggestions in a recent
> post (http://lists.gluster.org/pipermail/gluster-users/2018-March
> /033699.html). The first two suggestions (specifying a negative-timeout
> in the mount settings or adding rpc-auth-allow-insecure to glusterd.vol)
> did not improve our performance, while setting "disperse.eager-lock off"
> provided a tiny (5%) speed-up.
>
> Some of the various tests we have tried earlier can be seen in the links
> below.  Do any of the above observations suggest what we could try next to
> either improve the speed or debug the issue?  Thanks
>
> http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
> http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html
>
> Pat
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley  Email:  pha...@mit.edu
> Center for Ocean Engineering   Phone:  (617) 253-6824
> Dept. of Mechanical EngineeringFax:(617) 253-8125
> MIT, Room 5-213http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA  02139-4301
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Slow write times to gluster disk

2018-06-20 Thread Pat Haley


Hi,

We were recently revisiting our problems with the slowness of gluster 
writes 
(http://lists.gluster.org/pipermail/gluster-users/2017-April/030529.html). 
Specifically we were testing the suggestions in a recent post 
(http://lists.gluster.org/pipermail/gluster-users/2018-March/033699.html). 
The first two suggestions (specifying a negative-timeout in the mount 
settings or adding rpc-auth-allow-insecure to glusterd.vol) did not 
improve our performance, while setting "disperse.eager-lock off" 
provided a tiny (5%) speed-up.


Some of the various tests we have tried earlier can be seen in the links 
below.  Do any of the above observations suggest what we could try next 
to either improve the speed or debug the issue?  Thanks


http://lists.gluster.org/pipermail/gluster-users/2017-June/031565.html
http://lists.gluster.org/pipermail/gluster-users/2017-May/030937.html

Pat

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2017-08-08 Thread Steve Postma
Soumya,

its


[root@mseas-data2 ~]# glusterfs --version
glusterfs 3.7.11 built on Apr 27 2016 14:09:20
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
[root@mseas-data2 ~]#


Thanks,

Steve



From: Soumya Koduri <skod...@redhat.com>
Sent: Tuesday, August 8, 2017 1:37 AM
To: Pat Haley; Niels de Vos
Cc: gluster-users@gluster.org; Pranith Kumar Karampuri; Ben Turner; Ravishankar 
N; Raghavendra Gowdappa; Niels de Vos; Steve Postma
Subject: Re: [Gluster-users] Slow write times to gluster disk



- Original Message -
> From: "Pat Haley" <pha...@mit.edu><mailto:pha...@mit.edu>>
> To: "Soumya Koduri" <skod...@redhat.com><mailto:skod...@redhat.com>>, 
> gluster-users@gluster.org<mailto:gluster-users@gluster.org>, "Pranith Kumar 
> Karampuri" <pkara...@redhat.com><mailto:pkara...@redhat.com>>
> Cc: "Ben Turner" <btur...@redhat.com><mailto:btur...@redhat.com>>, 
> "Ravishankar N" <ravishan...@redhat.com><mailto:ravishan...@redhat.com>>, 
> "Raghavendra Gowdappa"
> <rgowd...@redhat.com><mailto:rgowd...@redhat.com>>, "Niels de Vos" 
> <nde...@redhat.com><mailto:nde...@redhat.com>>, "Steve Postma" 
> <spos...@ztechnet.com><mailto:spos...@ztechnet.com>>
> Sent: Monday, August 7, 2017 9:52:48 PM
> Subject: Re: [Gluster-users] Slow write times to gluster disk
>
>
> Hi Soumya,
>
> We just had the opportunity to try the option of disabling the
> kernel-NFS and restarting glusterd to start gNFS. However the gluster
> demon crashes immediately on startup. What additional information
> besides what we provide below would help debugging this?
>

Which version of glusterfs are you using? There were few regressions caused 
(all fixed now in master branch atleast) by recent changes in mount codepath.

Request Niels to comment.

Thanks,
Soumya


> Thanks,
>
> Pat
>
>
>  Forwarded Message 
> Subject: gluster-nfs crashing on start
> Date: Mon, 7 Aug 2017 16:05:09 +
> From: Steve Postma <spos...@ztechnet.com><mailto:spos...@ztechnet.com>>
> To: Pat Haley <pha...@mit.edu><mailto:pha...@mit.edu>>
>
>
>
> *To disable kernal-nfs and enable nfs through Gluster we:*
>
>
> gluster volume set data-volume nfs.export-volumes on
> gluster volume set data-volume nfs.disable off
>
> /etc/init.d/glusterd stop
>
>
> service nfslock stop
>
> service rpcgssd stop
>
> service rpcidmapd stop
>
> service portmap stop
>
> service nfs stop
>
>
> /etc/init.d/glusterd stop
>
>
>
>
> *the /var/log/glusterfs/nfs.log immediately reports a crash:*
>
> *
> *
>
> [root@mseas-data2<mailto:root@mseas-data2> glusterfs]# cat nfs.log
>
> [2017-08-07 15:20:16.327026] I [MSGID: 100030] [glusterfsd.c:2332:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
> 3.7.11 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs
> -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S
> /var/run/gluster/7db74f19472511d20849e471bf224c1a.socket)
>
> [2017-08-07 15:20:16.345166] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
>
> [2017-08-07 15:20:16.351290] I
> [rpcsvc.c:2215:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service:
> Configured rpc.outstanding-rpc-limit with value 16
>
> pending frames:
>
> frame : type(0) op(0)
>
> patchset: git://git.gluster.com/glusterfs.git
>
> signal received: 11
>
> time of crash:
>
> 2017-08-07 15:20:17
>
> configuration details:
>
> argp 1
>
> backtrace 1
>
> dlfcn 1
>
> libpthread 1
>
> llistxattr 1
>
> setfsid 1
>
> spinlock 1
>
> epoll.h 1
>
> xattr.h 1
>
> st_atim.tv_nsec 1
>
> package-string: glusterfs 3.7.11
>
> /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb8)[0x3889625a18]
>
> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x32f)[0x38896456af]
>
> /lib64/libc.so.6[0x34a1c32660]
>
> /lib64/libc.so.6[0x34a1d3382f]
>
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(+0x53307)[0x7f8d071b3307]
>
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(exp_file_parse+0x302)[0x7f8d071b3742

Re: [Gluster-users] Slow write times to gluster disk

2017-08-07 Thread Soumya Koduri


- Original Message -
> From: "Pat Haley" <pha...@mit.edu>
> To: "Soumya Koduri" <skod...@redhat.com>, gluster-users@gluster.org, "Pranith 
> Kumar Karampuri" <pkara...@redhat.com>
> Cc: "Ben Turner" <btur...@redhat.com>, "Ravishankar N" 
> <ravishan...@redhat.com>, "Raghavendra Gowdappa"
> <rgowd...@redhat.com>, "Niels de Vos" <nde...@redhat.com>, "Steve Postma" 
> <spos...@ztechnet.com>
> Sent: Monday, August 7, 2017 9:52:48 PM
> Subject: Re: [Gluster-users] Slow write times to gluster disk
> 
> 
> Hi Soumya,
> 
> We just had the opportunity to try the option of disabling the
> kernel-NFS and restarting glusterd to start gNFS.  However the gluster
> demon crashes immediately on startup.  What additional information
> besides what we provide below would help debugging this?
> 

Which version of glusterfs are you using? There were few regressions caused 
(all fixed now in master branch atleast) by recent changes in mount codepath.

Request Niels to comment.

Thanks,
Soumya


> Thanks,
> 
> Pat
> 
> 
>  Forwarded Message 
> Subject:  gluster-nfs crashing on start
> Date: Mon, 7 Aug 2017 16:05:09 +
> From: Steve Postma <spos...@ztechnet.com>
> To:   Pat Haley <pha...@mit.edu>
> 
> 
> 
> *To disable kernal-nfs and enable nfs through Gluster we:*
> 
> 
> gluster volume set data-volume nfs.export-volumes on
> gluster volume set data-volume nfs.disable off
> 
> /etc/init.d/glusterd stop
> 
> 
> service nfslock stop
> 
> service rpcgssd stop
> 
> service rpcidmapd stop
> 
> service portmap stop
> 
> service nfs stop
> 
> 
> /etc/init.d/glusterd stop
> 
> 
> 
> 
> *the /var/log/glusterfs/nfs.log immediately reports a crash:*
> 
> *
> *
> 
> [root@mseas-data2 glusterfs]# cat nfs.log
> 
> [2017-08-07 15:20:16.327026] I [MSGID: 100030] [glusterfsd.c:2332:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
> 3.7.11 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs
> -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S
> /var/run/gluster/7db74f19472511d20849e471bf224c1a.socket)
> 
> [2017-08-07 15:20:16.345166] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> 
> [2017-08-07 15:20:16.351290] I
> [rpcsvc.c:2215:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service:
> Configured rpc.outstanding-rpc-limit with value 16
> 
> pending frames:
> 
> frame : type(0) op(0)
> 
> patchset: git://git.gluster.com/glusterfs.git
> 
> signal received: 11
> 
> time of crash:
> 
> 2017-08-07 15:20:17
> 
> configuration details:
> 
> argp 1
> 
> backtrace 1
> 
> dlfcn 1
> 
> libpthread 1
> 
> llistxattr 1
> 
> setfsid 1
> 
> spinlock 1
> 
> epoll.h 1
> 
> xattr.h 1
> 
> st_atim.tv_nsec 1
> 
> package-string: glusterfs 3.7.11
> 
> /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb8)[0x3889625a18]
> 
> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x32f)[0x38896456af]
> 
> /lib64/libc.so.6[0x34a1c32660]
> 
> /lib64/libc.so.6[0x34a1d3382f]
> 
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(+0x53307)[0x7f8d071b3307]
> 
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(exp_file_parse+0x302)[0x7f8d071b3742]
> 
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(mnt3_auth_set_exports_auth+0x45)[0x7f8d071b47a5]
> 
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(_mnt3_init_auth_params+0x91)[0x7f8d07183e41]
> 
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(mnt3svc_init+0x218)[0x7f8d07184228]
> 
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(nfs_init_versions+0xd7)[0x7f8d07174a37]
> 
> /usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(init+0x77)[0x7f8d071767c7]
> 
> /usr/lib64/libglusterfs.so.0(xlator_init+0x52)[0x3889622a82]
> 
> /usr/lib64/libglusterfs.so.0(glusterfs_graph_init+0x31)[0x3889669aa1]
> 
> /usr/lib64/libglusterfs.so.0(glusterfs_graph_activate+0x57)[0x3889669bd7]
> 
> /usr/sbin/glusterfs(glusterfs_process_volfp+0xed)[0x405c0d]
> 
> /usr/sbin/glusterfs(mgmt_getspec_cbk+0x312)[0x40dbd2]
> 
> /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x3889e0f7b5]
> 
> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a1)[0x3889e10891]
> 
> /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x3889e0bbd8]
> 
> /usr/lib64/glusterfs/3.7.11/rpc-transport/socket.so(+0x94cd)[0x7f8d088e04cd]
> 
> /usr/lib64/glusterfs/3.7.11/rpc-transport/socket.so(+0xa79d)[0x7f8d088e179d]
> 
> /usr/l

Re: [Gluster-users] Slow write times to gluster disk

2017-08-07 Thread Pat Haley


Hi Soumya,

We just had the opportunity to try the option of disabling the 
kernel-NFS and restarting glusterd to start gNFS.  However the gluster 
demon crashes immediately on startup.  What additional information 
besides what we provide below would help debugging this?


Thanks,

Pat


 Forwarded Message 
Subject:gluster-nfs crashing on start
Date:   Mon, 7 Aug 2017 16:05:09 +
From:   Steve Postma 
To: Pat Haley 



*To disable kernal-nfs and enable nfs through Gluster we:*


gluster volume set data-volume nfs.export-volumes on
gluster volume set data-volume nfs.disable off

/etc/init.d/glusterd stop


service nfslock stop

service rpcgssd stop

service rpcidmapd stop

service portmap stop

service nfs stop


/etc/init.d/glusterd stop




*the /var/log/glusterfs/nfs.log immediately reports a crash:*

*
*

[root@mseas-data2 glusterfs]# cat nfs.log

[2017-08-07 15:20:16.327026] I [MSGID: 100030] [glusterfsd.c:2332:main] 
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 
3.7.11 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs 
-p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S 
/var/run/gluster/7db74f19472511d20849e471bf224c1a.socket)


[2017-08-07 15:20:16.345166] I [MSGID: 101190] 
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread 
with index 1


[2017-08-07 15:20:16.351290] I 
[rpcsvc.c:2215:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: 
Configured rpc.outstanding-rpc-limit with value 16


pending frames:

frame : type(0) op(0)

patchset: git://git.gluster.com/glusterfs.git

signal received: 11

time of crash:

2017-08-07 15:20:17

configuration details:

argp 1

backtrace 1

dlfcn 1

libpthread 1

llistxattr 1

setfsid 1

spinlock 1

epoll.h 1

xattr.h 1

st_atim.tv_nsec 1

package-string: glusterfs 3.7.11

/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb8)[0x3889625a18]

/usr/lib64/libglusterfs.so.0(gf_print_trace+0x32f)[0x38896456af]

/lib64/libc.so.6[0x34a1c32660]

/lib64/libc.so.6[0x34a1d3382f]

/usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(+0x53307)[0x7f8d071b3307]

/usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(exp_file_parse+0x302)[0x7f8d071b3742]

/usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(mnt3_auth_set_exports_auth+0x45)[0x7f8d071b47a5]

/usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(_mnt3_init_auth_params+0x91)[0x7f8d07183e41]

/usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(mnt3svc_init+0x218)[0x7f8d07184228]

/usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(nfs_init_versions+0xd7)[0x7f8d07174a37]

/usr/lib64/glusterfs/3.7.11/xlator/nfs/server.so(init+0x77)[0x7f8d071767c7]

/usr/lib64/libglusterfs.so.0(xlator_init+0x52)[0x3889622a82]

/usr/lib64/libglusterfs.so.0(glusterfs_graph_init+0x31)[0x3889669aa1]

/usr/lib64/libglusterfs.so.0(glusterfs_graph_activate+0x57)[0x3889669bd7]

/usr/sbin/glusterfs(glusterfs_process_volfp+0xed)[0x405c0d]

/usr/sbin/glusterfs(mgmt_getspec_cbk+0x312)[0x40dbd2]

/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x3889e0f7b5]

/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a1)[0x3889e10891]

/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x3889e0bbd8]

/usr/lib64/glusterfs/3.7.11/rpc-transport/socket.so(+0x94cd)[0x7f8d088e04cd]

/usr/lib64/glusterfs/3.7.11/rpc-transport/socket.so(+0xa79d)[0x7f8d088e179d]

/usr/lib64/libglusterfs.so.0[0x388968b2f0]

/lib64/libpthread.so.0[0x34a2007aa1]

/lib64/libc.so.6(clone+0x6d)[0x34a1ce8aad]






[root@mseas-data2 glusterfs]# gluster volume info

Volume Name: data-volume

Type: Distribute

Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18

Status: Started

Number of Bricks: 2

Transport-type: tcp

Bricks:

Brick1: mseas-data2:/mnt/brick1

Brick2: mseas-data2:/mnt/brick2

Options Reconfigured:

nfs.export-volumes: on

nfs.disable: off

performance.readdir-ahead: on

diagnostics.brick-sys-log-level: WARNING

nfs.exports-auth-enable: on



"/var/lib/glusterd/nfs/exports"

/gdata 172.16.1.0/255.255.255.0(rw)




*What else can we do to identify why this is failing?**
*

**
**

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2017-07-13 Thread Soumya Koduri



On 07/14/2017 06:40 AM, Pat Haley wrote:


Hi Soumya,

I just noticed some of the notes at the bottom.  In particular

  * Till glusterfs-3.7, gluster-NFS (gNFS) gets enabled by default. The
only requirement is that kernel-NFS has to be disabled for
gluster-NFS to come up. Please disable kernel-NFS server and restart
glusterd to start gNFS. In case of any issues with starting gNFS
server, please look at /var/log/glusterfs/nfs.log.

If we disable the kernel-NFS on our server and restart glusterd to start
gNFS will that affect the NFS file system also being served by that
server (i.e. the single server serves both a glusterFS area and an NFS
area)?   clients
Thats right. When you restart glusterd, it tries to spawn (provided 
nfs.disable option is set to off for any volume) a new glusterfs client 
process which acts like NFS-server as well.


Would we also have to disable the kernel-NFS for NFS-ganesha?

yes.



My second question concerns NFS-ganesha (v 2.3.x) for CentOS 6.8 and
gluster 3.7.11.  I think I see a couple of possibilities

 1. I see one possible rpm for version 2.3.3 in

https://mirror.chpc.utah.edu/pub/vault.centos.org/centos/6.8/storage/Source/gluster-3.8/
The other rpm's seem to be for gluster 3.8 packages, so I'm
wondering if there is a concern for conflict


AFAIK, nfs-ganesha-2.3.3 should work with both 3.8 & 3.7 gluster.


 2. In one of the links you sent
(https://buildlogs.centos.org/centos/6/storage/x86_64/gluster-3.7/)
I see an rpm for glusterfs-ganesha-3.7.11 .  Is this a specific
gluster package for compatibility with ganesha or a ganesha package
for gluster?

This is to be compatible with gluster-3.7* package.



Does either possibility seem more likely to be what I need than the other?


The current stable/maintained/tested combination is nfs-ganesha2.4/2.5 + 
glusterfs-3.8/3.10. But however incase you cannot upgrade, you can still 
use nfs-ganesha-2.3* with glusterfs-3.8/3.7


Hope it is clear.

Thanks,
Soumya



Pat


On 07/07/2017 01:31 PM, Soumya Koduri wrote:

Hi,

On 07/07/2017 06:16 AM, Pat Haley wrote:


Hi All,

A follow-up question.  I've been looking at various pages on nfs-ganesha
& gluster.  Is there a version of nfs-ganesha that is recommended for
use with

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)


For glusterfs 3.7, nfs-ganesha-2.3-* version can be used.

I see the packages built in centos7 storage sig [1] but not for
centos6. Request Niels to comment.




Thanks

Pat


On 07/05/2017 11:36 AM, Pat Haley wrote:


Hi Soumya,

(1) In http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/
I've placed the following 2 log files

etc-glusterfs-glusterd.vol.log
gdata.log

The first has repeated messages about nfs disconnects.  The second had
the .log name (but not much information).



Hmm yeah..weird ..there are not much logs in fuse mnt log file.


(2) About the gluster-NFS native server:  do you know where we can
find documentation on how to use/install it?  We haven't had success
in our searches.



Till glusterfs-3.7, gluster-NFS (gNFS) gets enabled by default. The
only requirement is that kernel-NFS has to be disabled for gluster-NFS
to come up. Please disable kernel-NFS server and restart glusterd to
start gNFS. In case of any issues with starting gNFS server, please
look at /var/log/glusterfs/nfs.log.

Thanks,
Soumya


[1] https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-3.7/
[2] https://buildlogs.centos.org/centos/6/storage/x86_64/gluster-3.7/


Thanks

Pat


On 07/04/2017 05:01 AM, Soumya Koduri wrote:



On 07/03/2017 09:01 PM, Pat Haley wrote:


Hi Soumya,

When I originally did the tests I ran tcpdump on the client.

I have rerun the tests, doing tcpdump on the server

tcpdump -i any -nnSs 0 host 172.16.1.121 -w
/root/capture_nfsfail.pcap

The results are in the same place

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch
experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The brick log files are there too.


Thanks for sharing. Looks like the error is not generated
@gluster-server side. The permission denied error was caused by
either kNFS or by fuse-mnt process or probably by the combination.

To check fuse-mnt logs, please look at
/var/log/glusterfs/.log

For eg.: if you have fuse mounted the gluster volume at /mnt/fuse-mnt
and exported it via kNFS, the log location for that fuse_mnt shall be
at /var/log/glusterfs/mnt-fuse-mnt.log


Also why not switch to either gluster-NFS native server or
NFS-Ganesha instead of using kNFS, as they are recommended NFS
servers to use with gluster?

Thanks,
Soumya



I believe we are using kernel-NFS exporting a fuse mounted gluster
volume.  I am having Steve confirm this.  I tried to find the
fuse-mnt
logs but failed.  Where should I look for them?

Thanks

Pat



On 07/03/2017 07:58 AM, Soumya Koduri wrote:




Re: [Gluster-users] Slow write times to gluster disk

2017-07-13 Thread Pat Haley


Hi Soumya,

I just noticed some of the notes at the bottom.  In particular

 * Till glusterfs-3.7, gluster-NFS (gNFS) gets enabled by default. The
   only requirement is that kernel-NFS has to be disabled for
   gluster-NFS to come up. Please disable kernel-NFS server and restart
   glusterd to start gNFS. In case of any issues with starting gNFS
   server, please look at /var/log/glusterfs/nfs.log.

If we disable the kernel-NFS on our server and restart glusterd to start 
gNFS will that affect the NFS file system also being served by that 
server (i.e. the single server serves both a glusterFS area and an NFS  
area)?  Would we also have to disable the kernel-NFS for NFS-ganesha?


My second question concerns NFS-ganesha (v 2.3.x) for CentOS 6.8 and 
gluster 3.7.11.  I think I see a couple of possibilities


1. I see one possible rpm for version 2.3.3 in
   
https://mirror.chpc.utah.edu/pub/vault.centos.org/centos/6.8/storage/Source/gluster-3.8/
   The other rpm's seem to be for gluster 3.8 packages, so I'm
   wondering if there is a concern for conflict
2. In one of the links you sent
   (https://buildlogs.centos.org/centos/6/storage/x86_64/gluster-3.7/)
   I see an rpm for glusterfs-ganesha-3.7.11 .  Is this a specific
   gluster package for compatibility with ganesha or a ganesha package
   for gluster?

Does either possibility seem more likely to be what I need than the other?

Pat


On 07/07/2017 01:31 PM, Soumya Koduri wrote:

Hi,

On 07/07/2017 06:16 AM, Pat Haley wrote:


Hi All,

A follow-up question.  I've been looking at various pages on nfs-ganesha
& gluster.  Is there a version of nfs-ganesha that is recommended for
use with

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)


For glusterfs 3.7, nfs-ganesha-2.3-* version can be used.

I see the packages built in centos7 storage sig [1] but not for 
centos6. Request Niels to comment.





Thanks

Pat


On 07/05/2017 11:36 AM, Pat Haley wrote:


Hi Soumya,

(1) In http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/
I've placed the following 2 log files

etc-glusterfs-glusterd.vol.log
gdata.log

The first has repeated messages about nfs disconnects.  The second had
the .log name (but not much information).



Hmm yeah..weird ..there are not much logs in fuse mnt log file.


(2) About the gluster-NFS native server:  do you know where we can
find documentation on how to use/install it?  We haven't had success
in our searches.



Till glusterfs-3.7, gluster-NFS (gNFS) gets enabled by default. The 
only requirement is that kernel-NFS has to be disabled for gluster-NFS 
to come up. Please disable kernel-NFS server and restart glusterd to 
start gNFS. In case of any issues with starting gNFS server, please 
look at /var/log/glusterfs/nfs.log.


Thanks,
Soumya


[1] https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-3.7/
[2] https://buildlogs.centos.org/centos/6/storage/x86_64/gluster-3.7/


Thanks

Pat


On 07/04/2017 05:01 AM, Soumya Koduri wrote:



On 07/03/2017 09:01 PM, Pat Haley wrote:


Hi Soumya,

When I originally did the tests I ran tcpdump on the client.

I have rerun the tests, doing tcpdump on the server

tcpdump -i any -nnSs 0 host 172.16.1.121 -w 
/root/capture_nfsfail.pcap


The results are in the same place

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch 
experiment

capture_nfssucceed.pcap  has the results from the successful touch
experiment

The brick log files are there too.


Thanks for sharing. Looks like the error is not generated
@gluster-server side. The permission denied error was caused by
either kNFS or by fuse-mnt process or probably by the combination.

To check fuse-mnt logs, please look at
/var/log/glusterfs/.log

For eg.: if you have fuse mounted the gluster volume at /mnt/fuse-mnt
and exported it via kNFS, the log location for that fuse_mnt shall be
at /var/log/glusterfs/mnt-fuse-mnt.log


Also why not switch to either gluster-NFS native server or
NFS-Ganesha instead of using kNFS, as they are recommended NFS
servers to use with gluster?

Thanks,
Soumya



I believe we are using kernel-NFS exporting a fuse mounted gluster
volume.  I am having Steve confirm this.  I tried to find the 
fuse-mnt

logs but failed.  Where should I look for them?

Thanks

Pat



On 07/03/2017 07:58 AM, Soumya Koduri wrote:



On 06/30/2017 07:56 PM, Pat Haley wrote:


Hi,

I was wondering if there were any additional test we could 
perform to

help debug the group write-permissions issue?


Sorry for the delay. Please find response inline --



Thanks

Pat


On 06/27/2017 12:29 PM, Pat Haley wrote:


Hi Soumya,

One example, we have a common working directory dri_fleat in the
gluster volume

drwxrwsr-x 22 root dri_fleat 4.0K May  1 15:14 dri_fleat

my user (phaley) does not own that directory but is a member of 
the
group  dri_fleat and should have write permissions. When I go 
to the

nfs-mounted version and 

Re: [Gluster-users] Slow write times to gluster disk

2017-07-07 Thread Soumya Koduri

Hi,

On 07/07/2017 06:16 AM, Pat Haley wrote:


Hi All,

A follow-up question.  I've been looking at various pages on nfs-ganesha
& gluster.  Is there a version of nfs-ganesha that is recommended for
use with

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)


For glusterfs 3.7, nfs-ganesha-2.3-* version can be used.

I see the packages built in centos7 storage sig [1] but not for centos6. 
Request Niels to comment.





Thanks

Pat


On 07/05/2017 11:36 AM, Pat Haley wrote:


Hi Soumya,

(1) In http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/
I've placed the following 2 log files

etc-glusterfs-glusterd.vol.log
gdata.log

The first has repeated messages about nfs disconnects.  The second had
the .log name (but not much information).



Hmm yeah..weird ..there are not much logs in fuse mnt log file.


(2) About the gluster-NFS native server:  do you know where we can
find documentation on how to use/install it?  We haven't had success
in our searches.



Till glusterfs-3.7, gluster-NFS (gNFS) gets enabled by default. The only 
requirement is that kernel-NFS has to be disabled for gluster-NFS to 
come up. Please disable kernel-NFS server and restart glusterd to start 
gNFS. In case of any issues with starting gNFS server, please look at 
/var/log/glusterfs/nfs.log.


Thanks,
Soumya


[1] https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-3.7/
[2] https://buildlogs.centos.org/centos/6/storage/x86_64/gluster-3.7/


Thanks

Pat


On 07/04/2017 05:01 AM, Soumya Koduri wrote:



On 07/03/2017 09:01 PM, Pat Haley wrote:


Hi Soumya,

When I originally did the tests I ran tcpdump on the client.

I have rerun the tests, doing tcpdump on the server

tcpdump -i any -nnSs 0 host 172.16.1.121 -w /root/capture_nfsfail.pcap

The results are in the same place

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The brick log files are there too.


Thanks for sharing. Looks like the error is not generated
@gluster-server side. The permission denied error was caused by
either kNFS or by fuse-mnt process or probably by the combination.

To check fuse-mnt logs, please look at
/var/log/glusterfs/.log

For eg.: if you have fuse mounted the gluster volume at /mnt/fuse-mnt
and exported it via kNFS, the log location for that fuse_mnt shall be
at /var/log/glusterfs/mnt-fuse-mnt.log


Also why not switch to either gluster-NFS native server or
NFS-Ganesha instead of using kNFS, as they are recommended NFS
servers to use with gluster?

Thanks,
Soumya



I believe we are using kernel-NFS exporting a fuse mounted gluster
volume.  I am having Steve confirm this.  I tried to find the fuse-mnt
logs but failed.  Where should I look for them?

Thanks

Pat



On 07/03/2017 07:58 AM, Soumya Koduri wrote:



On 06/30/2017 07:56 PM, Pat Haley wrote:


Hi,

I was wondering if there were any additional test we could perform to
help debug the group write-permissions issue?


Sorry for the delay. Please find response inline --



Thanks

Pat


On 06/27/2017 12:29 PM, Pat Haley wrote:


Hi Soumya,

One example, we have a common working directory dri_fleat in the
gluster volume

drwxrwsr-x 22 root dri_fleat 4.0K May  1 15:14 dri_fleat

my user (phaley) does not own that directory but is a member of the
group  dri_fleat and should have write permissions. When I go to the
nfs-mounted version and try to use the touch command I get the
following

ibfdr-compute-0-4(dri_fleat)% touch dum
touch: cannot touch `dum': Permission denied

One of the sub-directories under dri_fleat is "test" which phaley
owns

drwxrwsr-x  2 phaley   dri_fleat 4.0K May  1 15:16 test

Under this directory (mounted via nfs) user phaley can write

ibfdr-compute-0-4(test)% touch dum
ibfdr-compute-0-4(test)%

I have put the packet captures in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch
experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The command I used for these was

tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w
/root/capture_nfstest.pcap


I hope these pkts were captured on the node where NFS server is
running. Could you please use '-i any' as I do not see glusterfs
traffic in the tcpdump.

Also looks like NFS v4 is used between client & nfs server. Are you
using kernel-NFS here (i.e, kernel-NFS exporting fuse mounted gluster
volume)?
If that is the case please capture fuse-mnt logs as well. This error
may well be coming from kernel-NFS itself before the request is sent
to fuse-mnt process.

FWIW, we have below option -

Option: server.manage-gids
Default Value: off
Description: Resolve groups on the server-side.

I haven't looked into what this option exactly does. But it may worth
testing with this option on.

Thanks,
Soumya




The brick log files 

Re: [Gluster-users] Slow write times to gluster disk

2017-07-06 Thread Pat Haley


Hi All,

A follow-up question.  I've been looking at various pages on nfs-ganesha 
& gluster.  Is there a version of nfs-ganesha that is recommended for 
use with


glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)

Thanks

Pat


On 07/05/2017 11:36 AM, Pat Haley wrote:


Hi Soumya,

(1) In http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/ 
I've placed the following 2 log files


etc-glusterfs-glusterd.vol.log
gdata.log

The first has repeated messages about nfs disconnects.  The second had 
the .log name (but not much information).


(2) About the gluster-NFS native server:  do you know where we can 
find documentation on how to use/install it?  We haven't had success 
in our searches.


Thanks

Pat


On 07/04/2017 05:01 AM, Soumya Koduri wrote:



On 07/03/2017 09:01 PM, Pat Haley wrote:


Hi Soumya,

When I originally did the tests I ran tcpdump on the client.

I have rerun the tests, doing tcpdump on the server

tcpdump -i any -nnSs 0 host 172.16.1.121 -w /root/capture_nfsfail.pcap

The results are in the same place

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The brick log files are there too.


Thanks for sharing. Looks like the error is not generated 
@gluster-server side. The permission denied error was caused by 
either kNFS or by fuse-mnt process or probably by the combination.


To check fuse-mnt logs, please look at 
/var/log/glusterfs/.log


For eg.: if you have fuse mounted the gluster volume at /mnt/fuse-mnt 
and exported it via kNFS, the log location for that fuse_mnt shall be 
at /var/log/glusterfs/mnt-fuse-mnt.log



Also why not switch to either gluster-NFS native server or 
NFS-Ganesha instead of using kNFS, as they are recommended NFS 
servers to use with gluster?


Thanks,
Soumya



I believe we are using kernel-NFS exporting a fuse mounted gluster
volume.  I am having Steve confirm this.  I tried to find the fuse-mnt
logs but failed.  Where should I look for them?

Thanks

Pat



On 07/03/2017 07:58 AM, Soumya Koduri wrote:



On 06/30/2017 07:56 PM, Pat Haley wrote:


Hi,

I was wondering if there were any additional test we could perform to
help debug the group write-permissions issue?


Sorry for the delay. Please find response inline --



Thanks

Pat


On 06/27/2017 12:29 PM, Pat Haley wrote:


Hi Soumya,

One example, we have a common working directory dri_fleat in the
gluster volume

drwxrwsr-x 22 root dri_fleat 4.0K May  1 15:14 dri_fleat

my user (phaley) does not own that directory but is a member of the
group  dri_fleat and should have write permissions. When I go to the
nfs-mounted version and try to use the touch command I get the
following

ibfdr-compute-0-4(dri_fleat)% touch dum
touch: cannot touch `dum': Permission denied

One of the sub-directories under dri_fleat is "test" which phaley 
owns


drwxrwsr-x  2 phaley   dri_fleat 4.0K May  1 15:16 test

Under this directory (mounted via nfs) user phaley can write

ibfdr-compute-0-4(test)% touch dum
ibfdr-compute-0-4(test)%

I have put the packet captures in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch 
experiment

capture_nfssucceed.pcap  has the results from the successful touch
experiment

The command I used for these was

tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w 
/root/capture_nfstest.pcap


I hope these pkts were captured on the node where NFS server is
running. Could you please use '-i any' as I do not see glusterfs
traffic in the tcpdump.

Also looks like NFS v4 is used between client & nfs server. Are you
using kernel-NFS here (i.e, kernel-NFS exporting fuse mounted gluster
volume)?
If that is the case please capture fuse-mnt logs as well. This error
may well be coming from kernel-NFS itself before the request is sent
to fuse-mnt process.

FWIW, we have below option -

Option: server.manage-gids
Default Value: off
Description: Resolve groups on the server-side.

I haven't looked into what this option exactly does. But it may worth
testing with this option on.

Thanks,
Soumya




The brick log files are also in the above link.  If I read them
correctly they both funny times.  Specifically I see entries from
around 2017-06-27 14:02:37.404865  even though the system time was
2017-06-27 12:00:00.

One final item, another reply to my post had a link for possible
problems that could arise from users belonging to too many group. We
have seen the above problem even with a user belonging to only 4
groups.

Let me know what additional information I can provide.




Thanks

Pat


On 06/27/2017 02:45 AM, Soumya Koduri wrote:



On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote:

The only problem with using gluster mounted via NFS is that it
does not
respect the group write permissions which we need.

We have an exercise 

Re: [Gluster-users] Slow write times to gluster disk

2017-07-05 Thread Pat Haley


Hi Soumya,

(1) In http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/ 
I've placed the following 2 log files


etc-glusterfs-glusterd.vol.log
gdata.log

The first has repeated messages about nfs disconnects.  The second had 
the .log name (but not much information).


(2) About the gluster-NFS native server:  do you know where we can find 
documentation on how to use/install it?  We haven't had success in our 
searches.


Thanks

Pat


On 07/04/2017 05:01 AM, Soumya Koduri wrote:



On 07/03/2017 09:01 PM, Pat Haley wrote:


Hi Soumya,

When I originally did the tests I ran tcpdump on the client.

I have rerun the tests, doing tcpdump on the server

tcpdump -i any -nnSs 0 host 172.16.1.121 -w /root/capture_nfsfail.pcap

The results are in the same place

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The brick log files are there too.


Thanks for sharing. Looks like the error is not generated 
@gluster-server side. The permission denied error was caused by either 
kNFS or by fuse-mnt process or probably by the combination.


To check fuse-mnt logs, please look at 
/var/log/glusterfs/.log


For eg.: if you have fuse mounted the gluster volume at /mnt/fuse-mnt 
and exported it via kNFS, the log location for that fuse_mnt shall be 
at /var/log/glusterfs/mnt-fuse-mnt.log



Also why not switch to either gluster-NFS native server or NFS-Ganesha 
instead of using kNFS, as they are recommended NFS servers to use with 
gluster?


Thanks,
Soumya



I believe we are using kernel-NFS exporting a fuse mounted gluster
volume.  I am having Steve confirm this.  I tried to find the fuse-mnt
logs but failed.  Where should I look for them?

Thanks

Pat



On 07/03/2017 07:58 AM, Soumya Koduri wrote:



On 06/30/2017 07:56 PM, Pat Haley wrote:


Hi,

I was wondering if there were any additional test we could perform to
help debug the group write-permissions issue?


Sorry for the delay. Please find response inline --



Thanks

Pat


On 06/27/2017 12:29 PM, Pat Haley wrote:


Hi Soumya,

One example, we have a common working directory dri_fleat in the
gluster volume

drwxrwsr-x 22 root dri_fleat 4.0K May  1 15:14 dri_fleat

my user (phaley) does not own that directory but is a member of the
group  dri_fleat and should have write permissions.  When I go to the
nfs-mounted version and try to use the touch command I get the
following

ibfdr-compute-0-4(dri_fleat)% touch dum
touch: cannot touch `dum': Permission denied

One of the sub-directories under dri_fleat is "test" which phaley 
owns


drwxrwsr-x  2 phaley   dri_fleat 4.0K May  1 15:16 test

Under this directory (mounted via nfs) user phaley can write

ibfdr-compute-0-4(test)% touch dum
ibfdr-compute-0-4(test)%

I have put the packet captures in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch 
experiment

capture_nfssucceed.pcap  has the results from the successful touch
experiment

The command I used for these was

tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w 
/root/capture_nfstest.pcap


I hope these pkts were captured on the node where NFS server is
running. Could you please use '-i any' as I do not see glusterfs
traffic in the tcpdump.

Also looks like NFS v4 is used between client & nfs server. Are you
using kernel-NFS here (i.e, kernel-NFS exporting fuse mounted gluster
volume)?
If that is the case please capture fuse-mnt logs as well. This error
may well be coming from kernel-NFS itself before the request is sent
to fuse-mnt process.

FWIW, we have below option -

Option: server.manage-gids
Default Value: off
Description: Resolve groups on the server-side.

I haven't looked into what this option exactly does. But it may worth
testing with this option on.

Thanks,
Soumya




The brick log files are also in the above link.  If I read them
correctly they both funny times.  Specifically I see entries from
around 2017-06-27 14:02:37.404865  even though the system time was
2017-06-27 12:00:00.

One final item, another reply to my post had a link for possible
problems that could arise from users belonging to too many group. We
have seen the above problem even with a user belonging to only 4
groups.

Let me know what additional information I can provide.




Thanks

Pat


On 06/27/2017 02:45 AM, Soumya Koduri wrote:



On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote:

The only problem with using gluster mounted via NFS is that it
does not
respect the group write permissions which we need.

We have an exercise coming up in the a couple of weeks. It seems
to me
that in order to improve our write times before then, it would be
good
to solve the group write permissions for gluster mounted via NFS 
now.

We can then revisit gluster mounted via FUSE afterwards.

What information would you need to help us force 

Re: [Gluster-users] Slow write times to gluster disk

2017-07-04 Thread Soumya Koduri



On 07/03/2017 09:01 PM, Pat Haley wrote:


Hi Soumya,

When I originally did the tests I ran tcpdump on the client.

I have rerun the tests, doing tcpdump on the server

tcpdump -i any -nnSs 0 host 172.16.1.121 -w /root/capture_nfsfail.pcap

The results are in the same place

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The brick log files are there too.


Thanks for sharing. Looks like the error is not generated 
@gluster-server side. The permission denied error was caused by either 
kNFS or by fuse-mnt process or probably by the combination.


To check fuse-mnt logs, please look at 
/var/log/glusterfs/.log


For eg.: if you have fuse mounted the gluster volume at /mnt/fuse-mnt 
and exported it via kNFS, the log location for that fuse_mnt shall be at 
/var/log/glusterfs/mnt-fuse-mnt.log



Also why not switch to either gluster-NFS native server or NFS-Ganesha 
instead of using kNFS, as they are recommended NFS servers to use with 
gluster?


Thanks,
Soumya



I believe we are using kernel-NFS exporting a fuse mounted gluster
volume.  I am having Steve confirm this.  I tried to find the fuse-mnt
logs but failed.  Where should I look for them?

Thanks

Pat



On 07/03/2017 07:58 AM, Soumya Koduri wrote:



On 06/30/2017 07:56 PM, Pat Haley wrote:


Hi,

I was wondering if there were any additional test we could perform to
help debug the group write-permissions issue?


Sorry for the delay. Please find response inline --



Thanks

Pat


On 06/27/2017 12:29 PM, Pat Haley wrote:


Hi Soumya,

One example, we have a common working directory dri_fleat in the
gluster volume

drwxrwsr-x 22 root dri_fleat 4.0K May  1 15:14 dri_fleat

my user (phaley) does not own that directory but is a member of the
group  dri_fleat and should have write permissions.  When I go to the
nfs-mounted version and try to use the touch command I get the
following

ibfdr-compute-0-4(dri_fleat)% touch dum
touch: cannot touch `dum': Permission denied

One of the sub-directories under dri_fleat is "test" which phaley owns

drwxrwsr-x  2 phaley   dri_fleat 4.0K May  1 15:16 test

Under this directory (mounted via nfs) user phaley can write

ibfdr-compute-0-4(test)% touch dum
ibfdr-compute-0-4(test)%

I have put the packet captures in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The command I used for these was

tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w /root/capture_nfstest.pcap


I hope these pkts were captured on the node where NFS server is
running. Could you please use '-i any' as I do not see glusterfs
traffic in the tcpdump.

Also looks like NFS v4 is used between client & nfs server. Are you
using kernel-NFS here (i.e, kernel-NFS exporting fuse mounted gluster
volume)?
If that is the case please capture fuse-mnt logs as well. This error
may well be coming from kernel-NFS itself before the request is sent
to fuse-mnt process.

FWIW, we have below option -

Option: server.manage-gids
Default Value: off
Description: Resolve groups on the server-side.

I haven't looked into what this option exactly does. But it may worth
testing with this option on.

Thanks,
Soumya




The brick log files are also in the above link.  If I read them
correctly they both funny times.  Specifically I see entries from
around 2017-06-27 14:02:37.404865  even though the system time was
2017-06-27 12:00:00.

One final item, another reply to my post had a link for possible
problems that could arise from users belonging to too many group. We
have seen the above problem even with a user belonging to only 4
groups.

Let me know what additional information I can provide.




Thanks

Pat


On 06/27/2017 02:45 AM, Soumya Koduri wrote:



On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote:

The only problem with using gluster mounted via NFS is that it
does not
respect the group write permissions which we need.

We have an exercise coming up in the a couple of weeks. It seems
to me
that in order to improve our write times before then, it would be
good
to solve the group write permissions for gluster mounted via NFS now.
We can then revisit gluster mounted via FUSE afterwards.

What information would you need to help us force gluster mounted via
NFS
to respect the group write permissions?


Is this owning group or one of the auxiliary groups whose write
permissions are not considered? AFAIK, there are no special
permission checks done by gNFS server when compared to gluster native
client.

Could you please provide simple steps to reproduce the issue and
collect pkt trace and nfs/brick logs as well.

Thanks,
Soumya







___
Gluster-users mailing list
Gluster-users@gluster.org

Re: [Gluster-users] Slow write times to gluster disk

2017-07-03 Thread Pat Haley


Hi Soumya,

When I originally did the tests I ran tcpdump on the client.

I have rerun the tests, doing tcpdump on the server

tcpdump -i any -nnSs 0 host 172.16.1.121 -w /root/capture_nfsfail.pcap

The results are in the same place

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch 
experiment


The brick log files are there too.

I believe we are using kernel-NFS exporting a fuse mounted gluster 
volume.  I am having Steve confirm this.  I tried to find the fuse-mnt 
logs but failed.  Where should I look for them?


Thanks

Pat



On 07/03/2017 07:58 AM, Soumya Koduri wrote:



On 06/30/2017 07:56 PM, Pat Haley wrote:


Hi,

I was wondering if there were any additional test we could perform to
help debug the group write-permissions issue?


Sorry for the delay. Please find response inline --



Thanks

Pat


On 06/27/2017 12:29 PM, Pat Haley wrote:


Hi Soumya,

One example, we have a common working directory dri_fleat in the
gluster volume

drwxrwsr-x 22 root dri_fleat 4.0K May  1 15:14 dri_fleat

my user (phaley) does not own that directory but is a member of the
group  dri_fleat and should have write permissions.  When I go to the
nfs-mounted version and try to use the touch command I get the 
following


ibfdr-compute-0-4(dri_fleat)% touch dum
touch: cannot touch `dum': Permission denied

One of the sub-directories under dri_fleat is "test" which phaley owns

drwxrwsr-x  2 phaley   dri_fleat 4.0K May  1 15:16 test

Under this directory (mounted via nfs) user phaley can write

ibfdr-compute-0-4(test)% touch dum
ibfdr-compute-0-4(test)%

I have put the packet captures in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The command I used for these was

tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w /root/capture_nfstest.pcap


I hope these pkts were captured on the node where NFS server is 
running. Could you please use '-i any' as I do not see glusterfs 
traffic in the tcpdump.


Also looks like NFS v4 is used between client & nfs server. Are you 
using kernel-NFS here (i.e, kernel-NFS exporting fuse mounted gluster 
volume)?
If that is the case please capture fuse-mnt logs as well. This error 
may well be coming from kernel-NFS itself before the request is sent 
to fuse-mnt process.


FWIW, we have below option -

Option: server.manage-gids
Default Value: off
Description: Resolve groups on the server-side.

I haven't looked into what this option exactly does. But it may worth 
testing with this option on.


Thanks,
Soumya




The brick log files are also in the above link.  If I read them
correctly they both funny times.  Specifically I see entries from
around 2017-06-27 14:02:37.404865  even though the system time was
2017-06-27 12:00:00.

One final item, another reply to my post had a link for possible
problems that could arise from users belonging to too many group. We
have seen the above problem even with a user belonging to only 4 
groups.


Let me know what additional information I can provide.




Thanks

Pat


On 06/27/2017 02:45 AM, Soumya Koduri wrote:



On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote:
The only problem with using gluster mounted via NFS is that it 
does not

respect the group write permissions which we need.

We have an exercise coming up in the a couple of weeks. It seems 
to me
that in order to improve our write times before then, it would be 
good

to solve the group write permissions for gluster mounted via NFS now.
We can then revisit gluster mounted via FUSE afterwards.

What information would you need to help us force gluster mounted via
NFS
to respect the group write permissions?


Is this owning group or one of the auxiliary groups whose write
permissions are not considered? AFAIK, there are no special
permission checks done by gNFS server when compared to gluster native
client.

Could you please provide simple steps to reproduce the issue and
collect pkt trace and nfs/brick logs as well.

Thanks,
Soumya






--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Slow write times to gluster disk

2017-07-03 Thread Soumya Koduri



On 06/30/2017 07:56 PM, Pat Haley wrote:


Hi,

I was wondering if there were any additional test we could perform to
help debug the group write-permissions issue?


Sorry for the delay. Please find response inline --



Thanks

Pat


On 06/27/2017 12:29 PM, Pat Haley wrote:


Hi Soumya,

One example, we have a common working directory dri_fleat in the
gluster volume

drwxrwsr-x 22 root dri_fleat 4.0K May  1 15:14 dri_fleat

my user (phaley) does not own that directory but is a member of the
group  dri_fleat and should have write permissions.  When I go to the
nfs-mounted version and try to use the touch command I get the following

ibfdr-compute-0-4(dri_fleat)% touch dum
touch: cannot touch `dum': Permission denied

One of the sub-directories under dri_fleat is "test" which phaley owns

drwxrwsr-x  2 phaley   dri_fleat 4.0K May  1 15:16 test

Under this directory (mounted via nfs) user phaley can write

ibfdr-compute-0-4(test)% touch dum
ibfdr-compute-0-4(test)%

I have put the packet captures in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch
experiment

The command I used for these was

tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w /root/capture_nfstest.pcap


I hope these pkts were captured on the node where NFS server is running. 
Could you please use '-i any' as I do not see glusterfs traffic in the 
tcpdump.


Also looks like NFS v4 is used between client & nfs server. Are you 
using kernel-NFS here (i.e, kernel-NFS exporting fuse mounted gluster 
volume)?
If that is the case please capture fuse-mnt logs as well. This error may 
well be coming from kernel-NFS itself before the request is sent to 
fuse-mnt process.


FWIW, we have below option -

Option: server.manage-gids
Default Value: off
Description: Resolve groups on the server-side.

I haven't looked into what this option exactly does. But it may worth 
testing with this option on.


Thanks,
Soumya




The brick log files are also in the above link.  If I read them
correctly they both funny times.  Specifically I see entries from
around 2017-06-27 14:02:37.404865  even though the system time was
2017-06-27 12:00:00.

One final item, another reply to my post had a link for possible
problems that could arise from users belonging to too many group. We
have seen the above problem even with a user belonging to only 4 groups.

Let me know what additional information I can provide.




Thanks

Pat


On 06/27/2017 02:45 AM, Soumya Koduri wrote:



On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote:

The only problem with using gluster mounted via NFS is that it does not
respect the group write permissions which we need.

We have an exercise coming up in the a couple of weeks.  It seems to me
that in order to improve our write times before then, it would be good
to solve the group write permissions for gluster mounted via NFS now.
We can then revisit gluster mounted via FUSE afterwards.

What information would you need to help us force gluster mounted via
NFS
to respect the group write permissions?


Is this owning group or one of the auxiliary groups whose write
permissions are not considered? AFAIK, there are no special
permission checks done by gNFS server when compared to gluster native
client.

Could you please provide simple steps to reproduce the issue and
collect pkt trace and nfs/brick logs as well.

Thanks,
Soumya





___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Slow write times to gluster disk

2017-06-30 Thread Pat Haley


Hi,

I was wondering if there were any additional test we could perform to 
help debug the group write-permissions issue?


Thanks

Pat


On 06/27/2017 12:29 PM, Pat Haley wrote:


Hi Soumya,

One example, we have a common working directory dri_fleat in the 
gluster volume


drwxrwsr-x 22 root dri_fleat 4.0K May  1 15:14 dri_fleat

my user (phaley) does not own that directory but is a member of the 
group  dri_fleat and should have write permissions.  When I go to the 
nfs-mounted version and try to use the touch command I get the following


ibfdr-compute-0-4(dri_fleat)% touch dum
touch: cannot touch `dum': Permission denied

One of the sub-directories under dri_fleat is "test" which phaley owns

drwxrwsr-x  2 phaley   dri_fleat 4.0K May  1 15:16 test

Under this directory (mounted via nfs) user phaley can write

ibfdr-compute-0-4(test)% touch dum
ibfdr-compute-0-4(test)%

I have put the packet captures in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch 
experiment


The command I used for these was

tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w /root/capture_nfstest.pcap

The brick log files are also in the above link.  If I read them 
correctly they both funny times.  Specifically I see entries from 
around 2017-06-27 14:02:37.404865  even though the system time was 
2017-06-27 12:00:00.


One final item, another reply to my post had a link for possible 
problems that could arise from users belonging to too many group. We 
have seen the above problem even with a user belonging to only 4 groups.


Let me know what additional information I can provide.

Thanks

Pat


On 06/27/2017 02:45 AM, Soumya Koduri wrote:



On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote:

The only problem with using gluster mounted via NFS is that it does not
respect the group write permissions which we need.

We have an exercise coming up in the a couple of weeks.  It seems to me
that in order to improve our write times before then, it would be good
to solve the group write permissions for gluster mounted via NFS now.
We can then revisit gluster mounted via FUSE afterwards.

What information would you need to help us force gluster mounted via 
NFS

to respect the group write permissions?


Is this owning group or one of the auxiliary groups whose write 
permissions are not considered? AFAIK, there are no special 
permission checks done by gNFS server when compared to gluster native 
client.


Could you please provide simple steps to reproduce the issue and 
collect pkt trace and nfs/brick logs as well.


Thanks,
Soumya




--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Slow write times to gluster disk

2017-06-27 Thread Pat Haley


Hi Soumya,

One example, we have a common working directory dri_fleat in the gluster 
volume


drwxrwsr-x 22 root dri_fleat 4.0K May  1 15:14 dri_fleat

my user (phaley) does not own that directory but is a member of the 
group  dri_fleat and should have write permissions.  When I go to the 
nfs-mounted version and try to use the touch command I get the following


ibfdr-compute-0-4(dri_fleat)% touch dum
touch: cannot touch `dum': Permission denied

One of the sub-directories under dri_fleat is "test" which phaley owns

drwxrwsr-x  2 phaley   dri_fleat 4.0K May  1 15:16 test

Under this directory (mounted via nfs) user phaley can write

ibfdr-compute-0-4(test)% touch dum
ibfdr-compute-0-4(test)%

I have put the packet captures in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestNFSmount/

capture_nfsfail.pcap   has the results from the failed touch experiment
capture_nfssucceed.pcap  has the results from the successful touch 
experiment


The command I used for these was

tcpdump -i ib0 -nnSs 0 host 172.16.1.119 -w /root/capture_nfstest.pcap

The brick log files are also in the above link.  If I read them 
correctly they both funny times.  Specifically I see entries from around 
2017-06-27 14:02:37.404865  even though the system time was 2017-06-27 
12:00:00.


One final item, another reply to my post had a link for possible 
problems that could arise from users belonging to too many group. We 
have seen the above problem even with a user belonging to only 4 groups.


Let me know what additional information I can provide.

Thanks

Pat


On 06/27/2017 02:45 AM, Soumya Koduri wrote:



On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote:

The only problem with using gluster mounted via NFS is that it does not
respect the group write permissions which we need.

We have an exercise coming up in the a couple of weeks.  It seems to me
that in order to improve our write times before then, it would be good
to solve the group write permissions for gluster mounted via NFS now.
We can then revisit gluster mounted via FUSE afterwards.

What information would you need to help us force gluster mounted via NFS
to respect the group write permissions?


Is this owning group or one of the auxiliary groups whose write 
permissions are not considered? AFAIK, there are no special permission 
checks done by gNFS server when compared to gluster native client.


Could you please provide simple steps to reproduce the issue and 
collect pkt trace and nfs/brick logs as well.


Thanks,
Soumya


--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Slow write times to gluster disk

2017-06-27 Thread Niels de Vos
 http://lists.gluster.org/pipermail/gluster-users/2017-April/
> >>>030529.html
> >>>- The specific model of the hard disks is SeaGate ENTERPRISE
> >>>CAPACITY V.4 6TB (ST6000NM0024). The rated speed is 6Gb/s.
> >>>- Note: there is one physical server that hosts both the NFS and the
> >>>GlusterFS areas
> >>>
> >>> Latest tests
> >>>
> >>> I have had time to run the tests for one of the dd tests you requested
> >>> to the underlying XFS FS.  The median rate was 170 MB/s.  The dd results
> >>> and iostat record are in
> >>>
> >>> http://mseas.mit.edu/download/phaley/GlusterUsers/TestXFS/
> >>>
> >>> I'll add tests for the other brick and to the NFS area later.
> >>>
> >>> Thanks
> >>>
> >>> Pat
> >>>
> >>>
> >>> On 06/12/2017 06:06 PM, Ben Turner wrote:
> >>>
> >>> Ok you are correct, you have a pure distributed volume.  IE no 
> >>> replication overhead.  So normally for pure dist I use:
> >>>
> >>> throughput = slowest of disks / NIC * .6-.7
> >>>
> >>> In your case we have:
> >>>
> >>> 1200 * .6 = 720
> >>>
> >>> So you are seeing a little less throughput than I would expect in your 
> >>> configuration.  What I like to do here is:
> >>>
> >>> -First tell me more about your back end storage, will it sustain 1200 MB 
> >>> / sec?  What kind of HW?  How many disks?  What type and specs are the 
> >>> disks?  What kind of RAID are you using?
> >>>
> >>> -Second can you refresh me on your workload?  Are you doing reads / 
> >>> writes or both?  If both what mix?  Since we are using DD I assume you 
> >>> are working iwth large file sequential I/O, is this correct?
> >>>
> >>> -Run some DD tests on the back end XFS FS.  I normally have 
> >>> /xfs-mount/gluster-brick, if you have something similar just mkdir on the 
> >>> XFS -> /xfs-mount/my-test-dir.  Inside the test dir run:
> >>>
> >>> If you are focusing on a write workload run:
> >>>
> >>> # dd if=/dev/zero of=/xfs-mount/file bs=1024k count=1 conv=fdatasync
> >>>
> >>> If you are focusing on a read workload run:
> >>>
> >>> # echo 3 > /proc/sys/vm/drop_caches
> >>> # dd if=/gluster-mount/file of=/dev/null bs=1024k count=1
> >>>
> >>> ** MAKE SURE TO DROP CACHE IN BETWEEN READS!! **
> >>>
> >>> Run this in a loop similar to how you did in:
> >>> http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt
> >>>
> >>> Run this on both servers one at a time and if you are running on a SAN 
> >>> then run again on both at the same time.  While this is running gather 
> >>> iostat for me:
> >>>
> >>> # iostat -c -m -x 1 > iostat-$(hostname).txt
> >>>
> >>> Lets see how the back end performs on both servers while capturing 
> >>> iostat, then see how the same workload / data looks on gluster.
> >>>
> >>> -Last thing, when you run your kernel NFS tests are you using the same 
> >>> filesystem / storage you are using for the gluster bricks?  I want to be 
> >>> sure we have an apples to apples comparison here.
> >>>
> >>> -b
> >>>
> >>>
> >>>
> >>> - Original Message -
> >>>
> >>> From: "Pat Haley" <pha...@mit.edu> <pha...@mit.edu>
> >>> To: "Ben Turner" <btur...@redhat.com> <btur...@redhat.com>
> >>> Sent: Monday, June 12, 2017 5:18:07 PM
> >>> Subject: Re: [Gluster-users] Slow write times to gluster disk
> >>>
> >>>
> >>> Hi Ben,
> >>>
> >>> Here is the output:
> >>>
> >>> [root@mseas-data2 ~]# gluster volume info
> >>>
> >>> Volume Name: data-volume
> >>> Type: Distribute
> >>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
> >>> Status: Started
> >>> Number of Bricks: 2
> >>> Transport-type: tcp
> >>> Bricks:
> >>> Brick1: mseas-data2:/mnt/brick1
> >>> Brick2: mseas-data2:/mnt/brick2
> >>> Options Reconfigured:
> >>> nfs.exports-auth-enable: on
>

Re: [Gluster-users] Slow write times to gluster disk

2017-06-27 Thread Soumya Koduri



On 06/27/2017 10:17 AM, Pranith Kumar Karampuri wrote:

The only problem with using gluster mounted via NFS is that it does not
respect the group write permissions which we need.

We have an exercise coming up in the a couple of weeks.  It seems to me
that in order to improve our write times before then, it would be good
to solve the group write permissions for gluster mounted via NFS now.
We can then revisit gluster mounted via FUSE afterwards.

What information would you need to help us force gluster mounted via NFS
to respect the group write permissions?


Is this owning group or one of the auxiliary groups whose write 
permissions are not considered? AFAIK, there are no special permission 
checks done by gNFS server when compared to gluster native client.


Could you please provide simple steps to reproduce the issue and collect 
pkt trace and nfs/brick logs as well.


Thanks,
Soumya
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Slow write times to gluster disk

2017-06-26 Thread Pranith Kumar Karampuri
orrect, you have a pure distributed volume.  IE no replication 
>>> overhead.  So normally for pure dist I use:
>>>
>>> throughput = slowest of disks / NIC * .6-.7
>>>
>>> In your case we have:
>>>
>>> 1200 * .6 = 720
>>>
>>> So you are seeing a little less throughput than I would expect in your 
>>> configuration.  What I like to do here is:
>>>
>>> -First tell me more about your back end storage, will it sustain 1200 MB / 
>>> sec?  What kind of HW?  How many disks?  What type and specs are the disks? 
>>>  What kind of RAID are you using?
>>>
>>> -Second can you refresh me on your workload?  Are you doing reads / writes 
>>> or both?  If both what mix?  Since we are using DD I assume you are working 
>>> iwth large file sequential I/O, is this correct?
>>>
>>> -Run some DD tests on the back end XFS FS.  I normally have 
>>> /xfs-mount/gluster-brick, if you have something similar just mkdir on the 
>>> XFS -> /xfs-mount/my-test-dir.  Inside the test dir run:
>>>
>>> If you are focusing on a write workload run:
>>>
>>> # dd if=/dev/zero of=/xfs-mount/file bs=1024k count=1 conv=fdatasync
>>>
>>> If you are focusing on a read workload run:
>>>
>>> # echo 3 > /proc/sys/vm/drop_caches
>>> # dd if=/gluster-mount/file of=/dev/null bs=1024k count=1
>>>
>>> ** MAKE SURE TO DROP CACHE IN BETWEEN READS!! **
>>>
>>> Run this in a loop similar to how you did in:
>>> http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt
>>>
>>> Run this on both servers one at a time and if you are running on a SAN then 
>>> run again on both at the same time.  While this is running gather iostat 
>>> for me:
>>>
>>> # iostat -c -m -x 1 > iostat-$(hostname).txt
>>>
>>> Lets see how the back end performs on both servers while capturing iostat, 
>>> then see how the same workload / data looks on gluster.
>>>
>>> -Last thing, when you run your kernel NFS tests are you using the same 
>>> filesystem / storage you are using for the gluster bricks?  I want to be 
>>> sure we have an apples to apples comparison here.
>>>
>>> -b
>>>
>>>
>>>
>>> - Original Message -
>>>
>>> From: "Pat Haley" <pha...@mit.edu> <pha...@mit.edu>
>>> To: "Ben Turner" <btur...@redhat.com> <btur...@redhat.com>
>>> Sent: Monday, June 12, 2017 5:18:07 PM
>>> Subject: Re: [Gluster-users] Slow write times to gluster disk
>>>
>>>
>>> Hi Ben,
>>>
>>> Here is the output:
>>>
>>> [root@mseas-data2 ~]# gluster volume info
>>>
>>> Volume Name: data-volume
>>> Type: Distribute
>>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>>> Status: Started
>>> Number of Bricks: 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: mseas-data2:/mnt/brick1
>>> Brick2: mseas-data2:/mnt/brick2
>>> Options Reconfigured:
>>> nfs.exports-auth-enable: on
>>> diagnostics.brick-sys-log-level: WARNING
>>> performance.readdir-ahead: on
>>> nfs.disable: on
>>> nfs.export-volumes: off
>>>
>>>
>>> On 06/12/2017 05:01 PM, Ben Turner wrote:
>>>
>>> What is the output of gluster v info?  That will tell us more about your
>>> config.
>>>
>>> -b
>>>
>>> - Original Message -
>>>
>>> From: "Pat Haley" <pha...@mit.edu> <pha...@mit.edu>
>>> To: "Ben Turner" <btur...@redhat.com> <btur...@redhat.com>
>>> Sent: Monday, June 12, 2017 4:54:00 PM
>>> Subject: Re: [Gluster-users] Slow write times to gluster disk
>>>
>>>
>>> Hi Ben,
>>>
>>> I guess I'm confused about what you mean by replication.  If I look at
>>> the underlying bricks I only ever have a single copy of any file.  It
>>> either resides on one brick or the other  (directories exist on both
>>> bricks but not files).  We are not using gluster for redundancy (or at
>>> least that wasn't our intent).   Is that what you meant by replication
>>> or is it something else?
>>>
>>> Thanks
>>>
>>> Pat
>>>
>>> On 06/12/2017 04:28 PM, Ben Turner wrote:
>>>
>>> - Original Message -
>>>

Re: [Gluster-users] Slow write times to gluster disk

2017-06-26 Thread Pat Haley
 I normally have 
/xfs-mount/gluster-brick, if you have something similar just mkdir on the XFS 
-> /xfs-mount/my-test-dir.  Inside the test dir run:

If you are focusing on a write workload run:

# dd if=/dev/zero of=/xfs-mount/file bs=1024k count=1 conv=fdatasync

If you are focusing on a read workload run:

# echo 3 > /proc/sys/vm/drop_caches
# dd if=/gluster-mount/file of=/dev/null bs=1024k count=1

** MAKE SURE TO DROP CACHE IN BETWEEN READS!! **

Run this in a loop similar to how you did in:


http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt

<http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt>

Run this on both servers one at a time and if you are running on a SAN 
then run again on both at the same time.  While this is running gather iostat 
for me:

# iostat -c -m -x 1 > iostat-$(hostname).txt

Lets see how the back end performs on both servers while capturing 
iostat, then see how the same workload / data looks on gluster.

-Last thing, when you run your kernel NFS tests are you using the same 
filesystem / storage you are using for the gluster bricks?  I want to be sure 
we have an apples to apples comparison here.

-b



- Original Message -

From: "Pat Haley"<pha...@mit.edu> <mailto:pha...@mit.edu>
To: "Ben Turner"<btur...@redhat.com> <mailto:btur...@redhat.com>
    Sent: Monday, June 12, 2017 5:18:07 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Ben,

Here is the output:

[root@mseas-data2 ~]# gluster volume info

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off


On 06/12/2017 05:01 PM, Ben Turner wrote:

What is the output of gluster v info?  That will tell us more about your
config.

-b

- Original Message -

From: "Pat Haley"<pha...@mit.edu> <mailto:pha...@mit.edu>
To: "Ben Turner"<btur...@redhat.com> <mailto:btur...@redhat.com>
Sent: Monday, June 12, 2017 4:54:00 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Ben,

I guess I'm confused about what you mean by replication.  If I look at
the underlying bricks I only ever have a single copy of any file.  It
either resides on one brick or the other  (directories exist on both
bricks but not files).  We are not using gluster for redundancy (or at
least that wasn't our intent).   Is that what you meant by replication
or is it something else?

Thanks

Pat

On 06/12/2017 04:28 PM, Ben Turner wrote:

- Original Message -

From: "Pat Haley"<pha...@mit.edu> <mailto:pha...@mit.edu>
To: "Ben Turner"<btur...@redhat.com> <mailto:btur...@redhat.com>, "Pranith 
Kumar Karampuri"
<pkara...@redhat.com> <mailto:pkara...@redhat.com>
Cc: "Ravishankar N"<ravishan...@redhat.com> 
<mailto:ravishan...@redhat.com>,gluster-users@gluster.org
    <mailto:gluster-users@gluster.org>,
"Steve Postma"<spos...@ztechnet.com> <mailto:spos...@ztechnet.com>
Sent: Monday, June 12, 2017 2:35:41 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Guys,

I was wondering what our next steps should be to solve the slow write
times.

Recently I was debugging a large code and writing a lot of output at
every time step.  When I tried writing to our gluster disks, it was
taking over a day to do a single time step whereas if I had the same
program (same hardware, network) write to our nfs disk the time per
time-step was about 45 minutes. What we are shooting for here would be
to have similar times to either gluster of nfs.

I can see in your test:


http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt

<http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt>

You averaged ~600 MB / sec(expected for replica 2 with 10G, {~1200 MB /
sec} / #replicas{2} = 600).  Gluster does client side replication so 
with
replica 2 you 

Re: [Gluster-users] Slow write times to gluster disk

2017-06-23 Thread Pranith Kumar Karampuri
>
>> Run this on both servers one at a time and if you are running on a SAN then 
>> run again on both at the same time.  While this is running gather iostat for 
>> me:
>>
>> # iostat -c -m -x 1 > iostat-$(hostname).txt
>>
>> Lets see how the back end performs on both servers while capturing iostat, 
>> then see how the same workload / data looks on gluster.
>>
>> -Last thing, when you run your kernel NFS tests are you using the same 
>> filesystem / storage you are using for the gluster bricks?  I want to be 
>> sure we have an apples to apples comparison here.
>>
>> -b
>>
>>
>>
>> - Original Message -
>>
>> From: "Pat Haley" <pha...@mit.edu> <pha...@mit.edu>
>> To: "Ben Turner" <btur...@redhat.com> <btur...@redhat.com>
>> Sent: Monday, June 12, 2017 5:18:07 PM
>> Subject: Re: [Gluster-users] Slow write times to gluster disk
>>
>>
>> Hi Ben,
>>
>> Here is the output:
>>
>> [root@mseas-data2 ~]# gluster volume info
>>
>> Volume Name: data-volume
>> Type: Distribute
>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>> Status: Started
>> Number of Bricks: 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: mseas-data2:/mnt/brick1
>> Brick2: mseas-data2:/mnt/brick2
>> Options Reconfigured:
>> nfs.exports-auth-enable: on
>> diagnostics.brick-sys-log-level: WARNING
>> performance.readdir-ahead: on
>> nfs.disable: on
>> nfs.export-volumes: off
>>
>>
>> On 06/12/2017 05:01 PM, Ben Turner wrote:
>>
>> What is the output of gluster v info?  That will tell us more about your
>> config.
>>
>> -b
>>
>> - Original Message -
>>
>> From: "Pat Haley" <pha...@mit.edu> <pha...@mit.edu>
>> To: "Ben Turner" <btur...@redhat.com> <btur...@redhat.com>
>> Sent: Monday, June 12, 2017 4:54:00 PM
>> Subject: Re: [Gluster-users] Slow write times to gluster disk
>>
>>
>> Hi Ben,
>>
>> I guess I'm confused about what you mean by replication.  If I look at
>> the underlying bricks I only ever have a single copy of any file.  It
>> either resides on one brick or the other  (directories exist on both
>> bricks but not files).  We are not using gluster for redundancy (or at
>> least that wasn't our intent).   Is that what you meant by replication
>> or is it something else?
>>
>> Thanks
>>
>> Pat
>>
>> On 06/12/2017 04:28 PM, Ben Turner wrote:
>>
>> - Original Message -
>>
>> From: "Pat Haley" <pha...@mit.edu> <pha...@mit.edu>
>> To: "Ben Turner" <btur...@redhat.com> <btur...@redhat.com>, "Pranith Kumar 
>> Karampuri"<pkara...@redhat.com> <pkara...@redhat.com>
>> Cc: "Ravishankar N" <ravishan...@redhat.com> <ravishan...@redhat.com>, 
>> gluster-users@gluster.org,
>> "Steve Postma" <spos...@ztechnet.com> <spos...@ztechnet.com>
>> Sent: Monday, June 12, 2017 2:35:41 PM
>> Subject: Re: [Gluster-users] Slow write times to gluster disk
>>
>>
>> Hi Guys,
>>
>> I was wondering what our next steps should be to solve the slow write
>> times.
>>
>> Recently I was debugging a large code and writing a lot of output at
>> every time step.  When I tried writing to our gluster disks, it was
>> taking over a day to do a single time step whereas if I had the same
>> program (same hardware, network) write to our nfs disk the time per
>> time-step was about 45 minutes. What we are shooting for here would be
>> to have similar times to either gluster of nfs.
>>
>> I can see in your test:
>> http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt
>>
>> You averaged ~600 MB / sec(expected for replica 2 with 10G, {~1200 MB /
>> sec} / #replicas{2} = 600).  Gluster does client side replication so with
>> replica 2 you will only ever see 1/2 the speed of your slowest part of
>> the
>> stack(NW, disk, RAM, CPU).  This is usually NW or disk and 600 is
>> normally
>> a best case.  Now in your output I do see the instances where you went
>> down to 200 MB / sec.  I can only explain this in three ways:
>>
>> 1.  You are not using conv=fdatasync and writes are actually going to
>> page
>> cache and then being flushed to disk.  During the fsync the memory is not
>> yet available and the disks are busy flushing dirty pages.
>&g

Re: [Gluster-users] Slow write times to gluster disk

2017-06-22 Thread Pranith Kumar Karampuri
On Fri, Jun 23, 2017 at 2:23 AM, Pat Haley <pha...@mit.edu> wrote:

>
> Hi,
>
> Today we experimented with some of the FUSE options that we found in the
> list.
>
> Changing these options had no effect:
>
> gluster volume set test-volume performance.cache-max-file-size 2MB
> gluster volume set test-volume performance.cache-refresh-timeout 4
> gluster volume set test-volume performance.cache-size 256MB
> gluster volume set test-volume performance.write-behind-window-size 4MB
> gluster volume set test-volume performance.write-behind-window-size 8MB
>
>
This is a good coincidence, I am meeting with write-behind
maintainer(+Raghavendra G) today for the same doubt. I think we will have
something by EOD IST. I will update you.

> Changing the following option from its default value made the speed slower
>
> gluster volume set test-volume performance.write-behind off (on by default)
>
> Changing the following options initially appeared to give a 10% increase
> in speed, but this vanished in subsequent tests (we think the apparent
> increase may have been to a lighter workload on the computer from other
> users)
>
> gluster volume set test-volume performance.stat-prefetch on
> gluster volume set test-volume client.event-threads 4
> gluster volume set test-volume server.event-threads 4
>
> Can anything be gleaned from these observations?  Are there other things
> we can try?
>
> Thanks
>
> Pat
>
>
>
> On 06/20/2017 12:06 PM, Pat Haley wrote:
>
>
> Hi Ben,
>
> Sorry this took so long, but we had a real-time forecasting exercise last
> week and I could only get to this now.
>
> Backend Hardware/OS:
>
>- Much of the information on our back end system is included at the
>top of  http://lists.gluster.org/pipermail/gluster-users/2017-
>April/030529.html
>- The specific model of the hard disks is SeaGate ENTERPRISE CAPACITY
>V.4 6TB (ST6000NM0024). The rated speed is 6Gb/s.
>- Note: there is one physical server that hosts both the NFS and the
>GlusterFS areas
>
> Latest tests
>
> I have had time to run the tests for one of the dd tests you requested to
> the underlying XFS FS.  The median rate was 170 MB/s.  The dd results and
> iostat record are in
>
> http://mseas.mit.edu/download/phaley/GlusterUsers/TestXFS/
>
> I'll add tests for the other brick and to the NFS area later.
>
> Thanks
>
> Pat
>
>
> On 06/12/2017 06:06 PM, Ben Turner wrote:
>
> Ok you are correct, you have a pure distributed volume.  IE no replication 
> overhead.  So normally for pure dist I use:
>
> throughput = slowest of disks / NIC * .6-.7
>
> In your case we have:
>
> 1200 * .6 = 720
>
> So you are seeing a little less throughput than I would expect in your 
> configuration.  What I like to do here is:
>
> -First tell me more about your back end storage, will it sustain 1200 MB / 
> sec?  What kind of HW?  How many disks?  What type and specs are the disks?  
> What kind of RAID are you using?
>
> -Second can you refresh me on your workload?  Are you doing reads / writes or 
> both?  If both what mix?  Since we are using DD I assume you are working iwth 
> large file sequential I/O, is this correct?
>
> -Run some DD tests on the back end XFS FS.  I normally have 
> /xfs-mount/gluster-brick, if you have something similar just mkdir on the XFS 
> -> /xfs-mount/my-test-dir.  Inside the test dir run:
>
> If you are focusing on a write workload run:
>
> # dd if=/dev/zero of=/xfs-mount/file bs=1024k count=1 conv=fdatasync
>
> If you are focusing on a read workload run:
>
> # echo 3 > /proc/sys/vm/drop_caches
> # dd if=/gluster-mount/file of=/dev/null bs=1024k count=1
>
> ** MAKE SURE TO DROP CACHE IN BETWEEN READS!! **
>
> Run this in a loop similar to how you did in:
> http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt
>
> Run this on both servers one at a time and if you are running on a SAN then 
> run again on both at the same time.  While this is running gather iostat for 
> me:
>
> # iostat -c -m -x 1 > iostat-$(hostname).txt
>
> Lets see how the back end performs on both servers while capturing iostat, 
> then see how the same workload / data looks on gluster.
>
> -Last thing, when you run your kernel NFS tests are you using the same 
> filesystem / storage you are using for the gluster bricks?  I want to be sure 
> we have an apples to apples comparison here.
>
> -b
>
>
>
> - Original Message -
>
> From: "Pat Haley" <pha...@mit.edu> <pha...@mit.edu>
> To: "Ben Turner" <btur...@redhat.com> <btur...@redhat.com>
> Sent: Monday

Re: [Gluster-users] Slow write times to gluster disk

2017-06-22 Thread Pat Haley


Hi,

Today we experimented with some of the FUSE options that we found in the 
list.


Changing these options had no effect:

gluster volume set test-volume performance.cache-max-file-size 2MB
gluster volume set test-volume performance.cache-refresh-timeout 4
gluster volume set test-volume performance.cache-size 256MB
gluster volume set test-volume performance.write-behind-window-size 4MB
gluster volume set test-volume performance.write-behind-window-size 8MB

Changing the following option from its default value made the speed slower

gluster volume set test-volume performance.write-behind off (on by default)

Changing the following options initially appeared to give a 10% increase 
in speed, but this vanished in subsequent tests (we think the apparent 
increase may have been to a lighter workload on the computer from other 
users)


gluster volume set test-volume performance.stat-prefetch on
gluster volume set test-volume client.event-threads 4
gluster volume set test-volume server.event-threads 4


Can anything be gleaned from these observations?  Are there other things 
we can try?


Thanks

Pat


On 06/20/2017 12:06 PM, Pat Haley wrote:


Hi Ben,

Sorry this took so long, but we had a real-time forecasting exercise 
last week and I could only get to this now.


Backend Hardware/OS:

  * Much of the information on our back end system is included at the
top of
http://lists.gluster.org/pipermail/gluster-users/2017-April/030529.html
  * The specific model of the hard disks is SeaGate ENTERPRISE
CAPACITY V.4 6TB (ST6000NM0024). The rated speed is 6Gb/s.
  * Note: there is one physical server that hosts both the NFS and the
GlusterFS areas

Latest tests

I have had time to run the tests for one of the dd tests you requested 
to the underlying XFS FS.  The median rate was 170 MB/s.  The dd 
results and iostat record are in


http://mseas.mit.edu/download/phaley/GlusterUsers/TestXFS/

I'll add tests for the other brick and to the NFS area later.

Thanks

Pat


On 06/12/2017 06:06 PM, Ben Turner wrote:

Ok you are correct, you have a pure distributed volume.  IE no replication 
overhead.  So normally for pure dist I use:

throughput = slowest of disks / NIC * .6-.7

In your case we have:

1200 * .6 = 720

So you are seeing a little less throughput than I would expect in your 
configuration.  What I like to do here is:

-First tell me more about your back end storage, will it sustain 1200 MB / sec? 
 What kind of HW?  How many disks?  What type and specs are the disks?  What 
kind of RAID are you using?

-Second can you refresh me on your workload?  Are you doing reads / writes or 
both?  If both what mix?  Since we are using DD I assume you are working iwth 
large file sequential I/O, is this correct?

-Run some DD tests on the back end XFS FS.  I normally have 
/xfs-mount/gluster-brick, if you have something similar just mkdir on the XFS 
-> /xfs-mount/my-test-dir.  Inside the test dir run:

If you are focusing on a write workload run:

# dd if=/dev/zero of=/xfs-mount/file bs=1024k count=1 conv=fdatasync

If you are focusing on a read workload run:

# echo 3 > /proc/sys/vm/drop_caches
# dd if=/gluster-mount/file of=/dev/null bs=1024k count=1

** MAKE SURE TO DROP CACHE IN BETWEEN READS!! **

Run this in a loop similar to how you did in:

http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt

Run this on both servers one at a time and if you are running on a SAN then run 
again on both at the same time.  While this is running gather iostat for me:

# iostat -c -m -x 1 > iostat-$(hostname).txt

Lets see how the back end performs on both servers while capturing iostat, then 
see how the same workload / data looks on gluster.

-Last thing, when you run your kernel NFS tests are you using the same 
filesystem / storage you are using for the gluster bricks?  I want to be sure 
we have an apples to apples comparison here.

-b



- Original Message -

From: "Pat Haley"<pha...@mit.edu>
To: "Ben Turner"<btur...@redhat.com>
Sent: Monday, June 12, 2017 5:18:07 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Ben,

Here is the output:

[root@mseas-data2 ~]# gluster volume info

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off


On 06/12/2017 05:01 PM, Ben Turner wrote:

What is the output of gluster v info?  That will tell us more about your
config.

-b

- Original Message -

From: "Pat Haley"<pha...@mit.edu>
To: "Ben Turner"<btur...@redhat.com>
Sent: Monday, June 12, 2017 4:54:00 PM
Subject: Re: [Gluster-users] Slow write times to gluste

Re: [Gluster-users] Slow write times to gluster disk

2017-06-20 Thread Pat Haley


Hi Ben,

Sorry this took so long, but we had a real-time forecasting exercise 
last week and I could only get to this now.


Backend Hardware/OS:

 * Much of the information on our back end system is included at the
   top of
   http://lists.gluster.org/pipermail/gluster-users/2017-April/030529.html
 * The specific model of the hard disks is SeaGate ENTERPRISE CAPACITY
   V.4 6TB (ST6000NM0024). The rated speed is 6Gb/s.
 * Note: there is one physical server that hosts both the NFS and the
   GlusterFS areas

Latest tests

I have had time to run the tests for one of the dd tests you requested 
to the underlying XFS FS.  The median rate was 170 MB/s. The dd results 
and iostat record are in


http://mseas.mit.edu/download/phaley/GlusterUsers/TestXFS/

I'll add tests for the other brick and to the NFS area later.

Thanks

Pat


On 06/12/2017 06:06 PM, Ben Turner wrote:

Ok you are correct, you have a pure distributed volume.  IE no replication 
overhead.  So normally for pure dist I use:

throughput = slowest of disks / NIC * .6-.7

In your case we have:

1200 * .6 = 720

So you are seeing a little less throughput than I would expect in your 
configuration.  What I like to do here is:

-First tell me more about your back end storage, will it sustain 1200 MB / sec? 
 What kind of HW?  How many disks?  What type and specs are the disks?  What 
kind of RAID are you using?

-Second can you refresh me on your workload?  Are you doing reads / writes or 
both?  If both what mix?  Since we are using DD I assume you are working iwth 
large file sequential I/O, is this correct?

-Run some DD tests on the back end XFS FS.  I normally have 
/xfs-mount/gluster-brick, if you have something similar just mkdir on the XFS 
-> /xfs-mount/my-test-dir.  Inside the test dir run:

If you are focusing on a write workload run:

# dd if=/dev/zero of=/xfs-mount/file bs=1024k count=1 conv=fdatasync

If you are focusing on a read workload run:

# echo 3 > /proc/sys/vm/drop_caches
# dd if=/gluster-mount/file of=/dev/null bs=1024k count=1

** MAKE SURE TO DROP CACHE IN BETWEEN READS!! **

Run this in a loop similar to how you did in:

http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt

Run this on both servers one at a time and if you are running on a SAN then run 
again on both at the same time.  While this is running gather iostat for me:

# iostat -c -m -x 1 > iostat-$(hostname).txt

Lets see how the back end performs on both servers while capturing iostat, then 
see how the same workload / data looks on gluster.

-Last thing, when you run your kernel NFS tests are you using the same 
filesystem / storage you are using for the gluster bricks?  I want to be sure 
we have an apples to apples comparison here.

-b



- Original Message -

From: "Pat Haley" <pha...@mit.edu>
To: "Ben Turner" <btur...@redhat.com>
Sent: Monday, June 12, 2017 5:18:07 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Ben,

Here is the output:

[root@mseas-data2 ~]# gluster volume info

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off


On 06/12/2017 05:01 PM, Ben Turner wrote:

What is the output of gluster v info?  That will tell us more about your
config.

-b

- Original Message -

From: "Pat Haley" <pha...@mit.edu>
To: "Ben Turner" <btur...@redhat.com>
Sent: Monday, June 12, 2017 4:54:00 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Ben,

I guess I'm confused about what you mean by replication.  If I look at
the underlying bricks I only ever have a single copy of any file.  It
either resides on one brick or the other  (directories exist on both
bricks but not files).  We are not using gluster for redundancy (or at
least that wasn't our intent).   Is that what you meant by replication
or is it something else?

Thanks

Pat

On 06/12/2017 04:28 PM, Ben Turner wrote:

- Original Message -

From: "Pat Haley" <pha...@mit.edu>
To: "Ben Turner" <btur...@redhat.com>, "Pranith Kumar Karampuri"
<pkara...@redhat.com>
Cc: "Ravishankar N" <ravishan...@redhat.com>, gluster-users@gluster.org,
"Steve Postma" <spos...@ztechnet.com>
Sent: Monday, June 12, 2017 2:35:41 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Guys,

I was wondering what our next steps should be to solve the slow write
times.

Recently I was debugging a large code and writing a lot of output at
every time step.  When I tried writing to our gluster disks, it was
taking over a day to do a single tim

Re: [Gluster-users] Slow write times to gluster disk

2017-06-12 Thread Ben Turner
- Original Message -
> From: "Pat Haley" <pha...@mit.edu>
> To: "Ben Turner" <btur...@redhat.com>, "Pranith Kumar Karampuri" 
> <pkara...@redhat.com>
> Cc: "Ravishankar N" <ravishan...@redhat.com>, gluster-users@gluster.org, 
> "Steve Postma" <spos...@ztechnet.com>
> Sent: Monday, June 12, 2017 2:35:41 PM
> Subject: Re: [Gluster-users] Slow write times to gluster disk
> 
> 
> Hi Guys,
> 
> I was wondering what our next steps should be to solve the slow write times.
> 
> Recently I was debugging a large code and writing a lot of output at
> every time step.  When I tried writing to our gluster disks, it was
> taking over a day to do a single time step whereas if I had the same
> program (same hardware, network) write to our nfs disk the time per
> time-step was about 45 minutes. What we are shooting for here would be
> to have similar times to either gluster of nfs.

I can see in your test:

http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt

You averaged ~600 MB / sec(expected for replica 2 with 10G, {~1200 MB / sec} / 
#replicas{2} = 600).  Gluster does client side replication so with replica 2 
you will only ever see 1/2 the speed of your slowest part of the stack(NW, 
disk, RAM, CPU).  This is usually NW or disk and 600 is normally a best case.  
Now in your output I do see the instances where you went down to 200 MB / sec.  
I can only explain this in three ways:

1.  You are not using conv=fdatasync and writes are actually going to page 
cache and then being flushed to disk.  During the fsync the memory is not yet 
available and the disks are busy flushing dirty pages.
2.  Your storage RAID group is shared across multiple LUNS(like in a SAN) and 
when write times are slow the RAID group is busy serviceing other LUNs.
3.  Gluster bug / config issue / some other unknown unknown.

So I see 2 issues here:

1.  NFS does in 45 minutes what gluster can do in 24 hours.
2.  Sometimes your throughput drops dramatically.

WRT #1 - have a look at my estimates above.  My formula for guestimating 
gluster perf is: throughput = NIC throughput or storage(whatever is slower) / # 
replicas * overhead(figure .7 or .8).  Also the larger the record size the 
better for glusterfs mounts, I normally like to be at LEAST 64k up to 1024k:

# dd if=/dev/zero of=/gluster-mount/file bs=1024k count=1 conv=fdatasync

WRT #2 - Again, I question your testing and your storage config.  Try using 
conv=fdatasync for your DDs, use a larger record size, and make sure that your 
back end storage is not causing your slowdowns.  Also remember that with 
replica 2 you will take ~50% hit on writes because the client uses 50% of its 
bandwidth to write to one replica and 50% to the other.

-b



> 
> Thanks
> 
> Pat
> 
> 
> On 06/02/2017 01:07 AM, Ben Turner wrote:
> > Are you sure using conv=sync is what you want?  I normally use
> > conv=fdatasync, I'll look up the difference between the two and see if it
> > affects your test.
> >
> >
> > -b
> >
> > - Original Message -
> >> From: "Pat Haley" <pha...@mit.edu>
> >> To: "Pranith Kumar Karampuri" <pkara...@redhat.com>
> >> Cc: "Ravishankar N" <ravishan...@redhat.com>, gluster-users@gluster.org,
> >> "Steve Postma" <spos...@ztechnet.com>, "Ben
> >> Turner" <btur...@redhat.com>
> >> Sent: Tuesday, May 30, 2017 9:40:34 PM
> >> Subject: Re: [Gluster-users] Slow write times to gluster disk
> >>
> >>
> >> Hi Pranith,
> >>
> >> The "dd" command was:
> >>
> >>   dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt conv=sync
> >>
> >> There were 2 instances where dd reported 22 seconds. The output from the
> >> dd tests are in
> >>
> >> http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt
> >>
> >> Pat
> >>
> >> On 05/30/2017 09:27 PM, Pranith Kumar Karampuri wrote:
> >>> Pat,
> >>> What is the command you used? As per the following output, it
> >>> seems like at least one write operation took 16 seconds. Which is
> >>> really bad.
> >>>96.391165.10 us  89.00 us*16487014.00 us*  393212
> >>>WRITE
> >>>
> >>>
> >>> On Tue, May 30, 2017 at 10:36 PM, Pat Haley <pha...@mit.edu
> >>> <mailto:pha...@mit.edu>> wrote:
> >>>
> >>>
> >>>  Hi Pranith,
> >>>
> >>>  I ran the same 'dd' test both in the gl

Re: [Gluster-users] Slow write times to gluster disk

2017-06-12 Thread Pat Haley


Hi Guys,

I was wondering what our next steps should be to solve the slow write times.

Recently I was debugging a large code and writing a lot of output at 
every time step.  When I tried writing to our gluster disks, it was 
taking over a day to do a single time step whereas if I had the same 
program (same hardware, network) write to our nfs disk the time per 
time-step was about 45 minutes. What we are shooting for here would be 
to have similar times to either gluster of nfs.


Thanks

Pat


On 06/02/2017 01:07 AM, Ben Turner wrote:

Are you sure using conv=sync is what you want?  I normally use conv=fdatasync, 
I'll look up the difference between the two and see if it affects your test.


-b

- Original Message -

From: "Pat Haley" <pha...@mit.edu>
To: "Pranith Kumar Karampuri" <pkara...@redhat.com>
Cc: "Ravishankar N" <ravishan...@redhat.com>, gluster-users@gluster.org, "Steve Postma" 
<spos...@ztechnet.com>, "Ben
Turner" <btur...@redhat.com>
Sent: Tuesday, May 30, 2017 9:40:34 PM
Subject: Re: [Gluster-users] Slow write times to gluster disk


Hi Pranith,

The "dd" command was:

  dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt conv=sync

There were 2 instances where dd reported 22 seconds. The output from the
dd tests are in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt

Pat

On 05/30/2017 09:27 PM, Pranith Kumar Karampuri wrote:

Pat,
What is the command you used? As per the following output, it
seems like at least one write operation took 16 seconds. Which is
really bad.
   96.391165.10 us  89.00 us*16487014.00 us*  393212
   WRITE


On Tue, May 30, 2017 at 10:36 PM, Pat Haley <pha...@mit.edu
<mailto:pha...@mit.edu>> wrote:


 Hi Pranith,

 I ran the same 'dd' test both in the gluster test volume and in
 the .glusterfs directory of each brick.  The median results (12 dd
 trials in each test) are similar to before

   * gluster test volume: 586.5 MB/s
   * bricks (in .glusterfs): 1.4 GB/s

 The profile for the gluster test-volume is in

 
http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt
 
<http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt>

 Thanks

 Pat




 On 05/30/2017 12:10 PM, Pranith Kumar Karampuri wrote:

 Let's start with the same 'dd' test we were testing with to see,
 what the numbers are. Please provide profile numbers for the
 same. From there on we will start tuning the volume to see what
 we can do.

 On Tue, May 30, 2017 at 9:16 PM, Pat Haley <pha...@mit.edu
 <mailto:pha...@mit.edu>> wrote:


 Hi Pranith,

 Thanks for the tip.  We now have the gluster volume mounted
 under /home.  What tests do you recommend we run?

 Thanks

 Pat



 On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:


 On Tue, May 16, 2017 at 9:20 PM, Pat Haley <pha...@mit.edu
 <mailto:pha...@mit.edu>> wrote:


 Hi Pranith,

 Sorry for the delay.  I never saw received your reply
 (but I did receive Ben Turner's follow-up to your
 reply).  So we tried to create a gluster volume under
 /home using different variations of

 gluster volume create test-volume
 mseas-data2:/home/gbrick_test_1
 mseas-data2:/home/gbrick_test_2 transport tcp

 However we keep getting errors of the form

 Wrong brick type: transport, use
 :

 Any thoughts on what we're doing wrong?


 You should give transport tcp at the beginning I think.
 Anyways, transport tcp is the default, so no need to specify
 so remove those two words from the CLI.


 Also do you have a list of the test we should be running
 once we get this volume created?  Given the time-zone
 difference it might help if we can run a small battery
 of tests and post the results rather than test-post-new
 test-post... .


 This is the first time I am doing performance analysis on
 users as far as I remember. In our team there are separate
 engineers who do these tests. Ben who replied earlier is one
 such engineer.

 Ben,
 Have any suggestions?


 Thanks

 Pat



 On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:


 On Thu, May 11, 2017 at 9:32 PM, Pat Haley
 <pha...@mit.edu <mailto:pha...@mit.edu>> wrote:


 Hi Pranith,

 The /home partition is mounted as ext4
 /home ext4 defaults,usrquota,grpquota   1 2

 The brick partitions are mounte

Re: [Gluster-users] Slow write times to gluster disk

2017-06-01 Thread Ben Turner
Are you sure using conv=sync is what you want?  I normally use conv=fdatasync, 
I'll look up the difference between the two and see if it affects your test.


-b

- Original Message -
> From: "Pat Haley" <pha...@mit.edu>
> To: "Pranith Kumar Karampuri" <pkara...@redhat.com>
> Cc: "Ravishankar N" <ravishan...@redhat.com>, gluster-users@gluster.org, 
> "Steve Postma" <spos...@ztechnet.com>, "Ben
> Turner" <btur...@redhat.com>
> Sent: Tuesday, May 30, 2017 9:40:34 PM
> Subject: Re: [Gluster-users] Slow write times to gluster disk
> 
> 
> Hi Pranith,
> 
> The "dd" command was:
> 
>  dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt conv=sync
> 
> There were 2 instances where dd reported 22 seconds. The output from the
> dd tests are in
> 
> http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt
> 
> Pat
> 
> On 05/30/2017 09:27 PM, Pranith Kumar Karampuri wrote:
> > Pat,
> >What is the command you used? As per the following output, it
> > seems like at least one write operation took 16 seconds. Which is
> > really bad.
> >   96.391165.10 us  89.00 us*16487014.00 us*  393212
> >   WRITE
> >
> >
> > On Tue, May 30, 2017 at 10:36 PM, Pat Haley <pha...@mit.edu
> > <mailto:pha...@mit.edu>> wrote:
> >
> >
> > Hi Pranith,
> >
> > I ran the same 'dd' test both in the gluster test volume and in
> > the .glusterfs directory of each brick.  The median results (12 dd
> > trials in each test) are similar to before
> >
> >   * gluster test volume: 586.5 MB/s
> >   * bricks (in .glusterfs): 1.4 GB/s
> >
> > The profile for the gluster test-volume is in
> >
> > 
> > http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt
> > 
> > <http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt>
> >
> > Thanks
> >
> > Pat
> >
> >
> >
> >
> > On 05/30/2017 12:10 PM, Pranith Kumar Karampuri wrote:
> >> Let's start with the same 'dd' test we were testing with to see,
> >> what the numbers are. Please provide profile numbers for the
> >> same. From there on we will start tuning the volume to see what
> >> we can do.
> >>
> >> On Tue, May 30, 2017 at 9:16 PM, Pat Haley <pha...@mit.edu
> >> <mailto:pha...@mit.edu>> wrote:
> >>
> >>
> >> Hi Pranith,
> >>
> >> Thanks for the tip.  We now have the gluster volume mounted
> >> under /home.  What tests do you recommend we run?
> >>
> >> Thanks
> >>
> >> Pat
> >>
> >>
> >>
> >> On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:
> >>>
> >>>
> >>> On Tue, May 16, 2017 at 9:20 PM, Pat Haley <pha...@mit.edu
> >>> <mailto:pha...@mit.edu>> wrote:
> >>>
> >>>
> >>> Hi Pranith,
> >>>
> >>> Sorry for the delay.  I never saw received your reply
> >>> (but I did receive Ben Turner's follow-up to your
> >>> reply).  So we tried to create a gluster volume under
> >>> /home using different variations of
> >>>
> >>> gluster volume create test-volume
> >>> mseas-data2:/home/gbrick_test_1
> >>> mseas-data2:/home/gbrick_test_2 transport tcp
> >>>
> >>> However we keep getting errors of the form
> >>>
> >>> Wrong brick type: transport, use
> >>> :
> >>>
> >>> Any thoughts on what we're doing wrong?
> >>>
> >>>
> >>> You should give transport tcp at the beginning I think.
> >>> Anyways, transport tcp is the default, so no need to specify
> >>> so remove those two words from the CLI.
> >>>
> >>>
> >>> Also do you have a list of the test we should be running
> >>> once we get this volume created?  Given the time-zone
> >>> difference it might help if we can run a small battery
> >>> of tests and post the results rather than test-post-new
&g

Re: [Gluster-users] Slow write times to gluster disk

2017-05-31 Thread Pat Haley


Hi Soumya,

What pattern should we be trying to view with the tcpump? Is a one 
minute capture of a copy operation sufficient or are you looking for 
something else?


Pat


On 05/31/2017 06:56 AM, Soumya Koduri wrote:



On 05/31/2017 07:24 AM, Pranith Kumar Karampuri wrote:

Thanks this is good information.

+Soumya

Soumya,
   We are trying to find why kNFS is performing way better than
plain distribute glusterfs+fuse. What information do you think will
benefit us to compare the operations with kNFS vs gluster+fuse? We
already have profile output from fuse.

Could be because all operations done by kNFS are local to the system. 
The operations done by FUSE mount over network could be more in number 
and time-consuming than the ones sent by NFS-client. We could compare 
and examine the pattern from tcpump taken over fuse-mount and 
NFS-mount. Also nfsstat [1] may give some clue.


Sorry I hadn't followed this mail from the beginning. But is this 
comparison between single brick volume and kNFS exporting that brick? 
Otherwise its not a fair comparison if the volume is replicated or 
distributed.


Thanks,
Soumya

[1] https://linux.die.net/man/8/nfsstat



On Wed, May 31, 2017 at 7:10 AM, Pat Haley > wrote:


Hi Pranith,

The "dd" command was:

dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt conv=sync

There were 2 instances where dd reported 22 seconds. The output from
the dd tests are in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt


Pat


On 05/30/2017 09:27 PM, Pranith Kumar Karampuri wrote:

Pat,
   What is the command you used? As per the following output,
it seems like at least one write operation took 16 seconds. Which
is really bad.
 96.391165.10 us  89.00 us *16487014.00 us* 
393212   WRITE



On Tue, May 30, 2017 at 10:36 PM, Pat Haley > wrote:


Hi Pranith,

I ran the same 'dd' test both in the gluster test volume and
in the .glusterfs directory of each brick.  The median results
(12 dd trials in each test) are similar to before

  * gluster test volume: 586.5 MB/s
  * bricks (in .glusterfs): 1.4 GB/s

The profile for the gluster test-volume is in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt


Thanks

Pat




On 05/30/2017 12:10 PM, Pranith Kumar Karampuri wrote:

Let's start with the same 'dd' test we were testing with to
see, what the numbers are. Please provide profile numbers for
the same. From there on we will start tuning the volume to
see what we can do.

On Tue, May 30, 2017 at 9:16 PM, Pat Haley > wrote:


Hi Pranith,

Thanks for the tip.  We now have the gluster volume
mounted under /home.  What tests do you recommend we run?

Thanks

Pat



On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:



On Tue, May 16, 2017 at 9:20 PM, Pat Haley
> wrote:


Hi Pranith,

Sorry for the delay.  I never saw received your
reply (but I did receive Ben Turner's follow-up to
your reply).  So we tried to create a gluster volume
under /home using different variations of

gluster volume create test-volume
mseas-data2:/home/gbrick_test_1
mseas-data2:/home/gbrick_test_2 transport tcp

However we keep getting errors of the form

Wrong brick type: transport, use
:

Any thoughts on what we're doing wrong?


You should give transport tcp at the beginning I think.
Anyways, transport tcp is the default, so no need to
specify so remove those two words from the CLI.


Also do you have a list of the test we should be
running once we get this volume created? Given the
time-zone difference it might help if we can run a
small battery of tests and post the results rather
than test-post-new test-post... .


This is the first time I am doing performance analysis
on users as far as I remember. In our team there are
separate engineers who do these tests. Ben who replied
earlier is one such engineer.

Ben,
Have any suggestions?



Thanks

Pat



On 05/11/2017 12:06 PM, Pranith Kumar Karampuri 
wrote:


Re: [Gluster-users] Slow write times to gluster disk

2017-05-31 Thread Soumya Koduri



On 05/31/2017 07:24 AM, Pranith Kumar Karampuri wrote:

Thanks this is good information.

+Soumya

Soumya,
   We are trying to find why kNFS is performing way better than
plain distribute glusterfs+fuse. What information do you think will
benefit us to compare the operations with kNFS vs gluster+fuse? We
already have profile output from fuse.

Could be because all operations done by kNFS are local to the system. 
The operations done by FUSE mount over network could be more in number 
and time-consuming than the ones sent by NFS-client. We could compare 
and examine the pattern from tcpump taken over fuse-mount and NFS-mount. 
Also nfsstat [1] may give some clue.


Sorry I hadn't followed this mail from the beginning. But is this 
comparison between single brick volume and kNFS exporting that brick? 
Otherwise its not a fair comparison if the volume is replicated or 
distributed.


Thanks,
Soumya

[1] https://linux.die.net/man/8/nfsstat



On Wed, May 31, 2017 at 7:10 AM, Pat Haley > wrote:


Hi Pranith,

The "dd" command was:

dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt conv=sync

There were 2 instances where dd reported 22 seconds. The output from
the dd tests are in


http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt



Pat


On 05/30/2017 09:27 PM, Pranith Kumar Karampuri wrote:

Pat,
   What is the command you used? As per the following output,
it seems like at least one write operation took 16 seconds. Which
is really bad.
 96.391165.10 us  89.00 us *16487014.00 us* 393212  
 WRITE


On Tue, May 30, 2017 at 10:36 PM, Pat Haley > wrote:


Hi Pranith,

I ran the same 'dd' test both in the gluster test volume and
in the .glusterfs directory of each brick.  The median results
(12 dd trials in each test) are similar to before

  * gluster test volume: 586.5 MB/s
  * bricks (in .glusterfs): 1.4 GB/s

The profile for the gluster test-volume is in


http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt



Thanks

Pat




On 05/30/2017 12:10 PM, Pranith Kumar Karampuri wrote:

Let's start with the same 'dd' test we were testing with to
see, what the numbers are. Please provide profile numbers for
the same. From there on we will start tuning the volume to
see what we can do.

On Tue, May 30, 2017 at 9:16 PM, Pat Haley > wrote:


Hi Pranith,

Thanks for the tip.  We now have the gluster volume
mounted under /home.  What tests do you recommend we run?

Thanks

Pat



On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:



On Tue, May 16, 2017 at 9:20 PM, Pat Haley
> wrote:


Hi Pranith,

Sorry for the delay.  I never saw received your
reply (but I did receive Ben Turner's follow-up to
your reply).  So we tried to create a gluster volume
under /home using different variations of

gluster volume create test-volume
mseas-data2:/home/gbrick_test_1
mseas-data2:/home/gbrick_test_2 transport tcp

However we keep getting errors of the form

Wrong brick type: transport, use
:

Any thoughts on what we're doing wrong?


You should give transport tcp at the beginning I think.
Anyways, transport tcp is the default, so no need to
specify so remove those two words from the CLI.


Also do you have a list of the test we should be
running once we get this volume created?  Given the
time-zone difference it might help if we can run a
small battery of tests and post the results rather
than test-post-new test-post... .


This is the first time I am doing performance analysis
on users as far as I remember. In our team there are
separate engineers who do these tests. Ben who replied
earlier is one such engineer.

Ben,
Have any suggestions?



Thanks

Pat



On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 9:32 PM, Pat Haley
> wrote:


Hi Pranith,

  

Re: [Gluster-users] Slow write times to gluster disk

2017-05-30 Thread Pranith Kumar Karampuri
Thanks this is good information.

+Soumya

Soumya,
   We are trying to find why kNFS is performing way better than plain
distribute glusterfs+fuse. What information do you think will benefit us to
compare the operations with kNFS vs gluster+fuse? We already have profile
output from fuse.


On Wed, May 31, 2017 at 7:10 AM, Pat Haley  wrote:

>
> Hi Pranith,
>
> The "dd" command was:
>
> dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt conv=sync
>
> There were 2 instances where dd reported 22 seconds. The output from the
> dd tests are in
>
> http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/
> dd_testvol_gluster.txt
>
> Pat
>
>
> On 05/30/2017 09:27 PM, Pranith Kumar Karampuri wrote:
>
> Pat,
>What is the command you used? As per the following output, it seems
> like at least one write operation took 16 seconds. Which is really bad.
>
>  96.391165.10 us  89.00 us *16487014.00 us* 393212   
> WRITE
>
>
>
> On Tue, May 30, 2017 at 10:36 PM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> I ran the same 'dd' test both in the gluster test volume and in the
>> .glusterfs directory of each brick.  The median results (12 dd trials in
>> each test) are similar to before
>>
>>- gluster test volume: 586.5 MB/s
>>- bricks (in .glusterfs): 1.4 GB/s
>>
>> The profile for the gluster test-volume is in
>>
>> http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/pr
>> ofile_testvol_gluster.txt
>>
>> Thanks
>>
>> Pat
>>
>>
>>
>>
>> On 05/30/2017 12:10 PM, Pranith Kumar Karampuri wrote:
>>
>> Let's start with the same 'dd' test we were testing with to see, what the
>> numbers are. Please provide profile numbers for the same. From there on we
>> will start tuning the volume to see what we can do.
>>
>> On Tue, May 30, 2017 at 9:16 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Pranith,
>>>
>>> Thanks for the tip.  We now have the gluster volume mounted under
>>> /home.  What tests do you recommend we run?
>>>
>>> Thanks
>>>
>>> Pat
>>>
>>>
>>>
>>> On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:
>>>
>>>
>>>
>>> On Tue, May 16, 2017 at 9:20 PM, Pat Haley  wrote:
>>>

 Hi Pranith,

 Sorry for the delay.  I never saw received your reply (but I did
 receive Ben Turner's follow-up to your reply).  So we tried to create a
 gluster volume under /home using different variations of

 gluster volume create test-volume mseas-data2:/home/gbrick_test_1
 mseas-data2:/home/gbrick_test_2 transport tcp

 However we keep getting errors of the form

 Wrong brick type: transport, use :

 Any thoughts on what we're doing wrong?

>>>
>>> You should give transport tcp at the beginning I think. Anyways,
>>> transport tcp is the default, so no need to specify so remove those two
>>> words from the CLI.
>>>

 Also do you have a list of the test we should be running once we get
 this volume created?  Given the time-zone difference it might help if we
 can run a small battery of tests and post the results rather than
 test-post-new test-post... .

>>>
>>> This is the first time I am doing performance analysis on users as far
>>> as I remember. In our team there are separate engineers who do these tests.
>>> Ben who replied earlier is one such engineer.
>>>
>>> Ben,
>>> Have any suggestions?
>>>
>>>

 Thanks

 Pat



 On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:



 On Thu, May 11, 2017 at 9:32 PM, Pat Haley  wrote:

>
> Hi Pranith,
>
> The /home partition is mounted as ext4
> /home  ext4defaults,usrquota,grpquota  1 2
>
> The brick partitions are mounted ax xfs
> /mnt/brick1  xfs defaults0 0
> /mnt/brick2  xfs defaults0 0
>
> Will this cause a problem with creating a volume under /home?
>

 I don't think the bottleneck is disk. You can do the same tests you did
 on your new volume to confirm?


>
> Pat
>
>
>
> On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:
>
>
>
> On Thu, May 11, 2017 at 8:57 PM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> Unfortunately, we don't have similar hardware for a small scale
>> test.  All we have is our production hardware.
>>
>
> You said something about /home partition which has lesser disks, we
> can create plain distribute volume inside one of those directories. After
> we are done, we can remove the setup. What do you say?
>
>
>>
>> Pat
>>
>>
>>
>>
>> On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:
>>
>>
>>
>> On Thu, May 11, 2017 at 2:48 AM, Pat Haley  wrote:
>>
>>>
>>> Hi Pranith,
>>>
>>> Since we are mounting the partitions as the 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-30 Thread Pat Haley


Hi Pranith,

The "dd" command was:

dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt conv=sync

There were 2 instances where dd reported 22 seconds. The output from the 
dd tests are in


http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/dd_testvol_gluster.txt

Pat

On 05/30/2017 09:27 PM, Pranith Kumar Karampuri wrote:

Pat,
   What is the command you used? As per the following output, it 
seems like at least one write operation took 16 seconds. Which is 
really bad.

  96.391165.10 us  89.00 us*16487014.00 us*  393212   
WRITE


On Tue, May 30, 2017 at 10:36 PM, Pat Haley > wrote:



Hi Pranith,

I ran the same 'dd' test both in the gluster test volume and in
the .glusterfs directory of each brick.  The median results (12 dd
trials in each test) are similar to before

  * gluster test volume: 586.5 MB/s
  * bricks (in .glusterfs): 1.4 GB/s

The profile for the gluster test-volume is in


http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt



Thanks

Pat




On 05/30/2017 12:10 PM, Pranith Kumar Karampuri wrote:

Let's start with the same 'dd' test we were testing with to see,
what the numbers are. Please provide profile numbers for the
same. From there on we will start tuning the volume to see what
we can do.

On Tue, May 30, 2017 at 9:16 PM, Pat Haley > wrote:


Hi Pranith,

Thanks for the tip.  We now have the gluster volume mounted
under /home.  What tests do you recommend we run?

Thanks

Pat



On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:



On Tue, May 16, 2017 at 9:20 PM, Pat Haley > wrote:


Hi Pranith,

Sorry for the delay.  I never saw received your reply
(but I did receive Ben Turner's follow-up to your
reply).  So we tried to create a gluster volume under
/home using different variations of

gluster volume create test-volume
mseas-data2:/home/gbrick_test_1
mseas-data2:/home/gbrick_test_2 transport tcp

However we keep getting errors of the form

Wrong brick type: transport, use
:

Any thoughts on what we're doing wrong?


You should give transport tcp at the beginning I think.
Anyways, transport tcp is the default, so no need to specify
so remove those two words from the CLI.


Also do you have a list of the test we should be running
once we get this volume created?  Given the time-zone
difference it might help if we can run a small battery
of tests and post the results rather than test-post-new
test-post... .


This is the first time I am doing performance analysis on
users as far as I remember. In our team there are separate
engineers who do these tests. Ben who replied earlier is one
such engineer.

Ben,
Have any suggestions?


Thanks

Pat



On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 9:32 PM, Pat Haley
> wrote:


Hi Pranith,

The /home partition is mounted as ext4
/home ext4 defaults,usrquota,grpquota   1 2

The brick partitions are mounted ax xfs
/mnt/brick1 xfs defaults 0 0
/mnt/brick2 xfs defaults 0 0

Will this cause a problem with creating a volume
under /home?


I don't think the bottleneck is disk. You can do the
same tests you did on your new volume to confirm?


Pat



On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 8:57 PM, Pat Haley
> wrote:


Hi Pranith,

Unfortunately, we don't have similar hardware
for a small scale test.  All we have is our
production hardware.


You said something about /home partition which has
lesser disks, we can create plain distribute
volume inside one of those directories. After we
are done, we can remove the setup. What do you say?


Pat




On 05/11/2017 07:05 AM, Pranith Kumar
Karampuri wrote:



On Thu, May 11, 2017 at 2:48 AM, Pat Haley
> wrote:


  

Re: [Gluster-users] Slow write times to gluster disk

2017-05-30 Thread Pranith Kumar Karampuri
Pat,
   What is the command you used? As per the following output, it seems
like at least one write operation took 16 seconds. Which is really bad.

 96.391165.10 us  89.00 us *16487014.00 us* 393212
  WRITE



On Tue, May 30, 2017 at 10:36 PM, Pat Haley  wrote:

>
> Hi Pranith,
>
> I ran the same 'dd' test both in the gluster test volume and in the
> .glusterfs directory of each brick.  The median results (12 dd trials in
> each test) are similar to before
>
>- gluster test volume: 586.5 MB/s
>- bricks (in .glusterfs): 1.4 GB/s
>
> The profile for the gluster test-volume is in
>
> http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/
> profile_testvol_gluster.txt
>
> Thanks
>
> Pat
>
>
>
>
> On 05/30/2017 12:10 PM, Pranith Kumar Karampuri wrote:
>
> Let's start with the same 'dd' test we were testing with to see, what the
> numbers are. Please provide profile numbers for the same. From there on we
> will start tuning the volume to see what we can do.
>
> On Tue, May 30, 2017 at 9:16 PM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> Thanks for the tip.  We now have the gluster volume mounted under /home.
>> What tests do you recommend we run?
>>
>> Thanks
>>
>> Pat
>>
>>
>>
>> On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:
>>
>>
>>
>> On Tue, May 16, 2017 at 9:20 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Pranith,
>>>
>>> Sorry for the delay.  I never saw received your reply (but I did receive
>>> Ben Turner's follow-up to your reply).  So we tried to create a gluster
>>> volume under /home using different variations of
>>>
>>> gluster volume create test-volume mseas-data2:/home/gbrick_test_1
>>> mseas-data2:/home/gbrick_test_2 transport tcp
>>>
>>> However we keep getting errors of the form
>>>
>>> Wrong brick type: transport, use :
>>>
>>> Any thoughts on what we're doing wrong?
>>>
>>
>> You should give transport tcp at the beginning I think. Anyways,
>> transport tcp is the default, so no need to specify so remove those two
>> words from the CLI.
>>
>>>
>>> Also do you have a list of the test we should be running once we get
>>> this volume created?  Given the time-zone difference it might help if we
>>> can run a small battery of tests and post the results rather than
>>> test-post-new test-post... .
>>>
>>
>> This is the first time I am doing performance analysis on users as far as
>> I remember. In our team there are separate engineers who do these tests.
>> Ben who replied earlier is one such engineer.
>>
>> Ben,
>> Have any suggestions?
>>
>>
>>>
>>> Thanks
>>>
>>> Pat
>>>
>>>
>>>
>>> On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:
>>>
>>>
>>>
>>> On Thu, May 11, 2017 at 9:32 PM, Pat Haley  wrote:
>>>

 Hi Pranith,

 The /home partition is mounted as ext4
 /home  ext4defaults,usrquota,grpquota  1 2

 The brick partitions are mounted ax xfs
 /mnt/brick1  xfs defaults0 0
 /mnt/brick2  xfs defaults0 0

 Will this cause a problem with creating a volume under /home?

>>>
>>> I don't think the bottleneck is disk. You can do the same tests you did
>>> on your new volume to confirm?
>>>
>>>

 Pat



 On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:



 On Thu, May 11, 2017 at 8:57 PM, Pat Haley  wrote:

>
> Hi Pranith,
>
> Unfortunately, we don't have similar hardware for a small scale test.
> All we have is our production hardware.
>

 You said something about /home partition which has lesser disks, we can
 create plain distribute volume inside one of those directories. After we
 are done, we can remove the setup. What do you say?


>
> Pat
>
>
>
>
> On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:
>
>
>
> On Thu, May 11, 2017 at 2:48 AM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> Since we are mounting the partitions as the bricks, I tried the dd
>> test writing to /.glusterfs/.
>> The results without oflag=sync were 1.6 Gb/s (faster than gluster but not
>> as fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/
>> fewer disks).
>>
>
> Okay, then 1.6Gb/s is what we need to target for, considering your
> volume is just distribute. Is there any way you can do tests on similar
> hardware but at a small scale? Just so we can run the workload to learn
> more about the bottlenecks in the system? We can probably try to get the
> speed to 1.2Gb/s on your /home partition you were telling me yesterday. 
> Let
> me know if that is something you are okay to do.
>
>
>>
>> Pat
>>
>>
>>
>> On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:
>>
>>
>>
>> On Wed, May 10, 2017 at 10:15 PM, Pat Haley 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-30 Thread Pat Haley


Hi Pranith,

I ran the same 'dd' test both in the gluster test volume and in the 
.glusterfs directory of each brick.  The median results (12 dd trials in 
each test) are similar to before


 * gluster test volume: 586.5 MB/s
 * bricks (in .glusterfs): 1.4 GB/s

The profile for the gluster test-volume is in

http://mseas.mit.edu/download/phaley/GlusterUsers/TestVol/profile_testvol_gluster.txt

Thanks

Pat



On 05/30/2017 12:10 PM, Pranith Kumar Karampuri wrote:
Let's start with the same 'dd' test we were testing with to see, what 
the numbers are. Please provide profile numbers for the same. From 
there on we will start tuning the volume to see what we can do.


On Tue, May 30, 2017 at 9:16 PM, Pat Haley > wrote:



Hi Pranith,

Thanks for the tip.  We now have the gluster volume mounted under
/home.  What tests do you recommend we run?

Thanks

Pat



On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:



On Tue, May 16, 2017 at 9:20 PM, Pat Haley > wrote:


Hi Pranith,

Sorry for the delay.  I never saw received your reply (but I
did receive Ben Turner's follow-up to your reply).  So we
tried to create a gluster volume under /home using different
variations of

gluster volume create test-volume
mseas-data2:/home/gbrick_test_1
mseas-data2:/home/gbrick_test_2 transport tcp

However we keep getting errors of the form

Wrong brick type: transport, use :

Any thoughts on what we're doing wrong?


You should give transport tcp at the beginning I think. Anyways,
transport tcp is the default, so no need to specify so remove
those two words from the CLI.


Also do you have a list of the test we should be running once
we get this volume created?  Given the time-zone difference
it might help if we can run a small battery of tests and post
the results rather than test-post-new test-post... .


This is the first time I am doing performance analysis on users
as far as I remember. In our team there are separate engineers
who do these tests. Ben who replied earlier is one such engineer.

Ben,
Have any suggestions?


Thanks

Pat



On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 9:32 PM, Pat Haley > wrote:


Hi Pranith,

The /home partition is mounted as ext4
/home ext4 defaults,usrquota,grpquota   1 2

The brick partitions are mounted ax xfs
/mnt/brick1  xfs defaults0 0
/mnt/brick2  xfs defaults0 0

Will this cause a problem with creating a volume under
/home?


I don't think the bottleneck is disk. You can do the same
tests you did on your new volume to confirm?


Pat



On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 8:57 PM, Pat Haley
> wrote:


Hi Pranith,

Unfortunately, we don't have similar hardware for a
small scale test.  All we have is our production
hardware.


You said something about /home partition which has
lesser disks, we can create plain distribute volume
inside one of those directories. After we are done, we
can remove the setup. What do you say?


Pat




On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 2:48 AM, Pat Haley
> wrote:


Hi Pranith,

Since we are mounting the partitions as the
bricks, I tried the dd test writing to
/.glusterfs/.
The results without oflag=sync were 1.6 Gb/s
(faster than gluster but not as fast as I was
expecting given the 1.2 Gb/s to the no-gluster
area w/ fewer disks).


Okay, then 1.6Gb/s is what we need to target for,
considering your volume is just distribute. Is
there any way you can do tests on similar hardware
but at a small scale? Just so we can run the
workload to learn more about the bottlenecks in
the system? We can probably try to get the speed
to 1.2Gb/s on your /home partition you were
telling me yesterday. Let me know if that is
something you are okay to do.


Pat



On 05/10/2017 01:27 PM, Pranith Kumar
Karampuri wrote:



   

Re: [Gluster-users] Slow write times to gluster disk

2017-05-30 Thread Pranith Kumar Karampuri
Let's start with the same 'dd' test we were testing with to see, what the
numbers are. Please provide profile numbers for the same. From there on we
will start tuning the volume to see what we can do.

On Tue, May 30, 2017 at 9:16 PM, Pat Haley  wrote:

>
> Hi Pranith,
>
> Thanks for the tip.  We now have the gluster volume mounted under /home.
> What tests do you recommend we run?
>
> Thanks
>
> Pat
>
>
>
> On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:
>
>
>
> On Tue, May 16, 2017 at 9:20 PM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> Sorry for the delay.  I never saw received your reply (but I did receive
>> Ben Turner's follow-up to your reply).  So we tried to create a gluster
>> volume under /home using different variations of
>>
>> gluster volume create test-volume mseas-data2:/home/gbrick_test_1
>> mseas-data2:/home/gbrick_test_2 transport tcp
>>
>> However we keep getting errors of the form
>>
>> Wrong brick type: transport, use :
>>
>> Any thoughts on what we're doing wrong?
>>
>
> You should give transport tcp at the beginning I think. Anyways, transport
> tcp is the default, so no need to specify so remove those two words from
> the CLI.
>
>>
>> Also do you have a list of the test we should be running once we get this
>> volume created?  Given the time-zone difference it might help if we can run
>> a small battery of tests and post the results rather than test-post-new
>> test-post... .
>>
>
> This is the first time I am doing performance analysis on users as far as
> I remember. In our team there are separate engineers who do these tests.
> Ben who replied earlier is one such engineer.
>
> Ben,
> Have any suggestions?
>
>
>>
>> Thanks
>>
>> Pat
>>
>>
>>
>> On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:
>>
>>
>>
>> On Thu, May 11, 2017 at 9:32 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Pranith,
>>>
>>> The /home partition is mounted as ext4
>>> /home  ext4defaults,usrquota,grpquota  1 2
>>>
>>> The brick partitions are mounted ax xfs
>>> /mnt/brick1  xfs defaults0 0
>>> /mnt/brick2  xfs defaults0 0
>>>
>>> Will this cause a problem with creating a volume under /home?
>>>
>>
>> I don't think the bottleneck is disk. You can do the same tests you did
>> on your new volume to confirm?
>>
>>
>>>
>>> Pat
>>>
>>>
>>>
>>> On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:
>>>
>>>
>>>
>>> On Thu, May 11, 2017 at 8:57 PM, Pat Haley  wrote:
>>>

 Hi Pranith,

 Unfortunately, we don't have similar hardware for a small scale test.
 All we have is our production hardware.

>>>
>>> You said something about /home partition which has lesser disks, we can
>>> create plain distribute volume inside one of those directories. After we
>>> are done, we can remove the setup. What do you say?
>>>
>>>

 Pat




 On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:



 On Thu, May 11, 2017 at 2:48 AM, Pat Haley  wrote:

>
> Hi Pranith,
>
> Since we are mounting the partitions as the bricks, I tried the dd
> test writing to /.glusterfs/.
> The results without oflag=sync were 1.6 Gb/s (faster than gluster but not
> as fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/
> fewer disks).
>

 Okay, then 1.6Gb/s is what we need to target for, considering your
 volume is just distribute. Is there any way you can do tests on similar
 hardware but at a small scale? Just so we can run the workload to learn
 more about the bottlenecks in the system? We can probably try to get the
 speed to 1.2Gb/s on your /home partition you were telling me yesterday. Let
 me know if that is something you are okay to do.


>
> Pat
>
>
>
> On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:
>
>
>
> On Wed, May 10, 2017 at 10:15 PM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> Not entirely sure (this isn't my area of expertise).  I'll run your
>> answer by some other people who are more familiar with this.
>>
>> I am also uncertain about how to interpret the results when we also
>> add the dd tests writing to the /home area (no gluster, still on the same
>> machine)
>>
>>- dd test without oflag=sync (rough average of multiple tests)
>>- gluster w/ fuse mount : 570 Mb/s
>>   - gluster w/ nfs mount:  390 Mb/s
>>   - nfs (no gluster):  1.2 Gb/s
>>- dd test with oflag=sync (rough average of multiple tests)
>>   - gluster w/ fuse mount:  5 Mb/s
>>   - gluster w/ nfs mount:  200 Mb/s
>>   - nfs (no gluster): 20 Mb/s
>>
>> Given that the non-gluster area is a RAID-6 of 4 disks while each
>> brick of the gluster area is a RAID-6 of 32 disks, I would naively expect
>> the writes to the 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-30 Thread Pat Haley


Hi Pranith,

Thanks for the tip.  We now have the gluster volume mounted under 
/home.  What tests do you recommend we run?


Thanks

Pat


On 05/17/2017 05:01 AM, Pranith Kumar Karampuri wrote:



On Tue, May 16, 2017 at 9:20 PM, Pat Haley > wrote:



Hi Pranith,

Sorry for the delay.  I never saw received your reply (but I did
receive Ben Turner's follow-up to your reply).  So we tried to
create a gluster volume under /home using different variations of

gluster volume create test-volume mseas-data2:/home/gbrick_test_1
mseas-data2:/home/gbrick_test_2 transport tcp

However we keep getting errors of the form

Wrong brick type: transport, use :

Any thoughts on what we're doing wrong?


You should give transport tcp at the beginning I think. Anyways, 
transport tcp is the default, so no need to specify so remove those 
two words from the CLI.



Also do you have a list of the test we should be running once we
get this volume created?  Given the time-zone difference it might
help if we can run a small battery of tests and post the results
rather than test-post-new test-post... .


This is the first time I am doing performance analysis on users as far 
as I remember. In our team there are separate engineers who do these 
tests. Ben who replied earlier is one such engineer.


Ben,
Have any suggestions?


Thanks

Pat



On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 9:32 PM, Pat Haley > wrote:


Hi Pranith,

The /home partition is mounted as ext4
/home  ext4 defaults,usrquota,grpquota  1 2

The brick partitions are mounted ax xfs
/mnt/brick1  xfs defaults0 0
/mnt/brick2  xfs defaults0 0

Will this cause a problem with creating a volume under /home?


I don't think the bottleneck is disk. You can do the same tests
you did on your new volume to confirm?


Pat



On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 8:57 PM, Pat Haley > wrote:


Hi Pranith,

Unfortunately, we don't have similar hardware for a
small scale test. All we have is our production hardware.


You said something about /home partition which has lesser
disks, we can create plain distribute volume inside one of
those directories. After we are done, we can remove the
setup. What do you say?


Pat




On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 2:48 AM, Pat Haley
> wrote:


Hi Pranith,

Since we are mounting the partitions as the bricks,
I tried the dd test writing to
/.glusterfs/.
The results without oflag=sync were 1.6 Gb/s
(faster than gluster but not as fast as I was
expecting given the 1.2 Gb/s to the no-gluster area
w/ fewer disks).


Okay, then 1.6Gb/s is what we need to target for,
considering your volume is just distribute. Is there
any way you can do tests on similar hardware but at a
small scale? Just so we can run the workload to learn
more about the bottlenecks in the system? We can
probably try to get the speed to 1.2Gb/s on your /home
partition you were telling me yesterday. Let me know if
that is something you are okay to do.


Pat



On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



On Wed, May 10, 2017 at 10:15 PM, Pat Haley
> wrote:


Hi Pranith,

Not entirely sure (this isn't my area of
expertise). I'll run your answer by some other
people who are more familiar with this.

I am also uncertain about how to interpret the
results when we also add the dd tests writing
to the /home area (no gluster, still on the
same machine)

  * dd test without oflag=sync (rough average
of multiple tests)
  o gluster w/ fuse mount : 570 Mb/s
  o gluster w/ nfs mount: 390 Mb/s
  o nfs (no gluster):  1.2 Gb/s
  * dd test with oflag=sync (rough average of
multiple tests)
  o gluster w/ fuse mount:  5 Mb/s
  o gluster w/ nfs mount: 200 Mb/s
  o nfs (no gluster): 20 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-17 Thread Pranith Kumar Karampuri
On Wed, May 17, 2017 at 9:54 PM, Joe Julian  wrote:

> On 05/17/17 02:02, Pranith Kumar Karampuri wrote:
>
> On Tue, May 16, 2017 at 9:38 PM, Joe Julian  wrote:
>
>> On 04/13/17 23:50, Pranith Kumar Karampuri wrote:
>>
>>
>>
>> On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N 
>> wrote:
>>
>>> Hi Pat,
>>>
>>> I'm assuming you are using gluster native (fuse mount). If it helps, you
>>> could try mounting it via gluster NFS (gnfs) and then see if there is an
>>> improvement in speed. Fuse mounts are slower than gnfs mounts but you get
>>> the benefit of avoiding a single point of failure. Unlike fuse mounts, if
>>> the gluster node containing the gnfs server goes down, all mounts done
>>> using that node will fail). For fuse mounts, you could try tweaking the
>>> write-behind xlator settings to see if it helps. See the
>>> performance.write-behind and performance.write-behind-window-size
>>> options in `gluster volume set help`. Of course, even for gnfs mounts, you
>>> can achieve fail-over by using CTDB.
>>>
>>
>> Ravi,
>>   Do you have any data that suggests fuse mounts are slower than gNFS
>> servers?
>>
>> Pat,
>>   I see that I am late to the thread, but do you happen to have
>> "profile info" of the workload?
>>
>>
>> I have done actual testing. For directory ops, NFS is faster due to the
>> default cache settings in the kernel. For raw throughput, or ops on an open
>> file, fuse is faster.
>>
>> I have yet to test this but I expect with the newer caching features in
>> 3.8+, even directory op performance should be similar to nfs and more
>> accurate.
>>
>
> We are actually comparing fuse+gluster and kernel NFS (n the same brick.
> Did you get a chance to do this test at  any point?
>
>
> No, that's not comparing like to like and I've rarely had a use case to
> which a single-store NFS was the answer.
>

Exactly. Why is it so bad compared to kNFS? Is there any scope for
improvement is the question we are trying to find answer to. If there is
everyone wins :-)

PS: I may not respond till tomorrow. Will go to sleep now.


>
>
>
>
>>
>>
>> You can follow https://gluster.readthedocs.io
>> /en/latest/Administrator%20Guide/Monitoring%20Workload/ to get the
>> information.
>>
>>
>>>
>>> Thanks,
>>> Ravi
>>>
>>>
>>> On 04/08/2017 12:07 AM, Pat Haley wrote:
>>>
>>>
>>> Hi,
>>>
>>> We noticed a dramatic slowness when writing to a gluster disk when
>>> compared to writing to an NFS disk. Specifically when using dd (data
>>> duplicator) to write a 4.3 GB file of zeros:
>>>
>>>- on NFS disk (/home): 9.5 Gb/s
>>>- on gluster disk (/gdata): 508 Mb/s
>>>
>>> The gluser disk is 2 bricks joined together, no replication or anything
>>> else. The hardware is (literally) the same:
>>>
>>>- one server with 70 hard disks  and a hardware RAID card.
>>>- 4 disks in a RAID-6 group (the NFS disk)
>>>- 32 disks in a RAID-6 group (the max allowed by the card,
>>>/mnt/brick1)
>>>- 32 disks in another RAID-6 group (/mnt/brick2)
>>>- 2 hot spare
>>>
>>> Some additional information and more tests results (after changing the
>>> log level):
>>>
>>> glusterfs 3.7.11 built on Apr 27 2016 14:09:22
>>> CentOS release 6.8 (Final)
>>> RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108
>>> [Invader] (rev 02)
>>>
>>>
>>>
>>> *Create the file to /gdata (gluster)*
>>> [root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
>>> count=1000
>>> 1000+0 records in
>>> 1000+0 records out
>>> 1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>>>
>>> *Create the file to /home (ext4)*
>>> [root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
>>> count=1000
>>> 1000+0 records in
>>> 1000+0 records out
>>> 1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times as
>>> fast
>>>
>>>
>>>
>>> * Copy from /gdata to /gdata (gluster to gluster) *[root@mseas-data2
>>> gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>> 2048000+0 records in
>>> 2048000+0 records out
>>> 1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - realllyyy
>>> slooowww
>>>
>>>
>>> *Copy from /gdata to /gdata* *2nd time (gluster to gluster)*
>>> [root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>> 2048000+0 records in
>>> 2048000+0 records out
>>> 1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - realllyyy
>>> slooowww again
>>>
>>>
>>>
>>> *Copy from /home to /home (ext4 to ext4)*
>>> [root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
>>> 2048000+0 records in
>>> 2048000+0 records out
>>> 1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times as fast
>>>
>>>
>>> *Copy from /home to /home (ext4 to ext4)*
>>> [root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
>>> 2048000+0 records in
>>> 2048000+0 records out
>>> 1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times as
>>> fast
>>>
>>>
>>> As a test, can we copy data directly to the xfs mountpoint (/mnt/brick1)
>>> and bypass 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-17 Thread Joe Julian

On 05/17/17 02:02, Pranith Kumar Karampuri wrote:

On Tue, May 16, 2017 at 9:38 PM, Joe Julian > wrote:


On 04/13/17 23:50, Pranith Kumar Karampuri wrote:



On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N
> wrote:

Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it
helps, you could try mounting it via gluster NFS (gnfs) and
then see if there is an improvement in speed. Fuse mounts are
slower than gnfs mounts but you get the benefit of avoiding a
single point of failure. Unlike fuse mounts, if the gluster
node containing the gnfs server goes down, all mounts done
using that node will fail). For fuse mounts, you could try
tweaking the write-behind xlator settings to see if it helps.
See the performance.write-behind and
performance.write-behind-window-size options in `gluster
volume set help`. Of course, even for gnfs mounts, you can
achieve fail-over by using CTDB.


Ravi,
  Do you have any data that suggests fuse mounts are slower
than gNFS servers?

Pat,
  I see that I am late to the thread, but do you happen to
have "profile info" of the workload?



I have done actual testing. For directory ops, NFS is faster due
to the default cache settings in the kernel. For raw throughput,
or ops on an open file, fuse is faster.

I have yet to test this but I expect with the newer caching
features in 3.8+, even directory op performance should be similar
to nfs and more accurate.


We are actually comparing fuse+gluster and kernel NFS (n the same 
brick. Did you get a chance to do this test at  any point?


No, that's not comparing like to like and I've rarely had a use case to 
which a single-store NFS was the answer.






You can follow

https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/


to get the information.


Thanks,
Ravi


On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster
disk when compared to writing to an NFS disk. Specifically
when using dd (data duplicator) to write a 4.3 GB file of zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication
or anything else. The hardware is (literally) the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card,
/mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after
changing the log level):

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID
SAS-3 3108 [Invader] (rev 02)



*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1
bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1
bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3
times as fast*


Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
realllyyy slooowww


*Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
realllyyy slooowww again



*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30
times as fast


*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30
times as fast


As a test, can we copy data directly to 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-17 Thread Pranith Kumar Karampuri
On Tue, May 16, 2017 at 9:38 PM, Joe Julian  wrote:

> On 04/13/17 23:50, Pranith Kumar Karampuri wrote:
>
>
>
> On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N 
> wrote:
>
>> Hi Pat,
>>
>> I'm assuming you are using gluster native (fuse mount). If it helps, you
>> could try mounting it via gluster NFS (gnfs) and then see if there is an
>> improvement in speed. Fuse mounts are slower than gnfs mounts but you get
>> the benefit of avoiding a single point of failure. Unlike fuse mounts, if
>> the gluster node containing the gnfs server goes down, all mounts done
>> using that node will fail). For fuse mounts, you could try tweaking the
>> write-behind xlator settings to see if it helps. See the
>> performance.write-behind and performance.write-behind-window-size
>> options in `gluster volume set help`. Of course, even for gnfs mounts, you
>> can achieve fail-over by using CTDB.
>>
>
> Ravi,
>   Do you have any data that suggests fuse mounts are slower than gNFS
> servers?
>
> Pat,
>   I see that I am late to the thread, but do you happen to have
> "profile info" of the workload?
>
>
> I have done actual testing. For directory ops, NFS is faster due to the
> default cache settings in the kernel. For raw throughput, or ops on an open
> file, fuse is faster.
>
> I have yet to test this but I expect with the newer caching features in
> 3.8+, even directory op performance should be similar to nfs and more
> accurate.
>

We are actually comparing fuse+gluster and kernel NFS (n the same brick.
Did you get a chance to do this test at  any point?


>
>
> You can follow https://gluster.readthedocs.io/en/latest/Administrator%
> 20Guide/Monitoring%20Workload/ to get the information.
>
>
>>
>> Thanks,
>> Ravi
>>
>>
>> On 04/08/2017 12:07 AM, Pat Haley wrote:
>>
>>
>> Hi,
>>
>> We noticed a dramatic slowness when writing to a gluster disk when
>> compared to writing to an NFS disk. Specifically when using dd (data
>> duplicator) to write a 4.3 GB file of zeros:
>>
>>- on NFS disk (/home): 9.5 Gb/s
>>- on gluster disk (/gdata): 508 Mb/s
>>
>> The gluser disk is 2 bricks joined together, no replication or anything
>> else. The hardware is (literally) the same:
>>
>>- one server with 70 hard disks  and a hardware RAID card.
>>- 4 disks in a RAID-6 group (the NFS disk)
>>- 32 disks in a RAID-6 group (the max allowed by the card,
>>/mnt/brick1)
>>- 32 disks in another RAID-6 group (/mnt/brick2)
>>- 2 hot spare
>>
>> Some additional information and more tests results (after changing the
>> log level):
>>
>> glusterfs 3.7.11 built on Apr 27 2016 14:09:22
>> CentOS release 6.8 (Final)
>> RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108
>> [Invader] (rev 02)
>>
>>
>>
>> *Create the file to /gdata (gluster)*
>> [root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
>> count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>>
>> *Create the file to /home (ext4)*
>> [root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times as
>> fast
>>
>>
>>
>> * Copy from /gdata to /gdata (gluster to gluster) *[root@mseas-data2
>> gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>> 2048000+0 records in
>> 2048000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - realllyyy
>> slooowww
>>
>>
>> *Copy from /gdata to /gdata* *2nd time (gluster to gluster)*
>> [root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>> 2048000+0 records in
>> 2048000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - realllyyy
>> slooowww again
>>
>>
>>
>> *Copy from /home to /home (ext4 to ext4)*
>> [root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
>> 2048000+0 records in
>> 2048000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times as fast
>>
>>
>> *Copy from /home to /home (ext4 to ext4)*
>> [root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
>> 2048000+0 records in
>> 2048000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times as fast
>>
>>
>> As a test, can we copy data directly to the xfs mountpoint (/mnt/brick1)
>> and bypass gluster?
>>
>>
>> Any help you could give us would be appreciated.
>>
>> Thanks
>>
>> --
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley  Email:  pha...@mit.edu
>> Center for Ocean Engineering   Phone:  (617) 253-6824
>> Dept. of Mechanical EngineeringFax:(617) 253-8125
>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>> 77 Massachusetts Avenue
>> Cambridge, MA  02139-4301
>>
>> ___
>> Gluster-users mailing 
>> 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-17 Thread Pranith Kumar Karampuri
On Tue, May 16, 2017 at 9:20 PM, Pat Haley  wrote:

>
> Hi Pranith,
>
> Sorry for the delay.  I never saw received your reply (but I did receive
> Ben Turner's follow-up to your reply).  So we tried to create a gluster
> volume under /home using different variations of
>
> gluster volume create test-volume mseas-data2:/home/gbrick_test_1
> mseas-data2:/home/gbrick_test_2 transport tcp
>
> However we keep getting errors of the form
>
> Wrong brick type: transport, use :
>
> Any thoughts on what we're doing wrong?
>

You should give transport tcp at the beginning I think. Anyways, transport
tcp is the default, so no need to specify so remove those two words from
the CLI.

>
> Also do you have a list of the test we should be running once we get this
> volume created?  Given the time-zone difference it might help if we can run
> a small battery of tests and post the results rather than test-post-new
> test-post... .
>

This is the first time I am doing performance analysis on users as far as I
remember. In our team there are separate engineers who do these tests. Ben
who replied earlier is one such engineer.

Ben,
Have any suggestions?


>
> Thanks
>
> Pat
>
>
>
> On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:
>
>
>
> On Thu, May 11, 2017 at 9:32 PM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> The /home partition is mounted as ext4
>> /home  ext4defaults,usrquota,grpquota  1 2
>>
>> The brick partitions are mounted ax xfs
>> /mnt/brick1  xfs defaults0 0
>> /mnt/brick2  xfs defaults0 0
>>
>> Will this cause a problem with creating a volume under /home?
>>
>
> I don't think the bottleneck is disk. You can do the same tests you did on
> your new volume to confirm?
>
>
>>
>> Pat
>>
>>
>>
>> On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:
>>
>>
>>
>> On Thu, May 11, 2017 at 8:57 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Pranith,
>>>
>>> Unfortunately, we don't have similar hardware for a small scale test.
>>> All we have is our production hardware.
>>>
>>
>> You said something about /home partition which has lesser disks, we can
>> create plain distribute volume inside one of those directories. After we
>> are done, we can remove the setup. What do you say?
>>
>>
>>>
>>> Pat
>>>
>>>
>>>
>>>
>>> On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:
>>>
>>>
>>>
>>> On Thu, May 11, 2017 at 2:48 AM, Pat Haley  wrote:
>>>

 Hi Pranith,

 Since we are mounting the partitions as the bricks, I tried the dd test
 writing to /.glusterfs/.
 The results without oflag=sync were 1.6 Gb/s (faster than gluster but not
 as fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/
 fewer disks).

>>>
>>> Okay, then 1.6Gb/s is what we need to target for, considering your
>>> volume is just distribute. Is there any way you can do tests on similar
>>> hardware but at a small scale? Just so we can run the workload to learn
>>> more about the bottlenecks in the system? We can probably try to get the
>>> speed to 1.2Gb/s on your /home partition you were telling me yesterday. Let
>>> me know if that is something you are okay to do.
>>>
>>>

 Pat



 On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



 On Wed, May 10, 2017 at 10:15 PM, Pat Haley  wrote:

>
> Hi Pranith,
>
> Not entirely sure (this isn't my area of expertise).  I'll run your
> answer by some other people who are more familiar with this.
>
> I am also uncertain about how to interpret the results when we also
> add the dd tests writing to the /home area (no gluster, still on the same
> machine)
>
>- dd test without oflag=sync (rough average of multiple tests)
>- gluster w/ fuse mount : 570 Mb/s
>   - gluster w/ nfs mount:  390 Mb/s
>   - nfs (no gluster):  1.2 Gb/s
>- dd test with oflag=sync (rough average of multiple tests)
>   - gluster w/ fuse mount:  5 Mb/s
>   - gluster w/ nfs mount:  200 Mb/s
>   - nfs (no gluster): 20 Mb/s
>
> Given that the non-gluster area is a RAID-6 of 4 disks while each
> brick of the gluster area is a RAID-6 of 32 disks, I would naively expect
> the writes to the gluster area to be roughly 8x faster than to the
> non-gluster.
>

 I think a better test is to try and write to a file using nfs without
 any gluster to a location that is not inside the brick but someother
 location that is on same disk(s). If you are mounting the partition as the
 brick, then we can write to a file inside .glusterfs directory, something
 like /.glusterfs/.


> I still think we have a speed issue, I can't tell if fuse vs nfs is
> part of the problem.
>

 I got interested in the post because I read that fuse speed is lesser
 than nfs speed which is counter-intuitive 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-16 Thread Joe Julian

On 04/13/17 23:50, Pranith Kumar Karampuri wrote:



On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N > wrote:


Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it
helps, you could try mounting it via gluster NFS (gnfs) and then
see if there is an improvement in speed. Fuse mounts are slower
than gnfs mounts but you get the benefit of avoiding a single
point of failure. Unlike fuse mounts, if the gluster node
containing the gnfs server goes down, all mounts done using that
node will fail). For fuse mounts, you could try tweaking the
write-behind xlator settings to see if it helps. See the
performance.write-behind and performance.write-behind-window-size
options in `gluster volume set help`. Of course, even for gnfs
mounts, you can achieve fail-over by using CTDB.


Ravi,
  Do you have any data that suggests fuse mounts are slower than 
gNFS servers?


Pat,
  I see that I am late to the thread, but do you happen to have 
"profile info" of the workload?




I have done actual testing. For directory ops, NFS is faster due to the 
default cache settings in the kernel. For raw throughput, or ops on an 
open file, fuse is faster.


I have yet to test this but I expect with the newer caching features in 
3.8+, even directory op performance should be similar to nfs and more 
accurate.


You can follow 
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/ 
to get the information.



Thanks,
Ravi


On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster disk
when compared to writing to an NFS disk. Specifically when using
dd (data duplicator) to write a 4.3 GB file of zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or
anything else. The hardware is (literally) the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card,
/mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after
changing the log level):

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3
3108 [Invader] (rev 02)



*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3
times as fast*


Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
realllyyy slooowww


*Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
realllyyy slooowww again



*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times
as fast


*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times
as fast


As a test, can we copy data directly to the xfs mountpoint
(/mnt/brick1) and bypass gluster?


Any help you could give us would be appreciated.

Thanks

-- 


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:pha...@mit.edu 

Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-users



___ Gluster-users
  

Re: [Gluster-users] Slow write times to gluster disk

2017-05-16 Thread Joe Julian

On 05/10/17 14:18, Pat Haley wrote:


Hi Pranith,

Since we are mounting the partitions as the bricks, I tried the dd 
test writing to 
/.glusterfs/. The results 
without oflag=sync were 1.6 Gb/s (faster than gluster but not as fast 
as I was expecting given the 1.2 Gb/s to the no-gluster area w/ fewer 
disks).


Pat



Is that true for every disk? If you're choosing the same filename every 
time for your dd test, you're likely only doing that test against one 
disk. If that disk is slow, you would get the same results every time 
despite other disks performing normally.




On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



On Wed, May 10, 2017 at 10:15 PM, Pat Haley > wrote:



Hi Pranith,

Not entirely sure (this isn't my area of expertise). I'll run
your answer by some other people who are more familiar with this.

I am also uncertain about how to interpret the results when we
also add the dd tests writing to the /home area (no gluster,
still on the same machine)

  * dd test without oflag=sync (rough average of multiple tests)
  o gluster w/ fuse mount : 570 Mb/s
  o gluster w/ nfs mount:  390 Mb/s
  o nfs (no gluster):  1.2 Gb/s
  * dd test with oflag=sync (rough average of multiple tests)
  o gluster w/ fuse mount:  5 Mb/s
  o gluster w/ nfs mount:  200 Mb/s
  o nfs (no gluster): 20 Mb/s

Given that the non-gluster area is a RAID-6 of 4 disks while each
brick of the gluster area is a RAID-6 of 32 disks, I would
naively expect the writes to the gluster area to be roughly 8x
faster than to the non-gluster.


I think a better test is to try and write to a file using nfs without 
any gluster to a location that is not inside the brick but someother 
location that is on same disk(s). If you are mounting the partition 
as the brick, then we can write to a file inside .glusterfs 
directory, something like 
/.glusterfs/.



I still think we have a speed issue, I can't tell if fuse vs nfs
is part of the problem.


I got interested in the post because I read that fuse speed is lesser 
than nfs speed which is counter-intuitive to my understanding. So 
wanted clarifications. Now that I got my clarifications where fuse 
outperformed nfs without sync, we can resume testing as described 
above and try to find what it is. Based on your email-id I am 
guessing you are from Boston and I am from Bangalore so if you are 
okay with doing this debugging for multiple days because of 
timezones, I will be happy to help. Please be a bit patient with me, 
I am under a release crunch but I am very curious with the problem 
you posted.


  Was there anything useful in the profiles?


Unfortunately profiles didn't help me much, I think we are collecting 
the profiles from an active volume, so it has a lot of information 
that is not pertaining to dd so it is difficult to find the 
contributions of dd. So I went through your post again and found 
something I didn't pay much attention to earlier i.e. oflag=sync, so 
did my own tests on my setup with FUSE so sent that reply.



Pat



On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:

Okay good. At least this validates my doubts. Handling O_SYNC in
gluster NFS and fuse is a bit different.
When application opens a file with O_SYNC on fuse mount then
each write syscall has to be written to disk as part of the
syscall where as in case of NFS, there is no concept of open.
NFS performs write though a handle saying it needs to be a
synchronous write, so write() syscall is performed first then it
performs fsync(). so an write on an fd with O_SYNC becomes
write+fsync. I am suspecting that when multiple threads do this
write+fsync() operation on the same file, multiple writes are
batched together to be written do disk so the throughput on the
disk is increasing is my guess.

Does it answer your doubts?

On Wed, May 10, 2017 at 9:35 PM, Pat Haley > wrote:


Without the oflag=sync and only a single test of each, the
FUSE is going faster than NFS:

FUSE:
mseas-data2(dri_nascar)% dd if=/dev/zero count=4096
bs=1048576 of=zeros.txt conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s


NFS
mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576
of=zeros.txt conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s



On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:

Could you let me know the speed without oflag=sync on both
the mounts? No need to collect profiles.

On Wed, May 10, 2017 at 9:17 PM, Pat Haley > wrote:


Here is what I see 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-16 Thread Pat Haley


Hi Pranith,

Sorry for the delay.  I never saw received your reply (but I did receive 
Ben Turner's follow-up to your reply).  So we tried to create a gluster 
volume under /home using different variations of


gluster volume create test-volume mseas-data2:/home/gbrick_test_1 
mseas-data2:/home/gbrick_test_2 transport tcp


However we keep getting errors of the form

Wrong brick type: transport, use :

Any thoughts on what we're doing wrong?

Also do you have a list of the test we should be running once we get 
this volume created?  Given the time-zone difference it might help if we 
can run a small battery of tests and post the results rather than 
test-post-new test-post... .


Thanks

Pat


On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 9:32 PM, Pat Haley > wrote:



Hi Pranith,

The /home partition is mounted as ext4
/home  ext4 defaults,usrquota,grpquota  1 2

The brick partitions are mounted ax xfs
/mnt/brick1  xfs defaults0 0
/mnt/brick2  xfs defaults0 0

Will this cause a problem with creating a volume under /home?


I don't think the bottleneck is disk. You can do the same tests you 
did on your new volume to confirm?



Pat



On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 8:57 PM, Pat Haley > wrote:


Hi Pranith,

Unfortunately, we don't have similar hardware for a small
scale test.  All we have is our production hardware.


You said something about /home partition which has lesser disks,
we can create plain distribute volume inside one of those
directories. After we are done, we can remove the setup. What do
you say?


Pat




On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 2:48 AM, Pat Haley > wrote:


Hi Pranith,

Since we are mounting the partitions as the bricks, I
tried the dd test writing to
/.glusterfs/.
The results without oflag=sync were 1.6 Gb/s (faster
than gluster but not as fast as I was expecting given
the 1.2 Gb/s to the no-gluster area w/ fewer disks).


Okay, then 1.6Gb/s is what we need to target for,
considering your volume is just distribute. Is there any way
you can do tests on similar hardware but at a small scale?
Just so we can run the workload to learn more about the
bottlenecks in the system? We can probably try to get the
speed to 1.2Gb/s on your /home partition you were telling me
yesterday. Let me know if that is something you are okay to do.


Pat



On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



On Wed, May 10, 2017 at 10:15 PM, Pat Haley
> wrote:


Hi Pranith,

Not entirely sure (this isn't my area of
expertise). I'll run your answer by some other
people who are more familiar with this.

I am also uncertain about how to interpret the
results when we also add the dd tests writing to
the /home area (no gluster, still on the same machine)

  * dd test without oflag=sync (rough average of
multiple tests)
  o gluster w/ fuse mount : 570 Mb/s
  o gluster w/ nfs mount: 390 Mb/s
  o nfs (no gluster):  1.2 Gb/s
  * dd test with oflag=sync (rough average of
multiple tests)
  o gluster w/ fuse mount:  5 Mb/s
  o gluster w/ nfs mount: 200 Mb/s
  o nfs (no gluster): 20 Mb/s

Given that the non-gluster area is a RAID-6 of 4
disks while each brick of the gluster area is a
RAID-6 of 32 disks, I would naively expect the
writes to the gluster area to be roughly 8x faster
than to the non-gluster.


I think a better test is to try and write to a file
using nfs without any gluster to a location that is not
inside the brick but someother location that is on same
disk(s). If you are mounting the partition as the
brick, then we can write to a file inside .glusterfs
directory, something like
/.glusterfs/.


I still think we have a speed issue, I can't tell
if fuse vs nfs is part of the problem.


I got interested in the post because I read that fuse
speed is lesser than nfs speed which is
counter-intuitive to my understanding. So wanted

Re: [Gluster-users] Slow write times to gluster disk

2017-05-14 Thread Ben Turner
- Original Message -
> From: "Pranith Kumar Karampuri" <pkara...@redhat.com>
> To: "Pat Haley" <pha...@mit.edu>
> Cc: gluster-users@gluster.org, "Steve Postma" <spos...@ztechnet.com>
> Sent: Friday, May 12, 2017 11:17:11 PM
> Subject: Re: [Gluster-users] Slow write times to gluster disk
> 
> 
> 
> On Sat, May 13, 2017 at 8:44 AM, Pranith Kumar Karampuri <
> pkara...@redhat.com > wrote:
> 
> 
> 
> 
> 
> On Fri, May 12, 2017 at 8:04 PM, Pat Haley < pha...@mit.edu > wrote:
> 
> 
> 
> 
> Hi Pranith,
> 
> My question was about setting up a gluster volume on an ext4 partition. I
> thought we had the bricks mounted as xfs for compatibility with gluster?
> 
> Oh that should not be a problem. It works fine.
> 
> Just that xfs doesn't have limits for anything, where as ext4 does for things
> like hardlinks etc(At least last time I checked :-) ). So it is better to
> have xfs.

One of the biggest reasons to use XFS IMHO is that most of the testing / large 
scale deployments(at least that I know of) / etc are done using XFS as a 
backend.  While EXT4 should work I don't think that it has the same level of 
testing as XFS.

-b 



> 
> 
> 
> 
> 
> 
> 
> Pat
> 
> 
> 
> On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:
> 
> 
> 
> 
> 
> On Thu, May 11, 2017 at 9:32 PM, Pat Haley < pha...@mit.edu > wrote:
> 
> 
> 
> 
> Hi Pranith,
> 
> The /home partition is mounted as ext4
> /home ext4 defaults,usrquota,grpquota 1 2
> 
> The brick partitions are mounted ax xfs
> /mnt/brick1 xfs defaults 0 0
> /mnt/brick2 xfs defaults 0 0
> 
> Will this cause a problem with creating a volume under /home?
> 
> I don't think the bottleneck is disk. You can do the same tests you did on
> your new volume to confirm?
> 
> 
> 
> 
> Pat
> 
> 
> 
> On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:
> 
> 
> 
> 
> 
> On Thu, May 11, 2017 at 8:57 PM, Pat Haley < pha...@mit.edu > wrote:
> 
> 
> 
> 
> Hi Pranith,
> 
> Unfortunately, we don't have similar hardware for a small scale test. All we
> have is our production hardware.
> 
> You said something about /home partition which has lesser disks, we can
> create plain distribute volume inside one of those directories. After we are
> done, we can remove the setup. What do you say?
> 
> 
> 
> 
> 
> Pat
> 
> 
> 
> 
> On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:
> 
> 
> 
> 
> 
> On Thu, May 11, 2017 at 2:48 AM, Pat Haley < pha...@mit.edu > wrote:
> 
> 
> 
> 
> Hi Pranith,
> 
> Since we are mounting the partitions as the bricks, I tried the dd test
> writing to /.glusterfs/. The
> results without oflag=sync were 1.6 Gb/s (faster than gluster but not as
> fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/ fewer
> disks).
> 
> Okay, then 1.6Gb/s is what we need to target for, considering your volume is
> just distribute. Is there any way you can do tests on similar hardware but
> at a small scale? Just so we can run the workload to learn more about the
> bottlenecks in the system? We can probably try to get the speed to 1.2Gb/s
> on your /home partition you were telling me yesterday. Let me know if that
> is something you are okay to do.
> 
> 
> 
> 
> Pat
> 
> 
> 
> On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:
> 
> 
> 
> 
> 
> On Wed, May 10, 2017 at 10:15 PM, Pat Haley < pha...@mit.edu > wrote:
> 
> 
> 
> 
> Hi Pranith,
> 
> Not entirely sure (this isn't my area of expertise). I'll run your answer by
> some other people who are more familiar with this.
> 
> I am also uncertain about how to interpret the results when we also add the
> dd tests writing to the /home area (no gluster, still on the same machine)
> 
> 
> * dd test without oflag=sync (rough average of multiple tests)
> 
> 
> * gluster w/ fuse mount : 570 Mb/s
> * gluster w/ nfs mount: 390 Mb/s
> * nfs (no gluster): 1.2 Gb/s
> * dd test with oflag=sync (rough average of multiple tests)
> 
> * gluster w/ fuse mount: 5 Mb/s
> * gluster w/ nfs mount: 200 Mb/s
> * nfs (no gluster): 20 Mb/s
> 
> Given that the non-gluster area is a RAID-6 of 4 disks while each brick of
> the gluster area is a RAID-6 of 32 disks, I would naively expect the writes
> to the gluster area to be roughly 8x faster than to the non-gluster.
> 
> I think a better test is to try and write to a file using nfs without any
> gluster to a location that is not inside the brick

Re: [Gluster-users] Slow write times to gluster disk

2017-05-12 Thread Pranith Kumar Karampuri
On Sat, May 13, 2017 at 8:44 AM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
>
> On Fri, May 12, 2017 at 8:04 PM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> My question was about setting up a gluster volume on an ext4 partition.
>> I thought we had the bricks mounted as xfs for compatibility with gluster?
>>
>
> Oh that should not be a problem. It works fine.
>

Just that xfs doesn't have limits for anything, where as ext4 does for
things like hardlinks etc(At least last time I checked :-) ). So it is
better to have xfs.


>
>
>>
>> Pat
>>
>>
>>
>> On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:
>>
>>
>>
>> On Thu, May 11, 2017 at 9:32 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Pranith,
>>>
>>> The /home partition is mounted as ext4
>>> /home  ext4defaults,usrquota,grpquota  1 2
>>>
>>> The brick partitions are mounted ax xfs
>>> /mnt/brick1  xfs defaults0 0
>>> /mnt/brick2  xfs defaults0 0
>>>
>>> Will this cause a problem with creating a volume under /home?
>>>
>>
>> I don't think the bottleneck is disk. You can do the same tests you did
>> on your new volume to confirm?
>>
>>
>>>
>>> Pat
>>>
>>>
>>>
>>> On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:
>>>
>>>
>>>
>>> On Thu, May 11, 2017 at 8:57 PM, Pat Haley  wrote:
>>>

 Hi Pranith,

 Unfortunately, we don't have similar hardware for a small scale test.
 All we have is our production hardware.

>>>
>>> You said something about /home partition which has lesser disks, we can
>>> create plain distribute volume inside one of those directories. After we
>>> are done, we can remove the setup. What do you say?
>>>
>>>

 Pat




 On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:



 On Thu, May 11, 2017 at 2:48 AM, Pat Haley  wrote:

>
> Hi Pranith,
>
> Since we are mounting the partitions as the bricks, I tried the dd
> test writing to /.glusterfs/.
> The results without oflag=sync were 1.6 Gb/s (faster than gluster but not
> as fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/
> fewer disks).
>

 Okay, then 1.6Gb/s is what we need to target for, considering your
 volume is just distribute. Is there any way you can do tests on similar
 hardware but at a small scale? Just so we can run the workload to learn
 more about the bottlenecks in the system? We can probably try to get the
 speed to 1.2Gb/s on your /home partition you were telling me yesterday. Let
 me know if that is something you are okay to do.


>
> Pat
>
>
>
> On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:
>
>
>
> On Wed, May 10, 2017 at 10:15 PM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> Not entirely sure (this isn't my area of expertise).  I'll run your
>> answer by some other people who are more familiar with this.
>>
>> I am also uncertain about how to interpret the results when we also
>> add the dd tests writing to the /home area (no gluster, still on the same
>> machine)
>>
>>- dd test without oflag=sync (rough average of multiple tests)
>>- gluster w/ fuse mount : 570 Mb/s
>>   - gluster w/ nfs mount:  390 Mb/s
>>   - nfs (no gluster):  1.2 Gb/s
>>- dd test with oflag=sync (rough average of multiple tests)
>>   - gluster w/ fuse mount:  5 Mb/s
>>   - gluster w/ nfs mount:  200 Mb/s
>>   - nfs (no gluster): 20 Mb/s
>>
>> Given that the non-gluster area is a RAID-6 of 4 disks while each
>> brick of the gluster area is a RAID-6 of 32 disks, I would naively expect
>> the writes to the gluster area to be roughly 8x faster than to the
>> non-gluster.
>>
>
> I think a better test is to try and write to a file using nfs without
> any gluster to a location that is not inside the brick but someother
> location that is on same disk(s). If you are mounting the partition as the
> brick, then we can write to a file inside .glusterfs directory, something
> like /.glusterfs/.
>
>
>> I still think we have a speed issue, I can't tell if fuse vs nfs is
>> part of the problem.
>>
>
> I got interested in the post because I read that fuse speed is lesser
> than nfs speed which is counter-intuitive to my understanding. So wanted
> clarifications. Now that I got my clarifications where fuse outperformed
> nfs without sync, we can resume testing as described above and try to find
> what it is. Based on your email-id I am guessing you are from Boston and I
> am from Bangalore so if you are okay with doing this debugging for 
> multiple
> days because of timezones, I will be happy to help. Please be a bit 
> patient
> with me, I am under a release 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-12 Thread Pranith Kumar Karampuri
On Fri, May 12, 2017 at 8:04 PM, Pat Haley  wrote:

>
> Hi Pranith,
>
> My question was about setting up a gluster volume on an ext4 partition.  I
> thought we had the bricks mounted as xfs for compatibility with gluster?
>

Oh that should not be a problem. It works fine.


>
> Pat
>
>
>
> On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:
>
>
>
> On Thu, May 11, 2017 at 9:32 PM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> The /home partition is mounted as ext4
>> /home  ext4defaults,usrquota,grpquota  1 2
>>
>> The brick partitions are mounted ax xfs
>> /mnt/brick1  xfs defaults0 0
>> /mnt/brick2  xfs defaults0 0
>>
>> Will this cause a problem with creating a volume under /home?
>>
>
> I don't think the bottleneck is disk. You can do the same tests you did on
> your new volume to confirm?
>
>
>>
>> Pat
>>
>>
>>
>> On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:
>>
>>
>>
>> On Thu, May 11, 2017 at 8:57 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Pranith,
>>>
>>> Unfortunately, we don't have similar hardware for a small scale test.
>>> All we have is our production hardware.
>>>
>>
>> You said something about /home partition which has lesser disks, we can
>> create plain distribute volume inside one of those directories. After we
>> are done, we can remove the setup. What do you say?
>>
>>
>>>
>>> Pat
>>>
>>>
>>>
>>>
>>> On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:
>>>
>>>
>>>
>>> On Thu, May 11, 2017 at 2:48 AM, Pat Haley  wrote:
>>>

 Hi Pranith,

 Since we are mounting the partitions as the bricks, I tried the dd test
 writing to /.glusterfs/.
 The results without oflag=sync were 1.6 Gb/s (faster than gluster but not
 as fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/
 fewer disks).

>>>
>>> Okay, then 1.6Gb/s is what we need to target for, considering your
>>> volume is just distribute. Is there any way you can do tests on similar
>>> hardware but at a small scale? Just so we can run the workload to learn
>>> more about the bottlenecks in the system? We can probably try to get the
>>> speed to 1.2Gb/s on your /home partition you were telling me yesterday. Let
>>> me know if that is something you are okay to do.
>>>
>>>

 Pat



 On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



 On Wed, May 10, 2017 at 10:15 PM, Pat Haley  wrote:

>
> Hi Pranith,
>
> Not entirely sure (this isn't my area of expertise).  I'll run your
> answer by some other people who are more familiar with this.
>
> I am also uncertain about how to interpret the results when we also
> add the dd tests writing to the /home area (no gluster, still on the same
> machine)
>
>- dd test without oflag=sync (rough average of multiple tests)
>- gluster w/ fuse mount : 570 Mb/s
>   - gluster w/ nfs mount:  390 Mb/s
>   - nfs (no gluster):  1.2 Gb/s
>- dd test with oflag=sync (rough average of multiple tests)
>   - gluster w/ fuse mount:  5 Mb/s
>   - gluster w/ nfs mount:  200 Mb/s
>   - nfs (no gluster): 20 Mb/s
>
> Given that the non-gluster area is a RAID-6 of 4 disks while each
> brick of the gluster area is a RAID-6 of 32 disks, I would naively expect
> the writes to the gluster area to be roughly 8x faster than to the
> non-gluster.
>

 I think a better test is to try and write to a file using nfs without
 any gluster to a location that is not inside the brick but someother
 location that is on same disk(s). If you are mounting the partition as the
 brick, then we can write to a file inside .glusterfs directory, something
 like /.glusterfs/.


> I still think we have a speed issue, I can't tell if fuse vs nfs is
> part of the problem.
>

 I got interested in the post because I read that fuse speed is lesser
 than nfs speed which is counter-intuitive to my understanding. So wanted
 clarifications. Now that I got my clarifications where fuse outperformed
 nfs without sync, we can resume testing as described above and try to find
 what it is. Based on your email-id I am guessing you are from Boston and I
 am from Bangalore so if you are okay with doing this debugging for multiple
 days because of timezones, I will be happy to help. Please be a bit patient
 with me, I am under a release crunch but I am very curious with the problem
 you posted.

   Was there anything useful in the profiles?
>

 Unfortunately profiles didn't help me much, I think we are collecting
 the profiles from an active volume, so it has a lot of information that is
 not pertaining to dd so it is difficult to find the contributions of dd. So
 I went through your post again and found something I 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-12 Thread Pat Haley


Hi Pranith,

My question was about setting up a gluster volume on an ext4 partition.  
I thought we had the bricks mounted as xfs for compatibility with gluster?


Pat


On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 9:32 PM, Pat Haley > wrote:



Hi Pranith,

The /home partition is mounted as ext4
/home  ext4 defaults,usrquota,grpquota  1 2

The brick partitions are mounted ax xfs
/mnt/brick1  xfs defaults0 0
/mnt/brick2  xfs defaults0 0

Will this cause a problem with creating a volume under /home?


I don't think the bottleneck is disk. You can do the same tests you 
did on your new volume to confirm?



Pat



On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 8:57 PM, Pat Haley > wrote:


Hi Pranith,

Unfortunately, we don't have similar hardware for a small
scale test.  All we have is our production hardware.


You said something about /home partition which has lesser disks,
we can create plain distribute volume inside one of those
directories. After we are done, we can remove the setup. What do
you say?


Pat




On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 2:48 AM, Pat Haley > wrote:


Hi Pranith,

Since we are mounting the partitions as the bricks, I
tried the dd test writing to
/.glusterfs/.
The results without oflag=sync were 1.6 Gb/s (faster
than gluster but not as fast as I was expecting given
the 1.2 Gb/s to the no-gluster area w/ fewer disks).


Okay, then 1.6Gb/s is what we need to target for,
considering your volume is just distribute. Is there any way
you can do tests on similar hardware but at a small scale?
Just so we can run the workload to learn more about the
bottlenecks in the system? We can probably try to get the
speed to 1.2Gb/s on your /home partition you were telling me
yesterday. Let me know if that is something you are okay to do.


Pat



On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



On Wed, May 10, 2017 at 10:15 PM, Pat Haley
> wrote:


Hi Pranith,

Not entirely sure (this isn't my area of
expertise). I'll run your answer by some other
people who are more familiar with this.

I am also uncertain about how to interpret the
results when we also add the dd tests writing to
the /home area (no gluster, still on the same machine)

  * dd test without oflag=sync (rough average of
multiple tests)
  o gluster w/ fuse mount : 570 Mb/s
  o gluster w/ nfs mount: 390 Mb/s
  o nfs (no gluster):  1.2 Gb/s
  * dd test with oflag=sync (rough average of
multiple tests)
  o gluster w/ fuse mount:  5 Mb/s
  o gluster w/ nfs mount: 200 Mb/s
  o nfs (no gluster): 20 Mb/s

Given that the non-gluster area is a RAID-6 of 4
disks while each brick of the gluster area is a
RAID-6 of 32 disks, I would naively expect the
writes to the gluster area to be roughly 8x faster
than to the non-gluster.


I think a better test is to try and write to a file
using nfs without any gluster to a location that is not
inside the brick but someother location that is on same
disk(s). If you are mounting the partition as the
brick, then we can write to a file inside .glusterfs
directory, something like
/.glusterfs/.


I still think we have a speed issue, I can't tell
if fuse vs nfs is part of the problem.


I got interested in the post because I read that fuse
speed is lesser than nfs speed which is
counter-intuitive to my understanding. So wanted
clarifications. Now that I got my clarifications where
fuse outperformed nfs without sync, we can resume
testing as described above and try to find what it is.
Based on your email-id I am guessing you are from
Boston and I am from Bangalore so if you are okay with
doing this debugging for multiple days because of
timezones, I will be happy to help. Please be a bit
patient with me, I am under a release crunch but I am
very curious with the problem 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-11 Thread Pat Haley


Hi Pranith,

The /home partition is mounted as ext4
/home  ext4defaults,usrquota,grpquota  1 2

The brick partitions are mounted ax xfs
/mnt/brick1  xfs defaults0 0
/mnt/brick2  xfs defaults0 0

Will this cause a problem with creating a volume under /home?

Pat


On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 8:57 PM, Pat Haley > wrote:



Hi Pranith,

Unfortunately, we don't have similar hardware for a small scale
test.  All we have is our production hardware.


You said something about /home partition which has lesser disks, we 
can create plain distribute volume inside one of those directories. 
After we are done, we can remove the setup. What do you say?



Pat




On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 2:48 AM, Pat Haley > wrote:


Hi Pranith,

Since we are mounting the partitions as the bricks, I tried
the dd test writing to
/.glusterfs/. The
results without oflag=sync were 1.6 Gb/s (faster than gluster
but not as fast as I was expecting given the 1.2 Gb/s to the
no-gluster area w/ fewer disks).


Okay, then 1.6Gb/s is what we need to target for, considering
your volume is just distribute. Is there any way you can do tests
on similar hardware but at a small scale? Just so we can run the
workload to learn more about the bottlenecks in the system? We
can probably try to get the speed to 1.2Gb/s on your /home
partition you were telling me yesterday. Let me know if that is
something you are okay to do.


Pat



On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



On Wed, May 10, 2017 at 10:15 PM, Pat Haley > wrote:


Hi Pranith,

Not entirely sure (this isn't my area of expertise). 
I'll run your answer by some other people who are more

familiar with this.

I am also uncertain about how to interpret the results
when we also add the dd tests writing to the /home area
(no gluster, still on the same machine)

  * dd test without oflag=sync (rough average of
multiple tests)
  o gluster w/ fuse mount : 570 Mb/s
  o gluster w/ nfs mount:  390 Mb/s
  o nfs (no gluster):  1.2 Gb/s
  * dd test with oflag=sync (rough average of multiple
tests)
  o gluster w/ fuse mount:  5 Mb/s
  o gluster w/ nfs mount:  200 Mb/s
  o nfs (no gluster): 20 Mb/s

Given that the non-gluster area is a RAID-6 of 4 disks
while each brick of the gluster area is a RAID-6 of 32
disks, I would naively expect the writes to the gluster
area to be roughly 8x faster than to the non-gluster.


I think a better test is to try and write to a file using
nfs without any gluster to a location that is not inside the
brick but someother location that is on same disk(s). If you
are mounting the partition as the brick, then we can write
to a file inside .glusterfs directory, something like
/.glusterfs/.


I still think we have a speed issue, I can't tell if
fuse vs nfs is part of the problem.


I got interested in the post because I read that fuse speed
is lesser than nfs speed which is counter-intuitive to my
understanding. So wanted clarifications. Now that I got my
clarifications where fuse outperformed nfs without sync, we
can resume testing as described above and try to find what
it is. Based on your email-id I am guessing you are from
Boston and I am from Bangalore so if you are okay with doing
this debugging for multiple days because of timezones, I
will be happy to help. Please be a bit patient with me, I am
under a release crunch but I am very curious with the
problem you posted.

  Was there anything useful in the profiles?


Unfortunately profiles didn't help me much, I think we are
collecting the profiles from an active volume, so it has a
lot of information that is not pertaining to dd so it is
difficult to find the contributions of dd. So I went through
your post again and found something I didn't pay much
attention to earlier i.e. oflag=sync, so did my own tests on
my setup with FUSE so sent that reply.


Pat



On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:

Okay good. At least this validates my doubts. Handling
O_SYNC in gluster NFS and fuse is a bit different.
When application opens a file 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-11 Thread Pranith Kumar Karampuri
On Thu, May 11, 2017 at 8:57 PM, Pat Haley  wrote:

>
> Hi Pranith,
>
> Unfortunately, we don't have similar hardware for a small scale test.  All
> we have is our production hardware.
>

You said something about /home partition which has lesser disks, we can
create plain distribute volume inside one of those directories. After we
are done, we can remove the setup. What do you say?


>
> Pat
>
>
>
>
> On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:
>
>
>
> On Thu, May 11, 2017 at 2:48 AM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> Since we are mounting the partitions as the bricks, I tried the dd test
>> writing to /.glusterfs/. The
>> results without oflag=sync were 1.6 Gb/s (faster than gluster but not as
>> fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/ fewer
>> disks).
>>
>
> Okay, then 1.6Gb/s is what we need to target for, considering your volume
> is just distribute. Is there any way you can do tests on similar hardware
> but at a small scale? Just so we can run the workload to learn more about
> the bottlenecks in the system? We can probably try to get the speed to
> 1.2Gb/s on your /home partition you were telling me yesterday. Let me know
> if that is something you are okay to do.
>
>
>>
>> Pat
>>
>>
>>
>> On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:
>>
>>
>>
>> On Wed, May 10, 2017 at 10:15 PM, Pat Haley  wrote:
>>
>>>
>>> Hi Pranith,
>>>
>>> Not entirely sure (this isn't my area of expertise).  I'll run your
>>> answer by some other people who are more familiar with this.
>>>
>>> I am also uncertain about how to interpret the results when we also add
>>> the dd tests writing to the /home area (no gluster, still on the same
>>> machine)
>>>
>>>- dd test without oflag=sync (rough average of multiple tests)
>>>- gluster w/ fuse mount : 570 Mb/s
>>>   - gluster w/ nfs mount:  390 Mb/s
>>>   - nfs (no gluster):  1.2 Gb/s
>>>- dd test with oflag=sync (rough average of multiple tests)
>>>   - gluster w/ fuse mount:  5 Mb/s
>>>   - gluster w/ nfs mount:  200 Mb/s
>>>   - nfs (no gluster): 20 Mb/s
>>>
>>> Given that the non-gluster area is a RAID-6 of 4 disks while each brick
>>> of the gluster area is a RAID-6 of 32 disks, I would naively expect the
>>> writes to the gluster area to be roughly 8x faster than to the non-gluster.
>>>
>>
>> I think a better test is to try and write to a file using nfs without any
>> gluster to a location that is not inside the brick but someother location
>> that is on same disk(s). If you are mounting the partition as the brick,
>> then we can write to a file inside .glusterfs directory, something like
>> /.glusterfs/.
>>
>>
>>> I still think we have a speed issue, I can't tell if fuse vs nfs is part
>>> of the problem.
>>>
>>
>> I got interested in the post because I read that fuse speed is lesser
>> than nfs speed which is counter-intuitive to my understanding. So wanted
>> clarifications. Now that I got my clarifications where fuse outperformed
>> nfs without sync, we can resume testing as described above and try to find
>> what it is. Based on your email-id I am guessing you are from Boston and I
>> am from Bangalore so if you are okay with doing this debugging for multiple
>> days because of timezones, I will be happy to help. Please be a bit patient
>> with me, I am under a release crunch but I am very curious with the problem
>> you posted.
>>
>>   Was there anything useful in the profiles?
>>>
>>
>> Unfortunately profiles didn't help me much, I think we are collecting the
>> profiles from an active volume, so it has a lot of information that is not
>> pertaining to dd so it is difficult to find the contributions of dd. So I
>> went through your post again and found something I didn't pay much
>> attention to earlier i.e. oflag=sync, so did my own tests on my setup with
>> FUSE so sent that reply.
>>
>>
>>>
>>> Pat
>>>
>>>
>>>
>>> On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:
>>>
>>> Okay good. At least this validates my doubts. Handling O_SYNC in gluster
>>> NFS and fuse is a bit different.
>>> When application opens a file with O_SYNC on fuse mount then each write
>>> syscall has to be written to disk as part of the syscall where as in case
>>> of NFS, there is no concept of open. NFS performs write though a handle
>>> saying it needs to be a synchronous write, so write() syscall is performed
>>> first then it performs fsync(). so an write on an fd with O_SYNC becomes
>>> write+fsync. I am suspecting that when multiple threads do this
>>> write+fsync() operation on the same file, multiple writes are batched
>>> together to be written do disk so the throughput on the disk is increasing
>>> is my guess.
>>>
>>> Does it answer your doubts?
>>>
>>> On Wed, May 10, 2017 at 9:35 PM, Pat Haley  wrote:
>>>

 Without the oflag=sync and only a single test of each, the FUSE is
 going faster than NFS:

 FUSE:

Re: [Gluster-users] Slow write times to gluster disk

2017-05-11 Thread Pat Haley


Hi Pranith,

Unfortunately, we don't have similar hardware for a small scale test.  
All we have is our production hardware.


Pat



On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:



On Thu, May 11, 2017 at 2:48 AM, Pat Haley > wrote:



Hi Pranith,

Since we are mounting the partitions as the bricks, I tried the dd
test writing to
/.glusterfs/. The
results without oflag=sync were 1.6 Gb/s (faster than gluster but
not as fast as I was expecting given the 1.2 Gb/s to the
no-gluster area w/ fewer disks).


Okay, then 1.6Gb/s is what we need to target for, considering your 
volume is just distribute. Is there any way you can do tests on 
similar hardware but at a small scale? Just so we can run the workload 
to learn more about the bottlenecks in the system? We can probably try 
to get the speed to 1.2Gb/s on your /home partition you were telling 
me yesterday. Let me know if that is something you are okay to do.



Pat



On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



On Wed, May 10, 2017 at 10:15 PM, Pat Haley > wrote:


Hi Pranith,

Not entirely sure (this isn't my area of expertise).  I'll
run your answer by some other people who are more familiar
with this.

I am also uncertain about how to interpret the results when
we also add the dd tests writing to the /home area (no
gluster, still on the same machine)

  * dd test without oflag=sync (rough average of multiple tests)
  o gluster w/ fuse mount : 570 Mb/s
  o gluster w/ nfs mount:  390 Mb/s
  o nfs (no gluster):  1.2 Gb/s
  * dd test with oflag=sync (rough average of multiple tests)
  o gluster w/ fuse mount:  5 Mb/s
  o gluster w/ nfs mount:  200 Mb/s
  o nfs (no gluster): 20 Mb/s

Given that the non-gluster area is a RAID-6 of 4 disks while
each brick of the gluster area is a RAID-6 of 32 disks, I
would naively expect the writes to the gluster area to be
roughly 8x faster than to the non-gluster.


I think a better test is to try and write to a file using nfs
without any gluster to a location that is not inside the brick
but someother location that is on same disk(s). If you are
mounting the partition as the brick, then we can write to a file
inside .glusterfs directory, something like
/.glusterfs/.


I still think we have a speed issue, I can't tell if fuse vs
nfs is part of the problem.


I got interested in the post because I read that fuse speed is
lesser than nfs speed which is counter-intuitive to my
understanding. So wanted clarifications. Now that I got my
clarifications where fuse outperformed nfs without sync, we can
resume testing as described above and try to find what it is.
Based on your email-id I am guessing you are from Boston and I am
from Bangalore so if you are okay with doing this debugging for
multiple days because of timezones, I will be happy to help.
Please be a bit patient with me, I am under a release crunch but
I am very curious with the problem you posted.

Was there anything useful in the profiles?


Unfortunately profiles didn't help me much, I think we are
collecting the profiles from an active volume, so it has a lot of
information that is not pertaining to dd so it is difficult to
find the contributions of dd. So I went through your post again
and found something I didn't pay much attention to earlier i.e.
oflag=sync, so did my own tests on my setup with FUSE so sent
that reply.


Pat



On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:

Okay good. At least this validates my doubts. Handling
O_SYNC in gluster NFS and fuse is a bit different.
When application opens a file with O_SYNC on fuse mount then
each write syscall has to be written to disk as part of the
syscall where as in case of NFS, there is no concept of
open. NFS performs write though a handle saying it needs to
be a synchronous write, so write() syscall is performed
first then it performs fsync(). so an write on an fd with
O_SYNC becomes write+fsync. I am suspecting that when
multiple threads do this write+fsync() operation on the same
file, multiple writes are batched together to be written do
disk so the throughput on the disk is increasing is my guess.

Does it answer your doubts?

On Wed, May 10, 2017 at 9:35 PM, Pat Haley > wrote:


Without the oflag=sync and only a single test of each,
the FUSE is going faster than NFS:

FUSE:
mseas-data2(dri_nascar)% dd if=/dev/zero count=4096

Re: [Gluster-users] Slow write times to gluster disk

2017-05-11 Thread Pranith Kumar Karampuri
On Thu, May 11, 2017 at 2:48 AM, Pat Haley  wrote:

>
> Hi Pranith,
>
> Since we are mounting the partitions as the bricks, I tried the dd test
> writing to /.glusterfs/. The
> results without oflag=sync were 1.6 Gb/s (faster than gluster but not as
> fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/ fewer
> disks).
>

Okay, then 1.6Gb/s is what we need to target for, considering your volume
is just distribute. Is there any way you can do tests on similar hardware
but at a small scale? Just so we can run the workload to learn more about
the bottlenecks in the system? We can probably try to get the speed to
1.2Gb/s on your /home partition you were telling me yesterday. Let me know
if that is something you are okay to do.


>
> Pat
>
>
>
> On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:
>
>
>
> On Wed, May 10, 2017 at 10:15 PM, Pat Haley  wrote:
>
>>
>> Hi Pranith,
>>
>> Not entirely sure (this isn't my area of expertise).  I'll run your
>> answer by some other people who are more familiar with this.
>>
>> I am also uncertain about how to interpret the results when we also add
>> the dd tests writing to the /home area (no gluster, still on the same
>> machine)
>>
>>- dd test without oflag=sync (rough average of multiple tests)
>>- gluster w/ fuse mount : 570 Mb/s
>>   - gluster w/ nfs mount:  390 Mb/s
>>   - nfs (no gluster):  1.2 Gb/s
>>- dd test with oflag=sync (rough average of multiple tests)
>>   - gluster w/ fuse mount:  5 Mb/s
>>   - gluster w/ nfs mount:  200 Mb/s
>>   - nfs (no gluster): 20 Mb/s
>>
>> Given that the non-gluster area is a RAID-6 of 4 disks while each brick
>> of the gluster area is a RAID-6 of 32 disks, I would naively expect the
>> writes to the gluster area to be roughly 8x faster than to the non-gluster.
>>
>
> I think a better test is to try and write to a file using nfs without any
> gluster to a location that is not inside the brick but someother location
> that is on same disk(s). If you are mounting the partition as the brick,
> then we can write to a file inside .glusterfs directory, something like
> /.glusterfs/.
>
>
>> I still think we have a speed issue, I can't tell if fuse vs nfs is part
>> of the problem.
>>
>
> I got interested in the post because I read that fuse speed is lesser than
> nfs speed which is counter-intuitive to my understanding. So wanted
> clarifications. Now that I got my clarifications where fuse outperformed
> nfs without sync, we can resume testing as described above and try to find
> what it is. Based on your email-id I am guessing you are from Boston and I
> am from Bangalore so if you are okay with doing this debugging for multiple
> days because of timezones, I will be happy to help. Please be a bit patient
> with me, I am under a release crunch but I am very curious with the problem
> you posted.
>
>   Was there anything useful in the profiles?
>>
>
> Unfortunately profiles didn't help me much, I think we are collecting the
> profiles from an active volume, so it has a lot of information that is not
> pertaining to dd so it is difficult to find the contributions of dd. So I
> went through your post again and found something I didn't pay much
> attention to earlier i.e. oflag=sync, so did my own tests on my setup with
> FUSE so sent that reply.
>
>
>>
>> Pat
>>
>>
>>
>> On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:
>>
>> Okay good. At least this validates my doubts. Handling O_SYNC in gluster
>> NFS and fuse is a bit different.
>> When application opens a file with O_SYNC on fuse mount then each write
>> syscall has to be written to disk as part of the syscall where as in case
>> of NFS, there is no concept of open. NFS performs write though a handle
>> saying it needs to be a synchronous write, so write() syscall is performed
>> first then it performs fsync(). so an write on an fd with O_SYNC becomes
>> write+fsync. I am suspecting that when multiple threads do this
>> write+fsync() operation on the same file, multiple writes are batched
>> together to be written do disk so the throughput on the disk is increasing
>> is my guess.
>>
>> Does it answer your doubts?
>>
>> On Wed, May 10, 2017 at 9:35 PM, Pat Haley  wrote:
>>
>>>
>>> Without the oflag=sync and only a single test of each, the FUSE is going
>>> faster than NFS:
>>>
>>> FUSE:
>>> mseas-data2(dri_nascar)% dd if=/dev/zero count=4096 bs=1048576
>>> of=zeros.txt conv=sync
>>> 4096+0 records in
>>> 4096+0 records out
>>> 4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s
>>>
>>>
>>> NFS
>>> mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt
>>> conv=sync
>>> 4096+0 records in
>>> 4096+0 records out
>>> 4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s
>>>
>>>
>>>
>>> On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:
>>>
>>> Could you let me know the speed without oflag=sync on both the mounts?
>>> No need to collect profiles.
>>>
>>> On Wed, 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-10 Thread Pat Haley


Hi Pranith,

Since we are mounting the partitions as the bricks, I tried the dd test 
writing to /.glusterfs/. The 
results without oflag=sync were 1.6 Gb/s (faster than gluster but not as 
fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/ 
fewer disks).


Pat


On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:



On Wed, May 10, 2017 at 10:15 PM, Pat Haley > wrote:



Hi Pranith,

Not entirely sure (this isn't my area of expertise). I'll run your
answer by some other people who are more familiar with this.

I am also uncertain about how to interpret the results when we
also add the dd tests writing to the /home area (no gluster, still
on the same machine)

  * dd test without oflag=sync (rough average of multiple tests)
  o gluster w/ fuse mount : 570 Mb/s
  o gluster w/ nfs mount:  390 Mb/s
  o nfs (no gluster):  1.2 Gb/s
  * dd test with oflag=sync (rough average of multiple tests)
  o gluster w/ fuse mount:  5 Mb/s
  o gluster w/ nfs mount:  200 Mb/s
  o nfs (no gluster): 20 Mb/s

Given that the non-gluster area is a RAID-6 of 4 disks while each
brick of the gluster area is a RAID-6 of 32 disks, I would naively
expect the writes to the gluster area to be roughly 8x faster than
to the non-gluster.


I think a better test is to try and write to a file using nfs without 
any gluster to a location that is not inside the brick but someother 
location that is on same disk(s). If you are mounting the partition as 
the brick, then we can write to a file inside .glusterfs directory, 
something like /.glusterfs/.



I still think we have a speed issue, I can't tell if fuse vs nfs
is part of the problem.


I got interested in the post because I read that fuse speed is lesser 
than nfs speed which is counter-intuitive to my understanding. So 
wanted clarifications. Now that I got my clarifications where fuse 
outperformed nfs without sync, we can resume testing as described 
above and try to find what it is. Based on your email-id I am guessing 
you are from Boston and I am from Bangalore so if you are okay with 
doing this debugging for multiple days because of timezones, I will be 
happy to help. Please be a bit patient with me, I am under a release 
crunch but I am very curious with the problem you posted.


  Was there anything useful in the profiles?


Unfortunately profiles didn't help me much, I think we are collecting 
the profiles from an active volume, so it has a lot of information 
that is not pertaining to dd so it is difficult to find the 
contributions of dd. So I went through your post again and found 
something I didn't pay much attention to earlier i.e. oflag=sync, so 
did my own tests on my setup with FUSE so sent that reply.



Pat



On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:

Okay good. At least this validates my doubts. Handling O_SYNC in
gluster NFS and fuse is a bit different.
When application opens a file with O_SYNC on fuse mount then each
write syscall has to be written to disk as part of the syscall
where as in case of NFS, there is no concept of open. NFS
performs write though a handle saying it needs to be a
synchronous write, so write() syscall is performed first then it
performs fsync(). so an write on an fd with O_SYNC becomes
write+fsync. I am suspecting that when multiple threads do this
write+fsync() operation on the same file, multiple writes are
batched together to be written do disk so the throughput on the
disk is increasing is my guess.

Does it answer your doubts?

On Wed, May 10, 2017 at 9:35 PM, Pat Haley > wrote:


Without the oflag=sync and only a single test of each, the
FUSE is going faster than NFS:

FUSE:
mseas-data2(dri_nascar)% dd if=/dev/zero count=4096
bs=1048576 of=zeros.txt conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s


NFS
mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576
of=zeros.txt conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s



On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:

Could you let me know the speed without oflag=sync on both
the mounts? No need to collect profiles.

On Wed, May 10, 2017 at 9:17 PM, Pat Haley > wrote:


Here is what I see now:

[root@mseas-data2 ~]# gluster volume info

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
  

Re: [Gluster-users] Slow write times to gluster disk

2017-05-10 Thread Pranith Kumar Karampuri
On Wed, May 10, 2017 at 10:15 PM, Pat Haley  wrote:

>
> Hi Pranith,
>
> Not entirely sure (this isn't my area of expertise).  I'll run your answer
> by some other people who are more familiar with this.
>
> I am also uncertain about how to interpret the results when we also add
> the dd tests writing to the /home area (no gluster, still on the same
> machine)
>
>- dd test without oflag=sync (rough average of multiple tests)
>- gluster w/ fuse mount : 570 Mb/s
>   - gluster w/ nfs mount:  390 Mb/s
>   - nfs (no gluster):  1.2 Gb/s
>- dd test with oflag=sync (rough average of multiple tests)
>   - gluster w/ fuse mount:  5 Mb/s
>   - gluster w/ nfs mount:  200 Mb/s
>   - nfs (no gluster): 20 Mb/s
>
> Given that the non-gluster area is a RAID-6 of 4 disks while each brick of
> the gluster area is a RAID-6 of 32 disks, I would naively expect the writes
> to the gluster area to be roughly 8x faster than to the non-gluster.
>

I think a better test is to try and write to a file using nfs without any
gluster to a location that is not inside the brick but someother location
that is on same disk(s). If you are mounting the partition as the brick,
then we can write to a file inside .glusterfs directory, something like
/.glusterfs/.


> I still think we have a speed issue, I can't tell if fuse vs nfs is part
> of the problem.
>

I got interested in the post because I read that fuse speed is lesser than
nfs speed which is counter-intuitive to my understanding. So wanted
clarifications. Now that I got my clarifications where fuse outperformed
nfs without sync, we can resume testing as described above and try to find
what it is. Based on your email-id I am guessing you are from Boston and I
am from Bangalore so if you are okay with doing this debugging for multiple
days because of timezones, I will be happy to help. Please be a bit patient
with me, I am under a release crunch but I am very curious with the problem
you posted.

  Was there anything useful in the profiles?
>

Unfortunately profiles didn't help me much, I think we are collecting the
profiles from an active volume, so it has a lot of information that is not
pertaining to dd so it is difficult to find the contributions of dd. So I
went through your post again and found something I didn't pay much
attention to earlier i.e. oflag=sync, so did my own tests on my setup with
FUSE so sent that reply.


>
> Pat
>
>
>
> On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:
>
> Okay good. At least this validates my doubts. Handling O_SYNC in gluster
> NFS and fuse is a bit different.
> When application opens a file with O_SYNC on fuse mount then each write
> syscall has to be written to disk as part of the syscall where as in case
> of NFS, there is no concept of open. NFS performs write though a handle
> saying it needs to be a synchronous write, so write() syscall is performed
> first then it performs fsync(). so an write on an fd with O_SYNC becomes
> write+fsync. I am suspecting that when multiple threads do this
> write+fsync() operation on the same file, multiple writes are batched
> together to be written do disk so the throughput on the disk is increasing
> is my guess.
>
> Does it answer your doubts?
>
> On Wed, May 10, 2017 at 9:35 PM, Pat Haley  wrote:
>
>>
>> Without the oflag=sync and only a single test of each, the FUSE is going
>> faster than NFS:
>>
>> FUSE:
>> mseas-data2(dri_nascar)% dd if=/dev/zero count=4096 bs=1048576
>> of=zeros.txt conv=sync
>> 4096+0 records in
>> 4096+0 records out
>> 4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s
>>
>>
>> NFS
>> mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt
>> conv=sync
>> 4096+0 records in
>> 4096+0 records out
>> 4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s
>>
>>
>>
>> On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:
>>
>> Could you let me know the speed without oflag=sync on both the mounts? No
>> need to collect profiles.
>>
>> On Wed, May 10, 2017 at 9:17 PM, Pat Haley  wrote:
>>
>>>
>>> Here is what I see now:
>>>
>>> [root@mseas-data2 ~]# gluster volume info
>>>
>>> Volume Name: data-volume
>>> Type: Distribute
>>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>>> Status: Started
>>> Number of Bricks: 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: mseas-data2:/mnt/brick1
>>> Brick2: mseas-data2:/mnt/brick2
>>> Options Reconfigured:
>>> diagnostics.count-fop-hits: on
>>> diagnostics.latency-measurement: on
>>> nfs.exports-auth-enable: on
>>> diagnostics.brick-sys-log-level: WARNING
>>> performance.readdir-ahead: on
>>> nfs.disable: on
>>> nfs.export-volumes: off
>>>
>>>
>>>
>>> On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote:
>>>
>>> Is this the volume info you have?
>>>
>>> >* [root at mseas-data2 
>>> > ~]# gluster volume 
>>> >info
>>> *>>* Volume Name: data-volume
>>> *>* Type: Distribute
>>> 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-10 Thread Pat Haley


Hi Pranith,

Not entirely sure (this isn't my area of expertise).  I'll run your 
answer by some other people who are more familiar with this.


I am also uncertain about how to interpret the results when we also add 
the dd tests writing to the /home area (no gluster, still on the same 
machine)


 * dd test without oflag=sync (rough average of multiple tests)
 o gluster w/ fuse mount : 570 Mb/s
 o gluster w/ nfs mount:  390 Mb/s
 o nfs (no gluster):  1.2 Gb/s
 * dd test with oflag=sync (rough average of multiple tests)
 o gluster w/ fuse mount:  5 Mb/s
 o gluster w/ nfs mount:  200 Mb/s
 o nfs (no gluster): 20 Mb/s

Given that the non-gluster area is a RAID-6 of 4 disks while each brick 
of the gluster area is a RAID-6 of 32 disks, I would naively expect the 
writes to the gluster area to be roughly 8x faster than to the non-gluster.


I still think we have a speed issue, I can't tell if fuse vs nfs is part 
of the problem.  Was there anything useful in the profiles?


Pat


On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:
Okay good. At least this validates my doubts. Handling O_SYNC in 
gluster NFS and fuse is a bit different.
When application opens a file with O_SYNC on fuse mount then each 
write syscall has to be written to disk as part of the syscall where 
as in case of NFS, there is no concept of open. NFS performs write 
though a handle saying it needs to be a synchronous write, so write() 
syscall is performed first then it performs fsync(). so an write on an 
fd with O_SYNC becomes write+fsync. I am suspecting that when multiple 
threads do this write+fsync() operation on the same file, multiple 
writes are batched together to be written do disk so the throughput on 
the disk is increasing is my guess.


Does it answer your doubts?

On Wed, May 10, 2017 at 9:35 PM, Pat Haley > wrote:



Without the oflag=sync and only a single test of each, the FUSE is
going faster than NFS:

FUSE:
mseas-data2(dri_nascar)% dd if=/dev/zero count=4096 bs=1048576
of=zeros.txt conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s


NFS
mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576
of=zeros.txt conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s



On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:

Could you let me know the speed without oflag=sync on both the
mounts? No need to collect profiles.

On Wed, May 10, 2017 at 9:17 PM, Pat Haley > wrote:


Here is what I see now:

[root@mseas-data2 ~]# gluster volume info

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off



On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote:

Is this the volume info you have?

>/[root at mseas-data2
 ~]#
gluster volume info />//>/Volume Name: data-volume />/Type: Distribute />/Volume ID: 
c162161e-2a2d-4dac-b015-f31fd89ceb18 />/Status: Started />/Number of Bricks: 2 />/Transport-type: tcp 
/>/Bricks: />/Brick1: mseas-data2:/mnt/brick1 />/Brick2: mseas-data2:/mnt/brick2 />/Options Reconfigured: 
/>/performance.readdir-ahead: on />/nfs.disable: on />/nfs.export-volumes: off /
​I copied this from old thread from 2016. This is distribute
volume. Did you change any of the options in between?


-- 


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:pha...@mit.edu 

Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

-- 
Pranith
-- 


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:pha...@mit.edu 

Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

--
Pranith

--


Re: [Gluster-users] Slow write times to gluster disk

2017-05-10 Thread Pranith Kumar Karampuri
Okay good. At least this validates my doubts. Handling O_SYNC in gluster
NFS and fuse is a bit different.
When application opens a file with O_SYNC on fuse mount then each write
syscall has to be written to disk as part of the syscall where as in case
of NFS, there is no concept of open. NFS performs write though a handle
saying it needs to be a synchronous write, so write() syscall is performed
first then it performs fsync(). so an write on an fd with O_SYNC becomes
write+fsync. I am suspecting that when multiple threads do this
write+fsync() operation on the same file, multiple writes are batched
together to be written do disk so the throughput on the disk is increasing
is my guess.

Does it answer your doubts?

On Wed, May 10, 2017 at 9:35 PM, Pat Haley  wrote:

>
> Without the oflag=sync and only a single test of each, the FUSE is going
> faster than NFS:
>
> FUSE:
> mseas-data2(dri_nascar)% dd if=/dev/zero count=4096 bs=1048576
> of=zeros.txt conv=sync
> 4096+0 records in
> 4096+0 records out
> 4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s
>
>
> NFS
> mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt
> conv=sync
> 4096+0 records in
> 4096+0 records out
> 4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s
>
>
>
> On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:
>
> Could you let me know the speed without oflag=sync on both the mounts? No
> need to collect profiles.
>
> On Wed, May 10, 2017 at 9:17 PM, Pat Haley  wrote:
>
>>
>> Here is what I see now:
>>
>> [root@mseas-data2 ~]# gluster volume info
>>
>> Volume Name: data-volume
>> Type: Distribute
>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>> Status: Started
>> Number of Bricks: 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: mseas-data2:/mnt/brick1
>> Brick2: mseas-data2:/mnt/brick2
>> Options Reconfigured:
>> diagnostics.count-fop-hits: on
>> diagnostics.latency-measurement: on
>> nfs.exports-auth-enable: on
>> diagnostics.brick-sys-log-level: WARNING
>> performance.readdir-ahead: on
>> nfs.disable: on
>> nfs.export-volumes: off
>>
>>
>>
>> On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote:
>>
>> Is this the volume info you have?
>>
>> >* [root at mseas-data2 
>> > ~]# gluster volume 
>> >info
>> *>>* Volume Name: data-volume
>> *>* Type: Distribute
>> *>* Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>> *>* Status: Started
>> *>* Number of Bricks: 2
>> *>* Transport-type: tcp
>> *>* Bricks:
>> *>* Brick1: mseas-data2:/mnt/brick1
>> *>* Brick2: mseas-data2:/mnt/brick2
>> *>* Options Reconfigured:
>> *>* performance.readdir-ahead: on
>> *>* nfs.disable: on
>> *>* nfs.export-volumes: off
>>
>> *
>>
>> ​I copied this from old thread from 2016. This is distribute volume. Did
>> you change any of the options in between?
>>
>> --
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley  Email:  pha...@mit.edu
>> Center for Ocean Engineering   Phone:  (617) 253-6824
>> Dept. of Mechanical EngineeringFax:(617) 253-8125
>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>> 77 Massachusetts Avenue
>> Cambridge, MA  02139-4301
>>
>> --
> Pranith
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley  Email:  pha...@mit.edu
> Center for Ocean Engineering   Phone:  (617) 253-6824
> Dept. of Mechanical EngineeringFax:(617) 253-8125
> MIT, Room 5-213http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA  02139-4301
>
>


-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2017-05-10 Thread Pat Haley


Without the oflag=sync and only a single test of each, the FUSE is going 
faster than NFS:


FUSE:
mseas-data2(dri_nascar)% dd if=/dev/zero count=4096 bs=1048576 
of=zeros.txt conv=sync

4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s


NFS
mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt 
conv=sync

4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s


On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:
Could you let me know the speed without oflag=sync on both the mounts? 
No need to collect profiles.


On Wed, May 10, 2017 at 9:17 PM, Pat Haley > wrote:



Here is what I see now:

[root@mseas-data2 ~]# gluster volume info

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off



On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote:

Is this the volume info you have?

>/[root at mseas-data2
 ~]#
gluster volume info />//>/Volume Name: data-volume />/Type: Distribute />/Volume ID: 
c162161e-2a2d-4dac-b015-f31fd89ceb18 />/Status: Started />/Number of Bricks: 2 />/Transport-type: tcp 
/>/Bricks: />/Brick1: mseas-data2:/mnt/brick1 />/Brick2: mseas-data2:/mnt/brick2 />/Options Reconfigured: 
/>/performance.readdir-ahead: on />/nfs.disable: on />/nfs.export-volumes: off /
​I copied this from old thread from 2016. This is distribute
volume. Did you change any of the options in between?


-- 


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:pha...@mit.edu 

Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

--
Pranith

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2017-05-10 Thread Pranith Kumar Karampuri
Could you let me know the speed without oflag=sync on both the mounts? No
need to collect profiles.

On Wed, May 10, 2017 at 9:17 PM, Pat Haley  wrote:

>
> Here is what I see now:
>
> [root@mseas-data2 ~]# gluster volume info
>
> Volume Name: data-volume
> Type: Distribute
> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
> Status: Started
> Number of Bricks: 2
> Transport-type: tcp
> Bricks:
> Brick1: mseas-data2:/mnt/brick1
> Brick2: mseas-data2:/mnt/brick2
> Options Reconfigured:
> diagnostics.count-fop-hits: on
> diagnostics.latency-measurement: on
> nfs.exports-auth-enable: on
> diagnostics.brick-sys-log-level: WARNING
> performance.readdir-ahead: on
> nfs.disable: on
> nfs.export-volumes: off
>
>
>
> On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote:
>
> Is this the volume info you have?
>
> >* [root at mseas-data2 
> > ~]# gluster volume 
> >info
> *>>* Volume Name: data-volume
> *>* Type: Distribute
> *>* Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
> *>* Status: Started
> *>* Number of Bricks: 2
> *>* Transport-type: tcp
> *>* Bricks:
> *>* Brick1: mseas-data2:/mnt/brick1
> *>* Brick2: mseas-data2:/mnt/brick2
> *>* Options Reconfigured:
> *>* performance.readdir-ahead: on
> *>* nfs.disable: on
> *>* nfs.export-volumes: off
>
> *
>
> ​I copied this from old thread from 2016. This is distribute volume. Did
> you change any of the options in between?
>
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley  Email:  pha...@mit.edu
> Center for Ocean Engineering   Phone:  (617) 253-6824
> Dept. of Mechanical EngineeringFax:(617) 253-8125
> MIT, Room 5-213http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA  02139-4301
>
>


-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2017-05-10 Thread Pat Haley


Here is what I see now:

[root@mseas-data2 ~]# gluster volume info

Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off



On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote:

Is this the volume info you have?

>/[root at mseas-data2 
 ~]# gluster 
volume info />//>/Volume Name: data-volume />/Type: Distribute />/Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18 />/Status: Started />/Number of Bricks: 2 />/Transport-type: tcp />/Bricks: />/Brick1: mseas-data2:/mnt/brick1 />/Brick2: mseas-data2:/mnt/brick2 />/Options Reconfigured: />/performance.readdir-ahead: on />/nfs.disable: on />/nfs.export-volumes: off /
​I copied this from old thread from 2016. This is distribute volume. 
Did you change any of the options in between?


--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2017-05-10 Thread Pranith Kumar Karampuri
Is this the volume info you have?

>* [root at mseas-data2 
> ~]# gluster volume info
*>>* Volume Name: data-volume
*>* Type: Distribute
*>* Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
*>* Status: Started
*>* Number of Bricks: 2
*>* Transport-type: tcp
*>* Bricks:
*>* Brick1: mseas-data2:/mnt/brick1
*>* Brick2: mseas-data2:/mnt/brick2
*>* Options Reconfigured:
*>* performance.readdir-ahead: on
*>* nfs.disable: on
*>

* nfs.export-volumes: off*

​I copied this from old thread from 2016. This is distribute volume. Did
you change any of the options in between?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2017-05-10 Thread Pat Haley


Hi,

We finally managed to do the dd tests for an NFS-mounted gluster file 
system.  The profile results during that test are in


http://mseas.mit.edu/download/phaley/GlusterUsers/profile_gluster_nfs_test

The summary of the dd tests are

 * writing to gluster disk mounted with fuse:  5 Mb/s
 * writing to gluster disk mounted with nfs:  200 Mb/s

Pat


On 05/05/2017 08:11 PM, Pat Haley wrote:


Hi,

We redid the dd tests (this time using conv=sync oflag=sync to avoid 
caching questions).  The profile results are in


http://mseas.mit.edu/download/phaley/GlusterUsers/profile_gluster_fuse_test


On 05/05/2017 12:47 PM, Ravishankar N wrote:

On 05/05/2017 08:42 PM, Pat Haley wrote:


Hi Pranith,

I presume you are asking for some version of the profile data that 
just shows the dd test (or a repeat of the dd test).  If yes, how do 
I extract just that data?
Yes, that is what he is asking for. Just clear the existing profile 
info using `gluster volume profile volname clear` and run the dd test 
once. Then when you run profile info again, it should just give you 
the stats for the dd test.


Thanks

Pat



On 05/05/2017 10:58 AM, Pranith Kumar Karampuri wrote:

hi Pat,
  Let us concentrate on the performance numbers part for now. 
We will look at the permissions one after this?


As per the profile info, only 2.6% of the work-load is writes. 
There are too many Lookups.


Would it be possible to get the data for just the dd test you were 
doing earlier?



On Fri, May 5, 2017 at 8:14 PM, Pat Haley > wrote:



Hi Pranith & Ravi,

A couple of quick questions

We have profile turned on. Are there specific queries we should
make that would help debug our configuration?  (The default
profile info was previously sent in
http://lists.gluster.org/pipermail/gluster-users/2017-May/030840.html

but I'm not sure if that is what you were looking for.)

We also started to do a test on serving gluster over NFS.  We
rediscovered an issue we previously reported (
http://lists.gluster.org/pipermail/gluster-users/2016-September/028289.html


) in that the NFS mounted version was ignoring the group write
permissions.  What specific information would be useful in
debugging this?

Thanks

Pat



On 04/14/2017 03:01 AM, Ravishankar N wrote:

On 04/14/2017 12:20 PM, Pranith Kumar Karampuri wrote:



On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N
> wrote:

Hi Pat,

I'm assuming you are using gluster native (fuse mount).
If it helps, you could try mounting it via gluster NFS
(gnfs) and then see if there is an improvement in speed.
Fuse mounts are slower than gnfs mounts but you get the
benefit of avoiding a single point of failure. Unlike
fuse mounts, if the gluster node containing the gnfs
server goes down, all mounts done using that node will
fail). For fuse mounts, you could try tweaking the
write-behind xlator settings to see if it helps. See the
performance.write-behind and
performance.write-behind-window-size options in `gluster
volume set help`. Of course, even for gnfs mounts, you
can achieve fail-over by using CTDB.


Ravi,
  Do you have any data that suggests fuse mounts are
slower than gNFS servers?

I have heard anecdotal evidence time and again on the ML and
IRC, which is why I wanted to compare it with NFS numbers on
his setup.


Pat,
  I see that I am late to the thread, but do you happen
to have "profile info" of the workload?

You can follow

https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/


to get the information.

Yeah, Let's see if profile info shows up anything interesting.
-Ravi



Thanks,
Ravi


On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster
disk when compared to writing to an NFS disk.
Specifically when using dd (data duplicator) to write a
4.3 GB file of zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no
replication or anything else. The hardware is
(literally) the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the
card, /mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  

Re: [Gluster-users] Slow write times to gluster disk

2017-05-05 Thread Pat Haley


Hi,

We redid the dd tests (this time using conv=sync oflag=sync to avoid 
caching questions).  The profile results are in


http://mseas.mit.edu/download/phaley/GlusterUsers/profile_gluster_fuse_test


On 05/05/2017 12:47 PM, Ravishankar N wrote:

On 05/05/2017 08:42 PM, Pat Haley wrote:


Hi Pranith,

I presume you are asking for some version of the profile data that 
just shows the dd test (or a repeat of the dd test).  If yes, how do 
I extract just that data?
Yes, that is what he is asking for. Just clear the existing profile 
info using `gluster volume profile volname clear` and run the dd test 
once. Then when you run profile info again, it should just give you 
the stats for the dd test.


Thanks

Pat



On 05/05/2017 10:58 AM, Pranith Kumar Karampuri wrote:

hi Pat,
  Let us concentrate on the performance numbers part for now. We 
will look at the permissions one after this?


As per the profile info, only 2.6% of the work-load is writes. There 
are too many Lookups.


Would it be possible to get the data for just the dd test you were 
doing earlier?



On Fri, May 5, 2017 at 8:14 PM, Pat Haley > wrote:



Hi Pranith & Ravi,

A couple of quick questions

We have profile turned on. Are there specific queries we should
make that would help debug our configuration?  (The default
profile info was previously sent in
http://lists.gluster.org/pipermail/gluster-users/2017-May/030840.html

but I'm not sure if that is what you were looking for.)

We also started to do a test on serving gluster over NFS.  We
rediscovered an issue we previously reported (
http://lists.gluster.org/pipermail/gluster-users/2016-September/028289.html


) in that the NFS mounted version was ignoring the group write
permissions.  What specific information would be useful in
debugging this?

Thanks

Pat



On 04/14/2017 03:01 AM, Ravishankar N wrote:

On 04/14/2017 12:20 PM, Pranith Kumar Karampuri wrote:



On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N
> wrote:

Hi Pat,

I'm assuming you are using gluster native (fuse mount). If
it helps, you could try mounting it via gluster NFS (gnfs)
and then see if there is an improvement in speed. Fuse
mounts are slower than gnfs mounts but you get the benefit
of avoiding a single point of failure. Unlike fuse mounts,
if the gluster node containing the gnfs server goes down,
all mounts done using that node will fail). For fuse
mounts, you could try tweaking the write-behind xlator
settings to see if it helps. See the
performance.write-behind and
performance.write-behind-window-size options in `gluster
volume set help`. Of course, even for gnfs mounts, you can
achieve fail-over by using CTDB.


Ravi,
  Do you have any data that suggests fuse mounts are
slower than gNFS servers?

I have heard anecdotal evidence time and again on the ML and
IRC, which is why I wanted to compare it with NFS numbers on
his setup.


Pat,
  I see that I am late to the thread, but do you happen to
have "profile info" of the workload?

You can follow

https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/


to get the information.

Yeah, Let's see if profile info shows up anything interesting.
-Ravi



Thanks,
Ravi


On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster
disk when compared to writing to an NFS disk.
Specifically when using dd (data duplicator) to write a
4.3 GB file of zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no
replication or anything else. The hardware is (literally)
the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the
card, /mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after
changing the log level):

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID
SAS-3 3108 [Invader] (rev 02)



*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-05 Thread Ravishankar N

On 05/05/2017 08:42 PM, Pat Haley wrote:


Hi Pranith,

I presume you are asking for some version of the profile data that 
just shows the dd test (or a repeat of the dd test).  If yes, how do I 
extract just that data?
Yes, that is what he is asking for. Just clear the existing profile info 
using `gluster volume profile volname clear` and run the dd test once. 
Then when you run profile info again, it should just give you the stats 
for the dd test.


Thanks

Pat



On 05/05/2017 10:58 AM, Pranith Kumar Karampuri wrote:

hi Pat,
  Let us concentrate on the performance numbers part for now. We 
will look at the permissions one after this?


As per the profile info, only 2.6% of the work-load is writes. There 
are too many Lookups.


Would it be possible to get the data for just the dd test you were 
doing earlier?



On Fri, May 5, 2017 at 8:14 PM, Pat Haley > wrote:



Hi Pranith & Ravi,

A couple of quick questions

We have profile turned on. Are there specific queries we should
make that would help debug our configuration? (The default
profile info was previously sent in
http://lists.gluster.org/pipermail/gluster-users/2017-May/030840.html

but I'm not sure if that is what you were looking for.)

We also started to do a test on serving gluster over NFS.  We
rediscovered an issue we previously reported (
http://lists.gluster.org/pipermail/gluster-users/2016-September/028289.html


) in that the NFS mounted version was ignoring the group write
permissions.  What specific information would be useful in
debugging this?

Thanks

Pat



On 04/14/2017 03:01 AM, Ravishankar N wrote:

On 04/14/2017 12:20 PM, Pranith Kumar Karampuri wrote:



On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N
> wrote:

Hi Pat,

I'm assuming you are using gluster native (fuse mount). If
it helps, you could try mounting it via gluster NFS (gnfs)
and then see if there is an improvement in speed. Fuse
mounts are slower than gnfs mounts but you get the benefit
of avoiding a single point of failure. Unlike fuse mounts,
if the gluster node containing the gnfs server goes down,
all mounts done using that node will fail). For fuse
mounts, you could try tweaking the write-behind xlator
settings to see if it helps. See the
performance.write-behind and
performance.write-behind-window-size options in `gluster
volume set help`. Of course, even for gnfs mounts, you can
achieve fail-over by using CTDB.


Ravi,
  Do you have any data that suggests fuse mounts are slower
than gNFS servers?

I have heard anecdotal evidence time and again on the ML and
IRC, which is why I wanted to compare it with NFS numbers on his
setup.


Pat,
  I see that I am late to the thread, but do you happen to
have "profile info" of the workload?

You can follow

https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/


to get the information.

Yeah, Let's see if profile info shows up anything interesting.
-Ravi



Thanks,
Ravi


On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster
disk when compared to writing to an NFS disk. Specifically
when using dd (data duplicator) to write a 4.3 GB file of
zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no
replication or anything else. The hardware is (literally)
the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the
card, /mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after
changing the log level):

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID
SAS-3 3108 [Invader] (rev 02)



*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1
bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-05 Thread Pat Haley


Hi Pranith,

I presume you are asking for some version of the profile data that just 
shows the dd test (or a repeat of the dd test).  If yes, how do I 
extract just that data?


Thanks

Pat



On 05/05/2017 10:58 AM, Pranith Kumar Karampuri wrote:

hi Pat,
  Let us concentrate on the performance numbers part for now. We 
will look at the permissions one after this?


As per the profile info, only 2.6% of the work-load is writes. There 
are too many Lookups.


Would it be possible to get the data for just the dd test you were 
doing earlier?



On Fri, May 5, 2017 at 8:14 PM, Pat Haley > wrote:



Hi Pranith & Ravi,

A couple of quick questions

We have profile turned on. Are there specific queries we should
make that would help debug our configuration?  (The default
profile info was previously sent in
http://lists.gluster.org/pipermail/gluster-users/2017-May/030840.html

but I'm not sure if that is what you were looking for.)

We also started to do a test on serving gluster over NFS. We
rediscovered an issue we previously reported (
http://lists.gluster.org/pipermail/gluster-users/2016-September/028289.html


) in that the NFS mounted version was ignoring the group write
permissions.  What specific information would be useful in
debugging this?

Thanks

Pat



On 04/14/2017 03:01 AM, Ravishankar N wrote:

On 04/14/2017 12:20 PM, Pranith Kumar Karampuri wrote:



On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N
> wrote:

Hi Pat,

I'm assuming you are using gluster native (fuse mount). If
it helps, you could try mounting it via gluster NFS (gnfs)
and then see if there is an improvement in speed. Fuse
mounts are slower than gnfs mounts but you get the benefit
of avoiding a single point of failure. Unlike fuse mounts,
if the gluster node containing the gnfs server goes down,
all mounts done using that node will fail). For fuse mounts,
you could try tweaking the write-behind xlator settings to
see if it helps. See the performance.write-behind and
performance.write-behind-window-size options in `gluster
volume set help`. Of course, even for gnfs mounts, you can
achieve fail-over by using CTDB.


Ravi,
  Do you have any data that suggests fuse mounts are slower
than gNFS servers?

I have heard anecdotal evidence time and again on the ML and IRC,
which is why I wanted to compare it with NFS numbers on his setup.


Pat,
  I see that I am late to the thread, but do you happen to
have "profile info" of the workload?

You can follow

https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/


to get the information.

Yeah, Let's see if profile info shows up anything interesting.
-Ravi



Thanks,
Ravi


On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster
disk when compared to writing to an NFS disk. Specifically
when using dd (data duplicator) to write a 4.3 GB file of
zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication
or anything else. The hardware is (literally) the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the
card, /mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after
changing the log level):

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID
SAS-3 3108 [Invader] (rev 02)



*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1
bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1
bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s -
*3 times as fast*


Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd 

Re: [Gluster-users] Slow write times to gluster disk

2017-05-05 Thread Pranith Kumar Karampuri
hi Pat,
  Let us concentrate on the performance numbers part for now. We will
look at the permissions one after this?

As per the profile info, only 2.6% of the work-load is writes. There are
too many Lookups.

Would it be possible to get the data for just the dd test you were doing
earlier?


On Fri, May 5, 2017 at 8:14 PM, Pat Haley  wrote:

>
> Hi Pranith & Ravi,
>
> A couple of quick questions
>
> We have profile turned on. Are there specific queries we should make that
> would help debug our configuration?  (The default profile info was
> previously sent in http://lists.gluster.org/pipermail/gluster-users/2017-
> May/030840.html but I'm not sure if that is what you were looking for.)
>
> We also started to do a test on serving gluster over NFS.  We rediscovered
> an issue we previously reported ( http://lists.gluster.org/
> pipermail/gluster-users/2016-September/028289.html ) in that the NFS
> mounted version was ignoring the group write permissions.  What specific
> information would be useful in debugging this?
>
> Thanks
>
> Pat
>
>
>
> On 04/14/2017 03:01 AM, Ravishankar N wrote:
>
> On 04/14/2017 12:20 PM, Pranith Kumar Karampuri wrote:
>
>
>
> On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N 
> wrote:
>
>> Hi Pat,
>>
>> I'm assuming you are using gluster native (fuse mount). If it helps, you
>> could try mounting it via gluster NFS (gnfs) and then see if there is an
>> improvement in speed. Fuse mounts are slower than gnfs mounts but you get
>> the benefit of avoiding a single point of failure. Unlike fuse mounts, if
>> the gluster node containing the gnfs server goes down, all mounts done
>> using that node will fail). For fuse mounts, you could try tweaking the
>> write-behind xlator settings to see if it helps. See the
>> performance.write-behind and performance.write-behind-window-size
>> options in `gluster volume set help`. Of course, even for gnfs mounts, you
>> can achieve fail-over by using CTDB.
>>
>
> Ravi,
>   Do you have any data that suggests fuse mounts are slower than gNFS
> servers?
>
> I have heard anecdotal evidence time and again on the ML and IRC, which is
> why I wanted to compare it with NFS numbers on his setup.
>
>
> Pat,
>   I see that I am late to the thread, but do you happen to have
> "profile info" of the workload?
>
> You can follow https://gluster.readthedocs.io/en/latest/Administrator%
> 20Guide/Monitoring%20Workload/ to get the information.
>
> Yeah, Let's see if profile info shows up anything interesting.
> -Ravi
>
>
>
>>
>> Thanks,
>> Ravi
>>
>>
>> On 04/08/2017 12:07 AM, Pat Haley wrote:
>>
>>
>> Hi,
>>
>> We noticed a dramatic slowness when writing to a gluster disk when
>> compared to writing to an NFS disk. Specifically when using dd (data
>> duplicator) to write a 4.3 GB file of zeros:
>>
>>- on NFS disk (/home): 9.5 Gb/s
>>- on gluster disk (/gdata): 508 Mb/s
>>
>> The gluser disk is 2 bricks joined together, no replication or anything
>> else. The hardware is (literally) the same:
>>
>>- one server with 70 hard disks  and a hardware RAID card.
>>- 4 disks in a RAID-6 group (the NFS disk)
>>- 32 disks in a RAID-6 group (the max allowed by the card,
>>/mnt/brick1)
>>- 32 disks in another RAID-6 group (/mnt/brick2)
>>- 2 hot spare
>>
>> Some additional information and more tests results (after changing the
>> log level):
>>
>> glusterfs 3.7.11 built on Apr 27 2016 14:09:22
>> CentOS release 6.8 (Final)
>> RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108
>> [Invader] (rev 02)
>>
>>
>>
>> *Create the file to /gdata (gluster)*
>> [root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
>> count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>>
>> *Create the file to /home (ext4)*
>> [root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times as
>> fast
>>
>>
>>
>> * Copy from /gdata to /gdata (gluster to gluster) *[root@mseas-data2
>> gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>> 2048000+0 records in
>> 2048000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - realllyyy
>> slooowww
>>
>>
>> *Copy from /gdata to /gdata* *2nd time (gluster to gluster)*
>> [root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>> 2048000+0 records in
>> 2048000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - realllyyy
>> slooowww again
>>
>>
>>
>> *Copy from /home to /home (ext4 to ext4)*
>> [root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
>> 2048000+0 records in
>> 2048000+0 records out
>> 1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times as fast
>>
>>
>> *Copy from /home to /home (ext4 to ext4)*
>> [root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
>> 2048000+0 records in
>> 2048000+0 records out

Re: [Gluster-users] Slow write times to gluster disk

2017-05-05 Thread Pat Haley


Hi Pranith & Ravi,

A couple of quick questions

We have profile turned on. Are there specific queries we should make 
that would help debug our configuration?  (The default profile info was 
previously sent in 
http://lists.gluster.org/pipermail/gluster-users/2017-May/030840.html 
but I'm not sure if that is what you were looking for.)


We also started to do a test on serving gluster over NFS.  We 
rediscovered an issue we previously reported ( 
http://lists.gluster.org/pipermail/gluster-users/2016-September/028289.html 
) in that the NFS mounted version was ignoring the group write 
permissions.  What specific information would be useful in debugging this?


Thanks

Pat


On 04/14/2017 03:01 AM, Ravishankar N wrote:

On 04/14/2017 12:20 PM, Pranith Kumar Karampuri wrote:



On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N 
> wrote:


Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it
helps, you could try mounting it via gluster NFS (gnfs) and then
see if there is an improvement in speed. Fuse mounts are slower
than gnfs mounts but you get the benefit of avoiding a single
point of failure. Unlike fuse mounts, if the gluster node
containing the gnfs server goes down, all mounts done using that
node will fail). For fuse mounts, you could try tweaking the
write-behind xlator settings to see if it helps. See the
performance.write-behind and performance.write-behind-window-size
options in `gluster volume set help`. Of course, even for gnfs
mounts, you can achieve fail-over by using CTDB.


Ravi,
  Do you have any data that suggests fuse mounts are slower than 
gNFS servers?
I have heard anecdotal evidence time and again on the ML and IRC, 
which is why I wanted to compare it with NFS numbers on his setup.


Pat,
  I see that I am late to the thread, but do you happen to have 
"profile info" of the workload?


You can follow 
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/ 
to get the information.

Yeah, Let's see if profile info shows up anything interesting.
-Ravi



Thanks,
Ravi


On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster disk
when compared to writing to an NFS disk. Specifically when using
dd (data duplicator) to write a 4.3 GB file of zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or
anything else. The hardware is (literally) the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card,
/mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after
changing the log level):

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3
3108 [Invader] (rev 02)



*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3
times as fast*


Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
realllyyy slooowww


*Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
realllyyy slooowww again



*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times
as fast


*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30
times as fast


As a test, can we copy data directly to the xfs mountpoint
(/mnt/brick1) and bypass gluster?


Any help you could give us would be appreciated.

Thanks

-- 


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  

Re: [Gluster-users] Slow write times to gluster disk

2017-04-17 Thread Soumya Koduri



On 04/14/2017 10:27 AM, Ravishankar N wrote:

I'm not sure if the version you are running (glusterfs 3.7.11 ) works
with NFS-Ganesha as the link seems to suggest version >=3.8 as a
per-requisite. Adding Soumya for help. If it is not supported, then you
might have to go the plain glusterNFS way.


Even gluster 3.7.x shall work with NFS-Ganesha but the steps to 
configure had changed from 3.8 and hence the pre-requisite was added in 
the doc. IIUC, from your below mail, you would like to try NFS 
(preferably gNFS but not NFS-Ganesha) which may perform better compared 
to fuse mount. In that case, gNFS server comes up by default (till 
release-3.7.x) and there are additional steps needed to export volume 
via gNFS. Let me know if you have any issues accessing volumes via gNFS.


Regards,
Soumya


Regards,
Ravi

On 04/14/2017 03:48 AM, Pat Haley wrote:


Hi Ravi (and list),

We are planning on testing the NFS route to see what kind of speed-up
we get.  A little research led us to the following:

https://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/

Is this correct path to take to mount 2 xfs volumes as a single
gluster file system volume?  If not, what would be a better path?


Pat



On 04/11/2017 12:21 AM, Ravishankar N wrote:

On 04/11/2017 12:42 AM, Pat Haley wrote:


Hi Ravi,

Thanks for the reply.  And yes, we are using the gluster native
(fuse) mount.  Since this is not my area of expertise I have a few
questions (mostly clarifications)

Is a factor of 20 slow-down typical when compare a fuse-mounted
filesytem versus an NFS-mounted filesystem or should we also be
looking for additional issues?  (Note the first dd test described
below was run on the server that hosts the file-systems so no
network communication was involved).


Though both the gluster bricks and the mounts are on the same
physical machine in your setup, the I/O still passes through
different layers of kernel/user-space fuse stack although I don't
know if 20x slow down on gluster vs NFS share is normal. Why don't
you try doing a gluster NFS mount on the machine and try the dd test
and compare it with the gluster fuse mount results?



You also mention tweaking " write-behind xlator settings".  Would
you expect better speed improvements from switching the mounting
from fuse to gnfs or from tweaking the settings?  Also are these
mutually exclusive or would the be additional benefits from both
switching to gfns and tweaking?

You should test these out and find the answers yourself. :-)



My next question is to make sure I'm clear on the comment " if the
gluster node containing the gnfs server goes down, all mounts done
using that node will fail".  If you have 2 servers, each 1 brick in
the over-all gluster FS, and one server fails, then for gnfs nothing
on either server is visible to other nodes while under fuse only the
files on the dead server are not visible.  Is this what you meant?

Yes, for gnfs mounts, all I/O from various mounts go to the gnfs
server process (on the machine whose IP was used at the time of
mounting) which then sends the I/O to the brick processes. For fuse,
the gluster fuse mount itself talks directly to the bricks.


Finally, you mention "even for gnfs mounts, you can achieve
fail-over by using CTDB".  Do you know if CTDB would have any
performance impact (i.e. in a worst cast scenario could adding CTDB
to gnfs erase the speed benefits of going to gnfs in the first place)?

I don't think it would. You can even achieve load balancing via CTDB
to use different gnfs servers for different clients. But I don't know
if this is needed/ helpful in your current setup where everything
(bricks and clients) seem to be on just one server.

-Ravi

Thanks

Pat


On 04/08/2017 12:58 AM, Ravishankar N wrote:

Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it
helps, you could try mounting it via gluster NFS (gnfs) and then
see if there is an improvement in speed. Fuse mounts are slower
than gnfs mounts but you get the benefit of avoiding a single point
of failure. Unlike fuse mounts, if the gluster node containing the
gnfs server goes down, all mounts done using that node will fail).
For fuse mounts, you could try tweaking the write-behind xlator
settings to see if it helps. See the performance.write-behind and
performance.write-behind-window-size options in `gluster volume set
help`. Of course, even for gnfs mounts, you can achieve fail-over
by using CTDB.

Thanks,
Ravi

On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster disk when
compared to writing to an NFS disk. Specifically when using dd
(data duplicator) to write a 4.3 GB file of zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or
anything else. The hardware is (literally) the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in 

Re: [Gluster-users] Slow write times to gluster disk

2017-04-14 Thread Ravishankar N

On 04/14/2017 12:20 PM, Pranith Kumar Karampuri wrote:



On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N > wrote:


Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it
helps, you could try mounting it via gluster NFS (gnfs) and then
see if there is an improvement in speed. Fuse mounts are slower
than gnfs mounts but you get the benefit of avoiding a single
point of failure. Unlike fuse mounts, if the gluster node
containing the gnfs server goes down, all mounts done using that
node will fail). For fuse mounts, you could try tweaking the
write-behind xlator settings to see if it helps. See the
performance.write-behind and performance.write-behind-window-size
options in `gluster volume set help`. Of course, even for gnfs
mounts, you can achieve fail-over by using CTDB.


Ravi,
  Do you have any data that suggests fuse mounts are slower than 
gNFS servers?
I have heard anecdotal evidence time and again on the ML and IRC, which 
is why I wanted to compare it with NFS numbers on his setup.


Pat,
  I see that I am late to the thread, but do you happen to have 
"profile info" of the workload?


You can follow 
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/ 
to get the information.

Yeah, Let's see if profile info shows up anything interesting.
-Ravi



Thanks,
Ravi


On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster disk
when compared to writing to an NFS disk. Specifically when using
dd (data duplicator) to write a 4.3 GB file of zeros:

  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or
anything else. The hardware is (literally) the same:

  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card,
/mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after
changing the log level):

glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3
3108 [Invader] (rev 02)



*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3
times as fast*


Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
realllyyy slooowww


*Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
realllyyy slooowww again



*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times
as fast


*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times
as fast


As a test, can we copy data directly to the xfs mountpoint
(/mnt/brick1) and bypass gluster?


Any help you could give us would be appreciated.

Thanks

-- 


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:pha...@mit.edu 

Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-users



___ Gluster-users
mailing list Gluster-users@gluster.org


Re: [Gluster-users] Slow write times to gluster disk

2017-04-14 Thread Pranith Kumar Karampuri
On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N 
wrote:

> Hi Pat,
>
> I'm assuming you are using gluster native (fuse mount). If it helps, you
> could try mounting it via gluster NFS (gnfs) and then see if there is an
> improvement in speed. Fuse mounts are slower than gnfs mounts but you get
> the benefit of avoiding a single point of failure. Unlike fuse mounts, if
> the gluster node containing the gnfs server goes down, all mounts done
> using that node will fail). For fuse mounts, you could try tweaking the
> write-behind xlator settings to see if it helps. See the
> performance.write-behind and performance.write-behind-window-size options
> in `gluster volume set help`. Of course, even for gnfs mounts, you can
> achieve fail-over by using CTDB.
>

Ravi,
  Do you have any data that suggests fuse mounts are slower than gNFS
servers?

Pat,
  I see that I am late to the thread, but do you happen to have
"profile info" of the workload?

You can follow
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/
to get the information.


>
> Thanks,
> Ravi
>
>
> On 04/08/2017 12:07 AM, Pat Haley wrote:
>
>
> Hi,
>
> We noticed a dramatic slowness when writing to a gluster disk when
> compared to writing to an NFS disk. Specifically when using dd (data
> duplicator) to write a 4.3 GB file of zeros:
>
>- on NFS disk (/home): 9.5 Gb/s
>- on gluster disk (/gdata): 508 Mb/s
>
> The gluser disk is 2 bricks joined together, no replication or anything
> else. The hardware is (literally) the same:
>
>- one server with 70 hard disks  and a hardware RAID card.
>- 4 disks in a RAID-6 group (the NFS disk)
>- 32 disks in a RAID-6 group (the max allowed by the card, /mnt/brick1)
>- 32 disks in another RAID-6 group (/mnt/brick2)
>- 2 hot spare
>
> Some additional information and more tests results (after changing the log
> level):
>
> glusterfs 3.7.11 built on Apr 27 2016 14:09:22
> CentOS release 6.8 (Final)
> RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108
> [Invader] (rev 02)
>
>
>
> *Create the file to /gdata (gluster)*
> [root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>
> *Create the file to /home (ext4)*
> [root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times as fast
>
>
>
> * Copy from /gdata to /gdata (gluster to gluster) *[root@mseas-data2
> gdata]# dd if=/gdata/zero1 of=/gdata/zero2
> 2048000+0 records in
> 2048000+0 records out
> 1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - realllyyy
> slooowww
>
>
> *Copy from /gdata to /gdata* *2nd time (gluster to gluster)*
> [root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
> 2048000+0 records in
> 2048000+0 records out
> 1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - realllyyy
> slooowww again
>
>
>
> *Copy from /home to /home (ext4 to ext4)*
> [root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
> 2048000+0 records in
> 2048000+0 records out
> 1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times as fast
>
>
> *Copy from /home to /home (ext4 to ext4)*
> [root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
> 2048000+0 records in
> 2048000+0 records out
> 1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times as fast
>
>
> As a test, can we copy data directly to the xfs mountpoint (/mnt/brick1)
> and bypass gluster?
>
>
> Any help you could give us would be appreciated.
>
> Thanks
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley  Email:  pha...@mit.edu
> Center for Ocean Engineering   Phone:  (617) 253-6824
> Dept. of Mechanical EngineeringFax:(617) 253-8125
> MIT, Room 5-213http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA  02139-4301
>
>
>
> ___
> Gluster-users mailing 
> listGluster-users@gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow write times to gluster disk

2017-04-13 Thread Ravishankar N
I'm not sure if the version you are running (glusterfs 3.7.11 ) works 
with NFS-Ganesha as the link seems to suggest version >=3.8 as a 
per-requisite. Adding Soumya for help. If it is not supported, then you 
might have to go the plain glusterNFS way.

Regards,
Ravi

On 04/14/2017 03:48 AM, Pat Haley wrote:


Hi Ravi (and list),

We are planning on testing the NFS route to see what kind of speed-up 
we get.  A little research led us to the following:


https://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/

Is this correct path to take to mount 2 xfs volumes as a single 
gluster file system volume?  If not, what would be a better path?



Pat



On 04/11/2017 12:21 AM, Ravishankar N wrote:

On 04/11/2017 12:42 AM, Pat Haley wrote:


Hi Ravi,

Thanks for the reply.  And yes, we are using the gluster native 
(fuse) mount.  Since this is not my area of expertise I have a few 
questions (mostly clarifications)


Is a factor of 20 slow-down typical when compare a fuse-mounted 
filesytem versus an NFS-mounted filesystem or should we also be 
looking for additional issues?  (Note the first dd test described 
below was run on the server that hosts the file-systems so no 
network communication was involved).


Though both the gluster bricks and the mounts are on the same 
physical machine in your setup, the I/O still passes through 
different layers of kernel/user-space fuse stack although I don't 
know if 20x slow down on gluster vs NFS share is normal. Why don't 
you try doing a gluster NFS mount on the machine and try the dd test 
and compare it with the gluster fuse mount results?




You also mention tweaking " write-behind xlator settings". Would you 
expect better speed improvements from switching the mounting from 
fuse to gnfs or from tweaking the settings? Also are these mutually 
exclusive or would the be additional benefits from both switching to 
gfns and tweaking?

You should test these out and find the answers yourself. :-)



My next question is to make sure I'm clear on the comment " if the 
gluster node containing the gnfs server goes down, all mounts done 
using that node will fail".  If you have 2 servers, each 1 brick in 
the over-all gluster FS, and one server fails, then for gnfs nothing 
on either server is visible to other nodes while under fuse only the 
files on the dead server are not visible.  Is this what you meant?
Yes, for gnfs mounts, all I/O from various mounts go to the gnfs 
server process (on the machine whose IP was used at the time of 
mounting) which then sends the I/O to the brick processes. For fuse, 
the gluster fuse mount itself talks directly to the bricks.


Finally, you mention "even for gnfs mounts, you can achieve 
fail-over by using CTDB".  Do you know if CTDB would have any 
performance impact (i.e. in a worst cast scenario could adding CTDB 
to gnfs erase the speed benefits of going to gnfs in the first place)?
I don't think it would. You can even achieve load balancing via CTDB 
to use different gnfs servers for different clients. But I don't know 
if this is needed/ helpful in your current setup where everything 
(bricks and clients) seem to be on just one server.


-Ravi

Thanks

Pat


On 04/08/2017 12:58 AM, Ravishankar N wrote:

Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it 
helps, you could try mounting it via gluster NFS (gnfs) and then 
see if there is an improvement in speed. Fuse mounts are slower 
than gnfs mounts but you get the benefit of avoiding a single point 
of failure. Unlike fuse mounts, if the gluster node containing the 
gnfs server goes down, all mounts done using that node will fail). 
For fuse mounts, you could try tweaking the write-behind xlator 
settings to see if it helps. See the performance.write-behind and 
performance.write-behind-window-size options in `gluster volume set 
help`. Of course, even for gnfs mounts, you can achieve fail-over 
by using CTDB.


Thanks,
Ravi

On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster disk when 
compared to writing to an NFS disk. Specifically when using dd 
(data duplicator) to write a 4.3 GB file of zeros:


  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or 
anything else. The hardware is (literally) the same:


  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card,
/mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after changing 
the log level):


glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 
[Invader] (rev 02)




*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero 

Re: [Gluster-users] Slow write times to gluster disk

2017-04-13 Thread Pat Haley


Hi Ravi (and list),

We are planning on testing the NFS route to see what kind of speed-up we 
get.  A little research led us to the following:


https://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/

Is this correct path to take to mount 2 xfs volumes as a single gluster 
file system volume?  If not, what would be a better path?



Pat



On 04/11/2017 12:21 AM, Ravishankar N wrote:

On 04/11/2017 12:42 AM, Pat Haley wrote:


Hi Ravi,

Thanks for the reply.  And yes, we are using the gluster native 
(fuse) mount.  Since this is not my area of expertise I have a few 
questions (mostly clarifications)


Is a factor of 20 slow-down typical when compare a fuse-mounted 
filesytem versus an NFS-mounted filesystem or should we also be 
looking for additional issues?  (Note the first dd test described 
below was run on the server that hosts the file-systems so no network 
communication was involved).


Though both the gluster bricks and the mounts are on the same physical 
machine in your setup, the I/O still passes through different layers 
of kernel/user-space fuse stack although I don't know if 20x slow down 
on gluster vs NFS share is normal. Why don't you try doing a gluster 
NFS mount on the machine and try the dd test and compare it with the 
gluster fuse mount results?




You also mention tweaking " write-behind xlator settings". Would you 
expect better speed improvements from switching the mounting from 
fuse to gnfs or from tweaking the settings?  Also are these mutually 
exclusive or would the be additional benefits from both switching to 
gfns and tweaking?

You should test these out and find the answers yourself. :-)



My next question is to make sure I'm clear on the comment " if the 
gluster node containing the gnfs server goes down, all mounts done 
using that node will fail".  If you have 2 servers, each 1 brick in 
the over-all gluster FS, and one server fails, then for gnfs nothing 
on either server is visible to other nodes while under fuse only the 
files on the dead server are not visible.  Is this what you meant?
Yes, for gnfs mounts, all I/O from various mounts go to the gnfs 
server process (on the machine whose IP was used at the time of 
mounting) which then sends the I/O to the brick processes. For fuse, 
the gluster fuse mount itself talks directly to the bricks.


Finally, you mention "even for gnfs mounts, you can achieve fail-over 
by using CTDB".  Do you know if CTDB would have any performance 
impact (i.e. in a worst cast scenario could adding CTDB to gnfs erase 
the speed benefits of going to gnfs in the first place)?
I don't think it would. You can even achieve load balancing via CTDB 
to use different gnfs servers for different clients. But I don't know 
if this is needed/ helpful in your current setup where everything 
(bricks and clients) seem to be on just one server.


-Ravi

Thanks

Pat


On 04/08/2017 12:58 AM, Ravishankar N wrote:

Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it helps, 
you could try mounting it via gluster NFS (gnfs) and then see if 
there is an improvement in speed. Fuse mounts are slower than gnfs 
mounts but you get the benefit of avoiding a single point of 
failure. Unlike fuse mounts, if the gluster node containing the gnfs 
server goes down, all mounts done using that node will fail). For 
fuse mounts, you could try tweaking the write-behind xlator settings 
to see if it helps. See the performance.write-behind and 
performance.write-behind-window-size options in `gluster volume set 
help`. Of course, even for gnfs mounts, you can achieve fail-over by 
using CTDB.


Thanks,
Ravi

On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster disk when 
compared to writing to an NFS disk. Specifically when using dd 
(data duplicator) to write a 4.3 GB file of zeros:


  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or 
anything else. The hardware is (literally) the same:


  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card,
/mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after changing 
the log level):


glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 
[Invader] (rev 02)




*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M 
count=1000

1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M 
count=1000

1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) 

Re: [Gluster-users] Slow write times to gluster disk

2017-04-10 Thread Ravishankar N

On 04/11/2017 12:42 AM, Pat Haley wrote:


Hi Ravi,

Thanks for the reply.  And yes, we are using the gluster native (fuse) 
mount.  Since this is not my area of expertise I have a few questions 
(mostly clarifications)


Is a factor of 20 slow-down typical when compare a fuse-mounted 
filesytem versus an NFS-mounted filesystem or should we also be 
looking for additional issues?  (Note the first dd test described 
below was run on the server that hosts the file-systems so no network 
communication was involved).


Though both the gluster bricks and the mounts are on the same physical 
machine in your setup, the I/O still passes through different layers of 
kernel/user-space fuse stack although I don't know if 20x slow down on 
gluster vs NFS share is normal. Why don't you try doing a gluster NFS 
mount on the machine and try the dd test and compare it with the gluster 
fuse mount results?




You also mention tweaking " write-behind xlator settings".  Would you 
expect better speed improvements from switching the mounting from fuse 
to gnfs or from tweaking the settings?  Also are these mutually 
exclusive or would the be additional benefits from both switching to 
gfns and tweaking?

You should test these out and find the answers yourself. :-)



My next question is to make sure I'm clear on the comment " if the 
gluster node containing the gnfs server goes down, all mounts done 
using that node will fail".  If you have 2 servers, each 1 brick in 
the over-all gluster FS, and one server fails, then for gnfs nothing 
on either server is visible to other nodes while under fuse only the 
files on the dead server are not visible.  Is this what you meant?
Yes, for gnfs mounts, all I/O from various mounts go to the gnfs server 
process (on the machine whose IP was used at the time of mounting) which 
then sends the I/O to the brick processes. For fuse, the gluster fuse 
mount itself talks directly to the bricks.


Finally, you mention "even for gnfs mounts, you can achieve fail-over 
by using CTDB".  Do you know if CTDB would have any performance impact 
(i.e. in a worst cast scenario could adding CTDB to gnfs erase the 
speed benefits of going to gnfs in the first place)?
I don't think it would. You can even achieve load balancing via CTDB to 
use different gnfs servers for different clients. But I don't know if 
this is needed/ helpful in your current setup where everything (bricks 
and clients) seem to be on just one server.


-Ravi

Thanks

Pat


On 04/08/2017 12:58 AM, Ravishankar N wrote:

Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it helps, 
you could try mounting it via gluster NFS (gnfs) and then see if 
there is an improvement in speed. Fuse mounts are slower than gnfs 
mounts but you get the benefit of avoiding a single point of failure. 
Unlike fuse mounts, if the gluster node containing the gnfs server 
goes down, all mounts done using that node will fail). For fuse 
mounts, you could try tweaking the write-behind xlator settings to 
see if it helps. See the performance.write-behind and 
performance.write-behind-window-size options in `gluster volume set 
help`. Of course, even for gnfs mounts, you can achieve fail-over by 
using CTDB.


Thanks,
Ravi

On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster disk when 
compared to writing to an NFS disk. Specifically when using dd (data 
duplicator) to write a 4.3 GB file of zeros:


  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or 
anything else. The hardware is (literally) the same:


  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card,
/mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after changing 
the log level):


glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 
[Invader] (rev 02)




*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M 
count=1000

1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M 
count=1000

1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times 
as fast*



Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - realllyyy 
slooowww



*Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in

Re: [Gluster-users] Slow write times to gluster disk

2017-04-10 Thread Pat Haley


Hi Ravi,

Thanks for the reply.  And yes, we are using the gluster native (fuse) 
mount.  Since this is not my area of expertise I have a few questions 
(mostly clarifications)


Is a factor of 20 slow-down typical when compare a fuse-mounted 
filesytem versus an NFS-mounted filesystem or should we also be looking 
for additional issues?  (Note the first dd test described below was run 
on the server that hosts the file-systems so no network communication 
was involved).


You also mention tweaking " write-behind xlator settings".  Would you 
expect better speed improvements from switching the mounting from fuse 
to gnfs or from tweaking the settings?  Also are these mutually 
exclusive or would the be additional benefits from both switching to 
gfns and tweaking?


My next question is to make sure I'm clear on the comment " if the 
gluster node containing the gnfs server goes down, all mounts done using 
that node will fail".  If you have 2 servers, each 1 brick in the 
over-all gluster FS, and one server fails, then for gnfs nothing on 
either server is visible to other nodes while under fuse only the files 
on the dead server are not visible.  Is this what you meant?


Finally, you mention "even for gnfs mounts, you can achieve fail-over by 
using CTDB".  Do you know if CTDB would have any performance impact 
(i.e. in a worst cast scenario could adding CTDB to gnfs erase the speed 
benefits of going to gnfs in the first place)?


Thanks

Pat


On 04/08/2017 12:58 AM, Ravishankar N wrote:

Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it helps, 
you could try mounting it via gluster NFS (gnfs) and then see if there 
is an improvement in speed. Fuse mounts are slower than gnfs mounts 
but you get the benefit of avoiding a single point of failure. Unlike 
fuse mounts, if the gluster node containing the gnfs server goes down, 
all mounts done using that node will fail). For fuse mounts, you could 
try tweaking the write-behind xlator settings to see if it helps. See 
the performance.write-behind and performance.write-behind-window-size 
options in `gluster volume set help`. Of course, even for gnfs mounts, 
you can achieve fail-over by using CTDB.


Thanks,
Ravi

On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster disk when 
compared to writing to an NFS disk. Specifically when using dd (data 
duplicator) to write a 4.3 GB file of zeros:


  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or 
anything else. The hardware is (literally) the same:


  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card, /mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after changing 
the log level):


glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 
[Invader] (rev 02)




*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M 
count=1000

1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times as 
fast*



Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - realllyyy 
slooowww



*Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - realllyyy 
slooowww again




*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times as fast


*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times as fast


As a test, can we copy data directly to the xfs mountpoint 
(/mnt/brick1) and bypass gluster?



Any help you could give us would be appreciated.

Thanks

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  

Re: [Gluster-users] Slow write times to gluster disk

2017-04-07 Thread Ravishankar N

Hi Pat,

I'm assuming you are using gluster native (fuse mount). If it helps, you 
could try mounting it via gluster NFS (gnfs) and then see if there is an 
improvement in speed. Fuse mounts are slower than gnfs mounts but you 
get the benefit of avoiding a single point of failure. Unlike fuse 
mounts, if the gluster node containing the gnfs server goes down, all 
mounts done using that node will fail). For fuse mounts, you could try 
tweaking the write-behind xlator settings to see if it helps. See the 
performance.write-behind and performance.write-behind-window-size 
options in `gluster volume set help`. Of course, even for gnfs mounts, 
you can achieve fail-over by using CTDB.


Thanks,
Ravi

On 04/08/2017 12:07 AM, Pat Haley wrote:


Hi,

We noticed a dramatic slowness when writing to a gluster disk when 
compared to writing to an NFS disk. Specifically when using dd (data 
duplicator) to write a 4.3 GB file of zeros:


  * on NFS disk (/home): 9.5 Gb/s
  * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or 
anything else. The hardware is (literally) the same:


  * one server with 70 hard disks  and a hardware RAID card.
  * 4 disks in a RAID-6 group (the NFS disk)
  * 32 disks in a RAID-6 group (the max allowed by the card, /mnt/brick1)
  * 32 disks in another RAID-6 group (/mnt/brick2)
  * 2 hot spare

Some additional information and more tests results (after changing the 
log level):


glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 
[Invader] (rev 02)




*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times as 
fast*



Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - realllyyy 
slooowww



*Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - realllyyy 
slooowww again




*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times as fast


*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times as fast


As a test, can we copy data directly to the xfs mountpoint 
(/mnt/brick1) and bypass gluster?



Any help you could give us would be appreciated.

Thanks

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Slow write times to gluster disk

2017-04-07 Thread Pat Haley


Hi,

We noticed a dramatic slowness when writing to a gluster disk when 
compared to writing to an NFS disk. Specifically when using dd (data 
duplicator) to write a 4.3 GB file of zeros:


 * on NFS disk (/home): 9.5 Gb/s
 * on gluster disk (/gdata): 508 Mb/s

The gluser disk is 2 bricks joined together, no replication or anything 
else. The hardware is (literally) the same:


 * one server with 70 hard disks  and a hardware RAID card.
 * 4 disks in a RAID-6 group (the NFS disk)
 * 32 disks in a RAID-6 group (the max allowed by the card, /mnt/brick1)
 * 32 disks in another RAID-6 group (/mnt/brick2)
 * 2 hot spare

Some additional information and more tests results (after changing the 
log level):


glusterfs 3.7.11 built on Apr 27 2016 14:09:22
CentOS release 6.8 (Final)
RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 
[Invader] (rev 02)




*Create the file to /gdata (gluster)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*

*Create the file to /home (ext4)*
[root@mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times as fast*


Copy from /gdata to /gdata (gluster to gluster)
*[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - realllyyy 
slooowww



*Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
[root@mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - realllyyy 
slooowww again




*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times as fast


*Copy from /home to /home (ext4 to ext4)*
[root@mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
2048000+0 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times as fast


As a test, can we copy data directly to the xfs mountpoint (/mnt/brick1) 
and bypass gluster?



Any help you could give us would be appreciated.

Thanks

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users