Re: [Gluster-devel] Throttling xlator on the bricks

2016-01-24 Thread Ravishankar N

On 01/25/2016 12:56 PM, Venky Shankar wrote:

Also, it would be beneficial to have the core TBF implementation as part of
libglusterfs so as to be consumable by the server side xlator component to
throttle dispatched FOPs and for daemons to throttle anything that's outside
"brick" boundary (such as cpu, etc..).
That makes sense. We were initially thinking to overload 
posix_rchecksum() to do the SHA256 sums for the signer.



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] GlusterFS FUSE client hangs on rsyncing lots of file

2016-01-24 Thread baul jianguo
3.5.7 also hangs.only the flush op hung. Yes,off the
performance.client-io-threads ,no hang.

The hang does not relate the client kernel version.

One client statdump about flush op,any abnormal?

[global.callpool.stack.12]

uid=0

gid=0

pid=14432

unique=16336007098

lk-owner=77cb199aa36f3641

op=FLUSH

type=1

cnt=6



[global.callpool.stack.12.frame.1]

ref_count=1

translator=fuse

complete=0



[global.callpool.stack.12.frame.2]

ref_count=0

translator=datavolume-write-behind

complete=0

parent=datavolume-read-ahead

wind_from=ra_flush

wind_to=FIRST_CHILD (this)->fops->flush

unwind_to=ra_flush_cbk



[global.callpool.stack.12.frame.3]

ref_count=1

translator=datavolume-read-ahead

complete=0

parent=datavolume-open-behind

wind_from=default_flush_resume

wind_to=FIRST_CHILD(this)->fops->flush

unwind_to=default_flush_cbk



[global.callpool.stack.12.frame.4]

ref_count=1

translator=datavolume-open-behind

complete=0

parent=datavolume-io-threads

wind_from=iot_flush_wrapper

wind_to=FIRST_CHILD(this)->fops->flush

unwind_to=iot_flush_cbk



[global.callpool.stack.12.frame.5]

ref_count=1

translator=datavolume-io-threads

complete=0

parent=datavolume

wind_from=io_stats_flush

wind_to=FIRST_CHILD(this)->fops->flush

unwind_to=io_stats_flush_cbk



[global.callpool.stack.12.frame.6]

ref_count=1

translator=datavolume

complete=0

parent=fuse

wind_from=fuse_flush_resume

wind_to=xl->fops->flush

unwind_to=fuse_err_cbk



On Sun, Jan 24, 2016 at 5:35 AM, Oleksandr Natalenko
 wrote:
> With "performance.client-io-threads" set to "off" no hangs occurred in 3
> rsync/rm rounds. Could that be some fuse-bridge lock race? Will bring that
> option to "on" back again and try to get full statedump.
>
> On четвер, 21 січня 2016 р. 14:54:47 EET Raghavendra G wrote:
>> On Thu, Jan 21, 2016 at 10:49 AM, Pranith Kumar Karampuri <
>>
>> pkara...@redhat.com> wrote:
>> > On 01/18/2016 02:28 PM, Oleksandr Natalenko wrote:
>> >> XFS. Server side works OK, I'm able to mount volume again. Brick is 30%
>> >> full.
>> >
>> > Oleksandr,
>> >
>> >   Will it be possible to get the statedump of the client, bricks
>> >
>> > output next time it happens?
>> >
>> > https://github.com/gluster/glusterfs/blob/master/doc/debugging/statedump.m
>> > d#how-to-generate-statedump
>> We also need to dump inode information. To do that you've to add "all=yes"
>> to /var/run/gluster/glusterdump.options before you issue commands to get
>> statedump.
>>
>> > Pranith
>> >
>> >> On понеділок, 18 січня 2016 р. 15:07:18 EET baul jianguo wrote:
>> >>> What is your brick file system? and the glusterfsd process and all
>> >>> thread status?
>> >>> I met same issue when client app such as rsync stay in D status,and
>> >>> the brick process and relate thread also be in the D status.
>> >>> And the brick dev disk util is 100% .
>> >>>
>> >>> On Sun, Jan 17, 2016 at 6:13 AM, Oleksandr Natalenko
>> >>>
>> >>>  wrote:
>>  Wrong assumption, rsync hung again.
>> 
>>  On субота, 16 січня 2016 р. 22:53:04 EET Oleksandr Natalenko wrote:
>> > One possible reason:
>> >
>> > cluster.lookup-optimize: on
>> > cluster.readdir-optimize: on
>> >
>> > I've disabled both optimizations, and at least as of now rsync still
>> > does
>> > its job with no issues. I would like to find out what option causes
>> > such
>> > a
>> > behavior and why. Will test more.
>> >
>> > On пʼятниця, 15 січня 2016 р. 16:09:51 EET Oleksandr Natalenko wrote:
>> >> Another observation: if rsyncing is resumed after hang, rsync itself
>> >> hangs a lot faster because it does stat of already copied files. So,
>> >> the
>> >> reason may be not writing itself, but massive stat on GlusterFS
>> >> volume
>> >> as well.
>> >>
>> >> 15.01.2016 09:40, Oleksandr Natalenko написав:
>> >>> While doing rsync over millions of files from ordinary partition to
>> >>> GlusterFS volume, just after approx. first 2 million rsync hang
>> >>> happens, and the following info appears in dmesg:
>> >>>
>> >>> ===
>> >>> [17075038.924481] INFO: task rsync:10310 blocked for more than 120
>> >>> seconds.
>> >>> [17075038.931948] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> >>> disables this message.
>> >>> [17075038.940748] rsync   D 88207fc13680 0 10310
>> >>> 10309 0x0080
>> >>> [17075038.940752]  8809c578be18 0086
>> >>> 8809c578bfd8
>> >>> 00013680
>> >>> [17075038.940756]  8809c578bfd8 00013680
>> >>> 880310cbe660
>> >>> 881159d16a30
>> >>> [17075038.940759]  881e3aa25800 8809c578be48
>> >>> 881159d16b10
>> >>> 88087d553980
>> >>> [17075038.940762] Call Trace:
>> >>> [17075038.940770]  [] schedule+0x29/0x70
>> >>> [17075038.940797]  []
>> >>> __fuse_request_send+0x13d/0x2c0
>> >>> [fuse]
>> >>> [17075038.940801]  [] ?
>> 

Re: [Gluster-devel] Throttling xlator on the bricks

2016-01-24 Thread Venky Shankar
On Mon, Jan 25, 2016 at 11:06:26AM +0530, Ravishankar N wrote:
> Hi,
> 
> We are planning to introduce a throttling xlator on the server (brick)
> process to regulate FOPS. The main motivation is to solve complaints about
> AFR selfheal taking too much of CPU resources. (due to too many fops for
> entry
> self-heal, rchecksums for data self-heal etc.)
> 
> The throttling is achieved using the Token Bucket Filter algorithm (TBF).
> TBF
> is already used by bitrot's bitd signer (which is a client process) in
> gluster to regulate the CPU intensive check-sum calculation. By putting the
> logic on the brick side, multiple clients- selfheal, bitrot, rebalance or
> even the mounts themselves can avail the benefits of throttling.

  [Providing current TBF implementation link for completeness]

  
https://github.com/gluster/glusterfs/blob/master/xlators/features/bit-rot/src/bitd/bit-rot-tbf.c

Also, it would be beneficial to have the core TBF implementation as part of
libglusterfs so as to be consumable by the server side xlator component to
throttle dispatched FOPs and for daemons to throttle anything that's outside
"brick" boundary (such as cpu, etc..).

> 
> The TBF algorithm in a nutshell is as follows: There is a bucket which is
> filled
> at a steady (configurable) rate with tokens. Each FOP will need a fixed
> amount
> of tokens to be processed. If the bucket has that many tokens, the FOP is
> allowed and that many tokens are removed from the bucket. If not, the FOP is
> queued until the bucket is filled.
> 
> The xlator will need to reside above io-threads and can have different
> buckets,
> one per client. There has to be a communication mechanism between the client
> and
> the brick (IPC?) to tell what FOPS need to be regulated from it, and the no.
> of
> tokens needed etc. These need to be re configurable via appropriate
> mechanisms.
> Each bucket will have a token filler thread which will fill the tokens in
> it.
> The main thread will enqueue heals in a list in the bucket if there aren't
> enough tokens. Once the token filler detects some FOPS can be serviced, it
> will
> send a cond-broadcast to a dequeue thread which will process (stack wind)
> all
> the FOPS that have the required no. of tokens from all buckets.
> 
> This is just a high level abstraction: requesting feedback on any aspect of
> this feature. what kind of mechanism is best between the client/bricks for
> tuning various parameters? What other requirements do you foresee?
> 
> Thanks,
> Ravi

> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Throttling xlator on the bricks

2016-01-24 Thread Ravishankar N

Hi,

We are planning to introduce a throttling xlator on the server (brick)
process to regulate FOPS. The main motivation is to solve complaints about
AFR selfheal taking too much of CPU resources. (due to too many fops for 
entry

self-heal, rchecksums for data self-heal etc.)

The throttling is achieved using the Token Bucket Filter algorithm 
(TBF). TBF

is already used by bitrot's bitd signer (which is a client process) in
gluster to regulate the CPU intensive check-sum calculation. By putting the
logic on the brick side, multiple clients- selfheal, bitrot, rebalance or
even the mounts themselves can avail the benefits of throttling.

The TBF algorithm in a nutshell is as follows: There is a bucket which 
is filled
at a steady (configurable) rate with tokens. Each FOP will need a fixed 
amount

of tokens to be processed. If the bucket has that many tokens, the FOP is
allowed and that many tokens are removed from the bucket. If not, the FOP is
queued until the bucket is filled.

The xlator will need to reside above io-threads and can have different 
buckets,
one per client. There has to be a communication mechanism between the 
client and
the brick (IPC?) to tell what FOPS need to be regulated from it, and the 
no. of
tokens needed etc. These need to be re configurable via appropriate 
mechanisms.
Each bucket will have a token filler thread which will fill the tokens 
in it.

The main thread will enqueue heals in a list in the bucket if there aren't
enough tokens. Once the token filler detects some FOPS can be serviced, 
it will
send a cond-broadcast to a dequeue thread which will process (stack 
wind) all

the FOPS that have the required no. of tokens from all buckets.

This is just a high level abstraction: requesting feedback on any aspect of
this feature. what kind of mechanism is best between the client/bricks for
tuning various parameters? What other requirements do you foresee?

Thanks,
Ravi
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Feature: Automagic lock-revocation for features/locks xlator (v3.7.x)

2016-01-24 Thread Venky Shankar
On Jan 25, 2016 08:12, "Pranith Kumar Karampuri" 
wrote:
>
>
>
> On 01/25/2016 02:17 AM, Richard Wareing wrote:
>>
>> Hello all,
>>
>> Just gave a talk at SCaLE 14x today and I mentioned our new locks
revocation feature which has had a significant impact on our GFS cluster
reliability.  As such I wanted to share the patch with the community, so
here's the bugzilla report:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1301401
>>
>> =
>> Summary:
>> Mis-behaving brick clients (gNFSd, FUSE, gfAPI) can cause cluster
instability and eventual complete unavailability due to failures in
releasing entry/inode locks in a timely manner.
>>
>> Classic symptoms on this are increased brick (and/or gNFSd) memory usage
due the high number of (lock request) frames piling up in the processes.
The failure-mode results in bricks eventually slowing down to a crawl due
to swapping, or OOMing due to complete memory exhaustion; during this
period the entire cluster can begin to fail.  End-users will experience
this as hangs on the filesystem, first in a specific region of the
file-system and ultimately the entire filesystem as the offending brick
begins to turn into a zombie (i.e. not quite dead, but not quite alive
either).
>>
>> Currently, these situations must be handled by an administrator
detecting & intervening via the "clear-locks" CLI command.  Unfortunately
this doesn't scale for large numbers of clusters, and it depends on the
correct (external) detection of the locks piling up (for which there is
little signal other than state dumps).
>>
>> This patch introduces two features to remedy this situation:
>>
>> 1. Monkey-unlocking - This is a feature targeted at developers (only!)
to help track down crashes due to stale locks, and prove the utility of he
lock revocation feature.  It does this by silently dropping 1% of unlock
requests; simulating bugs or mis-behaving clients.
>>
>> The feature is activated via:
>> features.locks-monkey-unlocking 
>>
>> You'll see the message
>> "[] W [inodelk.c:653:pl_inode_setlk] 0-groot-locks: MONKEY
LOCKING (forcing stuck lock)!" ... in the logs indicating a request has
been dropped.
>>
>> 2. Lock revocation - Once enabled, this feature will revoke a
*contended*lock  (i.e. if nobody else asks for the lock, we will not revoke
it) either by the amount of time the lock has been held, how many other
lock requests are waiting on the lock to be freed, or some combination of
both.  Clients which are losing their locks will be notified by receiving
EAGAIN (send back to their callback function).
>>
>> The feature is activated via these options:
>> features.locks-revocation-secs 
>> features.locks-revocation-clear-all [on/off]
>> features.locks-revocation-max-blocked 
>>
>> Recommended settings are: 1800 seconds for a time based timeout (give
clients the benefit of the doubt, or chose a max-blocked requires some
experimentation depending on your workload, but generally values of
hundreds to low thousands (it's normal for many ten's of locks to be taken
out when files are being written @ high throughput).
>
>
> I really like this feature. One question though, self-heal, rebalance
domain locks are active until self-heal/rebalance is complete which can
take more than 30 minutes if the files are in TBs. I will try to see what
we can do to handle these without increasing the revocation-secs too much.
May be we can come up with per domain revocation timeouts. Comments are
welcome.

[
I've not gone through the design or the patch,
hence this might be a shot in the air.
]

Maybe give clients a second (or more) chance to "refresh" their locks - in
the sense, when a lock is about to be revoked, notify the client which can
then call for a refresh to conform it's locks holding validity. This would
require some maintainance work on the client to keep track of locked
regions.

>
> Pranith
>>
>>
>> =
>>
>> The patch supplied will patch clean the the v3.7.6 release tag, and
probably to any 3.7.x release & master (posix locks xlator is rarely
touched).
>>
>> Richard
>>
>>
>>
>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Feature: Automagic lock-revocation for features/locks xlator (v3.7.x)

2016-01-24 Thread Raghavendra Gowdappa


- Original Message -
> From: "Richard Wareing" 
> To: "Pranith Kumar Karampuri" 
> Cc: gluster-devel@gluster.org
> Sent: Monday, January 25, 2016 8:17:11 AM
> Subject: Re: [Gluster-devel] Feature: Automagic lock-revocation for 
> features/locks xlator (v3.7.x)
> 
> Yup per domain would be useful, the patch itself currently honors domains as
> well. So locks in a different domains will not be touched during revocation.
> 
> In our cases we actually prefer to pull the plug on SHD/DHT domains to ensure
> clients do not hang, this is important for DHT self heals which cannot be
> disabled via any option, we've found in most cases once we reap the lock
> another properly behaving client comes along and completes the DHT heal
> properly.

Flushing waiting locks of DHT can affect application continuity too. Though 
locks requested by rebalance process can be flushed to certain extent without 
applications noticing any failures, there is no guarantee that locks requested 
in DHT_LAYOUT_HEAL_DOMAIN and DHT_FILE_MIGRATE_DOMAIN, are issued by only 
rebalance process. These two domains are used for locks to synchronize among 
and between rebalance process(es) and client(s). So, there is equal probability 
that these locks might be requests from clients and hence application can see 
some file operations failing.

In case of pulling plug on DHT_LAYOUT_HEAL_DOMAIN, dentry operations that 
depend on layout can fail. These operations can include create, link, unlink, 
symlink, mknod, mkdir, rename for files/directory within the directory on which 
lock request is failed.

In case of pulling plug on DHT_FILE_MIGRATE_DOMAIN, rename of immediate 
subdirectories/files can fail.


> 
> Richard
> 
> 
> Sent from my iPhone
> 
> On Jan 24, 2016, at 6:42 PM, Pranith Kumar Karampuri < pkara...@redhat.com >
> wrote:
> 
> 
> 
> 
> 
> 
> On 01/25/2016 02:17 AM, Richard Wareing wrote:
> 
> 
> 
> Hello all,
> 
> Just gave a talk at SCaLE 14x today and I mentioned our new locks revocation
> feature which has had a significant impact on our GFS cluster reliability.
> As such I wanted to share the patch with the community, so here's the
> bugzilla report:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1301401
> 
> =
> Summary:
> Mis-behaving brick clients (gNFSd, FUSE, gfAPI) can cause cluster instability
> and eventual complete unavailability due to failures in releasing
> entry/inode locks in a timely manner.
> 
> Classic symptoms on this are increased brick (and/or gNFSd) memory usage due
> the high number of (lock request) frames piling up in the processes. The
> failure-mode results in bricks eventually slowing down to a crawl due to
> swapping, or OOMing due to complete memory exhaustion; during this period
> the entire cluster can begin to fail. End-users will experience this as
> hangs on the filesystem, first in a specific region of the file-system and
> ultimately the entire filesystem as the offending brick begins to turn into
> a zombie (i.e. not quite dead, but not quite alive either).
> 
> Currently, these situations must be handled by an administrator detecting &
> intervening via the "clear-locks" CLI command. Unfortunately this doesn't
> scale for large numbers of clusters, and it depends on the correct
> (external) detection of the locks piling up (for which there is little
> signal other than state dumps).
> 
> This patch introduces two features to remedy this situation:
> 
> 1. Monkey-unlocking - This is a feature targeted at developers (only!) to
> help track down crashes due to stale locks, and prove the utility of he lock
> revocation feature. It does this by silently dropping 1% of unlock requests;
> simulating bugs or mis-behaving clients.
> 
> The feature is activated via:
> features.locks-monkey-unlocking 
> 
> You'll see the message
> "[] W [inodelk.c:653:pl_inode_setlk] 0-groot-locks: MONKEY LOCKING
> (forcing stuck lock)!" ... in the logs indicating a request has been
> dropped.
> 
> 2. Lock revocation - Once enabled, this feature will revoke a *contended*lock
> (i.e. if nobody else asks for the lock, we will not revoke it) either by the
> amount of time the lock has been held, how many other lock requests are
> waiting on the lock to be freed, or some combination of both. Clients which
> are losing their locks will be notified by receiving EAGAIN (send back to
> their callback function).
> 
> The feature is activated via these options:
> features.locks-revocation-secs 
> features.locks-revocation-clear-all [on/off]
> features.locks-revocation-max-blocked 
> 
> Recommended settings are: 1800 seconds for a time based timeout (give clients
> the benefit of the doubt, or chose a max-blocked requires some
> experimentation depending on your workload, but generally values of hundreds
> to low thousands (it's normal for many ten's of locks to be taken out when
> files are being written @ high throughput).
> 
> I really like this feature. One question though, self-heal, rebalance domain
> locks are ac

Re: [Gluster-devel] Feature: Automagic lock-revocation for features/locks xlator (v3.7.x)

2016-01-24 Thread Richard Wareing
Yup per domain would be useful, the patch itself currently honors domains as 
well.  So locks in a different domains will not be touched during revocation.

In our cases we actually prefer to pull the plug on SHD/DHT domains to ensure 
clients do not hang, this is important for DHT self heals which cannot be 
disabled via any option, we've found in most cases once we reap the lock 
another properly behaving client comes along and completes the DHT heal 
properly.

Richard


Sent from my iPhone

On Jan 24, 2016, at 6:42 PM, Pranith Kumar Karampuri 
mailto:pkara...@redhat.com>> wrote:



On 01/25/2016 02:17 AM, Richard Wareing wrote:
Hello all,

Just gave a talk at SCaLE 14x today and I mentioned our new locks revocation 
feature which has had a significant impact on our GFS cluster reliability.  As 
such I wanted to share the patch with the community, so here's the bugzilla 
report:

https://bugzilla.redhat.com/show_bug.cgi?id=1301401

=
Summary:
Mis-behaving brick clients (gNFSd, FUSE, gfAPI) can cause cluster instability 
and eventual complete unavailability due to failures in releasing entry/inode 
locks in a timely manner.

Classic symptoms on this are increased brick (and/or gNFSd) memory usage due 
the high number of (lock request) frames piling up in the processes.  The 
failure-mode results in bricks eventually slowing down to a crawl due to 
swapping, or OOMing due to complete memory exhaustion; during this period the 
entire cluster can begin to fail.  End-users will experience this as hangs on 
the filesystem, first in a specific region of the file-system and ultimately 
the entire filesystem as the offending brick begins to turn into a zombie (i.e. 
not quite dead, but not quite alive either).

Currently, these situations must be handled by an administrator detecting & 
intervening via the "clear-locks" CLI command.  Unfortunately this doesn't 
scale for large numbers of clusters, and it depends on the correct (external) 
detection of the locks piling up (for which there is little signal other than 
state dumps).

This patch introduces two features to remedy this situation:

1. Monkey-unlocking - This is a feature targeted at developers (only!) to help 
track down crashes due to stale locks, and prove the utility of he lock 
revocation feature.  It does this by silently dropping 1% of unlock requests; 
simulating bugs or mis-behaving clients.

The feature is activated via:
features.locks-monkey-unlocking 

You'll see the message
"[] W [inodelk.c:653:pl_inode_setlk] 0-groot-locks: MONKEY LOCKING 
(forcing stuck lock)!" ... in the logs indicating a request has been dropped.

2. Lock revocation - Once enabled, this feature will revoke a *contended*lock  
(i.e. if nobody else asks for the lock, we will not revoke it) either by the 
amount of time the lock has been held, how many other lock requests are waiting 
on the lock to be freed, or some combination of both.  Clients which are losing 
their locks will be notified by receiving EAGAIN (send back to their callback 
function).

The feature is activated via these options:
features.locks-revocation-secs 
features.locks-revocation-clear-all [on/off]
features.locks-revocation-max-blocked 

Recommended settings are: 1800 seconds for a time based timeout (give clients 
the benefit of the doubt, or chose a max-blocked requires some experimentation 
depending on your workload, but generally values of hundreds to low thousands 
(it's normal for many ten's of locks to be taken out when files are being 
written @ high throughput).

I really like this feature. One question though, self-heal, rebalance domain 
locks are active until self-heal/rebalance is complete which can take more than 
30 minutes if the files are in TBs. I will try to see what we can do to handle 
these without increasing the revocation-secs too much. May be we can come up 
with per domain revocation timeouts. Comments are welcome.

Pranith

=

The patch supplied will patch clean the the v3.7.6 release tag, and probably to 
any 3.7.x release & master (posix locks xlator is rarely touched).

Richard






___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Tips and Tricks for Gluster Developer

2016-01-24 Thread Richard Wareing
Here's my tips:

1. General C tricks
- learn to use vim or emacs & read their manuals; customize to suite your style
- use vim w/ pathogen plugins for auto formatting (don't use tabs!) & syntax
- use ctags to jump around functions
- Use ASAN & valgrind to check for memory leaks and heap corruption
- learn to use "git bisect" to quickly find where regressions were introduced & 
revert them
- Use a window manager like tmux or screen

2. Gluster specific tricks
- Alias "ggrep" to grep through all Gluster source files for some string and 
show you the line numbers
- Alias "gvim" or "gemacs" to open any source file without full path, eg. "gvim 
afr.c"
- GFS specific gdb macros to dump out pretty formatting of various structs 
(Jeff Darcy has some of these IIRC)
- Write prove tests...for everything you write, and any bug you fix.  Make them 
deterministic (timing/races shouldn't matter).
- Bugs/races and/or crashes which are hard or impossible to repro often require 
the creation of a developer specific feature to simulate the failure and 
efficiently code/test a fix.  Example: "monkey-unlocking" in the lock 
revocation patch I just posted.
- That edge case you are ignoring because you think it's impossible/unlikely?  
We will find/hit it in 48hrs at large scale (seriously we will) handle it 
correctly or at a minimum write a (kernel style) "OOPS" log type message.

That's all I have off the top of my head.  I'll give example aliases in another 
reply.

Richard

Sent from my iPhone

> On Jan 22, 2016, at 6:14 AM, Raghavendra Talur  wrote:
> 
> HI All,
> 
> I am sure there are many tricks hidden under sleeves of many Gluster 
> developers.
> I realized this when speaking to new developers. It would be good have a 
> searchable thread of such tricks.
> 
> Just reply back on this thread with the tricks that you have and I promise I 
> will collate them and add them to developer guide.
> 
> 
> Looking forward to be amazed!
> 
> Thanks,
> Raghavendra Talur
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.gluster.org_mailman_listinfo_gluster-2Ddevel&d=CwICAg&c=5VD0RTtNlTh3ycd41b3MUw&r=qJ8Lp7ySfpQklq3QZr44Iw&m=wVrGhYdkvCanDEZF0xOyVbFg0am_GxaoXR26Cvp7H2U&s=JOrY0up51BoZOq2sKaNJQHPzqKiUS3Bwgn7fr5VPXjw&e=
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Feature: Automagic lock-revocation for features/locks xlator (v3.7.x)

2016-01-24 Thread Pranith Kumar Karampuri



On 01/25/2016 02:17 AM, Richard Wareing wrote:

Hello all,

Just gave a talk at SCaLE 14x today and I mentioned our new locks 
revocation feature which has had a significant impact on our GFS 
cluster reliability.  As such I wanted to share the patch with the 
community, so here's the bugzilla report:


https://bugzilla.redhat.com/show_bug.cgi?id=1301401

=
Summary:
Mis-behaving brick clients (gNFSd, FUSE, gfAPI) can cause cluster 
instability and eventual complete unavailability due to failures in 
releasing entry/inode locks in a timely manner.


Classic symptoms on this are increased brick (and/or gNFSd) memory 
usage due the high number of (lock request) frames piling up in the 
processes.  The failure-mode results in bricks eventually slowing down 
to a crawl due to swapping, or OOMing due to complete memory 
exhaustion; during this period the entire cluster can begin to fail. 
 End-users will experience this as hangs on the filesystem, first in a 
specific region of the file-system and ultimately the entire 
filesystem as the offending brick begins to turn into a zombie (i.e. 
not quite dead, but not quite alive either).


Currently, these situations must be handled by an administrator 
detecting & intervening via the "clear-locks" CLI command. 
 Unfortunately this doesn't scale for large numbers of clusters, and 
it depends on the correct (external) detection of the locks piling up 
(for which there is little signal other than state dumps).


This patch introduces two features to remedy this situation:

1. Monkey-unlocking - This is a feature targeted at developers (only!) 
to help track down crashes due to stale locks, and prove the utility 
of he lock revocation feature.  It does this by silently dropping 1% 
of unlock requests; simulating bugs or mis-behaving clients.


The feature is activated via:
features.locks-monkey-unlocking 

You'll see the message
"[] W [inodelk.c:653:pl_inode_setlk] 0-groot-locks: MONKEY 
LOCKING (forcing stuck lock)!" ... in the logs indicating a request 
has been dropped.


2. Lock revocation - Once enabled, this feature will revoke a 
*contended*lock (i.e. if nobody else asks for the lock, we will not 
revoke it)either by the amount of time the lock has been held, how 
many other lock requests are waiting on the lock to be freed, or some 
combination of both.  Clients which are losing their locks will be 
notified by receiving EAGAIN (send back to their callback function).


The feature is activated via these options:
features.locks-revocation-secs 
features.locks-revocation-clear-all [on/off]
features.locks-revocation-max-blocked 

Recommended settings are: 1800 seconds for a time based timeout (give 
clients the benefit of the doubt, or chose a max-blocked requires some 
experimentation depending on your workload, but generally values of 
hundreds to low thousands (it's normal for many ten's of locks to be 
taken out when files are being written @ high throughput).


I really like this feature. One question though, self-heal, rebalance 
domain locks are active until self-heal/rebalance is complete which can 
take more than 30 minutes if the files are in TBs. I will try to see 
what we can do to handle these without increasing the revocation-secs 
too much. May be we can come up with per domain revocation timeouts. 
Comments are welcome.


Pranith


=

The patch supplied will patch clean the the v3.7.6 release tag, and 
probably to any 3.7.x release & master (posix locks xlator is rarely 
touched).


Richard





___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-24 Thread Oleksandr Natalenko
Also, I've repeated the same "find" test again, but with glusterfs process 
launched under valgrind. And here is valgrind output:

https://gist.github.com/097afb01ebb2c5e9e78d

On неділя, 24 січня 2016 р. 09:33:00 EET Mathieu Chateau wrote:
> Thanks for all your tests and times, it looks promising :)
> 
> 
> Cordialement,
> Mathieu CHATEAU
> http://www.lotp.fr
> 
> 2016-01-23 22:30 GMT+01:00 Oleksandr Natalenko :
> > OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the
> > following
> > patches:
> > 
> > ===
> > 
> > Kaleb S KEITHLEY (1):
> >   fuse: use-after-free fix in fuse-bridge, revisited
> > 
> > Pranith Kumar K (1):
> >   mount/fuse: Fix use-after-free crash
> > 
> > Soumya Koduri (3):
> >   gfapi: Fix inode nlookup counts
> >   inode: Retire the inodes from the lru list in inode_table_destroy
> >   upcall: free the xdr* allocations
> > 
> > ===
> > 
> > I run rsync from one GlusterFS volume to another. While memory started
> > from
> > under 100 MiBs, it stalled at around 600 MiBs for source volume and does
> > not
> > grow further. As for target volume it is ~730 MiBs, and that is why I'm
> > going
> > to do several rsync rounds to see if it grows more (with no patches bare
> > 3.7.6
> > could consume more than 20 GiBs).
> > 
> > No "kernel notifier loop terminated" message so far for both volumes.
> > 
> > Will report more in several days. I hope current patches will be
> > incorporated
> > into 3.7.7.
> > 
> > On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote:
> > > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote:
> > > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote:
> > > >> I presume by this you mean you're not seeing the "kernel notifier
> > > >> loop
> > > >> terminated" error in your logs.
> > > > 
> > > > Correct, but only with simple traversing. Have to test under rsync.
> > > 
> > > Without the patch I'd get "kernel notifier loop terminated" within a few
> > > minutes of starting I/O.  With the patch I haven't seen it in 24 hours
> > > of beating on it.
> > > 
> > > >> Hmmm.  My system is not leaking. Last 24 hours the RSZ and VSZ are
> > 
> > > >> stable:
> > http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longev
> > 
> > > >> ity /client.out
> > > > 
> > > > What ops do you perform on mounted volume? Read, write, stat? Is that
> > > > 3.7.6 + patches?
> > > 
> > > I'm running an internally developed I/O load generator written by a guy
> > > on our perf team.
> > > 
> > > it does, create, write, read, rename, stat, delete, and more.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Feature: Automagic lock-revocation for features/locks xlator (v3.7.x)

2016-01-24 Thread Richard Wareing
Hello all,

Just gave a talk at SCaLE 14x today and I mentioned our new locks revocation 
feature which has had a significant impact on our GFS cluster reliability.  As 
such I wanted to share the patch with the community, so here's the bugzilla 
report:

https://bugzilla.redhat.com/show_bug.cgi?id=1301401

=
Summary:
Mis-behaving brick clients (gNFSd, FUSE, gfAPI) can cause cluster instability 
and eventual complete unavailability due to failures in releasing entry/inode 
locks in a timely manner.

Classic symptoms on this are increased brick (and/or gNFSd) memory usage due 
the high number of (lock request) frames piling up in the processes.  The 
failure-mode results in bricks eventually slowing down to a crawl due to 
swapping, or OOMing due to complete memory exhaustion; during this period the 
entire cluster can begin to fail.  End-users will experience this as hangs on 
the filesystem, first in a specific region of the file-system and ultimately 
the entire filesystem as the offending brick begins to turn into a zombie (i.e. 
not quite dead, but not quite alive either).

Currently, these situations must be handled by an administrator detecting & 
intervening via the "clear-locks" CLI command.  Unfortunately this doesn't 
scale for large numbers of clusters, and it depends on the correct (external) 
detection of the locks piling up (for which there is little signal other than 
state dumps).

This patch introduces two features to remedy this situation:

1. Monkey-unlocking - This is a feature targeted at developers (only!) to help 
track down crashes due to stale locks, and prove the utility of he lock 
revocation feature.  It does this by silently dropping 1% of unlock requests; 
simulating bugs or mis-behaving clients.

The feature is activated via:
features.locks-monkey-unlocking 

You'll see the message
"[] W [inodelk.c:653:pl_inode_setlk] 0-groot-locks: MONKEY LOCKING 
(forcing stuck lock)!" ... in the logs indicating a request has been dropped.

2. Lock revocation - Once enabled, this feature will revoke a *contended*lock  
(i.e. if nobody else asks for the lock, we will not revoke it) either by the 
amount of time the lock has been held, how many other lock requests are waiting 
on the lock to be freed, or some combination of both.  Clients which are losing 
their locks will be notified by receiving EAGAIN (send back to their callback 
function).

The feature is activated via these options:
features.locks-revocation-secs 
features.locks-revocation-clear-all [on/off]
features.locks-revocation-max-blocked 

Recommended settings are: 1800 seconds for a time based timeout (give clients 
the benefit of the doubt, or chose a max-blocked requires some experimentation 
depending on your workload, but generally values of hundreds to low thousands 
(it's normal for many ten's of locks to be taken out when files are being 
written @ high throughput).

=

The patch supplied will patch clean the the v3.7.6 release tag, and probably to 
any 3.7.x release & master (posix locks xlator is rarely touched).

Richard



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Bugs with incorrect status

2016-01-24 Thread Niels de Vos
On Sun, Jan 24, 2016 at 05:43:39PM +0100, Niels de Vos wrote:
> Hi all,
> 
> below is the current list of bugs that have an incorrect starus. Until
> we have the tools that automatically update the status of bugs,
> developers are expected to update their bugs when they post patches, and
> when all patches have been merged. The release engineer that handles the
> minor update will then close the bugs once the release is available.

The script that generates the output from the previous email is
available in our release-tools repository:

  https://github.com/gluster/release-tools/blob/master/check-bugs.py

It is pretty heavy on our Gerrit instance, and currently takes 90
minutes (!) to run on my laptop. If you intend to improve the output,
please modify the script and have the 'bzs' list (at the bottom of the
script) contain only a few bugs.

Niels


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Bugs with incorrect status

2016-01-24 Thread Niels de Vos
Hi all,

below is the current list of bugs that have an incorrect starus. Until
we have the tools that automatically update the status of bugs,
developers are expected to update their bugs when they post patches, and
when all patches have been merged. The release engineer that handles the
minor update will then close the bugs once the release is available.

Please have a look and correct the status of the bugs where you are
assigned (just search for your email address that you use for bugzilla).

Thanks,
Niels


892808 (mainline) ASSIGNED: [FEAT] Bring subdirectory mount option with native 
client
  [master] I4f542e fuse: support subdirectory mounts (NEW)
  ** vbel...@redhat.com: Bug 892808 should be in POST, change I4f542e under 
review **

1008839 (mainline) POST: Certain blocked entry lock info not retained after the 
lock is granted
  [master] Ie37837 features/locks : Certain blocked entry lock info not 
retained after the lock is granted (ABANDONED)
  ** ata...@redhat.com: Bug 1008839 is in POST, but all changes have been 
abandoned **

1074947 (mainline) ON_QA: add option to bulld rpm without server
  [master] Iaa1498 build: add option to bulld rpm without server (NEW)
  ** b...@gluster.org: Bug 1074947 should be in POST, change Iaa1498 under 
review **

1089642 (mainline) POST: Quotad doesn't load io-stats xlator, which implies 
none of the logging options have any effect on it.
  [master] Iccc033 glusterd: add io-stats to all quotad's sub-graphs (ABANDONED)
  ** spa...@redhat.com: Bug 1089642 is in POST, but all changes have been 
abandoned **

1092414 (mainline) ASSIGNED: Disable NFS by default
  [master] If25502 [WIP] glusterd: make Gluster/NFS an optional component (NEW)
  ** nde...@redhat.com: Bug 1092414 should be in POST, change If25502 under 
review **

1093768 (3.5.0) POST: Comment typo in gf-history.changelog.c
  ** kschi...@redhat.com: No change posted, but bug 1093768 is in POST **

1094478 (3.5.0) POST: Bad macro in changelog-misc.h
  ** kschi...@redhat.com: No change posted, but bug 1094478 is in POST **

1099294 (3.5.0) POST: Incorrect error message in 
/features/changelog/lib/src/gf-history-changelog.c
  ** kschi...@redhat.com: No change posted, but bug 1099294 is in POST **

1099460 (3.5.0) NEW: file locks are not released within an acceptable time when 
a fuse-client uncleanly disconnects
  [release-3.5] I5e5f54 socket: use TCP_USER_TIMEOUT to detect client failures 
quicker (NEW)
  ** nde...@redhat.com: Bug 1099460 should be in POST, change I5e5f54 under 
review **

1099683 (3.5.0) POST: Silent error from call to realpath in 
features/changelog/lib/src/gf-history-changelog.c
  ** vshan...@redhat.com: No change posted, but bug 1099683 is in POST **

110 (mainline) ASSIGNED: [RFE] Add regression tests for the component 
geo-replication
  [master] Ie27848 tests/geo-rep: Automated configuration for geo-rep 
regression. (NEW)
  [master] I9c9ae8 geo-rep: Regression tests improvements (ABANDONED)
  [master] I433dd8 Geo-rep: Adding regression tests for geo-rep (MERGED)
  [master] Ife8201 Geo-rep: Adding regression tests for geo-rep (ABANDONED)
  ** khire...@redhat.com: Bug 110 should be in POST, change Ie27848 under 
review **

020 (3.5.0) POST: Unused code changelog_entry_length
  ** kschi...@redhat.com: No change posted, but bug 020 is in POST **

031 (3.5.0) POST: CHANGELOG_FILL_HTIME_DIR macro fills buffer without size 
limits
  ** kschi...@redhat.com: No change posted, but bug 031 is in POST **

1114415 (mainline) MODIFIED: There is no way to monitor if the healing is 
successful when the brick is erased
  ** pkara...@redhat.com: No change posted, but bug 1114415 is in MODIFIED **

1116714 (3.5.0) POST: indices/xattrop directory contains stale entries
  [release-3.5] I470cf8 afr : Added xdata flags to indicate probable existence 
of stale index. (ABANDONED)
  ** ata...@redhat.com: Bug 1116714 is in POST, but all changes have been 
abandoned **

1122120 (3.5.1) MODIFIED: Bricks crashing after disable and re-enabled quota on 
a volume
  ** b...@gluster.org: No change posted, but bug 1122120 is in MODIFIED **

1126289 (mainline) POST: [SNAPSHOT]: Deletion of a snapshot in a volume or 
system fails if some operation which acquires the volume lock comes in between.
  [master] I708b72 glusterd/snapshot : Differentiating between various error 
scenario. (ABANDONED)
  ** kaus...@redhat.com: Bug 1126289 is in POST, but all changes have been 
abandoned **

1131846 (mainline) POST: remove-brick - once you stop remove-brick using stop 
command,  status says '  failed: remove-brick not started.'
  ** gg...@redhat.com: No change posted, but bug 1131846 is in POST **

1132074 (mainline) POST: Document steps to perform for replace-brick
  [master] Ic7292b doc: Steps for Replacing brick in gluster volume (ABANDONED)
  ** pkara...@redhat.com: Bug 1132074 is in POST, but all changes have been 
abandoned **

1134305 (mainline) POST: rpc actor failed to complete successfully messa

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-24 Thread Oleksandr Natalenko
BTW, am I the only one who sees in

max_size=4294965480

almost 2^32? Could that be integer overflow?

On неділя, 24 січня 2016 р. 13:23:55 EET Oleksandr Natalenko wrote:
> The leak definitely remains. I did "find /mnt/volume -type d" over GlusterFS
> volume, with mentioned patches applied and without "kernel notifier loop
> terminated" message, but "glusterfs" process consumed ~4GiB of RAM after
> "find" finished.
> 
> Here is statedump:
> 
> https://gist.github.com/10cde83c63f1b4f1dd7a
> 
> I see the following:
> 
> ===
> [mount/fuse.fuse - usage-type gf_fuse_mt_iov_base memusage]
> size=4235109959
> num_allocs=2
> max_size=4294965480
> max_num_allocs=3
> total_allocs=4533524
> ===
> 
> ~4GiB, right?
> 
> Pranith, Kaleb?
> 
> On неділя, 24 січня 2016 р. 09:33:00 EET Mathieu Chateau wrote:
> > Thanks for all your tests and times, it looks promising :)
> > 
> > 
> > Cordialement,
> > Mathieu CHATEAU
> > http://www.lotp.fr
> > 
> > 2016-01-23 22:30 GMT+01:00 Oleksandr Natalenko :
> > > OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the
> > > following
> > > patches:
> > > 
> > > ===
> > > 
> > > Kaleb S KEITHLEY (1):
> > >   fuse: use-after-free fix in fuse-bridge, revisited
> > > 
> > > Pranith Kumar K (1):
> > >   mount/fuse: Fix use-after-free crash
> > > 
> > > Soumya Koduri (3):
> > >   gfapi: Fix inode nlookup counts
> > >   inode: Retire the inodes from the lru list in inode_table_destroy
> > >   upcall: free the xdr* allocations
> > > 
> > > ===
> > > 
> > > I run rsync from one GlusterFS volume to another. While memory started
> > > from
> > > under 100 MiBs, it stalled at around 600 MiBs for source volume and does
> > > not
> > > grow further. As for target volume it is ~730 MiBs, and that is why I'm
> > > going
> > > to do several rsync rounds to see if it grows more (with no patches bare
> > > 3.7.6
> > > could consume more than 20 GiBs).
> > > 
> > > No "kernel notifier loop terminated" message so far for both volumes.
> > > 
> > > Will report more in several days. I hope current patches will be
> > > incorporated
> > > into 3.7.7.
> > > 
> > > On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote:
> > > > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote:
> > > > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote:
> > > > >> I presume by this you mean you're not seeing the "kernel notifier
> > > > >> loop
> > > > >> terminated" error in your logs.
> > > > > 
> > > > > Correct, but only with simple traversing. Have to test under rsync.
> > > > 
> > > > Without the patch I'd get "kernel notifier loop terminated" within a
> > > > few
> > > > minutes of starting I/O.  With the patch I haven't seen it in 24 hours
> > > > of beating on it.
> > > > 
> > > > >> Hmmm.  My system is not leaking. Last 24 hours the RSZ and VSZ are
> > > 
> > > > >> stable:
> > > http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longe
> > > v
> > > 
> > > > >> ity /client.out
> > > > > 
> > > > > What ops do you perform on mounted volume? Read, write, stat? Is
> > > > > that
> > > > > 3.7.6 + patches?
> > > > 
> > > > I'm running an internally developed I/O load generator written by a
> > > > guy
> > > > on our perf team.
> > > > 
> > > > it does, create, write, read, rename, stat, delete, and more.
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-24 Thread Oleksandr Natalenko
The leak definitely remains. I did "find /mnt/volume -type d" over GlusterFS 
volume, with mentioned patches applied and without "kernel notifier loop 
terminated" message, but "glusterfs" process consumed ~4GiB of RAM after 
"find" finished.

Here is statedump:

https://gist.github.com/10cde83c63f1b4f1dd7a

I see the following:

===
[mount/fuse.fuse - usage-type gf_fuse_mt_iov_base memusage]
size=4235109959
num_allocs=2
max_size=4294965480
max_num_allocs=3
total_allocs=4533524
===

~4GiB, right?

Pranith, Kaleb?

On неділя, 24 січня 2016 р. 09:33:00 EET Mathieu Chateau wrote:
> Thanks for all your tests and times, it looks promising :)
> 
> 
> Cordialement,
> Mathieu CHATEAU
> http://www.lotp.fr
> 
> 2016-01-23 22:30 GMT+01:00 Oleksandr Natalenko :
> > OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the
> > following
> > patches:
> > 
> > ===
> > 
> > Kaleb S KEITHLEY (1):
> >   fuse: use-after-free fix in fuse-bridge, revisited
> > 
> > Pranith Kumar K (1):
> >   mount/fuse: Fix use-after-free crash
> > 
> > Soumya Koduri (3):
> >   gfapi: Fix inode nlookup counts
> >   inode: Retire the inodes from the lru list in inode_table_destroy
> >   upcall: free the xdr* allocations
> > 
> > ===
> > 
> > I run rsync from one GlusterFS volume to another. While memory started
> > from
> > under 100 MiBs, it stalled at around 600 MiBs for source volume and does
> > not
> > grow further. As for target volume it is ~730 MiBs, and that is why I'm
> > going
> > to do several rsync rounds to see if it grows more (with no patches bare
> > 3.7.6
> > could consume more than 20 GiBs).
> > 
> > No "kernel notifier loop terminated" message so far for both volumes.
> > 
> > Will report more in several days. I hope current patches will be
> > incorporated
> > into 3.7.7.
> > 
> > On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote:
> > > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote:
> > > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote:
> > > >> I presume by this you mean you're not seeing the "kernel notifier
> > > >> loop
> > > >> terminated" error in your logs.
> > > > 
> > > > Correct, but only with simple traversing. Have to test under rsync.
> > > 
> > > Without the patch I'd get "kernel notifier loop terminated" within a few
> > > minutes of starting I/O.  With the patch I haven't seen it in 24 hours
> > > of beating on it.
> > > 
> > > >> Hmmm.  My system is not leaking. Last 24 hours the RSZ and VSZ are
> > 
> > > >> stable:
> > http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longev
> > 
> > > >> ity /client.out
> > > > 
> > > > What ops do you perform on mounted volume? Read, write, stat? Is that
> > > > 3.7.6 + patches?
> > > 
> > > I'm running an internally developed I/O load generator written by a guy
> > > on our perf team.
> > > 
> > > it does, create, write, read, rename, stat, delete, and more.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-24 Thread Mathieu Chateau
Thanks for all your tests and times, it looks promising :)


Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2016-01-23 22:30 GMT+01:00 Oleksandr Natalenko :

> OK, now I'm re-performing tests with rsync + GlusterFS v3.7.6 + the
> following
> patches:
>
> ===
> Kaleb S KEITHLEY (1):
>   fuse: use-after-free fix in fuse-bridge, revisited
>
> Pranith Kumar K (1):
>   mount/fuse: Fix use-after-free crash
>
> Soumya Koduri (3):
>   gfapi: Fix inode nlookup counts
>   inode: Retire the inodes from the lru list in inode_table_destroy
>   upcall: free the xdr* allocations
> ===
>
> I run rsync from one GlusterFS volume to another. While memory started from
> under 100 MiBs, it stalled at around 600 MiBs for source volume and does
> not
> grow further. As for target volume it is ~730 MiBs, and that is why I'm
> going
> to do several rsync rounds to see if it grows more (with no patches bare
> 3.7.6
> could consume more than 20 GiBs).
>
> No "kernel notifier loop terminated" message so far for both volumes.
>
> Will report more in several days. I hope current patches will be
> incorporated
> into 3.7.7.
>
> On пʼятниця, 22 січня 2016 р. 12:53:36 EET Kaleb S. KEITHLEY wrote:
> > On 01/22/2016 12:43 PM, Oleksandr Natalenko wrote:
> > > On пʼятниця, 22 січня 2016 р. 12:32:01 EET Kaleb S. KEITHLEY wrote:
> > >> I presume by this you mean you're not seeing the "kernel notifier loop
> > >> terminated" error in your logs.
> > >
> > > Correct, but only with simple traversing. Have to test under rsync.
> >
> > Without the patch I'd get "kernel notifier loop terminated" within a few
> > minutes of starting I/O.  With the patch I haven't seen it in 24 hours
> > of beating on it.
> >
> > >> Hmmm.  My system is not leaking. Last 24 hours the RSZ and VSZ are
> > >> stable:
> > >>
> http://download.gluster.org/pub/gluster/glusterfs/dynamic-analysis/longev
> > >> ity /client.out
> > >
> > > What ops do you perform on mounted volume? Read, write, stat? Is that
> > > 3.7.6 + patches?
> >
> > I'm running an internally developed I/O load generator written by a guy
> > on our perf team.
> >
> > it does, create, write, read, rename, stat, delete, and more.
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel