Re: [Gluster-devel] [RFC] inode table locking contention reduction experiment

2019-11-03 Thread Changwei Ge



On 2019/11/4 12:39 下午, Amar Tumballi wrote:

Thanks for this, github works for review right now :-)

I am occupied till Wednesday, and will review them by this week. A 
glance on the changes looks good to me.


Few tests which can run for validations are :

tests/bugs/shard/bug-1696136-lru-limit-equals-deletion-rate.t

tests/features/fuse-lru-limit.t

tests/bugs/shard/shard-inode-refcount-test.t


Ideal is to run the full regression with `./run-tests.sh -c`


Sure, I will run the entire regression.
Actually, there are still some tiny problems with this large patch. I am 
still working on improving it including adding some debug/trace methods.


Or is there some existing trace/dump method for Glusterfs fuse client? I 
know there is something like that for brick process but can't find any 
for client.


I am planning to join Glusterfs video conference on 12th this month and 
then discuss about my idea.


Thanks,
Changwei




Regards,

Amar


On Mon, Nov 4, 2019 at 9:21 AM Changwei Ge > wrote:


Hi Amar,

On 2019/10/31 6:30 下午, Amar Tumballi wrote:
 >
 >
 > On Wed, Oct 30, 2019 at 4:32 PM Xavi Hernandez
mailto:jaher...@redhat.com>
 > >> wrote:
 >
 >     Hi Changwei,
 >
 >     On Tue, Oct 29, 2019 at 7:56 AM Changwei Ge
mailto:c...@linux.alibaba.com>
 >     >> wrote:
 >
 >         Hi,
 >
 >         I am recently working on reducing inode_[un]ref() locking
 >         contention by
 >         getting rid of inode table lock. Just use inode lock to
protect
 >         inode
 >         REF. I have already discussed a couple rounds with several
 >         Glusterfs
 >         developers via emails and Gerrit and basically get
understood on
 >         major
 >         logic around.
 >
 >         Currently, inode REF can be ZERO and be reused by
increasing it
 >         to ONE.
 >         This is IMO why we have to burden so much work for inode
table when
 >         REF/UNREF. It makes inode [un]ref() and inode table and
 >         dentries(alias)
 >         searching hard to run concurrently.
 >
 >         So my question is in what cases, how can we find a inode
whose
 >         REF is ZERO?
 >
 >         As Glusterfs store its inode memory address into kernel/fuse,
 >         can we
 >         conclude that only fuse_ino_to_inode() can bring back a
REF=0 inode?
 >
 >
 > Xavi's answer below provides some insights. and same time,
assuming that
 > only fuse_ino_to_inode() can bring back inode from ref=0 state (for
 > now), is a good start.
 >
 >
 >     Yes, when an inode gets refs = 0, it means that gluster code
is not
 >     using it anywhere, so it cannot be referenced again unless kernel
 >     sends new requests on the same inode. Once refs=0 and
nlookup=0, the
 >     inode can be destroyed.
 >
 >     Inode code is quite complex right now and I haven't had time to
 >     investigate this further, but I think we could simplify inode
 >     management significantly (specially unref) if we add a reference
 >     when nlookup becomes > 0, and remove a reference when
 >     nlookup becomes 0 again. Maybe with this approach we could avoid
 >     inode table lock in many cases. However we need to make sure we
 >     correctly handle invalidation logic to keep inode table size
under
 >     control.
 >
 >
 > My suggestion is, don't wait for a complete solution for posting the
 > patch. Let us get a chance to have a look at WorkInProgress
patches, so
 > we can have discussions on code itself. It would help to reach
better
 > solutions sooner.

Agree.

I have almost implemented my draft design for this experiment.
The immature code has been pushed to my personal Glusterfs repo[1].

Now it's a single large patch, I will split it to patches when I decide
to push it to Gerrit for review convenience. If you prefer to push
it to
Gerrit for a early review and discussion, I can do that :-). But I am
still doing some debug stuff.

My work includes:

1. Move inode refing and unrefing logic unrelated logic out from
`__inode_[un]ref()` hence to reduce their arguments.
2. Add a specific ‘ref_lock’ to inode to keep ref/unref atomicity.
3. As `inode_table::active_size` is only used for debug purpose,
convert
it to atomic variable.
4. Factor out pruning inode.
5. In order to run inode search and grep run concurrently, firstly  use
RDLOCK  and then convert it WRLOCK if necessary.
6. Inode table lock is not necessary for inode ref/unref unless we have
to move it between table lists.

etc...

Any comme

Re: [Gluster-devel] [RFC] inode table locking contention reduction experiment

2019-11-03 Thread Amar Tumballi
Thanks for this, github works for review right now :-)

I am occupied till Wednesday, and will review them by this week. A glance
on the changes looks good to me.

Few tests which can run for validations are :

tests/bugs/shard/bug-1696136-lru-limit-equals-deletion-rate.t

tests/features/fuse-lru-limit.t

tests/bugs/shard/shard-inode-refcount-test.t


Ideal is to run the full regression with `./run-tests.sh -c`


Regards,

Amar


On Mon, Nov 4, 2019 at 9:21 AM Changwei Ge  wrote:

> Hi Amar,
>
> On 2019/10/31 6:30 下午, Amar Tumballi wrote:
> >
> >
> > On Wed, Oct 30, 2019 at 4:32 PM Xavi Hernandez  > > wrote:
> >
> > Hi Changwei,
> >
> > On Tue, Oct 29, 2019 at 7:56 AM Changwei Ge  > > wrote:
> >
> > Hi,
> >
> > I am recently working on reducing inode_[un]ref() locking
> > contention by
> > getting rid of inode table lock. Just use inode lock to protect
> > inode
> > REF. I have already discussed a couple rounds with several
> > Glusterfs
> > developers via emails and Gerrit and basically get understood on
> > major
> > logic around.
> >
> > Currently, inode REF can be ZERO and be reused by increasing it
> > to ONE.
> > This is IMO why we have to burden so much work for inode table
> when
> > REF/UNREF. It makes inode [un]ref() and inode table and
> > dentries(alias)
> > searching hard to run concurrently.
> >
> > So my question is in what cases, how can we find a inode whose
> > REF is ZERO?
> >
> > As Glusterfs store its inode memory address into kernel/fuse,
> > can we
> > conclude that only fuse_ino_to_inode() can bring back a REF=0
> inode?
> >
> >
> > Xavi's answer below provides some insights. and same time, assuming that
> > only fuse_ino_to_inode() can bring back inode from ref=0 state (for
> > now), is a good start.
> >
> >
> > Yes, when an inode gets refs = 0, it means that gluster code is not
> > using it anywhere, so it cannot be referenced again unless kernel
> > sends new requests on the same inode. Once refs=0 and nlookup=0, the
> > inode can be destroyed.
> >
> > Inode code is quite complex right now and I haven't had time to
> > investigate this further, but I think we could simplify inode
> > management significantly (specially unref) if we add a reference
> > when nlookup becomes > 0, and remove a reference when
> > nlookup becomes 0 again. Maybe with this approach we could avoid
> > inode table lock in many cases. However we need to make sure we
> > correctly handle invalidation logic to keep inode table size under
> > control.
> >
> >
> > My suggestion is, don't wait for a complete solution for posting the
> > patch. Let us get a chance to have a look at WorkInProgress patches, so
> > we can have discussions on code itself. It would help to reach better
> > solutions sooner.
>
> Agree.
>
> I have almost implemented my draft design for this experiment.
> The immature code has been pushed to my personal Glusterfs repo[1].
>
> Now it's a single large patch, I will split it to patches when I decide
> to push it to Gerrit for review convenience. If you prefer to push it to
> Gerrit for a early review and discussion, I can do that :-). But I am
> still doing some debug stuff.
>
> My work includes:
>
> 1. Move inode refing and unrefing logic unrelated logic out from
> `__inode_[un]ref()` hence to reduce their arguments.
> 2. Add a specific ‘ref_lock’ to inode to keep ref/unref atomicity.
> 3. As `inode_table::active_size` is only used for debug purpose, convert
> it to atomic variable.
> 4. Factor out pruning inode.
> 5. In order to run inode search and grep run concurrently, firstly  use
> RDLOCK  and then convert it WRLOCK if necessary.
> 6. Inode table lock is not necessary for inode ref/unref unless we have
> to move it between table lists.
>
> etc...
>
> Any comments, ideas, suggestions are kindly welcomed.
>
> Thanks,
> Changwei
>
> [1]:
>
> https://github.com/changweige/glusterfs/commit/d7226d2458281212af19ec8c2ca3d8c8caae1330
>
> >
> > Regards,
> >
> > Xavi
> >
> >
> >
> > Thanks,
> > Changwei
> > ___
> >
> > Community Meeting Calendar:
> >
> > APAC Schedule -
> > Every 2nd and 4th Tuesday at 11:30 AM IST
> > Bridge: https://bluejeans.com/118564314
> >
> > NA/EMEA Schedule -
> > Every 1st and 3rd Tuesday at 01:00 PM EDT
> > Bridge: https://bluejeans.com/118564314
> >
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org 
> > https://lists.gluster.org/mailman/listinfo/gluster-devel
> >
>
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11

Re: [Gluster-devel] [RFC] inode table locking contention reduction experiment

2019-11-03 Thread Changwei Ge

Hi Amar,

On 2019/10/31 6:30 下午, Amar Tumballi wrote:



On Wed, Oct 30, 2019 at 4:32 PM Xavi Hernandez > wrote:


Hi Changwei,

On Tue, Oct 29, 2019 at 7:56 AM Changwei Ge mailto:c...@linux.alibaba.com>> wrote:

Hi,

I am recently working on reducing inode_[un]ref() locking
contention by
getting rid of inode table lock. Just use inode lock to protect
inode
REF. I have already discussed a couple rounds with several
Glusterfs
developers via emails and Gerrit and basically get understood on
major
logic around.

Currently, inode REF can be ZERO and be reused by increasing it
to ONE.
This is IMO why we have to burden so much work for inode table when
REF/UNREF. It makes inode [un]ref() and inode table and
dentries(alias)
searching hard to run concurrently.

So my question is in what cases, how can we find a inode whose
REF is ZERO?

As Glusterfs store its inode memory address into kernel/fuse,
can we
conclude that only fuse_ino_to_inode() can bring back a REF=0 inode?


Xavi's answer below provides some insights. and same time, assuming that 
only fuse_ino_to_inode() can bring back inode from ref=0 state (for 
now), is a good start.



Yes, when an inode gets refs = 0, it means that gluster code is not
using it anywhere, so it cannot be referenced again unless kernel
sends new requests on the same inode. Once refs=0 and nlookup=0, the
inode can be destroyed.

Inode code is quite complex right now and I haven't had time to
investigate this further, but I think we could simplify inode
management significantly (specially unref) if we add a reference
when nlookup becomes > 0, and remove a reference when
nlookup becomes 0 again. Maybe with this approach we could avoid
inode table lock in many cases. However we need to make sure we
correctly handle invalidation logic to keep inode table size under
control.


My suggestion is, don't wait for a complete solution for posting the 
patch. Let us get a chance to have a look at WorkInProgress patches, so 
we can have discussions on code itself. It would help to reach better 
solutions sooner.


Agree.

I have almost implemented my draft design for this experiment.
The immature code has been pushed to my personal Glusterfs repo[1].

Now it's a single large patch, I will split it to patches when I decide 
to push it to Gerrit for review convenience. If you prefer to push it to 
Gerrit for a early review and discussion, I can do that :-). But I am 
still doing some debug stuff.


My work includes:

1. Move inode refing and unrefing logic unrelated logic out from 
`__inode_[un]ref()` hence to reduce their arguments.

2. Add a specific ‘ref_lock’ to inode to keep ref/unref atomicity.
3. As `inode_table::active_size` is only used for debug purpose, convert 
it to atomic variable.

4. Factor out pruning inode.
5. In order to run inode search and grep run concurrently, firstly  use 
RDLOCK  and then convert it WRLOCK if necessary.
6. Inode table lock is not necessary for inode ref/unref unless we have 
to move it between table lists.


etc...

Any comments, ideas, suggestions are kindly welcomed.

Thanks,
Changwei

[1]:
https://github.com/changweige/glusterfs/commit/d7226d2458281212af19ec8c2ca3d8c8caae1330



Regards,

Xavi



Thanks,
Changwei
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-devel


___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [RFC] inode table locking contention reduction experiment

2019-11-03 Thread Changwei Ge

Hi Xavi,


On 2019/10/30 7:01 下午, Xavi Hernandez wrote:

Hi Changwei,

On Tue, Oct 29, 2019 at 7:56 AM Changwei Ge > wrote:


Hi,

I am recently working on reducing inode_[un]ref() locking contention by
getting rid of inode table lock. Just use inode lock to protect inode
REF. I have already discussed a couple rounds with several Glusterfs
developers via emails and Gerrit and basically get understood on major
logic around.

Currently, inode REF can be ZERO and be reused by increasing it to ONE.
This is IMO why we have to burden so much work for inode table when
REF/UNREF. It makes inode [un]ref() and inode table and dentries(alias)
searching hard to run concurrently.

So my question is in what cases, how can we find a inode whose REF
is ZERO?

As Glusterfs store its inode memory address into kernel/fuse, can we
conclude that only fuse_ino_to_inode() can bring back a REF=0 inode?


Yes, when an inode gets refs = 0, it means that gluster code is not 
using it anywhere, so it cannot be referenced again unless kernel sends 
new requests on the same inode. Once refs=0 and nlookup=0, the inode can 
be destroyed.




Your answer is quite a useful clue for my following work. :-)
Basically, I think I have understood the underlying intention of inode 
REF/UNREF code.


Inode code is quite complex right now and I haven't had time to 
investigate this further, but I think we could simplify inode management 
significantly (specially unref) if we add a reference when 
nlookup becomes > 0, and remove a reference when nlookup becomes 0 
again. Maybe with this approach we could avoid inode table lock in many 
cases. However we need to make sure we correctly handle invalidation 
logic to keep inode table size under control.




Sounds a good idea. If we can set inode ref as the *only* condition to 
decide whether to free an inode and make sure we won't use an inode with 
REF=0 anymore, inode management will be much simpler. We have to ensure 
that after decreasing inode ref to ZERO, there should no code path 
finding/referencing such an inode. Then we can directly free this inode.



Thanks,
Changwei



Regards,

Xavi



Thanks,
Changwei
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-devel


___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] Weekly Untriaged Bugs

2019-11-03 Thread jenkins
[...truncated 7 lines...]
https://bugzilla.redhat.com/1762311 / build: build-aux/pkg-version has bashishms
https://bugzilla.redhat.com/1762175 / core: (glusterfs-5.11)- GlusterFS 5.11 
tracker
https://bugzilla.redhat.com/1763987 / libgfapi: libgfapi: ls -R cifs 
/mount_point/.snaps directory failed with "ls: reading directory 
/root/dir/.snaps/test2: Not a directory"
https://bugzilla.redhat.com/1761365 / libgfapi: libgfapi: the glfs_init() get 
stuck and is in inifinitely loop in pthread_spin_lock()
https://bugzilla.redhat.com/1767264 / libglusterfsclient: glusterfs client 
process coredump
https://bugzilla.redhat.com/1767305 / posix-acl: READDIRP incorrectly updates 
posix-acl inode ctx
https://bugzilla.redhat.com/1761526 / project-infrastructure: Duplicate 
reception of specific email repeatedly from gluster-users list
https://bugzilla.redhat.com/1761350 / replicate: Directories are not healed, 
when dirs are created on the backend bricks and performed lookup from mount 
path.
https://bugzilla.redhat.com/1761088 / replicate: Git fetch fails when fetah on 
gluster storage
https://bugzilla.redhat.com/1759829 / write-behind: write-behind xlator 
generate coredump,when run "glfs_vol_set_IO_ERR.t"
[...truncated 2 lines...]

build.log
Description: Binary data
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel