Re: [Cluster-devel] About locking granularity of gfs2

2018-04-24 Thread Guoqing Jiang



On 04/24/2018 08:54 PM, Steven Whitehouse wrote:

Hi,


On 24/04/18 04:54, Guoqing Jiang wrote:

Hi Steve,

Thanks for your reply.

On 04/24/2018 11:03 AM, Steven Whitehouse wrote:

Hi,


On 24/04/18 03:52, Guoqing Jiang wrote:

Hi,

Since gfs2 can "allow parallel allocation from different nodes 
simultaneously
as the locking granularity is one lock per resource group" per 
section 3.2 of [1].


Could it possible to make the locking granularity also applies to 
R/W IO? Then,
with the help of "sunit" and "swidth", we basically can lock a 
stripe, so all nodes
can write to different stripes in parallel, so the basic IO unit is 
one stripe.
Since I don't know gfs2 well,  I am wondering it is possible to do 
it or it doesn't
make sense at all for the idea due to some reasons. Any thoughts 
would be

appreciated, thanks.

I am asking the question because if people want to add the cluster 
support for
md/raid5, then it is better to get the help from filesystem level 
to ensure only one
node can access a stripe at a time, otherwise we have to locking a 
stripe in md

layer which could cause performance issue.

[1] https://www.kernel.org/doc/ols/2007/ols2007v2-pages-253-260.pdf

Regards,
Guoqing



It is not just performance, it would be correctness too, since there 
is no guarantee that two nodes are not writing to the same stripe at 
the same time.


Yes, no fs can guarantee it. I am wondering if using GFS2 as a local 
filesystem, and gfs2 runs on top
of raid5, is it possible that gfs2 can write to two places 
simultaneously while the two places belong

to one stripe?


Yes



Based on the possibility, I guess it is not recommend to run gfs2 on raid5.

The locking granularity is per-inode generally, but also per-rgrp in 
case of rgrps, but that refers only to the header/bitmap since the 
allocated blocks are subject to the per-inode glocks in general.


Please correct me, does it mean there are two types of locking 
granularity? per-rgrp is for allocate rgrp,

and per-inode if for R/W IO, thanks.
It depends what operation is being undertaken. The per-inode glock 
covers all the blocks related to the inode, but during allocation and 
deallocation, the responsibility for the allocated and deallocate 
blocks passes between the rgrp and inode to which they relate. So the 
situation is more complicated than when no allocation/deallocation is 
involved,


Thanks a lot for your explanation.

Regards,
Guoqing



Re: [Cluster-devel] About locking granularity of gfs2

2018-04-24 Thread Steven Whitehouse

Hi,


On 24/04/18 04:54, Guoqing Jiang wrote:

Hi Steve,

Thanks for your reply.

On 04/24/2018 11:03 AM, Steven Whitehouse wrote:

Hi,


On 24/04/18 03:52, Guoqing Jiang wrote:

Hi,

Since gfs2 can "allow parallel allocation from different nodes 
simultaneously
as the locking granularity is one lock per resource group" per 
section 3.2 of [1].


Could it possible to make the locking granularity also applies to 
R/W IO? Then,
with the help of "sunit" and "swidth", we basically can lock a 
stripe, so all nodes
can write to different stripes in parallel, so the basic IO unit is 
one stripe.
Since I don't know gfs2 well,  I am wondering it is possible to do 
it or it doesn't
make sense at all for the idea due to some reasons. Any thoughts 
would be

appreciated, thanks.

I am asking the question because if people want to add the cluster 
support for
md/raid5, then it is better to get the help from filesystem level to 
ensure only one
node can access a stripe at a time, otherwise we have to locking a 
stripe in md

layer which could cause performance issue.

[1] https://www.kernel.org/doc/ols/2007/ols2007v2-pages-253-260.pdf

Regards,
Guoqing



It is not just performance, it would be correctness too, since there 
is no guarantee that two nodes are not writing to the same stripe at 
the same time.


Yes, no fs can guarantee it. I am wondering if using GFS2 as a local 
filesystem, and gfs2 runs on top
of raid5, is it possible that gfs2 can write to two places 
simultaneously while the two places belong

to one stripe?


Yes

The locking granularity is per-inode generally, but also per-rgrp in 
case of rgrps, but that refers only to the header/bitmap since the 
allocated blocks are subject to the per-inode glocks in general.


Please correct me, does it mean there are two types of locking 
granularity? per-rgrp is for allocate rgrp,

and per-inode if for R/W IO, thanks.
It depends what operation is being undertaken. The per-inode glock 
covers all the blocks related to the inode, but during allocation and 
deallocation, the responsibility for the allocated and deallocate blocks 
passes between the rgrp and inode to which they relate. So the situation 
is more complicated than when no allocation/deallocation is involved,


Steve.


Guoqing






Re: [Cluster-devel] About locking granularity of gfs2

2018-04-24 Thread Guoqing Jiang



On 04/24/2018 01:13 PM, Gang He wrote:

Stripe unit is logical volume concepts, for file system, it should not know 
this,
for file system, the access unit is block (power of disk sector size).


IMHO, It is true for typical fs, but I think zfs and btrfs can know 
about it well, though

no cluster fs can support it now.

Thanks,
Guoqing



Re: [Cluster-devel] About locking granularity of gfs2

2018-04-23 Thread Guoqing Jiang

Hi Steve,

Thanks for your reply.

On 04/24/2018 11:03 AM, Steven Whitehouse wrote:

Hi,


On 24/04/18 03:52, Guoqing Jiang wrote:

Hi,

Since gfs2 can "allow parallel allocation from different nodes 
simultaneously
as the locking granularity is one lock per resource group" per 
section 3.2 of [1].


Could it possible to make the locking granularity also applies to R/W 
IO? Then,
with the help of "sunit" and "swidth", we basically can lock a 
stripe, so all nodes
can write to different stripes in parallel, so the basic IO unit is 
one stripe.
Since I don't know gfs2 well,  I am wondering it is possible to do it 
or it doesn't
make sense at all for the idea due to some reasons. Any thoughts 
would be

appreciated, thanks.

I am asking the question because if people want to add the cluster 
support for
md/raid5, then it is better to get the help from filesystem level to 
ensure only one
node can access a stripe at a time, otherwise we have to locking a 
stripe in md

layer which could cause performance issue.

[1] https://www.kernel.org/doc/ols/2007/ols2007v2-pages-253-260.pdf

Regards,
Guoqing



It is not just performance, it would be correctness too, since there 
is no guarantee that two nodes are not writing to the same stripe at 
the same time.


Yes, no fs can guarantee it. I am wondering if using GFS2 as a local 
filesystem, and gfs2 runs on top
of raid5, is it possible that gfs2 can write to two places 
simultaneously while the two places belong

to one stripe?

The locking granularity is per-inode generally, but also per-rgrp in 
case of rgrps, but that refers only to the header/bitmap since the 
allocated blocks are subject to the per-inode glocks in general.


Please correct me, does it mean there are two types of locking 
granularity? per-rgrp is for allocate rgrp,

and per-inode if for R/W IO, thanks.

I don't think it would be easy to try and make them correspond with 
raid stripes and getting gfs2 to work with md would be non-trivial,


Totally agree :-).

Regards,,
Guoqing



Re: [Cluster-devel] About locking granularity of gfs2

2018-04-23 Thread Steven Whitehouse

Hi,


On 24/04/18 03:52, Guoqing Jiang wrote:

Hi,

Since gfs2 can "allow parallel allocation from different nodes 
simultaneously
as the locking granularity is one lock per resource group" per section 
3.2 of [1].


Could it possible to make the locking granularity also applies to R/W 
IO? Then,
with the help of "sunit" and "swidth", we basically can lock a stripe, 
so all nodes
can write to different stripes in parallel, so the basic IO unit is 
one stripe.
Since I don't know gfs2 well,  I am wondering it is possible to do it 
or it doesn't

make sense at all for the idea due to some reasons. Any thoughts would be
appreciated, thanks.

I am asking the question because if people want to add the cluster 
support for
md/raid5, then it is better to get the help from filesystem level to 
ensure only one
node can access a stripe at a time, otherwise we have to locking a 
stripe in md

layer which could cause performance issue.

[1] https://www.kernel.org/doc/ols/2007/ols2007v2-pages-253-260.pdf

Regards,
Guoqing



It is not just performance, it would be correctness too, since there is 
no guarantee that two nodes are not writing to the same stripe at the 
same time. The locking granularity is per-inode generally, but also 
per-rgrp in case of rgrps, but that refers only to the header/bitmap 
since the allocated blocks are subject to the per-inode glocks in general.


I don't think it would be easy to try and make them correspond with raid 
stripes and getting gfs2 to work with md would be non-trivial,


Steve.