Re: [Cluster-devel] [PATCH 09/11] gfs2: Read bitmap buffers on demand

2019-01-14 Thread Andreas Gruenbacher
On Mon, 14 Jan 2019 at 19:41, Andreas Gruenbacher  wrote:
>
> Bob,
>
> On Mon, 14 Jan 2019 at 18:12, Bob Peterson  wrote:
> > - Original Message -
> > > Before this patch, when locking a resource group, gfs2 would read in the
> > > resource group header and all the bitmap buffers of the resource group.
> > > Those buffers would then be locked into memory until the resource group
> > > is unlocked, which will happen when the filesystem is unmounted or when
> > > transferring the resource group lock to another node, but not due to
> > > memory pressure.  Larger resource groups lock more buffers into memory,
> > > and cause more unnecessary I/O when resource group locks are transferred
> > > between nodes.
> > >
> > > With this patch, when locking a resource group, only the resource group
> > > header is read in.  The other bitmap buffers (the resource group header
> > > contains part of the bitmap) are only read in on demand.
> > >
> > > It would probably make sense to also only read in the resource group
> > > header on demand, when the resource group is modified, but since the
> > > header contains the number of free blocks in the resource group, there
> > > is a higher incentive to keep the header locked in memory after that.
> > >
> > > Signed-off-by: Andreas Gruenbacher 
> > > ---
> >
> > Hi,
> >
> > This patch looks good. However, I'm concerned about its performance,
> > at least until we get some bitmap read-ahead in place.
> > (see "/* FIXME: Might want to do read-ahead here. */")
> >
> > In particular, I'm concerned that it does the exact opposite of this
> > performance enhancement:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/fs/gfs2?id=39b0f1e9290880a6c905f639e7db6b646e302a4f
> >
> > So we need to test that 2GB rgrps (33 bitmap blocks) that are fragmented
> > don't have a disproportionate slowdown compared to, say, 128MB rgrps
> > because we end up spending so much time fetching pages out of page cache,
> > only to find them inadequate and doing brelse() again. And by fragmented,
> > I mean all 33 bitmap blocks have a small number of free blocks, thus
> > disqualifying them from having "full" status but still having a large
> > enough number of free blocks that we might consider searching them.
> > Some of the patches we did after that one may have mitigated the problem
> > to some extent, but we need to be sure we're not regressing performance.
>
> Yes, there are a few ways this patch can still be tuned depending on
> the performance impact. This should speed up "normal" workloads by
> requiring fewer reads from disk when acquiring a resource group glock,
> and less memory is pinned for filesystem metadata, but we do end up
> with more page cache lookups during allocations, and depending on
> read-ahead, possibly more smaller reads at inconvenient times.
>
> rd_last_alloc should help limit the performance impact when bitmaps
> become mostly full. (Forms of rd_last_alloc have been in gfs2 since
> the mainline merge.)

Hmm, rd_extfail_pt actually, not rd_last_alloc, and that predates the
optimization in commit 39b0f1e929 by less than two years.

> One way to further reduce the number of page cache lookups would be to
> cache the buffer head of the last allocation, but it's unclear if this
> is even an issue, so let's do some performance testing first.

Andreas



Re: [Cluster-devel] [PATCH 09/11] gfs2: Read bitmap buffers on demand

2019-01-14 Thread Andreas Gruenbacher
Bob,

On Mon, 14 Jan 2019 at 18:12, Bob Peterson  wrote:
> - Original Message -
> > Before this patch, when locking a resource group, gfs2 would read in the
> > resource group header and all the bitmap buffers of the resource group.
> > Those buffers would then be locked into memory until the resource group
> > is unlocked, which will happen when the filesystem is unmounted or when
> > transferring the resource group lock to another node, but not due to
> > memory pressure.  Larger resource groups lock more buffers into memory,
> > and cause more unnecessary I/O when resource group locks are transferred
> > between nodes.
> >
> > With this patch, when locking a resource group, only the resource group
> > header is read in.  The other bitmap buffers (the resource group header
> > contains part of the bitmap) are only read in on demand.
> >
> > It would probably make sense to also only read in the resource group
> > header on demand, when the resource group is modified, but since the
> > header contains the number of free blocks in the resource group, there
> > is a higher incentive to keep the header locked in memory after that.
> >
> > Signed-off-by: Andreas Gruenbacher 
> > ---
>
> Hi,
>
> This patch looks good. However, I'm concerned about its performance,
> at least until we get some bitmap read-ahead in place.
> (see "/* FIXME: Might want to do read-ahead here. */")
>
> In particular, I'm concerned that it does the exact opposite of this
> performance enhancement:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/fs/gfs2?id=39b0f1e9290880a6c905f639e7db6b646e302a4f
>
> So we need to test that 2GB rgrps (33 bitmap blocks) that are fragmented
> don't have a disproportionate slowdown compared to, say, 128MB rgrps
> because we end up spending so much time fetching pages out of page cache,
> only to find them inadequate and doing brelse() again. And by fragmented,
> I mean all 33 bitmap blocks have a small number of free blocks, thus
> disqualifying them from having "full" status but still having a large
> enough number of free blocks that we might consider searching them.
> Some of the patches we did after that one may have mitigated the problem
> to some extent, but we need to be sure we're not regressing performance.

Yes, there are a few ways this patch can still be tuned depending on
the performance impact. This should speed up "normal" workloads by
requiring fewer reads from disk when acquiring a resource group glock,
and less memory is pinned for filesystem metadata, but we do end up
with more page cache lookups during allocations, and depending on
read-ahead, possibly more smaller reads at inconvenient times.

rd_last_alloc should help limit the performance impact when bitmaps
become mostly full. (Forms of rd_last_alloc have been in gfs2 since
the mainline merge.)

One way to further reduce the number of page cache lookups would be to
cache the buffer head of the last allocation, but it's unclear if this
is even an issue, so let's do some performance testing first.

Thanks,
Andreas



Re: [Cluster-devel] [PATCH 09/11] gfs2: Read bitmap buffers on demand

2019-01-14 Thread Bob Peterson
- Original Message -
> Before this patch, when locking a resource group, gfs2 would read in the
> resource group header and all the bitmap buffers of the resource group.
> Those buffers would then be locked into memory until the resource group
> is unlocked, which will happen when the filesystem is unmounted or when
> transferring the resource group lock to another node, but not due to
> memory pressure.  Larger resource groups lock more buffers into memory,
> and cause more unnecessary I/O when resource group locks are transferred
> between nodes.
> 
> With this patch, when locking a resource group, only the resource group
> header is read in.  The other bitmap buffers (the resource group header
> contains part of the bitmap) are only read in on demand.
> 
> It would probably make sense to also only read in the resource group
> header on demand, when the resource group is modified, but since the
> header contains the number of free blocks in the resource group, there
> is a higher incentive to keep the header locked in memory after that.
> 
> Signed-off-by: Andreas Gruenbacher 
> ---

Hi,

This patch looks good. However, I'm concerned about its performance,
at least until we get some bitmap read-ahead in place.
(see "/* FIXME: Might want to do read-ahead here. */")

In particular, I'm concerned that it does the exact opposite of this
performance enhancement:

https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/fs/gfs2?id=39b0f1e9290880a6c905f639e7db6b646e302a4f

So we need to test that 2GB rgrps (33 bitmap blocks) that are fragmented
don't have a disproportionate slowdown compared to, say, 128MB rgrps
because we end up spending so much time fetching pages out of page cache,
only to find them inadequate and doing brelse() again. And by fragmented,
I mean all 33 bitmap blocks have a small number of free blocks, thus
disqualifying them from having "full" status but still having a large
enough number of free blocks that we might consider searching them.
Some of the patches we did after that one may have mitigated the problem
to some extent, but we need to be sure we're not regressing performance.

Regards,

Bob Peterson