Re: [PATCH] gfs2: Revert "Fix loop in gfs2_rbm_find"

2019-01-31 Thread Linus Torvalds
On Thu, Jan 31, 2019 at 11:12 AM Andreas Gruenbacher
 wrote:
>
> That patch just hasn't seen enough testing to make me comfortable
> sending it yet.

Ok, I'll just apply the revert.

Thanks,

  Linus


Re: [PATCH] gfs2: Revert "Fix loop in gfs2_rbm_find"

2019-01-31 Thread Andreas Gruenbacher
On Thu, 31 Jan 2019 at 19:41, Linus Torvalds
 wrote:
> On Wed, Jan 30, 2019 at 12:30 PM Andreas Gruenbacher  
> wrote:
> >
> > This reverts commit 2d29f6b96d8f80322ed2dd895bca590491c38d34.
> >
> > It turns out that the fix can lead to a ~20 percent performance regression
> > in initial writes to the page cache according to iozone.  Let's revert this
> > for now to have more time for a proper fix.
>
> Ugh. I think part of the problem is that the
>
> n += (rbm->bii - initial_bii);
>
> doesn't make any sense when we just set rbm->bii to 0 a couple of
> lines earlier. So it's basically a really odd way to write
>
> n -= initial_bii;
>
> which in turn really doesn't make any sense _either_.

Right, that code clearly doesn't make sense. The only positive about
it is that it hasn't caused any obvious issues so far.

> So I'l all for reverting the commit because it caused a performance
> regression, but the end result really is all kinds of odd.
>
> Is the bug as simple as "we incremented the iterator counter twice"?
> Because in the -E2BIG case, we first increment it by the difference in
> bii, but then we *also* increment it in res_covered_end_of_rgrp()
> (which we do *not* do for the "ret > 0" case, which goes to
> next_iter).

In the "ret > 0" case, rbm->bii should have already been incremented
in gfs2_reservation_check_and_update.

> So if somebody can run the performance test again, how does it look if
> *instead* of the revert, we do what looks at least _slightly_ more
> sensible, and move the "increment iteration count" up to the
> next-bitmap case instead?
>
> At that point, it will actually match the bii updates (because
> next_bitmap also updates rbm->bii). Hmm?
>
> Something like the whitespace-damaged thinig below. Completely
> untested. But *if* this works, it would make more sense than the
> revert..
>
> Hmm?

My idea was to fix that whole loop properly as in this patch instead:

  https://www.redhat.com/archives/cluster-devel/2019-January/msg00111.html

That patch just hasn't seen enough testing to make me comfortable
sending it yet. We're testing it right now, and my plan was to get it
into the next merge window. I don't think an intermediate workaround
would make much sense. Does that sound acceptable? Would you rather
have that fix sent to you sometime next week instead?

Thanks a lot,
Andreas


Re: [PATCH] gfs2: Revert "Fix loop in gfs2_rbm_find"

2019-01-31 Thread Linus Torvalds
On Wed, Jan 30, 2019 at 12:30 PM Andreas Gruenbacher
 wrote:
>
> This reverts commit 2d29f6b96d8f80322ed2dd895bca590491c38d34.
>
> It turns out that the fix can lead to a ~20 percent performance regression
> in initial writes to the page cache according to iozone.  Let's revert this
> for now to have more time for a proper fix.

Ugh. I think part of the problem is that the

n += (rbm->bii - initial_bii);

doesn't make any sense when we just set rbm->bii to 0 a couple of
lines earlier. So it's basically a really odd way to write

n -= initial_bii;

which in turn really doesn't make any sense _either_.

So I'l all for reverting the commit because it caused a performance
regression, but the end result really is all kinds of odd.

Is the bug as simple as "we incremented the iterator counter twice"?
Because in the -E2BIG case, we first increment it by the difference in
bii, but then we *also* increment it in res_covered_end_of_rgrp()
(which we do *not* do for the "ret > 0" case, which goes to
next_iter).

So if somebody can run the performance test again, how does it look if
*instead* of the revert, we do what looks at least _slightly_ more
sensible, and move the "increment iteration count" up to the
next-bitmap case instead?

At that point, it will actually match the bii updates (because
next_bitmap also updates rbm->bii). Hmm?

Something like the whitespace-damaged thinig below. Completely
untested. But *if* this works, it would make more sense than the
revert..

Hmm?

   Linus

--- snip snip ---
  diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
  index 831d7cb5a49c..5b1006d5344f 100644
  --- a/fs/gfs2/rgrp.c
  +++ b/fs/gfs2/rgrp.c
  @@ -1796,10 +1796,10 @@ static int gfs2_rbm_find(struct gfs2_rbm
*rbm, u8 state, u32 *minext,
rbm->bii++;
if (rbm->bii == rbm->rgd->rd_length)
rbm->bii = 0;
  + n++;
   res_covered_end_of_rgrp:
if ((rbm->bii == 0) && nowrap)
break;
  - n++;
   next_iter:
if (n >= iters)
break;


[PATCH] gfs2: Revert "Fix loop in gfs2_rbm_find"

2019-01-30 Thread Andreas Gruenbacher
This reverts commit 2d29f6b96d8f80322ed2dd895bca590491c38d34.

It turns out that the fix can lead to a ~20 percent performance regression
in initial writes to the page cache according to iozone.  Let's revert this
for now to have more time for a proper fix.

Cc: sta...@vger.kernel.org # v3.13+
Signed-off-by: Andreas Gruenbacher 
Signed-off-by: Bob Peterson 

---
 fs/gfs2/rgrp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 831d7cb5a49c4..17a8d3b439905 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1780,9 +1780,9 @@ static int gfs2_rbm_find(struct gfs2_rbm *rbm, u8 state, 
u32 *minext,
goto next_iter;
}
if (ret == -E2BIG) {
-   n += rbm->bii - initial_bii;
rbm->bii = 0;
rbm->offset = 0;
+   n += (rbm->bii - initial_bii);
goto res_covered_end_of_rgrp;
}
return ret;
-- 
2.20.1