Andriy Gapon wrote this message on Tue, Nov 05, 2013 at 12:12 +0200:
> on 08/10/2013 04:38 Xin LI said the following:
> > Author: delphij
> > Date: Tue Oct  8 01:38:24 2013
> > New Revision: 256132
> > URL: http://svnweb.freebsd.org/changeset/base/256132
> > 
> > Log:
> >   Improve lzjb decompress performance by reorganizing the code
> >   to tighten the copy loop.
> >   
> >   Submitted by:     Denis Ahrens <denis h3q com>
> >   MFC after:        2 weeks
> >   Approved by:      re (gjb)
> > 
> > Modified:
> >   head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lzjb.c
> > 
> > Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lzjb.c
> > ==============================================================================
> > --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lzjb.c      Mon Oct 
> >  7 22:30:03 2013        (r256131)
> > +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lzjb.c      Tue Oct 
> >  8 01:38:24 2013        (r256132)
> > @@ -117,7 +117,9 @@ lzjb_decompress(void *s_start, void *d_s
> >                     src += 2;
> >                     if ((cpy = dst - offset) < (uchar_t *)d_start)
> >                             return (-1);
> > -                   while (--mlen >= 0 && dst < d_end)
> > +                   if (mlen > (d_end - dst))
> > +                           mlen = d_end - dst;
> > +                   while (--mlen >= 0)
> >                             *dst++ = *cpy++;
> >             } else {
> >                     *dst++ = *src++;
> > 
> 
> Intuitively it might seem that this change is indeed an improvement.
> But given how "not dumb" (not sure if that always amounts to smart) the modern
> compilers are, has anyone actually measured if this change indeed improves the
> performance?
> 
> Judging from the conversations on the ZFS mailing lists this change event hurt
> performance in some environments.
> Looks like the upstream is not going to take this change.
> 
> So does it make any sense for us to make the code more obscure and different
> from upstream?  Unless performance benefits could indeed be demonstrated.

The old code compiles to:
   62c13:       49 f7 de                neg    %r14
   62c16:       44 8d 58 ff             lea    -0x1(%rax),%r11d
   62c1a:       4c 89 d3                mov    %r10,%rbx
   62c1d:       0f 1f 00                nopl   (%rax)
   62c20:       42 8a 14 33             mov    (%rbx,%r14,1),%dl
   62c24:       88 13                   mov    %dl,(%rbx)
   62c26:       48 ff c3                inc    %rbx
   62c29:       ff c8                   dec    %eax
   62c2b:       85 c0                   test   %eax,%eax
   62c2d:       7f f1                   jg     62c20 <lzjb_decompress+0xa0>

The load/store line is addresses 62c20-62c29...  So it looks like clang
is already doing the optimization that was added...

-- 
  John-Mark Gurney                              Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Reply via email to