Hello! On Fri, Nov 27, 2020 at 01:22:46PM -0500, [email protected] wrote:
> # HG changeset patch > # User Noah Goldstein <[email protected]> > # Date 1606497081 18000 > # Fri Nov 27 12:11:21 2020 -0500 > # Node ID 7ec2fc7b29d6614df28152dd4a895e6139138890 > # Parent 90cc7194e993f8d722347e9f46a00f65dffc3935 > Replaced loop with __builtin_ctzl for detecting first 0 in bitnamp > No particular pressing reason for this change other than the performance > benefit it yields. Converts a loop that could take 63 iterations (and had a > branch) to 5 instructions > > diff -r 90cc7194e993 -r 7ec2fc7b29d6 src/core/ngx_slab.c > --- a/src/core/ngx_slab.c Fri Nov 27 00:01:20 2020 +0300 > +++ b/src/core/ngx_slab.c Fri Nov 27 12:11:21 2020 -0500 > @@ -235,36 +235,33 @@ > > if (bitmap[n] != NGX_SLAB_BUSY) { > > - for (m = 1, i = 0; m; m <<= 1, i++) { > - if (bitmap[n] & m) { > - continue; > + m = ~bitmap[n]; > + i = __builtin_ctzl(m); Thank you for the patch. Unfortunately, __builtin_ctzl() is not portable and hence the patch cannot be committed for obvious reasons. Further, even if __builtin_ctzl() is available, there no guarantees that it can be used on an uintptr_t variable, as uintptr_t can be larger than long (notably, 64bit Windows uses LLP64 data model). Also note that there are other similar loops in the various places of the code, and changing just one would certainly confuse readers. Note well that slab allocations are expected to be rare, and this loop is not expected to be performance-critical. Most performance impact on slab allocations are expected to be from shared mutex locking. If you have practical reasons to assume this code needs to be optimized, please share the details. If it indeed needs to, we can consider making a portable solution. -- Maxim Dounin http://mdounin.ru/ _______________________________________________ nginx-devel mailing list [email protected] http://mailman.nginx.org/mailman/listinfo/nginx-devel
