Hi Simon, JFTR, you seem to be after the trailing zeros in the code you commented.
If the bitmap is *really* that sparse then it might be profitable to rewrite it in terms of __builtin_ctz (when present). Some CPUs even have instructions for this. http://hardwarebug.org/2010/01/14/beware-the-builtins/ Possibly one could even switch to checking *leading* zeros by reformulating the algorithm and eliminate a few more instructions. http://www.hackersdelight.org/ might be another source for inspiration. Cheers, Gabor On 1/20/15, [email protected] <[email protected]> wrote: > Repository : ssh://[email protected]/ghc > > On branch : master > Link : > http://ghc.haskell.org/trac/ghc/changeset/9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b/ghc > >>--------------------------------------------------------------- > > commit 9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b > Author: Simon Marlow <[email protected]> > Date: Wed Jan 14 08:45:07 2015 +0000 > > comments only > > >>--------------------------------------------------------------- > > 9894f6a5b4883ea87fd5f280a2eb4a8abfbd2a6b > rts/sm/Scav.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/rts/sm/Scav.c b/rts/sm/Scav.c > index 2ecb23b..781840c 100644 > --- a/rts/sm/Scav.c > +++ b/rts/sm/Scav.c > @@ -285,6 +285,8 @@ scavenge_large_srt_bitmap( StgLargeSRT *large_srt ) > > for (i = 0; i < size / BITS_IN(W_); i++) { > bitmap = large_srt->l.bitmap[i]; > + // skip zero words: bitmaps can be very sparse, and this helps > + // performance a lot in some cases. > if (bitmap != 0) { > for (j = 0; j < BITS_IN(W_); j++) { > if ((bitmap & 1) != 0) { > > _______________________________________________ > ghc-commits mailing list > [email protected] > http://www.haskell.org/mailman/listinfo/ghc-commits > _______________________________________________ ghc-devs mailing list [email protected] http://www.haskell.org/mailman/listinfo/ghc-devs
