On 7 September 2012 07:28, Sven Torvinger <[email protected]> wrote: > On Thursday, 6 September 2012 at 20:44:29 UTC, Walter Bright wrote: >> >> On 9/6/2012 10:50 AM, Benjamin Thaut wrote: >>> >>> I just tried profiling it with Very Sleepy but basically it only tells me >>> for >>> both versions that most of the time is spend in gcx.fullcollect. >>> Just that the GDC version spends less time in gcx.fullcollect then the >>> DMD version. >> >> >> Even so, that in itself is a good clue. > > > my bet is on, cross-module-inlining of bitop.btr failing... > > https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gcbits.d > > version (DigitalMars) > { > version = bitops; > } > else version (GNU) > { > // use the unoptimized version > } > else version (D_InlineAsm_X86) > { > version = Asm86; > } > > wordtype testClear(size_t i) > { > version (bitops) > { > return core.bitop.btr(data + 1, i); // this is faster! > } >
You would be wrong. btr is a compiler intrinsic, so it is *always* inlined! Leaning towards Walter here that I would very much like to see hard evidence of your claims. :-) On a side note of that though, GDC has bt, btr, bts, etc, as intrinsics to its compiler front-end. So it would be no problem switching to version = bitops for version GNU. -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
