Hi Bulat,
This seems quite reasonable to me. Have you eyeballed the assembly
GCC produces to see that the hotpath is improved? If you can submit
a patch that would be great!
Cheers,
Edward
Excerpts from Bulat Ziganshin's message of 2014-10-14 10:08:59 -0700:
Hello Glasgow-haskell-users,
i'm looking a the
https://github.com/ghc/ghc/blob/23bb90460d7c963ee617d250fa0a33c6ac7bbc53/rts/sm/Storage.c#L680
if i correctly understand, it's speed-critical routine?
i think that it may be improved in this way:
StgPtr allocate (Capability *cap, W_ n)
{
bdescr *bd;
StgPtr p;
TICK_ALLOC_HEAP_NOCTR(WDS(n));
CCS_ALLOC(cap-r.rCCCS,n);
/// here starts new improved code:
bd = cap-r.rCurrentAlloc;
if (bd == NULL || bd-free + n bd-end) {
if (n = LARGE_OBJECT_THRESHOLD/sizeof(W_)) {
}
if (bd-free + n = bd-start + BLOCK_SIZE_W)
bd-end = min (bd-start + BLOCK_SIZE_W, bd-free +
LARGE_OBJECT_THRESHOLD)
goto usual_alloc;
}
}
/// and here it stops
usual_alloc:
p = bd-free;
bd-free += n;
IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa));
return p;
}
i think it's obvious - we consolidate two if's on the crirical path
into the single one plus avoid one ADD by keeping highly-useful bd-end
pointer
further improvements may include removing bd==NULL check by
initializing bd-free=bd-end=NULL and moving entire if body
into separate slow_allocate() procedure marked noinline with
allocate() probably marked as forceinline:
StgPtr allocate (Capability *cap, W_ n)
{
bdescr *bd;
StgPtr p;
TICK_ALLOC_HEAP_NOCTR(WDS(n));
CCS_ALLOC(cap-r.rCCCS,n);
bd = cap-r.rCurrentAlloc;
if (bd-free + n bd-end)
return slow_allocate(cap,n);
p = bd-free;
bd-free += n;
IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa));
return p;
}
this change will greatly simplify optimizer's work. according to my
experience current C++ compilers are weak on optimizing large
functions with complex execution paths and such transformations really
improve the generated code
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users