Re: optimizing StgPtr allocate (Capability *cap, W_ n)

2014-10-16 Thread Edward Z. Yang
Hi Bulat,

This seems quite reasonable to me. Have you eyeballed the assembly
GCC produces to see that the hotpath is improved? If you can submit
a patch that would be great!

Cheers,
Edward

Excerpts from Bulat Ziganshin's message of 2014-10-14 10:08:59 -0700:
 Hello Glasgow-haskell-users,
 
 i'm looking a the 
 https://github.com/ghc/ghc/blob/23bb90460d7c963ee617d250fa0a33c6ac7bbc53/rts/sm/Storage.c#L680
 
 if i correctly understand, it's speed-critical routine?
 
 i think that it may be improved in this way:
 
 StgPtr allocate (Capability *cap, W_ n)
 {
 bdescr *bd;
 StgPtr p;
 
 TICK_ALLOC_HEAP_NOCTR(WDS(n));
 CCS_ALLOC(cap-r.rCCCS,n);
 
 /// here starts new improved code:
 
 bd = cap-r.rCurrentAlloc;
 if (bd == NULL || bd-free + n  bd-end) {
 if (n = LARGE_OBJECT_THRESHOLD/sizeof(W_)) {
 
 }
 if (bd-free + n = bd-start + BLOCK_SIZE_W)
 bd-end = min (bd-start + BLOCK_SIZE_W, bd-free + 
 LARGE_OBJECT_THRESHOLD)
 goto usual_alloc;
 }
 
 }
 
 /// and here it stops
 
 usual_alloc:
 p = bd-free;
 bd-free += n;
 
 IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa));
 return p;
 }
 
 
 i  think  it's  obvious - we consolidate two if's on the crirical path
 into the single one plus avoid one ADD by keeping highly-useful bd-end 
 pointer
 
 further   improvements   may   include   removing  bd==NULL  check  by
 initializing bd-free=bd-end=NULL   and   moving   entire   if body
 into   separate   slow_allocate()  procedure  marked  noinline  with
 allocate() probably marked as forceinline:
 
 StgPtr allocate (Capability *cap, W_ n)
 {
 bdescr *bd;
 StgPtr p;
 
 TICK_ALLOC_HEAP_NOCTR(WDS(n));
 CCS_ALLOC(cap-r.rCCCS,n);
 
 bd = cap-r.rCurrentAlloc;
 if (bd-free + n  bd-end)
 return slow_allocate(cap,n);
 
 p = bd-free;
 bd-free += n;
 
 IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa));
 return p;
 }
 
 this  change  will  greatly simplify optimizer's work. according to my
 experience   current  C++  compilers  are  weak  on  optimizing  large
 functions with complex execution paths and such transformations really
 improve the generated code
 
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


optimizing StgPtr allocate (Capability *cap, W_ n)

2014-10-14 Thread Bulat Ziganshin
Hello Glasgow-haskell-users,

i'm looking a the 
https://github.com/ghc/ghc/blob/23bb90460d7c963ee617d250fa0a33c6ac7bbc53/rts/sm/Storage.c#L680

if i correctly understand, it's speed-critical routine?

i think that it may be improved in this way:

StgPtr allocate (Capability *cap, W_ n)
{
bdescr *bd;
StgPtr p;

TICK_ALLOC_HEAP_NOCTR(WDS(n));
CCS_ALLOC(cap-r.rCCCS,n);

/// here starts new improved code:

bd = cap-r.rCurrentAlloc;
if (bd == NULL || bd-free + n  bd-end) {
if (n = LARGE_OBJECT_THRESHOLD/sizeof(W_)) {

}
if (bd-free + n = bd-start + BLOCK_SIZE_W)
bd-end = min (bd-start + BLOCK_SIZE_W, bd-free + 
LARGE_OBJECT_THRESHOLD)
goto usual_alloc;
}

}

/// and here it stops

usual_alloc:
p = bd-free;
bd-free += n;

IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa));
return p;
}


i  think  it's  obvious - we consolidate two if's on the crirical path
into the single one plus avoid one ADD by keeping highly-useful bd-end pointer

further   improvements   may   include   removing  bd==NULL  check  by
initializing bd-free=bd-end=NULL   and   moving   entire   if body
into   separate   slow_allocate()  procedure  marked  noinline  with
allocate() probably marked as forceinline:

StgPtr allocate (Capability *cap, W_ n)
{
bdescr *bd;
StgPtr p;

TICK_ALLOC_HEAP_NOCTR(WDS(n));
CCS_ALLOC(cap-r.rCCCS,n);

bd = cap-r.rCurrentAlloc;
if (bd-free + n  bd-end)
return slow_allocate(cap,n);

p = bd-free;
bd-free += n;

IF_DEBUG(sanity, ASSERT(*((StgWord8*)p) == 0xaa));
return p;
}

this  change  will  greatly simplify optimizer's work. according to my
experience   current  C++  compilers  are  weak  on  optimizing  large
functions with complex execution paths and such transformations really
improve the generated code


-- 
Best regards,
 Bulat  mailto:bulat.zigans...@gmail.com

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users