Re: marking overhead, and on the cost of conditionals in hot code

2009-01-17 Thread Ludovic Courtès
Hello!

Andy Wingo wi...@pobox.com writes:

 I dropped into cachegrind, and it tells me thing about scm_gc_mark in a
 simple guile -c 1 run:

   .  void
   .  scm_gc_mark (SCM ptr)
 794,344  {
 155,170  = ???:0x00024917 (77585x)
 198,586if (SCM_IMP (ptr))
   .  return;
   .
 513,038if (SCM_GC_MARK_P (ptr))
   .  return;
   .  
  84,580if (!scm_i_marking)
   .  {
   .static const char msg[]
   .   = Should only call scm_gc_mark() during GC.;
   .scm_c_issue_deprecation_warning (msg);
   .  }
   .  
  42,290SCM_SET_GC_MARK (ptr);
  63,435scm_gc_mark_dependencies (ptr);
 2,666,432  = 
 /home/wingo/src/guile/vm/libguile/gc-mark.c:scm_gc_mark_dependencies (5222x)
 704  = 
 /usr/src/debugglibc-20081113T2206/elf/../sysdeps/i386/dl-trampoline.S:_dl_runtime_resolve
  (1x)
 595,758  }


 I think that the items on the left are cycle counts, and are of relative
 importance. The = lines are the cumulative costs of the subroutines.

This is actually the output of Callgrind, and the left column is
instruction reads (Ir), which is not directly equivalent to the cycle
count, especially on a CISC arch (it's nevertheless a good
approximation, I'm just nitpicking ;-)).

 The salient point for me is that the scm_i_marking check slows down
 this function by about 10%! Also, that the majority of the time in this
 function is in the SCM_GC_MARK_P line.

 If I thought that we'd keep our GC, I would work at inlining this
 function, i think.

But it's a macro, isn't it?

Thanks,
Ludo'.





Re: marking overhead, and on the cost of conditionals in hot code

2009-01-17 Thread Ludovic Courtès
Neil Jerram neiljer...@googlemail.com writes:

 It seems like a lot of things are starting to depend on whether or not
 we move to BDW-GC.  (This, the fix I just did for NetBSD,
 scm_init_guile, forthcoming work on threads and mutex locking
 inconsistencies, ...)  We should aim to reach a definitive decision on
 this soon!

Right.  Here's my (small) roadmap:

  1. Experiment a bit more with static allocation, notably for subrs,
 and see whether it's worth it.

  2. Provide additional benchmarking results, based on those by Clinger,
 Hansen et al., which are in the repo.  I'd like to have a
 reasonable understanding of what they do, though.

Additional feedback from interested parties could also be helpful in
trying to reach a decision.

Thanks,
Ludo'.





marking overhead, and on the cost of conditionals in hot code

2009-01-16 Thread Andy Wingo
I dropped into cachegrind, and it tells me thing about scm_gc_mark in a
simple guile -c 1 run:

  .  void
  .  scm_gc_mark (SCM ptr)
794,344  {
155,170  = ???:0x00024917 (77585x)
198,586if (SCM_IMP (ptr))
  .  return;
  .
513,038if (SCM_GC_MARK_P (ptr))
  .  return;
  .  
 84,580if (!scm_i_marking)
  .  {
  .static const char msg[]
  . = Should only call scm_gc_mark() during GC.;
  .scm_c_issue_deprecation_warning (msg);
  .  }
  .  
 42,290SCM_SET_GC_MARK (ptr);
 63,435scm_gc_mark_dependencies (ptr);
2,666,432  = 
/home/wingo/src/guile/vm/libguile/gc-mark.c:scm_gc_mark_dependencies (5222x)
704  = 
/usr/src/debugglibc-20081113T2206/elf/../sysdeps/i386/dl-trampoline.S:_dl_runtime_resolve
 (1x)
595,758  }


I think that the items on the left are cycle counts, and are of relative
importance. The = lines are the cumulative costs of the subroutines.

The salient point for me is that the scm_i_marking check slows down
this function by about 10%! Also, that the majority of the time in this
function is in the SCM_GC_MARK_P line.

If I thought that we'd keep our GC, I would work at inlining this
function, i think.

Andy
-- 
http://wingolog.org/