Re: marking overhead, and on the cost of conditionals in hot code
Hello! Andy Wingo wi...@pobox.com writes: I dropped into cachegrind, and it tells me thing about scm_gc_mark in a simple guile -c 1 run: . void . scm_gc_mark (SCM ptr) 794,344 { 155,170 = ???:0x00024917 (77585x) 198,586if (SCM_IMP (ptr)) . return; . 513,038if (SCM_GC_MARK_P (ptr)) . return; . 84,580if (!scm_i_marking) . { .static const char msg[] . = Should only call scm_gc_mark() during GC.; .scm_c_issue_deprecation_warning (msg); . } . 42,290SCM_SET_GC_MARK (ptr); 63,435scm_gc_mark_dependencies (ptr); 2,666,432 = /home/wingo/src/guile/vm/libguile/gc-mark.c:scm_gc_mark_dependencies (5222x) 704 = /usr/src/debugglibc-20081113T2206/elf/../sysdeps/i386/dl-trampoline.S:_dl_runtime_resolve (1x) 595,758 } I think that the items on the left are cycle counts, and are of relative importance. The = lines are the cumulative costs of the subroutines. This is actually the output of Callgrind, and the left column is instruction reads (Ir), which is not directly equivalent to the cycle count, especially on a CISC arch (it's nevertheless a good approximation, I'm just nitpicking ;-)). The salient point for me is that the scm_i_marking check slows down this function by about 10%! Also, that the majority of the time in this function is in the SCM_GC_MARK_P line. If I thought that we'd keep our GC, I would work at inlining this function, i think. But it's a macro, isn't it? Thanks, Ludo'.
Re: marking overhead, and on the cost of conditionals in hot code
Neil Jerram neiljer...@googlemail.com writes: It seems like a lot of things are starting to depend on whether or not we move to BDW-GC. (This, the fix I just did for NetBSD, scm_init_guile, forthcoming work on threads and mutex locking inconsistencies, ...) We should aim to reach a definitive decision on this soon! Right. Here's my (small) roadmap: 1. Experiment a bit more with static allocation, notably for subrs, and see whether it's worth it. 2. Provide additional benchmarking results, based on those by Clinger, Hansen et al., which are in the repo. I'd like to have a reasonable understanding of what they do, though. Additional feedback from interested parties could also be helpful in trying to reach a decision. Thanks, Ludo'.
marking overhead, and on the cost of conditionals in hot code
I dropped into cachegrind, and it tells me thing about scm_gc_mark in a simple guile -c 1 run: . void . scm_gc_mark (SCM ptr) 794,344 { 155,170 = ???:0x00024917 (77585x) 198,586if (SCM_IMP (ptr)) . return; . 513,038if (SCM_GC_MARK_P (ptr)) . return; . 84,580if (!scm_i_marking) . { .static const char msg[] . = Should only call scm_gc_mark() during GC.; .scm_c_issue_deprecation_warning (msg); . } . 42,290SCM_SET_GC_MARK (ptr); 63,435scm_gc_mark_dependencies (ptr); 2,666,432 = /home/wingo/src/guile/vm/libguile/gc-mark.c:scm_gc_mark_dependencies (5222x) 704 = /usr/src/debugglibc-20081113T2206/elf/../sysdeps/i386/dl-trampoline.S:_dl_runtime_resolve (1x) 595,758 } I think that the items on the left are cycle counts, and are of relative importance. The = lines are the cumulative costs of the subroutines. The salient point for me is that the scm_i_marking check slows down this function by about 10%! Also, that the majority of the time in this function is in the SCM_GC_MARK_P line. If I thought that we'd keep our GC, I would work at inlining this function, i think. Andy -- http://wingolog.org/