Hello, I’m still working on issues of performance vs. sharing; I must assume some of the people here on the list must have seen my "dup"-paper¹ as referees.
I’m now wondering about a approach where the compiler (either automatically or by user annotation; I’ll leave that question for later) would mark some thunks as reentrant, i.e. simply skip the blackholing and update frame pushing. A quick test showed that this should work quite well, take the usual example: import System.Environment main = do a <- getArgs let n = length a print n let l = [n..30000000] print $ last l + last l This obviously leaks memory: $ ./Test +RTS -t 0 60000000 <<ghc: 2400054760 bytes, 4596 GCs, 169560494/935354240 avg/max bytes residency (11 samples), 2121M in use, 0.00 INIT (0.00 elapsed), 0.63 MUT (0.63 elapsed), 4.28 GC (4.29 elapsed) :ghc>> I then modified the the assembly (a crude but effective way of testing this ;-)) to not push a stack frame: $ diff -u Test.s Test-modified.s --- Test.s 2012-08-01 11:30:00.000000000 +0200 +++ Test-modified.s 2012-08-01 11:29:40.000000000 +0200 @@ -56,20 +56,20 @@ leaq -40(%rbp),%rax cmpq %r15,%rax jb .LcpZ - addq $16,%r12 - cmpq 144(%r13),%r12 - ja .Lcq1 - movq $stg_upd_frame_info,-16(%rbp) - movq %rbx,-8(%rbp) + //addq $16,%r12 + //cmpq 144(%r13),%r12 + //ja .Lcq1 + //movq $stg_upd_frame_info,-16(%rbp) + //movq %rbx,-8(%rbp) movq $ghczmprim_GHCziTypes_Izh_con_info,-8(%r12) movq $30000000,0(%r12) leaq -7(%r12),%rax - movq %rax,-24(%rbp) + movq %rax,-8(%rbp) movq 16(%rbx),%rax - movq %rax,-32(%rbp) - movq $stg_ap_pp_info,-40(%rbp) + movq %rax,-16(%rbp) + movq $stg_ap_pp_info,-24(%rbp) movl $base_GHCziEnum_zdfEnumInt_closure,%r14d - addq $-40,%rbp + addq $-24,%rbp jmp base_GHCziEnum_enumFromTo_info .Lcq1: movq $16,192(%r13) Now it runs fast and slim (and did not crash on the first try, which I find surprising after hand-modifying the assembly code): $ ./Test +RTS -t 0 60000000 <<ghc: 4800054840 bytes, 9192 GCs, 28632/28632 avg/max bytes residency (1 samples), 1M in use, 0.00 INIT (0.00 elapsed), 0.73 MUT (0.73 elapsed), 0.04 GC (0.04 elapsed) :ghc>> My question is: Has anybody worked in that direction? And are there any fundamental problems with the current RTS implementation and such closures? Greetings, Joachim ¹ http://arxiv.org/abs/1207.2017 currently not about to appear anywhere else, but I have not given up hope yet :-) -- Dipl.-Math. Dipl.-Inform. Joachim Breitner Wissenschaftlicher Mitarbeiter http://pp.info.uni-karlsruhe.de/~breitner
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users