On 01/08/2012 11:38, Joachim Breitner wrote:
Hello,

I’m still working on issues of performance vs. sharing; I must assume
some of the people here on the list must have seen my "dup"-paper¹ as
referees.

I’m now wondering about a approach where the compiler (either
automatically or by user annotation; I’ll leave that question for later)
would mark some thunks as reentrant, i.e. simply skip the blackholing
and update frame pushing. A quick test showed that this should work
quite well, take the usual example:

         import System.Environment
         main = do
             a <- getArgs
             let n = length a
             print n
             let l = [n..30000000]
             print $ last l + last l

This obviously leaks memory:

         $ ./Test +RTS -t
         0
         60000000
         <<ghc: 2400054760 bytes, 4596 GCs, 169560494/935354240 avg/max
         bytes residency (11 samples), 2121M in use, 0.00 INIT (0.00
         elapsed), 0.63 MUT (0.63 elapsed), 4.28 GC (4.29 elapsed) :ghc>>


I then modified the the assembly (a crude but effective way of testing
this ;-)) to not push a stack frame:

$ diff -u Test.s Test-modified.s
--- Test.s      2012-08-01 11:30:00.000000000 +0200
+++ Test-modified.s     2012-08-01 11:29:40.000000000 +0200
@@ -56,20 +56,20 @@
        leaq -40(%rbp),%rax
        cmpq %r15,%rax
        jb .LcpZ
-       addq $16,%r12
-       cmpq 144(%r13),%r12
-       ja .Lcq1
-       movq $stg_upd_frame_info,-16(%rbp)
-       movq %rbx,-8(%rbp)
+       //addq $16,%r12
+       //cmpq 144(%r13),%r12
+       //ja .Lcq1
+       //movq $stg_upd_frame_info,-16(%rbp)
+       //movq %rbx,-8(%rbp)
        movq $ghczmprim_GHCziTypes_Izh_con_info,-8(%r12)
        movq $30000000,0(%r12)
        leaq -7(%r12),%rax
-       movq %rax,-24(%rbp)
+       movq %rax,-8(%rbp)
        movq 16(%rbx),%rax
-       movq %rax,-32(%rbp)
-       movq $stg_ap_pp_info,-40(%rbp)
+       movq %rax,-16(%rbp)
+       movq $stg_ap_pp_info,-24(%rbp)
        movl $base_GHCziEnum_zdfEnumInt_closure,%r14d
-       addq $-40,%rbp
+       addq $-24,%rbp
        jmp base_GHCziEnum_enumFromTo_info
  .Lcq1:
        movq $16,192(%r13)

Now it runs fast and slim (and did not crash on the first try, which I
find surprising after hand-modifying the assembly code):

         $ ./Test +RTS -t
         0
         60000000
         <<ghc: 4800054840 bytes, 9192 GCs, 28632/28632 avg/max bytes
         residency (1 samples), 1M in use, 0.00 INIT (0.00 elapsed), 0.73
         MUT (0.73 elapsed), 0.04 GC (0.04 elapsed) :ghc>>


My question is: Has anybody worked in that direction? And are there any
fundamental problems with the current RTS implementation and such
closures?

Long ago GHC used to have an "update analyser" which would detect some thunks that would never be re-entered and omit the update frame on them. I wrote a paper about this many years ago, and there were other people working on similar ideas, some using types (e.g. linear types) - google for "update avoidance". As I understand it you want to omit doing some updates in order to avoid space leaks, which is slightly different.

The StgSyn abstract syntax has an UpdateFlag on each StgRhs which lets you turn off the update, and I believe the code generator will respect it although it isn't actually ever turned off at the moment.

Cheers,
        Simon

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to