#4334: Make lines stricter to fix space leak
----------------------------------+-----------------------------------------
    Reporter:  daniel.is.fischer  |       Owner:                   
        Type:  proposal           |      Status:  new              
    Priority:  normal             |   Component:  libraries/base   
     Version:  6.12.3             |    Keywords:  lines, space leak
    Testcase:                     |   Blockedby:                   
          Os:  Unknown/Multiple   |    Blocking:                   
Architecture:  Unknown/Multiple   |     Failure:  Other            
----------------------------------+-----------------------------------------
 I propose changing the implementation of lines to be stricter.

 The current implementation causes (in GHC) a space leak, so it is better
 to sacrifice a little laziness to fix that leak.

 Period of discussion: Three weeks, until 15^th^ October (because of ICFP).

 Current implementation:
 {{{
 lines                   :: String -> [String]
 lines ""                =  []
 lines s                 =  let (l, s') = break (== '\n') s
                            in  l : case s' of
                                         []      -> []
                                         (_:s'') -> lines s''
 }}}
 The relevant part of the core showing the leak (-O or -O2, similar for -O0
 but without inlining `break`):
 {{{
         let {
           ds1_sjM :: ([GHC.Types.Char], [GHC.Types.Char])
           LclId
           [Str: DmdType]
           ds1_sjM =
             case GHC.List.$wbreak @ GHC.Types.Char lvl1_rjW wild_B1
             of _ { (# ww1_aj3, ww2_aj4 #) ->
             (ww1_aj3, ww2_aj4)
             } } in
         GHC.Types.:
           @ GHC.Base.String
           (case ds1_sjM of _ { (l_aif, _) -> l_aif })
           (case ds1_sjM of _ { (_, s'_aih) ->
            case s'_aih of _ {
              [] -> GHC.Types.[] @ GHC.Base.String;
              : _ s''_adj -> Lines.lines s''_adj
            }
            })
 }}}
 `ds1_sjM` keeps a reference to the first line until the second is
 demanded, preventing it from being GCed.

 Current strictness properties:
 {{{
 lines _|_       = _|_
 lines (_|_:_|_) = _|_ : _|_
 }}}
 Proposed implementation:
 {{{
 lines                   :: String -> [String]
 lines ""                =  []
 lines s                 = case break (== '\n') s of
                             (l, s') -> l : case s' of
                                             []      -> []
                                             (_:s'') -> lines s''
 }}}
 Due to the pattern matching on `break`'s result, the pair is never
 considered a single entity, allowing the first line to be GCed before the
 second is demanded.

 Strictness properties of the proposed implementation:
 {{{
 lines _|_       = _|_
 lines (_|_:_|_) = _|_
 }}}
 Generally, a `_|_` as the first `Char` of a line would produce a `_|_`
 instead of the `_|_ : _|_` which the current implementation produces.
 `_|_` in any other place would be treated identically.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4334>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to