#4334: Make lines stricter to fix space leak
----------------------------------+-----------------------------------------
Reporter: daniel.is.fischer | Owner:
Type: proposal | Status: new
Priority: normal | Component: libraries/base
Version: 6.12.3 | Keywords: lines, space leak
Testcase: | Blockedby:
Os: Unknown/Multiple | Blocking:
Architecture: Unknown/Multiple | Failure: Other
----------------------------------+-----------------------------------------
I propose changing the implementation of lines to be stricter.
The current implementation causes (in GHC) a space leak, so it is better
to sacrifice a little laziness to fix that leak.
Period of discussion: Three weeks, until 15^th^ October (because of ICFP).
Current implementation:
{{{
lines :: String -> [String]
lines "" = []
lines s = let (l, s') = break (== '\n') s
in l : case s' of
[] -> []
(_:s'') -> lines s''
}}}
The relevant part of the core showing the leak (-O or -O2, similar for -O0
but without inlining `break`):
{{{
let {
ds1_sjM :: ([GHC.Types.Char], [GHC.Types.Char])
LclId
[Str: DmdType]
ds1_sjM =
case GHC.List.$wbreak @ GHC.Types.Char lvl1_rjW wild_B1
of _ { (# ww1_aj3, ww2_aj4 #) ->
(ww1_aj3, ww2_aj4)
} } in
GHC.Types.:
@ GHC.Base.String
(case ds1_sjM of _ { (l_aif, _) -> l_aif })
(case ds1_sjM of _ { (_, s'_aih) ->
case s'_aih of _ {
[] -> GHC.Types.[] @ GHC.Base.String;
: _ s''_adj -> Lines.lines s''_adj
}
})
}}}
`ds1_sjM` keeps a reference to the first line until the second is
demanded, preventing it from being GCed.
Current strictness properties:
{{{
lines _|_ = _|_
lines (_|_:_|_) = _|_ : _|_
}}}
Proposed implementation:
{{{
lines :: String -> [String]
lines "" = []
lines s = case break (== '\n') s of
(l, s') -> l : case s' of
[] -> []
(_:s'') -> lines s''
}}}
Due to the pattern matching on `break`'s result, the pair is never
considered a single entity, allowing the first line to be GCed before the
second is demanded.
Strictness properties of the proposed implementation:
{{{
lines _|_ = _|_
lines (_|_:_|_) = _|_
}}}
Generally, a `_|_` as the first `Char` of a line would produce a `_|_`
instead of the `_|_ : _|_` which the current implementation produces.
`_|_` in any other place would be treated identically.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/4334>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs