#7556: build/fold causes with ByteString unpack causes huge memory leak
------------------------------------+---------------------------------------
Reporter:  glguy                    |          Owner:                  
    Type:  bug                      |         Status:  new             
Priority:  normal                   |      Component:  libraries/base  
 Version:  7.6.1                    |       Keywords:                  
      Os:  Unknown/Multiple         |   Architecture:  Unknown/Multiple
 Failure:  Runtime performance bug  |      Blockedby:                  
Blocking:                           |        Related:                  
------------------------------------+---------------------------------------
Changes (by shachaf):

 * cc: shachaf@… (added)


Comment:

 The issue is
 [http://hackage.haskell.org/packages/archive/bytestring/0.10.2.0/doc/html/src
 /Data-ByteString.html#unpackFoldr here]:

 {{{
 unpack ps = build (unpackFoldr ps)
 {-# INLINE unpack #-}

 --
 -- Have unpack fuse with good list consumers
 --
 -- critical this isn't strict in the acc
 -- as it will break in the presence of list fusion. this is a known
 -- issue with seq and build/foldr rewrite rules, which rely on lazy
 -- demanding to avoid bottoms in the list.
 --
 unpackFoldr :: ByteString -> (Word8 -> a -> a) -> a -> a
 unpackFoldr (PS fp off len) f ch = withPtr fp $ \p -> do
     let loop q n    _   | q `seq` n `seq` False = undefined -- n.b.
         loop _ (-1) acc = return acc
         loop q n    acc = do
            a <- peekByteOff q n
            loop q (n-1) (a `f` acc)
     loop (p `plusPtr` off) (len-1) ch
 {-# INLINE [0] unpackFoldr #-}

 {-# RULES
 "ByteString unpack-list" [1]  forall p  .
     unpackFoldr p (:) [] = unpackBytes p
 }}}

 When we use `foldr`, `foldr/build` fusion turns the whole expression into
 an application of `unpackFoldr`, which is tail recursive and therefore not
 sufficiently lazy -- but also not strict in the accumulator, so it builds
 up a big thunk. In `example1`, fusion doesn't happen, so the fold happens
 over `unpackBytes` instead, which generates list in small chunks that can
 be processed lazily.

 This looks like a `bytestring` bug to me.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/7556#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
Glasgow-haskell-bugs mailing list
Glasgow-haskell-bugs@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to