claus.reinke: > >Not necessarily so, since you are making assumptions about the > >timeliness of garbage collection. I was similarly sceptical of Claus' > >suggestion: > > > >Claus Reinke: > >>in order to keep the overall structure, one could move readFile backwards > >>and parseEmail forwards in the pipeline, until the two meet. then make > >>sure > >>that parseEmail completely constructs the internal representation of each > >>email, thereby keeping no implicit references to the external > >>representation. > > you are quite right to be skeptical!-) indeed, in the latest Handle > documentation, we still find the following excuse for GHC: > > http://www.haskell.org/ghc/docs/latest/html/libraries/base/System-IO.html#t%3AHandle > > GHC note: a Handle will be automatically closed when the garbage > collector detects that it has become unreferenced by the program. > However, relying on this behaviour is not generally recommended: the > garbage collector is unpredictable. If possible, use explicit an > explicit hClose to close Handles when they are no longer required. GHC > does not currently attempt to free up file descriptors when they have > run out, it is your responsibility to ensure that this doesn't happen. > this issue has been discussed in the past, and i consider it a bug if the > memory > manager tells me to handle memory myself;-) so i do hope that this > infelicity will > be removed in the future (run out of file descriptors -> run a garbage > collection > and try again, before giving up entirely). > > in fact, my local version had two variants of processFile - the one i > posted and > one with explicit file handle handling (the code was restructured this way > exactly > to hide this implementation decision in a single function). i did test both > variants > on a directory with lots of copies of a few emails (>2000 files), and both > worked > on my system, so i hoped -rather than checked- that the handle collection > issue > had finally been fixed, and made the mistake of removing the more complex > variant before posting. thanks for pointing out that error - as the > documentation > above demonstrates, it isn't good to rely on assumptions, nor on tests. > > so here is the alternate variant of processFile (for which i imported > System.IO): > > >processFile path = do > > f <- openFile path ReadMode > > text <- hGetContents f > > let email = parseEmail text > > email `seq` hClose f > > return email > > all this hazzle to expose a file handle to call hClose on, just so that the > GC does not have to.. >
Are we at the point that we should consider adding some documentation how to deal with this issue? And are the recommendations to either use strict IO (should we have a package for System.IO.Strict??), or via strictness on the consumer of the data. -- Don _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe