On 01 August 2005 16:05, Cale Gibbard wrote: > Your problem is, as you pointed out, that readFile does lazy IO. > Although the semantics of it can be a bit confusing at times, it is > useful for applications where you have a large file which is being > consumed, and you don't want to allocate all of the memory for it > before doing any processing. Laziness lets you read the file as needed > -- you may not even need it all, depending on what is being done. This > is quite helpful when you have something like a couple gigabytes of > data on disk which needs processing. It can however be confusing at > first that it may not finish reading the file before the file is > altered, or, in situations involving handles, before the handle is > closed. > > You can write a strict IO version of readFile in Haskell as follows: > > import IO > > hGetContents' hdl = do e <- hIsEOF hdl > if e then return [] > else do c <- hGetChar hdl > cs <- hGetContents' hdl > return (c:cs) > > readFile' fn = do hdl <- openFile fn ReadMode > xs <- hGetContents' hdl > hClose hdl > return xs > > If you use readFile', it will ensure that the entire file is read and > memory for the string is allocated before continuing. This ought to > solve your problem.
Note that hGetContents' is likely to be slow, because it does repeated hGetChar. Also it is not tail-recursive, so it will run out of stack for a large file. It would be better to read the whole file into memory but lazilly convert it to a String, like so: import System.IO import System.IO.Unsafe import Foreign import Data.Char readFile' f = do h <- openFile f ReadMode s <- hFileSize h fp <- mallocForeignPtrBytes (fromIntegral s) len <- withForeignPtr fp $ \buf -> hGetBuf h buf (fromIntegral s) lazySlurp fp 0 len lazySlurp :: ForeignPtr Word8 -> Int -> Int -> IO String lazySlurp fp ix len | ix == len = return [] | otherwise = do c <- withForeignPtr fp $ \p -> peekElemOff p ix cs <- unsafeInterleaveIO (lazySlurp fp (ix+1) len) return (chr (fromIntegral c) : cs) Actually, I recommend always using this version unless you need to deal with really large files. Performance won't be quite as good as readFile, but I've attached a tweaked version that performs much better. Cheers, Simon
readfile.hs
Description: readfile.hs
_______________________________________________ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell