pete-expires-20070513: > [EMAIL PROTECTED] (Donald Bruce Stewart) writes: > > > pete-expires-20070513: > >> When using readFile to process a large number of files, I am exceeding > >> the resource limits for the maximum number of open file descriptors on > >> my system. How can I enhance my program to deal with this situation > >> without making significant changes? > > > > Read in data strictly, and there are two obvious ways to do that: > > > > -- Via strings: > > > > readFileStrict f = do > > s <- readFile f > > length s `seq` return s > > > > -- Via ByteStrings > > readFileStrict = Data.ByteString.readFile > > readFileStrictString = liftM Data.ByteString.unpack > > Data.ByteString.readFile > > > > If you're reading more than say, 100k of data, I'd use strict > > ByteStrings without hesitation. More than 10M, and I'd use lazy > > bytestrings. > > Correct me if I'm wrong, but isn't this exactly what I wanted to > avoid? Reading the entire file into memory? In my previous email, I > was trying to state that I wanted to lazily read the file because some > of the files are quite large and there is no reason to read beyond the > small set of headers. If I read the entire file into memory, this > design goal is no longer met. > > Nevertheless, I was benchmarking with ByteStrings (both lazy and > strict), and in both cases, the ByteString versions of readFile yield > the same error regarding max open files. Incidentally, the lazy > bytestring version of my program was by far the fastest and used the > least amount of memory, but it still crapped out regarding max open > files. > > So I'm back to square one. Any other ideas?
Hmm. Ok. So we need to have more hClose's happen somehow. Can you process files one at a time? -- Don _______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
