Christian Höner writes: > welcome ;-)
Thanks! > What you still need to do is to make @stats@ and @foldl@ more strict. > For this you should check out @foldl'@ (note the prime) and bang > patterns for @stats@. I'm intentionally not giving the solution! Ah, thanks for the hints - especially the second one; I was suspecting the first :-) > Once you have done that the program should work with arbitrarily long > inputs. Indeed, this is what I changed it into: {-# LANGUAGE BangPatterns #-} import System.Environment import Data.List import Bio.Sequence.FastQ import Bio.Core.Sequence main = do [f] <- getArgs putStrLn . output . average . foldl' stats (0, 0) =<< readIllumina f where stats (!count, !totalLength) s = (count+1, totalLength+toInteger(seqlength s)) average (count, totallength) = (count, totallength, t/c) where t = fromIntegral totallength :: Float c = fromIntegral count :: Float output (count, length, average) = "Count " ++ show count ++ "\n" ++ "Total " ++ show count ++ " records " ++ show length ++ " length " ++ show average ++ " average" And now it works fine on a fastq-file of 5.1 GB on my desktop with 16GB RAM. Let me see if I can find a bigger fastq-file to run it on... Yes, it works fine on the 33 GB file I just located. Thanks for the tips! Do you gradually get an intuitive feeling for when strictness is "necessary", is it something you'll handle when running into a problem, or do you do measurements? Best regards, Adam -- "A cat has nine lives, but a bullfrog croaks Adam Sjøgren every day." a...@koldfront.dk