Hi Adam, Yes, that is how it is done! You can also append +RTS -s -RTS to your program and will see running time statistics. The 5th line is "total memory in use" and should be only a couple of MBytes.
The real world Haskell book has a chapter on optimization. In general you get a good intuition after a while. The basic idea is to make the streaming part lazy enough that only a single block is allocated; then make each calculation strict enough that "no thunks" are retained. Haskell has a bunch of libraries that help writing efficient code for all kinds of problems (numerics: vector, repa; streaming: conduit, pipes, ...). This makes many things automagical. My method of choice for improving performance is "benchmarking" via +RTS -s -RTS to see if I'm leaking space, or totally screwed up the algorithm design. Then I hunt for the usual stuff: strictness annotation. If that fails I'll just read the intermediate core language -- but that is only necessary for the kind of programs I write; you won't need the last part. ;-) Viele Gruesse, Christian * Adam Sjøgren <a...@koldfront.dk> [23.07.2015 17:02]: > Indeed, this is what I changed it into: > > putStrLn . output . average . foldl' stats (0, 0) =<< readIllumina f > where stats (!count, !totalLength) s = (count+1, > totalLength+toInteger(seqlength s)) > > And now it works fine on a fastq-file of 5.1 GB on my desktop with 16GB > RAM. > > Thanks for the tips! > > Do you gradually get an intuitive feeling for when strictness is > "necessary", is it something you'll handle when running into a problem, > or do you do measurements? > > > Best regards, > > Adam > > -- > "A cat has nine lives, but a bullfrog croaks Adam Sjøgren > every day." a...@koldfront.dk