I was interested to see if I could determine what was happening with this. After some playing around, I noticed the code was running significantly faster if I *didn't* compile it, but ran it with 'runghc' instead (running under ghci was also fast).
Here are the running times I found. The 'Zip.hs' program comes with the zip-archive package. The runtime of the compiled version didn't seem to be affected by optimisations. Regardless, I'm quite surprised running interpreted was significantly faster than compiled. > time runghc ./Zip.hs -l ~/jdk1.6.0_05-src.zip 1.48s user 0.17s system 97% cpu 1.680 total > time ./dist/build/Zip/Zip -l ~/jdk1.6.0_05-src.zip 89.00s user 1.06s system 98% cpu 1:31.84 total The file 'jdk1.6.0_05-src.zip' was just an 18MB zip file I had lying around. I'm using ghc 6.12.1 Cheers, -- David Powell On Tue, Aug 10, 2010 at 12:10 PM, Jason Dagit <da...@codersbase.com> wrote: > > > On Mon, Aug 9, 2010 at 4:29 PM, Pieter Laeremans <pie...@laeremans.org>wrote: > >> Hello, >> >> I'm trying some haskell scripting. I'm writing a script to print some >> information >> from a zip archive. The zip-archive library does look nice but the >> performance of zip-archive/lazy bytestring >> doesn't seem to scale. >> >> Executing : >> >> eRelativePath $ head $ zEntries archive >> >> on an archive of around 12 MB with around 20 files yields >> >> Stack space overflow: current size 8388608 bytes. >> > > So it's a stack overflow at about 8 megs. I don't have a strong sense of > what is normal, but that seems like a small stack to me. Oh, actually I > just check and that is the default stack size :) > > I looked at Zip.hs (included as an example). The closest I see to your > example is some code for listing the files in the archive. Perhaps you > should try the supplied program on your archive and see if it too has a > stack overflow. > > The line the author uses to list files is: > List -> mapM_ putStrLn $ filesInArchive archive > > But, you're taking the head of the entries, so I don't see how you'd be > holding on to too much data. I just don't see anything wrong with your > program. Did you remember to compile with optimizations? Perhaps try the > author's way of listing entries and see if performance changes? > > >> >> The script in question can be found at : >> >> http://github.com/plaeremans/HaskellSnipplets/blob/master/ZipList.hs >> >> I'm using the latest version of haskell platform. Are these libaries not >> production ready, >> or am I doing something terribly wrong ? >> > > Not production ready would be my assumption. I think an iteratee style > might be more appropriate for these sorts of nested streams of potentially > large size anyway. I'm skeptical of anything that depends on lazy > bytestrings or lazy io. In this case, the performance would appear to be > depend on lazy bytestrings. > > You might want to experiment with increasing the stack size. Something > like this: > ./ZipList +RTS -K100M -RTS foo.zip > > Jason > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > >
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe