On Wed, Jun 18, 2008 at 08:19:10PM +0200, Georg Sauthoff wrote: > Hi, > > I played a bit around with the nice bytestring package. At some point I > implemented a simple sorting program, because I needed line-sorting a file > with a custom line-compare function. I was a bit surprised, that the > resulting code is very fast. A lot of faster than sorting via GNU sort > (with the standard line-compare function). > > Then I got suspicious and tested standard GNU sort against a trivial > standard sort implementation in Haskell using bytestring. The Haskell > implementation looks like: > [snip] > Ok, strane ... Well, let's test with some 'normal' text: > > time ./sort < bible > /dev/null # ~ 0.4 s > time sort < bible > /dev/null # ~ 0.56 s > > Ok, not that different. But with Haskell you often expect to get very > slow code compared to an implementation in C. > > And I am surprised, that the Haskell is fast _and_ nice to read - because > for example the ultra fast 'wc -l' Haskell implementation from the > Haskell-Wiki uses some insider-knowledge about bytestring, and looks to > a beginner not that intuitive, I guess. > > ./sort is the shown Haskell implementation. I used ghc 6.8.2, installed > bytestring 0.9.1.0 as a user-local package (don't now I this superseeds > the global bytestring package), compiled via ghc -O2. The tests are run at a > Pentium M 1.3 GHz computer. As GNU sort I tested the version from > coreutils 6.9. > > You can get the test files from: > http://www.techfak.uni-bielefeld.de/~gsauthof/stuff/brackets.gz > http://www.techfak.uni-bielefeld.de/~gsauthof/stuff/bible.gz > (obviously, you have to gunzip them after downloading ...) > > Of course, the naive Haskell implementation doesn't care about locales > (i.e. collating locale sequences), but this shouldn't explain the > observed differences sorting the brackets file.
GNU 'sort' uses an external sort algorithm. You can, with 200M of memory, give it a 50G input file, and it will work. This might explain the difference.. Stefan
signature.asc
Description: Digital signature
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe