2009/11/9 Michael Alipio <daem0n...@yahoo.com>: > Hi, >> >> Do you need the fastest possible sort? > > I'm not even sure if I really need to worry about all these > sorting techniques. My program just reads a text file > (wordlist). It might be megabyte-sized or probably few > gigabytes (i might also add size checking on this to be > safe with File::Slurp). Then I will give the user an option > of sorting it in various ways, like length, alphabetical, > numerical, frequency of letters, etc. > > I see, so it all boils down to how expensive the > comparison you're going to implement to fully benefit > from these techniques.
Having a cheap comparison is important, but it's just one part of a very large story. Donald Knuth, one of the masters of computer science, wrote an entire textbook on searching and sorting: http://www.amazon.com/Art-Computer-Programming-Sorting-Searching/dp/0201896850 This is a huge subject and we've barely scratched the surface. There is no "it all boils down to". > Now comes another question > for me to find the answer to, how expensive the > comparisons in my sorting function would be... I > guess there's no other way for me to find this out > than to try it out myself. Actually, there is a certain amount of reasoning possible with respect to comparison functions: for example, a Schwartzian Transform will be a win if the key calculation is more expensive than the comparison of keys. > What's worse is that > there's also a "depends on the system" factor to > consider as well. Sometimes I wish perl's motto > is "there's only one best way to do it" so everyone > would just agree on one way of doing something, > so everyone would have the same beautiful > and efficient code. There is no programming language which will automatically optimise the best possible sort for you. Not Perl, not Python, not Java, not C++. If you want the best, you will have to learn the theory. A compsci course or a thorough read of Knuth will go far. > For now, I will probably just stick > to using the built-in sort (just for sorting length, > numbers, and letters), until I have gained enough > knowledge about why it's necessary to use the > other techniques, or how to do the benchmark > myself. Oh, it's not too hard to do benchmarking, use cmpthese from Benchmark: http://search.cpan.org/~dapm/perl-5.10.1/lib/Benchmark.pm Phil -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/