2009/11/9 Michael Alipio <daem0n...@yahoo.com>:
> Hi,
>>
>>  Do you need the fastest possible sort?
>
> I'm not even sure if I really need to worry about all these
> sorting techniques. My program just reads a text file
> (wordlist). It might be megabyte-sized or probably few
> gigabytes (i might also add size checking on this to be
> safe with File::Slurp). Then I will give the user an option
> of sorting it in various ways, like length, alphabetical,
> numerical, frequency of letters, etc.
>
> I see, so it all boils down to how expensive the
> comparison you're going to implement to fully benefit
> from these techniques.

Having a cheap comparison is important, but it's just one part of a
very large story. Donald Knuth, one of the masters of computer
science, wrote an entire textbook on searching and sorting:
http://www.amazon.com/Art-Computer-Programming-Sorting-Searching/dp/0201896850

This is a huge subject and we've barely scratched the surface. There
is no "it all boils down to".

> Now comes another question
> for me to find the answer to, how expensive the
> comparisons in my sorting function would be... I
> guess there's no other way for me to find this out
> than to try it out  myself.

Actually, there is a certain amount of reasoning possible with respect
to comparison functions: for example, a Schwartzian Transform will be
a win if the key calculation is more expensive than the comparison of
keys.

> What's worse is that
> there's also a "depends on the system" factor to
> consider as well. Sometimes I wish perl's motto
> is "there's only one best way to do it" so everyone
> would just agree on one way of doing something,
> so everyone would have the same beautiful
> and efficient code.

There is no programming language which will automatically optimise the
best possible sort for you. Not Perl, not Python, not Java, not C++.
If you want the best, you will have to learn the theory. A compsci
course or a thorough read of Knuth will go far.

> For now, I will probably just stick
> to using the built-in sort (just for sorting length,
> numbers, and letters), until I have gained enough
> knowledge about why it's necessary to use the
> other techniques, or how to do the benchmark
> myself.

Oh, it's not too hard to do benchmarking, use cmpthese from Benchmark:
http://search.cpan.org/~dapm/perl-5.10.1/lib/Benchmark.pm

Phil

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to