--On Donnerstag, Februar 16, 2006 10:39:45 -0800 Dann Corbit <[EMAIL PROTECTED]> wrote:

He refers to counting sort and radix sort (which comes in most
significant digit and least significant digit format).  These are also
called distribution (as opposed to comparison) sorts.

These sorts are O(n) as a function of the data size, but really they are
O(M*n) where M is the average key length and n is the data set size.
(In the case of MSD radix sort M is the average length to completely
differentiate strings)

So if you have an 80 character key, then 80*log(n) will only be faster
I suppose you meant 80 * n here?

than n*log(n) when you have 2^80th elements -- in other words -- never.
I think this is wrong. You can easily improve Radix sort by a constant if you don't take single bytes as the digits but rather k-byte values. At least 2 byte should be possible without problems. This would give you 40 * n time, not 80 * n. And you assume that the comparision of an 80-byte wide value only costs 1, which might (and in many cases will be imho) wrong. Actually it migh mean to compare 80 bytes as well, but I might be wrong.

What I think as the biggest problem is the digit representation necessary for Radix-Sort in cases of locales which sort without looking at spaces. I assume that would be hard to implement. The same goes for the proposed mapping of string values onto 4/8-byte values.

Mit freundlichem Gruß
Jens Schicke
--
Jens Schicke                  [EMAIL PROTECTED]
asco GmbH                     http://www.asco.de
Mittelweg 7                   Tel 0531/3906-127
38106 Braunschweig            Fax 0531/3906-400

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Reply via email to