Re: comparative study of iteration and ordered collections

Dave Long Mon, 09 Apr 2007 07:28:08 -0700

random_shuffle looks pretty hairy.

random_shuffle, as usually implemented, is based on swaps (establishing arandom permutation) -- but one can also think of it as sorting with a"broken" comparison function which returns random results, so there may bea clever alternative to doing a real sort with an appended random key?

There's a non-clever way to make any sort stable, which is to add a
secondary sort key that is the original position in the list.  There
may be a clever way to do a stable mergesort without knowing the list
lengths in advance, but I don't know it.

The idea of radix-sort is that one can reduce the problem of sorting on awide key to several sorts on narrow keys -- adding a secondary sort key isa bit self-defeating for this use.

The brute-force way to preserve stability is to reverse the reversed listin between passes, as used here with integers providing lists (and a pairof integers providing a tape):

http://en.literateprograms.org/Merge_sort_(dc)

The clever way (as used with actual tape drives) was to reverse the senseof comparison on each pass.

That leaves only the heap and permutation operations.  I really don't
know about them.

I'd guess heaps and permutations are essentially swap-based, and thatswaps imply random access.(permutations are closely related to insertion/selection-sort, which arenot so far away from heap sort)

Knuth Vol 3, "Sorting and Searching", might be a good reference. Anythingthat was good for external sorting with tapes ought to be good withlists. (indeed, being good with tapes implies that one should be able tofind more cache-friendly implementations than the standardsingly-linked-list)

A remarkably large share of the uses of array indexing by numerical
variables I found in both my Python code and my JavaScript code were
for some variant of the string.join problem --- given an array of N
strings, connect them with N-1 commas, except when N=0, in which case
we should connect them with N=0 commas instead of N-1 = -1 commas.

Squint at it properly, and tail-call optimization becomes an instance ofstring.join: instead of "a,b,c," (where x is a jump and ',' the equivalentreturn) we want to generate "a,b,c". The dual (context preservation whenbuilding argument lists) follows the same pattern: we only need tosave/restore contexts if there are at least two overlapping evaluations.

This may be a practical reason for the last couple of decades' interest inmonads: we prefer to program in an algebraic, tree-like form (code as wellas data), while the iron prefers to handle linear sequences of codeoperating on contiguous chunks of data (op x state -> state') -- andmonads provide machinery to go from tree inputs to flat results in atheoretically nice manner.


-Dave

Re: comparative study of iteration and ordered collections

Reply via email to