Laurence Reeves writes: >P Witte wrote: >> Quicksort is so much faster than any stable sort I know >> of, even when "cheating", that its worth it ;o) >> > Yeah, yeah. I, of course, agree totally. I cheat /all/ the time.
Cheats I dont easily agree with are those that employ particular quirks of particular hardware or software that end up being a nightmare to debug or transfer years after the only programmer who ever understood it OD'ed on Jolt.. > I did get a bit carried away... and perused the current net thinking on > sort algorithms, and it would seem that the ancient "mergesort" has a > lot going for it. > > You add O(n) extra memory to extend the key and make "quicksort" > effectively stable. This adds cost to key comparisons. > > With "mergesort", a similar O(n) extra memory /may/ be needed. However, > and this is a /big/ however, the worst case (admittedly low probability) > time for "quicksort" is O(n*n), whereas "mergesort" stays at > O(n*ln(n))... indeed, its worst case is only twice as slow as its best > case. To be frank, I dont understand my own implementation of quicksort in QUICKSORT anymore. Here are some timings (QPC2 - 1.2GHz PC) CLS FOR d% = 10000 TO 30000 STEP 10000 DIM a%(d%), b%(d%) : rem All data same t = DATE FOR i% = 1 TO 60 QUICKSORT a%; 1 END FOR i% PRINT DATE - t, : rem Data already sorted MATFILL a%, 0, 1: REMark a%() => 0, 1, .. d% t = DATE FOR i% = 1 TO 60 QUICKSORT a%; 1 END FOR i% PRINT DATE - t, : rem Random data FOR i% = 0 TO d%: a%(i%) = RND(0 TO d%) t = DATE FOR i% = 1 TO 60 MATEQU b% TO a%: REMark b%() => a%() END FOR i% dt = DATE - t: PRINT dt, t = DATE FOR i% = 1 TO 60 MATEQU b% TO a%: QUICKSORT a%; 1 END FOR i% PRINT DATE - t - dt : END FOR d% Results (in seconds): 10 6 0 6 21 18 0 13 32 18 1 19 Doing something similar with strings ( dim a$(d%, 8) and all strings used are 8 chars long) I get: 32 17 0 26 (10000) 68 38 1 56 (20000) Thats pretty close to n*ln(n) for the worst case scenario? All data is sorted in place. Only pointers are stacked. No sentinels. Dont know how I did it, but it looks ok to me ;o) > I said above that extra memory /may/ be needed, because it depends on > how the data is held. There is /no/ extra memory (barring O(ln(n))) if > the data starts off as a linked list. The algorithm is also highly > effective if the data is stored on disc, etc, rather than in memory. > > Algorithm for "mergesort": > > If more than one element, split (arbitrarily) into two equal (or differ > by one) chunks, recursively mergesort each, merge the two sorted chunks. > > 3767376 > split 376 7376 > split 3 76 7376 > split 3 7 6 7376 > merge 3 67 7376 > merge 367 7376 > split 367 73 76 > split 367 7 3 76 > merge 367 37 76 > split 367 37 7 6 > merge 367 37 67 > merge 367 3677 > merge 3366777 Nice! I used a version of mergesort to append a new sub-array to an already sorted array. Faster than inserting each new item individually. That was while I was still in the binary insertion sort era... > And now... my fertile mind... what if the "split into two chunks" > actually was the odd and even indices within the array? Curiously, this > could be done as a SuperBasic slice. The merge operation becomes > /horrible/, and I haven't thought it through at all (except that the > worst case looks to be O(n*n)!), but the whole thing becomes (close to) > in-place. > > > PS. I always feel "shellsort" is magic. > PPS. <http://en.wikipedia.org/wiki/Patience_sorting> looks rather > interesting. I did a shellsort implementation, but it didnt do for me. In fact I wrote SuperBasic versions of most of the algorithms in Knuth's TAOCP. Only three of them made it to assembler. This one beats them all hands down. Per _______________________________________________ QL-Users Mailing List http://www.q-v-d.demon.co.uk/smsqe.htm