Hi Alex, > What user-defined less-than relation do you have in mind?

I am implementing multi-methods in picolisp, see http://logand.com/picoWiki/multimethods What I need there is to order methods by their prototype, so that more specialized methods are called first. This "more specialized" thing is my lessThan predicate for sorting. > Built-in list functions can be used better and more efficiently to > do the pre- and post-processing. I think there are the following cases: 1) the built-in comparison (as implemented by compare()) is suitable for the problem at hand and then 'sort' and 'by' is fine. This saves overhead of calling 'apply' for comparison but introduces some consing when 'by' is used. It also assumes "absolute" ordering of list elements as understood by compare(). 2) a custom comparison is needed (no absolute ordering available). Then using 'sort' is not efficient because the pre-processing step is at least O(N^2). In this case, overhead of pre-processing (my function 'order' mentioned earlier) and/or consing is bigger than overhead of calling 'apply' to do the custom comparison. If I knew how to do the preprocessing step better, I would agree with you, but it seems to me that there isn't a better way in general. My reasoning here is that a sort function "knows" how to invoke the "custom" lessThan predicate less than O(N^2) times. If not, I have to do the pre-processing and compute *absolute* ordering values the way my 'order' function does, but that is not that great. What the 'order' function does is to compute those absolute values which the 'sort' function understands. There are situations though where these absolute values are not available for the problem at hand and then I need to pre-compute them. And computing this absolute ordering cannot be implemented "efficiently". I imagine other problems where the built-in 'sort' function is not "good enough", e.g. sorting strings by application specific rules like http://www.davekoelle.com/alphanum.html > (when (Lt X Y) > > would do directly what you intend. Yes, thanks, that's what I meant:-) > However, there is another problem with 'order', probably just a wrong > parenthesis: > >> (let S 0 >> (for Y Lst >> (when (apply Lt NIL X Y) >> (inc 'S) ) ) ) >> (push 'Q (cons S X)) ) > > 'S' is unbound when 'cons' is called, so the CARs of all elements end up > with NIL (or whatever 'S' was before). Oops, sorry for the bug:-( The 'push' expression should have been under the 'let' expression... >> That is quite simple but not efficient. > > True, as the list is iterated proportional to the square of the length. > Why do you think this is necessary? Not sure what do you mean? Why it has to be O(N^2)? Because only a sort function can possibly know how to use a lessThan predicate less than O(N^2) times as this depends on the sorting strategy I think. >> Could the existing 'sort' function be extended for the scenario with a >> user-defined less-than relation? > > I decided back then to write 'sort' without a functional argument, as I > believe that it is more efficient this way. I understand that this choice is useful for many practical tasks. However, it also restricts the problems where I can use it. Sometimes a programmer builds more complicated data structures or assigns application specific meanings to things and then the built-in compare() is "not enough", it simply does not have the right meaning. > If 'sort' calls a function twice each time two elements need to be > compared, A good sorting algorithm should minimize the total number of comparisons. Ideally two elements would be compared once at most. If it happens that they are compared twice (occassionaly), overall the number of comparisons should be still as small as possible (some elements do not get compared at all...). > it will involve more overhead than simply applying this function > once to each argument before the call to 'sort'. Only if I have the absolute ordering expected by compare() readily at hand. Otherwise I have to do O(N^2) preprocessing in which case applying the function less times than O(N^2) should be cheeper. > 'sort' is much simpler this way and does not have to do any > bookkeeping. It certainly is (only slightly) simpler but not necessarily as useful. I have attached 'sort2' function which "extends" the existing 'sort' function and allows a custom lessThan predicate to be passed in. What do you think of it? You can see the difference running: (de lt (L R) (push '*Lt (list L R)) (> L R) ) : (off *Lt) -> NIL : (sort2 '(9 3 0 6 2 8 1 7 5 4 9 3 0 6 2 8 1 7 5 4) lt) -> (9 9 8 8 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0) : (length *Lt) -> 183 : (off *Lt) -> NIL : (order '(9 3 0 6 2 8 1 7 5 4 9 3 0 6 2 8 1 7 5 4) lt) -> (9 9 8 8 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0) : (length *Lt) -> 400 Of course that could be run as: : (by - sort '(9 3 0 6 2 8 1 7 5 4 9 3 0 6 2 8 1 7 5 4)) -> (9 9 8 8 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0) but that assumes the absolute ordering (something like readily available mapping to integer numbers). If the elements weren't numbers but arbitrary partial ordered set data, it would not work while the 'sort2' and 'order' examples would still work. Thank you, Tomas

(load "lib/gcc.l") (gcc "sort2" NIL 'sort2) static int cmp(any lt, any l, any r) { cell c[2] = {*l, *r}; any z = apply(NULL, lt, NO, 2, c); return z == T ? -1 : 1; } #define compare(l, r) (lt == Nil ? compare(l, r) : cmp(lt, l, r)) /* List Merge Sort: Bill McDaniel, DDJ Jun99 */ // (sort2 'lst 'lt) -> lst any sort2(any x) { int i; any p, in[2], out[2], last; any *tail[2]; any lt; x = cdr(x); if (!isCell(out[0] = EVAL(car(x)))) return out[0]; lt = EVAL(cadr(x)); out[1] = Nil; do { in[0] = out[0]; in[1] = out[1]; i = isCell(in[1]) && compare(in[0], in[1]) >= 0; if (isCell(p = in[i])) in[i] = cdr(in[i]); out[0] = p; tail[0] = &cdr(p); last = out[0]; cdr(p) = Nil; i = 0; out[1] = Nil; tail[1] = &out[1]; while (isCell(in[0]) || isCell(in[1])) { if (!isCell(in[1])) { if (isCell(p = in[0])) in[0] = cdr(in[0]); if (compare(p,last) < 0) i = 1-i; } else if (!isCell(in[0])) { p = in[1], in[1] = cdr(in[1]); if (compare(p,last) < 0) i = 1-i; } else if (compare(in[0],last) < 0) { if (compare(in[1],last) >= 0) p = in[1], in[1] = cdr(in[1]); else { if (compare(in[0],in[1]) < 0) p = in[0], in[0] = cdr(in[0]); else p = in[1], in[1] = cdr(in[1]); i = 1-i; } } else { if (compare(in[1],last) < 0) p = in[0], in[0] = cdr(in[0]); else { if (compare(in[0],in[1]) < 0) p = in[0], in[0] = cdr(in[0]); else p = in[1], in[1] = cdr(in[1]); } } *tail[i] = p; tail[i] = &cdr(p); cdr(p) = Nil; last = p; } } while (isCell(out[1])); return out[0]; } /**/