Uri Guttman writes: > >>>>> "DC" == Damian Conway <[EMAIL PROTECTED]> writes: > DC> # Modtimewise numerically ascending... > DC> @sorted = sort {-M $^a <=> -M $^b} @unsorted; > > DC> # Fuzz-ifically... > DC> sub fuzzy_cmp($x, $y) returns Int; > DC> @sorted = sort &fuzzy_cmp, @unsorted; > > ok, so that is recognizes as a compare sub due to the 2 arg sig. so does > the sub must be defined/declared before the sort code is compiled?
Nope. C<sort> is declared as a multimethod. This works, too: $code = sub ($a, $b) { -M $a <=> -M $b }; @sorted = sort $code, @unsorted; > DC> or with a single one-argument block/closure (to sort according > DC> whatever the specified key extractor returns): > > DC> # Numerically ascending... > DC> @sorted = sort {+ $^elem} @unsorted; > DC> @sorted = sort {+ $_} @unsorted; > > is $^elem special? or just a regular place holder? i see $_ will be set > to each record as we discussed. Those two statements are exactly the same in every way. Well, except how they're writted. $^elem is indeed a regular placeholder. $_ becomes an implicit parameter when it is referred to, in the absence of placeholders or another type of signature. > DC> # Key-ifically... > DC> sub get_key($elem) {...} > DC> @sorted = sort &get_key, @unsorted; > > and that is parsed as an extracter code call due to the single arg > sig. again, it appears that it has to be seen before the sort code for > that to work. Nope. Runtime dispatch as before. > DC> or with a single extractor/comparator pair (to sort according to the > DC> extracted key, using the specified comparator): > > DC> # Modtimewise stringifically descending... > DC> @sorted = sort {-M}=>{$^b cmp $^a} @unsorted; > > so that is a single pair of extractor/comparator. but there is no comma > before @unsorted. is that correct? see below for why i ask that. Yes. Commas may be ommitted on either side of a block when used as an argument. I would argue that they only be omitted on the right side, so that this is unambiguous: if some_function { ... } { ... } Which might be parsed as either: if (some_function { ... }) { ... } Or: if (some_function()) {...} {...} # Bare block > DC> or with an array of comparators and/or key extractors and/or > DC> extractor-comparator pairs (to sort according to a cascading list of > DC> criteria): > > DC> # Numerically ascending > DC> # or else namewise stringifically descending case-insensitive > DC> # or else modtimewise numerically ascending > DC> # or else namewise fuzz-ifically > DC> # or else fuzz-ifically... > DC> @sorted = sort [ {+ $^elem}, > DC> {$^b.name cmp $^a.name} is insensitive, > DC> {-M}, > DC> {.name}=>&fuzzy_cmp, > DC> &fuzzy_cmp, > > i see the need for commas in here as it is a list of criteria. > > DC> ], > > but what about that comma? no other example seems to have one before the > @unsorted stuff. It's not a closure, so you need a comma. > DC> @unsorted; > > DC> If a key-extractor block returns number, then C<< <=> >> is used to > DC> compare those keys. Otherwise C<cmp> is used. In either case, the keys > DC> extracted by the block are cached within the call to C<sort>, to > DC> optimize subsequent comparisons against the same element. That is, a > DC> key-extractor block is only ever called once for each element being > DC> sorted. > > where does the int optimizer come in? just as you had it before in the > extractor code? that will need to be accessible to the optimizer if the > GRT is to work correctly. If the block provably returns an int, C<sort> might be able to optimize for ints. Several ways to provably return an int: my $extractor = an int sub($arg) { $arg.num } @sorted = sort $extractor, @unsorted; Or with a smarter compiler: @sorted = sort { int .num } @unsorted; Or C<sort> might even check whether all the return values are ints and then optimize that way. No guarantees: it's not a language-level issue. > i like that the key caching is defined here. Yeah. This is a language-level issue, as the blocks might have side-effects. > DC> Note that ambiguous cases like: > > DC> @sorted = sort {-M}, {-M}, {-M}; > > DC> will be dispatched according to the normal multiple dispatch semantics > DC> (which will mean that they will mean): > > DC> @sorted = sort {-M} <== {-M}, {-M}; > > DC> and so one would need to write: > > DC> @sorted = sort <== {-M}, {-M}, {-M}; > > that clears up that one for me. > > this is very good overall (notwithstanding my few nits and > questions). it will satisfy all sorts of sort users, even those who are > out of sorts. Agreed. I'm very fond of it.. Luke