Uri Guttman writes:
> >>>>> "DC" == Damian Conway <[EMAIL PROTECTED]> writes:
>   DC>      # Modtimewise numerically ascending...
>   DC>      @sorted = sort {-M $^a <=> -M $^b} @unsorted;
> 
>   DC>      # Fuzz-ifically...
>   DC>      sub fuzzy_cmp($x, $y) returns Int;
>   DC>      @sorted = sort &fuzzy_cmp, @unsorted;
> 
> ok, so that is recognizes as a compare sub due to the 2 arg sig. so does
> the sub must be defined/declared before the sort code is compiled?

Nope.  C<sort> is declared as a multimethod.  This works, too:

    $code = sub ($a, $b) { -M $a <=> -M $b };
    @sorted = sort $code, @unsorted;

>   DC> or with a single one-argument block/closure (to sort according
>   DC> whatever the specified key extractor returns):
> 
>   DC>      # Numerically ascending...
>   DC>      @sorted = sort {+ $^elem} @unsorted;
>   DC>      @sorted = sort {+ $_} @unsorted;
> 
> is $^elem special? or just a regular place holder? i see $_ will be set
> to each record as we discussed.

Those two statements are exactly the same in every way.  Well, except
how they're writted.  $^elem is indeed a regular placeholder.  $_
becomes an implicit parameter when it is referred to, in the absence of
placeholders or another type of signature.

>   DC>      # Key-ifically...
>   DC>      sub get_key($elem) {...}
>   DC>      @sorted = sort &get_key, @unsorted;
> 
> and that is parsed as an extracter code call due to the single arg
> sig. again, it appears that it has to be seen before the sort code for
> that to work.

Nope.  Runtime dispatch as before.

>   DC> or with a single extractor/comparator pair (to sort according to the
>   DC> extracted key, using the specified comparator):
> 
>   DC>      # Modtimewise stringifically descending...
>   DC>      @sorted = sort {-M}=>{$^b cmp $^a} @unsorted;
> 
> so that is a single pair of extractor/comparator. but there is no comma
> before @unsorted. is that correct? see below for why i ask that.

Yes.  Commas may be ommitted on either side of a block when used as an
argument.  I would argue that they only be omitted on the right side, so
that this is unambiguous:

    if some_function { ... }  
    { ... }

Which might be parsed as either:

    if (some_function { ... }) { ... }

Or:

    if (some_function()) {...}
    {...}  # Bare block

>   DC> or with an array of comparators and/or key extractors and/or
>   DC> extractor-comparator pairs (to sort according to a cascading list of
>   DC> criteria):
> 
>   DC>      # Numerically ascending
>   DC>      # or else namewise stringifically descending case-insensitive
>   DC>      # or else modtimewise numerically ascending
>   DC>      # or else namewise fuzz-ifically
>   DC>      # or else fuzz-ifically...
>   DC>      @sorted = sort [ {+ $^elem},
>   DC>                       {$^b.name cmp $^a.name} is insensitive,
>   DC>                       {-M},
>   DC>                       {.name}=>&fuzzy_cmp,
>   DC>                       &fuzzy_cmp,
> 
> i see the need for commas in here as it is a list of criteria.
> 
>   DC>                     ],
> 
> but what about that comma? no other example seems to have one before the
> @unsorted stuff.

It's not a closure, so you need a comma.

>   DC>                     @unsorted;
> 
>   DC> If a key-extractor block returns number, then C<< <=> >> is used to
>   DC> compare those keys. Otherwise C<cmp> is used. In either case, the keys
>   DC> extracted by the block are cached within the call to C<sort>, to
>   DC> optimize subsequent comparisons against the same element. That is, a
>   DC> key-extractor block is only ever called once for each element being
>   DC> sorted.
> 
> where does the int optimizer come in? just as you had it before in the
> extractor code? that will need to be accessible to the optimizer if the
> GRT is to work correctly.

If the block provably returns an int, C<sort> might be able to optimize
for ints.  Several ways to provably return an int:

    my $extractor = an int sub($arg) { $arg.num }
    @sorted = sort $extractor, @unsorted;

Or with a smarter compiler:

    @sorted = sort { int .num } @unsorted;

Or C<sort> might even check whether all the return values are ints and
then optimize that way.  No guarantees: it's not a language-level issue.

> i like that the key caching is defined here. 

Yeah.  This is a language-level issue, as the blocks might have
side-effects.

>   DC> Note that ambiguous cases like:
> 
>   DC>      @sorted = sort {-M}, {-M}, {-M};
> 
>   DC> will be dispatched according to the normal multiple dispatch semantics
>   DC> (which will mean that they will mean):
> 
>   DC>      @sorted = sort {-M}          <== {-M}, {-M};
> 
>   DC> and so one would need to write:
> 
>   DC>      @sorted = sort <== {-M}, {-M}, {-M};
> 
> that clears up that one for me.
> 
> this is very good overall (notwithstanding my few nits and
> questions). it will satisfy all sorts of sort users, even those who are
> out of sorts.

Agreed.  I'm very fond of it..

Luke

Reply via email to