Re: The Sort Problem: a definitive ruling

2004-02-21 Thread Gordon Henriksen
On Friday, February 20, 2004, at 05:48 , Damian Conway wrote:

Joe Gottman asked:

   How do you decide whether a key-extractor block returns number?  Do 
you look at the signature,  or do you simply evaluate the result of 
the key-extractor for each element in the unsorted list?  For example, 
what is the result of the following code?
  sort {$_.key} (1= 'a', 10 = 'b', 2 ='c');
   There is nothing in the signature of the key-extractor to suggest 
that all the keys are numbers, but as it turns out they all are.  Will 
the sort end up being numerical or alphabetic?
Whilst I'd very much like it to analyse the keys, detect that they're 
all numbers, and use C = 
Eek! Please don't even TRY to do that. It'd be creepy if the same call 
to sort could swap at runtime between numeric and string comparisons 
based upon its input. I would hope that the determination be made at 
compile time.

Consider the poor schmuck sorting new objects in preparation for a merge 
sort, only to find that his new array isn't sorted the same as his old 
array was, even though they came back from the exact same call to 
sort... Blech.

But if sort's arguments were specifically typed, i.e.:

my @array of Int;
@array = sort @array;
Does this meet the key extractor returns number qualification?



Gordon Henriksen
[EMAIL PROTECTED]


Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Smylers
Luke Palmer writes:

 Uri Guttman writes:
 
   DC == Damian Conway [EMAIL PROTECTED] writes:
 
DC  @sorted = sort {-M}={$^b cmp $^a} @unsorted;
  
  but there is no comma before @unsorted. is that correct?
 
 Yes.  Commas may be ommitted on either side of a block when used as an
 argument.

That's what I thought too.  But Damian gave exactly the opposite answer
to Uri's question, claiming he'd made a typo and a comma would be
required.

So which is it -- is Luke right in saying that Damian was right in the
first place?  Or is Damian right in saying that his example was wrong?

Smylers



Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Luke Palmer
Smylers writes:
 Luke Palmer writes:
 
  Uri Guttman writes:
  
DC == Damian Conway [EMAIL PROTECTED] writes:
  
 DC  @sorted = sort {-M}={$^b cmp $^a} @unsorted;
   
   but there is no comma before @unsorted. is that correct?
  
  Yes.  Commas may be ommitted on either side of a block when used as an
  argument.
 
 That's what I thought too.  But Damian gave exactly the opposite answer
 to Uri's question, claiming he'd made a typo and a comma would be
 required.
 
 So which is it -- is Luke right in saying that Damian was right in the
 first place?  Or is Damian right in saying that his example was wrong?

I was wrong in saying that he was right.  Those aren't simple blocks, as
Damian said, so you need the comma.

Luke

 Smylers
 


Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Smylers
Joe Gottman writes:

   sort {$_.key} (1= 'a', 10 = 'b', 2 ='c');
 
 There is nothing in the signature of the key-extractor to suggest that
 all the keys are numbers, but as it turns out they all are.

Are they?  I'd been presuming that pair keys would always be strings (as
for hashes in Perl 5), and that the C =  operator would
automatically quote a preceding word, stringifying it (as in Perl 5).
So the keys above are strings, albeit ones composed only of digits.

Of course that doesn't actually help with your question, since there are
other data structures of which the same could be asked.

Smylers



Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Luke Palmer
Smylers writes:
 Joe Gottman writes:
 
sort {$_.key} (1= 'a', 10 = 'b', 2 ='c');
  
  There is nothing in the signature of the key-extractor to suggest that
  all the keys are numbers, but as it turns out they all are.
 
 Are they?  I'd been presuming that pair keys would always be strings (as
 for hashes in Perl 5), and that the C =  operator would
 automatically quote a preceding word, stringifying it (as in Perl 5).
 So the keys above are strings, albeit ones composed only of digits.
 
 Of course that doesn't actually help with your question, since there are
 other data structures of which the same could be asked.

I think you're forgetting what language you're talking about.  Those are
numbers.  After this statement:

$x = '345';

C$x is a number.  I should hope it would be treated as one during
multimethod dispatch.

However, I'm not saying this with authority.  I'm just extrapolating.
If it's not correct, I'd appreciate that someone who knows correct me.

Luke


Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Smylers
Luke Palmer writes:

 After this statement:
 
 $x = '345';
 
 C$x is a number.

Oh.  I'd been assuming that quote marks indicated strings, and that,
while a string containing only digits could obviously be treated as a
number (as in Perl 5), it wouldn't be one without being provoked.

 I should hope it would be treated as one during multimethod dispatch.

What about:

  $x = '0345';

Is that a number?  And if so which of these is the same as?

  $x = 345;
  $x = 0345;

What about if the variable contains a line read from user input?  As a
programmer I'd expect that to be a string -- and if a user happens to
type only digits then it'd be surprising to find the variable is
considered to be of a different type.

User input comes with a trailing line-break character, which would make
it not a number.  But would it suddenly become a number after
Cchomping it?  Or if the input stream was auto-chomping?

Smylers


Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Dan Sugalski
At 2:49 PM -0700 2/20/04, Luke Palmer wrote:
After this statement:

$x = '345';

C$x is a number.
No, it isn't. It's a string. Or, rather, it's a PerlScalar.

I should hope it would be treated as one during
multimethod dispatch.
I should certainly hope *not*. If so, it's a bug. We ought to go add 
some tests to the test suite once we expose this bit of the engine.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Luke Palmer
Smylers writes:
 Luke Palmer writes:
 
  After this statement:
  
  $x = '345';
  
  C$x is a number.
 
 Oh.  I'd been assuming that quote marks indicated strings, and that,
 while a string containing only digits could obviously be treated as a
 number (as in Perl 5), it wouldn't be one without being provoked.
 
  I should hope it would be treated as one during multimethod dispatch.
 
 What about:
 
   $x = '0345';
 
 Is that a number?  And if so which of these is the same as?
 
   $x = 345;
   $x = 0345;

Well, since those are the same number, I imagine the, um, first?

Don't forget that octal numbers look like 0o345.

 What about if the variable contains a line read from user input?  As a
 programmer I'd expect that to be a string -- and if a user happens to
 type only digits then it'd be surprising to find the variable is
 considered to be of a different type.

Yeah, that's a tough question.  I'd want it to be a number if it were
only digits, unless I wanted it to be a string.  Since numbers and
strings are polymorphic with one another, maybe it's wrong to think that
we can differentiate.

But Csort has to know when to use C = .

Maybe you're right.  In the presence of multimethod dispatch, it might
be simpler just to tag something with either a num or str marker (I'm
completely neglecting Parrot's implementation for this discussion), and
treat it as its tag.  That wouldn't change the behavior of adding two
strings together, or concatenating two numbers, of course.

Luke


Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Damian Conway
Uri wondered:

  DC No. C infix:=  is the name of the binary C =  operator.

so how is that allowed there without a block?
A Code object in a scalar context yields a Code reference.

Damian


Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Damian Conway
Smylers wrote:

 sort {$_.key} (1= 'a', 10 = 'b', 2 ='c');

There is nothing in the signature of the key-extractor to suggest that
all the keys are numbers, but as it turns out they all are.
Are they?  I'd been presuming that pair keys would always be strings 
Nope.

 and that the C =  operator would
automatically quote a preceding word, stringifying it (as in Perl 5).
Yes. But numbers aren't words. C =  will continue to autostringify 
*identifiers*, as in Perl 6, and those keys aren't identifiers.

Of course, if you used the pairs to populate a hash, the *hash* will convert 
the non-stringific pair keys to stringific hash keys (unless the hash is 
defined to take non-string keys).

Damian


Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Damian Conway
Luke wrote:

I think you're forgetting what language you're talking about.  Those are
numbers.  After this statement:
$x = '345';

C$x is a number.  
I don't think so. C$x is, of course, a variable. And what it contains after 
that statement will depend on whether the variable is explicitly typed or not.
If C$x is explicitly typed, the rvalue will have been converted to that type 
(if possible). If C$x is not explicitly typed (i.e. implicitly typed to 
CAny), then it will contain a string, since that's what the rvalue 
inherently is.

I should hope it would be treated as one during
multimethod dispatch.
I would expect that the multiple dispatch mechanism would allow the CStr in 
C$x to be coerced to a CNum, if that's the parameter type it was seeking 
to match. And I would expect that the distance of that coercion would be 1.


However, I'm not saying this with authority.  I'm just extrapolating.
If it's not correct, I'd appreciate that someone who knows correct me.
Ditto. ;-)

Damian


Re: The Sort Problem: a definitive ruling

2004-02-20 Thread Damian Conway
Smylers wrote:

Oh.  I'd been assuming that quote marks indicated strings, and that,
while a string containing only digits could obviously be treated as a
number (as in Perl 5), it wouldn't be one without being provoked.
Correct.


What about:

  $x = '0345';

Is that a number?  
Nope. A string (unless C$X is otherwised typed).


What about if the variable contains a line read from user input?  As a
programmer I'd expect that to be a string
You'd be right (unless, of course, the variable's storage type forced a 
coercion during the assignment).

Damian


The Sort Problem: a definitive ruling

2004-02-19 Thread Damian Conway
The design team discussed The Sort Problem during yesterday's 
teleconference. Here is Larry's decision: final, definitive, and unalterable 
(well...for this week at least ;-)

-cut-cut-cut-cut-cut-cut

Csort in Perl6 is a global multisub:

multi sub *sort(Criterion @by: [EMAIL PROTECTED]) {...}
multi sub *sort(Criterion $by: [EMAIL PROTECTED]) {...}
multi sub *sort( : [EMAIL PROTECTED]) {...}
where:

type KeyExtractor ::= Code(Any) returns Any;

type Comparator   ::= Code(Any, Any) returns Int;

type Criterion::= KeyExtractor
| Comparator
| Pair(KeyExtractor, Comparator)
;
That means that we can call Csort without a block (to sort stringifically 
ascending with Ccmp):

# Stringifically ascending...
@sorted = sort @unsorted;
or with a single two-argument block/closure (to sort by whatever the specified 
comparator is):

# Numerically ascending...
@sorted = sort {$^a = $^b} @unsorted;
# Namewise stringifically descending case-insensitive...
@sorted = sort {lc $^b.name cmp lc $^a.name}
   @unsorted;
# or...
@sorted = sort {$^b.name cmp $^a.name} is insensitive
   @unsorted;
# or...
@sorted = sort {$^a.name cmp $^b.name} is descending is insensitive
   @unsorted;
# Modtimewise numerically ascending...
@sorted = sort {-M $^a = -M $^b} @unsorted;
# Fuzz-ifically...
sub fuzzy_cmp($x, $y) returns Int;
@sorted = sort fuzzy_cmp, @unsorted;
or with a single one-argument block/closure (to sort according whatever the 
specified key extractor returns):

# Numerically ascending...
@sorted = sort {+ $^elem} @unsorted;
@sorted = sort {+ $_} @unsorted;
# Namewise stringifically descending case-insensitive...
@sorted = sort {~ $^elem.name} is descending is insensitive @unsorted;
@sorted = sort {lc $^elem.name} is descending @unsorted;
@sorted = sort {lc .name} is descending @unsorted;
# Modtimewise numerically ascending...
@sorted = sort {-M} @unsorted;
# Key-ifically...
sub get_key($elem) {...}
@sorted = sort get_key, @unsorted;
or with a single extractor/comparator pair (to sort according to the extracted 
key, using the specified comparator):

# Modtimewise stringifically descending...
@sorted = sort {-M}={$^b cmp $^a} @unsorted;
# Namewise fuzz-ifically...
@sorted = sort {.name}=fuzzy_cmp @unsorted;
or with an array of comparators and/or key extractors and/or 
extractor-comparator pairs (to sort according to a cascading list of criteria):

# Numerically ascending
# or else namewise stringifically descending case-insensitive
# or else modtimewise numerically ascending
# or else namewise fuzz-ifically
# or else fuzz-ifically...
@sorted = sort [ {+ $^elem},
 {$^b.name cmp $^a.name} is insensitive,
 {-M},
 {.name}=fuzzy_cmp,
 fuzzy_cmp,
   ],
   @unsorted;


If a key-extractor block returns number, then C =  is used to compare 
those keys. Otherwise Ccmp is used. In either case, the keys extracted by 
the block are cached within the call to Csort, to optimize subsequent 
comparisons against the same element. That is, a key-extractor block is only 
ever called once for each element being sorted.

If a key-extractor/comparator pair is specified, the key-extractor is the key 
of the pair and the comparator the value. The extractor is used to retreive 
keys, which are then passed to the comparator.

The Cis descending and Cis insensitive traits on a key extractor or a 
comparator are detected within the call to Csort (or possibly by the 
compiler) and used to modify the case-sensitivity and direction of any 
comparison operators used for the corresponding key or in the corresponding 
comparator.

Note that ambiguous cases like:

@sorted = sort {-M}, {-M}, {-M};
@sorted = sort {$^a = $^b}, {$^a = $^b}, {$^a = $^b};
@sorted = sort [...], [...], [...];
# etc.
will be dispatched according to the normal multiple dispatch semantics
(which will mean that they will mean):
@sorted = sort {-M}  == {-M}, {-M};
@sorted = sort {$^a = $^b} == {$^a = $^b}, {$^a = $^b};
@sorted = sort [...] == [...], [...];
# etc.
and so one would need to write:

@sorted = sort == {-M}, {-M}, {-M};
@sorted = sort == {$^a = $^b}, {$^a = $^b}, {$^a = $^b};
@sorted = sort == [...], [...], [...];
# etc.
to get Ccmp comparison on all the arguments.

-cut-cut-cut-cut-cut-cut

Thanks to everyone who contributed to this discussion (especially Uri). As you 
see, the result is sort facility that is simultaneously much more powerful, 
much easier-to-use in the simple cases, has the potential to 

Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Dave Whipp
Damian Conway [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
  type KeyExtractor ::= Code(Any) returns Any;

  # Modtimewise numerically ascending...
  @sorted = sort {-M} @unsorted;


One thing I've been trying to figure out reading this: what is the signature
of prefix:-M ? i.e. how does it tell the outer block that it (the
outer-block) needs a parameter? There seems to be some transitive magic
going on here. Could similar magic be used to have infix:= require two
higher-order variables (e.g. could sort { = } @unsorted be made to
work?)


Dave.




Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Luke Palmer
Dave Whipp writes:
 Damian Conway [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
   type KeyExtractor ::= Code(Any) returns Any;
 
   # Modtimewise numerically ascending...
   @sorted = sort {-M} @unsorted;
 
 
 One thing I've been trying to figure out reading this: what is the signature
 of prefix:-M ? 

Presumably something like:

sub prefix:-M (?$file = $CALLER::_) {...}

 i.e. how does it tell the outer block that it (the outer-block) needs
 a parameter? 

Because it operates on $_.  It tells it the same way:

map { .name } @objects

Does.  Of course, this is going to be tough on the compiler, who will
have to take the C= $CALLER::_ part into account.

 There seems to be some transitive magic going on here.  Could similar
 magic be used to have infix:= require two higher-order variables
 (e.g. could sort { = } @unsorted be made to work?)

No.  Although you could do such a thing with:

sort infix:=, @unsorted;

Luke


Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Uri Guttman
 DC == Damian Conway [EMAIL PROTECTED] writes:

  DC Once again the Iron Designer rises to the supreme challenge of
  DC the Mailinglist Stadium and expresses the true spirit of Perl
  DC 6!!!

and the challenge for next week is slicing squid with noodles!
(or cutting down the mightiest tree in the forest with a herring)

good job all.

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org


Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Damian Conway
Dave Whipp wondered:

@sorted = sort {-M} @unsorted;
One thing I've been trying to figure out reading this: what is the signature
of prefix:-M ? i.e. how does it tell the outer block that it (the
outer-block) needs a parameter?
It doesn't. As A6 explained:

	http://dev.perl.org/perl6/apocalypse/A06.html#Bare_subs

any block that doesn't have placeholder-specified parameters but which refers 
(even implicitly) to $_ will automatically have the signature of ($_).

That's why:

	@odd = grep { $_ % 2 } @nums;

will still work in Perl 6.

Since a bare C-M implicitly refers to $_, the surrounding block 
automagically gets a one-parameter signature and hence is (correctly!) 
interpreted as a key extractor.

Don't you just love it when a plan^H^H^H^Hdesign comes together? ;-)


There seems to be some transitive magic going on here. 
There is. Kinda. Just not the type of magic you thought.


Could similar magic be used to have infix:= require two
higher-order variables (e.g. could sort { = } @unsorted be made to
work?)
No. But this will work:

	sort infix:= @unsorted

Damian


Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Uri Guttman
 DC == Damian Conway [EMAIL PROTECTED] writes:

  DC  # Stringifically ascending...
  DC  @sorted = sort @unsorted;

  DC or with a single two-argument block/closure (to sort by whatever the
  DC specified comparator is):

  DC  # Numerically ascending...
  DC  @sorted = sort {$^a = $^b} @unsorted;

so because that has 2 placeholders, it is will match this signature:

 type Comparator   ::= Code(Any, Any) returns Int;

i have to remember that placeholders are really implied args to a code
block and not just in the expression

  DC  # Namewise stringifically descending case-insensitive...
  DC  @sorted = sort {lc $^b.name cmp lc $^a.name}
  DC @unsorted;
  DC  # or...
  DC  @sorted = sort {$^b.name cmp $^a.name} is insensitive
  DC @unsorted;
  DC  # or...
  DC  @sorted = sort {$^a.name cmp $^b.name} is descending is insensitive
  DC @unsorted;

TIMTOWTDI lives on!

  DC  # Modtimewise numerically ascending...
  DC  @sorted = sort {-M $^a = -M $^b} @unsorted;

  DC  # Fuzz-ifically...
  DC  sub fuzzy_cmp($x, $y) returns Int;
  DC  @sorted = sort fuzzy_cmp, @unsorted;

ok, so that is recognizes as a compare sub due to the 2 arg sig. so does
the sub must be defined/declared before the sort code is compiled?

  DC or with a single one-argument block/closure (to sort according
  DC whatever the specified key extractor returns):

  DC  # Numerically ascending...
  DC  @sorted = sort {+ $^elem} @unsorted;
  DC  @sorted = sort {+ $_} @unsorted;

is $^elem special? or just a regular place holder? i see $_ will be set
to each record as we discussed.

  DC  # Namewise stringifically descending case-insensitive...
  DC  @sorted = sort {~ $^elem.name} is descending is insensitive @unsorted;
  DC  @sorted = sort {lc $^elem.name} is descending @unsorted;
  DC  @sorted = sort {lc .name} is descending @unsorted;

just getting my p6 chops back. .name is really $_.name so that makes
sense. and $^elem is just a named placeholder for $_ as before?

  DC  # Key-ifically...
  DC  sub get_key($elem) {...}
  DC  @sorted = sort get_key, @unsorted;

and that is parsed as an extracter code call due to the single arg
sig. again, it appears that it has to be seen before the sort code for
that to work.

  DC or with a single extractor/comparator pair (to sort according to the
  DC extracted key, using the specified comparator):

  DC  # Modtimewise stringifically descending...
  DC  @sorted = sort {-M}={$^b cmp $^a} @unsorted;

so that is a single pair of extractor/comparator. but there is no comma
before @unsorted. is that correct? see below for why i ask that.

  DC  # Namewise fuzz-ifically...
  DC  @sorted = sort {.name}=fuzzy_cmp @unsorted;

i first parsed that as being wrong and the {} should wrap the whole
thing. so that is a pair again of extractor/comparator.

  DC or with an array of comparators and/or key extractors and/or
  DC extractor-comparator pairs (to sort according to a cascading list of
  DC criteria):

  DC  # Numerically ascending
  DC  # or else namewise stringifically descending case-insensitive
  DC  # or else modtimewise numerically ascending
  DC  # or else namewise fuzz-ifically
  DC  # or else fuzz-ifically...
  DC  @sorted = sort [ {+ $^elem},
  DC   {$^b.name cmp $^a.name} is insensitive,
  DC   {-M},
  DC   {.name}=fuzzy_cmp,
  DC   fuzzy_cmp,

i see the need for commas in here as it is a list of criteria.

  DC ],

but what about that comma? no other example seems to have one before the
@unsorted stuff.

  DC @unsorted;

  DC If a key-extractor block returns number, then C =  is used to
  DC compare those keys. Otherwise Ccmp is used. In either case, the keys
  DC extracted by the block are cached within the call to Csort, to
  DC optimize subsequent comparisons against the same element. That is, a
  DC key-extractor block is only ever called once for each element being
  DC sorted.

where does the int optimizer come in? just as you had it before in the
extractor code? that will need to be accessible to the optimizer if the
GRT is to work correctly.

i like that the key caching is defined here. we can implement it in
several different ways depending on optimization hints and such. we
could support the ST, GRT and orchish and select the best one for each
sort. or we could have one basic sort and load the others as pragmas or
modules.

  DC The Cis descending and Cis insensitive traits on a key extractor
  DC or a comparator are detected within the call to Csort (or possibly
  DC by the compiler) and used to modify the case-sensitivity and
  DC direction of any comparison operators used for the corresponding key
  DC or in the corresponding comparator.

or by reversing the order of the 

Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Luke Palmer
Uri Guttman writes:
  DC == Damian Conway [EMAIL PROTECTED] writes:
   DC  # Modtimewise numerically ascending...
   DC  @sorted = sort {-M $^a = -M $^b} @unsorted;
 
   DC  # Fuzz-ifically...
   DC  sub fuzzy_cmp($x, $y) returns Int;
   DC  @sorted = sort fuzzy_cmp, @unsorted;
 
 ok, so that is recognizes as a compare sub due to the 2 arg sig. so does
 the sub must be defined/declared before the sort code is compiled?

Nope.  Csort is declared as a multimethod.  This works, too:

$code = sub ($a, $b) { -M $a = -M $b };
@sorted = sort $code, @unsorted;

   DC or with a single one-argument block/closure (to sort according
   DC whatever the specified key extractor returns):
 
   DC  # Numerically ascending...
   DC  @sorted = sort {+ $^elem} @unsorted;
   DC  @sorted = sort {+ $_} @unsorted;
 
 is $^elem special? or just a regular place holder? i see $_ will be set
 to each record as we discussed.

Those two statements are exactly the same in every way.  Well, except
how they're writted.  $^elem is indeed a regular placeholder.  $_
becomes an implicit parameter when it is referred to, in the absence of
placeholders or another type of signature.

   DC  # Key-ifically...
   DC  sub get_key($elem) {...}
   DC  @sorted = sort get_key, @unsorted;
 
 and that is parsed as an extracter code call due to the single arg
 sig. again, it appears that it has to be seen before the sort code for
 that to work.

Nope.  Runtime dispatch as before.

   DC or with a single extractor/comparator pair (to sort according to the
   DC extracted key, using the specified comparator):
 
   DC  # Modtimewise stringifically descending...
   DC  @sorted = sort {-M}={$^b cmp $^a} @unsorted;
 
 so that is a single pair of extractor/comparator. but there is no comma
 before @unsorted. is that correct? see below for why i ask that.

Yes.  Commas may be ommitted on either side of a block when used as an
argument.  I would argue that they only be omitted on the right side, so
that this is unambiguous:

if some_function { ... }  
{ ... }

Which might be parsed as either:

if (some_function { ... }) { ... }

Or:

if (some_function()) {...}
{...}  # Bare block

   DC or with an array of comparators and/or key extractors and/or
   DC extractor-comparator pairs (to sort according to a cascading list of
   DC criteria):
 
   DC  # Numerically ascending
   DC  # or else namewise stringifically descending case-insensitive
   DC  # or else modtimewise numerically ascending
   DC  # or else namewise fuzz-ifically
   DC  # or else fuzz-ifically...
   DC  @sorted = sort [ {+ $^elem},
   DC   {$^b.name cmp $^a.name} is insensitive,
   DC   {-M},
   DC   {.name}=fuzzy_cmp,
   DC   fuzzy_cmp,
 
 i see the need for commas in here as it is a list of criteria.
 
   DC ],
 
 but what about that comma? no other example seems to have one before the
 @unsorted stuff.

It's not a closure, so you need a comma.

   DC @unsorted;
 
   DC If a key-extractor block returns number, then C =  is used to
   DC compare those keys. Otherwise Ccmp is used. In either case, the keys
   DC extracted by the block are cached within the call to Csort, to
   DC optimize subsequent comparisons against the same element. That is, a
   DC key-extractor block is only ever called once for each element being
   DC sorted.
 
 where does the int optimizer come in? just as you had it before in the
 extractor code? that will need to be accessible to the optimizer if the
 GRT is to work correctly.

If the block provably returns an int, Csort might be able to optimize
for ints.  Several ways to provably return an int:

my $extractor = an int sub($arg) { $arg.num }
@sorted = sort $extractor, @unsorted;

Or with a smarter compiler:

@sorted = sort { int .num } @unsorted;

Or Csort might even check whether all the return values are ints and
then optimize that way.  No guarantees: it's not a language-level issue.

 i like that the key caching is defined here. 

Yeah.  This is a language-level issue, as the blocks might have
side-effects.

   DC Note that ambiguous cases like:
 
   DC  @sorted = sort {-M}, {-M}, {-M};
 
   DC will be dispatched according to the normal multiple dispatch semantics
   DC (which will mean that they will mean):
 
   DC  @sorted = sort {-M}  == {-M}, {-M};
 
   DC and so one would need to write:
 
   DC  @sorted = sort == {-M}, {-M}, {-M};
 
 that clears up that one for me.
 
 this is very good overall (notwithstanding my few nits and
 questions). it will satisfy all sorts of sort users, even those who are
 out of sorts.

Agreed.  I'm very fond of it..

Luke



Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Luke Palmer
Luke Palmer writes:
 Yes.  Commas may be ommitted on either side of a block when used as an
 argument.  I would argue that they only be omitted on the right side, so
 that this is unambiguous:
 
 if some_function { ... }  
 { ... }
 
 Which might be parsed as either:
 
 if (some_function { ... }) { ... }
 
 Or:
 
 if (some_function()) {...}
 {...}  # Bare block

Silly me.  That doesn't solve anything.  I don't know why I thought it
did.  I still think that this looks weird:

foo $bar { ... } $baz;

But that's just preference.

Luke



Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Joe Gottman

- Original Message - 
From: Damian Conway [EMAIL PROTECTED]
To: Perl 6 Language [EMAIL PROTECTED]
Sent: Thursday, February 19, 2004 8:29 PM
Subject: [perl] The Sort Problem: a definitive ruling


 Csort in Perl6 is a global multisub:

  multi sub *sort(Criterion @by: [EMAIL PROTECTED]) {...}
  multi sub *sort(Criterion $by: [EMAIL PROTECTED]) {...}
  multi sub *sort( : [EMAIL PROTECTED]) {...}

 where:

  type KeyExtractor ::= Code(Any) returns Any;

  type Comparator   ::= Code(Any, Any) returns Int;

  type Criterion::= KeyExtractor
  | Comparator
  | Pair(KeyExtractor, Comparator)
  ;
   snip

 If a key-extractor block returns number, then C =  is used to
compare
 those keys. Otherwise Ccmp is used. In either case, the keys extracted
by
 the block are cached within the call to Csort, to optimize subsequent
 comparisons against the same element. That is, a key-extractor block is
only
 ever called once for each element being sorted.



   How do you decide whether a key-extractor block returns number?  Do you
look at the signature,  or do you simply evaluate the result of the
key-extractor for each element in the unsorted list?  For example, what is
the result of the following code?

  sort {$_.key} (1= 'a', 10 = 'b', 2 ='c');

   There is nothing in the signature of the key-extractor to suggest that
all the keys are numbers, but as it turns out they all are.  Will the sort
end up being numerical or alphabetic?


Joe Gottman




Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Damian Conway
Uri checked:


  DC  @sorted = sort {$^a = $^b} @unsorted;

so because that has 2 placeholders, it is will match this signature:

 type Comparator   ::= Code(Any, Any) returns Int;
Correct.


i have to remember that placeholders are really implied args to a code
block and not just in the expression
Indeed.


  DC  sub fuzzy_cmp($x, $y) returns Int;
  DC  @sorted = sort fuzzy_cmp, @unsorted;
ok, so that is recognizes as a compare sub due to the 2 arg sig. so does
the sub must be defined/declared before the sort code is compiled?
Yes. And, yes, declaration is sufficient.


  DC  @sorted = sort {+ $^elem} @unsorted;
  DC  @sorted = sort {+ $_} @unsorted;
is $^elem special? or just a regular place holder?
Regular.


just getting my p6 chops back. .name is really $_.name so that makes
sense. and $^elem is just a named placeholder for $_ as before?
Yes and yes.


  DC  @sorted = sort get_key, @unsorted;

and that is parsed as an extracter code call due to the single arg
sig. 
Yes.


again, it appears that it has to be seen before the sort code for
that to work.
Correct.


  DC or with a single extractor/comparator pair (to sort according to the
  DC extracted key, using the specified comparator):
  DC  # Modtimewise stringifically descending...
  DC  @sorted = sort {-M}={$^b cmp $^a} @unsorted;
so that is a single pair of extractor/comparator. but there is no comma
before @unsorted. 
Typo. I'm pretty sure it would actually need a comma there.


  DC  @sorted = sort {.name}=fuzzy_cmp @unsorted;
Comma required there too. :-(



  DC ],

but what about that comma? 
It's required.


no other example seems to have one before the @unsorted stuff.
The comma exception only applies to simple blocks. I messed up those two 
examples. :-(


where does the int optimizer come in? 
When the key extractor is known to return an Int. Which would occur either 
when it's explicitly declared to do that, or when the compiler can intuit a 
block's return type from the type of value returned by the block (i.e. if the 
block always returns the result of a call to Cint).


just as you had it before in the
extractor code? 
Yup.



so are those traits are only allowed/meaningful on comparison blocks?
or will an extraction block take them 
Both. That's why I wrote:

The Cis descending and Cis insensitive traits
on a key extractor or a comparator...
 ^
you have examples which show the
traits on either the extractor or comparator code blocks. that implies
that the guts can get those flags from either and use them as needed.
Yep. Inside the body of Csort you'd access them as:

$by.trait{descending}
$by.trait{insensitive}
(unless Larry's changed the trait accessor syntax since last I looked).

Damian


Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Uri Guttman
 DC == Damian Conway [EMAIL PROTECTED] writes:

  DC No. But this will work:

  DC   sort infix:= @unsorted

my brane hertz!!

so that declares (creates?) an infix op as a code block? and since =
is known to take 2 args it is parsed (or multidispatched) as a
comparator block for sort?

amazing how you and luke both came up with the exact same answer. p6
syntax like that is killing me slowly!

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org


Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Uri Guttman
 JG == Joe Gottman [EMAIL PROTECTED] writes:


  JGHow do you decide whether a key-extractor block returns number?  Do you
  JG look at the signature,  or do you simply evaluate the result of the
  JG key-extractor for each element in the unsorted list?  For example, what is
  JG the result of the following code?

  JG   sort {$_.key} (1= 'a', 10 = 'b', 2 ='c');

  JGThere is nothing in the signature of the key-extractor to suggest that
  JG all the keys are numbers, but as it turns out they all are.  Will the sort
  JG end up being numerical or alphabetic?

my take is that either = or cmp (my pref) would be the default
comparator. if you want to force one, you need to use prefix ~ or + or
int.

oh, another reason to make cmp the default is BACKWARDS COMPATIBILITY!
we have support now for the old simple @sorted = sort @unsorted syntax
so the cmp should still be the default.

hey, i am remembering p6 syntax now! but give me a week and i will
forget it again :)

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org


Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Damian Conway
Uri bemoaned:


  DC 	sort infix:= @unsorted

my brane hertz!!

so that declares (creates?) an infix op as a code block? 
No. C infix:=  is the name of the binary C =  operator.


amazing how you and luke both came up with the exact same answer.
Great minds... etc. ;-)

 p6 syntax like that is killing me slowly!

No, it's gradually making your *stronger*. ;-)

Damian


Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Larry Wall
On Fri, Feb 20, 2004 at 02:47:55PM +1100, Damian Conway wrote:
: Yep. Inside the body of Csort you'd access them as:
: 
:   $by.trait{descending}
:   $by.trait{insensitive}
: 
: (unless Larry's changed the trait accessor syntax since last I looked).

Well, if traits are just compile-time properties, and properties
are just mixed-in roles (usually enums), then it's more likely that
something like:

$by.Direction == descending
$by.Case == insensitive

would be the incantation.  Or maybe if enums auto-booleanize, then
you could say

$by.Direction::descending
$by.Case::insensitive

And then

$by.descending
$by.insensitive

might be allowed as abbreviations when unambiguous.  Or maybe we
require matching:

$by ~~ descending
$by ~~ insensitive

But there is no such thing as a true property or false property.
There's a Boolean role that can have the value true or false.  Traits
are mixed in to declared objects at compile time, and can do weird
things to such objects at mixin time.

Likewise there's no such thing as a descending property.  There's
a Direction property which defaults to ascending.  And a Case property
that defaults to sensitive.

To do otherwise is to set ourselves up for objects that can be both
true and false simultaneously.  Only junctions should be allowed
to do that...

Larry


Re: The Sort Problem: a definitive ruling

2004-02-19 Thread Uri Guttman
 DC == Damian Conway [EMAIL PROTECTED] writes:

  DC Uri bemoaned:

cause you agonize me head!

  DC sort infix:= @unsorted
   my brane hertz!!
   so that declares (creates?) an infix op as a code block?

  DC No. C infix:=  is the name of the binary C =  operator.

so how is that allowed there without a block? is it because it is the
name is in a style of a sub? that makes sense to me but i want to make
sure i get it.

and now back to my advil addiction. :)

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org