Re: [S29] uniq

2005-05-27 Thread TSa (Thomas Sandlaß)

Luke Palmer wrote:

So I suppose that's my proposal.  Allow, even encourage, overloading
of =:=, but only for value types.  I've been thinking that we ought to
provide a standard role for making something a value type.  Maybe it
would require definition of =:=.


I would like to propose something slightly different:
1) Let =:= be non-overloadable purely macro/grammar based
   --- same could be applied to ::= and := I think
2) the implementation of =:= then checks type homogenity
   as a precondition and calls the type specific EQUAL
   submethod or some such

The rational is that type homogenity is a necessary condition
for identity and thus allows to return false when violated.

The EQUAL class/type submethod could be used as the default for
the == and eq operators as long as subtype homogenity for both
sides holds. Well, or the operator name is a parameter to the
parametric role:

role Identity[ eq_op:( ::?CLASS, ::?CLASS: -- bit ) = $::CLASS::EQUAL ]
does Identity[ ::?CLASS ]  # F-bounded
{
   ...
}

which is implicitly forced into every class definition by virtue of

role Any does Identity;

In particular we have:

class Num does Identity[ infix:{'=='} ] {...}
class Str does Identity[ infix:{'eq'} ] {...}

To what exteent in canbe auto-generated with dispatching to
the respective methods of all elements defined into the class'
value environment, I can't say.
--
TSa (Thomas Sandlaß)



[S29] uniq

2005-05-19 Thread Ingo Blechschmidt
Hi,

three quick questions:

Is it intentional that there's no uniq in the current S29[1] draft?
See [2] for Damian saying that uniq is probably in.

I wondered what uniq's default comparator should be, =:=?

Should it be possible to give an own comparator block, similar as with
grep? E.g.
  uniq a b a a c d;   # a b a c d

  uniq:{ abs $^a == abs $^b } 42, 23, -23, 23, 42
# 42, 23, 42
  

--Ingo

[1] http://svn.openfoundry.org/pugs/docs/AES/S29draft.pod
[2] http://groups.google.com/groups?selm=420DB295.3000902%40conway.org

-- 
Linux, the choice of a GNU | The future is here. It's just not widely
generation on a dual AMD   | distributed yet.
Athlon!| -- William Gibson



Re: [S29] uniq

2005-05-19 Thread Rod Adams
Ingo Blechschmidt wrote:
Hi,
three quick questions:
 

Since Aaron is still getting up to speed, I'll take a stab at these.
Is it intentional that there's no uniq in the current S29[1] draft?
See [2] for Damian saying that uniq is probably in.
 

Just hasn't been entered.
I wondered what uniq's default comparator should be, =:=?
 

I'd have gone with ~~
Should it be possible to give an own comparator block, similar as with
grep? E.g.
 uniq a b a a c d;   # a b a c d
 uniq:{ abs $^a == abs $^b } 42, 23, -23, 23, 42
   # 42, 23, 42
 

I could see that happening. But I'd have to stop and wonder if wrapping 
it inside a map would be more natural. If it does happen, it'd likely 
need to copy the key generation style of the new sort.

-- Rod Adams


Re: [S29] uniq

2005-05-19 Thread Ingo Blechschmidt
Hi,

Rod Adams wrote:
I wondered what uniq's default comparator should be, =:=?
 I'd have gone with ~~

Even better. :)


--Ingo

-- 
Linux, the choice of a GNU | Row, row, row your bits, gently down the
generation on a dual AMD   | stream...  
Athlon!| 



Re: [S29] uniq

2005-05-19 Thread Mark Overmeer
* Ingo Blechschmidt ([EMAIL PROTECTED]) [050519 16:52]:
 Should it be possible to give an own comparator block, similar as with
 grep? E.g.
   uniq a b a a c d;   # a b a c d
 
   uniq:{ abs $^a == abs $^b } 42, 23, -23, 23, 42
 # 42, 23, 42

'uniq' differs from 'sort' because there is no order relationship between
the elements.  A quick algorithm for finding the unique elements in perl5
is
   sub uniq(@)
   {  my %h = map { ($_ = 1) } @elements;
  keys %h;
   }
or
   sub uniq(@)
   {  my %h;
  $h{$_}++ for @elements;
  keys %h;
   }

anyway: O(n), which is much better than sort can do.

For your proposal, the requested code only requires this
uniq:{ abs $^a } 42, 23, -23, 23, 42, 42, 23, 42

uniq:{lc} @words
-- 
   MarkOv


drs Mark A.C.J. OvermeerMARKOV Solutions
   [EMAIL PROTECTED]  [EMAIL PROTECTED]
http://Mark.Overmeer.net   http://solutions.overmeer.net


Re: [S29] uniq

2005-05-19 Thread Ingo Blechschmidt
Hi,

Mark Overmeer wrote:
 * Ingo Blechschmidt ([EMAIL PROTECTED]) [050519 16:52]:
 Should it be possible to give an own comparator block, similar as
 with grep? E.g.
   uniq a b a a c d;   # a b a c d
 
   uniq:{ abs $^a == abs $^b } 42, 23, -23, 23, 42
 # 42, 23, 42
 
 'uniq' differs from 'sort' because there is no order relationship
 between
 the elements.

quoting Damian's original mail[1]:
  uniq  - remove duplicates without reordering
 ^^

 A quick algorithm for finding the unique elements in 
 perl5 is
[...]

Yeah, or the Perl 6:
  sub uniq([EMAIL PROTECTED]) {
my %h;
[EMAIL PROTECTED];
return keys %h;
  }
  # Or

  sub uniq([EMAIL PROTECTED]) {
my %h;
[EMAIL PROTECTED] = (undef) xx Inf;
return keys %h;
  }
  # thousand other ways ommitted :)


--Ingo

[1] http://groups.google.com/groups?selm=420DB295.3000902%40conway.org

-- 
Linux, the choice of a GNU | The future is here. It's just not widely
generation on a dual AMD   | distributed yet. -- William Gibson  
Athlon!| 



Re: [S29] uniq

2005-05-19 Thread Adriano Ferreira
quoting Damian's original mail[1]:
  uniq  - remove duplicates without reordering
^^

Would not that mean the original order of the first ocurrence is
preserved? This is what Ruby Array#uniq does:

[4,1,2,4,2,3,5].uniq  = [ 4, 1, 2, 3, 5]

The behavior is consistent with the description given ri Array#uniq

- Array#uniq
 array.uniq   - an_array

 Returns a new array by removing duplicate values in _self_.

a = [ a, a, b, b, c ]
a.uniq   #= [a, b, c]

A naïve Perl 5 implementation could be:

sub uniq {
  my (@u, %h);
  for (@_) {
push(@u, $_) unless $h{$_};
$h{$_}++;
  }
  return @u;
}

Adriano.


Re: [S29] uniq

2005-05-19 Thread Adriano Ferreira
The former implementation can be shortened:

sub uniq {
  my %h;
  return grep { ! $h{$_}++ } @_;
}

But realize that none of the proposed solutions (which are based on
hashes for computing the return) is amenable to the extension Ingo
called for with comparator blocks.

Adriano.


Re: [S29] uniq

2005-05-19 Thread Ingo Blechschmidt
Hi,

Adriano Ferreira wrote:
 quoting Damian's original mail[1]:
  uniq  - remove duplicates without reordering
 ^^
 
 Would not that mean the original order of the first ocurrence is
 preserved? This is what Ruby Array#uniq does:
 
 [4,1,2,4,2,3,5].uniq  = [ 4, 1, 2, 3, 5]

I read this as that uniq should behave like Unix's uniq(1), i.e.
removing only successive duplicates, e.g.:
  uniq [3,3,3,4,3]   =   [3,4,3]   # what I meant
  uniq [3,3,3,4,3]   =   [3,4] # what you meant

(But I'd be fine with either behaviours.)


--Ingo

-- 
Linux, the choice of a GNU | Row, row, row your bits, gently down the
generation on a dual AMD   | stream...  
Athlon!|



Re: [S29] uniq

2005-05-19 Thread Juerd
Ingo Blechschmidt skribis 2005-05-19 21:07 (+0200):
 I read this as that uniq should behave like Unix's uniq(1), i.e.
 removing only successive duplicates, e.g.:
   uniq [3,3,3,4,3]   =   [3,4,3]   # what I meant
   uniq [3,3,3,4,3]   =   [3,4] # what you meant

Which leads to lots of |sort|uniq, or just sort -u.

Are there many practical uses for removing successive duplicates?

I personally have never used tr///'s capability of removing successively
duplicate characters, except in golf.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: [S29] uniq

2005-05-19 Thread Adriano Ferreira
On 5/19/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote:
 I read this as that uniq should behave like Unix's uniq(1), i.e.
 removing only successive duplicates, e.g.:
  uniq [3,3,3,4,3]   =   [3,4,3]   # what I meant
  uniq [3,3,3,4,3]   =   [3,4] # what you meant

That has been discussed a few days ago at ruby-talk mailing list.
While standard 'uniq' in Ruby works by removing duplicates, unlike
Unix utility, the action of compressing a run of elements into a
single element was named 'squeeze'.

In P5:
sub squeeze {
 my ($y, $last);
 return grep { ($y, $last) = ($_ eq $last, $_); !$y } @_;
}

In Perl 6, 'eq' would be replaced by '~~' as said before. As Juerd
have said, not sure about how useful this may be.

If 'uniq' does not respect the original order of the elements, it is
not more than (1) convert list to a set, (2) convert back to a list.
If that's the intention, maybe we're better using only hashes or any
set-like implementation. There is also the variant of eliminating
duplicates in-place or constructing a new list.

Adriano.


Re: [S29] uniq

2005-05-19 Thread Damian Conway
Ingo Blechschmidt wrote:
Is it intentional that there's no uniq in the current S29[1] draft?
See [2] for Damian saying that uniq is probably in.
It still probably is.

I wondered what uniq's default comparator should be, =:=?
infix:~~

Should it be possible to give an own comparator block, similar as with
grep? E.g.
  uniq a b a a c d;   # a b a c d
  uniq:{ abs $^a == abs $^b } 42, 23, -23, 23, 42
# 42, 23, 42
Yes, I'd expect that it would be implemented as a multisub, equivalent to:
multi sub uniq (key_for: [EMAIL PROTECTED]) {
my %seen is shape(Any);
return grep { %seen{key_for $datum}++ } @data;
}
multi sub uniq (: [EMAIL PROTECTED]) {
return uniq { $^self } @data;
}
Note that this implementation is order-preserving, does not require repeated 
elements to appear consecutively, and keeps only the first occurrence.

The usage would be:
uniq a b a a c d;  # a b c d
uniq { abs $^value } 42, 23, -23, 23, 42;# 42, 23
BTW, I am *sorely* tempted to suggest the following implementation instead:
multi sub uniq (key_for: [EMAIL PROTECTED] is copy) {
my %seen_at_index is shape(Any);
my @uniq;
for @data - $datum {
my $prev_index_for_datum := %seen_at_index{key_for $datum};
if defined $prev_index_for_datum {
@uniq[$prev_index_for_datum] |= $datum;
}
else {
$prev_index_for_datum = [EMAIL PROTECTED];
push @uniq, $datum;
}
}
return [EMAIL PROTECTED];
}
multi sub uniq (: [EMAIL PROTECTED]) {
return uniq { $^self } @data;
}
which would produce:
uniq a b a a c d;  # a b c d
uniq { lc } a b C A a c d; # 'a'|'A', 'b', 'C'|'c', 'd'
uniq { abs $^value } 42, 23, -23, 23, 42;# 42, 23|-23
But I'd want to think through the ramifications a little more carefully before 
actually advocating something that correct. ;-)

Damian


Re: [S29] uniq

2005-05-19 Thread Ingo Blechschmidt
Hi,

Damian Conway wrote:
 BTW, I am *sorely* tempted to suggest the following implementation
 instead:
[...]
 which would produce:
 
  uniq a b a a c d;  # a b c d
 
  uniq { lc } a b C A a c d; # 'a'|'A', 'b',
  'C'|'c', 'd'
 
  uniq { abs $^value } 42, 23, -23, 23, 42;# 42, 23|-23

I like that *very* much! :)

And it's a nice example why Junctions Are Good :)


--Ingo

-- 
Linux, the choice of a GNU | 'The box said, 'Requires Windows 95 or
generation on a dual AMD   | better', so i installed Linux - TKK 5  
Athlon!| 



Re: [S29] uniq

2005-05-19 Thread Rod Adams
Damian Conway wrote:
BTW, I am *sorely* tempted to suggest the following implementation 
instead:

snip
which would produce:
uniq a b a a c d;  # a b c d
uniq { lc } a b C A a c d; # 'a'|'A', 'b', 
'C'|'c', 'd'

uniq { abs $^value } 42, 23, -23, 23, 42;# 42, 23|-23
But I'd want to think through the ramifications a little more 
carefully before actually advocating something that correct. ;-)

uhm... okay. I can see the utility in having something like that around. 
But I think it's the kind of thing someone has to ask for, and that it 
likely shouldn't carry the name 'uniq'.

-- Rod Adams


Re: [S29] uniq

2005-05-19 Thread Luke Palmer
On 5/19/05, Damian Conway [EMAIL PROTECTED] wrote:
 Ingo Blechschmidt wrote:
  I wondered what uniq's default comparator should be, =:=?
 
 infix:~~

Woah there.  ~~ is a good comparator and all, but it's not the right
one here.  ~~ compares an object and a pattern to see if they match. 
That makes it the right choice for when and grep.  But we're trying to
remove elements that are *the same*, not elements that *match*.

uniq hello, 13, /\w+/;# hello, 13  (!)

In fact, I think this brings up a good point.  It's one that we've
already addressed, but I think it needs a little revisiting.  The
distinction between == and eq is very nice in its place.  However, as
projects become bigger and more object oriented, we're working with
objects more than strings and numbers.   And yet, eq compares two
objects' stringifications, and == compares their numifications. 
Neither of those tests whether the objects are the same.

Larry has called it equal, but that was when we thought it was just
for generics.  I think it's for much more than that: it's for
comparing any two things that aren't specifically strings or numbers,
even if we know what they are.  For types with reference semantics, it
ought to behave just like =:=.  For types with value semantics, it
ought to give a reasonable value of same, something like as long as
you don't look at the addresses, there's nothing you could do to tell
these apart.

And maybe that's what we call =:=.  Value types really should behave
as though identical objects are inherently indistinguishable.  There
is, of course, the problem of the .id with those.  It is, in the
general case, impossible to come up with an id that is the same
between two objects iff they =:= each other.  Maybe that's how you get
around the should in my second sentence.

So I suppose that's my proposal.  Allow, even encourage, overloading
of =:=, but only for value types.  I've been thinking that we ought to
provide a standard role for making something a value type.  Maybe it
would require definition of =:=.

Luke


Re: [S29] uniq

2005-05-19 Thread Damian Conway
Luke wrote:
I wondered what uniq's default comparator should be, =:=?
infix:~~
 
Woah there.  ~~ is a good comparator and all, but it's not the right
one here.  ~~ compares an object and a pattern to see if they match. 
That makes it the right choice for when and grep.  But we're trying to
remove elements that are *the same*, not elements that *match*.
Ah, I missed Larry's message last week where he mused:
 : 3 =:= 3; # always true?
 : 3.id ~~ 3.id; # ditto?

 I think immutable values could work that way, especially if we
 want to store only a single representation of each immutable
 string.
If that's now the behaviour of =:= on non-referential values, then I agree 
that Cuniq should compare with the =:= operator.

Of course, that presupposes quite a bit of smarts on the part of the operator.
For example, we'd also need to decide what these evaluate to:
3 =:= 3.0  # 
3 =:= '3'  # 
They're surely not identical, but you'd probably want Cuniq to think so.
Or maybe you just have to SWYM and write:
@uniq_nums = uniq [EMAIL PROTECTED];
or:
@uniq_strs = uniq [EMAIL PROTECTED];
if you have heterogeneous data.
Damian



Re: [S29] uniq

2005-05-19 Thread Sam Vilain
Mark Overmeer wrote:
'uniq' differs from 'sort' because there is no order relationship between
the elements.  A quick algorithm for finding the unique elements in perl5
is
   sub uniq(@)
   {  my %h = map { ($_ = 1) } @elements;
  keys %h;
   }
...and an even quicker one is:
 use Set::Object;
 sub uniq(@)
 {
 set(@_)-members;
 }
or
 use v6;
 use Set;
 sub uniq([EMAIL PROTECTED])
 {
 set(@items).members;
 }
Sam.