Re: [S29] uniq
Luke Palmer wrote: So I suppose that's my proposal. Allow, even encourage, overloading of =:=, but only for value types. I've been thinking that we ought to provide a standard role for making something a value type. Maybe it would require definition of =:=. I would like to propose something slightly different: 1) Let =:= be non-overloadable purely macro/grammar based --- same could be applied to ::= and := I think 2) the implementation of =:= then checks type homogenity as a precondition and calls the type specific EQUAL submethod or some such The rational is that type homogenity is a necessary condition for identity and thus allows to return false when violated. The EQUAL class/type submethod could be used as the default for the == and eq operators as long as subtype homogenity for both sides holds. Well, or the operator name is a parameter to the parametric role: role Identity[ eq_op:( ::?CLASS, ::?CLASS: -- bit ) = $::CLASS::EQUAL ] does Identity[ ::?CLASS ] # F-bounded { ... } which is implicitly forced into every class definition by virtue of role Any does Identity; In particular we have: class Num does Identity[ infix:{'=='} ] {...} class Str does Identity[ infix:{'eq'} ] {...} To what exteent in canbe auto-generated with dispatching to the respective methods of all elements defined into the class' value environment, I can't say. -- TSa (Thomas Sandlaß)
[S29] uniq
Hi, three quick questions: Is it intentional that there's no uniq in the current S29[1] draft? See [2] for Damian saying that uniq is probably in. I wondered what uniq's default comparator should be, =:=? Should it be possible to give an own comparator block, similar as with grep? E.g. uniq a b a a c d; # a b a c d uniq:{ abs $^a == abs $^b } 42, 23, -23, 23, 42 # 42, 23, 42 --Ingo [1] http://svn.openfoundry.org/pugs/docs/AES/S29draft.pod [2] http://groups.google.com/groups?selm=420DB295.3000902%40conway.org -- Linux, the choice of a GNU | The future is here. It's just not widely generation on a dual AMD | distributed yet. Athlon!| -- William Gibson
Re: [S29] uniq
Ingo Blechschmidt wrote: Hi, three quick questions: Since Aaron is still getting up to speed, I'll take a stab at these. Is it intentional that there's no uniq in the current S29[1] draft? See [2] for Damian saying that uniq is probably in. Just hasn't been entered. I wondered what uniq's default comparator should be, =:=? I'd have gone with ~~ Should it be possible to give an own comparator block, similar as with grep? E.g. uniq a b a a c d; # a b a c d uniq:{ abs $^a == abs $^b } 42, 23, -23, 23, 42 # 42, 23, 42 I could see that happening. But I'd have to stop and wonder if wrapping it inside a map would be more natural. If it does happen, it'd likely need to copy the key generation style of the new sort. -- Rod Adams
Re: [S29] uniq
Hi, Rod Adams wrote: I wondered what uniq's default comparator should be, =:=? I'd have gone with ~~ Even better. :) --Ingo -- Linux, the choice of a GNU | Row, row, row your bits, gently down the generation on a dual AMD | stream... Athlon!|
Re: [S29] uniq
* Ingo Blechschmidt ([EMAIL PROTECTED]) [050519 16:52]: Should it be possible to give an own comparator block, similar as with grep? E.g. uniq a b a a c d; # a b a c d uniq:{ abs $^a == abs $^b } 42, 23, -23, 23, 42 # 42, 23, 42 'uniq' differs from 'sort' because there is no order relationship between the elements. A quick algorithm for finding the unique elements in perl5 is sub uniq(@) { my %h = map { ($_ = 1) } @elements; keys %h; } or sub uniq(@) { my %h; $h{$_}++ for @elements; keys %h; } anyway: O(n), which is much better than sort can do. For your proposal, the requested code only requires this uniq:{ abs $^a } 42, 23, -23, 23, 42, 42, 23, 42 uniq:{lc} @words -- MarkOv drs Mark A.C.J. OvermeerMARKOV Solutions [EMAIL PROTECTED] [EMAIL PROTECTED] http://Mark.Overmeer.net http://solutions.overmeer.net
Re: [S29] uniq
Hi, Mark Overmeer wrote: * Ingo Blechschmidt ([EMAIL PROTECTED]) [050519 16:52]: Should it be possible to give an own comparator block, similar as with grep? E.g. uniq a b a a c d; # a b a c d uniq:{ abs $^a == abs $^b } 42, 23, -23, 23, 42 # 42, 23, 42 'uniq' differs from 'sort' because there is no order relationship between the elements. quoting Damian's original mail[1]: uniq - remove duplicates without reordering ^^ A quick algorithm for finding the unique elements in perl5 is [...] Yeah, or the Perl 6: sub uniq([EMAIL PROTECTED]) { my %h; [EMAIL PROTECTED]; return keys %h; } # Or sub uniq([EMAIL PROTECTED]) { my %h; [EMAIL PROTECTED] = (undef) xx Inf; return keys %h; } # thousand other ways ommitted :) --Ingo [1] http://groups.google.com/groups?selm=420DB295.3000902%40conway.org -- Linux, the choice of a GNU | The future is here. It's just not widely generation on a dual AMD | distributed yet. -- William Gibson Athlon!|
Re: [S29] uniq
quoting Damian's original mail[1]: uniq - remove duplicates without reordering ^^ Would not that mean the original order of the first ocurrence is preserved? This is what Ruby Array#uniq does: [4,1,2,4,2,3,5].uniq = [ 4, 1, 2, 3, 5] The behavior is consistent with the description given ri Array#uniq - Array#uniq array.uniq - an_array Returns a new array by removing duplicate values in _self_. a = [ a, a, b, b, c ] a.uniq #= [a, b, c] A naïve Perl 5 implementation could be: sub uniq { my (@u, %h); for (@_) { push(@u, $_) unless $h{$_}; $h{$_}++; } return @u; } Adriano.
Re: [S29] uniq
The former implementation can be shortened: sub uniq { my %h; return grep { ! $h{$_}++ } @_; } But realize that none of the proposed solutions (which are based on hashes for computing the return) is amenable to the extension Ingo called for with comparator blocks. Adriano.
Re: [S29] uniq
Hi, Adriano Ferreira wrote: quoting Damian's original mail[1]: uniq - remove duplicates without reordering ^^ Would not that mean the original order of the first ocurrence is preserved? This is what Ruby Array#uniq does: [4,1,2,4,2,3,5].uniq = [ 4, 1, 2, 3, 5] I read this as that uniq should behave like Unix's uniq(1), i.e. removing only successive duplicates, e.g.: uniq [3,3,3,4,3] = [3,4,3] # what I meant uniq [3,3,3,4,3] = [3,4] # what you meant (But I'd be fine with either behaviours.) --Ingo -- Linux, the choice of a GNU | Row, row, row your bits, gently down the generation on a dual AMD | stream... Athlon!|
Re: [S29] uniq
Ingo Blechschmidt skribis 2005-05-19 21:07 (+0200): I read this as that uniq should behave like Unix's uniq(1), i.e. removing only successive duplicates, e.g.: uniq [3,3,3,4,3] = [3,4,3] # what I meant uniq [3,3,3,4,3] = [3,4] # what you meant Which leads to lots of |sort|uniq, or just sort -u. Are there many practical uses for removing successive duplicates? I personally have never used tr///'s capability of removing successively duplicate characters, except in golf. Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: [S29] uniq
On 5/19/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote: I read this as that uniq should behave like Unix's uniq(1), i.e. removing only successive duplicates, e.g.: uniq [3,3,3,4,3] = [3,4,3] # what I meant uniq [3,3,3,4,3] = [3,4] # what you meant That has been discussed a few days ago at ruby-talk mailing list. While standard 'uniq' in Ruby works by removing duplicates, unlike Unix utility, the action of compressing a run of elements into a single element was named 'squeeze'. In P5: sub squeeze { my ($y, $last); return grep { ($y, $last) = ($_ eq $last, $_); !$y } @_; } In Perl 6, 'eq' would be replaced by '~~' as said before. As Juerd have said, not sure about how useful this may be. If 'uniq' does not respect the original order of the elements, it is not more than (1) convert list to a set, (2) convert back to a list. If that's the intention, maybe we're better using only hashes or any set-like implementation. There is also the variant of eliminating duplicates in-place or constructing a new list. Adriano.
Re: [S29] uniq
Ingo Blechschmidt wrote: Is it intentional that there's no uniq in the current S29[1] draft? See [2] for Damian saying that uniq is probably in. It still probably is. I wondered what uniq's default comparator should be, =:=? infix:~~ Should it be possible to give an own comparator block, similar as with grep? E.g. uniq a b a a c d; # a b a c d uniq:{ abs $^a == abs $^b } 42, 23, -23, 23, 42 # 42, 23, 42 Yes, I'd expect that it would be implemented as a multisub, equivalent to: multi sub uniq (key_for: [EMAIL PROTECTED]) { my %seen is shape(Any); return grep { %seen{key_for $datum}++ } @data; } multi sub uniq (: [EMAIL PROTECTED]) { return uniq { $^self } @data; } Note that this implementation is order-preserving, does not require repeated elements to appear consecutively, and keeps only the first occurrence. The usage would be: uniq a b a a c d; # a b c d uniq { abs $^value } 42, 23, -23, 23, 42;# 42, 23 BTW, I am *sorely* tempted to suggest the following implementation instead: multi sub uniq (key_for: [EMAIL PROTECTED] is copy) { my %seen_at_index is shape(Any); my @uniq; for @data - $datum { my $prev_index_for_datum := %seen_at_index{key_for $datum}; if defined $prev_index_for_datum { @uniq[$prev_index_for_datum] |= $datum; } else { $prev_index_for_datum = [EMAIL PROTECTED]; push @uniq, $datum; } } return [EMAIL PROTECTED]; } multi sub uniq (: [EMAIL PROTECTED]) { return uniq { $^self } @data; } which would produce: uniq a b a a c d; # a b c d uniq { lc } a b C A a c d; # 'a'|'A', 'b', 'C'|'c', 'd' uniq { abs $^value } 42, 23, -23, 23, 42;# 42, 23|-23 But I'd want to think through the ramifications a little more carefully before actually advocating something that correct. ;-) Damian
Re: [S29] uniq
Hi, Damian Conway wrote: BTW, I am *sorely* tempted to suggest the following implementation instead: [...] which would produce: uniq a b a a c d; # a b c d uniq { lc } a b C A a c d; # 'a'|'A', 'b', 'C'|'c', 'd' uniq { abs $^value } 42, 23, -23, 23, 42;# 42, 23|-23 I like that *very* much! :) And it's a nice example why Junctions Are Good :) --Ingo -- Linux, the choice of a GNU | 'The box said, 'Requires Windows 95 or generation on a dual AMD | better', so i installed Linux - TKK 5 Athlon!|
Re: [S29] uniq
Damian Conway wrote: BTW, I am *sorely* tempted to suggest the following implementation instead: snip which would produce: uniq a b a a c d; # a b c d uniq { lc } a b C A a c d; # 'a'|'A', 'b', 'C'|'c', 'd' uniq { abs $^value } 42, 23, -23, 23, 42;# 42, 23|-23 But I'd want to think through the ramifications a little more carefully before actually advocating something that correct. ;-) uhm... okay. I can see the utility in having something like that around. But I think it's the kind of thing someone has to ask for, and that it likely shouldn't carry the name 'uniq'. -- Rod Adams
Re: [S29] uniq
On 5/19/05, Damian Conway [EMAIL PROTECTED] wrote: Ingo Blechschmidt wrote: I wondered what uniq's default comparator should be, =:=? infix:~~ Woah there. ~~ is a good comparator and all, but it's not the right one here. ~~ compares an object and a pattern to see if they match. That makes it the right choice for when and grep. But we're trying to remove elements that are *the same*, not elements that *match*. uniq hello, 13, /\w+/;# hello, 13 (!) In fact, I think this brings up a good point. It's one that we've already addressed, but I think it needs a little revisiting. The distinction between == and eq is very nice in its place. However, as projects become bigger and more object oriented, we're working with objects more than strings and numbers. And yet, eq compares two objects' stringifications, and == compares their numifications. Neither of those tests whether the objects are the same. Larry has called it equal, but that was when we thought it was just for generics. I think it's for much more than that: it's for comparing any two things that aren't specifically strings or numbers, even if we know what they are. For types with reference semantics, it ought to behave just like =:=. For types with value semantics, it ought to give a reasonable value of same, something like as long as you don't look at the addresses, there's nothing you could do to tell these apart. And maybe that's what we call =:=. Value types really should behave as though identical objects are inherently indistinguishable. There is, of course, the problem of the .id with those. It is, in the general case, impossible to come up with an id that is the same between two objects iff they =:= each other. Maybe that's how you get around the should in my second sentence. So I suppose that's my proposal. Allow, even encourage, overloading of =:=, but only for value types. I've been thinking that we ought to provide a standard role for making something a value type. Maybe it would require definition of =:=. Luke
Re: [S29] uniq
Luke wrote: I wondered what uniq's default comparator should be, =:=? infix:~~ Woah there. ~~ is a good comparator and all, but it's not the right one here. ~~ compares an object and a pattern to see if they match. That makes it the right choice for when and grep. But we're trying to remove elements that are *the same*, not elements that *match*. Ah, I missed Larry's message last week where he mused: : 3 =:= 3; # always true? : 3.id ~~ 3.id; # ditto? I think immutable values could work that way, especially if we want to store only a single representation of each immutable string. If that's now the behaviour of =:= on non-referential values, then I agree that Cuniq should compare with the =:= operator. Of course, that presupposes quite a bit of smarts on the part of the operator. For example, we'd also need to decide what these evaluate to: 3 =:= 3.0 # 3 =:= '3' # They're surely not identical, but you'd probably want Cuniq to think so. Or maybe you just have to SWYM and write: @uniq_nums = uniq [EMAIL PROTECTED]; or: @uniq_strs = uniq [EMAIL PROTECTED]; if you have heterogeneous data. Damian
Re: [S29] uniq
Mark Overmeer wrote: 'uniq' differs from 'sort' because there is no order relationship between the elements. A quick algorithm for finding the unique elements in perl5 is sub uniq(@) { my %h = map { ($_ = 1) } @elements; keys %h; } ...and an even quicker one is: use Set::Object; sub uniq(@) { set(@_)-members; } or use v6; use Set; sub uniq([EMAIL PROTECTED]) { set(@items).members; } Sam.