2016-11-18 15:36 GMT+01:00 Alexander Ilin <[email protected]>:
> Hello, all!
>
> I have an interesting little task for you today.
>
> Let's say you have a sequence of tuples, and you want to remove all tuples
> with duplicate ids, so that in the new sequence there is only one tuple with
> each id.
>
> Here's my solution:
>
> TYPED: dedupe-by-hash ( seq: sequence -- seq: sequence )
> dup [ hash>> ] map >hash-set [
> [ hash>> ] dip
> [ in? ] [ delete ] 2bi
> ] curry filter ;
>
> This is not the first time I'm solving this task, and I begun to wonder -
> is there something similar in the Factor library?
Everything is in the Factor library. :) What you are describing is
like a group by operation in sql. So if you have:
TUPLE: person name id ;
You can use either:
USE: sequences.extras
[ id>> ] sort-with [ id>> ] group-by [ second first ] map
Or
USE: math.statistics
[ id>> ] collect-by [ nip first ] { } assoc>map
If you want tiebreakers, like choosing the person with the
alphabetically first name if more than one share id, you can implement
it like this:
USE: slots.syntax
[ slots{ id name } ] sort-with [ id>> ] group-by [ second first ] map
It's not as efficient as what John committed though. :) Maybe we
should try and clean it up somehow? If we put all group
by/aggregation/uniquifying words in the same vocab it would be more
easily discoverable?
--
mvh Björn Lindqvist
------------------------------------------------------------------------------
_______________________________________________
Factor-talk mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/factor-talk