Re: [julia-users] Re: What's the difference between unique(a) and union(a)

James Fairbanks Thu, 14 Jan 2016 08:00:25 -0800

On Wednesday, January 13, 2016 at 7:40:58 PM UTC-5, Bob Nnamtrop wrote:
>
> The unique(itr, dim) function seems broken to me. ... 
>
For example:
>
> julia> a=[[3,1,2,3,1] [1,1,2,1,1]]
> 5x2 Array{Int64,2}:
>  3  1
>  1  1
>  2  2
>  3  1
>  1  1
>
> julia> unique(a, 1)
> 3x2 Array{Int64,2}:
>  3  1
>  1  1
>  2  2
>
> This is incorrect for the second column. Shouldn't it produce something 
> like:
>
>
It is producing unique rows. The docs say:


help?> unique
search: unique

  unique(itr[, dim])

  Returns an array containing only the unique elements of the iterable itr, 
in the order that the first of each set of equivalent
  elements originally appears. If dim is specified, returns unique regions 
of the array itr along dim.

I think that "unique regions of the array along dim" could be clearer, but 
it is accurate.

For example unique(a,2) gives unique columns

julia> a=[[3,1,2,3,1] [1,1,2,1,1] [3,1,2,3,1]]
5x3 Array{Int64,2}:
 3  1  3
 1  1  1
 2  2  2
 3  1  3
 1  1  1

julia> unique(a, 2)
5x2 Array{Int64,2}:
 3  1
 1  1
 2  2
 3  1
 1  1


The difference between unique and union is:

- Use unique when you have one collection with repeats and want to get a 
collection of the distinct items.
- Use union when you have a multiple collections and want the collection 
containing the distinct items that appear in any of the collections.

They happen to do the same thing on one argument because union(x) is 
computing the union of x with the empty collection, which makes sense 
because the empty set is the identity operation for union. For the same 
reason +(1) == 1.

Re: [julia-users] Re: What's the difference between unique(a) and union(a)

Reply via email to