In data.table 1.7.7: 

The function unique works for datatables (without keys) that have factors, but 
not if they have strings. In the latter case, setting the key will convert the 
strings to factors. I can't figure out from the documentation if this is the 
intended behavior or not. (The documentation does say that keys can't be 
characters/strings). It would be nice if unique would work without having to 
convert strings to factors because of the conversion cost in very large 
datatables, but maybe this isn't possible.

--Steve

> library(data.table)
> foo1=as.data.table(data.frame(a=c("1", "1"), b=c(2,2)))
> foo1
     a b
[1,] 1 2
[2,] 1 2
> str(foo1)
Classes ‘data.table’ and 'data.frame':  2 obs. of  2 variables:
 $ a: Factor w/ 1 level "1": 1 1
 $ b: num  2 2
> unique(foo1)
     a b
[1,] 1 2
> foo2=as.data.table(data.frame(a=c("1", "1"), b=c(2,2), 
> stringsAsFactors=FALSE))
> foo2
     a b
[1,] 1 2
[2,] 1 2
> str(foo2)
Classes ‘data.table’ and 'data.frame':  2 obs. of  2 variables:
 $ a: chr  "1" "1"
 $ b: num  2 2
> unique(foo2)
     a b
[1,] 1 2
[2,] 1 2
> setkey(foo2, a)
> str(foo2)
Classes ‘data.table’ and 'data.frame':  2 obs. of  2 variables:
 $ a: Factor w/ 1 level "1": 1 1
 $ b: num  2 2
 - attr(*, "sorted")= chr "a"
> unique(foo2)
     a b
[1,] 1 2
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to