In data.table 1.7.7:
The function unique works for datatables (without keys) that have factors, but
not if they have strings. In the latter case, setting the key will convert the
strings to factors. I can't figure out from the documentation if this is the
intended behavior or not. (The documentation does say that keys can't be
characters/strings). It would be nice if unique would work without having to
convert strings to factors because of the conversion cost in very large
datatables, but maybe this isn't possible.
--Steve
> library(data.table)
> foo1=as.data.table(data.frame(a=c("1", "1"), b=c(2,2)))
> foo1
a b
[1,] 1 2
[2,] 1 2
> str(foo1)
Classes ‘data.table’ and 'data.frame': 2 obs. of 2 variables:
$ a: Factor w/ 1 level "1": 1 1
$ b: num 2 2
> unique(foo1)
a b
[1,] 1 2
> foo2=as.data.table(data.frame(a=c("1", "1"), b=c(2,2),
> stringsAsFactors=FALSE))
> foo2
a b
[1,] 1 2
[2,] 1 2
> str(foo2)
Classes ‘data.table’ and 'data.frame': 2 obs. of 2 variables:
$ a: chr "1" "1"
$ b: num 2 2
> unique(foo2)
a b
[1,] 1 2
[2,] 1 2
> setkey(foo2, a)
> str(foo2)
Classes ‘data.table’ and 'data.frame': 2 obs. of 2 variables:
$ a: Factor w/ 1 level "1": 1 1
$ b: num 2 2
- attr(*, "sorted")= chr "a"
> unique(foo2)
a b
[1,] 1 2
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help