All,
I was wondering how setkey orders a factor and whether it observes whether the
factor is ordered or just alphabetically orders the factor
I would like to have the key observe the order of a factor (e.g., a course
taken field may run from 1 to 5 with 1=Basic Math, 2=Calculus, 3=Geometry,
4=Algebra I and 5=Algebra 2. I would like the sort imposed by data.table to
"respect" the canonical ordering of the classes, no an alphabetical ordering.
I can't however, seem to get the key to behave the way I want.
Here's an example:
setkey(123)
my.course.sample <- sample(1:5, 10, replace=TRUE)
X <- 1:10
Y <- factor(my.course.sample, levels=1:5, labels=c("Basic Math", "Calculus",
"Geometry", "Algebra I", "Algebra II"))
my.dt <- data.table(ID=X, COURSE=Y)
> my.dt
ID COURSE
[1,] 1 Algebra II
[2,] 2 Algebra I
[3,] 3 Algebra I
[4,] 4 Algebra II
[5,] 5 Geometry
[6,] 6 Algebra I
[7,] 7 Geometry
[8,] 8 Calculus
[9,] 9 Algebra I
[10,] 10 Geometry
setkey(my.dt, COURSE)
> my.dt
ID COURSE
[1,] 2 Algebra I
[2,] 3 Algebra I
[3,] 6 Algebra I
[4,] 9 Algebra I
[5,] 1 Algebra II
[6,] 4 Algebra II
[7,] 8 Calculus
[8,] 5 Geometry
[9,] 7 Geometry
[10,] 10 Geometry
###
### The COURSE key is alphabetizing based upon the labels
###
###
### Now try to impose a different ordering
###
Y <- factor(my.course.sample, levels=c(1,4,3,5,2), labels=c("Basic Math",
"Calculus", "Geometry", "Algebra I", "Algebra II"))
my.dt <- data.table(ID=X, COURSE=Y)
> my.dt
ID COURSE
[1,] 1 Algebra I
[2,] 2 Calculus
[3,] 3 Calculus
[4,] 4 Algebra I
[5,] 5 Geometry
[6,] 6 Calculus
[7,] 7 Geometry
[8,] 8 Algebra II
[9,] 9 Calculus
[10,] 10 Geometry
setkey(my.dt, COURSE)
> my.dt
ID COURSE
[1,] 1 Algebra I
[2,] 3 Algebra I
[3,] 9 Algebra I
[4,] 2 Algebra II
[5,] 4 Algebra II
[6,] 8 Algebra II
[7,] 7 Basic Math
[8,] 5 Calculus
[9,] 6 Calculus
[10,] 10 Geometry
Y <- factor(my.course.sample, levels=c(1,4,3,5,2), labels=c("Basic Math",
"Calculus", "Geometry", "Algebra I", "Algebra II"), ordered=TRUE)
my.dt <- data.table(ID=X, COURSE=Y)
my.dt
ID COURSE
[1,] 1 Algebra I
[2,] 2 Calculus
[3,] 3 Calculus
[4,] 4 Algebra I
[5,] 5 Geometry
[6,] 6 Calculus
[7,] 7 Geometry
[8,] 8 Algebra II
[9,] 9 Calculus
[10,] 10 Geometry
setkey(my.dt, COURSE)
my.dt
ID COURSE
[1,] 1 Algebra I
[2,] 4 Algebra I
[3,] 8 Algebra II
[4,] 2 Calculus
[5,] 3 Calculus
[6,] 6 Calculus
[7,] 9 Calculus
[8,] 5 Geometry
[9,] 7 Geometry
[10,] 10 Geometry
### Setting COURSE as the key for an ordered factor seems to over-ride the
ordering associated with the factor and impose an alphabetical order.
I'd like the key to respect the order associated with the factor
Any help with this greatly appreciated.
Best regards,
Damian Betebenner
Center for Assessment
PO Box 351
Dover, NH 03821-0351
Phone (office): (603) 516-7900
Phone (cell): (857) 234-2474
Fax: (603) 516-7910
[email protected]
www.nciea.org
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help