On 25.02.2012 19:16, Paul Johnson wrote:
Hello, Everybody:

This may not be a "bug", but for me it is an unexpected outcome. A
factor variable's levels
do not retain their ordering after the levels function is used.  I
supply an example in which
a factor with values "BC" "AD" (in that order) is unintentionally
re-alphabetized by the levels
function.

To me, this is very bad behavior. Would you agree?


# Paul Johnson 2012-02-05

x<- c("AD","BC","AD","BC","AD","BC")
xf<- factor(x, levels=c("BC", "AD"), labels=c("Before Christ","After Christ"))
y<- rnorm(6)

m1<- lm (y ~ xf )

plot(y ~ xf)

abline (m1)
## Just a little problem the line does not "go through" the box
## plot in the right spot because contrasts(xf) is 0,1 but
## the plot uses xf in 1,2.

xlevels<- levels(xf)
newdf<- data.frame(xf=xlevels)

ypred<- predict(m1, newdata=newdf)

##Watch now: the plot comes out "reversed", AC before BC
plot(ypred ~ newdf$xf)

## Ah. Now I see:

levels(newdf$xf)
## Why doesnt newdf$xf respect the ordering of the levels?


Because xlevels was a character and you coerced it to a factor by calling data.frame(xf=xlevels) on it without telling anything about the orderiung, hence it got sorted lexicographically.

Uwe Ligges






______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to