[R] factor with numeric names

Saiwing Yeung Sat, 21 Mar 2009 14:04:20 -0700

Hi all,

I have a pretty basic question about categorical variables but I can'tseem to be able to find answer so I am hoping someone here can help. Ifound that if the factor names are all in numbers, fitting the modelin lm would return labels that are not very recognizable.


# Example: let's just assume that we want to fit this model
fit <- lm(height ~ age + Seed, data=Loblolly)

# See the category names are all mangled up here
fit


Call:
lm(formula = height ~ age + Seed, data = Loblolly)

Coefficients:

(Intercept) age Seed.L Seed.Q Seed.CSeed^4-1.31240 2.59052 4.86941 0.87307 0.37894-0.46853Seed^5 Seed^6 Seed^7 Seed^8 Seed^9Seed^100.55237 0.39659 -0.06507 0.35074 -0.834420.42085

    Seed^11      Seed^12      Seed^13
    0.53906     -0.29803     -0.77254



One possible solution I found is to rename the categorical variables

seed.str <- paste("S", Loblolly$Seed, sep="")
seed.str <- factor(seed.str)
fit <- lm(height ~ age + seed.str, data=Loblolly)
fit



Call:
lm(formula = height ~ age + seed.str, data = Loblolly)

Coefficients:
 (Intercept)           age  seed.strS303  seed.strS305  seed.strS307
     -0.4301        2.5905        0.8600        1.8683       -1.9183
seed.strS309  seed.strS311  seed.strS315  seed.strS319  seed.strS321
      0.5350       -1.5933       -0.8867       -0.3650       -2.0350
seed.strS323  seed.strS325  seed.strS327  seed.strS329  seed.strS331
      0.3067       -1.3233       -2.6400       -2.9333       -2.2267

Now it is actually possible to see which one is which, but is kind oflame. Can someone point me to a more elegant solution? Thank you somuch.


Saiwing Yeung

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] factor with numeric names

Reply via email to