Re: [R] factor with numeric names

2011-03-25 Thread agent dunham
Dear all, 

According to the post I was trying:

factorA = c(2,2,3,3,4,4,3,4,2,2)
levels(factor - c(lv1,lv2,lv3) )

But this returns NULL and doesn't change factor names. 

Actually, my factor is included in a data.frame, so I also tried: 

levels(df$factorA)[levels(df$factorA)==2] - lv1  Also
levels(df$factorA)[levels(df$factorA)==2] - lv1
levels(df$factorA)[levels(df$factorA)==3] - lv2
levels(df$factorA)[levels(df$factorA)==4] - lv3

then I type table(df$factorA) and it doesn't work either. 

Any help would be appreciated. Thanks, 
u...@host.com



--
View this message in context: 
http://r.789695.n4.nabble.com/factor-with-numeric-names-tp882535p3404942.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor with numeric names

2011-03-25 Thread agent dunham
Dear all, 

According to the post I was trying: 

factorA = c(2,2,3,3,4,4,3,4,2,2) 
levels(factorA - c(lv1,lv2,lv3) ) 

But this returns NULL and doesn't change factor names. 

Actually, my factor is included in a data.frame, so I also tried: 

levels(df$factorA)[levels(df$factorA)==2] - lv1  Also
levels(df$factorA)[levels(df$factorA)==2] - lv1 
levels(df$factorA)[levels(df$factorA)==3] - lv2 
levels(df$factorA)[levels(df$factorA)==4] - lv3 

then I type table(df$factorA) and it doesn't work either. 

After attach(df), I tried also this 

for(i in 1:10) 
   { if(factor(factorA)[i]==2) factorA[i]=lv2 else {; 
  if(factor(factorA)[i]==3) factorA[i]=lv3 else {; 
  factorA[i]= lv4


Any help would be appreciated. Thanks, u...@host.com

--
View this message in context: 
http://r.789695.n4.nabble.com/factor-with-numeric-names-tp882535p3405247.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor with numeric names

2011-03-25 Thread David Winsemius


On Mar 25, 2011, at 8:30 AM, agent dunham wrote:


Dear all,

According to the post I was trying:

factorA = c(2,2,3,3,4,4,3,4,2,2)
levels(factorA - c(lv1,lv2,lv3) )


Well, this is wrong. Try:

levels(factorA) - c(lv1,lv2,lv3)

 factorA
 [1] 2 2 3 3 4 4 3 4 2 2
attr(,levels)
[1] lv1 lv2 lv3


But this returns NULL and doesn't change factor names.

Actually, my factor is included in a data.frame, so I also tried:

levels(df$factorA)[levels(df$factorA)==2] - lv1  Also
levels(df$factorA)[levels(df$factorA)==2] - lv1
levels(df$factorA)[levels(df$factorA)==3] - lv2
levels(df$factorA)[levels(df$factorA)==4] - lv3

then I type table(df$factorA) and it doesn't work either.

After attach(df), I tried also this

for(i in 1:10)
  { if(factor(factorA)[i]==2) factorA[i]=lv2 else {;
 if(factor(factorA)[i]==3) factorA[i]=lv3 else {;
 factorA[i]= lv4


That looks painful. What book or resource are you using?

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor with numeric names

2009-03-25 Thread Saiwing Yeung


Thank you so much both for the answer. I think I have a better handle  
on this now. Yes, Loblolly$Seed is an ordered factor, but I didn't  
realize that the default for ordered factor is contr.poly.


And then I was further confused because I didn't realize the  
coefficient names generated (not just the model) are different  
depending on whether there is an intercept term (even though they were  
both contr.poly).


 lm(formula = height ~ age + Seed, data = Loblolly)

Call:
lm(formula = height ~ age + Seed, data = Loblolly)

Coefficients:
(Intercept)  age   Seed.L   Seed.Q   Seed.C
Seed^4
   -1.31240  2.59052  4.86941  0.87307  0.37894  
-0.46853
 Seed^5   Seed^6   Seed^7   Seed^8   Seed^9   
Seed^10
0.55237  0.39659 -0.06507  0.35074 -0.83442   
0.42085

Seed^11  Seed^12  Seed^13
0.53906 -0.29803 -0.77254

 lm(formula = height ~ age + Seed - 1, data = Loblolly)

Call:
lm(formula = height ~ age + Seed - 1, data = Loblolly)

Coefficients:
age  Seed329  Seed327  Seed325  Seed307  Seed331  Seed311   
Seed315  Seed321
 2.5905  -3.3635  -3.0701  -1.7535  -2.3485  -2.6568  -2.0235   
-1.3168  -2.4651

Seed319  Seed301  Seed323  Seed309  Seed303  Seed305
-0.7951  -0.4301  -0.1235   0.1049   0.4299   1.4382


This should have been obvious to me...


(for the sake of completeness) I think factor() doesn't change the  
ordered-ness


# as.factor(Loblolly$Seed) doesn't remove the ordered-ness
 str(Loblolly$Seed)
 Ord.factor w/ 14 levels 329327325..: 10 10 10 10 10 10 13  
13 13 13 ...

 str(as.factor(Loblolly$Seed))
 Ord.factor w/ 14 levels 329327325..: 10 10 10 10 10 10 13  
13 13 13 ...


# this works though
 str(factor(Loblolly$Seed, ordered=F))
 Factor w/ 14 levels 329,327,325,..: 10 10 10 10 10 10 13 13 13  
13 ...



Saiwing



On Mar 21, 2009, at 3:35 PM, John Fox wrote:


Dear Saiwing Yeung,

You appear to be using orthogonal-polynomial contrasts (generated by
contr.poly) for Seed, which suggests that Seed is either an ordered  
factor
or that you've assigned these contrasts to it. Because Seed has 14  
levels,
you end up fitting an degree-13 polynomial. If Seed is indeed an  
ordered
factor and you want to use contr.treatment instead then you could,  
e.g., set
Loblolly$Seed - as.factor(Loblolly$Seed). (If I'm right about Seed  
being an
ordered factor, your solution worked because it changed Seed to a  
factor,

not because it used non-numeric level names.)

I hope this helps,
John


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
]

On

Behalf Of Saiwing Yeung
Sent: March-21-09 5:02 PM
To: r-help@r-project.org
Subject: [R] factor with numeric names

Hi all,

I have a pretty basic question about categorical variables but I  
can't
seem to be able to find answer so I am hoping someone here can  
help. I

found that if the factor names are all in numbers, fitting the model
in lm would return labels that are not very recognizable.

# Example: let's just assume that we want to fit this model
fit - lm(height ~ age + Seed, data=Loblolly)

# See the category names are all mangled up here
fit


Call:
lm(formula = height ~ age + Seed, data = Loblolly)

Coefficients:
(Intercept)  age   Seed.L   Seed.Q   Seed.C
Seed^4
   -1.31240  2.59052  4.86941  0.87307  0.37894
-0.46853
 Seed^5   Seed^6   Seed^7   Seed^8   Seed^9
Seed^10
0.55237  0.39659 -0.06507  0.35074 -0.83442
0.42085
Seed^11  Seed^12  Seed^13
0.53906 -0.29803 -0.77254



One possible solution I found is to rename the categorical variables

seed.str - paste(S, Loblolly$Seed, sep=)
seed.str - factor(seed.str)
fit - lm(height ~ age + seed.str, data=Loblolly)
fit



Call:
lm(formula = height ~ age + seed.str, data = Loblolly)

Coefficients:
 (Intercept)   age  seed.strS303  seed.strS305  seed.strS307
 -0.43012.59050.86001.8683   -1.9183
seed.strS309  seed.strS311  seed.strS315  seed.strS319  seed.strS321
  0.5350   -1.5933   -0.8867   -0.3650   -2.0350
seed.strS323  seed.strS325  seed.strS327  seed.strS329  seed.strS331
  0.3067   -1.3233   -2.6400   -2.9333   -2.2267


Now it is actually possible to see which one is which, but is kind of
lame. Can someone point me to a more elegant solution? Thank you so
much.

Saiwing Yeung

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

[R] factor with numeric names

2009-03-21 Thread Saiwing Yeung

Hi all,

I have a pretty basic question about categorical variables but I can't  
seem to be able to find answer so I am hoping someone here can help. I  
found that if the factor names are all in numbers, fitting the model  
in lm would return labels that are not very recognizable.


# Example: let's just assume that we want to fit this model
fit - lm(height ~ age + Seed, data=Loblolly)

# See the category names are all mangled up here
fit


Call:
lm(formula = height ~ age + Seed, data = Loblolly)

Coefficients:
(Intercept)  age   Seed.L   Seed.Q   Seed.C
Seed^4
   -1.31240  2.59052  4.86941  0.87307  0.37894  
-0.46853
 Seed^5   Seed^6   Seed^7   Seed^8   Seed^9   
Seed^10
0.55237  0.39659 -0.06507  0.35074 -0.83442   
0.42085

Seed^11  Seed^12  Seed^13
0.53906 -0.29803 -0.77254



One possible solution I found is to rename the categorical variables

seed.str - paste(S, Loblolly$Seed, sep=)
seed.str - factor(seed.str)
fit - lm(height ~ age + seed.str, data=Loblolly)
fit



Call:
lm(formula = height ~ age + seed.str, data = Loblolly)

Coefficients:
 (Intercept)   age  seed.strS303  seed.strS305  seed.strS307
 -0.43012.59050.86001.8683   -1.9183
seed.strS309  seed.strS311  seed.strS315  seed.strS319  seed.strS321
  0.5350   -1.5933   -0.8867   -0.3650   -2.0350
seed.strS323  seed.strS325  seed.strS327  seed.strS329  seed.strS331
  0.3067   -1.3233   -2.6400   -2.9333   -2.2267


Now it is actually possible to see which one is which, but is kind of  
lame. Can someone point me to a more elegant solution? Thank you so  
much.


Saiwing Yeung

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor with numeric names

2009-03-21 Thread John Fox
Dear Saiwing Yeung,

You appear to be using orthogonal-polynomial contrasts (generated by
contr.poly) for Seed, which suggests that Seed is either an ordered factor
or that you've assigned these contrasts to it. Because Seed has 14 levels,
you end up fitting an degree-13 polynomial. If Seed is indeed an ordered
factor and you want to use contr.treatment instead then you could, e.g., set
Loblolly$Seed - as.factor(Loblolly$Seed). (If I'm right about Seed being an
ordered factor, your solution worked because it changed Seed to a factor,
not because it used non-numeric level names.)

I hope this helps,
 John

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
 Behalf Of Saiwing Yeung
 Sent: March-21-09 5:02 PM
 To: r-help@r-project.org
 Subject: [R] factor with numeric names
 
 Hi all,
 
 I have a pretty basic question about categorical variables but I can't
 seem to be able to find answer so I am hoping someone here can help. I
 found that if the factor names are all in numbers, fitting the model
 in lm would return labels that are not very recognizable.
 
 # Example: let's just assume that we want to fit this model
 fit - lm(height ~ age + Seed, data=Loblolly)
 
 # See the category names are all mangled up here
 fit
 
 
 Call:
 lm(formula = height ~ age + Seed, data = Loblolly)
 
 Coefficients:
 (Intercept)  age   Seed.L   Seed.Q   Seed.C
 Seed^4
 -1.31240  2.59052  4.86941  0.87307  0.37894
 -0.46853
   Seed^5   Seed^6   Seed^7   Seed^8   Seed^9
 Seed^10
  0.55237  0.39659 -0.06507  0.35074 -0.83442
 0.42085
  Seed^11  Seed^12  Seed^13
  0.53906 -0.29803 -0.77254
 
 
 
 One possible solution I found is to rename the categorical variables
 
 seed.str - paste(S, Loblolly$Seed, sep=)
 seed.str - factor(seed.str)
 fit - lm(height ~ age + seed.str, data=Loblolly)
 fit
 
 
 
 Call:
 lm(formula = height ~ age + seed.str, data = Loblolly)
 
 Coefficients:
   (Intercept)   age  seed.strS303  seed.strS305  seed.strS307
   -0.43012.59050.86001.8683   -1.9183
 seed.strS309  seed.strS311  seed.strS315  seed.strS319  seed.strS321
0.5350   -1.5933   -0.8867   -0.3650   -2.0350
 seed.strS323  seed.strS325  seed.strS327  seed.strS329  seed.strS331
0.3067   -1.3233   -2.6400   -2.9333   -2.2267
 
 
 Now it is actually possible to see which one is which, but is kind of
 lame. Can someone point me to a more elegant solution? Thank you so
 much.
 
 Saiwing Yeung
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor with numeric names

2009-03-21 Thread Tal Galili
Hi Saiwing,
If all you are asking is how to rename a factor vector, the easiest way
would be to use:
levels(Loblolly$Seed) - c( a vector of level names you would like to use
for the factor - separated by commas)

If you are asking how to make your output look better, I am not sure I have
an idea (except for using summary(fit) - but I guess that is not what you
mean)

Best,
Tal





On Sat, Mar 21, 2009 at 11:02 PM, Saiwing Yeung saiw...@berkeley.eduwrote:

 Hi all,

 I have a pretty basic question about categorical variables but I can't seem
 to be able to find answer so I am hoping someone here can help. I found that
 if the factor names are all in numbers, fitting the model in lm would return
 labels that are not very recognizable.

 # Example: let's just assume that we want to fit this model
 fit - lm(height ~ age + Seed, data=Loblolly)

 # See the category names are all mangled up here
 fit


 Call:
 lm(formula = height ~ age + Seed, data = Loblolly)

 Coefficients:
 (Intercept)  age   Seed.L   Seed.Q   Seed.C
 Seed^4
   -1.31240  2.59052  4.86941  0.87307  0.37894 -0.46853
 Seed^5   Seed^6   Seed^7   Seed^8   Seed^9  Seed^10
0.55237  0.39659 -0.06507  0.35074 -0.83442  0.42085
Seed^11  Seed^12  Seed^13
0.53906 -0.29803 -0.77254



 One possible solution I found is to rename the categorical variables

 seed.str - paste(S, Loblolly$Seed, sep=)
 seed.str - factor(seed.str)
 fit - lm(height ~ age + seed.str, data=Loblolly)
 fit



 Call:
 lm(formula = height ~ age + seed.str, data = Loblolly)

 Coefficients:
  (Intercept)   age  seed.strS303  seed.strS305  seed.strS307
 -0.43012.59050.86001.8683   -1.9183
 seed.strS309  seed.strS311  seed.strS315  seed.strS319  seed.strS321
  0.5350   -1.5933   -0.8867   -0.3650   -2.0350
 seed.strS323  seed.strS325  seed.strS327  seed.strS329  seed.strS331
  0.3067   -1.3233   -2.6400   -2.9333   -2.2267


 Now it is actually possible to see which one is which, but is kind of lame.
 Can someone point me to a more elegant solution? Thank you so much.

 Saiwing Yeung

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.