Re: [R] cbind() and factors.

2004-12-11 Thread Frank E Harrell Jr
Gabor Grothendieck wrote:
michael watson (IAH-C michael.watson at bbsrc.ac.uk writes:
: 
: Hi
: 
: I'm seeing some odd behaviour with cbind().  My code is:
: 
:  cat - read.table(cogs_category.txt, sep=\t, header=TRUE,
: quote=NULL, colClasses=character)
:  colnames(cat)
: [1] CodeDescription
:  is.factor(cat$Code)
: [1] FALSE
:  is.factor(cat$Description)
: [1] FALSE
:  is.factor(rainbow(nrow(cat)))
: [1] FALSE
:  cat - cbind(cat,Color=rainbow(nrow(cat)))
:  is.factor(cat$Color)
: [1] TRUE
:  ?cbind
: 
: I read a text file in which has two columns, Code and Description.
: Neither of these are factors.  I want to add a column of colours to the
: data frame using rainbow().  The rainbow function also does not return a
: factor.  However, if I cbind my data frame (which has no factors in it)
: and the results of rainbow() (which is a vector, not a factor), then for
: some reason the new column is a factor...??

Others have already explained the problem and given what is likely
the best solution but here is one other idea, just in case.
You may require a data frame depending on what you want to do but
if you don't then you could alternately use a character matrix
since that won't result in any conversions to factor.
Lets call the data frame from read.table, Cat.df, and our 
matrix, Cat.m.  cat is not wrong but its confusing 
since there is a common R function called cat.  Now we can 
write the following and don't have to worry about factors:

Cat.df - read.table(...)
# create a character matrix and cbind Colors to it
Cat.m - cbind(as.matrix(Cat.df), Color = rainbow(nrow(Cat.df)))
If you do find you need a data frame later you can convert it back
like this:
Cat.df - as.data.frame(Cat.m)
Cat.df[] - Cat.m  # clobber factors with character data
For speed, the mApply function in the Hmisc package (used by the Hmisc 
summarize function) does looping for stratified statistical summaries by 
operating on matrices rather than data frames.   factors are converted 
to numerics, and service routines can save and restore the levels and 
other attributes.  Here is an example from the summarize help file, plus 
related examples:

# To run mApply on a data frame:
m - mApply(asNumericMatrix(x), race, h)
# Here assume h is a function that returns a matrix similar to x
at - subsAttr(x)  # get original attributes and storage modes
matrix2dataFrame(m, at)
# Get stratified weighted means
g - function(y) wtd.mean(y[,1],y[,2])
summarize(cbind(y, wts), llist(sex,race), g, stat.name='y')
mApply(cbind(y,wts), llist(sex,race), g)
# Compare speed of mApply vs. by for computing
d - data.frame(sex=sample(c('female','male'),10,TRUE),
country=sample(letters,10,TRUE),
y1=runif(10), y2=runif(10))
g - function(x) {
  y - c(median(x[,'y1']-x[,'y2']),
 med.sum =median(x[,'y1']+x[,'y2']))
  names(y) - c('med.diff','med.sum')
  y
}
system.time(by(d, llist(sex=d$sex,country=d$country), g))
system.time({
 x - asNumericMatrix(d)
 a - subsAttr(d)
 m - mApply(x, llist(sex=d$sex,country=d$country), g)
})
system.time({
 x - asNumericMatrix(d)
 summarize(x, llist(sex=d$sex, country=d$country), g)
})
--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] cbind() and factors.

2004-12-10 Thread Stephane DRAY
cat is a data.frame,
so cbind is use for a data.frame
and
?data.frame tell us that:
 Character variables passed to 'data.frame' are converted
 to factor columns unless protected by 'I'.
PS : it is not good ides to call your data.frame cat as there is a cat 
function.


At 09:19 10/12/2004, michael watson (IAH-C) wrote:
Hi
I'm seeing some odd behaviour with cbind().  My code is:
 cat - read.table(cogs_category.txt, sep=\t, header=TRUE,
quote=NULL, colClasses=character)
 colnames(cat)
[1] CodeDescription
 is.factor(cat$Code)
[1] FALSE
 is.factor(cat$Description)
[1] FALSE
 is.factor(rainbow(nrow(cat)))
[1] FALSE
 cat - cbind(cat,Color=rainbow(nrow(cat)))
 is.factor(cat$Color)
[1] TRUE
 ?cbind
I read a text file in which has two columns, Code and Description.
Neither of these are factors.  I want to add a column of colours to the
data frame using rainbow().  The rainbow function also does not return a
factor.  However, if I cbind my data frame (which has no factors in it)
and the results of rainbow() (which is a vector, not a factor), then for
some reason the new column is a factor...??
Mick
Michael Watson
Head of Informatics
Institute for Animal Health,
Compton Laboratory,
Compton,
Newbury,
Berkshire RG20 7NN
UK
Phone : +44 (0)1635 578411 ext. 2535
Mobile: +44 (0)7990 827831
E-mail: [EMAIL PROTECTED]
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Stéphane DRAY
-- 

Département des Sciences Biologiques
Université de Montréal, C.P. 6128, succursale centre-ville
Montréal, Québec H3C 3J7, Canada
Tel : (514) 343-6111 poste 1233 Fax : (514) 343-2293
E-mail : [EMAIL PROTECTED]
-- 

Web  http://www.steph280.freesurf.fr/
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] cbind() and factors.

2004-12-10 Thread Rolf Turner

This is of the nature of an FAQ.  Data frames coerce character
vectors into factors.  If you want a character vector to stay
that way (and not become a factor) wrap in up in ``I()'':

cat - cbind(cat,Color=I(rainbow(nrow(cat

(There's no need to quote the name ``Color'' in the foregoing.)

cheers,

Rolf Turner
[EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] cbind() and factors.

2004-12-10 Thread Gabor Grothendieck
michael watson (IAH-C michael.watson at bbsrc.ac.uk writes:

: 
: Hi
: 
: I'm seeing some odd behaviour with cbind().  My code is:
: 
:  cat - read.table(cogs_category.txt, sep=\t, header=TRUE,
: quote=NULL, colClasses=character)
:  colnames(cat)
: [1] CodeDescription
:  is.factor(cat$Code)
: [1] FALSE
:  is.factor(cat$Description)
: [1] FALSE
:  is.factor(rainbow(nrow(cat)))
: [1] FALSE
:  cat - cbind(cat,Color=rainbow(nrow(cat)))
:  is.factor(cat$Color)
: [1] TRUE
:  ?cbind
: 
: I read a text file in which has two columns, Code and Description.
: Neither of these are factors.  I want to add a column of colours to the
: data frame using rainbow().  The rainbow function also does not return a
: factor.  However, if I cbind my data frame (which has no factors in it)
: and the results of rainbow() (which is a vector, not a factor), then for
: some reason the new column is a factor...??

Others have already explained the problem and given what is likely
the best solution but here is one other idea, just in case.

You may require a data frame depending on what you want to do but
if you don't then you could alternately use a character matrix
since that won't result in any conversions to factor.

Lets call the data frame from read.table, Cat.df, and our 
matrix, Cat.m.  cat is not wrong but its confusing 
since there is a common R function called cat.  Now we can 
write the following and don't have to worry about factors:

Cat.df - read.table(...)
# create a character matrix and cbind Colors to it
Cat.m - cbind(as.matrix(Cat.df), Color = rainbow(nrow(Cat.df)))

If you do find you need a data frame later you can convert it back
like this:

Cat.df - as.data.frame(Cat.m)
Cat.df[] - Cat.m  # clobber factors with character data

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] cbind() and factors.

2004-12-10 Thread Dieter Menne
Probably you called the build-in rainwbow-function, which returns a string.

str(rainbow(10))
 chr FF

Dieter Menne

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html