Factors are internally stored as integers (enums if you have used other programming languages) with a special label set -- it's more memory efficient than storing the whole string over and over.
Michael On Wed, Feb 29, 2012 at 5:49 AM, Aniruddha Mukherjee <aniruddha.mukher...@tcs.com> wrote: > Hello Berend. > > Many thanks for your prompt reply and that helped me a lot. One more > thing, if you please explain, I shall be highly obliged. > Why in my case (i.e. when stringsAsFactors was TRUE by default), >> as.numeric(matr1$Pulse_rate) > displays the following > [1] 4 5 7 5 9 8 6 10 3 2 5 1 10 10 > ? > > Best regards. > > > From: > Berend Hasselman <b...@xs4all.nl> > To: > Aniruddha Mukherjee <aniruddha.mukher...@tcs.com> > Cc: > R-help <r-help@r-project.org> > Date: > 02/29/2012 03:57 PM > Subject: > Re: [R] Error occurred during mean calculation of a column of a data > frame, which is apparently contents numeric data > > > > > On 29-02-2012, at 09:45, Aniruddha Mukherjee wrote: > >> Hello R people, >> >> How can I compute the mean of the "Pulse_rate" column of the data frame > or >> matrix from the following character object called "str_got". It has 14 >> entries and each entry has 8 values, separated by commas. Please go thru > >> the following R commands to know how I tried to unstring and unlist the >> values to form a data frame. >>> str_got >> [1] > "bp,67,2011-12-09T19:59:44.044+05:30,9830576102,68.0,124.0,58.0,66.0" >> "bp,67,2011-12-09T20:19:31.031+05:30,9830576102,72.0,133.0,93.0,40.0" >> ..... >>> >> matr<-matrix(unlist(strsplit(str_got, ",")), nrows, byrow=T) > > nrows? > I assume this was set somewhere in your script and not shown. > Is it length(str_got)? > >>> matr >> [,1] [,2] [,3] >> [,4] [,5] [,6] [,7] [,8] >> [1,] "bp" "67" "2011-12-09T19:59:44.044+05:30" "9830576102" "68.0" >> ...... > >> Note column names must be inserted before computing the desired mean >> value. >> matr1<-as.data.frame(matr) > > Use matr1 <- as.data.frame(matr, stringsAsFactors=FALSE) > > If you don't dos tringsAsFactors=FALSE the column will be a factor and > that is not equivalent with numeric. > > What's wrong with > > matr1$Pulse_rate <- as.numeric(matr1$Pulse_rate) > > Then you can calculate the desired mean with > > mean(matr1$Pulse_rate) > > or > > mean(matr1[,"Pulse_rate"]) > > Berend > > > > =====-----=====-----===== > Notice: The information contained in this e-mail > message and/or attachments to it may contain > confidential or privileged information. If you are > not the intended recipient, any dissemination, use, > review, distribution, printing or copying of the > information contained in this e-mail message > and/or attachments to it are strictly prohibited. If > you have received this communication in error, > please notify us by reply e-mail or telephone and > immediately and permanently delete the message > and any attachments. Thank you > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.