Thanks a lot to all who responded. This is a little less confusing now, although it's hard for me to fathom the (practical) use of a dataframe within a dataframe. If one mixes different notations, or, put in a different way, different underlying classes (data.frame vs. numeric), these rather unintuitive results appear. So I'll use any of these: df$pct <- df$weight / ave(df$weight, df$sex, FUN=sum)*100 df["pct"] <- df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100
using str() is very insightful, as is using class() I'd prefer it if R simply generated an error when one attempts to nest a data.frame within a data.frame. Thanks again! Cheers!! Albert-Jan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ________________________________ From: Brian Diggs <dig...@ohsu.edu> To: R-help@r-project.org Sent: Fri, June 17, 2011 11:58:44 PM Subject: Re: [R] is this a bug? On 6/17/2011 2:24 PM, (Ted Harding) wrote: > And the extra twist in the tale is exemplified by this > mini-version of Albert-Jan's first example: > > DF<- data.frame(A=c(1,2,3)) > DF$B<- c(4,5,6) > DF$C<- c(7,8,9) > DF > # A B C > # 1 1 4 7 > # 2 2 5 8 > # 3 3 6 9 > > DF$D<- DF["A"]/DF["B"] > DF > # A B C A > # 1 1 4 7 0.25 > # 2 2 5 8 0.40 > # 3 3 6 9 0.50 > > ##And why: > > DF["A"]/DF["B"] > # A > # 1 0.25 > # 2 0.40 > # 3 0.50 > > ##So the ratio DF["A"]/DF["B"] comes out with the name of > ##the numerator, "A". This is then the name given to DF$D It's even slightly weirder than that: str(DF) #'data.frame': 3 obs. of 4 variables: # $ A: num 1 2 3 # $ B: num 4 5 6 # $ C: num 7 8 9 # $ D:'data.frame': 3 obs. of 1 variable: # ..$ A: num 0.25 0.4 0.5 There is a column D in DF which is itself a data frame with a single column whose name is A (because of what Ted said). When formatted for printing out, the column name of the inner data frame is used (as a result of how data.frame() itself handles named arguments when the argument is itself a data.frame: "If a list or data frame or matrix is passed to data.frame it is as if each component or column had been passed as a separate argument..."). So not a bug, but a convoluted set of circumstances that can happen when non-atomic vectors are assigned to columns of a data.frame. That's one of those /you shouldn't do that even though it is technically legal or at least you shouldn't be surprised when things don't work the way you thought they would/ things. > Thus Albert-Jan's > df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100 > comes through with name "weight". > > Ted. > > > On 17-Jun-11 21:06:42, William Dunlap wrote: >> df$varname is a column of df. >> >> df["varname"] is a one-column df containing that column. >> >> df[["varname"]] is a column of df (same as df$varname). >> >> df[,"varname"] is a column of df (same as df$varname). >> >> df[,"varname",drop=FALSE] is a one-column df (same as df$varname). >> >> df$newVarname<- df["varname"] inserts a new component >> into df, the component being a one-column data.frame, >> not the column in that data.frame. >> >> Bill Dunlap >> Spotfire, TIBCO Software >> wdunlap tibco.com >> >>> -----Original Message----- >>> From: r-help-boun...@r-project.org >>> [mailto:r-help-boun...@r-project.org] On Behalf Of Albert-Jan Roskam >>> Sent: Friday, June 17, 2011 1:49 PM >>> To: R Mailing List >>> Subject: [R] is this a bug? >>> >>> Hello, >>> >>> Is the following a bug? I always thought that df$varname<- >>> does the same as >>> df["varname"]<- >>> >>>> df<- data.frame(weight=round(runif(10, 10, 100)), >>> sex=round(runif(100, 0, >>> 1))) >>>> df$pct<- df["weight"] / ave(df["weight"], df["sex"], FUN=sum)*100 >>>> names(df) >>> [1] "weight" "sex" "pct" ### ----------> ok >>>> head(df) [[elided Yahoo spam]] >>> 1 86 0 2.4002233 >>> 2 19 1 0.5643006 >>> 3 32 0 0.8931063 >>> 4 87 0 2.4281328 >>> 5 45 0 1.2559308 >>> 6 95 0 2.6514094 >>>> rm(df) >>>> df<- data.frame(weight=round(runif(10, 10, 100)), >>> sex=round(runif(100, 0, >>> 1))) >>>> df["pct"]<- df["weight"] / ave(df["weight"], df["sex"], >>> FUN=sum)*100 ### >>>> -----> this does work >>>> names(df) >>> [1] "weight" "sex" "pct" >>>> head(df) >>> weight sex pct >>> 1 15 0 0.5246590 >>> 2 43 0 1.5040224 >>> 3 17 1 0.9284544 >>> 4 44 1 2.4030584 >>> 5 76 1 4.1507373 >>> 6 59 0 2.0636586 >>>> do.call(c, R.Version()) >>> platform arch >>> "i686-pc-linux-gnu" "i686" >>> os system >>> "linux-gnu" "i686, linux-gnu" >>> status major >>> "" "2" >>> minor year >>> "11.1" "2010" >>> month day >>> "05" "31" >>> svn rev language >>> "52157" "R" >>> version.string >>> "R version 2.11.1 (2010-05-31)" >>>> # Thanks! >>> >>> Cheers!! >>> Albert-Jan >>> >>> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> All right, but apart from the sanitation, the medicine, >>> education, wine, public >>> order, irrigation, roads, a fresh water system, and public >>> health, what have the >>> Romans ever done for us? >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -------------------------------------------------------------------- > E-Mail: (Ted Harding)<ted.hard...@wlandres.net> > Fax-to-email: +44 (0)870 094 0861 > Date: 17-Jun-11 Time: 22:24:41 > ------------------------------ XFMail ------------------------------ > -- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health & Science University ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.