Hi, everyone, I have a data.frame named "eva" like this:
IND PARTNO VC1 EO1 EO2 EO3 EO4 EO5 114 114001 2 5 4 4 5 4 114 114001 2 4 4 4 4 4 114 114001 2 4 NA NA NA NA 112 112002 2 3 3 6 2 6 112 112002 2 1 1 3 4 4 112 112003 2 6 6 6 5 6 112 112003 2 5 7 6 6 6 112 112003 2 6 6 6 4 5 114 114004 2 2 3 3 2 4 114 114004 2 5 3 4 4 2 114 114004 2 NA NA NA NA NA 113 113005 2 5 5 6 6 5 113 113005 2 7 7 4 7 6 111 111006 2 5 7 7 7 7 112 112007 2 7 7 7 2 2 112 112007 2 6 6 6 1 2 112 112007 2 7 6 6 2 2 111 111008 2 4 1 3 1 4 111 111008 2 3 1 5 3 2 This is only a small part of the whole data. "PARTNO" is a digit variable and I want to use it as a group variable to aggreate other variables. What I want to get looks like this: IND PARTNO NUM VC1 EO1 EO2 EO3 EO4 EO5 114 114001 3 2 4.3 4 4 4.5 4 112 112002 2 2 2 2 4.5 3 5 112 112003 3 2 5.7 6.3 6 5 5.7 114 114004 3 2 3.5 3 3.5 3 3 113 113005 2 2 6 6 5 6.5 5.5 111 111006 1 2 5 7 7 7 7 112 112007 3 2 6.7 6.3 6.3 1.7 2 111 111008 2 2 3.5 1 4 2 3 "NUM" is a newly added variable which indicates the case number of each group grouped by "PARTNO". I have two questions on this manipulation. The first is how to get the newly added variable "NUM". I have no idea on this question. The second is how to average other variables by group. If there are "NA", I want the average operation is done on other cases. For example, the variable "EO1" has values of 2, 5, and "NA" on case 114004. What I have done is > aggregate(eva[,-2], by=eva[,-2], mean) But it seems because there are "NA"s, the "aggregate" cannot process. Because the "NA" values are not a small part, I cannot use imputation methods. I'm not sure whether my operation is right. Does anyone have any suggestion on the two problems? Thanks in advance! ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
