NAME ID a b c d 1 Control_1 probe~B01R01C01 381 213 345 653 2 Control_2 probe~B01R01C02 574 629 563 783 3 Control_1 probe~B01R09C01 673 511 521 967 4 Control_3 probe~B01R09C02 53 809 999 50 5 MM0289~RFU:11810.15 probe~B29R13C06 681 34 115 587 6 MM0289~RFU:9238.41 probe~B29R13C05 784 443 20 784 7 MM16597~RFU:36765.38 probe~B44R15C20 719 251 790 445 8 MM16597~RFU:41258.94 probe~B44R15C19 677 363 268 686 NAME ID a b c d 1 Control_1 probe~B01R01C01 381 213 345 653 2 Control_2 probe~B01R01C02 574 629 563 783 3 Control_1 probe~B01R09C01 673 511 521 967 4 Control_3 probe~B01R09C02 53 809 999 50 5 MM0289~RFU:11810.15 probe~B29R13C06 681 34 115 587 6 MM0289~RFU:9238.41 probe~B29R13C05 784 443 20 784 7 MM16597~RFU:36765.38 probe~B44R15C20 719 251 790 445 8 MM16597~RFU:41258.94 probe~B44R15C19 677 363 268 686 Sorry, that should look like this: NAME ID a b c d 1 Control_1 probe~B01R01C01 381 213 345 653 2 Control_2 probe~B01R01C02 574 629 563 783 3 Control_1 probe~B01R09C01 673 511 521 967 4 Control_3 probe~B01R09C02 53 809 999 50 5 MM0289~RFU:11810.15 probe~B29R13C06 681 34 115 587 6 MM0289~RFU:9238.41 probe~B29R13C05 784 443 20 784 7 MM16597~RFU:36765.38 probe~B44R15C20 719 251 790 445 8 MM16597~RFU:41258.94 probe~B44R15C19 677 363 268 686 NAME ID a b c d 1 Control_1 probe~B01R01C01 3 22 926 774 2 Control_2 probe~B01R01C02 712 13 32 179 3 Control_1 probe~B01R09C01 937 824 898 668 4 Control_3 probe~B01R09C02 464 836 508 53 5 MM0289~RFU:11810.15 probe~B29R13C06 99 544 607 984 6 MM0289~RFU:9238.41 probe~B29R13C05 605 603 862 575 7 MM16597~RFU:36765.38 probe~B44R15C20 700 923 219 582 8 MM16597~RFU:41258.94 probe~B44R15C19 132 777 497 995 --- On Thu, 1/12/11, Jabez Wilson <jabez...@yahoo.co.uk> wrote: From: Jabez Wilson <jabez...@yahoo.co.uk> Subject: calculate mean of multiple rows in a data frame To: "R-Help" <r-h...@stat.math.ethz.ch> Date: Thursday, 1 December, 2011, 20:45 Dear all, I have a data frame (DF) in the following format: NAME ID a b c d 1 Control_1 probe~B01R01C01 381 213 345 653 2 Control_2 probe~B01R01C02 574 629 563 783 3 Control_1 probe~B01R09C01 673 511 521 967 4 Control_3 probe~B01R09C02 53 809 999 50 5 MM0289~RFU:11810.15 probe~B29R13C06 681 34 115 587 6 MM0289~RFU:9238.41 probe~B29R13C05 784 443 20 784 7 MM16597~RFU:36765.38 probe~B44R15C20 719 251 790 445 8 MM16597~RFU:41258.94 probe~B44R15C19 677 363 268 686..... I would like to consolidate the data frame by parsing through the rows, and where the NAME is identical, consolidate into one row and return the mean. I can do this for the first lines (Control_1 etc) by using aggregate() aggregate(DF[,-c(1:2)], by=list(DF$NAME), mean) but since aggregate looks for unique lines it won't consolidate e.g. lines 5/6 and 7/8. Is there a way of telling aggregate to grep just the first part of the name (i.e. up to "~") and consolidate those? I could pre-grep the file before importing into R, but I'd like to do it within R if possible. Thanks for any suggestions [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.