[R] Add a dim to an array
Dear list, I'm trying to add a new dim to a multidimensional array. My array looks like this a1 - array(1:8, c(2, 2, 2)) dimnames(a1) - list(A = c(A1, A2), B = c(B1, B2), D = c(D1, D2)) I would like to add a new dim 'group' with the value low. Right now I'm using this, but I think are better ways... a2 - as.data.frame(as.table(a1)) a2$group - low a2 - xtabs(Freq ~ A + B + D + group, data = a2) a2 Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create a new Vector based on two columns
Hello, I am trying to get a new vector 'x1' based on the not NA-values in column 'a' and 'b'. I found a way but I am sure this is not the best solution. So any ideas on how to optimize this would be great! m - factor(c(a1, a1, a2, b1, b2, b3, d1, d1), ordered = TRUE) df - data.frame( a= m, b = m) df[1,1] - NA df[4,2] - NA df[2,2] - NA df[6,1] - NA df w - !apply(df, 2, is.na) v - apply(w, 1, FUN=function(L) which(L == TRUE)[[1]]) for (i in 1:nrow(df) ) { g[i] - df[i, v[i]] } df$x1 - g Thanks for any help Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create new Vector based on two colums
Hello, I am trying to get a new vector 'x1' based on the not NA-values in column 'a' and 'b'. I found a way but I am sure this is not the best solution. So any ideas on how to optimize this would be great! m - factor(c(a1, a1, a2, b1, b2, b3, d1, d1), ordered = TRUE) df - data.frame( a= m, b = m) df[1,1] - NA df[4,2] - NA df[2,2] - NA df[6,1] - NA df w - !apply(df, 2, is.na) v - apply(w, 1, FUN=function(L) which(L == TRUE)[[1]]) for (i in 1:nrow(df) ) { g[i] - df[i, v[i]] } df$x1 - g Thanks for any help Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate the difference using ave
Thanks Dimitris, but I would like to bind the result on the dataframe, so the length should be equal to nrow(df1). BTW, sorry for the example, it wasn't very clear, next try: # options(stringsAsFactors = FALSE) set.seed(123) df1 - data.frame(id = rep(LETTERS[1:6], 3), yr = rep(c(2009:2011), each=6), water = sample(c(100:500), 18), salt = sample(c(10:40), 18)) CalcDiffPct - function(xdf) { n - length(unique(xdf[[id]])) n.NA - rep(NA, n) w - seq_len(nrow(xdf) - n) diff_pct - xdf$salt / c(n.NA, xdf$water[w]) * 100 diff_pct } # The order is important df1 - df1[order(df1$yr, df1$id), ] # This works, as long as each # combination of yr / id exist with(df1, table(id, yr)) df1$salt_pct - CalcDiffPct(df1) df1 # But if the I drop any row the result will be wrong # (or 'correct' as the function doesn't handle this case) df2 - df1 df2 - df2[-15, ] with(df2, table(id, yr)) df2$salt_pct2 - CalcDiffPct(df2) df2 ## Thanks for any help! Patrick Am 26.10.2011 14:00, schrieb Dimitris Rizopoulos: Maybe one approach could be: set.seed(123) df1 - data.frame(measure = rep(c(A1, A2, A3), each=3), water = sample(c(100:200), 9), tide = sample(c(-10:+10), 9)) 100 * tail(df1$tide, -3) / head(df1$water, -3) I hope it helps. Best, Dimitris On 10/26/2011 12:02 PM, Patrick Hausmann wrote: Dear R users, It may be very simple but it is being difficult for me. I'd like to calculate the difference in percent between to measures. My data looks like this: set.seed(123) df1 - data.frame(measure = rep(c(A1, A2, A3), each=3), water = sample(c(100:200), 9), tide = sample(c(-10:+10), 9)) df1 # What I want to calculate is: # tide_[A2] / water_[A1], # tide_[A3] / water_[A2] # This 'works' for the example, but I am # looking for a more general solution. df1$tide_diff - ave(df1$tide, FUN=function(L) L / c(NA, NA, NA, df1$water)) * 100 df1 Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate the difference using ave
Dear R users, It may be very simple but it is being difficult for me. I'd like to calculate the difference in percent between to measures. My data looks like this: set.seed(123) df1 - data.frame(measure = rep(c(A1, A2, A3), each=3), water = sample(c(100:200), 9), tide = sample(c(-10:+10), 9)) df1 # What I want to calculate is: # tide_[A2] / water_[A1], # tide_[A3] / water_[A2] # This 'works' for the example, but I am # looking for a more general solution. df1$tide_diff - ave(df1$tide, FUN=function(L) L / c(NA, NA, NA, df1$water)) * 100 df1 Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MARGIN in sweep refers to a specific column in a second df
Dear R folks, I am doing some calculations over an array using sweep and apply. # Sample Data (from help 'addmargins') Aye - sample(c(Yes, Si, Oui), 177, replace = TRUE) Bee - sample(c(Hum, Buzz), 177, replace = TRUE) Sea - sample(c(White, Black, Red, Dead), 177, replace = TRUE) (A - table(Aye, Bee, Sea)) apply(A, c(1, 2), sum ) ## ok, sweep with fixed MARGIN round( sweep( apply(A, c(1, 2), sum ), 1 , c(111, 333, 444), FUN = /), 2) # DF with values for sweep MARGIN DF - data.frame( answer = c(111, 333, 444), Aye = c(Oui, Si, Yes)) ## ok, MARGIN in correct order round( sweep( apply(A, c(1, 2), sum ), 1 , DF[['answer']], FUN = /), 2) ## But if I change the order in DF the result is not what I want... DF.s - DF[order(DF$Aye, decreasing = TRUE), ] DF.s round( sweep( apply(A, c(1, 2), sum ), 1 , DF.s[['answer']], FUN = /), 2) So, I would like to know, how to set MARGIN in sweep to refer to the values in DF with notice of the Aye-column? Thanks for any help Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Expand DF with all levels of a variable
Dear list, I would like to expand a DF with all the missing levels of a variable. a - c(2,2,3,4,5,6,7,8,9) a.cut - cut(a, breaks=c(0,2,6,9,12), right=FALSE ) (x - data.frame(a, a.cut)) # In 'x' the level [0,2) is missing. AddMissingLevel - function(xdf) { xfac - factor( c([0,2), [2,6), [6,9), [9,12)) ) xlevels - levels(xfac) if(length(xlevels) != nlevels(factor(xdf$a.cut))) { v - setdiff(xlevels, factor(xdf$a.cut)) u - data.frame(a = 0, a.cut = v) x - rbind(u, x) } return(x) } AddMissingLevel(x) Does a more general approach exist, e.g. using expand.grid? Thanks for any help!! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] More flexible aggregate / eval
Dear list, I would like to do some calculation using different grouping variables. My 'df' looks like this: # Some data set.seed(345) id - seq(200,400, by=10) ids - sample(substr(id,1,1)) group1 - rep(1:3, each=7) group2 - rep(1:2, c(10,11)) group3 - rep(1:4, c(5,5,5,6)) df - data.frame(id, ids, group1, group2, group3) df - rbind(df, df, df) df$time - seq(2009, 2011, each=3) df$x1 - sample(0:100, 63) df$x2 - sample(44:234, 63) head(df) ## For group1 d1 - aggregate(cbind(x1, x2) ~ group1 + ids + time, data = df, sum) d1$l_pct - with(d1, ave(x1, list(group1, time), FUN = function(x) round(prop.table(x) * 100, 1) ) ) op1 - xtabs(l_pct ~ group1 + ids + time, data = d1) ftable(op1, row.vars=c(1,3)) ## For group2 d2 - aggregate(cbind(x1, x2) ~ group2 + ids + time, data = df, sum) d2$l_pct - with(d2, ave(x1, list(group2, time), FUN = function(x) round(prop.table(x) * 100, 1) ) ) op2 - xtabs(l_pct ~ group2 + ids + time, data = d2) ftable(op2, row.vars=c(1,3)) ## and for group3... ## To have a more flexible solution I wrote this function: myfun - function(xdf, xvar) { fo1 - cbind(x1, x2) ~ fo2 - paste(fo1, xvar, + ids + time, sep=) formular - as.formula(fo2) d2 - do.call(aggregate, list(formular, data = xdf, FUN = sum)) d2$l_pct - with(d2, ave(x1, list(eval(as.name(xvar)), time), FUN = function(x) round(prop.table(x) * 100, 1) ) ) op2 - xtabs(l_pct ~ eval(as.name(xvar)) + ids + time, data = d2) fop2 - ftable(op2, row.vars=c(1,3)) out - list(d2, fop2) return(out) } ( out_gr1 - myfun(df, group1) ) ( out_gr2 - myfun(df, group2) ) ( out_gr3 - myfun(df, group3) ) This seems to work ok, but I am not really familiar with 'as.formula', 'eval' and 'as.name'. So I would like to know, if my solution is ok or if there are maybe better ways to solve this task. Thanks for any help!! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply mean to a three-dimension data
Hi, I think you could also use this way (via array, see http://r.789695.n4.nabble.com/apply-over-list-of-data-frames-td3057968.html) b - list() b[[1]] = matrix(1:4, 2, 2) b[[2]] = matrix(10:13, 2, 2) b[[3]] = matrix(20:23, 2, 2) b.a - array(unlist(b), dim=c(2, 2, 3)) (b.mean - apply(X = b.a, MARGIN = c(1, 2), FUN = mean)) (b.sum - apply(X = b.a, MARGIN = c(1, 2), FUN = sum)) HTH Patrick Am 24.03.2011 16:07, schrieb Hui Du: Hi All, Suppose I have data like b[[1]] = matrix(1:4, 2, 2) b[[2]] = matrix(10:13, 2, 2) b[[3]] = matrix(20:23, 2, 2) [[1]] [,1] [,2] [1,]13 [2,]24 [[2]] [,1] [,2] [1,] 10 12 [2,] 11 13 [[3]] [,1] [,2] [1,] 20 22 [2,] 21 23 Now I want to calculate the mean of each cell across the list. For example mean of (b[[1]][1,1], b[[2]][1,1], b[[3]][1,1]), mean of (b[[1]][1,2], b[[2]][1,2], b[[3]][1,2]) etc. e.g. mean of (1, 10, 20), mean of(3, 12, 22). Could somebody tell me how to do it? Thank you in advance. HXD [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fraction with timelag
Dear r-help, I'm having this DF df - data.frame(id = 1:6, xout = c(1234, 2134, 234, 456, 324, 345), xin= c(NA, 34,67,87,34, NA)) and would like to calculate the fraction (xin_t / xout_t-1) The result should be: # NA, 2.76, 3.14, 37.18, 7.46, NA I am sure there is a solution using zoo... but I don't know how... Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] list multiplied by a factor / mapply
Dear list, this works fine: x - split(iris, iris$Species) x1 - lapply(x, function(L) transform(L, g = L[,1:4] * 3)) but I would like to multiply each Species with another factor: setosa by 2, versicolor by 3 and virginica by 4. I've tried mapply but without success. Any thoughts? Thanks for any idea! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] manipulate dataframe
Hi André, try this: df1 - data.frame(x1 = rep(1:3, each=3), x2=letters[1:9]) dfs - split(df1, df1$x1) df2 - data.frame(sapply(dfs, FUN=[[, x2)) colnames(df2) - paste(d, unique(df1$x1), sep=) df2 HTH Patrick Am 06.02.2011 12:13, schrieb André de Boer: Hello, Can someone give me hint to change a data.frame. I want to split a column in more columns depending on the value of a other column. Thanks for the reaction, Andre Example: dat x1 x2 1 1 a 2 1 b 3 1 c 4 2 d 5 2 e 6 2 f 7 3 g 8 3 h 9 3 i in dur d1 d2 d3 1 a d g 2 b e h 3 c f i [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] transform a df with a condition
Dear all, for each A == 3 in 'df' I would like to change the variables B and K. My result should be the whole df and not the subset (A==3)... df - data.frame(A = c(1,1,3,2,2,3,3), B = c(2,1,1,2,7,8,7), K = c(a.1, d.2, f.3, a.1, k.4, f.9, f.5)) x1 - within(df[df$A ==3, ], { B1 - 5 K1 - gsub(f,m, K) }) x2 - transform(df[df$A==3, ], B1 = 5, K1 = gsub(f,m, K)) Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transform a df with a condition
Arrg, sorry - of course I don't want *new* variables. So this is my correct example: df - data.frame(A = c(1,1,3,2,2,3,3), B = c(2,1,1,2,7,8,7), K = c(a.1, d.2, f.3, a.1, k.4, f.9, f.5)) x1 - within(df[df$A ==3, ], { B - 5 K - gsub(f,m, K) }) x2 - transform(df[df$A==3, ], B = 5, K = gsub(f,m, K)) Thanks Patrick Am 16.01.2011 15:13, schrieb Patrick Hausmann: Dear all, for each A == 3 in 'df' I would like to change the variables B and K. My result should be the whole df and not the subset (A==3)... df - data.frame(A = c(1,1,3,2,2,3,3), B = c(2,1,1,2,7,8,7), K = c(a.1, d.2, f.3, a.1, k.4, f.9, f.5)) x1 - within(df[df$A ==3, ], { B1 - 5 K1 - gsub(f,m, K) }) x2 - transform(df[df$A==3, ], B1 = 5, K1 = gsub(f,m, K)) Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using combn
Dear list, I want to apply the table function to every pair of variables in df and the return should be a list. setwd(123) asd - data.frame(a1=sample(1:4, 20, replace=TRUE), a2=sample(1:4, 20, replace=TRUE), a3=sample(1:4, 20, replace=TRUE), a4=sample(1:4, 20, replace=TRUE)) with(asd, table(a1, a2)) with(asd, table(a1, a3)) with(asd, table(a1, a4)) ... I'm sure there is a solution using combn - but I don't get it... combn(colnames(asd), 2) ... Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Projecting data on a world map using long/lat
Hi Mathijs, this should work: library(maptools) library(ggplot2) gpclibPermit() theme_set(theme_bw()) #setwd(C:\\foo) point to your local dir # Data: http://thematicmapping.org/downloads/world_borders.php world.shp - readShapeSpatial(TM_WORLD_BORDERS-0.3.shp) # check for region-id - Use FIPS head(world@data) ## see licence, not GPL world.shp.p - fortify.SpatialPolygonsDataFrame(world.shp, region=FIPS) world - merge(world.shp.p, world.shp, by.x=id, by.y=FIPS) head(world) dim(world) # only the worldmap p - ggplot(data=world, aes(x=long, y=lat, group=group)) + geom_polygon(fill=#63D1F4) p - p + geom_path(color=white) + coord_equal() ggsave(p, width=11.69, height=8.27, file=world_map.jpg) ## Add some locations cities - read.table(textConnection( longlat city pop -58.381944 -34.599722 'Buenos Aires' 11548541 14.25 40.83 Neapel 962940), header = TRUE) p1 - p + geom_point(data = cities, aes(group = NULL), shape=5, color='black') ggsave(p1, width=11.69, height=8.27, file=world_map_2.jpg) Regards, Patrick Am 10.12.2010 17:53, schrieb mathijsdevaan: Thanks for the suggestions, but I am not there yet (I'm a real novice). In the code provided by Patrick (see below), I changed the shape input (from sids to world) which I downloaded here: http://thematicmapping.org/downloads/world_borders.php. As a result I also need to change the CNTY_ID and id in the code, but I have no idea what to put there. Could you please help me? Thanks! Mathijs library(maptools) library(ggplot2) gpclibPermit() myshp- readShapeSpatial(system.file(shapes/sids.shp, package=maptools)) ## see licence, not GPL myshp.points- fortify.SpatialPolygonsDataFrame(myshp, region=CNTY_ID) shpm- merge(myshp.points, myshp, by.x=id, by.y=CNTY_ID) head(shpm) p- ggplot(shpm, aes(long, lat, group=group, fill=NWBIR74)) p- p + geom_polygon() + geom_path(color=white) + coord_equal() ## Add some locations cities- read.table(textConnection( longlat city val -78.644722 35.818889 Raleigh 323 -80.84 35.226944 Charlotte 510 -82.555833 35.58 Asheville400), header = TRUE) p- p + geom_point(aes( fill=NULL, group = NULL, size=val), data = cities, color= 'black') p __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LaTeX, MiKTeX, LyX: A Guide for the Perplexed
Hi Paul, I am using Sweave and MiKTeX and the results are really impressive, but it's often quite complicated (or impossible) to share the rnw-files with my colleagues/clients. So it depends with/for whom you are working. Perhaps as an alternative you could use a simpler markup format e.g. Markdown (with the ascii package). To convert between different markup languages Pandoc [1] looks very promising. HTH Patrick [1] http://johnmacfarlane.net/pandoc Am 08.12.2010 00:29, schrieb Paul Miller: Hello Everyone, � Been learning R over the past several months. Read several books and have learned a great deal about data manipulation, statistical analysis, and graphics. � Now I want to learn how to make nice looking documents and�about literate programming. My understanding is that R users normally do this using LaTeX, MiKTeX, LyX, etc. in conjuction with Sweave.�An alternative might be to use the R2wd package to create Word documents. � So I guess I have�four questions: � 1. How do I choose between the various options? Why would someone decide to use LaTeX instead of MiKTeX or vice versa for example? � 2. What are the best sources of information about LaTeX, MiKTeX, LyX, etc.? � 3. What is the learning curve like for each of these? What do you get�for the time you put in learning something that is more difficult? � 4. How do people who use LaTeX, MiKTeX, LyX, etc. share documents with people who are just using Word? How difficult does using LaTeX, MiKTeX, LyX, etc. make it to collaborate on projects with others? � Thanks, � Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help summarizing R data frame
Here are some examples with tapply, aggregate, ddply: x - read.table(clipboard, head=TRUE) with(x, tapply(quantity, identifier, sum)) aggregate(x$quantity, by=list(x$identifier), sum) aggregate(quantity ~ identifier, data = x, sum) library(plyr) ddply(x, .(identifier), summarise, quantity=sum(quantity)) HTH Patrick Am 02.12.2010 17:24, schrieb chris99: I am trying to aggregate data in column 2 to identifiers in col 1 eg.. take this identifier quantity 1 10 1 20 2 30 1 15 2 10 3 20 and make this identifier quantity 145 240 320 Thanks in advance for your help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] more flexible ave
Hi all, I would like to calculate the percent of the total per group for this data.frame: df - data.frame(site = c(a, a, a, b, b, b), gr = c(total, x1, x2, x1, total,x2), value1 = c(212, 56, 87, 33, 456, 213)) df calcPercent - function(df) { df - transform(df, pct_val1 = ave(df[, -c(1:2)], df$gr, FUN = function(x) x/df[df$gr == total, value1]) ) } # This works as intended... w - lapply(split(df, df$site), calcPercent) w - do.call(rbind, w) w # ... but when I add a new column df$value2 - c(1546, 560, 543, 234, 654, 312) # the result is not what I want... w - lapply(split(df, df$site), calcPercent) w - do.call(rbind, w) w Clearly I have to change the function, (particularly value1) - but how... I've also played around with apply but without any success. Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stacking consecutive columns
Hi Gregory, is this what you want? Ok, not the most elegant way... # using 'melt' from the 'reshape' package library(reshape) Data - data.frame(month = 1:12, x2002 = runif(12), x2003 = runif(12), x2004 = runif(12), x2005 = runif(12)) v - NULL for(i in 2:4) { kk - melt(Data[, c(1, i, i+1)], id.vars=month, variable_name=year) v[[i-1]] - kk[order(kk$year, decreasing=TRUE),] } out - do.call(cbind, v) HTH Patrick Am 17.11.2010 15:03, schrieb Graves, Gregory: I have a file, each column of which is a separate year, and each row of each column is mean precipitation for that month. Looks like this (except it goes back to 1964). monthX2000 X2001 X2002 X2003 X2004 X2005 X2006 X2007 X2008 X2009 11.600 1.010 4.320 2.110 0.925 3.275 3.460 0.675 1.315 2.920 22.960 3.905 3.230 2.380 2.720 1.880 2.430 1.380 2.480 2.380 31.240 1.815 1.755 1.785 1.250 3.940 10.025 0.420 2.845 2.460 43.775 1.350 2.745 0.170 0.710 2.570 0.255 0.425 4.470 1.250 54.050 1.385 5.650 1.515 12.005 6.895 7.020 4.060 7.725 2.775 68.635 8.900 15.715 12.680 16.270 12.605 7.095 7.025 10.465 7.345 75.475 7.955 7.880 6.670 7.955 7.355 5.475 5.650 7.255 7.985 88.435 5.525 7.120 6.250 7.150 7.610 5.525 6.500 6.275 10.405 95.855 7.830 7.250 7.355 9.715 7.850 6.385 7.960 4.485 7.250 10 7.965 11.915 6.735 8.125 7.855 10.465 4.340 6.165 2.400 3.240 11 1.705 1.525 0.905 1.670 1.840 2.100 0.255 2.830 4.425 1.645 12 2.335 0.840 0.795 1.890 0.145 1.700 0.260 2.160 2.300 2.220 What I want to do is to stack 2008 data underneath 2009 data, 2007 under 2008, 2006 under 2007, etc. I have so far figured out that I can do this with the following clumsy approach: a=stack(yearmonth,select=c(X2009,X2008)) b=stack(yearmonth,select=c(X2008,X2007)) x=as.data.frame(c(a,b)) write.table(x,clipboard,sep= ,col.names=NA) #then paste this back into Excel to get this values ind values.1ind.1 1 0.275 X2009 1.285 X2008 2 0.41X2009 3.85X2008 3 1.915 X2009 3.995 X2008 4 1.25X2009 3.845 X2008 5 8.76X2009 2.095 X2008 6 8.65X2009 8.29X2008 7 7.175 X2009 9.405 X2008 8 7.19X2009 13.44 X2008 9 8.13X2009 7.245 X2008 10 1.46X2009 5.645 X2008 11 2.56X2009 0.535 X2008 12 5.01X2009 1.225 X2008 13 1.285 X2008 0.72X2007 14 3.85X2008 1.89X2007 15 3.995 X2008 1.035 X2007 16 3.845 X2008 2.86X2007 17 2.095 X2008 3.785 X2007 18 8.29X2008 9.62X2007 19 9.405 X2008 9.245 X2007 20 13.44 X2008 5.595 X2007 21 7.245 X2008 8.4 X2007 22 5.645 X2008 6.705 X2007 23 0.535 X2008 1.47X2007 24 1.225 X2008 1.665 X2007 Is there an easier, cleaner way to do this? Thanks. Gregory A. Graves, Lead Scientist Everglades REstoration COoordination and VERification (RECOVER) Restoration Sciences Department South Florida Water Management District Phones: DESK: 561 / 682 - 2429 CELL: 561 / 719 - 8157 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] replace NA-values
Dear list, I'm trying to replace NA-values with the preceding values in that column. This code works, but I am sure there is a more elegant way... df - data.frame(id = c(A1, NA, NA, NA, B1, NA, NA, C1, NA, NA, NA, NA), value = c(1:12)) rn - c(rownames(df[!is.na(df$id),]), nrow(df)+1) rn - diff(as.numeric(rn)) df$id2 - rep(levels(df$id), rn) thanks for any help Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply a function on elements of a list two by two
Am 08.05.2010 15:43, schrieb Joris Meys: Dear all, I want to apply a function to list elements, two by two. I hoped that combn would help me out, but I can't get it to work. A nested for-loop works, but seems highly inefficient when you have large lists. Is there a more efficient way of approaching this? # Make some toy data data(iris) test- vector(list,3) for (i in 1:3){ x- levels(iris$Species)[i] tmp- dist(iris[iris$Species==x,-5]) test[[i]]- tmp } names(test)- levels(iris$Species) # Using 'lapply' and 'split' is a little bit more flexible: test - lapply(split(iris[, -5], iris$Species), function(x) dist(x)) # nested for loop works for(i in 1:2){ for(j in (i+1):3){ print(all.equal(test[[i]],test[[j]])) } } # combn doesn't work combn(test,2,all.equal) Sorry, no answer HTH Patrick Cheers Joris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find the three best values in every row
Hello Alfred, I found the solution from S. Ellison (https://stat.ethz.ch/pipermail/r-help/2010-May/238158.html) really inspiring. Here I am using tail and the library 'plyr': set.seed(17*11) d-data.frame(africa=sample(50, 10), europe= sample(50, 10), n.america= sample(50, 10), s.america= sample(50, 10), antarctica= sample((1:50)/20, 10) ) # using tail t(apply(d, 1, function(x, n) tail(sort(x), n), n=3)) lapply(split(d, rownames(d)), function(x, n) sort(x)[n:ncol(x)], n=3) # with plyr from Hadley Wickham library(plyr) ldply(split(d, rownames(d)), function(x, n) sort(x)[n : ncol(x)], n=3) HTH Patrick Am 07.05.2010 15:43, schrieb Alfred Schulze: Hello, i have a dataframe with the GDP for different Country (in the columns) and Years (in the rows). Now i want for every year the best three values, if possible with name of the countries (columnnames). For the best it's no problem but for the other two values. Thanks, Alfred __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame pivoting
Hi Angelo, try x - structure(list(ID = c(A1, A1, A1, A1, A1, A2, A2, A3, A3, A3, A3, A3), YEAR = c(2007, 2007, 2007, 2008, 2008, 2007, 2008, 2007, 2007, 2008, 2008, 2008), PROPERTY = c(P1, P2, P3, P1, P2, P5, P6, P1, P3, P1, P2, P6 ), VALUE = c(1, 2, 3, 10, 20, 50, 20, 1, 30, 10, 4, 25)), .Names = c(ID, YEAR, PROPERTY, VALUE), row.names = c(NA, 12L), class = data.frame) # package reshape library(reshape) xm - melt(x, id.var=c(ID, YEAR, PROPERTY)) # with cast (reshape) cast(xm, ID ~ YEAR ~ PROPERTY) ftable(cast(xm, ID ~ YEAR ~ PROPERTY)) # with xtabs - 0 != NA xtabs(value ~ ID + YEAR + PROPERTY, data = xm) ftable( xtabs(value ~ ID + YEAR + PROPERTY, data = xm) ) ftable(addmargins(xtabs(value ~ ID + YEAR + PROPERTY, data = xm))) HTH Patrick Am 06.05.2010 09:06, schrieb angelo.lina...@bancaditalia.it: Dear R experts, I am trying to solve this problem, related to the possibility of changing the shape of a data frame using a pivoting-like function. I have a dataframe df of observations as follows: ID VALIDITY YEAR PROPERTYPROPERTY VALUE A1 2007P1 V1 A1 2007P2 V2 A1 2007P3 V3 A1 2008P1 V10 A1 2008P2 V20 A2 2007P5 V50 A2 2008P6 V20 A3 2007P1 V1 A3 2007P3 V30 A3 2008P1 V10 A3 2008P2 V4 A3 2008P6 V25 (you can imagine that this data is collected every year from a sample of people with several measures - weight, number of children, income... It can happen that some properties could be missing from some IDs). I have to obtain a data frame like this: ID VALIDITY YEAR P1 P2 P3 P4 P5 P6 A1 2007V1 V2 V3 - - - A1 200 V10 V20 - - - - A2 2007- - - - V50 - A2 2008- - - - - V60 A3 2007V1 - V30 - - - A3 2008V10 V4 - - - V25 I started using the operator by obtaining the different slices of data: by(df,df$PROPERTY,list) but then ? I also tried using tapply: tapply(df$CID,df$PROPERTY,list) obtaining a list but I am not able to go on. Can you help me ? Thank you in advance Angelo Linardi ** Le e-mail provenienti dalla Banca d'Italia sono trasmesse in buona fede e non comportano alcun vincolo ne' creano obblighi per la Banca stessa, salvo che cio' non sia espressamente previsto da un accordo scritto. Questa e-mail e' confidenziale. Qualora l'avesse ricevuta per errore, La preghiamo di comunicarne via e-mail la ricezione al mittente e di distruggerne il contenuto. La informiamo inoltre che l'utilizzo non autorizzato del messaggio o dei suoi allegati potrebbe costituire reato. Grazie per la collaborazione. -- E-mails from the Bank of Italy are sent in good faith but they are neither binding on the Bank nor to be understood as creating any obligation on its part except where provided for in a written agreement. This e-mail is confidential. If you have received it by mistake, please inform the sender by reply e-mail and delete it from your system. Please also note that the unauthorized disclosure or use of the message or any attachments could be an offence. Thank you for your cooperation. ** __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transpose? reshape? flipping? challenge with data frame
Hi David, you could use a mix of plyr and reshape: # Example datasets # Input propsum - data.frame(coverClass=c(C, G, L, O, S), R209120812=c(NA, 0.49, 0.38, 0.04, 0.09), R209122212=c(0.05, 0.35, 0.41, 0.09, 0.10)) library(plyr) xpropsum - melt(propsum, id.var=coverClass, variable_name = Image) tpropsum - reshape(xpropsum, timevar=coverClass, idvar=Image, direction=wide) colnames(tpropsum) - sub(value., , colnames(tpropsum)) tpropsum Cheers Patrick Am 23.04.2010 06:43, schrieb david.gobb...@csiro.au: Greetings all, I am having difficulty transposing, reshaping, flipping (not sure which) a data frame which is read from a DBF file. I have tried using t(), reshape() and other approaches without success. Can anyone please suggest an way (elegant or not) of flipping this data around ? The initial data is like propsum (defined below), and I want it to look like tpropsum once reformed. propsum coverClass R209120812 R209122212 1 C NA 0.05 2 G 0.49 0.35 3 L 0.38 0.41 4 O 0.04 0.09 5 S 0.09 0.10 tpropsum ImageCGLO L.1 1 R209120812 NA 0.49 0.38 0.04 0.09 2 R209122212 0.05 0.35 0.41 0.09 0.10 # Example datasets # Input propsum- data.frame(coverClass=c(C, G, L, O, S), R209120812=c(NA, 0.49, 0.38, 0.04, 0.09), R209122212=c(0.05, 0.35, 0.41, 0.09, 0.10)) # Desired output tpropsum- data.frame(Image=c(R209120812, R209122212), C=c(NA, 0.05), G=c(0.49, 0.35), L=c(0.38, 0.41), O=c(0.04, 0.09), L=c(0.09, 0.10)) Thanks, David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transpose? reshape? flipping? challenge with data frame
Ups, I mean library(reshape) not plyr, sorry # Example datasets # Input propsum - data.frame(coverClass=c(C, G, L, O, S), R209120812=c(NA, 0.49, 0.38, 0.04, 0.09), R209122212=c(0.05, 0.35, 0.41, 0.09, 0.10)) library(reshape) xpropsum - melt(propsum, id.var=coverClass, variable_name = Image) tpropsum - reshape(xpropsum, timevar=coverClass, idvar=Image, direction=wide) colnames(tpropsum) - sub(value., , colnames(tpropsum)) tpropsum HTH, Patrick Am 23.04.2010 12:16, schrieb Patrick Hausmann: Hi David, you could use a mix of plyr and reshape: # Example datasets # Input propsum - data.frame(coverClass=c(C, G, L, O, S), R209120812=c(NA, 0.49, 0.38, 0.04, 0.09), R209122212=c(0.05, 0.35, 0.41, 0.09, 0.10)) library(plyr) xpropsum - melt(propsum, id.var=coverClass, variable_name = Image) tpropsum - reshape(xpropsum, timevar=coverClass, idvar=Image, direction=wide) colnames(tpropsum) - sub(value., , colnames(tpropsum)) tpropsum Cheers Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to use tapply for quantile
Hi James, I don't know how to solve it with tapply (something with split I think..), but you could use plyr (from Hadley Wickham). library(plyr) # Generate some data set.seed(321) myD - data.frame( Place = sample(c(AWQ,DFR, WEQ), 10, replace=T), Light = sample(LETTERS[1:2], 15, replace=T), value=rnorm(30) ) myD[c(3,12,29), value] - NA # data.frame to data.frame ddply(myD, .(Place, Light), summarise, quan_value = quantile(value, na.rm=TRUE)) # data.frame to list quant - function(df) quantile(df$value, na.rm=TRUE) dlply(myD, .(Place, Light), quant) Cheers Patrick Am 09.04.2010 03:24, schrieb James Rome: I am trying to calculate quantiles of a data frame column split up by two factors: # Calculate the quantiles quarts = tapply(gdf$tt, list(gdf$Runway, gdf$OnHour), FUN=quantile, na.rm = TRUE) This does not work: quarts 04L 04R 15R 22L 22R 2732 33L 33R 0 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL 1 NULL Numeric,5 NULL Numeric,5 NULL NULL NULL Numeric,5 NULL 2 NULL NULL NULL Numeric,5 NULL NULL NULL NULL NULL 3 NULL NULL NULL NULL NULL NULL NULL Numeric,5 NULL 4 NULL NULL NULL NULL NULL NULL NULL NULL NULL 5 NULL NULL NULL NULL NULL NULL NULL NULL NULL 6 NULL NULL NULL NULL NULL NULL NULL NULL NULL 7 NULL Numeric,5 NULL NULL NULL Numeric,5 NULL Numeric,5 NULL 8 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL Numeric,5 NULL . . . But if I leave out either of the two factors, it does work quarts = tapply(gdf$tt, list(gdf$Runway), FUN=quantile, na.rm = TRUE) quarts $`04L` 0% 25% 50% 75% 100% 489 10 20 $`04R` 0% 25% 50% 75% 100% 09 10 11 28 . . . . How can I get this to work? Thanks, Jim Rome __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.csv and header
Hi Alex, try this mfile - c:\\ex01.txt nperm - 12 sDate - paste(date: , 2009-12-13, sep=) sFile - paste(filename: , mfile, sep=) sPerm - paste(number of permutations: , nperm, sep=) mt - matrix(1:10, 2) sink(mfile) cat(sDate, \n) cat(sFile, \n) cat(sPerm, \n) cat(-, \n\n) print(mt) sink() Best Patrick Walther, Alexander schrieb: Dear list, I would like to export a matrix to a TXT-File by using write.csv (not necessarily). Is there a way to add a header (with additional informations concerning the project) spanning multiple lines to this file before the actual data are listed up? Should look like this: date: filename: number of permutations: data (as a matrix) Any suggestions? Thnx in advance. cheers Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xtabs - missing combination
Dear list, I am trying to make a contingency table with xtabs but I am getting a 0 where I expect a 'NA'. Here is a simple example: options(stringsAsFactors = FALSE) rn - LETTERS[1:4] df1 - data.frame(r07 = rep(rn, each=4), r08 = rep(rn, 4), value = 1:16) xtabs(value ~ r07 + r08, df1) # Delete the combination [A, C] df1 - df1[-3,] # Set 'value' for this combination to 0 df1[13, 3] - 0 # This is the output I want tapply(df1[, value], df1[, c(r07, r08)], c) # but using 'xtabs' I get a 0 for [A, C] xtabs(value ~ r07 + r08, df1) Hmm, what have I missed... Thanks for any help! Best, Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ave and grouping
Dear list, # I have a DF like this: sleep$b - c(rep(8,10), rep(9,10)) sleep$me - with(sleep, ave(extra, group, FUN = mean)) sleep # I would like to create a new variable # holding the b-th value of group 1 and 2. # This is not what I want, it takes always the '8' from group '1' # and not the '9' sleep$gr - with(sleep, ave(extra, group, FUN = function(x) x[ b[1] ])) sleep Thanks for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lapply and aggregate function
Dear list, I have two things I am struggling... # First set.seed(123) myD - data.frame( Light = sample(LETTERS[1:2], 10, replace=T), Feed = sample(letters[1:5], 20, replace=T), value=rnorm(20) ) # Mean for Light myD$meanLight - unlist( lapply( myD$Light, function(x) mean( myD$value[myD$Light == x]) ) ) # Mean for Feed myD$meanFeed - unlist( lapply( myD$Feed, function(x) mean( myD$value[myD$Feed == x]) ) ) myD # I would like to get a new Var meanLightFeed # holding the Group-Mean for each combination (eg. A:a = 0.821581) # by(myD$value, list(myD$Light, myD$Feed), mean)[[1]] # Second set.seed(321) myD - data.frame( Light = sample(LETTERS[1:2], 10, replace=T), value=rnorm(20) ) w1 - tapply(myD$value, myD$Light, mean) w1 # w1 # A B # 0.4753412 -0.2108387 myfun - function(x) (myD$value w1[x] myD$value w1[x] * 1.5) I would like to have a TRUE/FALSE-Variable depend on the constraint in myfun for each level in Light... As always - thanks for any help!! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NA-values and logical operation
Dear list, as a result of a logical operation I want to assign a new variable to a DF with NA-values. z - data.frame( x = c(5,6,5,NA,7,5,4,NA), y = c(1,2,2,2,2,2,2,2) ) p - (z$x = 5) (z$y == 1) p z[p, p1] -5 z # ok, this works fine z - z[,-3] p - (z$x = 5) (z$y == 2) p z[p, p2] -5 z # this failed... - how can I assign the value '5' to the new # var p2 Thanks for any help!! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice: Color in Barchart legend
Dear list, with the code below I produce the right graph, but the colours of the legend are different from the colours of the graph. The colours of the graph are the desired colours. Thanks for any help. Patrick library(lattice) pal1 - rgb(196, 255, 255, max = 255) pal2 - rgb( 0, 35, 196, max = 255) df - data.frame( Gruppe = c(A, B, A, B), Kat = c(x1, x1, w1, w1), value= c(1,2, 4, 5)) barchart(value ~ Kat, group= Gruppe, panel = function(y,x,...){ panel.barchart(x,y, ..., col=c(pal1, pal2)) }, data = df, auto.key = list(points = FALSE, rectangles = TRUE, columns = 2, space = bottom) ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strsplit and regexp
Dear list, I am trying to split a string using regexp: x - 2 Value 34 a-c 45 t strsplit(x, [0-9]) [[1]] [1]Value a-c t But I don't want to lose the digits (pattern), the result should be: [[1]] [1] 2 Value 34 a-c 45 t Thanks for any tipp Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tapply and grouping
Hello all, I have a df like this: w - c(1.20, 1.34, 2.34, 3.12, 2.89, 4.67, 2.43, 2.89, 1.99, 3.45, 2.01, 2.23, 1.45, 1.59) g - rep(c(a, b), each=7) df - data.frame(g, w) df # 1. Mean for each group tapply(df$w, df$g, function(x) mean(x)) # 2. Range for each group - fix value 0.15 tapply(df$w, df$g, function(x) x[(x mean(x) - 0.15) (x mean(x) + ( 1 - 0.15 ))]) Now my question: How can I use different values of 0.15 for each group. As a result of a calculation I have an object vari: vari ab 0.41 0.08 str(vari) num [, 1:2] 0.41 0.08 - attr(*, dimnames)=List of 1 ..$ : chr [1:2] a b So, I wanted to use 0.41 for group a and 0.08 for b instead of 0.15... Thanks for any help!! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using tapply
Dear list, I have a dataframe like this: w - c(1.2, 1.34, 2.34, 3.12, 2.43, 1.99, 2.01, 2.23, 1.45, 1.59) g - rep(c(a, b), each=5) df - data.frame(g, w) df df gw 1 a 1.20 2 a 1.34 3 a 2.34 4 a 3.12 5 a 2.43 6 b 1.99 7 b 2.01 8 b 2.23 9 b 1.45 10 b 1.59 Using tapply to get the mean for each group: vk - tapply(df$w, df$g, mean) vk # vk #a b #2.086 1.854 Now I would like to get for each group the first value *greater* than the mean. So for a it should be 2.34 and for b 1.99. Thanks for any help Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ave and sd
Dear list, I'm still trying to calculate the sd for V2 for each group in V1 if V3 is '0': x V1 V2 V3 1 A01 2.40 0 2 A01 3.40 1 3 A01 2.80 0 4 A02 3.20 0 5 A02 4.20 0 6 A03 2.98 1 7 A03 2.31 0 8 A04 4.20 0 # Work x$vmean - ave(x$V2, x$V1, x$V3 == 0, FUN = mean) # Work x$vsd2 - ave(x$V2, x$V1, FUN = sd) # Doesn't work x$vsd - ave(x$V2, x$V1, x$V3 == 0, FUN = sd) Thank you for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ave and levels
Dear list, I want to calculate the standard deviation using 'ave' on two different DFs. In the first DF M1 has only 1 level: str(x) 'data.frame': 18 obs. of 3 variables: $ M1: Factor w/ 1 level A03: 1 1 1 1 ... $ M2: num 2.76 2.93 3.06 3.07 3.12 ... $ M3: Factor w/ 2 levels Ausgewählt,Nicht ausgewählt: 1 1 1 1 ... and I am getting a correct 'NA' for the last value ave(x$M2, x$M1, factor(x$M3), FUN = sd) # [1] 0.1810123 0.1810123 0.1810123 0.1810123 # 0.1810123 0.1810123 0.1810123 0.1810123 0.1810123 # 0.1810123 0.1810123 0.1810123 0.1810123 # 0.1810123 0.1810123 0.1810123 0.1810123NA This ist the second DF (here M1 as 138 Levels): str(k) 'data.frame': 18 obs. of 3 variables: $ M1: Factor w/ 138 levels A01,A02,A03,..: 3 3 3 3 ... $ M2: num 2.76 2.93 3.06 3.07 3.12 3.12 3.15 3.17 3.17 3.17 ... $ M3: Factor w/ 2 levels Ausgewählt,Nicht ausgewählt: 1 1 1 1 ... and I am getting this error ave(k$M2, k$M1, factor(k$M3), FUN = sd) #Fehler in var(x, na.rm = na.rm) : 'x' ist leer So, what have I missed? Thank you for any help! Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.