Re: [R] List of Variables in Original Order
HI, May be this helps you: set.seed(1) mat1-matrix(rnorm(60,5),nrow=5,ncol=12) colnames(mat1)-paste0(Var,1:12) vec2-format(c(1,cor(mat1[,1],mat1[,2:12])),digits=4) vec3-colnames(mat1) arr2-array(rbind(vec3,vec2),dim=c(2,3,4)) res-data.frame(do.call(rbind,lapply(1:dim(arr2)[3],function(i) arr2[,,i]))) res # X1 X2 X3 #1 Var1 Var2 Var3 #2 1.0 0.27890 -0.61497 #3 Var4 Var5 Var6 #4 0.24916 -0.76155 0.30853 #5 Var7 Var8 Var9 #6 -0.46413 0.79287 0.05191 #7 Var10 Var11 Var12 #8 -0.06940 -0.53251 0.06766 A.K. - Original Message - From: rkulp rk...@charter.net To: r-help@r-project.org Cc: Sent: Thursday, September 27, 2012 6:26 PM Subject: [R] List of Variables in Original Order I am trying to Sweave the output of calculating correlations between one variable and several others. I wanted to print a table where the odd-numbered rows contain the variable names and the even-numbered rows contain the correlations. So if VarA is correlated with all the variables in mydata.df, then it would look like var1 var2 var3 corr1 corr2 corr3 var4 var5 var6 corr4 corr5 corr6 . . etc. I tried using a matrix for the correlations and another one for the variable names. I built the correlation matrix using x = matrix(format(cor(mydata.df[,1],mydata.df[,c(2:79)]),digits=4),nc=3) and the variable names matrix using y = matrix(ls(mydata.df[c(2:79)]),nc=3). The problem is the function ls returns the names in alphabetical order, columnar order. How do I get the names in columnar order? Is there a better way to display the correlation of a single variable with a large number of other variables? If there is, how do I do it? I appreciate any help I can get. This is my first project in R so I don't know much about it yet. -- View this message in context: http://r.789695.n4.nabble.com/List-of-Variables-in-Original-Order-tp4644436.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to test if there is a subvector in a longer vector
Hi! 28.09.2012 08:41, Atte Tenkanen wrote: Sorry. I should have mentioned that the order of the components is important. So c(1,4,6) is accepted as a subvector of c(2,1,1,4,6,3), but not of c(2,1,1,6,4,3). How to test this? How about this: --- code --- g1- c(2,1,1,4,6,3) g2- c(2,1,1,6,4,3) t1- c(1,4,6) t2-c(9,8) !is.na(sum(match(t1,g1))) [1] TRUE !is.na(sum(match(t1,g2))) [1] TRUE !is.na(sum(match(t2,g1))) [1] FALSE --- code --- Kind regads, Kimmo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple Question
Hi Everyone, I am trying a very simple task to append the Timestamp with a variable name so something like a_2012_09_27_00_12_30 - rnorm(1,2,1). Tried some commands but it doesn't work out well. Hope someone has some answer on it. Session Info R version 2.15.1 (2012-06-22) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1 bitops_1.0-4.1 tm_0.5-7.1 RMySQL_0.9-3DBI_0.2-5 loaded via a namespace (and not attached): [1] slam_0.1-24 tools_2.15.1 Statement I tried : b - unclass(Sys.time()) b = 1348812597 c_b - rnorm(1,2,1) Works perfect but doesn't show me c_1348812597. Best Regards, Bhupendrasinh Thakre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Question
Hi! 28.09.2012 09:13, Bhupendrasinh Thakre wrote: Statement I tried : b - unclass(Sys.time()) b = 1348812597 c_b - rnorm(1,2,1) Do you mean this: --- code --- df-data.frame(x=0,y=0) colnames(df) [1] x y colnames(df)[2]-paste(b,unclass(Sys.time()),sep=_) colnames(df) [1] x b_1348813791.55393 --- code --- HTH, Kimmo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Question
Hello, Try the following: b - unclass(Sys.time()) eval(parse(text=paste(c_,b, - rnorm(1,2,1),sep=))) ls() Regards, Pascal Le 28/09/2012 15:13, Bhupendrasinh Thakre a écrit : Hi Everyone, I am trying a very simple task to append the Timestamp with a variable name so something like a_2012_09_27_00_12_30 - rnorm(1,2,1). Tried some commands but it doesn't work out well. Hope someone has some answer on it. Session Info R version 2.15.1 (2012-06-22) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1 bitops_1.0-4.1 tm_0.5-7.1 RMySQL_0.9-3DBI_0.2-5 loaded via a namespace (and not attached): [1] slam_0.1-24 tools_2.15.1 Statement I tried : b - unclass(Sys.time()) b = 1348812597 c_b - rnorm(1,2,1) Works perfect but doesn't show me c_1348812597. Best Regards, Bhupendrasinh Thakre [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Question
On Sep 27, 2012, at 11:13 PM, Bhupendrasinh Thakre wrote: Hi Everyone, I am trying a very simple task to append the Timestamp with a variable name so something like a_2012_09_27_00_12_30 - rnorm(1,2,1). If you want to assign a value to a character-name you need to use ... `assign`. You cannot just stick a numeric value which is what you get with sys.Time() on the LHS of a - and expect R to intuit what you intend. ?assign assign( a_2012_09_27_00_12_30 , rnorm(1,2,1) ) assign( as.character(unclass(Sys.time())) , rnorm(1,2,1) ) (I would have thought you wanted to format that sys.Time result:) format(Sys.time(), %Y_%m_%d_%H_%M_%S) [1] 2012_09_27_23_32_40 assign(format(Sys.time(), %Y_%m_%d_%H_%M_%S), rnorm(1,2,1) ) grep(^2012, ls(), value=TRUE) [1] 2012_09_27_23_33_45 Tried some commands but it doesn't work out well. Hope someone has some answer on it. Session Info R version 2.15.1 (2012-06-22) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1 bitops_1.0-4.1 tm_0.5-7.1 RMySQL_0.9-3DBI_0.2-5 loaded via a namespace (and not attached): [1] slam_0.1-24 tools_2.15.1 Statement I tried : b - unclass(Sys.time()) b = 1348812597 c_b - rnorm(1,2,1) Works perfect but doesn't show me c_1348812597. Best Regards, Bhupendrasinh Thakre [[alternative HTML version deleted]] BT; Please learn to post in plain text. It's really very simple with gmail. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Annotate a segmented linear regression plot
On Sep 27, 2012, at 9:07 PM, Ben Harrison wrote: Hello, I have produced some segmented regressions with the segmented package by Viggo Mutteo. I have some example data and code below. I want to annotate the individual segments with the slope parameter (actually it would be nicer to annotate with 1000*slope and add some small amount of text as well). How can I do it? Reading the docs for segmented I can access all of the slope parameters via a named vector of the coefficients. How can I access the slope segments or locations? I have never tried to annotate an R plot before, so I don't even know how to 'pin' a bit of text to an x,y location on a plot. ?text # should be fairly clear. dput(bullard) structure(list(Rt = c(14.4477, 23.6752, 26.723, 33.8508, 37.9628, 47.0804, 49.7232, 54.6395, 59.9251, 64.7518, 81.1629, 85.7209, 88.0334, 98.366, 102.6563, 105.6953, 134.8691, 137.3795, 155.0056, 158.6707, 162.0671, 206.7413, 248.701, 255.9407, 265.5201, 283.1462, 288.8939, 299.8356, 311.0788, 323.2355, 366.9049, 379.3662, 384.3869, 392.3246, 436.0853, 439.1246, 454.6023, 458.6247, 464.1744, 479.9764, 486.5171, 489.5564, 507.5925, 524.7894, 544.0806, 558.7642, 562.4293, 577.9268, 650.8613, 658.6664, 669.6996, 692.7172, 694.6993), Tem = c(14.6189, 15.2877, 15.3106, 15.3536, 15.3665, 15.3764, 15.3928, 15.4182, 15.4671, 15.528, 15.5921, 15.7066, 15.7806, 15.8747, 16.0244, 16.146, 16.481, 16.6098, 16.8581, 17.0339, 17.2242, 17.8379, 19.2747, 19.7184, 19.9621, 20.0953, 20.4838, 20.578, 20.774, 21.0112, 23.01, 23.3897, 24.1697, 24.4176, 27.0874, 27.3597, 28.0178, 28.4026, 28.909, 29.7406, 30.532, 30.8734, 32.216, 32.8198, 34.0339, 34.7553, 35.2611, 35.8303, 41.1202, 41.5027, 42.0578, 42.6597, 42.656)), .Names = c(Rt, Tem), class = data.frame, row.names = c(NA, -53L)) library(segmented) out.lm - lm(Tem ~ Rt, data=bullard) o-segmented(out.lm, seg.Z=~Rt, psi=NA, control=seg.control(display=FALSE, K=2)) plot(o, lwd=1,col=2:6, main='Plot title') points(bullard) abline(out.lm, col=red, lwd=2) Ben. [[alternative HTML version deleted]] Please learn to post in plain text. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to test if there is a subvector in a longer vector
On 28-09-2012, at 07:41, Atte Tenkanen atte...@utu.fi wrote: Sorry. I should have mentioned that the order of the components is important. So c(1,4,6) is accepted as a subvector of c(2,1,1,4,6,3), but not of c(2,1,1,6,4,3). How to test this? See this discussion for a variety of solutions. http://r.789695.n4.nabble.com/matching-a-sequence-in-a-vector-td4389523.html#a4393453 Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running different Regressions using for loops
Hi Rui, Excellent!! This is what I was looking for. Thanks for the help. So, now I have stored the result of the 10 regressions in summ.list - lapply(lm.list2, summary) And now once I enter sum.list it gives me the output for all the 10 regressions... I wanted to access a beta coefficient of one of the regressionssay Price2+Media1+Trend+Seasonality...the result of which is stored in sum.list[2] I entered the below statement for accessing the Beta coefficient for Price2... summ.list[2]$coefficients[2] NULL But this is giving me NULL as the output... What I am looking for, is to access a beta value of a particular variable from a particular regression output and use it for further analysis. Can you please help me out with this. Greatly appreciate, you guys efforts. Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 27 September 2012 21:55 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Inline. Em 27-09-2012 13:52, Krunal Nanavati escreveu: Hi, Thanks for all your help. I am stuck again, but with a new problem, on similar lines. I have taken the problem to the next step now...i have now added 2 for loops... 1 for the Price variable...and another for the Media variable I have taken 5 price variables...and 2 media variables with the trend and seasonality(appearing in all of them)so in all there will be 10 regression to run now Price 1, Media 1 Price 1, Media 2 Price 2, Media 1' Price 2, Media 2 ...and so on I have built up a code for it... tryout=read.table(C:\\Users\\Krunal\\Desktop\\R tryout.csv,header=T,sep=,) cnames - names(tryout) price - cnames[grep(Price, cnames)] media - cnames[grep(Media, cnames)] resp - cnames[1] regr - cnames[7:8] lm.list - vector(list, 10) for(i in 1:5) + { + regress - paste(price[i], paste(regr, collapse = +), sep = +) + for(j in 1:2) { + regress1 - paste(media[j],regress,sep=+) fmla - paste(resp, + regress1, sep = ~) lm.list[[i]] - lm(as.formula(fmla), data = + tryout) } } summ.list - lapply(lm.list, summary) summ.list But it is only running...5 regressions...only Media 1 along with the 5 Price variables Trend Seasonality is regressed on Volume...giving only 5 outputs I feel there is something wrong with the lm.list[[i]] - lm(as.formula(fmla), data = tryout) statement. No, I don't think so. If it's giving you only 5 outputs the error is probably in the fmla construction. Put print statements to see the results of those paste() instructions. Supposing your data.frame is now called tryout2, price - paste(Price, 1:5, sep = ) media - paste(Media, 1:2, sep = ) pricemedia - apply(expand.grid(price, media, stringsAsFactors = FALSE), 1, paste, collapse=+) response - Volume trendseason - Trend+Seasonality # do this only once lm.list2 - list() for(i in seq_along(pricemedia)){ regr - paste(pricemedia[i], trendseason, sep = +) fmla - paste(response, regr, sep = ~) lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) } The trick is to use ?expand.grid Hope this helps, Rui Barradas I am not sure about its placement...whether it should be in loop 2 or in loop 1 Can you please help me out?? Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 27 September 2012 16:22 To: David Winsemius Cc: Krunal Nanavati; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Just to add that you can also lapply(lm.list, coef) with a different output. Rui Barradas Em 27-09-2012 09:24, David Winsemius escreveu: On Sep 26, 2012, at 10:31 PM, Krunal Nanavati wrote: Dear Rui, Thanks for your time. I have a question though, when I run the 5 regression, whose outputs are stored in lm.list[i], I only get the coefficients for the Intercept, Price, Trend Seasonality as below lm.list[1] [[1]] Call: lm(formula = as.formula(fmla), data = tryout) Coefficients: (Intercept) Price4Trend Seasonality 9923123 -260682664616 551392 summ.list - lapply(lm.list, summary) coef.list - lapply(summ.list, coef) coef.list I am also looking out for t stats and p value and R squared. For the r.squared rsq.vec - sapply(summ.list, $, r.squared) adj.rsq - sapply(summ.list, $, adj.r.squared) Do you know, how can I get all these statistics. Also, why is as.formula used in the lm function. It should work without that as well, right? No. Can you please tell me, why the code that I had written, does not work with R. I thought it should work perfectly. In R there is a difference between expression objects and character objects. Thanks Regards, Krunal Nanavati 9769-919198 *From:* Rui Barradas [mailto:ruipbarra...@sapo.pt]
[R] blank plot----how do I make symbols appear
Hi, I am trying to create a scatterplot, coding each point to one of 5 populations. I was successful when I did this for one set of data, yet when I try plotting other data a blank plot appears (although the axes are labelled and I can fit the regression lines from each population). I have tried a variety of things to fix this but nothing seems to work. I can plot the points if I do not specify that I want each population to have a particular symbol. However, once I add the command [grip$Morph] to my symbol parameter (e.g., pch=c(2,6,5,19,15) [grip$morph] ), I loose all the points. As I mentioned above, I was able to create a plot successfully using other data points from the same table (different columns), so I know the data are fine. Has anyone come across this before? R-script used: HAND-AllMal[,c(2,4,5)] na.omit(HAND)-HAND write.csv(HAND, grip.csv) read.csv(grip.csv)-grip grip class(grip) class(HAND) grip$morph-as.character(grip$Morph) morph- grip$morph BML-grip$BML grip$MCF-MCF reg1-lm(BML~MCF,data=subset(grip,morph==mel));reg1 reg2-lm(BML~MCF,data=subset(grip,morph==tham));reg2 reg3-lm(BML~MCF,data=subset(grip,morph==A));reg3 reg4-lm(BML~MCF,data=subset(grip,morph==B));reg4 reg5-lm(BML~MCF,data=subset(grip,morph==C));reg5 plot(MCF,BML,pch=c(2,6,5,19,15)[grip$morph],xlab=Residual Metacarpal Length,ylab=Residual Hand Strength (Broad Dowel), main=Males) abline(reg1,lty=1) abline(reg2,lty=2) abline(reg3,lty=3) abline(reg4,lty=4) abline(reg5,lty=6) -- *Jessica da Silva* PhD Candidate Molecular Ecology Evolution Program Applied Biodiversity Research Kirstenbosch Research Centre South African National Biodiversity Institute Postal address: 3 Sangster Road Howick, KZN 3290 Home/Fax: +27 33 330 2230 Cell: +27 79 045 1781 Email: jessica.m.dasi...@gmail.com j.dasi...@sanbi.org.za Website: http://jmdasilva.doodlekit.com/home/home [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] blank plot----how do I make symbols appear
Jessica da Silva jessica.m.dasilva at gmail.com writes: I am trying to create a scatterplot, coding each point to one of 5 populations. I was successful when I did this for one set of data, yet when I try plotting other data a blank plot appears (although the axes are labelled and I can fit the regression lines from each population). I However, once I add the command [grip$Morph] to my symbol parameter (e.g., pch=c(2,6,5,19,15) [grip$morph] ), I loose all the points. As I mentioned above, I was able to create a plot successfully using other data points from the same table (different columns), so I know the data are fine. Try grip$morph-unclass(grip$Morph) instead. Look at what as.character(factor(letters[1:3])) gives you. R-script used: HAND-AllMal[,c(2,4,5)] na.omit(HAND)-HAND write.csv(HAND, grip.csv) read.csv(grip.csv)-grip grip class(grip) class(HAND) grip$morph-as.character(grip$Morph) morph- grip$morph BML-grip$BML grip$MCF-MCF reg1-lm(BML~MCF,data=subset(grip,morph==mel));reg1 reg2-lm(BML~MCF,data=subset(grip,morph==tham));reg2 reg3-lm(BML~MCF,data=subset(grip,morph==A));reg3 reg4-lm(BML~MCF,data=subset(grip,morph==B));reg4 reg5-lm(BML~MCF,data=subset(grip,morph==C));reg5 plot(MCF,BML,pch=c(2,6,5,19,15)[grip$morph],xlab=Residual Metacarpal Length,ylab=Residual Hand Strength (Broad Dowel), main=Males) abline(reg1,lty=1) abline(reg2,lty=2) abline(reg3,lty=3) abline(reg4,lty=4) abline(reg5,lty=6) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running different Regressions using for loops
Hello, Krunal, try summ.list[[2]]$coefficients[2] Note the double square brackets (as summ.list is a list)! Hth, Gerrit On Fri, 28 Sep 2012, Krunal Nanavati wrote: Hi Rui, Excellent!! This is what I was looking for. Thanks for the help. So, now I have stored the result of the 10 regressions in summ.list - lapply(lm.list2, summary) And now once I enter sum.list it gives me the output for all the 10 regressions... I wanted to access a beta coefficient of one of the regressionssay Price2+Media1+Trend+Seasonality...the result of which is stored in sum.list[2] I entered the below statement for accessing the Beta coefficient for Price2... summ.list[2]$coefficients[2] NULL But this is giving me NULL as the output... What I am looking for, is to access a beta value of a particular variable from a particular regression output and use it for further analysis. snip __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running different Regressions using for loops
Hello, To access list elements you need `[[`, like this: summ.list[[2]]$coefficients Or Use the extractor function, coef(summ.list[[2]]) Rui Barradas Em 28-09-2012 07:23, Krunal Nanavati escreveu: Hi Rui, Excellent!! This is what I was looking for. Thanks for the help. So, now I have stored the result of the 10 regressions in summ.list - lapply(lm.list2, summary) And now once I enter sum.list it gives me the output for all the 10 regressions... I wanted to access a beta coefficient of one of the regressionssay Price2+Media1+Trend+Seasonality...the result of which is stored in sum.list[2] I entered the below statement for accessing the Beta coefficient for Price2... summ.list[2]$coefficients[2] NULL But this is giving me NULL as the output... What I am looking for, is to access a beta value of a particular variable from a particular regression output and use it for further analysis. Can you please help me out with this. Greatly appreciate, you guys efforts. Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 27 September 2012 21:55 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Inline. Em 27-09-2012 13:52, Krunal Nanavati escreveu: Hi, Thanks for all your help. I am stuck again, but with a new problem, on similar lines. I have taken the problem to the next step now...i have now added 2 for loops... 1 for the Price variable...and another for the Media variable I have taken 5 price variables...and 2 media variables with the trend and seasonality(appearing in all of them)so in all there will be 10 regression to run now Price 1, Media 1 Price 1, Media 2 Price 2, Media 1' Price 2, Media 2 ...and so on I have built up a code for it... tryout=read.table(C:\\Users\\Krunal\\Desktop\\R tryout.csv,header=T,sep=,) cnames - names(tryout) price - cnames[grep(Price, cnames)] media - cnames[grep(Media, cnames)] resp - cnames[1] regr - cnames[7:8] lm.list - vector(list, 10) for(i in 1:5) + { + regress - paste(price[i], paste(regr, collapse = +), sep = +) + for(j in 1:2) { + regress1 - paste(media[j],regress,sep=+) fmla - paste(resp, + regress1, sep = ~) lm.list[[i]] - lm(as.formula(fmla), data = + tryout) } } summ.list - lapply(lm.list, summary) summ.list But it is only running...5 regressions...only Media 1 along with the 5 Price variables Trend Seasonality is regressed on Volume...giving only 5 outputs I feel there is something wrong with the lm.list[[i]] - lm(as.formula(fmla), data = tryout) statement. No, I don't think so. If it's giving you only 5 outputs the error is probably in the fmla construction. Put print statements to see the results of those paste() instructions. Supposing your data.frame is now called tryout2, price - paste(Price, 1:5, sep = ) media - paste(Media, 1:2, sep = ) pricemedia - apply(expand.grid(price, media, stringsAsFactors = FALSE), 1, paste, collapse=+) response - Volume trendseason - Trend+Seasonality # do this only once lm.list2 - list() for(i in seq_along(pricemedia)){ regr - paste(pricemedia[i], trendseason, sep = +) fmla - paste(response, regr, sep = ~) lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) } The trick is to use ?expand.grid Hope this helps, Rui Barradas I am not sure about its placement...whether it should be in loop 2 or in loop 1 Can you please help me out?? Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 27 September 2012 16:22 To: David Winsemius Cc: Krunal Nanavati; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Just to add that you can also lapply(lm.list, coef) with a different output. Rui Barradas Em 27-09-2012 09:24, David Winsemius escreveu: On Sep 26, 2012, at 10:31 PM, Krunal Nanavati wrote: Dear Rui, Thanks for your time. I have a question though, when I run the 5 regression, whose outputs are stored in lm.list[i], I only get the coefficients for the Intercept, Price, Trend Seasonality as below lm.list[1] [[1]] Call: lm(formula = as.formula(fmla), data = tryout) Coefficients: (Intercept) Price4Trend Seasonality 9923123 -260682664616 551392 summ.list - lapply(lm.list, summary) coef.list - lapply(summ.list, coef) coef.list I am also looking out for t stats and p value and R squared. For the r.squared rsq.vec - sapply(summ.list, $, r.squared) adj.rsq - sapply(summ.list, $, adj.r.squared) Do you know, how can I get all these statistics. Also, why is as.formula used in the lm function. It should work without that as well, right? No. Can you please tell me, why the code that I had written, does not work with R. I thought it should work perfectly. In R there is a difference between
Re: [R] Drawing asymmetric error bars
On 09/27/2012 08:59 PM, Alexandra Howe wrote: Hello, I have data which I have arcsin transformed to analyse. I want to plot my data with error bars however as my data is back-transformed my standard errors are uneven. Is there a simple way to draw these asymmetric error bars in R? Hi Alexandra, Have a look at the dispersion function in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Crosstable-like analysis (ks test) of dataframe
Hi, I have a dataframe with multiple (appr. 20) columns containing vectors of different values (different distributions). Now I'd like to create a crosstable where I compare the distribution of each vector (df-column) with each other. For the comparison I want to use the ks.test(). The result should contain as row and column names the column names of the input dataframe and the cells should be populated with the p-value of the ks.test for each pairwise analysis. My data.frame looks like: df - data.frame(X=rnorm(1000,2),Y=rnorm(1000,1),Z=rnorm(1000,2)) And the test for one single case is: ks - ks.test(df$X,df$Z) where the p value is: ks[2] How can I create an automatized way of this pairwise analysis? Any suggestions? I guess that is a quite common analysis (probably with other tests). cheers, Johannes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Crosstable-like analysis (ks test) of dataframe
Hello, Try the following. f - function(x, y, ..., alternative = c(two.sided, less, greater), exact = NULL){ #w - getOption(warn) #options(warn = -1) # ignore warnings p - ks.test(x, y, ..., alternative = alternative, exact = exact)$p.value #options(warn = w) p } n - 1e1 dat - data.frame(X=rnorm(n), Y=runif(n), Z=rchisq(n, df=3)) apply(dat, 2, function(x) apply(dat, 2, function(y) f(x, y))) Hope this helps, Rui Barradas Em 28-09-2012 11:10, Johannes Radinger escreveu: Hi, I have a dataframe with multiple (appr. 20) columns containing vectors of different values (different distributions). Now I'd like to create a crosstable where I compare the distribution of each vector (df-column) with each other. For the comparison I want to use the ks.test(). The result should contain as row and column names the column names of the input dataframe and the cells should be populated with the p-value of the ks.test for each pairwise analysis. My data.frame looks like: df - data.frame(X=rnorm(1000,2),Y=rnorm(1000,1),Z=rnorm(1000,2)) And the test for one single case is: ks - ks.test(df$X,df$Z) where the p value is: ks[2] How can I create an automatized way of this pairwise analysis? Any suggestions? I guess that is a quite common analysis (probably with other tests). cheers, Johannes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running different Regressions using for loops
Hello, Try names(lm.list2[[2]]$coefficient[2] ) Rui Barradas Em 28-09-2012 11:29, Krunal Nanavati escreveu: Ok...this solves a part of my problem When I typelm.list2[2] ...I get the following output [[1]] Call: lm(formula = as.formula(fmla), data = tryout2) Coefficients: (Intercept) Price2 Media1 Distri1Trend Seasonality 13491232 -5759030-15203437048628 445351 When I enterlm.list2[[2]]$coefficient[2] it gives me the below output Price2 -5759030 And when I enterlm.list2[[2]]$coefficient[[2]] ...I get the number...which is -5759030 I am looking out for a way to get just the Price2 is there a statement for that?? Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 15:18 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, To access list elements you need `[[`, like this: summ.list[[2]]$coefficients Or Use the extractor function, coef(summ.list[[2]]) Rui Barradas Em 28-09-2012 07:23, Krunal Nanavati escreveu: Hi Rui, Excellent!! This is what I was looking for. Thanks for the help. So, now I have stored the result of the 10 regressions in summ.list - lapply(lm.list2, summary) And now once I enter sum.list it gives me the output for all the 10 regressions... I wanted to access a beta coefficient of one of the regressionssay Price2+Media1+Trend+Seasonality...the result of which is stored in sum.list[2] I entered the below statement for accessing the Beta coefficient for Price2... summ.list[2]$coefficients[2] NULL But this is giving me NULL as the output... What I am looking for, is to access a beta value of a particular variable from a particular regression output and use it for further analysis. Can you please help me out with this. Greatly appreciate, you guys efforts. Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 27 September 2012 21:55 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Inline. Em 27-09-2012 13:52, Krunal Nanavati escreveu: Hi, Thanks for all your help. I am stuck again, but with a new problem, on similar lines. I have taken the problem to the next step now...i have now added 2 for loops... 1 for the Price variable...and another for the Media variable I have taken 5 price variables...and 2 media variables with the trend and seasonality(appearing in all of them)so in all there will be 10 regression to run now Price 1, Media 1 Price 1, Media 2 Price 2, Media 1' Price 2, Media 2 ...and so on I have built up a code for it... tryout=read.table(C:\\Users\\Krunal\\Desktop\\R tryout.csv,header=T,sep=,) cnames - names(tryout) price - cnames[grep(Price, cnames)] media - cnames[grep(Media, cnames)] resp - cnames[1] regr - cnames[7:8] lm.list - vector(list, 10) for(i in 1:5) + { + regress - paste(price[i], paste(regr, collapse = +), sep = +) + for(j in 1:2) { + regress1 - paste(media[j],regress,sep=+) fmla - paste(resp, + regress1, sep = ~) lm.list[[i]] - lm(as.formula(fmla), data = + tryout) } } summ.list - lapply(lm.list, summary) summ.list But it is only running...5 regressions...only Media 1 along with the 5 Price variables Trend Seasonality is regressed on Volume...giving only 5 outputs I feel there is something wrong with the lm.list[[i]] - lm(as.formula(fmla), data = tryout) statement. No, I don't think so. If it's giving you only 5 outputs the error is probably in the fmla construction. Put print statements to see the results of those paste() instructions. Supposing your data.frame is now called tryout2, price - paste(Price, 1:5, sep = ) media - paste(Media, 1:2, sep = ) pricemedia - apply(expand.grid(price, media, stringsAsFactors = FALSE), 1, paste, collapse=+) response - Volume trendseason - Trend+Seasonality # do this only once lm.list2 - list() for(i in seq_along(pricemedia)){ regr - paste(pricemedia[i], trendseason, sep = +) fmla - paste(response, regr, sep = ~) lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) } The trick is to use ?expand.grid Hope this helps, Rui Barradas I am not sure about its placement...whether it should be in loop 2 or in loop 1 Can you please help me out?? Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 27 September 2012 16:22 To: David Winsemius Cc: Krunal Nanavati; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Just to add that you can also lapply(lm.list, coef) with a different output. Rui Barradas Em 27-09-2012 09:24, David Winsemius escreveu: On Sep 26, 2012, at
Re: [R] changing outlier shapes of boxplots using lattice
I would guess that if you find the bit that says pch=| and change it to pch=1 it will solve your question, and that reading ?par will tell you why. Sarah On Thursday, September 27, 2012, Elaine Kuo wrote: Hello This is Elaine. I am using package lattice to generate boxplots. Using Richard's code, the display was almost perfect except the outlier shape. Based on the following code, the outliers are vertical lines. However, I want the outliers to be empty circles. Please kindly help how to modify the code to change the outlier shapes. Thank you. code package (lattice) dataN - data.frame(GE_distance=rnorm(260), Diet_B=factor(rep(1:13, each=20))) Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2, sienna2,red2,firebrick3,saddlebrown,coral4, chocolate4,darkblue,navy,grey38) levels(dataN$Diet_B) - Diet.colors bwplot(GE_distance ~ Diet_B, data=dataN, xlab=list(Diet of Breeding Ground, cex = 1.4), ylab = list( Distance between Centers of B and NB Range (1000 km), cex = 1.4), panel=panel.bwplot.intermediate.hh, col=Diet.colors, pch=rep(|,13), scales=list(x=list(rot=90)), par.settings=list(box.umbrella=list(lty=1))) [[alternative HTML version deleted]] __ R-help@r-project.org javascript:; mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sarah Goslee http://www.stringpage.com http://www.sarahgoslee.com http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running different Regressions using for loops
Ok, if I'm understanding it well, you want the mean value of Price1, , Price5? I don't know if it makes any sense, the coefficients already are mean values, but see if this is it. price.coef - sapply(lm.list, function(x) coef(x)[2]) mean(price.coef) Rui Barradas Em 28-09-2012 12:07, Krunal Nanavati escreveu: Hi, Yes the thing that you provided...works finebut probably I should have asked for some other thing. Here is what I am trying to do I am trying to get the mean of Price variableso I am entering the below function: mean(names(lm.list2[[2]]$coefficient[2] )) but this gives me an error [1] NA Warning message: In mean.default(names(lm.list2[[2]]$coefficient[2])) : argument is not numeric or logical: returning NA I thought by getting the text from the list variable...will help me generate the mean for that text...which is a variable in the data...say Price 1, Media 2and so on Is this a proper approach...if it is...then something more needs to be done with the function that you provided. If not, is there a better way...to generate the mean of a particular variable inside the for loop used earlier...given below: lm.list2 - list() for(i in seq_along(pricemedia)){ regr - paste(pricemedia[i], trendseason, sep = +) fmla - paste(response, regr, sep = ~) lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) } Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 16:02 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Try names(lm.list2[[2]]$coefficient[2] ) Rui Barradas Em 28-09-2012 11:29, Krunal Nanavati escreveu: Ok...this solves a part of my problem When I typelm.list2[2] ...I get the following output [[1]] Call: lm(formula = as.formula(fmla), data = tryout2) Coefficients: (Intercept) Price2 Media1 Distri1Trend Seasonality 13491232 -5759030-15203437048628 445351 When I enterlm.list2[[2]]$coefficient[2] it gives me the below output Price2 -5759030 And when I enterlm.list2[[2]]$coefficient[[2]] ...I get the number...which is -5759030 I am looking out for a way to get just the Price2 is there a statement for that?? Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 15:18 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, To access list elements you need `[[`, like this: summ.list[[2]]$coefficients Or Use the extractor function, coef(summ.list[[2]]) Rui Barradas Em 28-09-2012 07:23, Krunal Nanavati escreveu: Hi Rui, Excellent!! This is what I was looking for. Thanks for the help. So, now I have stored the result of the 10 regressions in summ.list - lapply(lm.list2, summary) And now once I enter sum.list it gives me the output for all the 10 regressions... I wanted to access a beta coefficient of one of the regressionssay Price2+Media1+Trend+Seasonality...the result of which is stored in sum.list[2] I entered the below statement for accessing the Beta coefficient for Price2... summ.list[2]$coefficients[2] NULL But this is giving me NULL as the output... What I am looking for, is to access a beta value of a particular variable from a particular regression output and use it for further analysis. Can you please help me out with this. Greatly appreciate, you guys efforts. Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 27 September 2012 21:55 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Inline. Em 27-09-2012 13:52, Krunal Nanavati escreveu: Hi, Thanks for all your help. I am stuck again, but with a new problem, on similar lines. I have taken the problem to the next step now...i have now added 2 for loops... 1 for the Price variable...and another for the Media variable I have taken 5 price variables...and 2 media variables with the trend and seasonality(appearing in all of them)so in all there will be 10 regression to run now Price 1, Media 1 Price 1, Media 2 Price 2, Media 1' Price 2, Media 2 ...and so on I have built up a code for it... tryout=read.table(C:\\Users\\Krunal\\Desktop\\R tryout.csv,header=T,sep=,) cnames - names(tryout) price - cnames[grep(Price, cnames)] media - cnames[grep(Media, cnames)] resp - cnames[1] regr - cnames[7:8] lm.list - vector(list, 10) for(i in 1:5) + { + regress - paste(price[i], paste(regr, collapse = +), sep = +) + for(j in 1:2) { + regress1 - paste(media[j],regress,sep=+) fmla - paste(resp, +
Re: [R] What to use for ti in back-transforming summary statistics from F-T double square-root transformation in 'metafor'
Dear Chunyan, One possibility would be to use the harmonic mean of the person-time at risk values. You will have to do this manually though at the moment. Here is an example: ### let's just use the treatment group data from dat.warfarin data(dat.warfarin) dat - escalc(xi=x1i, ti=t1i, measure=IRFT, data=dat.warfarin, append=TRUE) dat ### check if back-transformation of individual IRFT values works transf.iirft(dat$yi, ti=dat$t1i) escalc(xi=x1i, ti=t1i, measure=IR, data=dat.warfarin)$yi ### random-effects models res - rma(yi, vi, data=dat) res ### harmonic mean of the ti's ti.hm - 1/(mean(1/dat$t1i)) ### back-transformation using the harmonic mean transf.iirft(res$b, ti=ti.hm) transf.iirft(res$ci.lb, ti=ti.hm) transf.iirft(res$ci.ub, ti=ti.hm) Best, Wolfgang -- Wolfgang Viechtbauer, Ph.D., Statistician Department of Psychiatry and Psychology School for Mental Health and Neuroscience Faculty of Health, Medicine, and Life Sciences Maastricht University, P.O. Box 616 (VIJV1) 6200 MD Maastricht, The Netherlands +31 (43) 388-4170 | http://www.wvbauer.com From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Liu, Chunyan [chunyan@cchmc.org] Sent: Thursday, September 27, 2012 10:48 PM To: r-help@R-project.org Subject: [R] What to use for ti in back-transforming summary statistics from F-T double square-root transformation in 'metafor' Hi Dr. Viechtbauer, I'm doing meta-analysis using your package 'metafor'. I used the 'IRFT' to transform the incident rate. But when I tried to back-transform the summary estimates from function rma, I don't know what's the appropriate ti to feed in function transf.iirft. I searched and found your post about using harmonic mean for ni to back-transform the double arcsine transformation. I'm hoping I can get your help on ti too. Thanks. Chunyan Liu 513-636-9763 Biostatistician II Department of Biostatistics and Epidemiology Cincinnati Children's Hospital Medical Center __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Anova and tukey-grouping
Hello, I am really new to R and it's still a challenge to me. Currently I'm working on my Master's Thesis. My supervisor works with SAS and is not familiar with R at all. I want to run an Anova, a tukey-test and as a result I want to have the tukey-grouping ( something like A - AB - B) I came across the HSD.test in the agricolae-package, but... unfortunately I do not get an output (like here in the answer http://stats.stackexchange.com/questions/31547/how-to-obtain-the-results-of-a-tukey-hsd-post-hoc-test-in-a-table-showing-groupe ) I did it like this: ## ANOVA anova.typabunmit-aov(ds.typabunmit$abun ~ ds.typabunmit$typ) summary(anova.typabunmit) summary.lm(anova.typabunmit) ## post HOC tukey.typabunmit-TukeyHSD(anova.typabunmit) tukey.typabunmit ## HSD HSD.test(anova.typabunmit, abun, group=TRUE) and the ONLY output is this: Name: abun ds.typabunmit$typ I would be very pleased about some ides..:! -- View this message in context: http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is it possible to enter in a function wich is within a library ?
Hello, I'd like to know if it is Ipossible to enter in a function wich is included in a library ? I know how to debug function wich is in a R file (but not in a library). But it is not the case when the function is included in a library. I want to go step by step in this function in order to test objects 'values. I tried debug(the_function) but the program does not stop at the_function (it only shows the body of the function). Thanks for your help. -- View this message in context: http://r.789695.n4.nabble.com/Is-it-possible-to-enter-in-a-function-wich-is-within-a-library-tp4644488.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running different Regressions using for loops
Ok...this solves a part of my problem When I typelm.list2[2] ...I get the following output [[1]] Call: lm(formula = as.formula(fmla), data = tryout2) Coefficients: (Intercept) Price2 Media1 Distri1Trend Seasonality 13491232 -5759030-15203437048628 445351 When I enterlm.list2[[2]]$coefficient[2] it gives me the below output Price2 -5759030 And when I enterlm.list2[[2]]$coefficient[[2]] ...I get the number...which is -5759030 I am looking out for a way to get just the Price2 is there a statement for that?? Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 15:18 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, To access list elements you need `[[`, like this: summ.list[[2]]$coefficients Or Use the extractor function, coef(summ.list[[2]]) Rui Barradas Em 28-09-2012 07:23, Krunal Nanavati escreveu: Hi Rui, Excellent!! This is what I was looking for. Thanks for the help. So, now I have stored the result of the 10 regressions in summ.list - lapply(lm.list2, summary) And now once I enter sum.list it gives me the output for all the 10 regressions... I wanted to access a beta coefficient of one of the regressionssay Price2+Media1+Trend+Seasonality...the result of which is stored in sum.list[2] I entered the below statement for accessing the Beta coefficient for Price2... summ.list[2]$coefficients[2] NULL But this is giving me NULL as the output... What I am looking for, is to access a beta value of a particular variable from a particular regression output and use it for further analysis. Can you please help me out with this. Greatly appreciate, you guys efforts. Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 27 September 2012 21:55 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Inline. Em 27-09-2012 13:52, Krunal Nanavati escreveu: Hi, Thanks for all your help. I am stuck again, but with a new problem, on similar lines. I have taken the problem to the next step now...i have now added 2 for loops... 1 for the Price variable...and another for the Media variable I have taken 5 price variables...and 2 media variables with the trend and seasonality(appearing in all of them)so in all there will be 10 regression to run now Price 1, Media 1 Price 1, Media 2 Price 2, Media 1' Price 2, Media 2 ...and so on I have built up a code for it... tryout=read.table(C:\\Users\\Krunal\\Desktop\\R tryout.csv,header=T,sep=,) cnames - names(tryout) price - cnames[grep(Price, cnames)] media - cnames[grep(Media, cnames)] resp - cnames[1] regr - cnames[7:8] lm.list - vector(list, 10) for(i in 1:5) + { + regress - paste(price[i], paste(regr, collapse = +), sep = +) + for(j in 1:2) { + regress1 - paste(media[j],regress,sep=+) fmla - paste(resp, + regress1, sep = ~) lm.list[[i]] - lm(as.formula(fmla), data = + tryout) } } summ.list - lapply(lm.list, summary) summ.list But it is only running...5 regressions...only Media 1 along with the 5 Price variables Trend Seasonality is regressed on Volume...giving only 5 outputs I feel there is something wrong with the lm.list[[i]] - lm(as.formula(fmla), data = tryout) statement. No, I don't think so. If it's giving you only 5 outputs the error is probably in the fmla construction. Put print statements to see the results of those paste() instructions. Supposing your data.frame is now called tryout2, price - paste(Price, 1:5, sep = ) media - paste(Media, 1:2, sep = ) pricemedia - apply(expand.grid(price, media, stringsAsFactors = FALSE), 1, paste, collapse=+) response - Volume trendseason - Trend+Seasonality # do this only once lm.list2 - list() for(i in seq_along(pricemedia)){ regr - paste(pricemedia[i], trendseason, sep = +) fmla - paste(response, regr, sep = ~) lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) } The trick is to use ?expand.grid Hope this helps, Rui Barradas I am not sure about its placement...whether it should be in loop 2 or in loop 1 Can you please help me out?? Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 27 September 2012 16:22 To: David Winsemius Cc: Krunal Nanavati; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Just to add that you can also lapply(lm.list, coef) with a different output. Rui Barradas Em 27-09-2012 09:24, David Winsemius escreveu: On Sep 26, 2012, at 10:31 PM, Krunal Nanavati wrote: Dear
Re: [R] Running different Regressions using for loops
Hi, Yes the thing that you provided...works finebut probably I should have asked for some other thing. Here is what I am trying to do I am trying to get the mean of Price variableso I am entering the below function: mean(names(lm.list2[[2]]$coefficient[2] )) but this gives me an error [1] NA Warning message: In mean.default(names(lm.list2[[2]]$coefficient[2])) : argument is not numeric or logical: returning NA I thought by getting the text from the list variable...will help me generate the mean for that text...which is a variable in the data...say Price 1, Media 2and so on Is this a proper approach...if it is...then something more needs to be done with the function that you provided. If not, is there a better way...to generate the mean of a particular variable inside the for loop used earlier...given below: lm.list2 - list() for(i in seq_along(pricemedia)){ regr - paste(pricemedia[i], trendseason, sep = +) fmla - paste(response, regr, sep = ~) lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) } Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 16:02 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Try names(lm.list2[[2]]$coefficient[2] ) Rui Barradas Em 28-09-2012 11:29, Krunal Nanavati escreveu: Ok...this solves a part of my problem When I typelm.list2[2] ...I get the following output [[1]] Call: lm(formula = as.formula(fmla), data = tryout2) Coefficients: (Intercept) Price2 Media1 Distri1Trend Seasonality 13491232 -5759030-15203437048628 445351 When I enterlm.list2[[2]]$coefficient[2] it gives me the below output Price2 -5759030 And when I enterlm.list2[[2]]$coefficient[[2]] ...I get the number...which is -5759030 I am looking out for a way to get just the Price2 is there a statement for that?? Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 15:18 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, To access list elements you need `[[`, like this: summ.list[[2]]$coefficients Or Use the extractor function, coef(summ.list[[2]]) Rui Barradas Em 28-09-2012 07:23, Krunal Nanavati escreveu: Hi Rui, Excellent!! This is what I was looking for. Thanks for the help. So, now I have stored the result of the 10 regressions in summ.list - lapply(lm.list2, summary) And now once I enter sum.list it gives me the output for all the 10 regressions... I wanted to access a beta coefficient of one of the regressionssay Price2+Media1+Trend+Seasonality...the result of which is stored in sum.list[2] I entered the below statement for accessing the Beta coefficient for Price2... summ.list[2]$coefficients[2] NULL But this is giving me NULL as the output... What I am looking for, is to access a beta value of a particular variable from a particular regression output and use it for further analysis. Can you please help me out with this. Greatly appreciate, you guys efforts. Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 27 September 2012 21:55 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Inline. Em 27-09-2012 13:52, Krunal Nanavati escreveu: Hi, Thanks for all your help. I am stuck again, but with a new problem, on similar lines. I have taken the problem to the next step now...i have now added 2 for loops... 1 for the Price variable...and another for the Media variable I have taken 5 price variables...and 2 media variables with the trend and seasonality(appearing in all of them)so in all there will be 10 regression to run now Price 1, Media 1 Price 1, Media 2 Price 2, Media 1' Price 2, Media 2 ...and so on I have built up a code for it... tryout=read.table(C:\\Users\\Krunal\\Desktop\\R tryout.csv,header=T,sep=,) cnames - names(tryout) price - cnames[grep(Price, cnames)] media - cnames[grep(Media, cnames)] resp - cnames[1] regr - cnames[7:8] lm.list - vector(list, 10) for(i in 1:5) + { + regress - paste(price[i], paste(regr, collapse = +), sep = +) + for(j in 1:2) { + regress1 - paste(media[j],regress,sep=+) fmla - paste(resp, + regress1, sep = ~) lm.list[[i]] - lm(as.formula(fmla), data = + tryout) } } summ.list - lapply(lm.list, summary) summ.list But it is only running...5 regressions...only Media 1 along with the 5 Price variables Trend Seasonality is
Re: [R] Running different Regressions using for loops
Ok...I am sorry for the misunderstanding what I am trying to do is lm.list2 - list() for(i in seq_along(pricemedia)){ regr - paste(pricemedia[i], trendseason, sep = +) fmla - paste(response, regr, sep = ~) lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) } When I run...this set of statementsthe 1st regression to be run, will have Price 1, Media 1...as X variablesand in the second loop it will have Price 1 Media 2 So, what I was thinking is...if I can generate inside the for loopthe mean for Price 1 and Media 1 during the 1st loopand then mean for Price 1 and Media 2 during the second loop...and so on...for all the 10 regressions Is the method that I was trying appropriate...or is there a better method there...I am sorry for the earlier explanation, I hope this one makes it more understandable Thanks for your time...and all the quick replies Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 16:49 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Ok, if I'm understanding it well, you want the mean value of Price1, , Price5? I don't know if it makes any sense, the coefficients already are mean values, but see if this is it. price.coef - sapply(lm.list, function(x) coef(x)[2]) mean(price.coef) Rui Barradas Em 28-09-2012 12:07, Krunal Nanavati escreveu: Hi, Yes the thing that you provided...works finebut probably I should have asked for some other thing. Here is what I am trying to do I am trying to get the mean of Price variableso I am entering the below function: mean(names(lm.list2[[2]]$coefficient[2] )) but this gives me an error [1] NA Warning message: In mean.default(names(lm.list2[[2]]$coefficient[2])) : argument is not numeric or logical: returning NA I thought by getting the text from the list variable...will help me generate the mean for that text...which is a variable in the data...say Price 1, Media 2and so on Is this a proper approach...if it is...then something more needs to be done with the function that you provided. If not, is there a better way...to generate the mean of a particular variable inside the for loop used earlier...given below: lm.list2 - list() for(i in seq_along(pricemedia)){ regr - paste(pricemedia[i], trendseason, sep = +) fmla - paste(response, regr, sep = ~) lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) } Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 16:02 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Try names(lm.list2[[2]]$coefficient[2] ) Rui Barradas Em 28-09-2012 11:29, Krunal Nanavati escreveu: Ok...this solves a part of my problem When I typelm.list2[2] ...I get the following output [[1]] Call: lm(formula = as.formula(fmla), data = tryout2) Coefficients: (Intercept) Price2 Media1 Distri1Trend Seasonality 13491232 -5759030-15203437048628 445351 When I enterlm.list2[[2]]$coefficient[2] it gives me the below output Price2 -5759030 And when I enterlm.list2[[2]]$coefficient[[2]] ...I get the number...which is -5759030 I am looking out for a way to get just the Price2 is there a statement for that?? Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 15:18 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, To access list elements you need `[[`, like this: summ.list[[2]]$coefficients Or Use the extractor function, coef(summ.list[[2]]) Rui Barradas Em 28-09-2012 07:23, Krunal Nanavati escreveu: Hi Rui, Excellent!! This is what I was looking for. Thanks for the help. So, now I have stored the result of the 10 regressions in summ.list - lapply(lm.list2, summary) And now once I enter sum.list it gives me the output for all the 10 regressions... I wanted to access a beta coefficient of one of the regressionssay Price2+Media1+Trend+Seasonality...the result of which is stored in sum.list[2] I entered the below statement for accessing the Beta coefficient for Price2... summ.list[2]$coefficients[2] NULL But this is giving me NULL as the output... What I am looking for, is to access a beta value of a particular variable from a particular regression output and use it for further analysis. Can you please help me out with this. Greatly appreciate, you guys efforts.
[R] RES: Generating an autocorrelated binary variable
I think the package BinarySimCLF can help. See http://cran.r-project.org/web/packages/binarySimCLF/binarySimCLF.pdf. André Gabriel. -Mensagem original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Em nome de Rolf Turner Enviada em: sexta-feira, 28 de setembro de 2012 00:02 Para: Simon Zehnder Cc: r help Assunto: Re: [R] Generating an autocorrelated binary variable I have no idea what your code is doing, nor why you want correlated binary variables. Correlation makes little or no sense in the context of binary random variables --- or more generally in the context of discrete random variables. Be that as it may, it is an easy calculation to show that if X and Y are binary random variables both with success probability of 0.5 then cor(X,Y) = 0.2 if and only if Pr(X=1 | Y = 1) = 0.6. So just generate X and Y using that fact: set.seed(42) X - numeric(1000) Y - numeric(1000) for(i in 1:1000) { Y[i] - rbinom(1,1,0.5) X[i] - if(Y[i]==1) rbinom(1,1,0.6) else rbinom(1,1,0.4) } # Check: cor(X,Y) # Get 0.2012336 Looks about right. Note that the sample proportions are 0.484 and 0.485 for X and Y respectively. These values do not differ significantly from 0.5. cheers, Rolf Turner On 28/09/12 08:26, Simon Zehnder wrote: Hi R-fellows, I am trying to simulate a multivariate correlated sample via the Gaussian copula method. One variable is a binary variable, that should be autocorrelated. The autocorrelation should be rho = 0.2. Furthermore, the overall probability to get either outcome of the binary variable should be 0.5. Below you can see the R code (I use for simplicity a diagonal matrix in rmvnorm even if it produces no correlated sample): sampleCop - function(n = 1000, rho = 0.2) { require(splus2R) mvrs - rmvnorm(n + 1, mean = rep(0, 3), cov = diag(3)) pmvrs - pnorm(mvrs, 0, 1) var1 - matrix(0, nrow = n + 1, ncol = 1) var1[1] - qbinom(pmvrs[1, 1], 1, 0.5) if(var1[1] == 0) var1[nrow(mvrs)] - -1 for(i in 1:(nrow(pmvrs) - 1)) { if(pmvrs[i + 1, 1] = rho) var1[i + 1] - var1[i] else var1[i + 1] - var1[i] * (-1) } sample - matrix(0, nrow = n, ncol = 4) sample[, 1] - var1[1:nrow(var1) - 1] sample[, 2] - var1[2:nrow(var1)] sample[, 3] - qnorm(pmvrs[1:nrow(var1) - 1, 2], 0, 1, 1, 0) sample[, 4] - qnorm(pmvrs[1:nrow(var1) - 1, 3], 0, 1, 1, 0) sample } Now, the code is fine, everything compiles. But when I compute the autocorrelation of the binary variable, it is not 0.2, but 0.6. Does anyone know why this happens? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to write R package
On 27/09/2012 5:15 PM, Dr. Alireza Zolfaghari wrote: Hi List, Would you please send me a good link to talk me through on how to write a R package? See the ?package.skeleton help page. After you have run it, follow the instructions in the Read-and-delete-me file that it will create. For full details, see the Writing R Extensions manual. For modifying the package after you've finished the Read-and-delete-me instructions, just manually add *.R files where the rest of them are, and use the prompt() function to produce skeleton documentation. That's about it, but you can read more if you like in a tutorial I gave a few years ago at a UseR meeting in Dortmund: http://www.statistik.uni-dortmund.de/useR-2008/slides/Murdoch.pdf Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] changing outlier shapes of boxplots using lattice
Elaine, For panel.bwplot you see that the central dot and the outlier dots are controlled by the same pch argument. I initially set the pch=| to match your first example with the horizontal indicator for the median. I would be inclined to use the default circle for the outliers and therefore also for the median. Rich On Fri, Sep 28, 2012 at 7:13 AM, Sarah Goslee sarah.gos...@gmail.comwrote: I would guess that if you find the bit that says pch=| and change it to pch=1 it will solve your question, and that reading ?par will tell you why. Sarah On Thursday, September 27, 2012, Elaine Kuo wrote: Hello This is Elaine. I am using package lattice to generate boxplots. Using Richard's code, the display was almost perfect except the outlier shape. Based on the following code, the outliers are vertical lines. However, I want the outliers to be empty circles. Please kindly help how to modify the code to change the outlier shapes. Thank you. code package (lattice) dataN - data.frame(GE_distance=rnorm(260), Diet_B=factor(rep(1:13, each=20))) Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2, sienna2,red2,firebrick3,saddlebrown,coral4, chocolate4,darkblue,navy,grey38) levels(dataN$Diet_B) - Diet.colors bwplot(GE_distance ~ Diet_B, data=dataN, xlab=list(Diet of Breeding Ground, cex = 1.4), ylab = list( Distance between Centers of B and NB Range (1000 km), cex = 1.4), panel=panel.bwplot.intermediate.hh, col=Diet.colors, pch=rep(|,13), scales=list(x=list(rot=90)), par.settings=list(box.umbrella=list(lty=1))) [[alternative HTML version deleted]] __ R-help@r-project.org javascript:; mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sarah Goslee http://www.stringpage.com http://www.sarahgoslee.com http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Anova and tukey-grouping
HI, I guess there is a mistake in your code. You should have used typ instead of abun as abun is the dependent variable. summary(fm1 - aov(breaks ~ wool + tension, data = warpbreaks)) myresults - TukeyHSD(fm1, tension, ordered = TRUE) library(agricolae) HSD.test(fm1,wool,group=TRUE) #Study: #HSD Test for breaks #Mean Square Error: 134.9578 #wool, means # breaks std.err replication #A 31.03704 3.050609 27 #B 25.25926 1.789963 27 #alpha: 0.05 ; Df Error: 50 #Critical Value of Studentized Range: 2.840532 #Honestly Significant Difference: 6.350628 #Means with the same letter are not significantly different. #Groups, Treatments and means #a A 31.037037037037 #a B 25.2592592592593 A.K. - Original Message - From: Landi ent-ar...@gmx.de To: r-help@r-project.org Cc: Sent: Friday, September 28, 2012 5:41 AM Subject: [R] Anova and tukey-grouping Hello, I am really new to R and it's still a challenge to me. Currently I'm working on my Master's Thesis. My supervisor works with SAS and is not familiar with R at all. I want to run an Anova, a tukey-test and as a result I want to have the tukey-grouping ( something like A - AB - B) I came across the HSD.test in the agricolae-package, but... unfortunately I do not get an output (like here in the answer http://stats.stackexchange.com/questions/31547/how-to-obtain-the-results-of-a-tukey-hsd-post-hoc-test-in-a-table-showing-groupe ) I did it like this: ## ANOVA anova.typabunmit-aov(ds.typabunmit$abun ~ ds.typabunmit$typ) summary(anova.typabunmit) summary.lm(anova.typabunmit) ## post HOC tukey.typabunmit-TukeyHSD(anova.typabunmit) tukey.typabunmit ## HSD HSD.test(anova.typabunmit, abun, group=TRUE) and the ONLY output is this: Name: abun ds.typabunmit$typ I would be very pleased about some ides..:! -- View this message in context: http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Crosstable-like analysis (ks test) of dataframe
Thank you Rui! that works as I want it... :) /Johannes On Fri, Sep 28, 2012 at 12:30 PM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Try the following. f - function(x, y, ..., alternative = c(two.sided, less, greater), exact = NULL){ #w - getOption(warn) #options(warn = -1) # ignore warnings p - ks.test(x, y, ..., alternative = alternative, exact = exact)$p.value #options(warn = w) p } n - 1e1 dat - data.frame(X=rnorm(n), Y=runif(n), Z=rchisq(n, df=3)) apply(dat, 2, function(x) apply(dat, 2, function(y) f(x, y))) Hope this helps, Rui Barradas Em 28-09-2012 11:10, Johannes Radinger escreveu: Hi, I have a dataframe with multiple (appr. 20) columns containing vectors of different values (different distributions). Now I'd like to create a crosstable where I compare the distribution of each vector (df-column) with each other. For the comparison I want to use the ks.test(). The result should contain as row and column names the column names of the input dataframe and the cells should be populated with the p-value of the ks.test for each pairwise analysis. My data.frame looks like: df - data.frame(X=rnorm(1000,2),Y=rnorm(1000,1),Z=rnorm(1000,2)) And the test for one single case is: ks - ks.test(df$X,df$Z) where the p value is: ks[2] How can I create an automatized way of this pairwise analysis? Any suggestions? I guess that is a quite common analysis (probably with other tests). cheers, Johannes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice bwplot(): Conditioning on one factor
I'm not able to create the proper syntax to specify a lattice bwplot() for only one of two conditioning factors. The syntax that produces a box plot of each of the two conditioning factors is: bwplot(quant ~ param | era, data=mg.d, main='Dissolved Magnesium', ylab='Concentration (mg/L)') What I've tried unsuccessfully are: bwplot(quant ~ param | factor(era=='Pre-mining'), data=mg.d, main='Magnesium', ylab='Concentration (mg/L)) bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration (mg/L)', subset=era('Pre-mining')) plus slight variations of the above. None work. Please point me to what I've missed in specifying only one of two conditioning factors for the plot. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Question
Many thanks Dr. Winsemius , Kimmo and Pascal All of them are working and really beautiful... Best Regards, Bhupendrasinh Thakre *Disclaimer :* The information contained in this communication is confidential and may be legally privileged. It is intended solely for the use of the individual or entity to whom it is adressed. If you are not the intended recipient you are hereby (a) notified that any disclosure, copying, distribution or taking any action with respect to the content of this information is strictly prohibited and may be unlawful, and (b) kindly requested to inform the sender immediately and destroy any copies. On Fri, Sep 28, 2012 at 1:36 AM, David Winsemius dwinsem...@comcast.netwrote: On Sep 27, 2012, at 11:13 PM, Bhupendrasinh Thakre wrote: Hi Everyone, I am trying a very simple task to append the Timestamp with a variable name so something like a_2012_09_27_00_12_30 - rnorm(1,2,1). If you want to assign a value to a character-name you need to use ... `assign`. You cannot just stick a numeric value which is what you get with sys.Time() on the LHS of a - and expect R to intuit what you intend. ?assign assign( a_2012_09_27_00_12_30 , rnorm(1,2,1) ) assign( as.character(unclass(Sys.time())) , rnorm(1,2,1) ) (I would have thought you wanted to format that sys.Time result:) format(Sys.time(), %Y_%m_%d_%H_%M_%S) [1] 2012_09_27_23_32_40 assign(format(Sys.time(), %Y_%m_%d_%H_%M_%S), rnorm(1,2,1) ) grep(^2012, ls(), value=TRUE) [1] 2012_09_27_23_33_45 Tried some commands but it doesn't work out well. Hope someone has some answer on it. Session Info R version 2.15.1 (2012-06-22) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1 bitops_1.0-4.1 tm_0.5-7.1 RMySQL_0.9-3DBI_0.2-5 loaded via a namespace (and not attached): [1] slam_0.1-24 tools_2.15.1 Statement I tried : b - unclass(Sys.time()) b = 1348812597 c_b - rnorm(1,2,1) Works perfect but doesn't show me c_1348812597. Best Regards, Bhupendrasinh Thakre [[alternative HTML version deleted]] BT; Please learn to post in plain text. It's really very simple with gmail. -- David Winsemius, MD Alameda, CA, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?
Jonathan, ff has a utility function file.resize() which allows to give a new filesize in bytes using doubles. See ?file.resize Regards Jens Oehlschlägel Gesendet: Donnerstag, 27. September 2012 um 21:17 Uhr Von: Jonathan Greenberg j...@illinois.edu An: r-help r-help@r-project.org, r-sig-...@r-project.org Betreff: Re: [R-sig-hpc] Quickest way to make a large empty file on disk? Folks: Asked this question some time ago, and found what appeared (at first) to be the best solution, but I'm now finding a new problem. First off, it seemed like ff as Jens suggested worked: # outdata_ncells = the number of rows * number of columns * number of bands in an image: out-ff(vmode=double,length=outdata_ncells,filename=filename) finalizer(out) - close close(out) This was working fine until I attempted to set length to a VERY large number: outdata_ncells = 17711913600. This would create a file that is 131.964GB. Big, but not obscenely so (and certainly not larger than the filesystem can handle). However, length appears to be restricted by .Machine$integer.max (I'm on a 64-bit windows box): .Machine$integer.max [1] 2147483647 Any suggestions on how to solve this problem for much larger file sizes? --j OnThu, May 3, 2012 at 10:44 AM, Jonathan Greenberg j...@illinois.eduwrote: Thanks, all! I'll try these out. I'm trying to work up something that is platform independent (if possible) for use with mmap. I'll do some tests on these suggestions and see which works best. I'll try to report back in a few days. Cheers! --j 2012/5/3 Jens Oehlschlägel jens.oehlschlae...@truecluster.com Jonathan, On some filesystems (e.g. NTFS, see below) it is possible to create 'sparse' memory-mapped files, i.e. reserving the space without the cost of actually writing initial values. Package 'ff' does this automatically and also allows to access the file in parallel. Check the example below and see how big file creation is immediate. Jens Oehlschlägel library(ff) library(snowfall) ncpus - 2 n - 1e8 system.time( + x - ff(vmode=double, length=n, filename=c:/Temp/x.ff) + ) User System verstrichen 0.01 0.00 0.02 # check finalizer, with an explicit filename we should have a 'close' finalizer finalizer(x) [1] close # if not, set it to 'close' inorder to not let slaves delete x on slave shutdown finalizer(x) - close sfInit(parallel=TRUE, cpus=ncpus, type=SOCK) R Version: R version 2.15.0 (2012-03-30) snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2 CPUs. sfLibrary(ff) Library ff loaded. Library ff loaded in cluster. Warnmeldung: In library(package = ff, character.only = TRUE, pos = 2, warn.conflicts = TRUE, : 'keep.source' is deprecated and will be ignored sfExport(x) # note: do not export the same ff multiple times # explicitely opening avoids a gc problem sfClusterEval(open(x, caching=mmeachflush)) # opening with 'mmeachflush' inststead of 'mmnoflush' is a bit slower but prevents OS write storms when the file is larger than RAM [[1]] [1] TRUE [[2]] [1] TRUE system.time( + sfLapply( chunk(x, length=ncpus), function(i){ + x[i] - runif(sum(i)) + invisible() + }) + ) User System verstrichen 0.00 0.00 30.78 system.time( + s - sfLapply( chunk(x, length=ncpus), function(i) quantile(x[i], c(0.05, 0.95)) ) + ) User System verstrichen 0.00 0.00 4.38 # for completeness sfClusterEval(close(x)) [[1]] [1] TRUE [[2]] [1] TRUE csummary(s) 5% 95% Min. 0.04998 0.95 1st Qu. 0.04999 0.95 Median 0.05001 0.95 Mean 0.05001 0.95 3rd Qu. 0.05002 0.95 Max. 0.05003 0.95 # stop slaves sfStop() Stopping cluster # with the close finalizer we are responsible for deleting the file explicitely (unless we want to keep it) delete(x) [1] TRUE # remove r-side metadata rm(x) # truly free memory gc() *Gesendet:* Donnerstag, 03. Mai 2012 um 00:23 Uhr *Von:* Jonathan Greenberg j...@illinois.edu *An:* r-help r-help@r-project.org, r-sig-...@r-project.org *Betreff:* [R-sig-hpc] Quickest way to make a large empty file on disk? R-helpers: What would be the absolute fastest way to make a large empty file (e.g. filled with all zeroes) on disk, given a byte size and a given number number of empty values. I know I can use writeBin, but the object in this case may be far too large to store in main memory. I'm asking because I'm going to use this file in conjunction with mmap to do parallel writes to this file. Say, I want to create a blank file of 10,000
Re: [R] Anova and tukey-grouping
Hello ! Thanks for your advice. I tried it, but the output is the same: HSD.test(anova.typabunmit, typ, group=TRUE) Name: typ ds.typabunmit$typ I don't get the values...!?!? -- View this message in context: http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644513.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] List of Variables in Original Order
AK: Thanks, that was very helpful. It led me to think of the function names(base) which provided the vector of names in the correct order. I then used the same matrix formatting and everything worked out exactly as planned. Dick On 9/28/2012 1:09 AM, arun kirshna [via R] wrote: HI, May be this helps you: set.seed(1) mat1-matrix(rnorm(60,5),nrow=5,ncol=12) colnames(mat1)-paste0(Var,1:12) vec2-format(c(1,cor(mat1[,1],mat1[,2:12])),digits=4) vec3-colnames(mat1) arr2-array(rbind(vec3,vec2),dim=c(2,3,4)) res-data.frame(do.call(rbind,lapply(1:dim(arr2)[3],function(i) arr2[,,i]))) res #X1 X2 X3 #1 Var1 Var2 Var3 #2 1.0 0.27890 -0.61497 #3 Var4 Var5 Var6 #4 0.24916 -0.76155 0.30853 #5 Var7 Var8 Var9 #6 -0.46413 0.79287 0.05191 #7Var10Var11Var12 #8 -0.06940 -0.53251 0.06766 A.K. - Original Message - From: rkulp [hidden email] /user/SendEmail.jtp?type=nodenode=4644469i=0 To: [hidden email] /user/SendEmail.jtp?type=nodenode=4644469i=1 Cc: Sent: Thursday, September 27, 2012 6:26 PM Subject: [R] List of Variables in Original Order I am trying to Sweave the output of calculating correlations between one variable and several others. I wanted to print a table where the odd-numbered rows contain the variable names and the even-numbered rows contain the correlations. So if VarA is correlated with all the variables in mydata.df, then it would look like var1var2 var3 corr1 corr2 corr3 var4 var5var6 corr4 corr5 corr6 . . etc. I tried using a matrix for the correlations and another one for the variable names. I built the correlation matrix using x = matrix(format(cor(mydata.df[,1],mydata.df[,c(2:79)]),digits=4),nc=3) and the variable names matrix using y = matrix(ls(mydata.df[c(2:79)]),nc=3). The problem is the function ls returns the names in alphabetical order, columnar order. How do I get the names in columnar order? Is there a better way to display the correlation of a single variable with a large number of other variables? If there is, how do I do it? I appreciate any help I can get. This is my first project in R so I don't know much about it yet. -- View this message in context: http://r.789695.n4.nabble.com/List-of-Variables-in-Original-Order-tp4644436.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email] /user/SendEmail.jtp?type=nodenode=4644469i=2 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [hidden email] /user/SendEmail.jtp?type=nodenode=4644469i=3 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/List-of-Variables-in-Original-Order-tp4644436p4644469.html To unsubscribe from List of Variables in Original Order, click here http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4644436code=cmt1bHBAY2hhcnRlci5uZXR8NDY0NDQzNnwxOTU3MDkxNDkw. NAML http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml rkulp.vcf (418 bytes) http://r.789695.n4.nabble.com/attachment/4644516/0/rkulp.vcf -- View this message in context: http://r.789695.n4.nabble.com/List-of-Variables-in-Original-Order-tp4644436p4644516.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running different Regressions using for loops
On Sep 28, 2012, at 4:35 AM, Krunal Nanavati wrote: Ok...I am sorry for the misunderstanding what I am trying to do is Perhaps (and that is a really large 'perhaps'): lm.list2 - list() lm.means - list() for(i in seq_along(pricemedia)){ regr - paste(pricemedia[i], trendseason, sep = +) fmla - paste(response, regr, sep = ~) lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) } lm.means[[i]] - mean(lm.list2[[i]]$coefficients[c(Price1, Media1)] } When I run...this set of statementsthe 1st regression to be run, will have Price 1, Media 1...as X variablesand in the second loop it will have Price 1 Media 2 So, what I was thinking is...if I can generate inside the for loopthe mean for Price 1 and Media 1 during the 1st loopand then mean for Price 1 and Media 2 during the second loop...and so on...for all the 10 regressions Is the method that I was trying appropriate...or is there a better method there...I am sorry for the earlier explanation, I hope this one makes it more understandable One generally want ones methods to be determinate while allowing the results to be approximate. Had you followed the posting guide a offered a reproducible example it would have been much more understandable. Thanks for your time...and all the quick replies -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 16:49 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Ok, if I'm understanding it well, you want the mean value of Price1, , Price5? I don't know if it makes any sense, the coefficients already are mean values, but see if this is it. price.coef - sapply(lm.list, function(x) coef(x)[2]) mean(price.coef) Rui Barradas Em 28-09-2012 12:07, Krunal Nanavati escreveu: Hi, Yes the thing that you provided...works finebut probably I should have asked for some other thing. Here is what I am trying to do I am trying to get the mean of Price variableso I am entering the below function: mean(names(lm.list2[[2]]$coefficient[2] )) but this gives me an error [1] NA Warning message: In mean.default(names(lm.list2[[2]]$coefficient[2])) : argument is not numeric or logical: returning NA I thought by getting the text from the list variable...will help me generate the mean for that text...which is a variable in the data...say Price 1, Media 2and so on Is this a proper approach...if it is...then something more needs to be done with the function that you provided. If not, is there a better way...to generate the mean of a particular variable inside the for loop used earlier...given below: lm.list2 - list() for(i in seq_along(pricemedia)){ regr - paste(pricemedia[i], trendseason, sep = +) fmla - paste(response, regr, sep = ~) lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) } Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 16:02 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, Try names(lm.list2[[2]]$coefficient[2] ) Rui Barradas Em 28-09-2012 11:29, Krunal Nanavati escreveu: Ok...this solves a part of my problem When I typelm.list2[2] ...I get the following output [[1]] Call: lm(formula = as.formula(fmla), data = tryout2) Coefficients: (Intercept) Price2 Media1 Distri1Trend Seasonality 13491232 -5759030-15203437048628 445351 When I enterlm.list2[[2]]$coefficient[2] it gives me the below output Price2 -5759030 And when I enterlm.list2[[2]]$coefficient[[2]] ...I get the number...which is -5759030 I am looking out for a way to get just the Price2 is there a statement for that?? Thanks Regards, Krunal Nanavati 9769-919198 -Original Message- From: Rui Barradas [mailto:ruipbarra...@sapo.pt] Sent: 28 September 2012 15:18 To: Krunal Nanavati Cc: David Winsemius; r-help@r-project.org Subject: Re: [R] Running different Regressions using for loops Hello, To access list elements you need `[[`, like this: summ.list[[2]]$coefficients Or Use the extractor function, coef(summ.list[[2]]) Rui Barradas Em 28-09-2012 07:23, Krunal Nanavati escreveu: Hi Rui, Excellent!! This is what I was looking for. Thanks for the help. So, now I have stored the result of the 10 regressions in summ.list - lapply(lm.list2, summary) And now once I enter sum.list it gives me the output for all the 10 regressions... I wanted to access a beta coefficient of one of the regressionssay
[R] max summary contradict each other
why does summary report max 27600 and not 27603? x - c(27603, 1) max(x) [1] 27603 summary(x) Min. 1st Qu. MedianMean 3rd Qu.Max. 16902 13800 13800 20700 27600 -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://pmw.org.il http://dhimmi.com http://iris.org.il http://mideasttruth.com Vegetarians eat Vegetables, Humanitarians are scary. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] max summary contradict each other
On 28/09/2012 12:14 PM, Sam Steingold wrote: why does summary report max 27600 and not 27603? x - c(27603, 1) max(x) [1] 27603 summary(x) Min. 1st Qu. MedianMean 3rd Qu.Max. 16902 13800 13800 20700 27600 Because you asked for 3 digit accuracy. See ?summary. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice bwplot(): Conditioning on one factor
On Sep 28, 2012, at 7:49 AM, Rich Shepard wrote: I'm not able to create the proper syntax to specify a lattice bwplot() for only one of two conditioning factors. Wouldn't that involve specifying the 'subset' parameter (if bwplot accepts a subset argument) or using the 'subset' function to pass the desired rows to the data argument if it doesn't? The syntax that produces a box plot of each of the two conditioning factors is: bwplot(quant ~ param | era, data=mg.d, main='Dissolved Magnesium', ylab='Concentration (mg/L)') What I've tried unsuccessfully are: bwplot(quant ~ param | factor(era=='Pre-mining'), data=mg.d, main='Magnesium', ylab='Concentration (mg/L)) bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration (mg/L)', subset=era('Pre-mining')) plus slight variations of the above. None work. Please point me to what I've missed in specifying only one of two conditioning factors for the plot. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Question
Hi Everyone, Sorry for coming back again with a new problem. Editing question, session info and data so you don't have to scroll till the end of page. *Situation :* I have a data frame and it's name is df. Now I want to add Time Stamp to the end of *name of data Frame i.e. df_system_time*. Previously it was running great and thanks to Dr. Winsemius , Kimmo and Pascal and I believe as the function which i used was scalar. *Data :* dput(df)structure(list(x = 1:10, y = 1:10), .Names = c(x, y), row.names = c(NA, -10L), class = data.frame) *Session Info :* R version 2.15.1 (2012-06-22) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices datasets utils methods [7] base other attached packages: [1] rcom_2.2-5 rscproxy_2.0-5 loaded via a namespace (and not attached): [1] colorspace_1.1-1 dichromat_1.2-4digest_0.5.2 [4] ggplot2_0.9.2.1grid_2.15.1gtable_0.1.1 [7] labeling_0.1 MASS_7.3-18memoise_0.1 [10] munsell_0.3plyr_1.7.1 proto_0.3-9.2 [13] RColorBrewer_1.0-5 reshape2_1.2.1 scales_0.2.2 [16] stringr_0.6.1 tools_2.15.1 It's kind of very easy in SQL but I love doing all the work in R so don't want to leave for just changing the name. Best Regards, Bhupendrasinh Thakre Best Regards, Bhupendrasinh Thakre *Disclaimer :* The information contained in this communication is confidential and may be legally privileged. It is intended solely for the use of the individual or entity to whom it is adressed. If you are not the intended recipient you are hereby (a) notified that any disclosure, copying, distribution or taking any action with respect to the content of this information is strictly prohibited and may be unlawful, and (b) kindly requested to inform the sender immediately and destroy any copies. On Fri, Sep 28, 2012 at 10:13 AM, Bhupendrasinh Thakre vickytha...@gmail.com wrote: Many thanks Dr. Winsemius , Kimmo and Pascal All of them are working and really beautiful... Best Regards, Bhupendrasinh Thakre *Disclaimer :* The information contained in this communication is confidential and may be legally privileged. It is intended solely for the use of the individual or entity to whom it is adressed. If you are not the intended recipient you are hereby (a) notified that any disclosure, copying, distribution or taking any action with respect to the content of this information is strictly prohibited and may be unlawful, and (b) kindly requested to inform the sender immediately and destroy any copies. On Fri, Sep 28, 2012 at 1:36 AM, David Winsemius dwinsem...@comcast.netwrote: On Sep 27, 2012, at 11:13 PM, Bhupendrasinh Thakre wrote: Hi Everyone, I am trying a very simple task to append the Timestamp with a variable name so something like a_2012_09_27_00_12_30 - rnorm(1,2,1). If you want to assign a value to a character-name you need to use ... `assign`. You cannot just stick a numeric value which is what you get with sys.Time() on the LHS of a - and expect R to intuit what you intend. ?assign assign( a_2012_09_27_00_12_30 , rnorm(1,2,1) ) assign( as.character(unclass(Sys.time())) , rnorm(1,2,1) ) (I would have thought you wanted to format that sys.Time result:) format(Sys.time(), %Y_%m_%d_%H_%M_%S) [1] 2012_09_27_23_32_40 assign(format(Sys.time(), %Y_%m_%d_%H_%M_%S), rnorm(1,2,1) ) grep(^2012, ls(), value=TRUE) [1] 2012_09_27_23_33_45 Tried some commands but it doesn't work out well. Hope someone has some answer on it. Session Info R version 2.15.1 (2012-06-22) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1 bitops_1.0-4.1 tm_0.5-7.1 RMySQL_0.9-3DBI_0.2-5 loaded via a namespace (and not attached): [1] slam_0.1-24 tools_2.15.1 Statement I tried : b - unclass(Sys.time()) b = 1348812597 c_b - rnorm(1,2,1) Works perfect but doesn't show me c_1348812597. Best Regards, Bhupendrasinh Thakre [[alternative HTML version deleted]] BT; Please learn to post in plain text. It's really very simple with gmail. -- David Winsemius, MD Alameda, CA, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice bwplot(): Conditioning on one factor
A small reproducible example, as requested bythe posting guide, would have been very helpful here (if you provide one, use ?dput to provide the data). You have also not told us what you mean by unsuccessful, so we are left to guess what sort of problems you experienced. None work is completely useless to help diagnose the problem. This means we waste time going back and forth trying to elucidate what you mean. Please consider these things if/when you post in future. In any case, my guess is that param is numeric and it should be a factor, so, e.g. bwplot(quant ~ factor(param) | era, data=mg.d, main='Dissolved Magnesium', ylab='Concentration (mg/L)') might be what you want. But of course, it may be completely wrong. Cheers, Bert On Fri, Sep 28, 2012 at 9:25 AM, David Winsemius dwinsem...@comcast.net wrote: On Sep 28, 2012, at 7:49 AM, Rich Shepard wrote: I'm not able to create the proper syntax to specify a lattice bwplot() for only one of two conditioning factors. Wouldn't that involve specifying the 'subset' parameter (if bwplot accepts a subset argument) or using the 'subset' function to pass the desired rows to the data argument if it doesn't? The syntax that produces a box plot of each of the two conditioning factors is: bwplot(quant ~ param | era, data=mg.d, main='Dissolved Magnesium', ylab='Concentration (mg/L)') What I've tried unsuccessfully are: bwplot(quant ~ param | factor(era=='Pre-mining'), data=mg.d, main='Magnesium', ylab='Concentration (mg/L)) bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration (mg/L)', subset=era('Pre-mining')) plus slight variations of the above. None work. Please point me to what I've missed in specifying only one of two conditioning factors for the plot. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Question
Hi Everyone, Sorry for coming back again with a new problem. Editing question, session info and data so you don't have to scroll till the end of page. *Situation :* I have a data frame and it's name is df. Now I want to add Time Stamp to the end of *name of data Frame i.e. df_system_time*. Previously it was running great and thanks to Dr. Winsemius , Kimmo and Pascal and I believe as the function which i used was scalar. *Data :* dput(df)structure(list(x = 1:10, y = 1:10), .Names = c(x, y), row.names = c(NA, -10L), class = data.frame) *Session Info :* R version 2.15.1 (2012-06-22) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices datasets utils methods [7] base other attached packages: [1] rcom_2.2-5 rscproxy_2.0-5 loaded via a namespace (and not attached): [1] colorspace_1.1-1 dichromat_1.2-4digest_0.5.2 [4] ggplot2_0.9.2.1grid_2.15.1gtable_0.1.1 [7] labeling_0.1 MASS_7.3-18memoise_0.1 [10] munsell_0.3plyr_1.7.1 proto_0.3-9.2 [13] RColorBrewer_1.0-5 reshape2_1.2.1 scales_0.2.2 [16] stringr_0.6.1 tools_2.15.1 It's kind of very easy in SQL but I love doing all the work in R so don't want to leave for just changing the name. Best Regards, Bhupendrasinh Thakre Best Regards, Bhupendrasinh Thakre *Disclaimer :* The information contained in this communication is confidential and may be legally privileged. It is intended solely for the use of the individual or entity to whom it is adressed. If you are not the intended recipient you are hereby (a) notified that any disclosure, copying, distribution or taking any action with respect to the content of this information is strictly prohibited and may be unlawful, and (b) kindly requested to inform the sender immediately and destroy any copies. On Fri, Sep 28, 2012 at 10:13 AM, Bhupendrasinh Thakre vickytha...@gmail.com wrote: Many thanks Dr. Winsemius , Kimmo and Pascal All of them are working and really beautiful... Best Regards, Bhupendrasinh Thakre *Disclaimer :* The information contained in this communication is confidential and may be legally privileged. It is intended solely for the use of the individual or entity to whom it is adressed. If you are not the intended recipient you are hereby (a) notified that any disclosure, copying, distribution or taking any action with respect to the content of this information is strictly prohibited and may be unlawful, and (b) kindly requested to inform the sender immediately and destroy any copies. On Fri, Sep 28, 2012 at 1:36 AM, David Winsemius dwinsem...@comcast.netwrote: On Sep 27, 2012, at 11:13 PM, Bhupendrasinh Thakre wrote: Hi Everyone, I am trying a very simple task to append the Timestamp with a variable name so something like a_2012_09_27_00_12_30 - rnorm(1,2,1). If you want to assign a value to a character-name you need to use ... `assign`. You cannot just stick a numeric value which is what you get with sys.Time() on the LHS of a - and expect R to intuit what you intend. ?assign assign( a_2012_09_27_00_12_30 , rnorm(1,2,1) ) assign( as.character(unclass(Sys.time())) , rnorm(1,2,1) ) (I would have thought you wanted to format that sys.Time result:) format(Sys.time(), %Y_%m_%d_%H_%M_%S) [1] 2012_09_27_23_32_40 assign(format(Sys.time(), %Y_%m_%d_%H_%M_%S), rnorm(1,2,1) ) grep(^2012, ls(), value=TRUE) [1] 2012_09_27_23_33_45 Tried some commands but it doesn't work out well. Hope someone has some answer on it. Session Info R version 2.15.1 (2012-06-22) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1 bitops_1.0-4.1 tm_0.5-7.1 RMySQL_0.9-3DBI_0.2-5 loaded via a namespace (and not attached): [1] slam_0.1-24 tools_2.15.1 Statement I tried : b - unclass(Sys.time()) b = 1348812597 c_b - rnorm(1,2,1) Works perfect but doesn't show me c_1348812597. Best Regards, Bhupendrasinh Thakre [[alternative HTML version deleted]] BT; Please learn to post in plain text. It's really very simple with gmail. -- David Winsemius, MD Alameda, CA, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] max summary contradict each other
Hi, Try this: summary(x,digits=max(5)) # Min. 1st Qu. Median Mean 3rd Qu. Max. # 1.0 6901.5 13802.0 13802.0 20702.0 27603.0 A.K. - Original Message - From: Sam Steingold s...@gnu.org To: r-help@r-project.org Cc: Sent: Friday, September 28, 2012 12:14 PM Subject: [R] max summary contradict each other why does summary report max 27600 and not 27603? x - c(27603, 1) max(x) [1] 27603 summary(x) Min. 1st Qu. Median Mean 3rd Qu. Max. 1 6902 13800 13800 20700 27600 -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://pmw.org.il http://dhimmi.com http://iris.org.il http://mideasttruth.com Vegetarians eat Vegetables, Humanitarians are scary. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice bwplot(): Conditioning on one factor
On Fri, 28 Sep 2012, David Winsemius wrote: Wouldn't that involve specifying the 'subset' parameter (if bwplot accepts a subset argument) or using the 'subset' function to pass the desired rows to the data argument if it doesn't? David, That's what I tried: bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration (mg/L)', subset=era('Pre-mining')) Perhaps I didn't write it correctly. Thanks, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Question
On 28-09-2012, at 18:40, Bhupendrasinh Thakre vickytha...@gmail.com wrote: Hi Everyone, Sorry for coming back again with a new problem. Editing question, session info and data so you don't have to scroll till the end of page. *Situation :* I have a data frame and it's name is df. Now I want to add Time Stamp to the end of *name of data Frame i.e. df_system_time*. Previously it was running great and thanks to Dr. Winsemius , Kimmo and Pascal and I believe as the function which i used was scalar. *Data :* dput(df)structure(list(x = 1:10, y = 1:10), .Names = c(x, y), row.names = c(NA, -10L), class = data.frame) You have been given the answer. It only needs a minor variation: newname.df - paste0(df_, format(Sys.time(), %Y_%m_%d_%H_%M_%S) ) assign(newname.df,df) and if you wish rm(list=c('df','newname.df')) Or install package memisc (found by doing findFn(rename) from package sos) and use function rename(0; I have not tried this. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?
Rui: Quick follow-up -- it looks like seek does do what I want (I see Simon suggested it some time ago) -- what do mean by trash your disk? What I'm trying to accomplish is getting parallel, asynchronous writes to a large binary image (just a binary file) working. Each node writes to a different sector of the file via mmap, filling in the values as the process runs, but the file needs to be pre-created before I can mmap it. Running a writeBin with a bunch of 0s would mean I'd basically have to write the file twice, but the seek/ff trick seems to be much faster. Do I risk doing some damage to my filesystem if I use seek? I see there is a strongly worded warning in the help for ?seek: Use of seek on Windows is discouraged. We have found so many errors in the Windows implementation of file positioning that users are advised to use it only at their own risk, and asked not to waste the *R* developers' time with bug reports on Windows' deficiencies. -- there's no detail here on which errors people have experienced, so I'm not sure if doing something as simple as just creating a file using seek falls under the discouraging category. As a note, we are trying to work this up on both Windows and *nix systems, hence our wanting to have a single approach that works on both OSs. --j On Thu, Sep 27, 2012 at 3:49 PM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, If you really need to trash your disk, why not use seek()? fl - file(Test.txt, open = wb) seek(fl, where = 1024, origin = start, rw = write) [1] 0 writeChar(character(1), fl, nchars = 1, useBytes = TRUE) Warning message: In writeChar(character(1), fl, nchars = 1, useBytes = TRUE) : writeChar: more characters requested than are in the string - will zero-pad close(fl) File Test.txt is now 1Kb in size. Hope this helps, Rui Barradas Em 27-09-2012 20:17, Jonathan Greenberg escreveu: Folks: Asked this question some time ago, and found what appeared (at first) to be the best solution, but I'm now finding a new problem. First off, it seemed like ff as Jens suggested worked: # outdata_ncells = the number of rows * number of columns * number of bands in an image: out-ff(vmode=double,length=outdata_ncells,filename=filename) finalizer(out) - close close(out) This was working fine until I attempted to set length to a VERY large number: outdata_ncells = 17711913600. This would create a file that is 131.964GB. Big, but not obscenely so (and certainly not larger than the filesystem can handle). However, length appears to be restricted by .Machine$integer.max (I'm on a 64-bit windows box): .Machine$integer.max [1] 2147483647 Any suggestions on how to solve this problem for much larger file sizes? --j On Thu, May 3, 2012 at 10:44 AM, Jonathan Greenberg j...@illinois.edu j...@illinois.eduwrote: Thanks, all! I'll try these out. I'm trying to work up something that is platform independent (if possible) for use with mmap. I'll do some tests on these suggestions and see which works best. I'll try to report back in a few days. Cheers! --j 2012/5/3 Jens Oehlschlägel jens.oehlschlae...@truecluster.com jens.oehlschlae...@truecluster.com Jonathan, On some filesystems (e.g. NTFS, see below) it is possible to create 'sparse' memory-mapped files, i.e. reserving the space without the cost of actually writing initial values. Package 'ff' does this automatically and also allows to access the file in parallel. Check the example below and see how big file creation is immediate. Jens Oehlschlägel library(ff) library(snowfall) ncpus - 2 n - 1e8 system.time( + x - ff(vmode=double, length=n, filename=c:/Temp/x.ff) + ) User System verstrichen 0.010.000.02 # check finalizer, with an explicit filename we should have a 'close' finalizer finalizer(x) [1] close # if not, set it to 'close' inorder to not let slaves delete x on slave shutdown finalizer(x) - close sfInit(parallel=TRUE, cpus=ncpus, type=SOCK) R Version: R version 2.15.0 (2012-03-30) snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2 CPUs. sfLibrary(ff) Library ff loaded. Library ff loaded in cluster. Warnmeldung: In library(package = ff, character.only = TRUE, pos = 2, warn.conflicts = TRUE, : 'keep.source' is deprecated and will be ignored sfExport(x) # note: do not export the same ff multiple times # explicitely opening avoids a gc problem sfClusterEval(open(x, caching=mmeachflush)) # opening with 'mmeachflush' inststead of 'mmnoflush' is a bit slower but prevents OS write storms when the file is larger than RAM [[1]] [1] TRUE [[2]] [1] TRUE system.time( + sfLapply( chunk(x, length=ncpus), function(i){ + x[i] - runif(sum(i)) + invisible() + }) + ) User System verstrichen 0.000.00 30.78 system.time( + s - sfLapply( chunk(x, length=ncpus), function(i)
Re: [R] Lattice bwplot(): Conditioning on one factor
On Sep 28, 2012, at 9:56 AM, Rich Shepard wrote: On Fri, 28 Sep 2012, David Winsemius wrote: Wouldn't that involve specifying the 'subset' parameter (if bwplot accepts a subset argument) or using the 'subset' function to pass the desired rows to the data argument if it doesn't? David, That's what I tried: bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration (mg/L)', subset=era('Pre-mining')) Sigh. If I were testing that strategy (which I did not try because you were too busy to have included a working example) I would have written it: bwplot(quant ~ param , data=mg.d, main='Magnesium', ylab='Concentration (mg/L)', subset= era=='Pre-mining' ) That passes a logical vector which will work only if bwplot created an local environment where column names of the 'data' argument have been added to the local namespce. I do not know if that is true. I just looked at the bwplot help page and do not see a subset argument documented there. The other suggestion which it seems you were also to busy too have tried was: bwplot(quant ~ param , main='Magnesium', ylab='Concentration (mg/L)', data = subset( mg.dsubset, era=='Pre-mining' ) ) Wrapping a column name around a factor level with parentheses (which R takes to mean there is a function named 'era' to be applied) and expecting R to understand the you want a subset seems doomed to failure. It makes no sense to me to condition on a factor that you know for certainty has only one level in the data being offered. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice bwplot(): Conditioning on one factor
On Fri, 28 Sep 2012, David Winsemius wrote: bwplot(quant ~ param , data=mg.d, main='Magnesium', ylab='Concentration (mg/L)', subset= era=='Pre-mining' ) David, Don: Thank you. I tried subset= and era== separately, not together. Now I know. Much appreciated, Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice bwplot(): Conditioning on one factor
Yes. Now I understand what was wanted. 1. the subset argument is certainly documented on the Help page: subset An expression that evaluates to a logical or integer indexing vector. Like groups, it is evaluated in data. Only the resulting rows of data are used for the plot. If subscripts is TRUE, the subscripts provided to the panel function will be indices referring to the rows of data prior to the subsetting. Whether levels of factors in the data frame that are unused after the subsetting will be dropped depends on the drop.unused.levels argument. Had the OP read this carefully, he would have presumably recognized the errors in his specification. 2. Here is a small reproducible example to show how it should be done (probably unnecessary now): df -expand.grid(a = letters[1:3],b=LETTERS[1:2]) df - df[rep(1:6,10),] df$y - runif(60) bwplot(y~a|b, dat=df,subset = (b==A)) ## The logical condition is parenthesized only for clarity Cheers, Bert On Fri, Sep 28, 2012 at 10:10 AM, David Winsemius dwinsem...@comcast.net wrote: On Sep 28, 2012, at 9:56 AM, Rich Shepard wrote: On Fri, 28 Sep 2012, David Winsemius wrote: Wouldn't that involve specifying the 'subset' parameter (if bwplot accepts a subset argument) or using the 'subset' function to pass the desired rows to the data argument if it doesn't? David, That's what I tried: bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration (mg/L)', subset=era('Pre-mining')) Sigh. If I were testing that strategy (which I did not try because you were too busy to have included a working example) I would have written it: bwplot(quant ~ param , data=mg.d, main='Magnesium', ylab='Concentration (mg/L)', subset= era=='Pre-mining' ) That passes a logical vector which will work only if bwplot created an local environment where column names of the 'data' argument have been added to the local namespce. I do not know if that is true. I just looked at the bwplot help page and do not see a subset argument documented there. The other suggestion which it seems you were also to busy too have tried was: bwplot(quant ~ param , main='Magnesium', ylab='Concentration (mg/L)', data = subset( mg.dsubset, era=='Pre-mining' ) ) Wrapping a column name around a factor level with parentheses (which R takes to mean there is a function named 'era' to be applied) and expecting R to understand the you want a subset seems doomed to failure. It makes no sense to me to condition on a factor that you know for certainty has only one level in the data being offered. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] install.packages on windows
On 28.09.2012 00:32, Duncan Murdoch wrote: On 12-09-27 2:53 PM, Anju R wrote: Sometimes when I try to install certain packages I get a warning message. For example, I tried to install the package Imtest on windows R version 2.15.1 and got the following message: Warning message: package ‘Imtest’ is not available (for R version 2.15.1) How can I install the above package? Why do I get the above Warning message? It probably means exactly what it says, except that the information is about the mirror you are using. I would try another mirror. If that doesn't solve it, then it probably means that the package is really not available for 2.15.1. You can look on the cran.r-project.org website for information about it, and probably download the source from there, but you will probably need to fix whatever is wrong with it before it will work. Or in other words: There is no such package Imtest on CRAN, perhaps you are looking for lmtest? Uwe Ligges Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Anova and tukey-grouping
Hi, As I mentioned earlier, these are just guess work until you provide a subset of your data with dput(). Also, please check the structure of the data with str(). A.K. - Original Message - From: Landi ent-ar...@gmx.de To: r-help@r-project.org Cc: Sent: Friday, September 28, 2012 10:35 AM Subject: Re: [R] Anova and tukey-grouping Hello ! Thanks for your advice. I tried it, but the output is the same: HSD.test(anova.typabunmit, typ, group=TRUE) Name: typ ds.typabunmit$typ I don't get the values...!?!? -- View this message in context: http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644513.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to write R package
On 28.09.2012 14:22, Duncan Murdoch wrote: On 27/09/2012 5:15 PM, Dr. Alireza Zolfaghari wrote: Hi List, Would you please send me a good link to talk me through on how to write a R package? See the ?package.skeleton help page. After you have run it, follow the instructions in the Read-and-delete-me file that it will create. For full details, see the Writing R Extensions manual. For modifying the package after you've finished the Read-and-delete-me instructions, just manually add *.R files where the rest of them are, and use the prompt() function to produce skeleton documentation. That's about it, but you can read more if you like in a tutorial I gave a few years ago at a UseR meeting in Dortmund: http://www.statistik.uni-dortmund.de/useR-2008/slides/Murdoch.pdf ... and there are others who gave talks or tutorials about it (inlcuding myself). Nevertheless, I'd recommend to look into the manual Writing R Extensions which is updated with R and with the changes in the package related mechanisms --- while all our talks and tutorials won't get updated. Probably Duncan's is still correct, but I want to make this remark for the list's archives. Best, Uwe Ligges Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Question
Thanks a ton Berend. That worked like a charm.. R comes with thousands of Sweet Surprises everyday Bhupendrasinh Thakre On Sep 28, 2012, at 12:00 PM, Berend Hasselman b...@xs4all.nl wrote: On 28-09-2012, at 18:40, Bhupendrasinh Thakre vickytha...@gmail.com wrote: Hi Everyone, Sorry for coming back again with a new problem. Editing question, session info and data so you don't have to scroll till the end of page. *Situation :* I have a data frame and it's name is df. Now I want to add Time Stamp to the end of *name of data Frame i.e. df_system_time*. Previously it was running great and thanks to Dr. Winsemius , Kimmo and Pascal and I believe as the function which i used was scalar. *Data :* dput(df)structure(list(x = 1:10, y = 1:10), .Names = c(x, y), row.names = c(NA, -10L), class = data.frame) You have been given the answer. It only needs a minor variation: newname.df - paste0(df_, format(Sys.time(), %Y_%m_%d_%H_%M_%S) ) assign(newname.df,df) and if you wish rm(list=c('df','newname.df')) Or install package memisc (found by doing findFn(rename) from package sos) and use function rename(0; I have not tried this. Berend [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple Question
On Fri, Sep 28, 2012 at 11:15 AM, Bhupendrasinh Thakre vickytha...@gmail.com wrote: Thanks a ton Berend. That worked like a charm.. R comes with thousands of Sweet Surprises everyday -- Not for those who read the docs. :-o -- Bert Bhupendrasinh Thakre On Sep 28, 2012, at 12:00 PM, Berend Hasselman b...@xs4all.nl wrote: On 28-09-2012, at 18:40, Bhupendrasinh Thakre vickytha...@gmail.com wrote: Hi Everyone, Sorry for coming back again with a new problem. Editing question, session info and data so you don't have to scroll till the end of page. *Situation :* I have a data frame and it's name is df. Now I want to add Time Stamp to the end of *name of data Frame i.e. df_system_time*. Previously it was running great and thanks to Dr. Winsemius , Kimmo and Pascal and I believe as the function which i used was scalar. *Data :* dput(df)structure(list(x = 1:10, y = 1:10), .Names = c(x, y), row.names = c(NA, -10L), class = data.frame) You have been given the answer. It only needs a minor variation: newname.df - paste0(df_, format(Sys.time(), %Y_%m_%d_%H_%M_%S) ) assign(newname.df,df) and if you wish rm(list=c('df','newname.df')) Or install package memisc (found by doing findFn(rename) from package sos) and use function rename(0; I have not tried this. Berend [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to test if there is a subvector in a longer vector
Thank you! ___ Lähettäjä: Berend Hasselman [b...@xs4all.nl] Lähetetty: 28. syyskuuta 2012 10:47 Vastaanottaja: Atte Tenkanen Cc: R help Aihe: Re: [R] How to test if there is a subvector in a longer vector On 28-09-2012, at 07:41, Atte Tenkanen atte...@utu.fi wrote: Sorry. I should have mentioned that the order of the components is important. So c(1,4,6) is accepted as a subvector of c(2,1,1,4,6,3), but not of c(2,1,1,6,4,3). How to test this? See this discussion for a variety of solutions. http://r.789695.n4.nabble.com/matching-a-sequence-in-a-vector-td4389523.html#a4393453 Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Arules - predict function issues - subscript out of bounds
Hi Ankur, I am running into the exact same issue you have described above. Were you able to find out why it didn't work on your data set and resolve it? If yes, could you share? Much thanks regards, Alice -- View this message in context: http://r.789695.n4.nabble.com/Arules-predict-function-issues-subscript-out-of-bounds-tp4634422p4644546.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?
Hello, I've written a function to try to answer to your op request, but I've run into a problem. See in the end. In the mean time, inline. Em 28-09-2012 17:44, Jonathan Greenberg escreveu: Rui: Quick follow-up -- it looks like seek does do what I want (I see Simon suggested it some time ago) -- what do mean by trash your disk? Nothing special, just that sometimes there are good ways of doing so. mmap seems to be safe. What I'm trying to accomplish is getting parallel, asynchronous writes to a large binary image (just a binary file) working. Each node writes to a different sector of the file via mmap, filling in the values as the process runs, but the file needs to be pre-created before I can mmap it. Running a writeBin with a bunch of 0s would mean I'd basically have to write the file twice, but the seek/ff trick seems to be much faster. Do I risk doing some damage to my filesystem if I use seek? I see there is a strongly worded warning in the help for ?seek: Use of seek on Windows is discouraged. We have found so many errors in the Windows implementation of file positioning that users are advised to use it only at their own risk, and asked not to waste the *R* developers' time with bug reports on Windows' deficiencies. -- there's no detail here on which errors people have experienced, so I'm not sure if doing something as simple as just creating a file using seek falls under the discouraging category. I'm not a great system programmer but in 20+ years of using seek on Windows has shown nothing of the sort. In fact, I've just found a problem with ubuntu 12.04, where seek gives the expected result on Windows, it goes up to a certain point on ubuntu and then stops seeking, or whatever is happening. I installed ubuntu very recently so I really don't know why the behavior that you can see in the example run below. But I do that Windows 7 is causing no problem, as expected. As a note, we are trying to work this up on both Windows and *nix systems, hence our wanting to have a single approach that works on both OSs. --j # # Function: creates a file of ascii nulls using seek/writeBin. File size can be big. # createBig - function(filename, size){ if(size == 0) return(0) chunk - .Machine$integer.max nchunks - as.integer(size / chunk) rest - size - as.double(nchunks)*as.double(chunk) fl - file(filename, open = wb) for(i in seq_len(nchunks)){ seek(fl, where = chunk - 1, origin = current, rw = write) writeBin(raw(1), fl) # -- debug -- print(seek(fl, where = NA)) } if(rest 0){ seek(fl, where = rest - 1, origin = current, rw = write) writeBin(raw(1), fl) } close(fl) } As you can see from the debug prints, on Windows 7, everything works as planned while on ubuntu 12.04 when it reaches 17Gb seek stops seeking. The increments in file size become 1 byte at a time, explained by the writeBin instruction. (The different, slightly larger, size is irrelevant, the code was ran several times all with the same result: at 17179869176 bytes it no longer works.) # # # System: Windows 7 / R 2.15.1 size - 10*.Machine$integer.max + sample(.Machine$integer.max, 1) size [1] 22195364413 createBig(Test.txt, size) [1] 2147483647 [1] 4294967294 [1] 6442450941 [1] 8589934588 [1] 10737418235 [1] 12884901882 [1] 15032385529 [1] 17179869176 [1] 19327352823 [1] 21474836470 file.info(Test.txt)$size [1] 22195364413 file.info(Test.txt)$size %/% .Machine$integer.max [1] 10 file.info(Test.txt)$size %% .Machine$integer.max [1] 720527943 sessionInfo() R version 2.15.1 (2012-06-22) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Portuguese_Portugal.1252 LC_CTYPE=Portuguese_Portugal.1252 [3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Portugal.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] fortunes_1.5-0 # # # System: ubuntu 12.04 precise pangolim / R 2.15.1 size - 10*.Machine$integer.max + sample(.Machine$integer.max, 1) size [1] 23091487381 createBig(Test.txt, size) [1] 2147483647 [1] 4294967294 [1] 6442450941 [1] 8589934588 [1] 10737418235 [1] 12884901882 [1] 15032385529 [1] 17179869176 [1] 17179869177 [1] 17179869178 file.info(Test.txt)$size [1] 17179869179 file.info(Test.txt)$size %/% .Machine$integer.max [1] 8 file.info(Test.txt)$size %% .Machine$integer.max [1] 3 sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=pt_PT.UTF-8 LC_NUMERIC=C [3] LC_TIME=pt_PT.UTF-8LC_COLLATE=pt_PT.UTF-8 [5] LC_MONETARY=pt_PT.UTF-8LC_MESSAGES=pt_PT.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11]
Re: [R] changing outlier shapes of boxplots using lattice
On Fri, Sep 28, 2012 at 6:57 AM, Richard M. Heiberger r...@temple.eduwrote: Elaine, For panel.bwplot you see that the central dot and the outlier dots are controlled by the same pch argument. ??? I don't think so... bwplot(rgamma(20,.1,1)~gl(2,10), pch=rep(17,2), panel = lattice::panel.bwplot) I think you mean panel.bwplot.intermidiate.hh ? BTW thank you for the useful HH package but in this case OP is using it with no at argument, so why not Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2, sienna2,red2,firebrick3,saddlebrown,coral4,chocolate4,darkblue,navy,grey38) bwplot(rgamma(20*13,1,.1)~gl(13,20), fill = Diet.colors, pch = |, par.settings = list(box.umbrella=list(lty=1))) cheers I initially set the pch=| to match your first example with the horizontal indicator for the median. I would be inclined to use the default circle for the outliers and therefore also for the median. Rich On Fri, Sep 28, 2012 at 7:13 AM, Sarah Goslee sarah.gos...@gmail.com wrote: I would guess that if you find the bit that says pch=| and change it to pch=1 it will solve your question, and that reading ?par will tell you why. Sarah On Thursday, September 27, 2012, Elaine Kuo wrote: Hello This is Elaine. I am using package lattice to generate boxplots. Using Richard's code, the display was almost perfect except the outlier shape. Based on the following code, the outliers are vertical lines. However, I want the outliers to be empty circles. Please kindly help how to modify the code to change the outlier shapes. Thank you. code package (lattice) dataN - data.frame(GE_distance=rnorm(260), Diet_B=factor(rep(1:13, each=20))) Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2, sienna2,red2,firebrick3,saddlebrown,coral4, chocolate4,darkblue,navy,grey38) levels(dataN$Diet_B) - Diet.colors bwplot(GE_distance ~ Diet_B, data=dataN, xlab=list(Diet of Breeding Ground, cex = 1.4), ylab = list( Distance between Centers of B and NB Range (1000 km), cex = 1.4), panel=panel.bwplot.intermediate.hh, col=Diet.colors, pch=rep(|,13), scales=list(x=list(rot=90)), par.settings=list(box.umbrella=list(lty=1))) [[alternative HTML version deleted]] __ R-help@r-project.org javascript:; mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sarah Goslee http://www.stringpage.com http://www.sarahgoslee.com http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Better way of Grouping?
Hello R users, This is more of a convenience question that I hope others might find useful if there is a better answer. I work with large datasets that requires multiple parsing stages for different analysis. For example, compare group 3 vs. group 4. A more complicated comparison would be time B in group 3 of group L with B in group 4 of group L. I normally subset each group with the following type of code. data=read(...) #L v D L=data[LvD %in% c(L),] D=data[LvD %in% c(D),] #Groups 3 and 4 within L and D group3L=L[group %in% c(3),] group4L=L[group %in% c(3),] group3D=D[group %in% c(3),] group4D=D[group %in% c(3),] #Times B, S45, FR2, FR8 you get the idea Is there a more efficient way to subset groups? Thanks for any insight. Regards, Charles [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?
On Sep 28, 2012, at 12:44 PM, Jonathan Greenberg wrote: Rui: Quick follow-up -- it looks like seek does do what I want (I see Simon suggested it some time ago) -- what do mean by trash your disk? I can't speak for Rui, but the difference between seeking and explicit write is that the FS can optimize the former by not actually writing anything to disk (which is why it's so fast on some OS/FS combos). However, what this means that the layout on the disk may not be sequential depending on the write patterns of the actual data blocks, because the FS may keep a mask of unused blocks and don't write them. But that is just a FS issue and thus varies vasty by OS and FS. For your use this probably doesn't matter as you probably don't need to stream the resulting file at the end. What I'm trying to accomplish is getting parallel, asynchronous writes to a large binary image (just a binary file) working. Each node writes to a different sector of the file via mmap, filling in the values as the process runs, but the file needs to be pre-created before I can mmap it. Running a writeBin with a bunch of 0s would mean I'd basically have to write the file twice, but the seek/ff trick seems to be much faster. Do I risk doing some damage to my filesystem if I use seek? I see there is a strongly worded warning in the help for ?seek: Use of seek on Windows is discouraged. We have found so many errors in the Windows implementation of file positioning that users are advised to use it only at their own risk, and asked not to waste the *R* developers' time with bug reports on Windows' deficiencies. -- there's no detail here on which errors people have experienced, so I'm not sure if doing something as simple as just creating a file using seek falls under the discouraging category. Quick search in my mail shows issues that were related to what Windows reports as the seek location on text files when querying. AFAICS it did not affect the side-effect of seek which is what you're interested in. Cheers, Simon As a note, we are trying to work this up on both Windows and *nix systems, hence our wanting to have a single approach that works on both OSs. --j On Thu, Sep 27, 2012 at 3:49 PM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, If you really need to trash your disk, why not use seek()? fl - file(Test.txt, open = wb) seek(fl, where = 1024, origin = start, rw = write) [1] 0 writeChar(character(1), fl, nchars = 1, useBytes = TRUE) Warning message: In writeChar(character(1), fl, nchars = 1, useBytes = TRUE) : writeChar: more characters requested than are in the string - will zero-pad close(fl) File Test.txt is now 1Kb in size. Hope this helps, Rui Barradas Em 27-09-2012 20:17, Jonathan Greenberg escreveu: Folks: Asked this question some time ago, and found what appeared (at first) to be the best solution, but I'm now finding a new problem. First off, it seemed like ff as Jens suggested worked: # outdata_ncells = the number of rows * number of columns * number of bands in an image: out-ff(vmode=double,length=outdata_ncells,filename=filename) finalizer(out) - close close(out) This was working fine until I attempted to set length to a VERY large number: outdata_ncells = 17711913600. This would create a file that is 131.964GB. Big, but not obscenely so (and certainly not larger than the filesystem can handle). However, length appears to be restricted by .Machine$integer.max (I'm on a 64-bit windows box): .Machine$integer.max [1] 2147483647 Any suggestions on how to solve this problem for much larger file sizes? --j On Thu, May 3, 2012 at 10:44 AM, Jonathan Greenberg j...@illinois.edu j...@illinois.eduwrote: Thanks, all! I'll try these out. I'm trying to work up something that is platform independent (if possible) for use with mmap. I'll do some tests on these suggestions and see which works best. I'll try to report back in a few days. Cheers! --j 2012/5/3 Jens Oehlschlägel jens.oehlschlae...@truecluster.com jens.oehlschlae...@truecluster.com Jonathan, On some filesystems (e.g. NTFS, see below) it is possible to create 'sparse' memory-mapped files, i.e. reserving the space without the cost of actually writing initial values. Package 'ff' does this automatically and also allows to access the file in parallel. Check the example below and see how big file creation is immediate. Jens Oehlschlägel library(ff) library(snowfall) ncpus - 2 n - 1e8 system.time( + x - ff(vmode=double, length=n, filename=c:/Temp/x.ff) + ) User System verstrichen 0.010.000.02 # check finalizer, with an explicit filename we should have a 'close' finalizer finalizer(x) [1] close # if not, set it to 'close' inorder to not let slaves delete x on slave shutdown finalizer(x) - close sfInit(parallel=TRUE, cpus=ncpus, type=SOCK) R
[R] Select Original and Duplicates
I would like to select a all the duplicate rows of a data frame including the original. Any help would be much appreciated. This is where I'm at so far. Thanks. #Sample data frame: df - read.table(header=T, con - textConnection(' label value A 4 B 3 C 6 B 3 B 1 A 2 A 4 A 4 ')) close(con) # Duplicate entries df[duplicated(df),] # label value # B 3 # A 4 # A 4 #I want to select all the rows that are duplicated including the original #This is the output I want # label value # B 3 # B 3 # A 4 # A 4 # A 4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Select Original and Duplicates
Hello, Try the following. idx - duplicated(df) | duplicated(df, fromLast = TRUE) df[idx, ] Note that they are returned in their original order in the df. Hope this helps, Rui Barradas Em 28-09-2012 21:11, Adam Gabbert escreveu: I would like to select a all the duplicate rows of a data frame including the original. Any help would be much appreciated. This is where I'm at so far. Thanks. #Sample data frame: df - read.table(header=T, con - textConnection(' label value A 4 B 3 C 6 B 3 B 1 A 2 A 4 A 4 ')) close(con) # Duplicate entries df[duplicated(df),] # label value # B 3 # A 4 # A 4 #I want to select all the rows that are duplicated including the original #This is the output I want # label value # B 3 # B 3 # A 4 # A 4 # A 4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Better way of Grouping?
You have not specified the objective function you are trying to optimize with your term efficient, or what you do with all of these subsets once you have them. For notational simplification and completeness of coverage (not necessarily computational speedup) you might want to look at tapply or ddply/dlply from the plyr package. If you build lists of subsets you can index into them according to grouping value. You can use expand.grid to build all permutations of grouping values to use as indexes into those lists of subsets. To reiterate, you have not indicated what you want to do with these subsets, so there could be special-purpose functions that do what you want. As always, reproducible code leads to reproducible answers. :) --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Charles Determan Jr deter...@umn.edu wrote: Hello R users, This is more of a convenience question that I hope others might find useful if there is a better answer. I work with large datasets that requires multiple parsing stages for different analysis. For example, compare group 3 vs. group 4. A more complicated comparison would be time B in group 3 of group L with B in group 4 of group L. I normally subset each group with the following type of code. data=read(...) #L v D L=data[LvD %in% c(L),] D=data[LvD %in% c(D),] #Groups 3 and 4 within L and D group3L=L[group %in% c(3),] group4L=L[group %in% c(3),] group3D=D[group %in% c(3),] group4D=D[group %in% c(3),] #Times B, S45, FR2, FR8 you get the idea Is there a more efficient way to subset groups? Thanks for any insight. Regards, Charles [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merging multiple columns into one column
Good Evening- I have a dataframe that has 10 columns that has a header and 7306 rows in each column, I want to combine these columns into one. I utilized the stack function but it only returned 3/4 of the data...my code is: where nfcuy_bw is the dataframe with 7305 obs. and 10 variables Once I apply this code I only receive a data frame with 58440 obs. of 2 variables, of which there should be 73,050 obs. of 2 variables, just wondering what is happening here? View(nfcuy_bw) attach(nfcuy_bw) cuyahoga_nf-data.frame(s5,s10,s25,s27,s33,s41,s51,his_c) cuy_nf-stack(cuyahoga_nf) Thanks Meredith -- Doctoral Candidate Department of Civil and Environmental Engineering Michigan Technological University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Better way of Grouping?
On Sep 28, 2012, at 11:59 AM, Charles Determan Jr wrote: Hello R users, This is more of a convenience question that I hope others might find useful if there is a better answer. I work with large datasets that requires multiple parsing stages for different analysis. For example, compare group 3 vs. group 4. A more complicated comparison would be time B in group 3 of group L with B in group 4 of group L. I normally subset each group with the following type of code. data=read(...) #L v D L=data[LvD %in% c(L),] D=data[LvD %in% c(D),] #Groups 3 and 4 within L and D group3L=L[group %in% c(3),] group4L=L[group %in% c(3),] Assume you meant to have a 4 there group3D=D[group %in% c(3),] group4D=D[group %in% c(3),] Ditto. Only makes sense with a 4. The usual way is to use: lapply( split(data, interaction(data$LvD, data$group)) , fun( subdf) {do something with subdf} ) That way you do not end up littering you workspace with subsidiary subsets of you main data object. #Times B, S45, FR2, FR8 you get the idea Is there a more efficient way to subset groups? Thanks for any insight. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merging multiple columns into one column
On Sep 28, 2012, at 2:51 PM, Meredith Ballard LaBeau wrote: Good Evening- I have a dataframe that has 10 columns that has a header and 7306 rows in each column, I want to combine these columns into one. I utilized the stack function but it only returned 3/4 of the data...my code is: where nfcuy_bw is the dataframe with 7305 obs. and 10 variables Once I apply this code I only receive a data frame with 58440 obs. of 2 variables, of which there should be 73,050 obs. of 2 variables, just wondering what is happening here? View(nfcuy_bw) attach(nfcuy_bw) Using 'attach' is a great way to produce confusing errors. cuyahoga_nf-data.frame(s5,s10,s25,s27,s33,s41,s51,his_c) cuy_nf-stack(cuyahoga_nf) Unable to do much else in the absence of a dataset, much less a summary of these objects, whose creation is your responsibility, not ours. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Heatmap Colors
Hello R-Users! I'm using a heatmap to visualize a matrix of values between -1 and 3. How can I set the colors so that white is zero, below zero is blue of increasing intensity towards -1 and above zero is red of increasing intensity towards red? I tried like this (using the marray and gplots packages from bioconductor): mcol - maPalette(low=blue, mid=white, high=red,k=100) heatmap.2(my_matrix, col=mcol) But white does not correspond to zero, because the value distribution is not symmetrical, so that zero is not in the middle. Is it somehow possible to create a color palette with white centered at zero? Nick Fankhauser __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merging multiple columns into one column
?unlist (A data frame is a list, as ?data.frame explains. Also the Intro to R tutorial, which should be read by everyone beginning with R). -- Bert On Fri, Sep 28, 2012 at 2:51 PM, Meredith Ballard LaBeau mmbal...@mtu.edu wrote: Good Evening- I have a dataframe that has 10 columns that has a header and 7306 rows in each column, I want to combine these columns into one. I utilized the stack function but it only returned 3/4 of the data...my code is: where nfcuy_bw is the dataframe with 7305 obs. and 10 variables Once I apply this code I only receive a data frame with 58440 obs. of 2 variables, of which there should be 73,050 obs. of 2 variables, just wondering what is happening here? View(nfcuy_bw) attach(nfcuy_bw) cuyahoga_nf-data.frame(s5,s10,s25,s27,s33,s41,s51,his_c) cuy_nf-stack(cuyahoga_nf) Thanks Meredith -- Doctoral Candidate Department of Civil and Environmental Engineering Michigan Technological University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] changing outlier shapes of boxplots using lattice
Hello Ilai, Thank you for the response. It did help a lot. However, a beginner to lattice has three questions. Q1 Please kindly explain why in this case OP is using it with no at argument, so it is possible to display the median and the outliers with different pch? Q2. what is the relationship between package HH and graphic-drawing? I checked ??HH and found little explanation on its function of graphic-drawing. Q3 Please kindly advise how to make outliers empty circle (pch=2) in this case as the code below. Thank you. code Diet.colors - c(forestgreen,darkgreen,chocolate1,darkorange2,sienna2, red2,firebrick3,saddlebrown,coral4,chocolate4,darkblue,navy,grey38) levels(dataN$Diet_B) - diet.code bwplot(MS_midpoint_lat~Diet_B, data=dataN, xlab=list(Diet of Breeding Ground, cex = 1.4), ylab = list(Latitudinal Midpoint Breeding Ground ,cex = 1.4), lwd=1.5, cex.lab=1.4, cex.axis=1.2, font.axis=2, cex=1.5, las=1, panel=panel.bwplot.intermediate.hh, bty=l, col=Diet.colors, pch=rep(l,13), scales=list(x=list(rot=90)), par.settings=list(plot.symbol = list(pch = 2, cex = 2),box.umbrella=list(lty=1))) Elaine On Sat, Sep 29, 2012 at 2:44 AM, ilai ke...@math.montana.edu wrote: On Fri, Sep 28, 2012 at 6:57 AM, Richard M. Heiberger r...@temple.eduwrote: Elaine, For panel.bwplot you see that the central dot and the outlier dots are controlled by the same pch argument. ??? I don't think so... bwplot(rgamma(20,.1,1)~gl(2,10), pch=rep(17,2), panel = lattice::panel.bwplot) I think you mean panel.bwplot.intermidiate.hh ? BTW thank you for the useful HH package but in this case OP is using it with no at argument, so why not Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2, sienna2,red2,firebrick3,saddlebrown,coral4,chocolate4,darkblue,navy,grey38) bwplot(rgamma(20*13,1,.1)~gl(13,20), fill = Diet.colors, pch = |, par.settings = list(box.umbrella=list(lty=1))) cheers I initially set the pch=| to match your first example with the horizontal indicator for the median. I would be inclined to use the default circle for the outliers and therefore also for the median. Rich On Fri, Sep 28, 2012 at 7:13 AM, Sarah Goslee sarah.gos...@gmail.com wrote: I would guess that if you find the bit that says pch=| and change it to pch=1 it will solve your question, and that reading ?par will tell you why. Sarah On Thursday, September 27, 2012, Elaine Kuo wrote: Hello This is Elaine. I am using package lattice to generate boxplots. Using Richard's code, the display was almost perfect except the outlier shape. Based on the following code, the outliers are vertical lines. However, I want the outliers to be empty circles. Please kindly help how to modify the code to change the outlier shapes. Thank you. code package (lattice) dataN - data.frame(GE_distance=rnorm(260), Diet_B=factor(rep(1:13, each=20))) Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2, sienna2,red2,firebrick3,saddlebrown,coral4, chocolate4,darkblue,navy,grey38) levels(dataN$Diet_B) - Diet.colors bwplot(GE_distance ~ Diet_B, data=dataN, xlab=list(Diet of Breeding Ground, cex = 1.4), ylab = list( Distance between Centers of B and NB Range (1000 km), cex = 1.4), panel=panel.bwplot.intermediate.hh, col=Diet.colors, pch=rep(|,13), scales=list(x=list(rot=90)), par.settings=list(box.umbrella=list(lty=1))) [[alternative HTML version deleted]] __ R-help@r-project.org javascript:; mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sarah Goslee http://www.stringpage.com http://www.sarahgoslee.com http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list
Re: [R] Heatmap Colors
On Sep 28, 2012, at 3:16 PM, Nick Fankhauser wrote: Hello R-Users! I'm using a heatmap to visualize a matrix of values between -1 and 3. How can I set the colors so that white is zero, below zero is blue of increasing intensity towards -1 and above zero is red of increasing intensity towards red? I tried like this (using the marray and gplots packages from bioconductor): mcol - maPalette(low=blue, mid=white, high=red,k=100) heatmap.2(my_matrix, col=mcol) But white does not correspond to zero, because the value distribution is not symmetrical, so that zero is not in the middle. Is it somehow possible to create a color palette with white centered at zero? The way you stated it at the beginning, I thought you should want the palette centered at 1 rather than 0: test - seq(-1,3, len=20) shift.BR - colorRamp(c(blue,white, red), bias=2)((1:16)/16) tpal - rgb(shift.BR, maxColorValue=255) barplot(test,col = tpal) Perhaps I was being led astray by a somewhat similar question on StackOverflow. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Heatmap Colors
On Sep 28, 2012, at 4:52 PM, David Winsemius wrote: On Sep 28, 2012, at 3:16 PM, Nick Fankhauser wrote: Hello R-Users! I'm using a heatmap to visualize a matrix of values between -1 and 3. How can I set the colors so that white is zero, below zero is blue of increasing intensity towards -1 and above zero is red of increasing intensity towards red? I tried like this (using the marray and gplots packages from bioconductor): mcol - maPalette(low=blue, mid=white, high=red,k=100) heatmap.2(my_matrix, col=mcol) But white does not correspond to zero, because the value distribution is not symmetrical, so that zero is not in the middle. Is it somehow possible to create a color palette with white centered at zero? The way you stated it at the beginning, I thought you should want the palette centered at 1 rather than 0: Oopps ... should have the number of breaks match the number of colors: test - seq(-1,3, len=20) shift.BR - colorRamp(c(blue,white, red), bias=2)((1:20)/20) tpal - rgb(shift.BR, maxColorValue=255) barplot(test,col = tpal) -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Errors in if statement
Hi guys, I have many rows (1000) and columns (30) of geno matrix. I use the following loop and condition statement (adapted from someone else code). I always have an error below. I was wondering if anyone knows what's the problem how to fix it. Thanks,Zhengyu ### geno matrix P1 P2 P3 P4 1 2 2 3 2 2 2 2 1 1 1 2 1 2 NANA 2 3 4 5 ### for(i in 1:4) { cat(i,) if(sum(geno[i,]!=2)3 sum(geno[i,]==1)=1 sum(geno[i,]==3)=1){ tmp = 1 } } ### 1 2 Error in if (sum(geno[i, ] != 2) 3 sum(geno[i, ] == 1) = 1 sum(geno[i, : missing value where TRUE/FALSE needed [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Better way of Grouping?
Hi, You can also use grep() to subset: LD-paste0(rep(rep(c(3,4),each=4),2),c(rep(L,8),rep(D,8))) set.seed(1) dat1-data.frame(LD=LD,value=sample(1:15,16,replace=TRUE)) dat2-within(dat1,{LD-as.character(LD)}) dat2[grepl(.*L,dat2$LD),] # subset all L values dat2[grepl(.*D,dat2$LD),] # subset all D values dat2[grepl(3D,dat2$LD),] dat2[grepl(4D,dat2$LD),] A.K. - Original Message - From: Charles Determan Jr deter...@umn.edu To: r-help@r-project.org Cc: Sent: Friday, September 28, 2012 2:59 PM Subject: [R] Better way of Grouping? Hello R users, This is more of a convenience question that I hope others might find useful if there is a better answer. I work with large datasets that requires multiple parsing stages for different analysis. For example, compare group 3 vs. group 4. A more complicated comparison would be time B in group 3 of group L with B in group 4 of group L. I normally subset each group with the following type of code. data=read(...) #L v D L=data[LvD %in% c(L),] D=data[LvD %in% c(D),] #Groups 3 and 4 within L and D group3L=L[group %in% c(3),] group4L=L[group %in% c(3),] group3D=D[group %in% c(3),] group4D=D[group %in% c(3),] #Times B, S45, FR2, FR8 you get the idea Is there a more efficient way to subset groups? Thanks for any insight. Regards, Charles [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Text mining? Text manipulation? Both? Predicting KRAS test results in cancer patients
Happy Friday Everyone, Hope Friday afternoon doesn't turn out to be a terrible time to post a question. I've been doing a little data mining of patient text medical records as of late. I started out trying to predict whether or not cancer patients had received KRAS mutation testing and did quite well with that. Now I'm trying to predict the results of KRAS testing (mutated vs. wild type). This is proving to be a little more difficult. With the first classification task, I created counts of terms (e.g., kras, mutated) in the text medical records using the tm package and then used those counts to predict whether or not patients had had KRAS mutation testing. I tried a few different analyses here, but found that random forests worked the best. Predicting the results of testing is harder though because of the way physicians and other healthcare professionals write about testing. For example, I'm finding phrases like KRAS mutation returned wild-type. In this example, if we're counting, we get 1 instance of kras, 1 instance of mutated, and one instance of wild. So you can see how it might be difficult to accurately predict the results of testing based on counts alone. My question is how best to deal with this. Are there any R text mining packages or related software that would be particularly suited to my problem? I took a look at the CRAN Task View: Natural Language Processing and there were so many options I didn't really know where to start (and it's not even clear that an R-based solution will work best for my problem). Alternatively, is there any real chance one could simply write code that would be able to identify true references to the results of KRAS testing and then create counts only of what are likely to be true references? I'd greatly appreciate it if someone could point me in the right direction. Thanks, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Select Original and Duplicates
That works. Thank you! On Fri, Sep 28, 2012 at 4:22 PM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Try the following. idx - duplicated(df) | duplicated(df, fromLast = TRUE) df[idx, ] Note that they are returned in their original order in the df. Hope this helps, Rui Barradas Em 28-09-2012 21:11, Adam Gabbert escreveu: I would like to select a all the duplicate rows of a data frame including the original. Any help would be much appreciated. This is where I'm at so far. Thanks. #Sample data frame: df - read.table(header=T, con - textConnection(' label value A 4 B 3 C 6 B 3 B 1 A 2 A 4 A 4 ')) close(con) # Duplicate entries df[duplicated(df),] # label value # B 3 # A 4 # A 4 #I want to select all the rows that are duplicated including the original #This is the output I want # label value # B 3 # B 3 # A 4 # A 4 # A 4 [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Select Original and Duplicates
HI, You can also try: idx-data.frame(t(sapply(df,function(x) !is.na(match(x,x[duplicated(x)]) df1-df[sapply(idx,function(x) all(x==TRUE)),] df1 # label value #1 A 4 #2 B 3 #4 B 3 #7 A 4 #8 A 4 A.K. - Original Message - From: Rui Barradas ruipbarra...@sapo.pt To: Adam Gabbert adamjgabb...@gmail.com Cc: r-help@r-project.org Sent: Friday, September 28, 2012 4:22 PM Subject: Re: [R] Select Original and Duplicates Hello, Try the following. idx - duplicated(df) | duplicated(df, fromLast = TRUE) df[idx, ] Note that they are returned in their original order in the df. Hope this helps, Rui Barradas Em 28-09-2012 21:11, Adam Gabbert escreveu: I would like to select a all the duplicate rows of a data frame including the original. Any help would be much appreciated. This is where I'm at so far. Thanks. #Sample data frame: df - read.table(header=T, con - textConnection(' label value A 4 B 3 C 6 B 3 B 1 A 2 A 4 A 4 ')) close(con) # Duplicate entries df[duplicated(df),] # label value # B 3 # A 4 # A 4 #I want to select all the rows that are duplicated including the original #This is the output I want # label value # B 3 # B 3 # A 4 # A 4 # A 4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Converting array to matrix
Hi, I have a 3d array as below, I want to make this array to a matrix of p=50(rows) and n=20(columns) with the coverage values . The code before the array is: library(binom) Loading required package: lattice pi.seq-seq(from = 0.01, to = 0.5, by = 0.01) no.seq-seq(from = 5, to = 100, by = 5) cp.all = binom.coverage( p = pi.seq, n = no.seq , conf.level = 0.95, method = exact) I basically want to plot this probability with filled. contour. Many thanks. method p n coverage 1 exact 0.01 5 0.9990199 2 exact 0.01 10 0.9957338 3 exact 0.01 15 0.9903702 4 exact 0.01 20 0.9831407 5 exact 0.01 25 0.9980493 6 exact 0.01 30 0.9966823 7 exact 0.01 35 0.9948463 8 exact 0.01 40 0.9925026 9 exact 0.01 45 0.9896219 10 exact 0.01 50 0.9861827 11 exact 0.01 55 0.9821712 12 exact 0.01 60 0.9775798 13 exact 0.01 65 0.9958308 14 exact 0.01 70 0.9945711 15 exact 0.01 75 0.9930800 16 exact 0.01 80 0.9913408 17 exact 0.01 85 0.9893386 18 exact 0.01 90 0.9870598 19 exact 0.01 95 0.9844924 20 exact 0.01 100 0.9816260 21 exact 0.02 5 0.9961576 22 exact 0.02 10 0.9838224 23 exact 0.02 15 0.9969606 24 exact 0.02 20 0.9929313 25 exact 0.02 25 0.9867566 26 exact 0.02 30 0.9782822 27 exact 0.02 35 0.9948918 28 exact 0.02 40 0.9917591 29 exact 0.02 45 0.9875780 30 exact 0.02 50 0.9822419 31 exact 0.02 55 0.9756698 32 exact 0.02 60 0.9929754 33 exact 0.02 65 0.9902072 34 exact 0.02 70 0.9867702 35 exact 0.02 75 0.9826010 36 exact 0.02 80 0.9776446 37 exact 0.02 85 0.9927058 38 exact 0.02 90 0.9904482 39 exact 0.02 95 0.9877327 40 exact 0.02 100 0.9845164 41 exact 0.03 5 0.9915279 42 exact 0.03 10 0.9972351 43 exact 0.03 15 0.9906286 44 exact 0.03 20 0.9789916 45 exact 0.03 25 0.9938142 46 exact 0.03 30 0.9880954 47 exact 0.03 35 0.9797802 48 exact 0.03 40 0.9933299 49 exact 0.03 45 0.9890462 50 exact 0.03 50 0.9831894 51 exact 0.03 55 0.9755598 52 exact 0.03 60 0.9908560 53 exact 0.03 65 0.9866943 54 exact 0.03 70 0.9813629 55 exact 0.03 75 0.9926775 56 exact 0.03 80 0.9896911 57 exact 0.03 85 0.9859049 58 exact 0.03 90 0.9812172 59 exact 0.03 95 0.9755343 60 exact 0.03 100 0.9893762 61 exact 0.04 5 0.9852420 62 exact 0.04 10 0.9937863 63 exact 0.04 15 0.9797082 64 exact 0.04 20 0.9925871 65 exact 0.04 25 0.9834784 66 exact 0.04 30 0.9936800 67 exact 0.04 35 0.9877867 68 exact 0.04 40 0.9789777 69 exact 0.04 45 0.9912599 70 exact 0.04 50 0.9855896 71 exact 0.04 55 0.9777638 72 exact 0.04 60 0.9901122 73 exact 0.04 65 0.9849824 74 exact 0.04 70 0.9781965 75 exact 0.04 75 0.9897956 76 exact 0.04 80 0.9852643 77 exact 0.04 85 0.9794261 78 exact 0.04 90 0.9899813 79 exact 0.04 95 0.9653302 80 exact 0.04 100 0.9641378 81 exact 0.05 5 0.9774075 82 exact 0.05 10 0.9884964 83 exact 0.05 15 0.9945327 84 exact 0.05 20 0.9840985 85 exact 0.05 25 0.9928351 86 exact 0.05 30 0.9843645 87 exact 0.05 35 0.9927483 88 exact 0.05 40 0.9861231 89 exact 0.05 45 0.9761385 90 exact 0.05 50 0.9882136 91 exact 0.05 55 0.9806825 92 exact 0.05 60 0.9902109 93 exact 0.05 65 0.9844774 94 exact 0.05 70 0.9766393 95 exact 0.05 75 0.9662306 96 exact 0.05 80 0.9650815 97 exact 0.05 85 0.9772934 98 exact 0.05 90 0.9755923 99 exact 0.05 95 0.9718140 100 exact 0.05 100 0.9826071 101 exact 0.06 5 0.9980297 102 exact 0.06 10 0.9811622 103 exact 0.06 15 0.9896401 104 exact 0.06 20 0.9943659 105 exact 0.06 25 0.9849507 106 exact 0.06 30 0.9920548 107 exact 0.06 35 0.9831689 108 exact 0.06 40 0.9909419 109 exact 0.06 45 0.9829932 110 exact 0.06 50 0.9906217 111 exact 0.06 55 0.9836566 112 exact 0.06 60 0.9663670 113 exact 0.06 65 0.9668145 114 exact 0.06 70 0.9630279 115 exact 0.06 75 0.9763348 116 exact 0.06 80 0.9716289 117 exact 0.06 85 0.9820840 118 exact 0.06 90 0.9772655 119 exact 0.06 95 0.9687703 120 exact 0.06 100 0.9680765 121 exact 0.07 5 0.9969201 122 exact 0.07 10 0.9964239 123 exact 0.07 15 0.9824673 124 exact 0.07 20 0.9892932 125 exact 0.07 25 0.9934691 126 exact 0.07 30 0.9837683 127 exact 0.07 35 0.9902956 128 exact 0.07 40 0.9801496 129 exact 0.07 45 0.9879752 130 exact 0.07 50 0.9779901 131 exact 0.07 55 0.9679391 132 exact 0.07 60 0.9640110 133 exact 0.07 65 0.9765091 134 exact 0.07 70 0.9702320 135 exact 0.07 75 0.9806132 136 exact 0.07 80 0.9553953 137 exact 0.07 85 0.9692733 138 exact 0.07 90 0.9656231 139 exact 0.07 95 0.9765780 140 exact 0.07 100 0.9715796 141 exact 0.08 5 0.9954747 142 exact 0.08 10 0.9941987 143 exact 0.08 15 0.9950303 144 exact 0.08 20 0.9816556 145 exact 0.08 25 0.9877073 146 exact
Re: [R] Converting array to matrix
On Sep 28, 2012, at 3:59 PM, farnoosh sheikhi wrote: Hi, I have a 3d array as below, I want to make this array to a matrix of p=50(rows) and n=20(columns) with the coverage values . The code before the array is: ?matrix mat - matrix(datfrm$coverage, 50, 20) filled.contour(mat) # untested -- David library(binom) Loading required package: lattice pi.seq-seq(from = 0.01, to = 0.5, by = 0.01) no.seq-seq(from = 5, to = 100, by = 5) cp.all = binom.coverage( p = pi.seq, n = no.seq , conf.level = 0.95, method = exact) I basically want to plot this probability with filled. contour. Many thanks. methodp n coverage 1 exact 0.01 5 0.9990199 2 exact 0.01 10 0.9957338 3 exact 0.01 15 0.9903702 4 exact 0.01 20 0.9831407 5 exact 0.01 25 0.9980493 6 exact 0.01 30 0.9966823 7 exact 0.01 35 0.9948463 8 exact 0.01 40 0.9925026 9 exact 0.01 45 0.9896219 10exact 0.01 50 0.9861827 11exact 0.01 55 0.9821712 12exact 0.01 60 0.9775798 13exact 0.01 65 0.9958308 14exact 0.01 70 0.9945711 15exact 0.01 75 0.9930800 16exact 0.01 80 0.9913408 17exact 0.01 85 0.9893386 18exact 0.01 90 0.9870598 19exact 0.01 95 0.9844924 20exact 0.01 100 0.9816260 21exact 0.02 5 0.9961576 22exact 0.02 10 0.9838224 23exact 0.02 15 0.9969606 24exact 0.02 20 0.9929313 25exact 0.02 25 0.9867566 26exact 0.02 30 0.9782822 27exact 0.02 35 0.9948918 28exact 0.02 40 0.9917591 29exact 0.02 45 0.9875780 30exact 0.02 50 0.9822419 31exact 0.02 55 0.9756698 32exact 0.02 60 0.9929754 33exact 0.02 65 0.9902072 34exact 0.02 70 0.9867702 35exact 0.02 75 0.9826010 36exact 0.02 80 0.9776446 37exact 0.02 85 0.9927058 38exact 0.02 90 0.9904482 39exact 0.02 95 0.9877327 40exact 0.02 100 0.9845164 41exact 0.03 5 0.9915279 42exact 0.03 10 0.9972351 43exact 0.03 15 0.9906286 44exact 0.03 20 0.9789916 45exact 0.03 25 0.9938142 46exact 0.03 30 0.9880954 47exact 0.03 35 0.9797802 48exact 0.03 40 0.9933299 49exact 0.03 45 0.9890462 50exact 0.03 50 0.9831894 51exact 0.03 55 0.9755598 52exact 0.03 60 0.9908560 53exact 0.03 65 0.9866943 54exact 0.03 70 0.9813629 55exact 0.03 75 0.9926775 56exact 0.03 80 0.9896911 57exact 0.03 85 0.9859049 58exact 0.03 90 0.9812172 59exact 0.03 95 0.9755343 60exact 0.03 100 0.9893762 61exact 0.04 5 0.9852420 62exact 0.04 10 0.9937863 63exact 0.04 15 0.9797082 64exact 0.04 20 0.9925871 65exact 0.04 25 0.9834784 66exact 0.04 30 0.9936800 67exact 0.04 35 0.9877867 68exact 0.04 40 0.9789777 69exact 0.04 45 0.9912599 70exact 0.04 50 0.9855896 71exact 0.04 55 0.9777638 72exact 0.04 60 0.9901122 73exact 0.04 65 0.9849824 74exact 0.04 70 0.9781965 75exact 0.04 75 0.9897956 76exact 0.04 80 0.9852643 77exact 0.04 85 0.9794261 78exact 0.04 90 0.9899813 79exact 0.04 95 0.9653302 80exact 0.04 100 0.9641378 81exact 0.05 5 0.9774075 82exact 0.05 10 0.9884964 83exact 0.05 15 0.9945327 84exact 0.05 20 0.9840985 85exact 0.05 25 0.9928351 86exact 0.05 30 0.9843645 87exact 0.05 35 0.9927483 88exact 0.05 40 0.9861231 89exact 0.05 45 0.9761385 90exact 0.05 50 0.9882136 91exact 0.05 55 0.9806825 92exact 0.05 60 0.9902109 93exact 0.05 65 0.9844774 94exact 0.05 70 0.9766393 95exact 0.05 75 0.9662306 96exact 0.05 80 0.9650815 97exact 0.05 85 0.9772934 98exact 0.05 90 0.9755923 99exact 0.05 95 0.9718140 100 exact 0.05 100 0.9826071 101 exact 0.06 5 0.9980297 102 exact 0.06 10 0.9811622 103 exact 0.06 15 0.9896401 104 exact 0.06 20 0.9943659 105 exact 0.06 25 0.9849507 106 exact 0.06 30 0.9920548 107 exact 0.06 35 0.9831689 108 exact 0.06 40 0.9909419 109 exact 0.06 45 0.9829932 110 exact 0.06 50 0.9906217 111 exact 0.06 55 0.9836566 112 exact 0.06 60 0.9663670 113 exact 0.06 65 0.9668145 114 exact 0.06 70 0.9630279 115 exact 0.06 75 0.9763348 116 exact 0.06 80 0.9716289 117 exact 0.06 85 0.9820840 118 exact 0.06 90 0.9772655 119 exact 0.06 95 0.9687703 120 exact 0.06 100 0.9680765 121 exact 0.07 5 0.9969201 122 exact 0.07 10 0.9964239 123 exact 0.07 15 0.9824673 124 exact 0.07 20 0.9892932 125 exact 0.07 25 0.9934691 126 exact 0.07 30 0.9837683 127 exact 0.07 35 0.9902956 128 exact 0.07 40 0.9801496 129 exact 0.07 45 0.9879752 130 exact 0.07 50 0.9779901 131 exact 0.07 55 0.9679391 132 exact 0.07 60 0.9640110 133 exact 0.07 65 0.9765091 134 exact 0.07 70 0.9702320 135 exact 0.07 75 0.9806132 136 exact 0.07 80
Re: [R] Errors in if statement
On Sep 28, 2012, at 1:16 PM, JiangZhengyu wrote: Hi guys, I have many rows (1000) and columns (30) of geno matrix. I use the following loop and condition statement (adapted from someone else code). I always have an error below. I was wondering if anyone knows what's the problem how to fix it. Boy, it surely looks like missing values are the problem. Have you read: ?sum -- David. Thanks,Zhengyu ### geno matrix P1 P2 P3 P4 1 2 2 3 2 2 2 2 1 1 1 2 1 2 NANA 2 3 4 5 ### for(i in 1:4) { cat(i,) if(sum(geno[i,]!=2)3 sum(geno[i,]==1)=1 sum(geno[i,]==3)=1){ tmp = 1 } } ### 1 2 Error in if (sum(geno[i, ] != 2) 3 sum(geno[i, ] == 1) = 1 sum(geno[i, : missing value where TRUE/FALSE needed [[alternative HTML version deleted]] David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.