Re: [R] problem in getVarianceStabilizedData
Please don't cross-post per R-help posting guide. The data function is not itself data. Provide your data to the function using the name you assign to it according to the getVarianceStabilizedData function documentation. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Suparna Mitra suparna.mitra...@gmail.com wrote: Hi All, I am having a problem while running getVarianceStabilizedData in DDSeq2 package. data.vsd-getVarianceStabilizedData(data) Error in (function (classes, fdef, mtable) : unable to find an inherited method for function �dispersionFunction� for signature �CountDataSet� Though the function looks okay dispersionFunction standardGeneric for dispersionFunction defined from package DESeq2 function (object) standardGeneric(dispersionFunction) environment: 0x7fe7a9c5d140 Methods may be defined for arguments: object Use showMethods(dispersionFunction) for currently available ones. Can anybody please help? Thanks, Mitra. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overlaying two graphs using ggplot2 in R
Hi R Users, I was struggling to overlay two graphs created from the two different dataset using ggplot2. Furthermore, I could not join means of the box plots. I tried this way but did not work. Any suggestions? dat1-structure(list(site = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), layer = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c(bottom, top), class = factor), Present = c(120L, 125L, 123L, 23L, 21L, 19L, 131L, 124L, 127L, 24L, 27L, 25L, 145L, 143L, 184L, 29L, 14L, 17L, 38L)), .Names = c(site, layer, Present), row.names = c(NA, 19L), class = data.frame) dat1 dat2-structure(list(site = 1:3, present = c(-3L, 2L, 5L)), .Names = c(site, present), row.names = c(NA, 3L), class = data.frame) dat2 library(plyr) library(ggplot2) A-ggplot(dat1, aes(x = factor(site), y = Present, colour = layer, fill=layer)) + geom_boxplot(outlier.shape = 16, outlier.size = 1) + theme_bw()+ ylim(0,185) # Here I wanted to join the means of the boxplots among the sites, but I could not join it. B-ggplot(dat2, aes(x=factor(site),y= present, colour=blue) + geom_line() + geom_point()) # wanted to plot it using second y axis. A+B Thanks for your help. KG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Statistics courses
There are a few remaining places on the following three statistics courses in Coimbra, Lisbon and Elche (Alicante). Course: Data exploration, linear regression, GLM GAM in R. With introduction to R. Where: University of Coimbra, Coimbra, Portugal When: 3-7 February, 2014 Course: Introduction to Linear Mixed Effects Models, GLMM and MCMC with R Where: University of Lisbon, Lisbon, Portugal When: 10-14 February, 2014 Course: Beginner's Guide to GAM and GAMM with R Where: Elche, Alicante, Spain When: 10-14 March, 2014 For full details, flyers, prices, etc. see: http://www.highstat.com/statscourse.htm Kind regards, Alain -- Dr. Alain F. Zuur First author of: 1. Beginner's Guide to GAMM with R (2014). 2. Beginner's Guide to GLM and GLMM with R (2013). 3. Begginner's Guide to GAM with R (2012). 4. Zero Inflated Models and GLMM with R (2012). 5. A Beginner's Guide to R (2009). 6. Mixed effects models and extensions in ecology with R (2009). 7. Analysing Ecological Data (2007). Highland Statistics Ltd. 9 St Clair Wynd UK - AB41 6DZ Newburgh Tel: 0044 1358 788177 Email: highs...@highstat.com URL: www.highstat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using apply function
Hi all R-users, I'm trying to using apply function to input a range of values into a function I wrote. I wrote a function with 4 information needed. I would like to make 2 of them fixed and the other 2 random (but with specified values). I would like to replicate the function 1 times. I was thinking about using loop function but which is really slow, therefore I transfer to apply function. But I got stucked. Could anyone help me? The question could be illustrated as follows, Target function: fun(F1,F2,R1,R2) R1 has values 1, 2, 3, 4, 5 R2 has values -1, -2, -3, -4, -5 F1=10 F2=100 There would be 25 conditions. I would like to avoid using loop to get the result. Could anyone give me some precious suggestion? Thank you Best, Yen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using apply function
Is this what you want: random - expand.grid(R1 = 1:5, R2 = -(1:5)) result - cbind(F1 = 10, F2 = 100, random) result F1 F2 R1 R2 1 10 100 1 -1 2 10 100 2 -1 3 10 100 3 -1 4 10 100 4 -1 5 10 100 5 -1 6 10 100 1 -2 7 10 100 2 -2 8 10 100 3 -2 9 10 100 4 -2 10 10 100 5 -2 11 10 100 1 -3 12 10 100 2 -3 13 10 100 3 -3 14 10 100 4 -3 15 10 100 5 -3 16 10 100 1 -4 17 10 100 2 -4 18 10 100 3 -4 19 10 100 4 -4 20 10 100 5 -4 21 10 100 1 -5 22 10 100 2 -5 23 10 100 3 -5 24 10 100 4 -5 25 10 100 5 -5 Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Mon, Jan 27, 2014 at 8:56 AM, Yen Lee b88207...@ntu.edu.tw wrote: Hi all R-users, I'm trying to using apply function to input a range of values into a function I wrote. I wrote a function with 4 information needed. I would like to make 2 of them fixed and the other 2 random (but with specified values). I would like to replicate the function 1 times. I was thinking about using loop function but which is really slow, therefore I transfer to apply function. But I got stucked. Could anyone help me? The question could be illustrated as follows, Target function: fun(F1,F2,R1,R2) R1 has values 1, 2, 3, 4, 5 R2 has values -1, -2, -3, -4, -5 F1=10 F2=100 There would be 25 conditions. I would like to avoid using loop to get the result. Could anyone give me some precious suggestion? Thank you Best, Yen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] passing variable names to dplyr
All, I would like to figure out how to pass variable names to the dplyr function mutate. For example, this works because hp is one of the variable names on mtcars: mutate(mtcars, scale(hp)) Let's says I want to pass in the target variable instead of hard-coding the name, as follows: target - hp mutate(mtcars, scale(target)) That dones't work. I read somewhere about using lapply, but that suggestion didn't work for me either: target - lapply(hp, as.symbol) mutate(mtcars, scale(target)) Does anyone know how to do this? Thanks, Roger *** This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No right to confidential or privileged treatment of this message is waived or lost by an error in transmission. If you have received this message in error, please immediately notify the the sender by e-mail, delete the message and all copies from your system and destroy any hard copies. You must not, directly or indirectly, use, disclose, distribute, print or copy any part of this message if you are not the intended recipient. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem (un)detecting changepoints
Dear List, I am using the cpt.mean() function in the changepoint package to detect change-points in my data and noticed that when there are no visible changes, the function returns the last point as the point of change. The following script can illustrate this: table(unlist(replicate(500,cpt.mean(rnorm(50),method=PELT)@cpts))) the result will return a uniform distribution from 1 to 49 (with ca 20 cpts located for each), and then 500 cases where the cpts is located on the last vector. Clearly, cpt.mean returns the index of the last vector value (here 50) for change in the time-series. I wonder if I am doing something wrong here, but I think the function should return a NA... Many thanks in advance, Enrico __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating group means
Hi Bert, Thanks for the reply. Here is the snippet from R shell running on top of a kerberos secured hadoop cluster = Sys.setenv(HADOOP_CMD=/usr/bin/hadoop) library(rhdfs) Loading required package: rJava HADOOP_CMD=/usr/bin/hadoop Be sure to run hdfs.init() hdfs.init() hdfs.ls(/) 14/01/27 06:26:48 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 14/01/27 06:26:48 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 14/01/27 06:26:48 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] Error in .jcall(RJavaTools, Ljava/lang/Object;, invokeMethod, cl, : java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: host1.com; destination host is: host.com:8020; = Thanks Regards Anoop Kumar K M TCS Digital Enterprise-Analytics And BigData Tata Consultancy Services Limited TCS Centre-SEZ Infopark Special Economic Zone, Kakkanad, Kusumagiri Post Kochi - 682030,Kerala India Ph:- +91 4846187171 Buzz:- 6187171 Mailto: anoop.kuma...@tcs.com Website: http://www.tcs.com Experience certainty. IT Services Business Solutions Consulting -r-help-boun...@r-project.org wrote: - To: Laura Bethan Thomas [lbt1] l...@aber.ac.uk, r-help@r-project.org r-help@r-project.org From: Bert Gunter Sent by: r-help-boun...@r-project.org Date: 01/27/2014 07:54PM Subject: Re: [R] Calculating group means 1. Please cc anything but personal remarks to the list, not to me. That will assure better answers. 2. Your query is too vague for me to be sure -- a small reproducible example of what you'd like would be very helpful here -- but I am guessing that you want the ?ave function instead of by(). Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Mon, Jan 27, 2014 at 4:27 AM, Laura Bethan Thomas [lbt1] l...@aber.ac.uk wrote: Hi Bert, Thank you very much for your help with my R issue. The code you suggested has worked- do you know of a way I can extract the averages this gives me into a data frame or table? Many thanks for tour help, Laura On 24 Dec 2013, at 07:28, Bert Gunter gunter.ber...@gene.com wrote: Jim: Did you forget about with() ? Instead of: by(lbtdat$latency,list(lbtdat$subject, lbtdat$condition,lbtdat$state),mean) ##do with(ibtdat,by(latency,list(subject,condition,state),mean)) Bert Gunter Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Mon, Dec 23, 2013 at 6:37 PM, Jim Lemon j...@bitwrit.com.au wrote: On 12/23/2013 11:31 PM, Laura Bethan Thomas [lbt1] wrote: Hi All, Sorry for what I imagine is quite a basic question. I have been trying to do is create latency averages for each state (1-8) for each participant (n=13) in each condition (1-10). I'm not sure what function I would need, or what the most efficient ay of calculating this would be. If you have any help with that I would be very grateful. structure(list(subject = c(1L, 1L, 1L, 1L, 1L, 1L), conditionNo = c(1L, 1L, 1L, 1L, 1L, 1L), state = c(5L, 8L, 7L, 8L, 1L, 7L), latency = c(869L, 864L, 1004L, 801L, 611L, 679L)), .Names = c(subject, conditionNo, state, latency), row.names = 3:8, class = data.frame) Hi Laura, You can do it like this: # make up enough data to do the calculation lbtdat-data.frame(subject=rep(1:13,each=160), condition=rep(rep(rep(1:10,each=8),2),13), state=rep(rep(1:8,20),13), latency=sample(600:1100,2080,TRUE)) by(lbtdat$latency,list(lbtdat$subject, lbtdat$condition,lbtdat$state),mean) but you are going to get a rather long list of means. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list
[R] Simplifying matrix computation
Dear R-users, I would like to know whether you know some trick for skipping some of the steps in the example below (especially the last step in a way that would make easier to be written succinctly in a text). I could try to explain in words the whole process, but I'm sure the code below would be clearer. Thanks in advance for your help, Giancarlo ## data in matrices D - matrix(1:15, 3, 5) T - matrix(0, 3, 3) T[c(2,4,6,8)] - 1 ## how to place the diag matrices of each row M0 - matrix(0, nrow(T), sum(T)) wr - which(T==1, arr.ind=TRUE)[,2] wc - 1:ncol(M0) M0[cbind(wr,wc)] - 1 ## number of columns m - ncol(D) ## non-zero positions M - kronecker(M0, diag(m)) ## which rows to take pos - which(T==1, arr.ind=TRUE)[,1] ## filling up with data M[M!=0] - t(D[wr,]) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating group means
1. Please cc anything but personal remarks to the list, not to me. That will assure better answers. 2. Your query is too vague for me to be sure -- a small reproducible example of what you'd like would be very helpful here -- but I am guessing that you want the ?ave function instead of by(). Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Mon, Jan 27, 2014 at 4:27 AM, Laura Bethan Thomas [lbt1] l...@aber.ac.uk wrote: Hi Bert, Thank you very much for your help with my R issue. The code you suggested has worked- do you know of a way I can extract the averages this gives me into a data frame or table? Many thanks for tour help, Laura On 24 Dec 2013, at 07:28, Bert Gunter gunter.ber...@gene.com wrote: Jim: Did you forget about with() ? Instead of: by(lbtdat$latency,list(lbtdat$subject, lbtdat$condition,lbtdat$state),mean) ##do with(ibtdat,by(latency,list(subject,condition,state),mean)) Bert Gunter Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Mon, Dec 23, 2013 at 6:37 PM, Jim Lemon j...@bitwrit.com.au wrote: On 12/23/2013 11:31 PM, Laura Bethan Thomas [lbt1] wrote: Hi All, Sorry for what I imagine is quite a basic question. I have been trying to do is create latency averages for each state (1-8) for each participant (n=13) in each condition (1-10). I'm not sure what function I would need, or what the most efficient ay of calculating this would be. If you have any help with that I would be very grateful. structure(list(subject = c(1L, 1L, 1L, 1L, 1L, 1L), conditionNo = c(1L, 1L, 1L, 1L, 1L, 1L), state = c(5L, 8L, 7L, 8L, 1L, 7L), latency = c(869L, 864L, 1004L, 801L, 611L, 679L)), .Names = c(subject, conditionNo, state, latency), row.names = 3:8, class = data.frame) Hi Laura, You can do it like this: # make up enough data to do the calculation lbtdat-data.frame(subject=rep(1:13,each=160), condition=rep(rep(rep(1:10,each=8),2),13), state=rep(rep(1:8,20),13), latency=sample(600:1100,2080,TRUE)) by(lbtdat$latency,list(lbtdat$subject, lbtdat$condition,lbtdat$state),mean) but you are going to get a rather long list of means. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] testing if xts date exists ?
You can use the which.i argument to [.xts: is.null(SPY[2009-01-18,which.i=TRUE]) [1] TRUE Best, -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com On Sat, Jan 25, 2014 at 9:27 AM, ce zadi...@excite.com wrote: Dear all How to test if xts date exists ? is.null doesn't work. SPY[2009-01-18] doesn't exist but I can't catch it in my script. library(quantmod) getSymbols(SPY) SPY[2009-01-16] SPY.Open SPY.High SPY.Low SPY.Close SPY.Volume SPY.Adjusted 2009-01-1685.8685.99 83.05 85.06 39923720076.58 SPY[2009-01-18] SPY.Open SPY.High SPY.Low SPY.Close SPY.Volume SPY.Adjusted is.null(SPY[2009-01-18]) [1] FALSE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating group means
That is not a small reproducible example. There's no r code. Please read the posting guide to learn how to post to r-help. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Mon, Jan 27, 2014 at 6:29 AM, Anoop Kumarkm anoop.kuma...@tcs.com wrote: Hi Bert, Thanks for the reply. Here is the snippet from R shell running on top of a kerberos secured hadoop cluster = Sys.setenv(HADOOP_CMD=/usr/bin/hadoop) library(rhdfs) Loading required package: rJava HADOOP_CMD=/usr/bin/hadoop Be sure to run hdfs.init() hdfs.init() hdfs.ls(/) 14/01/27 06:26:48 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 14/01/27 06:26:48 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 14/01/27 06:26:48 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] Error in .jcall(RJavaTools, Ljava/lang/Object;, invokeMethod, cl, : java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: host1.com; destination host is: host.com:8020; = Thanks Regards Anoop Kumar K M TCS Digital Enterprise-Analytics And BigData Tata Consultancy Services Limited TCS Centre-SEZ Infopark Special Economic Zone, Kakkanad, Kusumagiri Post Kochi - 682030,Kerala India Ph:- +91 4846187171 Buzz:- 6187171 Mailto: anoop.kuma...@tcs.com Website: http://www.tcs.com Experience certainty. IT Services Business Solutions Consulting -r-help-boun...@r-project.org wrote: - To: Laura Bethan Thomas [lbt1] l...@aber.ac.uk, r-help@r-project.org r-help@r-project.org From: Bert Gunter Sent by: r-help-boun...@r-project.org Date: 01/27/2014 07:54PM Subject: Re: [R] Calculating group means 1. Please cc anything but personal remarks to the list, not to me. That will assure better answers. 2. Your query is too vague for me to be sure -- a small reproducible example of what you'd like would be very helpful here -- but I am guessing that you want the ?ave function instead of by(). Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Mon, Jan 27, 2014 at 4:27 AM, Laura Bethan Thomas [lbt1] l...@aber.ac.uk wrote: Hi Bert, Thank you very much for your help with my R issue. The code you suggested has worked- do you know of a way I can extract the averages this gives me into a data frame or table? Many thanks for tour help, Laura On 24 Dec 2013, at 07:28, Bert Gunter gunter.ber...@gene.com wrote: Jim: Did you forget about with() ? Instead of: by(lbtdat$latency,list(lbtdat$subject, lbtdat$condition,lbtdat$state),mean) ##do with(ibtdat,by(latency,list(subject,condition,state),mean)) Bert Gunter Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Mon, Dec 23, 2013 at 6:37 PM, Jim Lemon j...@bitwrit.com.au wrote: On 12/23/2013 11:31 PM, Laura Bethan Thomas [lbt1] wrote: Hi All, Sorry for what I imagine is quite a basic question. I have been trying to do is create latency averages for each state (1-8) for each participant (n=13) in each condition (1-10). I'm not sure what function I would need, or what the most efficient ay of calculating this would be. If you have any help with that I would be very grateful. structure(list(subject = c(1L, 1L, 1L, 1L, 1L, 1L), conditionNo = c(1L, 1L, 1L, 1L, 1L, 1L), state = c(5L, 8L, 7L, 8L, 1L, 7L), latency = c(869L, 864L, 1004L, 801L, 611L, 679L)), .Names = c(subject, conditionNo, state, latency), row.names = 3:8, class = data.frame) Hi Laura, You can do it like this: # make up enough data to do the calculation lbtdat-data.frame(subject=rep(1:13,each=160), condition=rep(rep(rep(1:10,each=8),2),13), state=rep(rep(1:8,20),13), latency=sample(600:1100,2080,TRUE)) by(lbtdat$latency,list(lbtdat$subject, lbtdat$condition,lbtdat$state),mean) but you are
Re: [R] memory use of copies
Hi Ross -- On 01/23/2014 05:53 PM, Ross Boylan wrote: [Apologies if a duplicate; we are having mail problems.] I am trying to understand the circumstances under which R makes a copy of an object, as opposed to simply referring to it. I'm talking about what goes on under the hood, not the user semantics. I'm doing things that take a lot of memory, and am trying to minimize my use. I thought that R was clever so that copies were created lazily. For example, if a is matrix, then b - a b a referred to to the same object underneath, so that a complete duplicate (deep copy) wasn't made until it was necessary, e.g., b[3, 1] - 4 would duplicate the contents of a to b, and then overwrite them. Compiling your R with --enable-memory-profiling gives access to the tracemem() function, showing that your understanding above is correct b = matrix(0, 3, 2) tracemem(b) [1] 0x7054020 a = b## no copy b[3, 1] = 2 ## copy tracemem[0x7054020 - 0x7053fc8]: b = matrix(0, 3, 2) tracemem(b) tracemem(b) [1] 0x680e258 b[3, 1] = 2 ## no copy The same is apparent using .Internal(inspect()), where the first information @7053ec0 is the address of the data. The other relevant part is the 'NAM()' field, which indicates whether there are 0, 1 or (have been) at least 2 symbols referring to the data. NAM() increments from 1 (no duplication on modify required) on original creation to 2 when a = b (duplicate on modify) b = matrix(0, 3, 2) .Internal(inspect(b)) @7053ec0 14 REALSXP g0c4 [NAM(1),ATT] (len=6, tl=0) 0,0,0,0,0,... ATTRIB: @7057528 02 LISTSXP g0c0 [] TAG: @21c5fb8 01 SYMSXP g0c0 [LCK,gp=0x4000] dim (has value) @7056858 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 3,2 b[3, 1] = 2 .Internal(inspect(b)) @7053ec0 14 REALSXP g0c4 [NAM(1),ATT] (len=6, tl=0) 0,0,2,0,0,... ATTRIB: @7057528 02 LISTSXP g0c0 [] TAG: @21c5fb8 01 SYMSXP g0c0 [LCK,gp=0x4000] dim (has value) @7056858 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 3,2 a = b .Internal(inspect(b)) ## data address unchanced @7053ec0 14 REALSXP g0c4 [NAM(2),ATT] (len=6, tl=0) 0,0,0,0,0,... ATTRIB: @7057528 02 LISTSXP g0c0 [] TAG: @21c5fb8 01 SYMSXP g0c0 [LCK,gp=0x4000] dim (has value) @7056858 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 3,2 b[3, 1] = 2 .Internal(inspect(b)) ## data address changed @7232910 14 REALSXP g0c4 [NAM(1),ATT] (len=6, tl=0) 0,0,2,0,0,... ATTRIB: @7239d28 02 LISTSXP g0c0 [] TAG: @21c5fb8 01 SYMSXP g0c0 [LCK,gp=0x4000] dim (has value) @7237b48 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 3,2 The following log, from R 3.0.1, does not seem to act that way; I get the same amount of memory used whether I copy the same object repeatedly or create new objects of the same size. Can anyone explain what is going on? Am I just wrong that copies are initially shallow? Or perhaps that behavior only applies for function arguments? Or doesn't apply for class slots or reference class variables? foo - setRefClass(foo, fields=list(x=ANY)) bar - setClass(bar, slots=c(x)) using the approach above, we can see that creating an S4 or reference object in the way you've indicated (validity checks or other initialization might change this) does not copy the data although it is marked for duplication x = 1:2; .Internal(inspect(x)) @7553868 13 INTSXP g0c1 [NAM(1)] (len=2, tl=0) 1,2 .Internal(inspect(foo(x=x)$x)) @7553868 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 .Internal(inspect(bar(x=x)@x)) @7553868 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 On the other hand, lapply is creating copies x = 1:2; .Internal(inspect(x)) @757b5a8 13 INTSXP g0c1 [NAM(1)] (len=2, tl=0) 1,2 .Internal(inspect(lapply(1:2, function(i) x))) @7551f88 19 VECSXP g0c2 [] (len=2, tl=0) @757b428 13 INTSXP g0c1 [] (len=2, tl=0) 1,2 @757b3f8 13 INTSXP g0c1 [] (len=2, tl=0) 1,2 One can construct a list without copies x = 1:2; .Internal(inspect(x)) @7677c18 13 INTSXP g0c1 [NAM(1)] (len=2, tl=0) 1,2 .Internal(inspect(list(x)[rep(1, 2)])) @767b080 19 VECSXP g0c2 [NAM(2)] (len=2, tl=0) @7677c18 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 @7677c18 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 but that (creating a list of identical elements) doesn't seem to be a likely real-world scenario and the gain is transient x = 1:2; y = list(x)[rep(1, 4)] .Internal(inspect(y)) @507bef8 19 VECSXP g0c3 [NAM(2)] (len=4, tl=0) @514ff98 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 @514ff98 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 @514ff98 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 @514ff98 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 y[[1]][1] = 2L ## everybody copied .Internal(inspect(y)) @507bf40 19 VECSXP g0c3 [NAM(1)] (len=4, tl=0) @51502c8 13 INTSXP g0c1 [] (len=2, tl=0) 2,2 @51502f8 13 INTSXP g0c1 [] (len=2, tl=0) 1,2 @5150328 13 INTSXP g0c1 [] (len=2, tl=0) 1,2 @5150358 13 INTSXP g0c1 [] (len=2, tl=0) 1,2 Probably it is more helpful to think of reducing the number of times an object is _modified_, e.g.,
[R] Prediction Intervals predict.Arima
I would like to ask how exactly the prediction intervals are calculated by function predict.arima in R. I suppose that the method is same as for the function forecast (which I am actually using). Unfortunately I can not find it anywhere. I am particularly interested in how it works for Arima models, SARIMA models and ARIMA models which include external regressors (argument xreg is not null). Best wishes Monika Novackova [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R and kerberos
Dear Team, I have R and rhdfs installed in a kerberos secured cluster. Whether R and rhdfs will work with kerberos secured cluster? Is there any reported issues? Thanks Regards Anoop Kumar K M =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem (un)detecting changepoints
Hi, I don't think you are doing anything wrong, the routine is doing what it is documented to do, from ?cpt.mean cpt: Vector containing the changepoint locations for the penalty supplied. This always ends with n. i.e. as your series is of length 50, the last value returned in cpts will always be 50. Martyn -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Enrico R. Crema Sent: 27 January 2014 13:01 To: R help Subject: [R] problem (un)detecting changepoints Dear List, I am using the cpt.mean() function in the changepoint package to detect change-points in my data and noticed that when there are no visible changes, the function returns the last point as the point of change. The following script can illustrate this: table(unlist(replicate(500,cpt.mean(rnorm(50),method=PELT)@cpts))) the result will return a uniform distribution from 1 to 49 (with ca 20 cpts located for each), and then 500 cases where the cpts is located on the last vector. Clearly, cpt.mean returns the index of the last vector value (here 50) for change in the time-series. I wonder if I am doing something wrong here, but I think the function should return a NA... Many thanks in advance, Enrico __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem (un)detecting changepoints
Hi Martyn, Thanks! Should have checked the doc more thoroughly ! Enrico On 27 Jan 2014, at 16:35, Martyn Byng martyn.b...@nag.co.uk wrote: Hi, I don't think you are doing anything wrong, the routine is doing what it is documented to do, from ?cpt.mean cpt: Vector containing the changepoint locations for the penalty supplied. This always ends with n. i.e. as your series is of length 50, the last value returned in cpts will always be 50. Martyn -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Enrico R. Crema Sent: 27 January 2014 13:01 To: R help Subject: [R] problem (un)detecting changepoints Dear List, I am using the cpt.mean() function in the changepoint package to detect change-points in my data and noticed that when there are no visible changes, the function returns the last point as the point of change. The following script can illustrate this: table(unlist(replicate(500,cpt.mean(rnorm(50),method=PELT)@cpts))) the result will return a uniform distribution from 1 to 49 (with ca 20 cpts located for each), and then 500 cases where the cpts is located on the last vector. Clearly, cpt.mean returns the index of the last vector value (here 50) for change in the time-series. I wonder if I am doing something wrong here, but I think the function should return a NA... Many thanks in advance, Enrico __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail has been scanned for all viruses by Star. The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem in overlying two figures in ggplot2
Hi R Users, I was struggling to overlay two graphs created from the two different datasets using ggplot2.I could not overlay two figures. I wanted to plot second graph using second Y axis. but there was no provision. Furthermore, I could not join means of the box plots. I tried this way but did not work. Any suggestions? dat1-structure(list(site = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), layer = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c(bottom, top), class = factor), Present = c(120L, 125L, 123L, 23L, 21L, 19L, 131L, 124L, 127L, 24L, 27L, 25L, 145L, 143L, 184L, 29L, 14L, 17L, 38L)), .Names = c(site, layer, Present), row.names = c(NA, 19L), class = data.frame) dat1 dat2-structure(list(site = 1:3, present = c(-3L, 2L, 5L)), .Names = c(site, present), row.names = c(NA, 3L), class = data.frame) dat2 library(plyr) library(ggplot2) A-ggplot(dat1, aes(x = factor(site), y = Present, colour = layer, fill=layer)) + geom_boxplot(outlier.shape = 16, outlier.size = 1) + theme_bw()+ ylim(0,185) # Here I wanted to join the means of the boxplots among the sites, but I could not join it. B-ggplot(dat2, aes(x=factor(site),y= present, colour=blue) + geom_line() + geom_point()) # wanted to plot it using second y axis. A+B Thanks for your help. KG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bug or my misunderstanding?
Folks: Before I waste someone's time with a stupid bug report, could I get feedback as to whether the following really appears to be a (minor) bug? Summary: get_all_vars does not seem to handle multiple responses correctly. Example: y - matrix(runif(12),nc=3) x - 1:4 lmfit -lm(y~x) model.frame(lmfit) ## OK y.1y.2y.3 x 1 0.02159809 0.15593110 0.59007262 1 2 0.91169201 0.30725236 0.41035328 2 3 0.45079051 0.29174545 0.18771042 3 4 0.07983415 0.37301448 0.70319143 4 get_all_vars(lmfit)## not OK ? y xNA NA 1 0.02159809 0.1559311 0.5900726 1 2 0.91169201 0.3072524 0.4103533 2 3 0.45079051 0.2917455 0.1877104 3 4 0.07983415 0.3730145 0.7031914 4 ## model.frame() gives correct response variable names; get_all_vars() does not. R version 3.0.2 (2013-09-25) Platform: i386-w64-mingw32/i386 (32-bit) Many thanks. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug or my misunderstanding?
It is! I apologize for the noise. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Mon, Jan 27, 2014 at 9:48 AM, Brian Ripley rip...@stats.ox.ac.uk wrote: I think it is already a bug report. But I am away from base and cannot check easily. On 27 Jan 2014, at 18:15, Bert Gunter gunter.ber...@gene.com wrote: Folks: Before I waste someone's time with a stupid bug report, could I get feedback as to whether the following really appears to be a (minor) bug? Summary: get_all_vars does not seem to handle multiple responses correctly. Example: y - matrix(runif(12),nc=3) x - 1:4 lmfit -lm(y~x) model.frame(lmfit) ## OK y.1y.2y.3 x 1 0.02159809 0.15593110 0.59007262 1 2 0.91169201 0.30725236 0.41035328 2 3 0.45079051 0.29174545 0.18771042 3 4 0.07983415 0.37301448 0.70319143 4 get_all_vars(lmfit)## not OK ? y xNA NA 1 0.02159809 0.1559311 0.5900726 1 2 0.91169201 0.3072524 0.4103533 2 3 0.45079051 0.2917455 0.1877104 3 4 0.07983415 0.3730145 0.7031914 4 ## model.frame() gives correct response variable names; get_all_vars() does not. R version 3.0.2 (2013-09-25) Platform: i386-w64-mingw32/i386 (32-bit) Many thanks. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem (un)detecting changepoints
Although, I got the similar results using AMOC, which should: cpt The most probable location of a changepoint if a change was identified or NA if no changepoint. Enrico --- Dr Enrico R. Crema ERC EUROEVOL Research Associate UCL Institute of Archaeology 31-34 Gordon Square WC1H 0PY London +44(0)20-7679-1031 e.cr...@ucl.ac.uk On 27 Jan 2014, at 16:47, Enrico R. Crema enrico.cr...@gmail.com wrote: Hi Martyn, Thanks! Should have checked the doc more thoroughly ! Enrico On 27 Jan 2014, at 16:35, Martyn Byng martyn.b...@nag.co.uk wrote: Hi, I don't think you are doing anything wrong, the routine is doing what it is documented to do, from ?cpt.mean cpt: Vector containing the changepoint locations for the penalty supplied. This always ends with n. i.e. as your series is of length 50, the last value returned in cpts will always be 50. Martyn -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Enrico R. Crema Sent: 27 January 2014 13:01 To: R help Subject: [R] problem (un)detecting changepoints Dear List, I am using the cpt.mean() function in the changepoint package to detect change-points in my data and noticed that when there are no visible changes, the function returns the last point as the point of change. The following script can illustrate this: table(unlist(replicate(500,cpt.mean(rnorm(50),method=PELT)@cpts))) the result will return a uniform distribution from 1 to 49 (with ca 20 cpts located for each), and then 500 cases where the cpts is located on the last vector. Clearly, cpt.mean returns the index of the last vector value (here 50) for change in the time-series. I wonder if I am doing something wrong here, but I think the function should return a NA... Many thanks in advance, Enrico __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail has been scanned for all viruses by Star. The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in overlying two figures in ggplot2
Hi Kristi, There is a separate ggplot2 mailing list at https://groups.google.com/forum/#!forum/ggplot2, please post future ggplot2 questions there. On Mon, Jan 27, 2014 at 11:52 AM, Kristi Glover kristi.glo...@hotmail.com wrote: Hi R Users, I was struggling to overlay two graphs created from the two different datasets using ggplot2.I could not overlay two figures. I wanted to plot second graph using second Y axis. but there was no provision. Furthermore, I could not join means of the box plots. I tried this way but did not work. Any suggestions? dat1-structure(list(site = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), layer = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c(bottom, top), class = factor), Present = c(120L, 125L, 123L, 23L, 21L, 19L, 131L, 124L, 127L, 24L, 27L, 25L, 145L, 143L, 184L, 29L, 14L, 17L, 38L)), .Names = c(site, layer, Present), row.names = c(NA, 19L), class = data.frame) dat1 dat2-structure(list(site = 1:3, present = c(-3L, 2L, 5L)), .Names = c(site, present), row.names = c(NA, 3L), class = data.frame) dat2 library(plyr) library(ggplot2) A-ggplot(dat1, aes(x = factor(site), y = Present, colour = layer, fill=layer)) + geom_boxplot(outlier.shape = 16, outlier.size = 1) + theme_bw()+ ylim(0,185) # Here I wanted to join the means of the boxplots among the sites, but I could not join it. It would have been useful to show us what you tried, but here is one way: A + geom_line(aes( group = layer # ignore the fact that the x-axis is categorical ), position = position_dodge(width = 0.75), # match position of boxes stat=summary, # summarize raw data fun.y = mean, # by taking the mean color=black) B-ggplot(dat2, aes(x=factor(site),y= present, colour=blue) + geom_line() + geom_point()) # wanted to plot it using second y axis. ggplot2 doesn't really support multiple y axes. You can fake it (check the ggplot2 mailing list and/or stackoverflow for examples) but it's not easy. Best, Ista A+B Thanks for your help. KG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using substitute with multiple parameters
Dear All, I can't figure out how to pass multiple arguments to substitute to build up a call statement. One argument works fine: target - val1 call - substitute(select(zidx_df, datadate, target), list(target = as.name(target))) call select(zidx_df, datadate, val1) Now I would like to pass multiple arguments to substitute so that I get select(zidx_df, datadate, val1, val2, val3), but I only get the first argument: target - c(val1, val2, val3) call - substitute(select(zidx_df, datadate, target), list(target = as.name(target))) call select(zidx_df, datadate, val1) I have tried multiple variations, but none of them are even close, so I won't embarrass myself by posting them here. Thanks in advance, Roger *** This message is for the named person's use only. It may\...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handlig large SAS file in R
For that you need to purchase Stat/Transfer. Frank hans012 wrote Hey Guys I have a .sas7bdat file of 1.79gb that i want to read. I am using the .sas7bdat package to read the file and after i typed the command read.sas7bdat('filename.sas7bdat') it has been 3 hours with no result so far. Is there a way that i can see the progress of the read? Or is there another way to read the file with less computing time? I do not have access to SAS, the file was sent to me. Let me know what you guys think KR Hans - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Handlig-large-SAS-file-in-R-tp4684212p4684250.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Numeric Column Labels in Excel Function
Hi all, I frequently get requests to do data analysis where the person references an excel column. e.g., I want to analyze [insert complex variable name], located at column AAQ in Excel. I've been doing is gsub and inserting a part of the string for the complex variable name, then going from there. But, I was trying to make function that returns the following vector: excelVector = A, B, C, D,...AA, AB, AC...ZA, ZB, ZC,...AAA, AAB, AAC, etc. In other words, the argument would have one argument (n, or the number of columns), then it would return a list like that shown above. Then, all I would have to do is column.of.interest = which(excelVector==AAQ) But I'm a bit stumped. The first part is easy: LETTERS[1:26] The next would probably use expand.grid, but all my potential solutions are pretty clunky. Any ideas? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Numeric Column Labels in Excel Function
HI, May be you can try: fun1 - function(n){ if(n =26){ res - LETTERS[seq_len(n)] } else if(n26 n =702){ res - c(LETTERS,apply(expand.grid(vec1,vec1)[,2:1],1,paste,collapse=))[1:n] } else if(n 702 n =18278){ res - c(LETTERS,apply(expand.grid(vec1,vec1)[,2:1],1,paste,collapse=),apply(expand.grid(vec1,vec1,vec1)[,3:1],1,paste,collapse=))[1:n] } else { NA } res } fun1(0) #character(0) fun1(2) #[1] A B fun1(28) A.K. On Monday, January 27, 2014 4:41 PM, Dustin Fife fife.dus...@gmail.com wrote: Hi all, I frequently get requests to do data analysis where the person references an excel column. e.g., I want to analyze [insert complex variable name], located at column AAQ in Excel. I've been doing is gsub and inserting a part of the string for the complex variable name, then going from there. But, I was trying to make function that returns the following vector: excelVector = A, B, C, D,...AA, AB, AC...ZA, ZB, ZC,...AAA, AAB, AAC, etc. In other words, the argument would have one argument (n, or the number of columns), then it would return a list like that shown above. Then, all I would have to do is column.of.interest = which(excelVector==AAQ) But I'm a bit stumped. The first part is easy: LETTERS[1:26] The next would probably use expand.grid, but all my potential solutions are pretty clunky. Any ideas? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Numeric Column Labels in Excel Function
There seems to be a problem with that function: object 'vec1' not found. On Mon, Jan 27, 2014 at 4:05 PM, arun smartpink...@yahoo.com wrote: HI, May be you can try: fun1 - function(n){ if(n =26){ res - LETTERS[seq_len(n)] } else if(n26 n =702){ res - c(LETTERS,apply(expand.grid(vec1,vec1)[,2:1],1,paste,collapse=))[1:n] } else if(n 702 n =18278){ res - c(LETTERS,apply(expand.grid(vec1,vec1)[,2:1],1,paste,collapse=),apply(expand.grid(vec1,vec1,vec1)[,3:1],1,paste,collapse=))[1:n] } else { NA } res } fun1(0) #character(0) fun1(2) #[1] A B fun1(28) A.K. On Monday, January 27, 2014 4:41 PM, Dustin Fife fife.dus...@gmail.com wrote: Hi all, I frequently get requests to do data analysis where the person references an excel column. e.g., I want to analyze [insert complex variable name], located at column AAQ in Excel. I've been doing is gsub and inserting a part of the string for the complex variable name, then going from there. But, I was trying to make function that returns the following vector: excelVector = A, B, C, D,...AA, AB, AC...ZA, ZB, ZC,...AAA, AAB, AAC, etc. In other words, the argument would have one argument (n, or the number of columns), then it would return a list like that shown above. Then, all I would have to do is column.of.interest = which(excelVector==AAQ) But I'm a bit stumped. The first part is easy: LETTERS[1:26] The next would probably use expand.grid, but all my potential solutions are pretty clunky. Any ideas? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Numeric Column Labels in Excel Function
Sorry, this should work fun1 - function(n){ vec1 - LETTERS if(n =26){ res - vec1[seq_len(n)] } else if(n26 n =702){ res - c(LETTERS,apply(expand.grid(vec1,vec1)[,2:1],1,paste,collapse=))[1:n] } else if(n 702 n =18278){ res - c(LETTERS,apply(expand.grid(vec1,vec1)[,2:1],1,paste,collapse=),apply(expand.grid(vec1,vec1,vec1)[,3:1],1,paste,collapse=))[1:n] } else { res - NA } res } fun1(0) character(0) fun1(8) [1] A B C D E F G H fun1(40) [1] A B C D E F G H I J K L M N O [16] P Q R S T U V W X Y Z AA AB AC AD [31] AE AF AG AH AI AJ AK AL AM AN fun1(18279) [1] NA A.K. On Monday, January 27, 2014 5:11 PM, Dustin Fife fife.dus...@gmail.com wrote: There seems to be a problem with that function: object 'vec1' not found. On Mon, Jan 27, 2014 at 4:05 PM, arun smartpink...@yahoo.com wrote: HI, May be you can try: fun1 - function(n){ if(n =26){ res - LETTERS[seq_len(n)] } else if(n26 n =702){ res - c(LETTERS,apply(expand.grid(vec1,vec1)[,2:1],1,paste,collapse=))[1:n] } else if(n 702 n =18278){ res - c(LETTERS,apply(expand.grid(vec1,vec1)[,2:1],1,paste,collapse=),apply(expand.grid(vec1,vec1,vec1)[,3:1],1,paste,collapse=))[1:n] } else { NA } res } fun1(0) #character(0) fun1(2) #[1] A B fun1(28) A.K. On Monday, January 27, 2014 4:41 PM, Dustin Fife fife.dus...@gmail.com wrote: Hi all, I frequently get requests to do data analysis where the person references an excel column. e.g., I want to analyze [insert complex variable name], located at column AAQ in Excel. I've been doing is gsub and inserting a part of the string for the complex variable name, then going from there. But, I was trying to make function that returns the following vector: excelVector = A, B, C, D,...AA, AB, AC...ZA, ZB, ZC,...AAA, AAB, AAC, etc. In other words, the argument would have one argument (n, or the number of columns), then it would return a list like that shown above. Then, all I would have to do is column.of.interest = which(excelVector==AAQ) But I'm a bit stumped. The first part is easy: LETTERS[1:26] The next would probably use expand.grid, but all my potential solutions are pretty clunky. Any ideas? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in overlying two figures in ggplot2
On 01/28/2014 03:52 AM, Kristi Glover wrote: Hi R Users, I was struggling to overlay two graphs created from the two different datasets using ggplot2.I could not overlay two figures. I wanted to plot second graph using second Y axis. but there was no provision. Furthermore, I could not join means of the box plots. I tried this way but did not work. Any suggestions? dat1-structure(list(site = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), layer = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c(bottom, top), class = factor), Present = c(120L, 125L, 123L, 23L, 21L, 19L, 131L, 124L, 127L, 24L, 27L, 25L, 145L, 143L, 184L, 29L, 14L, 17L, 38L)), .Names = c(site, layer, Present), row.names = c(NA, 19L), class = data.frame) dat1 dat2-structure(list(site = 1:3, present = c(-3L, 2L, 5L)), .Names = c(site, present), row.names = c(NA, 3L), class = data.frame) dat2 library(plyr) library(ggplot2) A-ggplot(dat1, aes(x = factor(site), y = Present, colour = layer, fill=layer)) + geom_boxplot(outlier.shape = 16, outlier.size = 1) + theme_bw()+ ylim(0,185) # Here I wanted to join the means of the boxplots among the sites, but I could not join it. B-ggplot(dat2, aes(x=factor(site),y= present, colour=blue) + geom_line() + geom_point()) # wanted to plot it using second y axis. A+B Hi Kristi, It's not ggplot, but this might help. par(mar=c(5,4,4,4)) boxplot(Present~layer+site,data=dat1,staplewex=0, col=c(2,3,2,3,2,3),border=c(2,3,2,3,2,3), ylab=Present,xlab=factor(site),xaxt=n) boxat-c(1.5,3.3,5.5) axis(1,at=boxat,labels=1:3) library(plotrix) # or library(prettyR) newpresent-rescale(c(-5,dat2$present),c(50,100))[-1] points(boxat,newpresent,type=b,col=blue) axis(4,at=c(50,75,100),labels=c(-5,0,5),col=blue) mtext(Means of sites,side=4,at=150,line=0.5,col=blue) legend(1,170,c(Top,Bottom),fill=c(green,red)) par(mar=c(5,4,4,2)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Numeric Column Labels in Excel Function
If you use XLConnect, to can reference the column symbolically to retrieve the data. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Mon, Jan 27, 2014 at 4:30 PM, Dustin Fife fife.dus...@gmail.com wrote: Hi all, I frequently get requests to do data analysis where the person references an excel column. e.g., I want to analyze [insert complex variable name], located at column AAQ in Excel. I've been doing is gsub and inserting a part of the string for the complex variable name, then going from there. But, I was trying to make function that returns the following vector: excelVector = A, B, C, D,...AA, AB, AC...ZA, ZB, ZC,...AAA, AAB, AAC, etc. In other words, the argument would have one argument (n, or the number of columns), then it would return a list like that shown above. Then, all I would have to do is column.of.interest = which(excelVector==AAQ) But I'm a bit stumped. The first part is easy: LETTERS[1:26] The next would probably use expand.grid, but all my potential solutions are pretty clunky. Any ideas? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Predictor Importance in Random Forests and bootstrap
Hello! Below, I: 1. Create a data set with a bunch of factors. All of them are predictors and 'y' is the dependent variable. 2. I run a classification Random Forests run with predictor importance. I look at 2 measures of importance - MeanDecreaseAccuracy and MeanDecreaseGini 3. I run 2 boostrap runs for 2 Random Forests measures of importance mentioned above. Question: Could anyone please explain why I am getting such a huge positive bias across the board (for all predictors) for MeanDecreaseAccuracy? Thanks a lot! Dimitri # # Creating a a data set: #- N-1000 myset1-c(1,2,3,4,5) probs1a-c(.05,.10,.15,.40,.30) probs1b-c(.05,.15,.10,.30,.40) probs1c-c(.05,.05,.10,.15,.65) myset2-c(1,2,3,4,5,6,7) probs2a-c(.02,.03,.10,.15,.20,.30,.20) probs2b-c(.02,.03,.10,.15,.20,.20,.30) probs2c-c(.02,.03,.10,.10,.10,.25,.40) myset.y-c(1,2) probs.y-c(.65,.30) set.seed(1) y-as.factor(sample(myset.y,N,replace=TRUE,probs.y)) set.seed(2) a-as.factor(sample(myset1, N, replace = TRUE,probs1a)) set.seed(3) b-as.factor(sample(myset1, N, replace = TRUE,probs1b)) set.seed(4) c-as.factor(sample(myset1, N, replace = TRUE,probs1c)) set.seed(5) d-as.factor(sample(myset2, N, replace = TRUE,probs2a)) set.seed(6) e-as.factor(sample(myset2, N, replace = TRUE,probs2b)) set.seed(7) f-as.factor(sample(myset2, N, replace = TRUE,probs2c)) mydata-data.frame(a,b,c,d,e,f,y) #- # Single Random Forests run with predictor importance. #- library(randomForest) set.seed(123) rf1-randomForest(y~.,data=mydata,importance=T) importance(rf1)[,c(3:4)] #- # Bootstrapping run #- library(boot) ### Defining two functions to be used for bootstrapping: # myrf3 returns MeanDecreaseAccuracy: myrf3-function(usedata,idx){ set.seed(123) out-randomForest(y~.,data=usedata[idx,],importance=T) return(importance(out)[,3]) } # myrf4 returns MeanDecreaseGini: myrf4-function(usedata,idx){ set.seed(123) out-randomForest(y~.,data=usedata[idx,],importance=T) return(importance(out)[,4]) } ### 2 bootstrap runs: rfboot3-boot(mydata,myrf3,R=10) rfboot4-boot(mydata,myrf4,R=10) ### Results rfboot3 # for MeanDecreaseAccuracy colMeans(rfboot3$t)-importance(rf1)[,3] rfboot4 # for MeanDecreaseGini colMeans(rfboot4$t)-importance(rf1)[,4] # for MeanDecreaseGini -- Dimitri Liakhovitski [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Predictor Importance in Random Forests and bootstrap
I **think** this kind of methodological issue might be better at SO (stats.stackexchange.com). It's not really about R programming, which is the main focus of this list. And yes, I know they do intersect. Nevertheless... Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Mon, Jan 27, 2014 at 3:47 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! Below, I: 1. Create a data set with a bunch of factors. All of them are predictors and 'y' is the dependent variable. 2. I run a classification Random Forests run with predictor importance. I look at 2 measures of importance - MeanDecreaseAccuracy and MeanDecreaseGini 3. I run 2 boostrap runs for 2 Random Forests measures of importance mentioned above. Question: Could anyone please explain why I am getting such a huge positive bias across the board (for all predictors) for MeanDecreaseAccuracy? Thanks a lot! Dimitri # # Creating a a data set: #- N-1000 myset1-c(1,2,3,4,5) probs1a-c(.05,.10,.15,.40,.30) probs1b-c(.05,.15,.10,.30,.40) probs1c-c(.05,.05,.10,.15,.65) myset2-c(1,2,3,4,5,6,7) probs2a-c(.02,.03,.10,.15,.20,.30,.20) probs2b-c(.02,.03,.10,.15,.20,.20,.30) probs2c-c(.02,.03,.10,.10,.10,.25,.40) myset.y-c(1,2) probs.y-c(.65,.30) set.seed(1) y-as.factor(sample(myset.y,N,replace=TRUE,probs.y)) set.seed(2) a-as.factor(sample(myset1, N, replace = TRUE,probs1a)) set.seed(3) b-as.factor(sample(myset1, N, replace = TRUE,probs1b)) set.seed(4) c-as.factor(sample(myset1, N, replace = TRUE,probs1c)) set.seed(5) d-as.factor(sample(myset2, N, replace = TRUE,probs2a)) set.seed(6) e-as.factor(sample(myset2, N, replace = TRUE,probs2b)) set.seed(7) f-as.factor(sample(myset2, N, replace = TRUE,probs2c)) mydata-data.frame(a,b,c,d,e,f,y) #- # Single Random Forests run with predictor importance. #- library(randomForest) set.seed(123) rf1-randomForest(y~.,data=mydata,importance=T) importance(rf1)[,c(3:4)] #- # Bootstrapping run #- library(boot) ### Defining two functions to be used for bootstrapping: # myrf3 returns MeanDecreaseAccuracy: myrf3-function(usedata,idx){ set.seed(123) out-randomForest(y~.,data=usedata[idx,],importance=T) return(importance(out)[,3]) } # myrf4 returns MeanDecreaseGini: myrf4-function(usedata,idx){ set.seed(123) out-randomForest(y~.,data=usedata[idx,],importance=T) return(importance(out)[,4]) } ### 2 bootstrap runs: rfboot3-boot(mydata,myrf3,R=10) rfboot4-boot(mydata,myrf4,R=10) ### Results rfboot3 # for MeanDecreaseAccuracy colMeans(rfboot3$t)-importance(rf1)[,3] rfboot4 # for MeanDecreaseGini colMeans(rfboot4$t)-importance(rf1)[,4] # for MeanDecreaseGini -- Dimitri Liakhovitski [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] KnitR/RMarkdown: Is there a way to not print a section of the document?
I've been looking through the R documents to see if there's a way to not output certain chunks of code. I'm trying to present a document to a team of folks that won't necessarily be interested in the line-by-line code, though they are interested in the charts, etc. Thus, I'd like to not output certain chunks of code. Is there a way to suppress sections? Thank you. -- Jeff [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] KnitR/RMarkdown: Is there a way to not print a section of the document?
In the chunk options, you can use the argument echo = FALSE to suppress display of the R code in the output. echo = FALSE= # R code @ This will still print out results from R that would be sent to the command line (like print() statements, cat() statements, results from summary(), etc), but the rest of the code in the chunk would be hidden from the output document. To make it easier to turn many code chunks on and off at once, define a variable as TRUE/FALSE early in the document, and then use that variable as the argument in the subsequent chunk options: echo = FALSE= showcode = FALSE @ Every other chunk gets this argument: echo = showcode= # R code. If showcode is FALSE, this code would be run, but not displayed in the output document. @ On Mon, Jan 27, 2014 at 4:49 PM, Jeff Johnson mrjeffto...@gmail.com wrote: I've been looking through the R documents to see if there's a way to not output certain chunks of code. I'm trying to present a document to a team of folks that won't necessarily be interested in the line-by-line code, though they are interested in the charts, etc. Thus, I'd like to not output certain chunks of code. Is there a way to suppress sections? Thank you. -- Jeff [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- ___ Luke Miller __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error msg while trying to install package ncdf4
Hi all, I'm unable to install the package ncdf4 and below is the message that I'm getting. Any thougths on how to fix the problem will be very much appreciated. # # * installing *source* package ncdf4 ... ** package ncdf4 successfully unpacked and MD5 sums checked configure.in: starting checking for nc-config... no --- Error, nc-config not found or not executable. This is a script that comes with the netcdf library, version 4.1-beta2 or later, and must be present for configuration to succeed. If you installed the netcdf library (and nc-config) in a standard location, nc-config should be found automatically. Otherwise, you can specify the full path and name of the nc-config script by passing the --with-nc-config=/full/path/nc-config argument flag to the configure script. For example: ./configure --with-nc-config=/sw/dist/netcdf4/bin/nc-config Special note for R users: - To pass the configure flag to R, use something like this: R CMD INSTALL --configure-args=--with-nc-config=/home/joe/bin/nc-config ncdf4 where you should replace /home/joe/bin etc. with the location where you have installed the nc-config script that came with the netcdf 4 distribution. --- ERROR: configuration failed for package ncdf4 * removing /home/armel/R/x86_64-pc-linux-gnu-library/3.0/ncdf4 The downloaded source packages are in /tmp/RtmpKkBbop/downloaded_packages Warning message: In install.packages(ncdf4) : installation of package ncdf4 had non-zero exit status -- Armel KAPTUE, Ph.D. GIScCE SDSU 1021 Medary Av., Wecota Hall Box 506 B Brookings, SD 57007 USA Tel: +1 605 688 6255 Fax: +1 605 688 5227 Email: armel.kap...@sdstate.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Use calc function to do RKT over raster stack
I am trying to do the Mann-Kendall, Sen's Slope and Regional Kendall Test (rkt package) over a 30 year time period spatially. I have created a raster stack but am unable to figure out how to use the calc function to do rkt. This is what I have... #create vector of years for tests Years - c(1978:2007) #create stack of rasters ThirtyYrStack -stack(Map1978,Map1979,Map1980,Map1981,Map1982,Map1983,Map1984,Map1985,Map1986,Map1987,Map1988,Map1989,Map1990,Map1991,Map1992,Map1993,Map1994,Map1995,Map1996,Map1997,Map1998,Map1999,Map2000,Map2001,Map2002,Map2003,Map2004,Map2005,Map2006,Map2007) ##create function of rkt package MK_test - function(x,y){rkt(x,y)} ThirtyYr.mk -calc(ThirtyYrStack,MK_test(Years,ThirtyYrStack)) I guess I am confused on what to pass into calc. I don't think the problem is with the stack itself Alyssa -- View this message in context: http://r.789695.n4.nabble.com/Use-calc-function-to-do-RKT-over-raster-stack-tp4684265.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] KnitR/RMarkdown: Is there a way to not print a section of the document?
Hi I use Sweave and have a master Rnw file and parent files. If there are large chunks I split them up and then just put a % in front of the \SweaveInput if unwanted. Otherwise I split up the tex files with \input and \includeonly You could get into the chunk options and change things there but that is fiddly if you want to reuse Regards Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jeff Johnson Sent: Tuesday, 28 January 2014 10:49 To: R help Subject: [R] KnitR/RMarkdown: Is there a way to not print a section of the document? I've been looking through the R documents to see if there's a way to not output certain chunks of code. I'm trying to present a document to a team of folks that won't necessarily be interested in the line-by-line code, though they are interested in the charts, etc. Thus, I'd like to not output certain chunks of code. Is there a way to suppress sections? Thank you. -- Jeff [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlaying two graphs using ggplot2 in R
On Jan 27, 2014, at 3:13 AM, Kristi Glover wrote: Hi R Users, I was struggling to overlay two graphs created from the two different dataset using ggplot2. Furthermore, I could not join means of the box plots. I tried this way but did not work. Any suggestions? dat1-structure(list(site = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), layer = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c(bottom, top), class = factor), Present = c(120L, 125L, 123L, 23L, 21L, 19L, 131L, 124L, 127L, 24L, 27L, 25L, 145L, 143L, 184L, 29L, 14L, 17L, 38L)), .Names = c(site, layer, Present), row.names = c(NA, 19L), class = data.frame) dat1 dat2-structure(list(site = 1:3, present = c(-3L, 2L, 5L)), .Names = c(site, present), row.names = c(NA, 3L), class = data.frame) dat2 library(plyr) library(ggplot2) A-ggplot(dat1, aes(x = factor(site), y = Present, colour = layer, fill=layer)) + geom_boxplot(outlier.shape = 16, outlier.size = 1) + theme_bw()+ ylim(0,185) # Here I wanted to join the means of the boxplots among the sites, but I could not join it. B-ggplot(dat2, aes(x=factor(site),y= present, colour=blue) + geom_line() + geom_point()) # wanted to plot it using second y axis. A+B Thanks for your help. I believe you may be able to find 'ggplot2' hacks that deliver double ordinate plots but Hadley Wickham has an intense aversion to such plots, so you won't find them in standard `gglot2` functions. Do a search on StackOverflow and in the archives. I cannot remember the names of the functions or the authors, but I do remember experiencing the heresy of double ordinates within the vaulted arches the Church of GGplot. [[alternative HTML version deleted]] And you should learn to post in plain text. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] KnitR/RMarkdown: Is there a way to not print a section of the document?
Similarly, you can split a large input document into child documents in knitr, e.g. chap1, child=chap1.Rnw= @ You can comment out this chunk when you do not need it. Or control it programmatically, setup, include=FALSE= include_me = TRUE # or FALSE @ chap1, child=if (include_me) chap1.Rnw= @ Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name On Mon, Jan 27, 2014 at 8:20 PM, Duncan Mackay dulca...@bigpond.com wrote: Hi I use Sweave and have a master Rnw file and parent files. If there are large chunks I split them up and then just put a % in front of the \SweaveInput if unwanted. Otherwise I split up the tex files with \input and \includeonly You could get into the chunk options and change things there but that is fiddly if you want to reuse Regards Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jeff Johnson Sent: Tuesday, 28 January 2014 10:49 To: R help Subject: [R] KnitR/RMarkdown: Is there a way to not print a section of the document? I've been looking through the R documents to see if there's a way to not output certain chunks of code. I'm trying to present a document to a team of folks that won't necessarily be interested in the line-by-line code, though they are interested in the charts, etc. Thus, I'd like to not output certain chunks of code. Is there a way to suppress sections? Thank you. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset a data frame into multiple data frames
Hi, Try ?split() If `dat1` is the dataset: lst1 - split(dat1,dat1$ID) lst1$an1 # ID V1 mean SD SE #1 an1 5 72.21719 22.27118 9.092172 #2 an1 6 100.0 NA NA lst1$an2 # ID V1 mean SD SE #3 an2 5 79.27999 25.08938 10.2427 #4 an2 6 100.0 NA NA A.K. I have a data frame with an ID column and multiple IDs with rows of data that correspond to them. I need to subset the data frame such that each ID becomes its own data frame, and the name of the data frame should be its ID. I have quite a few IDs so doing this one subset at a time would be very tedious, but I have not been able to figure out how to loop it. I read about using write.table in a for loop to generate multiple excel files, but I want these data frames to remain in R. Here is a small example of my data frame (called data). ID V1 mean SD SE an1 5 72.21719 22.27118 9.092172 an1 6 100.0 NA NA an2 5 79.27999 25.08938 10.242698 an2 6 100.0 NA NA after subseting it, I want to be able to enter print(an1) and get ID V1 mean SD SE an1 5 72.21719 22.27118 9.092172 an1 6 100.0 NA NA similarly, if I enter print(an2) I should get ID V1 mean SD SE an2 5 79.27999 25.08938 10.242698 an2 6 100.0 NA NA I've looked around online, but I haven't been able to find the answer. Thanks for the help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlaying two graphs using ggplot2 in R
Hi Kristi Jim has given you 1 non ggplot2 solution here is one from lattice panel.average is a line so added a line for points # convert site to a factor dat1$Site = factor(dat1$Site) datav - aggregate(Present ~ Site, dat1,mean) datav diff(datav[,2]) # test bw1 - bwplot(Present~Site,data=subset(dat1, layer == top), ylim = c(0,200), panel = function(x,y,...){ panel.bwplot(x,y,...) panel.average(x=x,y=y,...) }) bw2- bwplot(Present~Site,data=subset(dat1, layer == bottom), ylim = c(0,200), panel = function(x,y,...){ panel.bwplot(x,y,...) # overall mean if needed panel.average(dat1[,Site], dat1[,Present], ...) }) print(bw1, more = T) print(bw2, more = F) Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Kristi Glover Sent: Monday, 27 January 2014 21:13 To: R-help Subject: [R] Overlaying two graphs using ggplot2 in R Hi R Users, I was struggling to overlay two graphs created from the two different dataset using ggplot2. Furthermore, I could not join means of the box plots. I tried this way but did not work. Any suggestions? dat1-structure(list(site = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), layer = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c(bottom, top), class = factor), Present = c(120L, 125L, 123L, 23L, 21L, 19L, 131L, 124L, 127L, 24L, 27L, 25L, 145L, 143L, 184L, 29L, 14L, 17L, 38L)), .Names = c(site, layer, Present), row.names = c(NA, 19L), class = data.frame) dat1 dat2-structure(list(site = 1:3, present = c(-3L, 2L, 5L)), .Names = c(site, present), row.names = c(NA, 3L), class = data.frame) dat2 library(plyr) library(ggplot2) A-ggplot(dat1, aes(x = factor(site), y = Present, colour = layer, fill=layer)) + geom_boxplot(outlier.shape = 16, outlier.size = 1) + theme_bw()+ ylim(0,185) # Here I wanted to join the means of the boxplots among the sites, but I could not join it. B-ggplot(dat2, aes(x=factor(site),y= present, colour=blue) + geom_line() + geom_point()) # wanted to plot it using second y axis. A+B Thanks for your help. KG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assigning a factor to a data frame
I created an an empty data frame this way: forsentest-data.frame(matrix(nrow=nod,ncol=f)). Then I tried to assign one row of another data frame forsen to it forsentest[1,]-forsen[1,] But the factors in forsen gets converted to numbers in forsentest which is not what I want. Is there another way around it ? Tjun Kiat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [BioC] problem in getVarianceStabilizedData
Thanks a lot Michael, Thats a great help. Best wishes, Suparna Dr. Suparna Mitra Department of Molecular and Clinical Pharmacology Institute of Translational Medicine University of Liverpool Block A: Waterhouse Buildings 1-5 Brownlow Street Liverpool L69 3GL Tel. +44 (0)151 795 5414 M: +44 (0) 7523228621 Internal ext: 55401 On 27 January 2014 21:18, Michael Love michaelisaiahl...@gmail.com wrote: hi Suparna, CountDataSet is the class used in DESeq, while DESeqDataSet is the class used in DESeq2. You can convert a CountDataSet to a DESeqDataSet using the steps outlined in the vignette, 1.2.3 Count matrix input. Mike On Sun, Jan 26, 2014 at 11:28 PM, Suparna Mitra suparna.mitra...@gmail.com wrote: Hi All, I am having a problem while running getVarianceStabilizedData in DDSeq2 package. data.vsd-getVarianceStabilizedData(data) Error in (function (classes, fdef, mtable) : unable to find an inherited method for function dispersionFunction for signature CountDataSet Though the function looks okay dispersionFunction standardGeneric for dispersionFunction defined from package DESeq2 function (object) standardGeneric(dispersionFunction) environment: 0x7fe7a9c5d140 Methods may be defined for arguments: object Use showMethods(dispersionFunction) for currently available ones. Can anybody please help? Thanks, Mitra. [[alternative HTML version deleted]] ___ Bioconductor mailing list bioconduc...@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning a factor to a data frame
Hi, Try: Either: forsentest[1,] - unlist(forsen[1,]) #or forsen[] - lapply(forsen,as.character) forsentest[1,] - forsen[1,] A.K. On Monday, January 27, 2014 10:38 PM, Tjun Kiat Teo teotj...@gmail.com wrote: I created an an empty data frame this way: forsentest-data.frame(matrix(nrow=nod,ncol=f)). Then I tried to assign one row of another data frame forsen to it forsentest[1,]-forsen[1,] But the factors in forsen gets converted to numbers in forsentest which is not what I want. Is there another way around it ? Tjun Kiat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How do you install cran mac binaries
Sorry if the question is stupid, how you you install mac os binaries like in : http://cran.r-project.org/bin/macosx/contrib/r-release/forecast_5.0.tgz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do you install cran mac binaries
As you install basically all CRAN packages and all OSes; install.packages(forecast) /Henrik On Mon, Jan 27, 2014 at 8:18 PM, ce zadi...@excite.com wrote: Sorry if the question is stupid, how you you install mac os binaries like in : http://cran.r-project.org/bin/macosx/contrib/r-release/forecast_5.0.tgz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Markov chain simulation
Hi there, I'm wonder if in R there is a way to simulate a discrete Markov chains with a specific number of occurence of state knowing the transition matrixway. For example, how to simualte a markov chain of length n with p occurences (pn) of the sate '0' for a transition matrix defined by: TransitionMatrix- matrix(c(0.7, 0.3, 0.4, 0.6),byrow=TRUE, nrow=2) colnames(TransitionMatrix) - c('0','1') row.names(TransitionMatrix) - c('0','1') Thanks, -- Armel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Arguments in functions when packaging
Hi everybody, I have a doubt in relation with arguments in functions when packaging: Does it make sense the fact of having dots as an argument when it is the only argument?. I mean you have a package and a function that will be used directly by the user has dots as an argument; for example: ReadData - function (...) { InPutParams(kValidate, ...) ReadAll(...) } Does it make sense the fact of having dots as argument of ReadData, and no other argument?. I ask this question because I hace created a package and I use dots as argument (only dots) in several visible functions, because in this way the fact of validating the values passed by the user is more easy for me (within InPutParams function), but I have consulted several packages and I don't see visible functions with dots as unique argument. Thank you in advance. Regards. Eva [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.