[R] bayesm - question about 'rscaleUsage function'
Hi all, I have managed to get the r-scale usage algorithm to work, but I need to obtain the final results from this. As I understand it, this code is designed to generate a matrix after processing and store it somewhere? Here is the code. I get this part of the code, it all makes sense. ## if(nchar(Sys.getenv(LONG_TEST)) != 0) {R=1000} else {R=5} { data(customerSat) surveydat = list(k=10,x=as.matrix(customerSat)) Mcmc1 = list(R=R) set.seed(66) out=rscaleUsage(Data=surveydat,Mcmc=Mcmc1) summary(out$mudraw) } My question is how do I retrieve the results from this in a matrix format I want to extract the matrix in its complete form and save this as a file for further processing. What I want to get is. V1 V2 V3 V4 V1 1.470.991.141.56 V2 1.670.931.351.21 If I type 'out' see below in the GUI interface I get the following: out [,1288] [,1289] [,1290] [,1291] ... [3,]1.470.991.141.56 [4,]1.170.960.631.32 [5,]1.650.510.391.62... summary(out) Length Class Mode Sigmadraw 500 bayesm.var numeric mudraw 50 bayesm.mat numeric taudraw9055 bayesm.mat numeric sigmadraw 9055 bayesm.mat numeric Lambdadraw 20 bayesm.mat numeric edraw 5 bayesm.mat numeric Thanks Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to convert decimal date to its equivalent date format(YYYY.mm.dd.hr.min.sec)
Hello R Users, How to convert decimal date to date as .mm.dd.hr.min.sec For example, I have decimal date in one column , and want to convert and write it in equivalent date(.mm.dd.hr.min.sec) in another next six columns. 1979.00 1979.020833 1979.041667 1979.062500 Is it possible in R ? Kindly help, Regards, Yogesh -- Dr. Yogesh K. Tiwari, Scientist, Indian Institute of Tropical Meteorology, Homi Bhabha Road, Pashan, Pune-411008 INDIA Phone: 0091-99 2273 9513 (Cell) : 0091-20-258 93 600 (O) (Ext.250) Fax: 0091-20-258 93 825 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About grep
Shao wrote: Hi,everyone. I have a problem when using the grep. for example: a - c(aa,aba,abac) b- c(ab,aba) I want to match the whole word,so grep(^aba$,a) it returns 2 but when I used it a more useful way: grep(^b[2]$,a), it doesn't work at all, it can't find it, returning integer(0). How can I chang the format in the second way? Thanks. The regexp ^b[2]$ matches only two words: b2 and b (in your present grep call with defaults extended = TRUE, perl = FALSE, fixed = FALSE). They don't present in a, that's why grep returns integer(0). If you want to find the word b[2] (the expression similar to an array indexing operator), either use fixed=TRUE and remove any regexp markup (^ and $) a- c(aa,aba,abac,b[2]) grep(b[2],a,fixed=TRUE) [1] 4 , or use escapes grep(^b\\[2\\]$,a) [1] 4 -- View this message in context: http://www.nabble.com/About-grep-tf4228021.html#a12029243 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About grep
On 07-Aug-07 04:20:27, Shao wrote: Hi,everyone. I have a problem when using the grep. for example: a - c(aa,aba,abac) b- c(ab,aba) I want to match the whole word,so grep(^aba$,a) it returns 2 but when I used it a more useful way: grep(^b[2]$,a), it doesn't work at all, it can't find it, returning integer(0). How can I chang the format in the second way? Thanks. -- Shao The problem is that in the string ^b[2]$ the element b[2] of b is not evaluated, but simply the successive characters ^ b [ 2 ] $ are passed to grep as a character string, which of course is not found. You can construct a character string with the value of b[2] in it by using paste(): grep(paste(^,b[2],$,sep=),a) [1] 2 Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 07-Aug-07 Time: 07:43:14 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bayesm - question about 'rscaleUsage function'
paulandpen at optusnet.com.au writes: I get this part of the code, it all makes sense. ## if(nchar(Sys.getenv(LONG_TEST)) != 0) {R=1000} else {R=5} { data(customerSat) surveydat = list(k=10,x=as.matrix(customerSat)) Mcmc1 = list(R=R) set.seed(66) out=rscaleUsage(Data=surveydat,Mcmc=Mcmc1) summary(out$mudraw) } My question is how do I retrieve the results from this in a matrix format str() is your friend... - if(nchar(Sys.getenv(LONG_TEST)) != 0) {R=1000} else {R=5} { data(customerSat) surveydat = list(k=10,x=as.matrix(customerSat)) Mcmc1 = list(R=R) set.seed(66) out=rscaleUsage(Data=surveydat,Mcmc=Mcmc1) summary(out$mudraw) } str(out) List of 6 $ Sigmadraw : bayesm.var [1:5, 1:100] 7.37 7.04 5.56 5.00 4.50 ... ..- attr(*, class)= chr [1:3] bayesm.var bayesm.mat mcmc ..- attr(*, mcpar)= num [1:3] 1 5 1 $ mudraw: bayesm.mat [1:5, 1:10] 6.21 6.54 6.49 6.56 6.59 ... ..- attr(*, class)= chr [1:2] bayesm.mat mcmc ..- attr(*, mcpar)= num [1:3] 1 5 1 $ taudraw : bayesm.mat [1:5, 1:1811] 0.1862 -1.4988 -1.0348 0.0754 1.0258 ... ..- attr(*, class)= chr [1:2] bayesm.mat mcmc ..- attr(*, mcpar)= num [1:3] 1 5 1 $ sigmadraw : bayesm.mat [1:5, 1:1811] 1.11 1.29 1.47 1.2 1.95 0.72 0.63 0.69 1.2 1.14 ... ..- attr(*, class)= chr [1:2] bayesm.mat mcmc ..- attr(*, mcpar)= num [1:3] 1 5 1 $ Lambdadraw: bayesm.mat [1:5, 1:4] 3.35 2.73 2.58 2.49 2.73 ... ..- attr(*, class)= chr [1:2] bayesm.mat mcmc ..- attr(*, mcpar)= num [1:3] 1 5 1 $ edraw :Classes 'bayesm.mat', 'mcmc' atomic [1:5] 0.000 0.001 0.001 0.001 0.001 .. ..- attr(*, mcpar)= num [1:3] 1 5 1 Looks like you want sigmadraw (lowercase!)? So try out$sigmadraw Dieter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cannot add lines to plot
Hello, I want to plot a time series and add lines to the plot later on. However, this seems to work only as long as I plot the series against the default index. As soon as I plot against an object of class chron or POSIXt (i.e. I want to add a date/time axis), the lines do not appear anymore. The command to add the lines is executed without an error message. (THIS DOES NOT ADD THE LINES) plot(datum2[(3653):(3653+i)],dlindus[(3653):(3653+i)], col =hcl(h=60,c=35,l=60), ylim=c(-8,8), type = l, xlab=(), ylab=(Return), main = (Industry)) lines(gvarindus, type=l, lwd=2) lines(quantindustlow, col =black, type = l,lty=3) lines(quantindusthigh, col =black, type = l,lty=3) (THIS ADDS THE LINES, but then I dont have an date axis) plot(dlindus[(3653):(3653+i)], col =hcl(h=60,c=35,l=60), ylim=c(-8,8), type = l, xlab=(), ylab=(Return), main = (Industry)) lines(gvarindus, type=l, lwd=2) lines(quantindustlow, col =black, type = l,lty=3) lines(quantindusthigh, col =black, type = l,lty=3) This sounds like a fairly simple problem, but I cannot find any answer in the R-help archives. Thanks alot. Zeno __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help on glmmML
Hello! I am using glmmML for a logitic regression with random effect. I use the posterior.mode as an estimate for the random effects. These can be very different from the estimates obtained using SAS , NLMIXED in the random with out= option. (all the fixed and standard error of random effect estimators are almost identical) Can someone explain to me why is that. The codes I use: R: glmm1-glmmML(mort30 ~ x , data=dat2,cluster=hospital,family=binomial) print(sort(glmm1$posterior.mode)) SAS: * proc* *nlmixed* data*=*dat*;* eta = b0 + b1*x+ u; expeta = exp(eta); p = expeta/(*1*+expeta); model mort30 ~ binomial(*1*,p); random u ~ normal(*0*,s2) subject=hospital out=blue; *run*; * proc* *sort* data=blue;by estimate; * proc* *print* data=blue;*run*; ** *THANKS FOR THE HELP* *RON* [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot add lines to plot
On Tue, 7 Aug 2007, Zeno Adams wrote: Hello, I want to plot a time series and add lines to the plot later on. However, this seems to work only as long as I plot the series against the default index. As soon as I plot against an object of class chron or POSIXt (i.e. I want to add a date/time axis), the lines do not appear anymore. The command to add the lines is executed without an error message. 1. Read the posting guide! 2. Provide a reproducible example (as suggested in the posting guide). 3. Use a time series object for your time series, see e.g. package zoo. If you set up a multivariate time series, you can get what you want by plot(z, plot.type = single) See the zoo vignettes for more information. 4. If you really want to modify the code below, you probably need to repeat (the right elements of) datum2 in lines(), e.g., lines(datum2[...], gvarindus, lwd = 2) and you should really drop unnecessary brackets as in ylab = (Return) hth, Z (THIS DOES NOT ADD THE LINES) plot(datum2[(3653):(3653+i)],dlindus[(3653):(3653+i)], col =hcl(h=60,c=35,l=60), ylim=c(-8,8), type = l, xlab=(), ylab=(Return), main = (Industry)) lines(gvarindus, type=l, lwd=2) lines(quantindustlow, col =black, type = l,lty=3) lines(quantindusthigh, col =black, type = l,lty=3) (THIS ADDS THE LINES, but then I dont have an date axis) plot(dlindus[(3653):(3653+i)], col =hcl(h=60,c=35,l=60), ylim=c(-8,8), type = l, xlab=(), ylab=(Return), main = (Industry)) lines(gvarindus, type=l, lwd=2) lines(quantindustlow, col =black, type = l,lty=3) lines(quantindusthigh, col =black, type = l,lty=3) This sounds like a fairly simple problem, but I cannot find any answer in the R-help archives. Thanks alot. Zeno __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Opening a script with the R editor via file association (on Windows)
Does ?Rscript help? Christopher Green [EMAIL PROTECTED] 06/08/2007 23:13:47 On Fri, 3 Aug 2007, Duncan Murdoch wrote: On 03/08/2007 7:16 PM, Christopher Green wrote: Is there an easy way to open an R script file in the R Editor found in more recent versions of R on Windows via a file association? I am looking for functionality akin to how the .ssc file extension works for S-Plus: upon double-clicking on a .R file, Rgui.exe starts up and loads the script file in the R Editor. As far as I can tell, Rgui.exe does not have a command line option to load a file (so associating .R with Rgui.exe %1 won't work). I can get Windows to start Rgui.exe when I double-click on a script file, but that's about it. I also don't see any way to have Rgui.exe evaluate an expression provided on the command line (which would allow file.edit(%1)). I also thought about creating a custom .First calling 'file.edit' in a batch file, but then I'd have to start R, load the .First file, then quit and restart R, so that's out. Is there an easy way to do this without resorting to some other R GUI? Easy? Not sure. But the following works: Set up the association as F:\R\R-2.5.1\bin\Rgui.exe --args %1 (with obvious modifications to the path) and put this line in your RHOME/etc/Rprofile.site file: utils::file.edit(commandArgs(TRUE)) You could make things more sophisticated if you don't want a blank edit window to open in case you're not using this shortcut. Duncan Murdoch Thanks to Duncan for pointing me towards commandArgs()...I was able to modify things so that the script editor only opens when there is a file to edit. Essentially, there are two steps: 1. Add the line if ( length(z - commandArgs(TRUE)) ) utils::file.edit(z[1]) to the Rprofile.site file; and 2. Edit the registry to associate the .R extension with the command C:\Program Files\R\R-2.5.1\bin\Rgui.exe --args %1 I have written up some instructions on how to do this for the faculty member who asked me about doing this; others may find them useful. http://www.stat.washington.edu/software/statsoft/R/Rwinassoc.shtml Chris Green __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This email contains information which may be confidential and/or privileged, and is intended only for the individual(s) or organisation(s) named above. If you are not the intended recipient, then please note that any disclosure, copying, distribution or use of the contents of this email is prohibited. Internet communications are not 100% secure and therefore we ask that you acknowledge this. If you have received this email in error, please notify the sender or contact +44(0)20 8943 7000 or [EMAIL PROTECTED] immediately, and delete this email and any attachments and copies from your system. Thank you. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex TW11 0LY, UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Naming Lists
Hi Im pretty new to R and I have run in to a problem. How do I name all the levels in this list. Lev1 - c(A1,A2) Lev2 - c(B1,B2) Lev3 - c(C1,C2) MyList - lapply(TEST1,function(x){ assign(x,x,where=1) lapply(TEST2,function(y){ assign(y,y,where=1) lapply(TEST3,function(z){ paste(unlist(x),unlist(y),unlist(z)) })})}) I would like to name the different levels in the list with the differnt inputs. So that the first level in named by the inputs in Lev1 and the second Level by the inputs in Lev2, and so on I would then like to use this call MyList$A1$B1$C1 A1 B1 C1 Regards Tom -- View this message in context: http://www.nabble.com/Naming-Lists-tf4228968.html#a12030717 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot add lines to plot
On Tue, 7 Aug 2007, Zeno Adams wrote: Hello, I want to plot a time series and add lines to the plot later on. However, this seems to work only as long as I plot the series against the default index. As soon as I plot against an object of class chron or POSIXt (i.e. I want to add a date/time axis), the lines do not appear anymore. The command to add the lines is executed without an error message. (THIS DOES NOT ADD THE LINES) plot(datum2[(3653):(3653+i)],dlindus[(3653):(3653+i)], col =hcl(h=60,c=35,l=60), ylim=c(-8,8), type = l, xlab=(), ylab=(Return), main = (Industry)) lines(gvarindus, type=l, lwd=2) lines(quantindustlow, col =black, type = l,lty=3) lines(quantindusthigh, col =black, type = l,lty=3) (THIS ADDS THE LINES, but then I dont have an date axis) plot(dlindus[(3653):(3653+i)], col =hcl(h=60,c=35,l=60), ylim=c(-8,8), type = l, xlab=(), ylab=(Return), main = (Industry)) lines(gvarindus, type=l, lwd=2) lines(quantindustlow, col =black, type = l,lty=3) lines(quantindusthigh, col =black, type = l,lty=3) This sounds like a fairly simple problem, but I cannot find any answer in the R-help archives. Look at the help for lines: the standard call is lines(x, y, col =black, type = l,lty=3) and you have omitted x. See ?xy.coords for what happens then. I think the reason you did not find this in the archives is that this is a rare misreading (or non-reading) of the help pages. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting series of matrices on a single plot.
Dear experts, I have in all 14 matrices which stands for gene expression divergence and 14 matrices which stands for gene sequence divergence. I have tried joining them by using the concatanation function giving SequenceDivergence - c(X1,X2,X3,X4,X5,X6,X7,X8,X9,X10,X11,X12,X13,X14) ExpressionDivergence - c(Y1,Y2,Y3,Y4,Y5,Y6,Y7,Y8,Y9,Y10,Y11,Y12,Y13,Y14) where X1,X2..X14 are the expression matrices containing r-values and Y1,Y2..Y14 are the ones with patristic distances Now, I want to plot SequenceDivergence vs. Expression Divergence Tried doing that using plot (Sequence Divergence vs. Expression Divergence) But then getting the error Error in as.double.default(x) : (list) object cannot be coerced to 'double' Note: The diagonal values for X1 was neglected as its seems to give some bias in my results. Code to derive X1,X2,X3..and Y1,Y2,Y3 is given as: EDant - read.table(C:/ant.txt,sep=\t) ED1 - as.matrix(EDant) X1 - lapply(1:ncol(ED15), function(a) ED15[-a, a]) # to neglect diagonal values I am attaching couple of matrices for your reference. ANT.txt is X1 after deleting the diagnol values, similarly,ANTEXP.TXT is Y1, apetella.txt is X2 and apetellaexp.txt is Y2 I tried reading R-manual and other sources but have'nt cracked it yet. Can anyone please help me regarding that? Thanks very much Yours sincerely, Urmi Trivedi. - Once upon a time there was 1 GB storage in your inbox. Click here for happy ending.0 1.21133 1.4202211.4993581.4751491.513886 1.6088161.6759161.8590381.9929582.420123 2.3094142.413222 1.21133 0 1.4635851.5427221.5185131.55725 1.65218 1.71928 1.9024022.0363222.4634872.3527782.456586 1.4202211.4635850 1.4186191.39441 1.433147 1.5280771.5951771.7782991.9122192.339384 2.2286752.332483 1.4993581.5427221.4186190 0.931235 0.9699721.3544981.4215981.60472 1.73864 2.165805 2.0550962.158904 1.4751491.5185131.39441 0.9312350 0.206013 1.3302891.3973891.5805111.7144312.141596 2.0308872.134695 1.5138861.55725 1.4331470.9699720.2060130 1.3690261.4361261.6192481.7531682.180333 2.0696242.173432 1.6088161.65218 1.5280771.3544981.330289 1.3690260 0.6049721.5376961.671616 2.0987811.9880722.09188 1.6759161.71928 1.5951771.4215981.397389 1.4361260.6049720 1.6047961.738716 2.1658812.0551722.15898 1.8590381.9024021.7782991.60472 1.580511 1.6192481.5376961.6047960 1.09121 1.968427 1.8577181.961526 1.9929582.0363221.9122191.73864 1.714431 1.7531681.6716161.7387161.09121 0 2.102347 1.9916382.095446 2.4201232.4634872.3393842.1658052.141596 2.1803332.0987812.1658811.9684272.102347 0 1.1287011.232509 2.3094142.3527782.2286752.0550962.030887 2.0696241.9880722.0551721.8577181.991638 1.1287010 0.91546 2.4132222.4565862.3324832.1589042.134695 2.1734322.09188 2.15898 1.9615262.0954461.232509 0.91546 0 1 -0.046373 -0.033044 0.0039020.466827 0.3893270.131989-0.063346 -0.118093 0.016386 -0.140215 0.34511 0.233074 -0.046373 1 0.0007170.8322270.077053 -0.106815 -0.050838 0.9364310.2973870.031358 -0.127265 0.31245 -0.053169 -0.033044 0.0007171 0.011265-0.018602 -0.023669 -0.028744 -0.007588 0.148827-0.007067 0.005362-0.029172 -0.017367 0.0039020.8322270.0112651 0.071247 -0.101225 0.0971170.8291670.2943960.017809 -0.129199 0.278831-0.106427 0.4668270.077053-0.018602 0.0712471 0.0782410.1406790.0700690.0557440.013194 -0.196092 -0.027692 0.028857 0.389327-0.106815 -0.023669 -0.101225 0.078241 1 0.140146-0.086806 0.079 0.194904
Re: [R] Goodman-Kruskal tau
On Wed, 1 Aug 2007, Upasna Sharma [EMAIL PROTECTED] wrote: From: Upasna Sharma [EMAIL PROTECTED] Subject: [R] Goodman Kruskal's tau I need to know which package in R calculates the Goodman Kruskal's tau statistic for nominal data. Also is there any implementation for multiple classification analysis (Andrews at al 1973) in R? Any information on this would be greatly appreciated. As far as I know, the Goodman-Kruskal tau has not yet been implemented in any package, but I've coded it as a function for my own research purposes following Liebetrau 1983, see the code attached below (save it and load it into R with the source-command. You simply provide the function with a matrix 'x' with the nominal data, and get the values of the G-K tau as well as the significance values and asymptotic standard errors, calculated in both directions, i.e. GK.tau(dat=x) or GK.tau(x), e.g. source(GK.tau.R) x [,1] [,2] [,3] [1,] 30 101 [2,]2 205 [3,]1 10 15 GK.tau(x) tau.RCtau.CR p.tau.RC p.tau.CR var.tau.RC ASE.tau.RC 1 0.3522439 0.3219675 2.002024e-13 3.065436e-12 0.004584542 0.06770924 ASE.tau.CR 1 0.07004566 GK.tau - function(dat) { N - sum(dat); dat.rows - nrow(dat); dat.cols - ncol(dat); max.col - sum.col - L.col - matrix(,dat.cols); max.row - sum.row - L.row - matrix(,dat.rows); for(i in 1:dat.cols) { sum.col[i] - sum(dat[,i]); max.col[i] - max(dat[,i]); } for(i in 1:dat.rows) { sum.row[i] - sum(dat[i,]); max.row[i] - max(dat[i,]); } max.row.margin - max(apply(dat,1,sum)); max.col.margin - max(apply(dat,2,sum)); # Goodman-Kruskal tau # Tau Column|Row n.err.unconditional - N^2; for(i in 1:dat.rows) n.err.unconditional - n.err.unconditional-N*sum(dat[i,]^2/sum.row[i]); n.err.conditional - N^2-sum(sum.col^2); tau.CR - 1-(n.err.unconditional/n.err.conditional); v - n.err.unconditional/(N^2); d - n.err.conditional/(N^2); f - d*(v+1)-2*v; var.tau.CR - 0; for(i in 1:dat.rows) for(j in 1:dat.cols) var.tau.CR - var.tau.CR + dat[i,j]*(-2*v*(sum.col[j]/N)+d*((2*dat[i,j]/sum.row[i])-sum((dat[i,]/sum.row[i])^2))-f)^2/(N^2*d^4); ASE.tau.CR - sqrt(var.tau.CR); z.tau.CR - tau.CR/ASE.tau.CR; U.tau.CR - (N-1)*(dat.cols-1)*tau.CR; # Chi-squared approximation for H0 according to Margolin Light 1974, see also Liebetrau 1983 p.tau.CR - pchisq(U.tau.CR,df=(dat.rows-1)*(dat.cols-1),lower=FALSE); # Tau Row|Column n.err.unconditional - N^2; for(j in 1:dat.cols) n.err.unconditional - n.err.unconditional-N*sum(dat[,j]^2/sum.col[j]); n.err.conditional - N^2-sum(sum.row^2); tau.RC - 1-(n.err.unconditional/n.err.conditional); v - n.err.unconditional/(N^2); d - n.err.conditional/(N^2); f - d*(v+1)-2*v; var.tau.RC - 0; for(i in 1:dat.rows) for(j in 1:dat.cols) var.tau.RC - var.tau.RC + dat[i,j]*(-2*v*(sum.row[i]/N)+d*((2*dat[i,j]/sum.col[j])-sum((dat[,j]/sum.col[j])^2))-f)^2/(N^2*d^4); ASE.tau.RC - sqrt(var.tau.RC); z.tau.RC - tau.CR/ASE.tau.RC; U.tau.RC - (N-1)*(dat.rows-1)*tau.RC; # Chi-squared approximation for H0 according to Margolin Light 1974, see also Liebetrau 1983 p.tau.RC - pchisq(U.tau.RC,df=(dat.rows-1)*(dat.cols-1),lower=FALSE); results - data.frame(tau.RC, tau.CR, p.tau.RC, p.tau.CR, var.tau.RC, ASE.tau.RC, ASE.tau.CR); return(results); } Hope this helps. -Antti -- == Antti Arppe - Master of Science (Engineering) Researcher doctoral student (Linguistics) E-mail: [EMAIL PROTECTED] WWW: http://www.ling.helsinki.fi/~aarppe __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R and excell differences in calculation F distributionfunction
R-help, I'm trying to work out the density function by doing: df(5.22245, 3, 22) [1] 0.005896862 df(15.20675, 6, 4) [1] 0.001223825 In excell the result is : 0.0071060464 * FDIST(5.22245,3,22) 0.011406 -- FDIST(15.20675,6,4) From my point of view the differences in the second case are substantial. Can anyone give me a hint on this? Thanks in advance Luis Ridao Cruz Faroese Fisheries Laboratory Nóatún 1, P.O. Box 3051 FR-110 Tórshavn Faroe Islands Tel : (+298) 353900, Tel (direct) : (+298) 353912 Mob.:(+298) 580800, Fax: : (+298) 353901 E-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] warning in library(GRASS)
Hi, With your help I've found out that library(GRASS) is the responsible for several warnings I'm obtaining. I've got a loop (~100 runs) and for every loop, I'm carrying out several operations over a sequence of raster maps (12) read from the GRASS GIS. But for every map and loop I always use RGRASS.temp as the R-workspace name for the imported raster files, no matter the name of the source GRASS raster map, e.g; RGRASS.temp - rast.get(G,rlist=paste(A2.grassname,it,.Bmaxind.,ib,sep=),catlabels=FALSE) As I always overwrite the same R object, I should hope that the number of calls to rast.get would have no influence on a warning like Too many open raster files, which I'm getting after the fourth loop. Am I missinterpreting something about the way R objects are stored, and every call to rast.get generates a new internal copy of RGRASS.temp? If so, is there a way to solve this problem? Thanks again and best wishes, Javier -- Javier García-Pintado Institute of Earth Sciences Jaume Almera (CSIC) Lluis Sole Sabaris s/n, 08028 Barcelona Phone: +34 934095410 Fax: +34 934110012 e-mail:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Classifly problems
Hello, I am trying to explore a classification with GGobi. I am trying to generate additional data according to the model so I can draw the decision boundaries on the GGobi plot. The problem is that I always get the same error: Error in predict.lda(model,data): wrong number of variables, even if I know that I used the same number of variables for the model generation (6) and for the additional data generation (6 also). I paste the code I am using: library(MASS) Tumor - c(rep(MM,20),rep(GBM,18),rep(LGG,17)) data.lda - lda(data,Tumor) data.ld - predict(data.lda) data.ldd - data.frame(data.ld$x,data.ld$class) library(rggobi) data.g - ggobi(data.ldd) library(classifly) generate_classification_data(data.lda,data,method=grid,n=10) Could you help me? Best regards, Dani [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and excell differences in calculation F distributionfunction
I think you'll have to compare FDIST to pf() and not df() Best regards Frede Aakmann Tøgersen Scientist UNIVERSITY OF AARHUS Faculty of Agricultural Sciences Dept. of Genetics and Biotechnology Blichers Allé 20, P.O. BOX 50 DK-8830 Tjele Phone: +45 8999 1900 Direct: +45 8999 1878 E-mail: [EMAIL PROTECTED] Web: http://www.agrsci.org This email may contain information that is confidential. Any use or publication of this email without written permission from Faculty of Agricultural Sciences is not allowed. If you are not the intended recipient, please notify Faculty of Agricultural Sciences immediately and delete this email. -Oprindelig meddelelse- Fra: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] På vegne af Luis Ridao Cruz Sendt: 7. august 2007 12:27 Til: r-help@stat.math.ethz.ch Emne: [R] R and excell differences in calculation F distributionfunction R-help, I'm trying to work out the density function by doing: df(5.22245, 3, 22) [1] 0.005896862 df(15.20675, 6, 4) [1] 0.001223825 In excell the result is : 0.0071060464 * FDIST(5.22245,3,22) 0.011406 -- FDIST(15.20675,6,4) From my point of view the differences in the second case are substantial. Can anyone give me a hint on this? Thanks in advance Luis Ridao Cruz Faroese Fisheries Laboratory Nóatún 1, P.O. Box 3051 FR-110 Tórshavn Faroe Islands Tel : (+298) 353900, Tel (direct) : (+298) 353912 Mob.:(+298) 580800, Fax: : (+298) 353901 E-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and excell differences in calculation F distributionfu
On 07-Aug-07 10:27:06, Luis Ridao Cruz wrote: R-help, I'm trying to work out the density function by doing: df(5.22245, 3, 22) [1] 0.005896862 df(15.20675, 6, 4) [1] 0.001223825 In excell the result is : 0.0071060464 * FDIST(5.22245,3,22) 0.011406 -- FDIST(15.20675,6,4) From my point of view the differences in the second case are substantial. Can anyone give me a hint on this? Thanks in advance It would seem that Excel is giving you the upper tail of the F-distribution (cumulative distribution, not density), i.e. what R would calculate as pf(5.22245,3,22,lower.tail=FALSE) [1] 0.007106046 pf(15.20675,6,4,lower.tail=FALSE) [1] 0.0114 so in effect using pf rather than df. If you want the density function (corresponding to df) from Excel, then that's not something I know how to do (nor want to know ... ). Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 07-Aug-07 Time: 11:47:52 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Robust Standard Errors for lme object
Hi I have fitted a linear mixed model using the lme function but now wish to calculate robust standard errors for my model. I have been able to find several functions which calculate robust s.e for lm objects but have not been able to find a function which calcualtes robust s.e for lme objects. Would anyone know of a function that will allow me to do this. Many Thanks Lucy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Functions for autoregressive Regressionmodels (Mix between times series and Regression Models) ?
Hello everybody, I've a question about autoregressive Regressionmodels. Let Y[1],.,Y[n], be a time series. Given the model: Y[t] = phi[1]*Y[t-1] + phi[2]*Y[t-1] + ... + phi[p]*Y[t-p] + x_t^T*beta + u_t, where x_t=(x[1t],x[2t],x[mt]) and beta=(beta[1],...,beta[m]) and u_t~(0,1) I want to estimate the coefficients phi and beta. Are in R any functions or packages for autoregressive Regressionmodel with special summaries?. I'm not meaning the function ar. Example: I have the data working.time- rnorm(100) # Y vacation- rnorm(100) #x1 bank.holidays - rnorm(100) #x2 illnes - rnorm(100) #x3 education - rnorm(100) #x3 Now I want to analyse: Y[t] = phi[1]*Y[t-1] + phi[2]*Y[t-1] + ... + phi[p]Y[t-p] + beta1*vacation_t +beta2*bank.holidays + beta3*illnes + beta4*eductation + u_t- Has anyone an idea? I would be more than glad if so. Thank you VERY much in advance. Kindly regards from the Eastern Part of Berlin, Maja -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classifly problems
Have you tried using classifly directly? library(classifly) classifly(Tumor ~ ., my.data.set, lda) generate_classification_data is an internal function, and you are passing it the wrong arguments. Hadley On 8/7/07, Dani Valverde [EMAIL PROTECTED] wrote: Hello, I am trying to explore a classification with GGobi. I am trying to generate additional data according to the model so I can draw the decision boundaries on the GGobi plot. The problem is that I always get the same error: Error in predict.lda(model,data): wrong number of variables, even if I know that I used the same number of variables for the model generation (6) and for the additional data generation (6 also). I paste the code I am using: library(MASS) Tumor - c(rep(MM,20),rep(GBM,18),rep(LGG,17)) data.lda - lda(data,Tumor) data.ld - predict(data.lda) data.ldd - data.frame(data.ld$x,data.ld$class) library(rggobi) data.g - ggobi(data.ldd) library(classifly) generate_classification_data(data.lda,data,method=grid,n=10) Could you help me? Best regards, Dani [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Functions for autoregressive Regressionmodels (Mix between times series and Regression Models) ?
arima (see ?arima) supports explanatory variables in the xreg= argument. On 8/7/07, Maja Schröter [EMAIL PROTECTED] wrote: Hello everybody, I've a question about autoregressive Regressionmodels. Let Y[1],.,Y[n], be a time series. Given the model: Y[t] = phi[1]*Y[t-1] + phi[2]*Y[t-1] + ... + phi[p]*Y[t-p] + x_t^T*beta + u_t, where x_t=(x[1t],x[2t],x[mt]) and beta=(beta[1],...,beta[m]) and u_t~(0,1) I want to estimate the coefficients phi and beta. Are in R any functions or packages for autoregressive Regressionmodel with special summaries?. I'm not meaning the function ar. Example: I have the data working.time- rnorm(100) # Y vacation- rnorm(100) #x1 bank.holidays - rnorm(100) #x2 illnes - rnorm(100) #x3 education - rnorm(100) #x3 Now I want to analyse: Y[t] = phi[1]*Y[t-1] + phi[2]*Y[t-1] + ... + phi[p]Y[t-p] + beta1*vacation_t +beta2*bank.holidays + beta3*illnes + beta4*eductation + u_t- Has anyone an idea? I would be more than glad if so. Thank you VERY much in advance. Kindly regards from the Eastern Part of Berlin, Maja -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Number of days in each month
Hi R-users, What is the best way to achieve a table which contains all days and months between years 2007-2020? I would like to calculate number of days in each month within those years (to data frame). Regards, Lauri [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems saving jpg files
Hi, I have a batch routine that does PCA on a series of files and saves the results as a csv file, and the respective graphs as pdf and jpg. While pdf's are fine, jpg files have a light grey background does not matter what color i set the bg param. I am running this on a PC with 4 GB RAM - if this makes any difference. My command is as follows: dev.print(jpeg, file=filejpg, width=1024, height=768, quality = 100, bg = white) I would appreciate if you have any explanations, and yes, i did read the help files this time ;-))) Thanks, Monica _ [[trailing spam removed]] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function for trim blanks from a string(s)?
Stripping either or both ends can be achieved by the somewhat tortuous but fairly general stripspace-function(x) sub(^\\s*([^ ]*.*[^ ])\\s*$, \\1,x) which uses a replacement buffer to keep the useful part of the string. Prof Brian Ripley [EMAIL PROTECTED] 06/08/2007 21:23:49 I am sure Marc knows that ?sub has examples of trimming trailing space and whitespace in various styles. *** This email contains information which may be confidential and/or privileged, and is intended only for the individual(s) or organisation(s) named above. If you are not the intended recipient, then please note that any disclosure, copying, distribution or use of the contents of this email is prohibited. Internet communications are not 100% secure and therefore we ask that you acknowledge this. If you have received this email in error, please notify the sender or contact +44(0)20 8943 7000 or [EMAIL PROTECTED] immediately, and delete this email and any attachments and copies from your system. Thank you. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex TW11 0LY, UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Number of days in each month
On Tue, Aug 07, 2007 at 03:53:28PM +0300, Lauri Nikkinen wrote: What is the best way to achieve a table which contains all days and months between years 2007-2020? I would like to calculate number of days in each month within those years (to data frame). Here you go -- just replace length.out=3 by 24*12 for your twentyfour years: as.POSIXlt(seq(as.Date(2007-02-01), by=month, length.out=3)-1)$mday [1] 31 28 31 A sequence of month at the first of the next month, shifted back a day gives the last one of that month -- converted to POSIXlt from which we extract the day-of-the-month. Hth, Dirk -- Three out of two people have difficulties with fractions. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Number of days in each month
diff(seq(as.Date(2007-01-01), as.Date(2021-01-01), by = month)) Time differences in days [1] 31 28 31 30 31 30 31 31 30 31 30 31 31 29 31 30 31 30 31 31 30 31 30 31 31 [26] 28 31 30 31 30 31 31 30 31 30 31 31 28 31 30 31 30 31 31 30 31 30 31 31 28 [51] 31 30 31 30 31 31 30 31 30 31 31 29 31 30 31 30 31 31 30 31 30 31 31 28 31 [76] 30 31 30 31 31 30 31 30 31 31 28 31 30 31 30 31 31 30 31 30 31 31 28 31 30 [101] 31 30 31 31 30 31 30 31 31 29 31 30 31 30 31 31 30 31 30 31 31 28 31 30 31 [126] 30 31 31 30 31 30 31 31 28 31 30 31 30 31 31 30 31 30 31 31 28 31 30 31 30 [151] 31 31 30 31 30 31 31 29 31 30 31 30 31 31 30 31 30 31 On 8/7/07, Lauri Nikkinen [EMAIL PROTECTED] wrote: Hi R-users, What is the best way to achieve a table which contains all days and months between years 2007-2020? I would like to calculate number of days in each month within those years (to data frame). Regards, Lauri [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Number of days in each month
Just one refinement if you want it nicely formatted: ts(diff(seq(as.Date(2007-01-01), as.Date(2021-01-01), by = month)), start = c(2007, 01), freq = 12) Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2007 31 28 31 30 31 30 31 31 30 31 30 31 2008 31 29 31 30 31 30 31 31 30 31 30 31 2009 31 28 31 30 31 30 31 31 30 31 30 31 2010 31 28 31 30 31 30 31 31 30 31 30 31 2011 31 28 31 30 31 30 31 31 30 31 30 31 2012 31 29 31 30 31 30 31 31 30 31 30 31 2013 31 28 31 30 31 30 31 31 30 31 30 31 2014 31 28 31 30 31 30 31 31 30 31 30 31 2015 31 28 31 30 31 30 31 31 30 31 30 31 2016 31 29 31 30 31 30 31 31 30 31 30 31 2017 31 28 31 30 31 30 31 31 30 31 30 31 2018 31 28 31 30 31 30 31 31 30 31 30 31 2019 31 28 31 30 31 30 31 31 30 31 30 31 2020 31 29 31 30 31 30 31 31 30 31 30 31 On 8/7/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: diff(seq(as.Date(2007-01-01), as.Date(2021-01-01), by = month)) Time differences in days [1] 31 28 31 30 31 30 31 31 30 31 30 31 31 29 31 30 31 30 31 31 30 31 30 31 31 [26] 28 31 30 31 30 31 31 30 31 30 31 31 28 31 30 31 30 31 31 30 31 30 31 31 28 [51] 31 30 31 30 31 31 30 31 30 31 31 29 31 30 31 30 31 31 30 31 30 31 31 28 31 [76] 30 31 30 31 31 30 31 30 31 31 28 31 30 31 30 31 31 30 31 30 31 31 28 31 30 [101] 31 30 31 31 30 31 30 31 31 29 31 30 31 30 31 31 30 31 30 31 31 28 31 30 31 [126] 30 31 31 30 31 30 31 31 28 31 30 31 30 31 31 30 31 30 31 31 28 31 30 31 30 [151] 31 31 30 31 30 31 31 29 31 30 31 30 31 31 30 31 30 31 On 8/7/07, Lauri Nikkinen [EMAIL PROTECTED] wrote: Hi R-users, What is the best way to achieve a table which contains all days and months between years 2007-2020? I would like to calculate number of days in each month within those years (to data frame). Regards, Lauri [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Number of days in each month
Lauri Nikkinen wrote: Hi R-users, What is the best way to achieve a table which contains all days and months between years 2007-2020? I would like to calculate number of days in each month within those years (to data frame). Regards, Lauri How about this? X - seq(ISOdate(2007,1,1), ISOdate(2020,12,31), by=60*60*24) mytab - table(substring(X, 3, 4), substring(X, 6, 7)) mytab 01 02 03 04 05 06 07 08 09 10 11 12 07 31 28 31 30 31 30 31 31 30 31 30 31 08 31 29 31 30 31 30 31 31 30 31 30 31 09 31 28 31 30 31 30 31 31 30 31 30 31 10 31 28 31 30 31 30 31 31 30 31 30 31 11 31 28 31 30 31 30 31 31 30 31 30 31 12 31 29 31 30 31 30 31 31 30 31 30 31 13 31 28 31 30 31 30 31 31 30 31 30 31 14 31 28 31 30 31 30 31 31 30 31 30 31 15 31 28 31 30 31 30 31 31 30 31 30 31 16 31 29 31 30 31 30 31 31 30 31 30 31 17 31 28 31 30 31 30 31 31 30 31 30 31 18 31 28 31 30 31 30 31 31 30 31 30 31 19 31 28 31 30 31 30 31 31 30 31 30 31 20 31 29 31 30 31 30 31 31 30 31 30 31 head(as.data.frame(mytab)) Var1 Var2 Freq 1 07 01 31 2 08 01 31 3 09 01 31 4 10 01 31 5 11 01 31 6 12 01 31 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in as.double.default(x) : (list) object cannot be coerced to 'double'
Dear experts, I have in all 14 matrices which stands for gene expression divergence and 14 matrices which stands for gene sequence divergence. I have tried joining them by using the concatanation function giving SequenceDivergence - c(X1,X2,X3,X4,X5,X6,X7,X8,X9,X10,X11,X12,X13,X14) ExpressionDivergence - c(Y1,Y2,Y3,Y4,Y5,Y6,Y7,Y8,Y9,Y10,Y11,Y12,Y13,Y14) where X1,X2..X14 are the expression matrices containing r-values and Y1,Y2..Y14 are the ones with patristic distances Now, I want to plot SequenceDivergence vs. Expression Divergence Tried doing that using plot (Sequence Divergence vs. Expression Divergence) But then getting the error Error in as.double.default(x) : (list) object cannot be coerced to 'double' Note: The diagonal values for X1 was neglected as its seems to give some bias in my results. Code to derive X1,X2,X3..and Y1,Y2,Y3 is given as: EDant - read.table(C:/ant.txt,sep=\t) ED1 - as.matrix(EDant) X1 - lapply(1:ncol(ED15), function(a) ED15[-a, a]) # to neglect diagonal values I am attaching couple of matrices for your reference. ANT.txt is X1 after deleting the diagnol values, similarly,ANTEXP.TXT is Y1, apetella.txt is X2 and apetellaexp.txt is Y2 I tried reading R-manual and other sources but have'nt cracked it yet. Can anyone please help me regarding that? Thanks very much Yours sincerely, Urmi Trivedi. - Once upon a time there was 1 GB storage in your inbox. Click here for happy ending.0 1.21133 1.4202211.4993581.4751491.513886 1.6088161.6759161.8590381.9929582.420123 2.3094142.413222 1.21133 0 1.4635851.5427221.5185131.55725 1.65218 1.71928 1.9024022.0363222.4634872.3527782.456586 1.4202211.4635850 1.4186191.39441 1.433147 1.5280771.5951771.7782991.9122192.339384 2.2286752.332483 1.4993581.5427221.4186190 0.931235 0.9699721.3544981.4215981.60472 1.73864 2.165805 2.0550962.158904 1.4751491.5185131.39441 0.9312350 0.206013 1.3302891.3973891.5805111.7144312.141596 2.0308872.134695 1.5138861.55725 1.4331470.9699720.2060130 1.3690261.4361261.6192481.7531682.180333 2.0696242.173432 1.6088161.65218 1.5280771.3544981.330289 1.3690260 0.6049721.5376961.671616 2.0987811.9880722.09188 1.6759161.71928 1.5951771.4215981.397389 1.4361260.6049720 1.6047961.738716 2.1658812.0551722.15898 1.8590381.9024021.7782991.60472 1.580511 1.6192481.5376961.6047960 1.09121 1.968427 1.8577181.961526 1.9929582.0363221.9122191.73864 1.714431 1.7531681.6716161.7387161.09121 0 2.102347 1.9916382.095446 2.4201232.4634872.3393842.1658052.141596 2.1803332.0987812.1658811.9684272.102347 0 1.1287011.232509 2.3094142.3527782.2286752.0550962.030887 2.0696241.9880722.0551721.8577181.991638 1.1287010 0.91546 2.4132222.4565862.3324832.1589042.134695 2.1734322.09188 2.15898 1.9615262.0954461.232509 0.91546 0 1 -0.046373 -0.033044 0.0039020.466827 0.3893270.131989-0.063346 -0.118093 0.016386 -0.140215 0.34511 0.233074 -0.046373 1 0.0007170.8322270.077053 -0.106815 -0.050838 0.9364310.2973870.031358 -0.127265 0.31245 -0.053169 -0.033044 0.0007171 0.011265-0.018602 -0.023669 -0.028744 -0.007588 0.148827-0.007067 0.005362-0.029172 -0.017367 0.0039020.8322270.0112651 0.071247 -0.101225 0.0971170.8291670.2943960.017809 -0.129199 0.278831-0.106427 0.4668270.077053-0.018602 0.0712471 0.0782410.1406790.0700690.0557440.013194 -0.196092 -0.027692 0.028857 0.389327-0.106815 -0.023669 -0.101225 0.078241 1 0.140146-0.086806 0.079 0.194904
[R] logistic models validation for categorical data
Dear fellow R-ussers, Thanks for your help on th AIC. I have yet an other question about logistic models. to buuild the model i used sevral packages, (MASS, nlme, binom), just to make sure i had all the tools at my disposal. now i would like to validat the model. ussing a cross validation tool. In a normal logistical case you get an error table giving the number of cases classified under 0/1 by the model, and a comparison with the real classification of the test sample. table 1 glm 0 1 FALSE 8 14 TRUE 0 8 Apperently this is not the case when you have categorical data. Then you get an output for every interval of your predictor variable , and this is compared to the frequency of this variable, so it is very hard to interpred. table 2 glm 0 2 3 4 7 FALSE 1 0 0 1 1 TRUE 1 2 1 0 0 One of the problems is, not only that i can't interpret table 2, but also that it does not give reference to the number of test samples, wich is clear in table 1 ( 30 test samples). Kind regards, Tom. Tom Willems CODA-CERVA-VAR Department of Virology Epizootic Diseases Section Groeselenberg 99 B-1180 Ukkel Tel.: +32 2 3790522 Fax: +32 2 3790666 E-mail: [EMAIL PROTECTED] www.var.fgov.be Disclaimer: click here [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Robust Standard Errors for lme object
Lucy: Why are you interested in robust standard errors from lme? Typically, robust standard errors are sought when there is model misspecification due to ignoring some covariance among units with a group. But, a mixed model is designed to directly account for covariances among units within a group such that the standard errors more adequately represent the true sampling variance of the parameters. So, the lme standard errors are robust in a sense that they are presumably correct if you have your model correctly specified. It would do better service to explain your issue a little more to get better help. Harold -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lucy Radford Sent: Tuesday, August 07, 2007 8:17 AM To: r-help@stat.math.ethz.ch Subject: [R] Robust Standard Errors for lme object Hi I have fitted a linear mixed model using the lme function but now wish to calculate robust standard errors for my model. I have been able to find several functions which calculate robust s.e for lm objects but have not been able to find a function which calcualtes robust s.e for lme objects. Would anyone know of a function that will allow me to do this. Many Thanks Lucy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Functions for autoregressive Regressionmodels (Mix between times series and Regression Models) ?
On Tue, 7 Aug 2007, Gabor Grothendieck wrote: arima (see ?arima) supports explanatory variables in the xreg= argument. In addition to arima(), the function dynlm() in package dynlm might be useful if you want to fit the model by OLS. See also package dyn for a similar approach. Z On 8/7/07, Maja Schröter [EMAIL PROTECTED] wrote: Hello everybody, I've a question about autoregressive Regressionmodels. Let Y[1],.,Y[n], be a time series. Given the model: Y[t] = phi[1]*Y[t-1] + phi[2]*Y[t-1] + ... + phi[p]*Y[t-p] + x_t^T*beta + u_t, where x_t=(x[1t],x[2t],x[mt]) and beta=(beta[1],...,beta[m]) and u_t~(0,1) I want to estimate the coefficients phi and beta. Are in R any functions or packages for autoregressive Regressionmodel with special summaries?. I'm not meaning the function ar. Example: I have the data working.time- rnorm(100) # Y vacation- rnorm(100) #x1 bank.holidays - rnorm(100) #x2 illnes - rnorm(100) #x3 education - rnorm(100) #x3 Now I want to analyse: Y[t] = phi[1]*Y[t-1] + phi[2]*Y[t-1] + ... + phi[p]Y[t-p] + beta1*vacation_t +beta2*bank.holidays + beta3*illnes + beta4*eductation + u_t- Has anyone an idea? I would be more than glad if so. Thank you VERY much in advance. Kindly regards from the Eastern Part of Berlin, Maja -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Phi parameter in corAR(1) in NLMEs
Hello, I have been fitting nonlinear models with random effects using nlme with the corCAR1 correlation structure, as I have unequally spaced observations per subject, and serially correlated errors. My question is regarding the Phi parameter. Example of output: Correlation Structure: Continuous AR(1) Formula: ~day | subject Parameter estimate(s): Phi 0.8475842 After reading the Pinheiro and Bates (2000), and Jones (1993) texts I am uncertain whether the estimate of Phi given in the output is: - the correlation between two observations one time unit apart, i.e. phi(t2-t1) = phi(1) = 0.8475842 - or is the parameter in the exponential term, exp(-phi(t2-t1)), i.e the correlation between two observations one unit apart is exp(-0.8475842*(t2-t1)) = exp(-0.8475842*1) = 0.4284 - or if there is another interpretation altogether Thank-you very much for any clarification on this, I appreciate your help! Kerrie Nelson, Biostatistics Unit, MGH The information transmitted in this electronic communication...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Catch errors
Seth and Stephen, Thanks a lot for the explanation and help! I really appreciate it. Gang On Aug 6, 2007, at 6:41 PM, Stephen Tucker wrote: That's because tag - 1 is evaluated in a local environment (of the function) - once function(err) exits, the information is lost. Try R's - operator: tag - 0 tryCatch(fit.lme - lme(Beta ~ Trust*Sex*Freq, random = ~1|Subj, Model), error=function(err) tag - 1) On Aug 6, 2007, at 6:55 PM, Seth Falcon wrote: Gang Chen [EMAIL PROTECTED] writes: I wanted something like this: tag - 0; tryCatch(fit.lme - lme(Beta ~ Trust*Sex*Freq, random = ~1|Subj, Model), error=function(err) tag - 1); but it seems not working because 'tag' does not get value of 1 when error occurs. How can I make it work? You can use '-' to assign to the parent frame like this: tag - 0 tryCatch({ print(inside tryCatch) print(paste(tag, tag)) stop(forcing an error) }, error=function(e) { print(caught an error) tag - 1 }) But you can also use the return value of tryCatch as a return code if that is what tag is. Like this: fail - TRUE tag - tryCatch({ print(inside tryCatch) if (fail) stop(forcing an error) 0 }, error=function(e) { print(caught an error) 1 }) tag + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center BioC: http://bioconductor.org/ Blog: http://userprimary.net/user/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Contrast testing with lme
I'm trying to run a contrast with lme, but could not get it work: anova(lme(Beta ~ Trust*Sex*Freq, random = ~1|Subj, Model), L=c (TrustT:Sex:Freq=1, TrustU:Sex:Freq=-1)) Error in anova.lme(lme(Beta ~ Trust * Sex * Freq, random = ~1 | Subj, : Effects TrustT:Sex:Freq, TrustU:Sex:Freq not matched What is missing? Here is more information: summary(lme(Beta ~ Trust*Sex*Freq, random = ~1|Subj, Model)) Linear mixed-effects model fit by REML Data: Model AIC BIC logLik -825.4663 -791.4348 426.7331 Random effects: Formula: ~1 | Subj (Intercept)Residual StdDev: 0.001144573 0.001167392 Fixed effects: Beta ~ Trust * Sex * Freq ValueStd.Error DFt-value p-value (Intercept) 0.0001090007 0.0005780194 77 0.1885762 0.8509 TrustU 0.0014426378 0.0005836959 77 2.4715572 0.0157 SexM0.0008230359 0.0005836959 77 1.4100423 0.1626 FreqLo 0.0001998191 0.0005836959 77 0.3423343 0.7330 FreqNo 0.0004900107 0.0005836959 77 0.8394965 0.4038 TrustU:SexM-0.0012598266 0.0008254707 77 -1.5261918 0.1311 TrustU:FreqLo -0.0012383346 0.0008254707 77 -1.5001558 0.1377 TrustU:FreqNo -0.0009141543 0.0008254707 77 -1.1074341 0.2716 SexM:FreqLo-0.0008469211 0.0008254707 77 -1.0259857 0.3081 SexM:FreqNo 0.0006361012 0.0008254707 77 0.7705922 0.4433 TrustU:SexM:FreqLo 0.0013272173 0.0011673918 77 1.1369082 0.2591 TrustU:SexM:FreqNo 0.0006241524 0.0011673918 77 0.5346555 0.5944 ... anova(lme(Beta ~ Trust*Sex*Freq, random = ~1|Subj, Model)) numDF denDF F-value p-value (Intercept)177 4.813938 0.0312 Trust 177 3.113293 0.0816 Sex177 3.535774 0.0638 Freq 277 6.083832 0.0035 Trust:Sex 177 1.634858 0.2049 Trust:Freq 277 0.678558 0.5104 Sex:Freq 277 2.165059 0.1217 Trust:Sex:Freq 277 0.647042 0.5264 Thanks, Gang __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interaction factor and numeric variable versus separate regressions
Dear list members, I have problems to interpret the coefficients from a lm model involving the interaction of a numeric and factor variable compared to separate lm models for each level of the factor variable. ## data: y1 - rnorm(20) + 6.8 y2 - rnorm(20) + (1:20*1.7 + 1) y3 - rnorm(20) + (1:20*6.7 + 3.7) y - c(y1,y2,y3) x - rep(1:20,3) f - gl(3,20, labels=paste(lev, 1:3, sep=)) d - data.frame(x=x,y=y, f=f) ## plot # xyplot(y~x|f) ## lm model with interaction summary(lm(y~x:f, data=d)) Call: lm(formula = y ~ x:f, data = d) Residuals: Min 1Q Median 3Q Max -2.8109 -0.8302 0.2542 0.6737 3.5383 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 3.687990.41045 8.985 1.91e-12 *** x:flev1 0.208850.04145 5.039 5.21e-06 *** x:flev2 1.496700.04145 36.109 2e-16 *** x:flev3 6.708150.04145 161.838 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.53 on 56 degrees of freedom Multiple R-Squared: 0.9984, Adjusted R-squared: 0.9984 F-statistic: 1.191e+04 on 3 and 56 DF, p-value: 2.2e-16 ## separate lm fits lapply(by(d, d$f, function(x) lm(y ~ x, data=x)), coef) $lev1 (Intercept) x 6.77022860 -0.01667528 $lev2 (Intercept) x 1.0190781.691982 $lev3 (Intercept) x 3.2746566.738396 Can anybody give me a hint why the coefficients for the slopes (especially for lev1) are so different and how the coefficients from the lm model with interaction are related to the separate fits? Thanks, Sven __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Robust Standard Errors for lme object
On Tue, 7 Aug 2007, Doran, Harold wrote: Lucy: Why are you interested in robust standard errors from lme? Typically, robust standard errors are sought when there is model misspecification due to ignoring some covariance among units with a group. But, a mixed model is designed to directly account for covariances among units within a group such that the standard errors more adequately represent the true sampling variance of the parameters. I think it's a perfectly reasonable thing to want, but it is not easy to provide because of the generality of lme(). For example, models with crossed effects need special handling, and possibly so do time-series or spatial covariance models with large numbers of observations per group. I imagine that misspecification of the variance, rather than the correlation, would be the main concern, just as it is with independent observations. Of course the model-robust variances would only be useful if the sample size of top-level units was large enough, and if the variance components were not of any direct interest. So, the lme standard errors are robust in a sense that they are presumably correct if you have your model correctly specified. To paraphrase the Hitchikers' Guide: This must be some definition of the word 'robust' that I was not previously aware of. :) -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Positioning text in top left corner of plot
Simple question how can you position text in the top left hand corner of a plot? I am plotting multiple plots using par(mfrow=c(2,3)) and all I want to do is label these plots a), b), c) etc. I have been fiddling around with both text and mtext but without much luck. text is fine but each plot has a different scale on the axis and so this makes it problematic. What is the best way to do this? Many thanks Dan -- ** Daniel Brewer, Ph.D. Institute of Cancer Research Email: [EMAIL PROTECTED] ** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addre...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] saving output
I have a question about how to save the output of a logistic regression model in some format (delimited, preferably) so that I can manipulate it to make nice tables in excel. I have tried print, save, write, none seem to do what I want them to. Does anyone have any ideas? Lynn D. Disney, Ph.D., J.D., M.P.H. Research Analyst SEIU 1199 1771 E. 30th Street Cleveland, Ohio 44114 Telephone: (216) 566-0117 Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Positioning text in top left corner of plot
You can call par('usr') to find the current coordinates, then use the text function (the pos and offset arguments may be helpful as well) The cnvrt.coords function from the TeachingDemos package may also be of help for finding coordinates for the text function. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Daniel Brewer Sent: Tuesday, August 07, 2007 9:09 AM To: r-help@stat.math.ethz.ch Subject: [R] Positioning text in top left corner of plot Simple question how can you position text in the top left hand corner of a plot? I am plotting multiple plots using par(mfrow=c(2,3)) and all I want to do is label these plots a), b), c) etc. I have been fiddling around with both text and mtext but without much luck. text is fine but each plot has a different scale on the axis and so this makes it problematic. What is the best way to do this? Many thanks Dan -- ** Daniel Brewer, Ph.D. Institute of Cancer Research Email: [EMAIL PROTECTED] ** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the\ ...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interaction factor and numeric variable versus separate regressions
In the single model all three levels share the same intercept which means that the slope must change to accomodate it whereas in the three separate models they each have their own intercept. Try looking at it graphically and note how the black dotted lines are all forced to go through the same intercept, i.e. the same point on the y axis, whereas the red dashed lines are each able to fit their portion of the data using both the intercept and the slope. y.lm - lm(y~x:f, data=d) plot(y ~ x, d, col = as.numeric(d$f), xlim = c(-5, 20)) for(i in 1:3) { abline(a = coef(y.lm)[1], b = coef(y.lm)[1+i], lty = dotted) abline(lm(y ~ x, d[as.numeric(d$f) == i,]), col = red, lty = dashed) } grid() On 8/7/07, Sven Garbade [EMAIL PROTECTED] wrote: Dear list members, I have problems to interpret the coefficients from a lm model involving the interaction of a numeric and factor variable compared to separate lm models for each level of the factor variable. ## data: y1 - rnorm(20) + 6.8 y2 - rnorm(20) + (1:20*1.7 + 1) y3 - rnorm(20) + (1:20*6.7 + 3.7) y - c(y1,y2,y3) x - rep(1:20,3) f - gl(3,20, labels=paste(lev, 1:3, sep=)) d - data.frame(x=x,y=y, f=f) ## plot # xyplot(y~x|f) ## lm model with interaction summary(lm(y~x:f, data=d)) Call: lm(formula = y ~ x:f, data = d) Residuals: Min 1Q Median 3Q Max -2.8109 -0.8302 0.2542 0.6737 3.5383 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 3.687990.41045 8.985 1.91e-12 *** x:flev1 0.208850.04145 5.039 5.21e-06 *** x:flev2 1.496700.04145 36.109 2e-16 *** x:flev3 6.708150.04145 161.838 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.53 on 56 degrees of freedom Multiple R-Squared: 0.9984, Adjusted R-squared: 0.9984 F-statistic: 1.191e+04 on 3 and 56 DF, p-value: 2.2e-16 ## separate lm fits lapply(by(d, d$f, function(x) lm(y ~ x, data=x)), coef) $lev1 (Intercept) x 6.77022860 -0.01667528 $lev2 (Intercept) x 1.0190781.691982 $lev3 (Intercept) x 3.2746566.738396 Can anybody give me a hint why the coefficients for the slopes (especially for lev1) are so different and how the coefficients from the lm model with interaction are related to the separate fits? Thanks, Sven __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] saving output
It's not so straightforward with the manipulation, but if you're familiar with LaTeX, I'd recommend the xtable package. It makes very readable regression summaries. Kyle H. Ambert Department of Behavioral Neuroscience Oregon Health Science University On 8/7/07, Lynn Disney [EMAIL PROTECTED] wrote: I have a question about how to save the output of a logistic regression model in some format (delimited, preferably) so that I can manipulate it to make nice tables in excel. I have tried print, save, write, none seem to do what I want them to. Does anyone have any ideas? Lynn D. Disney, Ph.D., J.D., M.P.H. Research Analyst SEIU 1199 1771 E. 30th Street Cleveland, Ohio 44114 Telephone: (216) 566-0117 Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] array
Hello, I have some files generated from microarray experiments. I used scan() to read the files, and assigned each file to a unique name with many rows and columns. Now I want to create a array (ArrayA) with unique names, and I can use ArrayA[1,2][[6]] to refer the data in each file. Is there any packages available for array of array? Thanks! [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] saving output
See ?sink -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O On 07/08/07, Lynn Disney [EMAIL PROTECTED] wrote: I have a question about how to save the output of a logistic regression model in some format (delimited, preferably) so that I can manipulate it to make nice tables in excel. I have tried print, save, write, none seem to do what I want them to. Does anyone have any ideas? Lynn D. Disney, Ph.D., J.D., M.P.H. Research Analyst SEIU 1199 1771 E. 30th Street Cleveland, Ohio 44114 Telephone: (216) 566-0117 Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Positioning text in top left corner of plot
maybe this is what you want? plot(rnorm(10)) legend(topleft, A), bty=n) ? b On Aug 7, 2007, at 11:08 AM, Daniel Brewer wrote: Simple question how can you position text in the top left hand corner of a plot? I am plotting multiple plots using par(mfrow=c(2,3)) and all I want to do is label these plots a), b), c) etc. I have been fiddling around with both text and mtext but without much luck. text is fine but each plot has a different scale on the axis and so this makes it problematic. What is the best way to do this? Many thanks Dan -- ** Daniel Brewer, Ph.D. Institute of Cancer Research Email: [EMAIL PROTECTED] ** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the add...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Robust Standard Errors for lme object
-Original Message- From: Thomas Lumley [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 07, 2007 11:06 AM To: Doran, Harold Cc: Lucy Radford; r-help@stat.math.ethz.ch Subject: Re: [R] Robust Standard Errors for lme object On Tue, 7 Aug 2007, Doran, Harold wrote: Lucy: Why are you interested in robust standard errors from lme? Typically, robust standard errors are sought when there is model misspecification due to ignoring some covariance among units with a group. But, a mixed model is designed to directly account for covariances among units within a group such that the standard errors more adequately represent the true sampling variance of the parameters. I think it's a perfectly reasonable thing to want, but it is not easy to provide because of the generality of lme(). For example, models with crossed effects need special handling, and possibly so do time-series or spatial covariance models with large numbers of observations per group. I still don't understand why we might want them. If there were some correlation among units within a group, then we would not have indep. observations. That is, we have less information from a cluster sample than from a simple random sample because each individual does not provide unique information. Hence, if one were to use an OLS regression when there is clustering, then the likelihood function is misspecified. Now the MLEs from this regression may still retain some pragmatic interest (they *may* still be consistent), but the sampling variances of the fixed effects would be incorrect (most likely too small). So, one may obtain robust standard errors to account for model misspecification in this scenario. On the other hand, if the researcher knows that there may be a violation of independence because of clustering, then one may be motivated to rely on a mixed model in which case the likelihood is not misspecified since the posterior views the observations as conditionally ind given the population distribution. Therefore, the sampling variances of the fixed effects are correct. Why get robust standard errors when the likelihood function is correctly specified? To make this concrete, assume it were an educational testing example where each student is tested and students are grouped into classrooms. Hence we have N total students clustered within K groups. Instruction happens at the group level and students within a classroom are presumably not providing independent information. Now, if I were to do an OLS regression with test scores as the outcome and some variable x as the ind. variable, this model would be misspecified given the clustering and the standard errors would not be correct. But, if I were to use lmer and accounted for this clustering, then the standard errors of the variable x would be accurate. Why would I go and get robust standard errors in this case? Another way to put it is this. The OLS standard errors are Se = (X'V^{-1}X){-1} Where V is a diagonal matrix and all elements along the diagonal are equal. This same equation is used for the standard errors in a mixed model, but V is now a block-diagonal matrix where the off-diagonals are the covariances among individuals within the same group. Doug Bates will kill me for using GLS notation since it doesn't jive with lmer. I imagine that misspecification of the variance, rather than the correlation, would be the main concern, just as it is with independent observations. Of course the model-robust variances would only be useful if the sample size of top-level units was large enough, and if the variance components were not of any direct interest. So, the lme standard errors are robust in a sense that they are presumably correct if you have your model correctly specified. To paraphrase the Hitchikers' Guide: This must be some definition of the word 'robust' that I was not previously aware of. :) -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interaction factor and numeric variable versus separate
On 07-Aug-07 15:34:13, Gabor Grothendieck wrote: In the single model all three levels share the same intercept which means that the slope must change to accomodate it whereas in the three separate models they each have their own intercept. I think this arose because of the formulation of the model with interaction as: summary(lm(y~x:f, data=d)) If it has been formulated as summary(lm(y~x*f, data=d)) there would be three separate intercepts, and three different slopes (and the differences would be the same as the differences for the separate models). Ted. Try looking at it graphically and note how the black dotted lines are all forced to go through the same intercept, i.e. the same point on the y axis, whereas the red dashed lines are each able to fit their portion of the data using both the intercept and the slope. y.lm - lm(y~x:f, data=d) plot(y ~ x, d, col = as.numeric(d$f), xlim = c(-5, 20)) for(i in 1:3) { abline(a = coef(y.lm)[1], b = coef(y.lm)[1+i], lty = dotted) abline(lm(y ~ x, d[as.numeric(d$f) == i,]), col = red, lty = dashed) } grid() On 8/7/07, Sven Garbade [EMAIL PROTECTED] wrote: Dear list members, I have problems to interpret the coefficients from a lm model involving the interaction of a numeric and factor variable compared to separate lm models for each level of the factor variable. ## data: y1 - rnorm(20) + 6.8 y2 - rnorm(20) + (1:20*1.7 + 1) y3 - rnorm(20) + (1:20*6.7 + 3.7) y - c(y1,y2,y3) x - rep(1:20,3) f - gl(3,20, labels=paste(lev, 1:3, sep=)) d - data.frame(x=x,y=y, f=f) ## plot # xyplot(y~x|f) ## lm model with interaction summary(lm(y~x:f, data=d)) Call: lm(formula = y ~ x:f, data = d) Residuals: Min 1Q Median 3Q Max -2.8109 -0.8302 0.2542 0.6737 3.5383 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 3.687990.41045 8.985 1.91e-12 *** x:flev1 0.208850.04145 5.039 5.21e-06 *** x:flev2 1.496700.04145 36.109 2e-16 *** x:flev3 6.708150.04145 161.838 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.53 on 56 degrees of freedom Multiple R-Squared: 0.9984, Adjusted R-squared: 0.9984 F-statistic: 1.191e+04 on 3 and 56 DF, p-value: 2.2e-16 ## separate lm fits lapply(by(d, d$f, function(x) lm(y ~ x, data=x)), coef) $lev1 (Intercept) x 6.77022860 -0.01667528 $lev2 (Intercept) x 1.0190781.691982 $lev3 (Intercept) x 3.2746566.738396 Can anybody give me a hint why the coefficients for the slopes (especially for lev1) are so different and how the coefficients from the lm model with interaction are related to the separate fits? Thanks, Sven __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 07-Aug-07 Time: 17:01:33 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interaction factor and numeric variable versus separate
Also check this post https://stat.ethz.ch/pipermail/r-help/2007-May/132866.html for a number of formulations. On 8/7/07, Ted Harding [EMAIL PROTECTED] wrote: On 07-Aug-07 15:34:13, Gabor Grothendieck wrote: In the single model all three levels share the same intercept which means that the slope must change to accomodate it whereas in the three separate models they each have their own intercept. I think this arose because of the formulation of the model with interaction as: summary(lm(y~x:f, data=d)) If it has been formulated as summary(lm(y~x*f, data=d)) there would be three separate intercepts, and three different slopes (and the differences would be the same as the differences for the separate models). Ted. Try looking at it graphically and note how the black dotted lines are all forced to go through the same intercept, i.e. the same point on the y axis, whereas the red dashed lines are each able to fit their portion of the data using both the intercept and the slope. y.lm - lm(y~x:f, data=d) plot(y ~ x, d, col = as.numeric(d$f), xlim = c(-5, 20)) for(i in 1:3) { abline(a = coef(y.lm)[1], b = coef(y.lm)[1+i], lty = dotted) abline(lm(y ~ x, d[as.numeric(d$f) == i,]), col = red, lty = dashed) } grid() On 8/7/07, Sven Garbade [EMAIL PROTECTED] wrote: Dear list members, I have problems to interpret the coefficients from a lm model involving the interaction of a numeric and factor variable compared to separate lm models for each level of the factor variable. ## data: y1 - rnorm(20) + 6.8 y2 - rnorm(20) + (1:20*1.7 + 1) y3 - rnorm(20) + (1:20*6.7 + 3.7) y - c(y1,y2,y3) x - rep(1:20,3) f - gl(3,20, labels=paste(lev, 1:3, sep=)) d - data.frame(x=x,y=y, f=f) ## plot # xyplot(y~x|f) ## lm model with interaction summary(lm(y~x:f, data=d)) Call: lm(formula = y ~ x:f, data = d) Residuals: Min 1Q Median 3Q Max -2.8109 -0.8302 0.2542 0.6737 3.5383 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 3.687990.41045 8.985 1.91e-12 *** x:flev1 0.208850.04145 5.039 5.21e-06 *** x:flev2 1.496700.04145 36.109 2e-16 *** x:flev3 6.708150.04145 161.838 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.53 on 56 degrees of freedom Multiple R-Squared: 0.9984, Adjusted R-squared: 0.9984 F-statistic: 1.191e+04 on 3 and 56 DF, p-value: 2.2e-16 ## separate lm fits lapply(by(d, d$f, function(x) lm(y ~ x, data=x)), coef) $lev1 (Intercept) x 6.77022860 -0.01667528 $lev2 (Intercept) x 1.0190781.691982 $lev3 (Intercept) x 3.2746566.738396 Can anybody give me a hint why the coefficients for the slopes (especially for lev1) are so different and how the coefficients from the lm model with interaction are related to the separate fits? Thanks, Sven __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 07-Aug-07 Time: 17:01:33 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interaction factor and numeric variable versus separate regressions
These are not the same model. You want x*f, and then you will find the differences in intercepts and slopes from group 1 as the coefficients. Remember too that the combined model pools error variances and the separate model has separate error variance for each group. To understand model formulae, study Bill Venables' exposition in chapter 6 of MASS. On Tue, 7 Aug 2007, Sven Garbade wrote: Dear list members, I have problems to interpret the coefficients from a lm model involving the interaction of a numeric and factor variable compared to separate lm models for each level of the factor variable. ## data: y1 - rnorm(20) + 6.8 y2 - rnorm(20) + (1:20*1.7 + 1) y3 - rnorm(20) + (1:20*6.7 + 3.7) y - c(y1,y2,y3) x - rep(1:20,3) f - gl(3,20, labels=paste(lev, 1:3, sep=)) d - data.frame(x=x,y=y, f=f) ## plot # xyplot(y~x|f) ## lm model with interaction summary(lm(y~x:f, data=d)) Call: lm(formula = y ~ x:f, data = d) Residuals: Min 1Q Median 3Q Max -2.8109 -0.8302 0.2542 0.6737 3.5383 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 3.687990.41045 8.985 1.91e-12 *** x:flev1 0.208850.04145 5.039 5.21e-06 *** x:flev2 1.496700.04145 36.109 2e-16 *** x:flev3 6.708150.04145 161.838 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.53 on 56 degrees of freedom Multiple R-Squared: 0.9984, Adjusted R-squared: 0.9984 F-statistic: 1.191e+04 on 3 and 56 DF, p-value: 2.2e-16 ## separate lm fits lapply(by(d, d$f, function(x) lm(y ~ x, data=x)), coef) $lev1 (Intercept) x 6.77022860 -0.01667528 $lev2 (Intercept) x 1.0190781.691982 $lev3 (Intercept) x 3.2746566.738396 Can anybody give me a hint why the coefficients for the slopes (especially for lev1) are so different and how the coefficients from the lm model with interaction are related to the separate fits? Thanks, Sven __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Naming Lists
On Tue, 7 Aug 2007, Tom.O wrote: Hi Im pretty new to R and I have run in to a problem. How do I name all the levels in this list. One way: MyList - mapply(function(x) mapply( function(y) mapply( function(z) paste(unlist(x),unlist(y),unlist(z)), Lev3, SIMPLIFY=FALSE ), Lev2, SIMPLIFY=FALSE ), Lev1, SIMPLIFY=FALSE ) However, for many purposes you might better use an array: MyArray.dimnames - list(Lev1,Lev2,Lev3) combos - rev( expand.grid (Lev3,Lev2,Lev1) ) elements - do.call( paste, combos ) MyArray - array( elements, dim=sapply(MyArray.dimnames,length), dimnames=MyArray.dimnames ) MyArray['A1','B1','C2'] Lev1 - c(A1,A2) Lev2 - c(B1,B2) Lev3 - c(C1,C2) MyList - lapply(Lev1,function(x){ lapply(Lev2,function(y){ lapply(Lev3,function(z){ paste(unlist(x),unlist(y),unlist(z)) })})}) I would like to name the different levels in the list with the differnt inputs. So that the first level in named by the inputs in Lev1 and the second Level by the inputs in Lev2, and so on I would then like to use this call MyList$A1$B1$C1 A1 B1 C1 Regards Tom -- View this message in context: http://www.nabble.com/Naming-Lists-tf4228968.html#a12030717 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spatial sampling problem
Hello SK, I suggest you try the RandomFields package. It has specific functions for simulating random fields with an array of different algorithms and distributions. The GaussRF function in particular normal generates random fields. Julian Julian M. Burgos Fisheries Acoustics Research Lab School of Aquatic and Fishery Science University of Washington 1122 NE Boat Street Seattle, WA 98105 Phone: 206-221-6864 sk wrote: Hi All, I am new in R and trying to simulate random normal 2D field with mean trend say north-south. My domain is 10x10 grid and I am trying to use mvnorm but do not know how to specify the domain and the mean field. I would appreciate any help. Cheers, SK - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XL files
Dear All, I am new to R. I need to read XLs files, parse the data and convert them into txt format for further calculation . If any one have code for that , please send me code or suggestions for the same.Waiting for reply. Thanking you Animesh [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] installing RMySql-0.6.0
Hi, I am trying to install RMySql-.0.6.0 to work with WinXP, MySQL 5 and R 2.51. there is no binary for windows on cran-r, so i grabbed RMySql-0.6.0.tar.gz and tried to compile it. I get the following error: * checking for file 'RMySQL/DESCRIPTION' ... OK * preparing 'RMySQL': * checking DESCRIPTION meta-information ... 'sh' is not recognized as an internal or external command. OK * cleaning src * removing junk files where is it specified to use .sh commands? i went through the installation instructions and could not find references to it. also do i need to compile the src/*.c files separately? i assume the rcmd build should take of everything as the individual src/.*c files don't compile without mysql.h and unistd.h. thanks, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] installing RMySql-0.6.0
Actually there is an RMYSQL windows binary in the CRAN Extras area but it crashes on my XP machine when I try to use it. I have had more luck with the Windows binary in the Bioconductor Extras area (which seems to work except for some warnings that you can ignore): http://www.bioconductor.org/packages/2.0/extra/bin/windows/contrib/2.5/RMySQL_0.6-0.zip and also another private build by one user that works without the warnings. Note that there is an r-sig-db list for this type of question: https://stat.ethz.ch/mailman/listinfo/r-sig-db On 8/7/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi, I am trying to install RMySql-.0.6.0 to work with WinXP, MySQL 5 and R 2.51. there is no binary for windows on cran-r, so i grabbed RMySql-0.6.0.tar.gz and tried to compile it. I get the following error: * checking for file 'RMySQL/DESCRIPTION' ... OK * preparing 'RMySQL': * checking DESCRIPTION meta-information ... 'sh' is not recognized as an internal or external command. OK * cleaning src * removing junk files where is it specified to use .sh commands? i went through the installation instructions and could not find references to it. also do i need to compile the src/*.c files separately? i assume the rcmd build should take of everything as the individual src/.*c files don't compile without mysql.h and unistd.h. thanks, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lm( ) for log10-transformed data
Dear help-list, I would like to perform a linear regression on the log10 of the two vectors ov.mag.min and res.600nm. The slope and intercept of the regression I use to plot a wider range of ov.mag.min in a double log plot. For a linear regression on only tow points, wouldn't I expect the results for two.points.min to match pretty exactly res.600nm? It does not seem to be the case here. I manually calculated the slope and intercept and got values of -0.71 and 1.5 that seem to work alright. What is it that I am missing here? Thanks already for your help, Kim ov.mag.min - c(50, 1000) res.600nm - c(2.0, 0.231) lm.line.min - lm( log(ov.mag.min,10) ~ log(res.600nm,10) ) intercept.min - lm.line.min$coefficients[1] slope.min - lm.line.min$coefficients[2] two.points.min - log(ov.mag.min,10) round(res.600nm,2) == round(10^(two.points.min*slope.min+intercept.min),2) round(res.600nm,2) round(10^(two.points.min*-0.71+1.5),2) __ Kim Milferstedt University of Illinois at Urbana-Champaign Department of Civil and Environmental Engineering 4125 Newmark Civil Engineering Laboratory 205 North Mathews Avenue MC-250 Urbana, IL 61801 USA phone: (001) 217 333-9663 fax: (001) 217 333-6968 email: [EMAIL PROTECTED] http://cee.uiuc.edu/research/morgenroth __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] input data file
Hello, I am new to R. I used scan() to read data from tab-delimited files. I want to save all my data files (multiple scan()) in another file, and use it like infile statement in SAS or \input{tex.file} in latex. Thanks! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to include bar values in a barplot?
How do I include bar values in a barplot (or other R graphics, where this could be applicable)? To make sure I am clear I am attaching a barplot created with OpenOffice.org which has barplot values written on top of each barplot. -- Donatas Glodenis http://dg.lapas.info __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to include bar values in a barplot?
Donatas G. wrote: How do I include bar values in a barplot (or other R graphics, where this could be applicable)? To make sure I am clear I am attaching a barplot created with OpenOffice.org which has barplot values written on top of each barplot. This causes an optical illusion in which the viewer adds some of the heights of the numbers to the heights of the bars. And in general dot charts are greatly preferred to bar plots, not the least because of their reduced ink:information ratio. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer() : crossed-random-effects specification
Dear all, I want to estimate a crossed-random-effects model (i.e., measurements, students, schools) where students migrate between schools over time. I'm interested in the fixed effects of SES, age and their interaction on read (reading achievement) while accounting for the sample design. Based on a previous post, I'm specifying my model as: fm1 - lmer2(read ~ SES*age +(1| schlid:idl) +(1| schlid), hamburg, control=list(gradient = FALSE, niterEM = 0)) where my data is hamburg and idl and schlid are the student and school ids, respectively. (1) Is this the specification I want to estimate to obtain those effects while accounting for...? I'm not sure about the grouping variables specification. If not, how should I specify my model? I reviewed Baltes (2007) Linear mixed model implementation and learned how to detect if my design is nested or crossed, but not how to specify my model once I know it is crossed as in my case. If the previous specification is correct, (2) why do I get this error message? Error in lmerFactorList(formula, fr$mf, !is.null(family)): number of levels in grouping factor(s) 'idl:schlid' is too large In addition: Warning messages: 1: numerical expression has 30113 elements: only the first used in: idl:schlid 2: numerical expression has 30113 elements: only the first used in: idl:schlid My design consists of 14,047 students, 200 schools and 33,011 obs. I could, however, run this model: fm2 - lmer2(read ~ SES*age + (1|idl) +(1|schlid), hamburg) But I guess it does not account for the crossed design. Does it? I'm not an statistician and am using lmer() and R for the first time today. In other words, I sincerely apologize for the very naïve question. But I would really like to estimate this model soundly and I can't with the software I am familiar with Any advice or references are very much appreciated. All the best, Daniel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] small sample techniques
If my sample size is small is there a particular switch option that I need to use with t.test so that it calculates the t ratio correctly? Here is a dummy example? á =0.05 Mean pain reduction for A =27; B =31 and SD are SDA=9 SDB=12 drgA.p-rnorm(5,27,9); drgB.p-rnorm(5,31,12) t.test(drgA.p,drgB.p) # what do I need to give as additional parameter here? I can do it manually but was looking for a switch option that I need to specify for t.test. Thanks ../Murli [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm( ) for log10-transformed data
On 8/7/2007 2:58 PM, Kim Milferstedt wrote: Dear help-list, I would like to perform a linear regression on the log10 of the two vectors ov.mag.min and res.600nm. The slope and intercept of the regression I use to plot a wider range of ov.mag.min in a double log plot. For a linear regression on only tow points, wouldn't I expect the results for two.points.min to match pretty exactly res.600nm? It does not seem to be the case here. I manually calculated the slope and intercept and got values of -0.71 and 1.5 that seem to work alright. What is it that I am missing here? Thanks already for your help, Kim ov.mag.min - c(50, 1000) res.600nm - c(2.0, 0.231) lm.line.min - lm( log(ov.mag.min,10) ~ log(res.600nm,10) ) intercept.min - lm.line.min$coefficients[1] slope.min - lm.line.min$coefficients[2] two.points.min - log(ov.mag.min,10) round(res.600nm,2) == round(10^(two.points.min*slope.min+intercept.min),2) round(res.600nm,2) round(10^(two.points.min*-0.71+1.5),2) The convention in the lm() function is to express a model as response ~ predictor. So you should expect round(ov.mag.min,2) == round(10^(log(res.600nm, 10)*slope.min+intercept.min),2) not the other way around, and this does evaluate to two TRUE values. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] array
Hi, Tiandao Li wrote: Hello, I have some files generated from microarray experiments. I used scan() to read the files, and assigned each file to a unique name with many rows and columns. Now I want to create a array (ArrayA) with unique names, and I can use ArrayA[1,2][[6]] to refer the data in each file. Is there any packages available for array of array? if all of your initial arrays only consist of rows and columns (i.e. matrices) and have the same number of rows and columns, you can store it conveniently in a three dimensional array: mat1a - matrix(1:10, ncol=2) mat2a - matrix(11:20, ncol=2) mat1a mat2a arr3D - array(numeric(0), dim=c(nrow(mat1a), ncol(mat1a), 2)) arr3D[,,1] - mat1a arr3D[,,2] - mat2a arr3D arr3D[,,2] if your initial arrays (assuming again that you have matrices) have varying amount of rows and columns, I would suggest to use lists: mat1b - matrix(1:10, ncol=2) mat2b - matrix(101:133, ncol=3) mat1b mat2b list3D - list(numberone=mat1b, numbertwo=mat2b) list3D list3D[[1]] list3D[[1]] list3D[[numberone]] list3D[[numbertwo]] I hope this helps. Best, Roland __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How do I generate a grid of data in [0,1]?
Hello, my objective is to produce a data.frame that represents a grid of data points evenly spaced as closely together as possible on a graph with domain and range both being [0,1]. I tried to get a smaller example working first and I tried the following code: sink(file=Rsampledata) a-0 while(a1) {data-data.frame(a,seq(0,1,0.1)); print(data); a-a+0.1} My output was: a seq.0..1..0.1. 1 00.0 2 00.1 3 00.2 4 00.3 5 00.4 6 00.5 7 00.6 8 00.7 9 00.8 10 00.9 11 01.0 a seq.0..1..0.1. 1 0.10.0 2 0.10.1 3 0.10.2 4 0.10.3 5 0.10.4 6 0.10.5 7 0.10.6 8 0.10.7 9 0.10.8 10 0.10.9 11 0.11.0 a seq.0..1..0.1. 1 0.20.0 2 0.20.1 3 0.20.2 4 0.20.3 5 0.20.4 6 0.20.5 7 0.20.6 8 0.20.7 9 0.20.8 10 0.20.9 11 0.21.0 a seq.0..1..0.1. 1 0.30.0 2 0.30.1 3 0.30.2 4 0.30.3 5 0.30.4 6 0.30.5 7 0.30.6 8 0.30.7 9 0.30.8 10 0.30.9 11 0.31.0 a seq.0..1..0.1. 1 0.40.0 2 0.40.1 3 0.40.2 4 0.40.3 5 0.40.4 6 0.40.5 7 0.40.6 8 0.40.7 9 0.40.8 10 0.40.9 11 0.41.0 a seq.0..1..0.1. 1 0.50.0 2 0.50.1 3 0.50.2 4 0.50.3 5 0.50.4 6 0.50.5 7 0.50.6 8 0.50.7 9 0.50.8 10 0.50.9 11 0.51.0 a seq.0..1..0.1. 1 0.60.0 2 0.60.1 3 0.60.2 4 0.60.3 5 0.60.4 6 0.60.5 7 0.60.6 8 0.60.7 9 0.60.8 10 0.60.9 11 0.61.0 a seq.0..1..0.1. 1 0.70.0 2 0.70.1 3 0.70.2 4 0.70.3 5 0.70.4 6 0.70.5 7 0.70.6 8 0.70.7 9 0.70.8 10 0.70.9 11 0.71.0 a seq.0..1..0.1. 1 0.80.0 2 0.80.1 3 0.80.2 4 0.80.3 5 0.80.4 6 0.80.5 7 0.80.6 8 0.80.7 9 0.80.8 10 0.80.9 11 0.81.0 a seq.0..1..0.1. 1 0.90.0 2 0.90.1 3 0.90.2 4 0.90.3 5 0.90.4 6 0.90.5 7 0.90.6 8 0.90.7 9 0.90.8 10 0.90.9 11 0.91.0 a seq.0..1..0.1. 1 10.0 2 10.1 3 10.2 4 10.3 5 10.4 6 10.5 7 10.6 8 10.7 9 10.8 10 10.9 11 11.0 This does produce the kind of data points that I will need, however I need all these data points in one single data.frame instead of 11 because of further formatting I must do with the data. Any suggestions? Thank you, Elysia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I generate a grid of data in [0,1]?
Try: data - expand.grid( a= seq(0,1,.1), b= seq(0,1,.1) ) Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of esh Sent: Tuesday, August 07, 2007 1:59 PM To: r-help@stat.math.ethz.ch Subject: [R] How do I generate a grid of data in [0,1]? Hello, my objective is to produce a data.frame that represents a grid of data points evenly spaced as closely together as possible on a graph with domain and range both being [0,1]. I tried to get a smaller example working first and I tried the following code: sink(file=Rsampledata) a-0 while(a1) {data-data.frame(a,seq(0,1,0.1)); print(data); a-a+0.1} My output was: a seq.0..1..0.1. 1 00.0 2 00.1 3 00.2 4 00.3 5 00.4 6 00.5 7 00.6 8 00.7 9 00.8 10 00.9 11 01.0 a seq.0..1..0.1. 1 0.10.0 2 0.10.1 3 0.10.2 4 0.10.3 5 0.10.4 6 0.10.5 7 0.10.6 8 0.10.7 9 0.10.8 10 0.10.9 11 0.11.0 a seq.0..1..0.1. 1 0.20.0 2 0.20.1 3 0.20.2 4 0.20.3 5 0.20.4 6 0.20.5 7 0.20.6 8 0.20.7 9 0.20.8 10 0.20.9 11 0.21.0 a seq.0..1..0.1. 1 0.30.0 2 0.30.1 3 0.30.2 4 0.30.3 5 0.30.4 6 0.30.5 7 0.30.6 8 0.30.7 9 0.30.8 10 0.30.9 11 0.31.0 a seq.0..1..0.1. 1 0.40.0 2 0.40.1 3 0.40.2 4 0.40.3 5 0.40.4 6 0.40.5 7 0.40.6 8 0.40.7 9 0.40.8 10 0.40.9 11 0.41.0 a seq.0..1..0.1. 1 0.50.0 2 0.50.1 3 0.50.2 4 0.50.3 5 0.50.4 6 0.50.5 7 0.50.6 8 0.50.7 9 0.50.8 10 0.50.9 11 0.51.0 a seq.0..1..0.1. 1 0.60.0 2 0.60.1 3 0.60.2 4 0.60.3 5 0.60.4 6 0.60.5 7 0.60.6 8 0.60.7 9 0.60.8 10 0.60.9 11 0.61.0 a seq.0..1..0.1. 1 0.70.0 2 0.70.1 3 0.70.2 4 0.70.3 5 0.70.4 6 0.70.5 7 0.70.6 8 0.70.7 9 0.70.8 10 0.70.9 11 0.71.0 a seq.0..1..0.1. 1 0.80.0 2 0.80.1 3 0.80.2 4 0.80.3 5 0.80.4 6 0.80.5 7 0.80.6 8 0.80.7 9 0.80.8 10 0.80.9 11 0.81.0 a seq.0..1..0.1. 1 0.90.0 2 0.90.1 3 0.90.2 4 0.90.3 5 0.90.4 6 0.90.5 7 0.90.6 8 0.90.7 9 0.90.8 10 0.90.9 11 0.91.0 a seq.0..1..0.1. 1 10.0 2 10.1 3 10.2 4 10.3 5 10.4 6 10.5 7 10.6 8 10.7 9 10.8 10 10.9 11 11.0 This does produce the kind of data points that I will need, however I need all these data points in one single data.frame instead of 11 because of further formatting I must do with the data. Any suggestions? Thank you, Elysia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] backslash c
How do I get output with \color{blue}, i.e. with only one backslash??? \color{blue} [1] color{blue} Warning messages: 1: '\c' is an unrecognized escape in a character string 2: unrecognized escape removed from \color{blue} \\color{blue} [1] \\color{blue} Any help greatly appreciated. Best regards, Mikkel sessionInfo() R version 2.5.1 (2007-06-27) i386-pc-mingw32 locale: LC_COLLATE=English_Ireland.1252;LC_CTYPE=English_Ireland.1252;LC_MONETARY=English_Ireland.1252;LC_NUMERIC=C;LC_TIME=English_Ireland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] backslash c
Perhaps: cat(\\color{blue}\n) -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O On 07/08/07, Mikkel Grum [EMAIL PROTECTED] wrote: How do I get output with \color{blue}, i.e. with only one backslash??? \color{blue} [1] color{blue} Warning messages: 1: '\c' is an unrecognized escape in a character string 2: unrecognized escape removed from \color{blue} \\color{blue} [1] \\color{blue} Any help greatly appreciated. Best regards, Mikkel sessionInfo() R version 2.5.1 (2007-06-27) i386-pc-mingw32 locale: LC_COLLATE=English_Ireland.1252;LC_CTYPE=English_Ireland.1252;LC_MONETARY=English_Ireland.1252;LC_NUMERIC=C;LC_TIME=English_Ireland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] backslash c
Mikkel, As RSiteSearch(backslash) would have shown you, this is a FAQ. Your second attempt is correct. To understand why, see the R FAQ 7.37 Why does backslash behave strangely inside strings? Chuck On Tue, 7 Aug 2007, Mikkel Grum wrote: How do I get output with \color{blue}, i.e. with only one backslash??? \color{blue} [1] color{blue} Warning messages: 1: '\c' is an unrecognized escape in a character string 2: unrecognized escape removed from \color{blue} \\color{blue} [1] \\color{blue} Any help greatly appreciated. Best regards, Mikkel sessionInfo() R version 2.5.1 (2007-06-27) i386-pc-mingw32 locale: LC_COLLATE=English_Ireland.1252;LC_CTYPE=English_Ireland.1252;LC_MONETARY=English_Ireland.1252;LC_NUMERIC=C;LC_TIME=English_Ireland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to include bar values in a barplot?
On Tuesday 07 August 2007 22:09:52 Donatas G. wrote: How do I include bar values in a barplot (or other R graphics, where this could be applicable)? To make sure I am clear I am attaching a barplot created with OpenOffice.org which has barplot values written on top of each barplot. Here is the barplot mentioned above: http://dg.lapas.info/wp-content/barplot-with-values.jpg it appeaars that this list does not allow attachments... -- Donatas Glodenis http://dg.lapas.info __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to include bar values in a barplot?
See: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/56852.html On 8/7/07, Donatas G. [EMAIL PROTECTED] wrote: On Tuesday 07 August 2007 22:09:52 Donatas G. wrote: How do I include bar values in a barplot (or other R graphics, where this could be applicable)? To make sure I am clear I am attaching a barplot created with OpenOffice.org which has barplot values written on top of each barplot. Here is the barplot mentioned above: http://dg.lapas.info/wp-content/barplot-with-values.jpg it appeaars that this list does not allow attachments... -- Donatas Glodenis http://dg.lapas.info __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] array
Hello, I have 30 GenePix tab-delimited files generated from 10 microarray experiments, each has 3 replicates. I used scan() to read the files, and assigned each file to a unique name (such as A1, A2, A3 for experment A, and B1, B2, and B3 for experiment B) with 1000 rows and 56 columns. Now I want to create a array (ArrayA) with these file names, ArrayA [,1][,2]... [,10] [1,]A1 B1 C1 [2,]A2 B2 C2 [3,]A3 B3 C3 so I can use ArrayA[1,2][[6]] to refer the data in column6 of experimentB, replicate 1. Is there any packages available for array of array? Thanks! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to include bar values in a barplot?
Generally adding the numbers to a graph accomplishes 2 things: 1) it acts as an admission that your graph is a failure 2) it converts the graph into a poorly laid out table (with a colorful and distracting background) In general it is better to find an appropriate graph that does convey the information that is intended or if a table is more appropriate, then replace it with a well laid out table (or both). Remember that the role of tables is to look up specific values and the role of graphs is to give a good overview. The books by William Cleveland and Tufte have a lot of good advice on these issues. Before asking how to get R to produce a graph that looks like one from a spreadsheet, you should study: http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html and some of the links from there. You may also want to run the following in R: library(fortunes) fortune(120) In general I like OpenOffice, my one main complaint is that when faced with the decision between doing something right or the same way as microsoft, they have not always made the right decision. Hope this gives you something to think about, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Donatas G. Sent: Tuesday, August 07, 2007 1:10 PM To: r-help@stat.math.ethz.ch Subject: [R] how to include bar values in a barplot? How do I include bar values in a barplot (or other R graphics, where this could be applicable)? To make sure I am clear I am attaching a barplot created with OpenOffice.org which has barplot values written on top of each barplot. -- Donatas Glodenis http://dg.lapas.info __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to include bar values in a barplot?
On Tuesday 07 August 2007 22:09:52 Donatas G. wrote: How do I include bar values in a barplot (or other R graphics, where this could be applicable)? To make sure I am clear I am attaching a barplot created with OpenOffice.org which has barplot values written on top of each barplot. After more than two hours search I finally found a solution: http://tolstoy.newcastle.edu.au/R/help/06/05/27286.html and http://tolstoy.newcastle.edu.au/R/help/05/09/12936.html However, what about percentage barcharts, such as this: http://dg.lapas.info/wp-content/barplot-lytys-G07_x_mean.png And maybe somebody knows how to get the legend of the bars in the barchart (again, see the example above)? -- Donatas Glodenis http://dg.lapas.info __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer() - crossed random effects specification
Dear all, I want to estimate a crossed-random-effects model (i.e., measurements, students, schools) where students migrate between schools over time. I'm interested in the fixed effects of SES, age and their interaction on read (reading achievement) while accounting for the sample design. Based on a previous post, I'm specifying my model as: fm1 - lmer2(read ~ SES*age +(1| schlid:idl) +(1| schlid), hamburg, control=list(gradient = FALSE, niterEM = 0)) where my data is hamburg and idl and schlid are the student and school ids, respectively. (1) Is this the specification I want to estimate to obtain those effects while accounting for...? I'm not sure about the grouping variables specification. If not, how should I specify my model? I reviewed Baltes (2007) Linear mixed model implementation and learned how to detect if my design is nested or crossed, but not how to specify my model once I know it is crossed as in my case. If the previous specification is correct, (2) why do I get this error message? Error in lmerFactorList(formula, fr$mf, !is.null(family)): number of levels in grouping factor(s) 'idl:schlid' is too large In addition: Warning messages: 1: numerical expression has 30113 elements: only the first used in: idl:schlid 2: numerical expression has 30113 elements: only the first used in: idl:schlid My design consists of 14,047 students, 200 schools and 33,011 obs. I could, however, run this model: fm2 - lmer2(read ~ SES*age + (1|idl) +(1|schlid), hamburg) But I guess it does not account for the crossed design. Does it? I'm not an statistician and am using lmer() and R for the first time today. In other words, I sincerely apologize for the very naïve question. But I would really like to estimate this model soundly and I can't with the software I am familiar with Any advice or references are very much appreciated. All the best, Daniel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmer() : crossed-random-effects specification
On 8/7/07, Daniel Caro [EMAIL PROTECTED] wrote: Dear all, I want to estimate a crossed-random-effects model (i.e., measurements, students, schools) where students migrate between schools over time. I'm interested in the fixed effects of SES, age and their interaction on read (reading achievement) while accounting for the sample design. Based on a previous post, I'm specifying my model as: fm1 - lmer2(read ~ SES*age +(1| schlid:idl) +(1| schlid), hamburg, control=list(gradient = FALSE, niterEM = 0)) where my data is hamburg and idl and schlid are the student and school ids, respectively. (1) Is this the specification I want to estimate to obtain those effects while accounting for...? I'm not sure about the grouping variables specification. If not, how should I specify my model? I reviewed Baltes (2007) Linear mixed model implementation and learned how to detect if my design is nested or crossed, but not how to specify my model once I know it is crossed as in my case. If the previous specification is correct, (2) why do I get this error message? Error in lmerFactorList(formula, fr$mf, !is.null(family)): number of levels in grouping factor(s) 'idl:schlid' is too large In addition: Warning messages: 1: numerical expression has 30113 elements: only the first used in: idl:schlid 2: numerical expression has 30113 elements: only the first used in: idl:schlid My design consists of 14,047 students, 200 schools and 33,011 obs. I could, however, run this model: fm2 - lmer2(read ~ SES*age + (1|idl) +(1|schlid), hamburg) But I guess it does not account for the crossed design. Does it? Yes, it does account for the crossed design. I'm not an statistician and am using lmer() and R for the first time today. In other words, I sincerely apologize for the very naïve question. But I would really like to estimate this model soundly and I can't with the software I am familiar with Any advice or references are very much appreciated. All the best, Daniel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixture of Normals with Large Data
I wasn't aware of this literature, thanks for the references. On 8/5/07, RAVI VARADHAN [EMAIL PROTECTED] wrote: Another possibility is to use data squashing methods. Relevant papers are: (1) DuMouchel et al. (1999), (2) Madigan et al. (2002), and (3) Owen (1999). Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: [EMAIL PROTECTED] - Original Message - From: Charles C. Berry [EMAIL PROTECTED] Date: Saturday, August 4, 2007 8:01 pm Subject: Re: [R] Mixture of Normals with Large Data To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch On Sat, 4 Aug 2007, Tim Victor wrote: All: I am trying to fit a mixture of 2 normals with 110 million observations. I am running R 2.5.1 on a box with 1gb RAM running 32-bit windows and I continue to run out of memory. Does anyone have any suggestions. If the first few million observations can be regarded as a SRS of the rest, then just use them. Or read in blocks of a convenient size and sample some observations from each block. You can repeat this process a few times to see if the results are sufficiently accurate. Otherwise, read in blocks of a convenient size (perhaps 1 million observations at a time), quantize the data to a manageable number of intervals - maybe a few thousand - and tabulate it. Add the counts over all the blocks. Then use mle() to fit a multinomial likelihood whose probabilities are the masses associated with each bin under a mixture of normals law. Chuck Thanks so much, Tim [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E UC San Diego La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] S4 methods: unable to find an inherited method
Hello all, I consider myself pretty new to the whole OO based programming so I'm sorry if I'm doing something stupid. xml-read.metlin(url) Error in function (classes, fdef, mtable) : unable to find an inherited method for function read.metlin, for signature url read.metlin standardGeneric for read.metlin defined from package .GlobalEnv function (xml, ...) standardGeneric(read.metlin) environment: 0x83a8ae4 Methods may be defined for arguments: xml url description http://metlin.scripps.edu/download/MSMS_test.XML; class url mode r text text opened closed can read yes can write no I defined my methods as : if (!isGeneric(read.metlin) ) setGeneric(read.metlin, function(xml) standardGeneric(read.metlin)) setMethod(read.metlin, xcmsRaw, function(xml) { #Parsing the METLIN XML File reading-readLines(xml) #do rest of script }) Any help as to why I'm getting the inherited method error would be great. Cheers, Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixture of Normals with Large Data
Why would anyone want to fit a mixture of normals with 110 million observations?? Any questions about the distribution that you would care to ask can be answered directly from the data. Of course, any test of normality (or anything else) would be rejected. More to the point, the data are certainly not a random sample of anything. There will be all kinds of systematic nonrandom structure in them. This is clearly a situation where the researcher needs to think more carefully about the substantive questions of interest and how the data may shed light on them, instead of arbitrarily and perhaps reflexively throwing some silly statistical methodology at them. Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Victor Sent: Tuesday, August 07, 2007 3:02 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] Mixture of Normals with Large Data I wasn't aware of this literature, thanks for the references. On 8/5/07, RAVI VARADHAN [EMAIL PROTECTED] wrote: Another possibility is to use data squashing methods. Relevant papers are: (1) DuMouchel et al. (1999), (2) Madigan et al. (2002), and (3) Owen (1999). Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: [EMAIL PROTECTED] - Original Message - From: Charles C. Berry [EMAIL PROTECTED] Date: Saturday, August 4, 2007 8:01 pm Subject: Re: [R] Mixture of Normals with Large Data To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch On Sat, 4 Aug 2007, Tim Victor wrote: All: I am trying to fit a mixture of 2 normals with 110 million observations. I am running R 2.5.1 on a box with 1gb RAM running 32-bit windows and I continue to run out of memory. Does anyone have any suggestions. If the first few million observations can be regarded as a SRS of the rest, then just use them. Or read in blocks of a convenient size and sample some observations from each block. You can repeat this process a few times to see if the results are sufficiently accurate. Otherwise, read in blocks of a convenient size (perhaps 1 million observations at a time), quantize the data to a manageable number of intervals - maybe a few thousand - and tabulate it. Add the counts over all the blocks. Then use mle() to fit a multinomial likelihood whose probabilities are the masses associated with each bin under a mixture of normals law. Chuck Thanks so much, Tim [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E UC San Diego La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixture of Normals with Large Data
Have you considered the situation of wanting to characterize probability densities of prevalence estimates based on a complex random sample of some large population. On 8/7/07, Bert Gunter [EMAIL PROTECTED] wrote: Why would anyone want to fit a mixture of normals with 110 million observations?? Any questions about the distribution that you would care to ask can be answered directly from the data. Of course, any test of normality (or anything else) would be rejected. More to the point, the data are certainly not a random sample of anything. There will be all kinds of systematic nonrandom structure in them. This is clearly a situation where the researcher needs to think more carefully about the substantive questions of interest and how the data may shed light on them, instead of arbitrarily and perhaps reflexively throwing some silly statistical methodology at them. Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Victor Sent: Tuesday, August 07, 2007 3:02 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] Mixture of Normals with Large Data I wasn't aware of this literature, thanks for the references. On 8/5/07, RAVI VARADHAN [EMAIL PROTECTED] wrote: Another possibility is to use data squashing methods. Relevant papers are: (1) DuMouchel et al. (1999), (2) Madigan et al. (2002), and (3) Owen (1999). Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: [EMAIL PROTECTED] - Original Message - From: Charles C. Berry [EMAIL PROTECTED] Date: Saturday, August 4, 2007 8:01 pm Subject: Re: [R] Mixture of Normals with Large Data To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch On Sat, 4 Aug 2007, Tim Victor wrote: All: I am trying to fit a mixture of 2 normals with 110 million observations. I am running R 2.5.1 on a box with 1gb RAM running 32-bit windows and I continue to run out of memory. Does anyone have any suggestions. If the first few million observations can be regarded as a SRS of the rest, then just use them. Or read in blocks of a convenient size and sample some observations from each block. You can repeat this process a few times to see if the results are sufficiently accurate. Otherwise, read in blocks of a convenient size (perhaps 1 million observations at a time), quantize the data to a manageable number of intervals - maybe a few thousand - and tabulate it. Add the counts over all the blocks. Then use mle() to fit a multinomial likelihood whose probabilities are the masses associated with each bin under a mixture of normals law. Chuck Thanks so much, Tim [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E UC San Diego La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixture of Normals with Large Data
Have you considered the situation of wanting to characterize probability densities of prevalence estimates based on a complex random sample of some large population. No -- and I stand by my statement. The empirical distribution of the data themselves are the best characterization of the density. You and others are free to disagree. -- Bert On 8/7/07, Bert Gunter [EMAIL PROTECTED] wrote: Why would anyone want to fit a mixture of normals with 110 million observations?? Any questions about the distribution that you would care to ask can be answered directly from the data. Of course, any test of normality (or anything else) would be rejected. More to the point, the data are certainly not a random sample of anything. There will be all kinds of systematic nonrandom structure in them. This is clearly a situation where the researcher needs to think more carefully about the substantive questions of interest and how the data may shed light on them, instead of arbitrarily and perhaps reflexively throwing some silly statistical methodology at them. Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Victor Sent: Tuesday, August 07, 2007 3:02 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] Mixture of Normals with Large Data I wasn't aware of this literature, thanks for the references. On 8/5/07, RAVI VARADHAN [EMAIL PROTECTED] wrote: Another possibility is to use data squashing methods. Relevant papers are: (1) DuMouchel et al. (1999), (2) Madigan et al. (2002), and (3) Owen (1999). Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: [EMAIL PROTECTED] - Original Message - From: Charles C. Berry [EMAIL PROTECTED] Date: Saturday, August 4, 2007 8:01 pm Subject: Re: [R] Mixture of Normals with Large Data To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch On Sat, 4 Aug 2007, Tim Victor wrote: All: I am trying to fit a mixture of 2 normals with 110 million observations. I am running R 2.5.1 on a box with 1gb RAM running 32-bit windows and I continue to run out of memory. Does anyone have any suggestions. If the first few million observations can be regarded as a SRS of the rest, then just use them. Or read in blocks of a convenient size and sample some observations from each block. You can repeat this process a few times to see if the results are sufficiently accurate. Otherwise, read in blocks of a convenient size (perhaps 1 million observations at a time), quantize the data to a manageable number of intervals - maybe a few thousand - and tabulate it. Add the counts over all the blocks. Then use mle() to fit a multinomial likelihood whose probabilities are the masses associated with each bin under a mixture of normals law. Chuck Thanks so much, Tim [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E UC San Diego La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R2WinBUGS results not different with different runs
Hi All I dont know if anyone else has noticed the same thing, but with 2 subsequent runs of the same syntax, I am getting exactly the same results. I was expecting that results differ slighlty, say in the 4th or 5th decimal place. Is this a specialty with R2WinBUGS? Does it have something to do with the seed value? Isnt the seed value reset everytime I restart winbugs? Thanks Toby __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional subsetting
Hello- Upon searching the email archives and reading documentation I haven't found what I'm looking for. I have a dataset with a time/date stamp on a series of geographic locations (lat/lon) associated with the movements of animals. On some days there are numerous locations and some days there is only one location. Associated with each location is an indication that the quality of the location is either good, poor, or unknown. For each animal, I want to extract only one location for each day. And I want the location extracted to be nearest to 12pm on any given day, and highest quality possible. So the order of priority for my extraction will be: 1. only one location per animal each day; 2. the location selected be as close to 12pm as possible; 3. the selected location comes from the best locations available (ie. take from pool of good locations first, or select from poor locations if a good location isn't available, or unknown if nothing else is available). I think aspect of this task that has me particularly stumped is how to select only one location for each day, and for that location to be a close to 12pm as possible. An example of my dataset follows: DeployID Date.Time LocationQuality Latitude Longitude STM05-1 28/02/2005 17:35 Good -35.562 177.158 STM05-1 28/02/2005 19:44 Good -35.487 177.129 STM05-1 28/02/2005 23:01 Unknown -35.399 177.064 STM05-1 01/03/2005 07:28 Unknown -34.978 177.268 STM05-1 01/03/2005 18:06 Poor -34.799 177.027 STM05-1 01/03/2005 18:47 Poor -34.85 177.059 STM05-2 28/02/2005 12:49 Good -35.928 177.328 STM05-2 28/02/2005 21:23 Poor -35.926 177.314 Many thanks for your input. I'm using R 2.5.1 on Windows XP. Cheers, Tim Sippel (MSc) School of Biological Sciences Auckland University Private Bag 92019 Auckland 1020 New Zealand +64-9-373-7599 ext. 84589 (work) +64-9-373-7668 (Fax) +64-21-593-001 (mobile) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] input data file
If you are going to read it back into R, then use 'save'; if it is input to another applicaiton, consider 'write.csv'. I assume that when you say save all my data files you really mean save all my R objects. On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: Hello, I am new to R. I used scan() to read data from tab-delimited files. I want to save all my data files (multiple scan()) in another file, and use it like infile statement in SAS or \input{tex.file} in latex. Thanks! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to convert decimal date to its equivalent date format(YYYY.mm.dd.hr.min.sec)
Is this what you want? x - scan(textConnection(1979.00 + + 1979.020833 + + 1979.041667 + + 1979.062500), what=0) Read 4 items # get the year and then determine the number of seconds in the year so you can # use the decimal part of the year x.year - floor(x) # fraction of the year x.frac - x - x.year # number of seconds in each year x.sec.yr - unclass(ISOdate(x.year+1,1,1,0,0,0)) - unclass(ISOdate(x.year,1,1,0,0,0)) # now get the actual time x.actual - ISOdate(x.year,1,1,0,0,0) + x.frac * x.sec.yr x.actual [1] 1979-01-01 00:00:00 GMT 1979-01-08 14:29:49 GMT 1979-01-16 05:00:10 GMT [4] 1979-01-23 19:30:00 GMT On 8/7/07, Yogesh Tiwari [EMAIL PROTECTED] wrote: Hello R Users, How to convert decimal date to date as .mm.dd.hr.min.sec For example, I have decimal date in one column , and want to convert and write it in equivalent date(.mm.dd.hr.min.sec) in another next six columns. 1979.00 1979.020833 1979.041667 1979.062500 Is it possible in R ? Kindly help, Regards, Yogesh -- Dr. Yogesh K. Tiwari, Scientist, Indian Institute of Tropical Meteorology, Homi Bhabha Road, Pashan, Pune-411008 INDIA Phone: 0091-99 2273 9513 (Cell) : 0091-20-258 93 600 (O) (Ext.250) Fax: 0091-20-258 93 825 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional subsetting
Assuming *A* animal can only be in one location at any one time, I don't understand how once you have selected the location nearest 12pm how you can select based on the type of location? Do you mean to select on location before timei.e. from the best location visited that day which was closest to 12pm? Perhaps for condition 2 for a animal which.min(time-1200) (replace 1200 with properly defined timestamp representing 12:00pm) Ross Darnell -Original Message- From: [EMAIL PROTECTED] on behalf of Tim Sippel Sent: Wed 08-Aug-07 9:37 AM To: r-help@stat.math.ethz.ch Subject: [R] Conditional subsetting Hello- Upon searching the email archives and reading documentation I haven't found what I'm looking for. I have a dataset with a time/date stamp on a series of geographic locations (lat/lon) associated with the movements of animals. On some days there are numerous locations and some days there is only one location. Associated with each location is an indication that the quality of the location is either good, poor, or unknown. For each animal, I want to extract only one location for each day. And I want the location extracted to be nearest to 12pm on any given day, and highest quality possible. So the order of priority for my extraction will be: 1. only one location per animal each day; 2. the location selected be as close to 12pm as possible; 3. the selected location comes from the best locations available (ie. take from pool of good locations first, or select from poor locations if a good location isn't available, or unknown if nothing else is available). I think aspect of this task that has me particularly stumped is how to select only one location for each day, and for that location to be a close to 12pm as possible. An example of my dataset follows: DeployID Date.Time LocationQuality Latitude Longitude STM05-1 28/02/2005 17:35 Good -35.562 177.158 STM05-1 28/02/2005 19:44 Good -35.487 177.129 STM05-1 28/02/2005 23:01 Unknown -35.399 177.064 STM05-1 01/03/2005 07:28 Unknown -34.978 177.268 STM05-1 01/03/2005 18:06 Poor -34.799 177.027 STM05-1 01/03/2005 18:47 Poor -34.85 177.059 STM05-2 28/02/2005 12:49 Good -35.928 177.328 STM05-2 28/02/2005 21:23 Poor -35.926 177.314 Many thanks for your input. I'm using R 2.5.1 on Windows XP. Cheers, Tim Sippel (MSc) School of Biological Sciences Auckland University Private Bag 92019 Auckland 1020 New Zealand +64-9-373-7599 ext. 84589 (work) +64-9-373-7668 (Fax) +64-21-593-001 (mobile) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] small sample techniques
Hi Nair, If the two populations are normal the t-test gives you the exact result for whatever the sample size is (the sample size will affect the number of degrees of freedom). When the populations are not normal and the sample size is large it is still OK to use t-test (because of the Central Limit Theorem) but this is not necessarily true for the small sample size. You could use simulation to find the relevant probabilities. --- Nair, Murlidharan T [EMAIL PROTECTED] wrote: If my sample size is small is there a particular switch option that I need to use with t.test so that it calculates the t ratio correctly? Here is a dummy example? á =0.05 Mean pain reduction for A =27; B =31 and SD are SDA=9 SDB=12 drgA.p-rnorm(5,27,9); drgB.p-rnorm(5,31,12) t.test(drgA.p,drgB.p) # what do I need to give as additional parameter here? I can do it manually but was looking for a switch option that I need to specify for t.test. Thanks ../Murli [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional subsetting
Thanks Ross. Yes, upon your remark it seems that maybe I need to be selecting first for the best available location, and then by the time stamp nearest to 12pm. It seems like tapply(), or aggregate() or aggregate.ts() might be applicable, but I'm working through trying different functions and have yet to find what works. Suggestions on what function to start with would be appreciated. Tim Sippel (MSc) School of Biological Sciences Auckland University Private Bag 92019 Auckland 1020 New Zealand +64-9-373-7599 ext. 84589 (work) +64-9-373-7668 (Fax) +64-21-593-001 (mobile) _ From: Ross Darnell [mailto:[EMAIL PROTECTED] Sent: 08 August 2007 12:43 To: Tim Sippel; r-help@stat.math.ethz.ch Subject: RE: [R] Conditional subsetting Assuming *A* animal can only be in one location at any one time, I don't understand how once you have selected the location nearest 12pm how you can select based on the type of location? Do you mean to select on location before timei.e. from the best location visited that day which was closest to 12pm? Perhaps for condition 2 for a animal which.min(time-1200) (replace 1200 with properly defined timestamp representing 12:00pm) Ross Darnell -Original Message- From: [EMAIL PROTECTED] on behalf of Tim Sippel Sent: Wed 08-Aug-07 9:37 AM To: r-help@stat.math.ethz.ch Subject: [R] Conditional subsetting Hello- Upon searching the email archives and reading documentation I haven't found what I'm looking for. I have a dataset with a time/date stamp on a series of geographic locations (lat/lon) associated with the movements of animals. On some days there are numerous locations and some days there is only one location. Associated with each location is an indication that the quality of the location is either good, poor, or unknown. For each animal, I want to extract only one location for each day. And I want the location extracted to be nearest to 12pm on any given day, and highest quality possible. So the order of priority for my extraction will be: 1. only one location per animal each day; 2. the location selected be as close to 12pm as possible; 3. the selected location comes from the best locations available (ie. take from pool of good locations first, or select from poor locations if a good location isn't available, or unknown if nothing else is available). I think aspect of this task that has me particularly stumped is how to select only one location for each day, and for that location to be a close to 12pm as possible. An example of my dataset follows: DeployID Date.Time LocationQuality Latitude Longitude STM05-1 28/02/2005 17:35 Good -35.562 177.158 STM05-1 28/02/2005 19:44 Good -35.487 177.129 STM05-1 28/02/2005 23:01 Unknown -35.399 177.064 STM05-1 01/03/2005 07:28 Unknown -34.978 177.268 STM05-1 01/03/2005 18:06 Poor -34.799 177.027 STM05-1 01/03/2005 18:47 Poor -34.85 177.059 STM05-2 28/02/2005 12:49 Good -35.928 177.328 STM05-2 28/02/2005 21:23 Poor -35.926 177.314 Many thanks for your input. I'm using R 2.5.1 on Windows XP. Cheers, Tim Sippel (MSc) School of Biological Sciences Auckland University Private Bag 92019 Auckland 1020 New Zealand +64-9-373-7599 ext. 84589 (work) +64-9-373-7668 (Fax) +64-21-593-001 (mobile) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional subsetting
I have tried the following to extract my subset and no luck yet. Any ideas? subset.df-subset(df, subset=c(DeployID,Date.Time, LocationQuality==(ordered(c(Good, Poor, Unknown))), Latitude,Longitude), select= min(abs(Date.Time-12:00))) I'm not sure if my 'ordered' argument will accomplish my desired result of first selecting for Good locations, etc. Thank you, Tim Sippel (MSc) School of Biological Sciences Auckland University Private Bag 92019 Auckland 1020 New Zealand +64-9-373-7599 ext. 84589 (work) +64-9-373-7668 (Fax) +64-21-593-001 (mobile) _ From: Tim Sippel [mailto:[EMAIL PROTECTED] Sent: 08 August 2007 13:16 To: 'Ross Darnell'; 'r-help@stat.math.ethz.ch' Subject: RE: [R] Conditional subsetting Thanks Ross. Yes, upon your remark it seems that maybe I need to be selecting first for the best available location, and then by the time stamp nearest to 12pm. It seems like tapply(), or aggregate() or aggregate.ts() might be applicable, but I'm working through trying different functions and have yet to find what works. Suggestions on what function to start with would be appreciated. Tim Sippel (MSc) School of Biological Sciences Auckland University Private Bag 92019 Auckland 1020 New Zealand +64-9-373-7599 ext. 84589 (work) +64-9-373-7668 (Fax) +64-21-593-001 (mobile) _ From: Ross Darnell [mailto:[EMAIL PROTECTED] Sent: 08 August 2007 12:43 To: Tim Sippel; r-help@stat.math.ethz.ch Subject: RE: [R] Conditional subsetting Assuming *A* animal can only be in one location at any one time, I don't understand how once you have selected the location nearest 12pm how you can select based on the type of location? Do you mean to select on location before timei.e. from the best location visited that day which was closest to 12pm? Perhaps for condition 2 for a animal which.min(time-1200) (replace 1200 with properly defined timestamp representing 12:00pm) Ross Darnell -Original Message- From: [EMAIL PROTECTED] on behalf of Tim Sippel Sent: Wed 08-Aug-07 9:37 AM To: r-help@stat.math.ethz.ch Subject: [R] Conditional subsetting Hello- Upon searching the email archives and reading documentation I haven't found what I'm looking for. I have a dataset with a time/date stamp on a series of geographic locations (lat/lon) associated with the movements of animals. On some days there are numerous locations and some days there is only one location. Associated with each location is an indication that the quality of the location is either good, poor, or unknown. For each animal, I want to extract only one location for each day. And I want the location extracted to be nearest to 12pm on any given day, and highest quality possible. So the order of priority for my extraction will be: 1. only one location per animal each day; 2. the location selected be as close to 12pm as possible; 3. the selected location comes from the best locations available (ie. take from pool of good locations first, or select from poor locations if a good location isn't available, or unknown if nothing else is available). I think aspect of this task that has me particularly stumped is how to select only one location for each day, and for that location to be a close to 12pm as possible. An example of my dataset follows: DeployID Date.Time LocationQuality Latitude Longitude STM05-1 28/02/2005 17:35 Good -35.562 177.158 STM05-1 28/02/2005 19:44 Good -35.487 177.129 STM05-1 28/02/2005 23:01 Unknown -35.399 177.064 STM05-1 01/03/2005 07:28 Unknown -34.978 177.268 STM05-1 01/03/2005 18:06 Poor -34.799 177.027 STM05-1 01/03/2005 18:47 Poor -34.85 177.059 STM05-2 28/02/2005 12:49 Good -35.928 177.328 STM05-2 28/02/2005 21:23 Poor -35.926 177.314 Many thanks for your input. I'm using R 2.5.1 on Windows XP. Cheers, Tim Sippel (MSc) School of Biological Sciences Auckland University Private Bag 92019 Auckland 1020 New Zealand +64-9-373-7599 ext. 84589 (work) +64-9-373-7668 (Fax) +64-21-593-001 (mobile) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] input data file
In the first part of myfile.R, I used scan() 100 times to read data from 100 different tab-delimited files. I want to save this part to another data file, so I won't accidently make mistakes, and I want to re-use/input it like infile statement in SAS or \input(file.tex} in latex. Don't want to copy/paste 100 scan() every time I need to read the same data. Thanks! On Tue, 7 Aug 2007, jim holtman wrote: If you are going to read it back into R, then use 'save'; if it is input to another applicaiton, consider 'write.csv'. I assume that when you say save all my data files you really mean save all my R objects. On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: Hello, I am new to R. I used scan() to read data from tab-delimited files. I want to save all my data files (multiple scan()) in another file, and use it like infile statement in SAS or \input{tex.file} in latex. Thanks! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional subsetting
Read the data in and transform the Date, Time and Quality columns so that Date and Time are the respective chron classes (see RNews 4/1 Help Desk article) and Quality is a factor such that lower levels are better. Then sort the data frame by Date, Quality and time from noon. That would actually be the default ordering if the shown factor levels are the only ones since they naturally fall into that order but we have explicitly ordered them in case your extension to other levels no longer obeys that property. Finally take the first row from each Date group. Lines - DeployID Date.Time LocationQuality Latitude Longitude STM05-1 28/02/2005 17:35 Good -35.562 177.158 STM05-1 28/02/2005 19:44 Good -35.487 177.129 STM05-1 28/02/2005 23:01 Unknown -35.399 177.064 STM05-1 01/03/2005 07:28 Unknown -34.978 177.268 STM05-1 01/03/2005 18:06 Poor -34.799 177.027 STM05-1 01/03/2005 18:47 Poor -34.85 177.059 STM05-2 28/02/2005 12:49 Good -35.928 177.328 STM05-2 28/02/2005 21:23 Poor -35.926 177.314 library(chron) # in next line replace textConnection(Lines) with myfile.dat DF - read.table(textConnection(Lines), skip = 1, as.is = TRUE, col.names = c(Id, Date, Time, Quality, Lat, Long)) DF - transform(DF, Date = chron(Date, format = d/m/y), Time = times(paste(Time, 00, sep = :)), Quality = factor(Quality, levels = c(Good, Poor, Unknown))) o - order(DF$Date, as.numeric(DF$Quality), abs(DF$Time - times(12:00:00))) DF - DF[o,] DF[tapply(row.names(DF), DF$Date, head, 1), ] # The last line above could alternately be written like this: do.call(rbind, by(DF, DF$Date, head, 1)) On 8/7/07, Tim Sippel [EMAIL PROTECTED] wrote: Thanks Ross. Yes, upon your remark it seems that maybe I need to be selecting first for the best available location, and then by the time stamp nearest to 12pm. It seems like tapply(), or aggregate() or aggregate.ts() might be applicable, but I'm working through trying different functions and have yet to find what works. Suggestions on what function to start with would be appreciated. Tim Sippel (MSc) School of Biological Sciences Auckland University Private Bag 92019 Auckland 1020 New Zealand +64-9-373-7599 ext. 84589 (work) +64-9-373-7668 (Fax) +64-21-593-001 (mobile) _ From: Ross Darnell [mailto:[EMAIL PROTECTED] Sent: 08 August 2007 12:43 To: Tim Sippel; r-help@stat.math.ethz.ch Subject: RE: [R] Conditional subsetting Assuming *A* animal can only be in one location at any one time, I don't understand how once you have selected the location nearest 12pm how you can select based on the type of location? Do you mean to select on location before timei.e. from the best location visited that day which was closest to 12pm? Perhaps for condition 2 for a animal which.min(time-1200) (replace 1200 with properly defined timestamp representing 12:00pm) Ross Darnell -Original Message- From: [EMAIL PROTECTED] on behalf of Tim Sippel Sent: Wed 08-Aug-07 9:37 AM To: r-help@stat.math.ethz.ch Subject: [R] Conditional subsetting Hello- Upon searching the email archives and reading documentation I haven't found what I'm looking for. I have a dataset with a time/date stamp on a series of geographic locations (lat/lon) associated with the movements of animals. On some days there are numerous locations and some days there is only one location. Associated with each location is an indication that the quality of the location is either good, poor, or unknown. For each animal, I want to extract only one location for each day. And I want the location extracted to be nearest to 12pm on any given day, and highest quality possible. So the order of priority for my extraction will be: 1. only one location per animal each day; 2. the location selected be as close to 12pm as possible; 3. the selected location comes from the best locations available (ie. take from pool of good locations first, or select from poor locations if a good location isn't available, or unknown if nothing else is available). I think aspect of this task that has me particularly stumped is how to select only one location for each day, and for that location to be a close to 12pm as possible. An example of my dataset follows: DeployID Date.Time LocationQuality Latitude Longitude STM05-1 28/02/2005 17:35 Good -35.562 177.158 STM05-1 28/02/2005 19:44 Good -35.487 177.129 STM05-1 28/02/2005 23:01 Unknown -35.399 177.064 STM05-1 01/03/2005 07:28 Unknown -34.978 177.268 STM05-1 01/03/2005 18:06 Poor -34.799 177.027 STM05-1 01/03/2005 18:47 Poor -34.85 177.059 STM05-2 28/02/2005 12:49 Good -35.928 177.328 STM05-2 28/02/2005 21:23 Poor -35.926 177.314 Many thanks for your input. I'm using R 2.5.1 on Windows XP. Cheers, Tim Sippel (MSc)
Re: [R] input data file
I would hope that you don't have 100 'scan' statements; you should just have a loop that is using a set of file names in a vector to read the data. Are you reading the data into separate objects? If so, have you considered reading the 100 files into a 'list' so that you have a single object with all of your data? This is then easy to save with the 'save' function and then you can quickly retrieve it with the 'load' statement. file.names - c('file1', ..., 'file100') input.list - list() for (i in file.names){ input.list[[i]] - scan(i, what=) } You can then 'save(input.list, file='save.Rdata')'. You can access the data from the individual files with: input.list[['file33']] On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: In the first part of myfile.R, I used scan() 100 times to read data from 100 different tab-delimited files. I want to save this part to another data file, so I won't accidently make mistakes, and I want to re-use/input it like infile statement in SAS or \input(file.tex} in latex. Don't want to copy/paste 100 scan() every time I need to read the same data. Thanks! On Tue, 7 Aug 2007, jim holtman wrote: If you are going to read it back into R, then use 'save'; if it is input to another applicaiton, consider 'write.csv'. I assume that when you say save all my data files you really mean save all my R objects. On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: Hello, I am new to R. I used scan() to read data from tab-delimited files. I want to save all my data files (multiple scan()) in another file, and use it like infile statement in SAS or \input{tex.file} in latex. Thanks! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Binary Search
Hi! R is an amazing piece of software and with so many libraries it can do almost anything... but I was very surprised that a standard binary search function seems not to exist. I can find other much more highly-complex search routines (optimize, uniroot, nlm) in the standard no-extra-packages-loaded version of R, but not this simple alternative. I searched and found an implementation inside the package genetics, but that requires the loading of an awful lot of other code to get to this simple function. Have I missed something? Perhaps I just didn't find what is already in there. Can anyone point me to what I'm after? If not, perhaps the developers might consider making it more readily available? Cheers, Matthew __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] input data file
I thought of loop at first. My data were generated from 32 microarray experiments, each had 3 replicates, 96 files in total. I named the files based on different conditions or time series, and I really won't want to name them after numbers. It will make me confused later when I need to refer/compare them. On Tue, 7 Aug 2007, jim holtman wrote: I would hope that you don't have 100 'scan' statements; you should just have a loop that is using a set of file names in a vector to read the data. Are you reading the data into separate objects? If so, have you considered reading the 100 files into a 'list' so that you have a single object with all of your data? This is then easy to save with the 'save' function and then you can quickly retrieve it with the 'load' statement. file.names - c('file1', ..., 'file100') input.list - list() for (i in file.names){ input.list[[i]] - scan(i, what=) } You can then 'save(input.list, file='save.Rdata')'. You can access the data from the individual files with: input.list[['file33']] On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: In the first part of myfile.R, I used scan() 100 times to read data from 100 different tab-delimited files. I want to save this part to another data file, so I won't accidently make mistakes, and I want to re-use/input it like infile statement in SAS or \input(file.tex} in latex. Don't want to copy/paste 100 scan() every time I need to read the same data. Thanks! On Tue, 7 Aug 2007, jim holtman wrote: If you are going to read it back into R, then use 'save'; if it is input to another applicaiton, consider 'write.csv'. I assume that when you say save all my data files you really mean save all my R objects. On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: Hello, I am new to R. I used scan() to read data from tab-delimited files. I want to save all my data files (multiple scan()) in another file, and use it like infile statement in SAS or \input{tex.file} in latex. Thanks! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] input data file
You don't have to name them after numbers. What I sent was just an example of a character vector with file names. If you have all the files in a directory, then you can set the loop to read in all the files (or selected one based on a pattern match). If you are copy/pasting the 'scan' command, then you must somehow be changing the file name that is being read and the R object that you are storing the values in. You can use list.files(pattern=..) to select a list of file names. This is much easier than copy/paste. On 8/8/07, Tiandao Li [EMAIL PROTECTED] wrote: I thought of loop at first. My data were generated from 32 microarray experiments, each had 3 replicates, 96 files in total. I named the files based on different conditions or time series, and I really won't want to name them after numbers. It will make me confused later when I need to refer/compare them. On Tue, 7 Aug 2007, jim holtman wrote: I would hope that you don't have 100 'scan' statements; you should just have a loop that is using a set of file names in a vector to read the data. Are you reading the data into separate objects? If so, have you considered reading the 100 files into a 'list' so that you have a single object with all of your data? This is then easy to save with the 'save' function and then you can quickly retrieve it with the 'load' statement. file.names - c('file1', ..., 'file100') input.list - list() for (i in file.names){ input.list[[i]] - scan(i, what=) } You can then 'save(input.list, file='save.Rdata')'. You can access the data from the individual files with: input.list[['file33']] On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: In the first part of myfile.R, I used scan() 100 times to read data from 100 different tab-delimited files. I want to save this part to another data file, so I won't accidently make mistakes, and I want to re-use/input it like infile statement in SAS or \input(file.tex} in latex. Don't want to copy/paste 100 scan() every time I need to read the same data. Thanks! On Tue, 7 Aug 2007, jim holtman wrote: If you are going to read it back into R, then use 'save'; if it is input to another applicaiton, consider 'write.csv'. I assume that when you say save all my data files you really mean save all my R objects. On 8/7/07, Tiandao Li [EMAIL PROTECTED] wrote: Hello, I am new to R. I used scan() to read data from tab-delimited files. I want to save all my data files (multiple scan()) in another file, and use it like infile statement in SAS or \input{tex.file} in latex. Thanks! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R2WinBUGS results not different with different runs
toby909 at gmail.com writes: Is this a specialty with R2WinBUGS? Does it have something to do with the seed value? Isnt the seed value reset everytime I restart winbugs? I always have the same seed if I start WinBUGS multiple times. Gregor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Define t-distribution using data, and ghyp package
Dear R users, I am fitting a t-distribution to some data, then selecting randomly from the distribution. I am using the brand new ghyp package, which seems designed to do this. Firstly is this approach appropriate or are their alternatives I should also consider? I have a small dataset and the hyperbolic distribution is also used when heavy tails are expected, but I understand the t-distribution is designed for understanding variance when data are limited. I have 2 questions specific to ghyp; as this package is new perhaps my limited knowledge is complemented by some teething issues: 1. How should I set argument nu in function fit.tuv()? It appears to be a starting value. The preliminary draft help file says nu is defined as -2*lambda but lambda is not defined. Experimenting with Data1 below, nu=2.5 gives no error messages and hist() looks vaguely reasonable, although Dist1 lists Converge=FALSE. Nu=2 gives Error in FUN(newX[, i], ...) : If lambda 0: chi must be 0 and psi must be = 0! lambda = -1; chi = 0; psi = 0. Nu=3 gives Warning message: NaNs produced in: sqrt(diag(hess)), and again Dist1 gives Converge=FALSE, but also the resulting hist() is scrambled. Looking at Data2, again nu=2.5 gives no error messages, and Dist2 lists Converge=TRUE. Interestingly, Dist2 lists nu=18.2927194, but if I set nu=18 in fit.tuv() I get Warning message: NaNs produced in: sqrt(diag(hess)), and Converge=FALSE (!). 2. What is the significance/consequence of Converge=FALSE? How can I achieve Converge=TRUE apart from collecting more data? 3. In fit.tuv() I have experimented with setting argument symmetric=TRUE but why is the distribution always Asymmetric Student-t? Code: library(ghyp) Data1 - c(37.07000, 46.94609, 38.19270, 41.98090, 36.45126, 45.25217, 39.07771, 39.35987) Dist1 - fit.tuv(Data1, nu = 2.5, opt.pars = c(nu = T, mu = T, sigma = T), symmetric = T) hist(Dist1, data = Data1, gaussian = T, log.hist = F, ylim = c(0,0.5), ghyp.col = 1, ghyp.lwd = 1, ghyp.lty = solid, col = 1, nclass = 30) Data2 - c(0.07, 2.68, 2.00, 0.27, 0.39, 1.17, 1.34, 4.34, 3.36, 3.61, 1.28) Dist2 - fit.tuv(Data2, nu = 2.5, opt.pars = c(nu = T, mu = T, sigma = T), symmetric = T) hist(Dist2, data = Data2, gaussian = T, log.hist = F, ylim = c(0,1), ghyp.col = 1, ghyp.lwd = 1, ghyp.lty = solid, col = 1, nclass = 30) Thanks in advance, Scott. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Naming Lists
Thanks, I have considered using array but I dont belive its possible. For my function isn't really paste(), its a custom function that takes the three different parameters. The result however is a matrix where the size depends on the inputs. I belive thats hard to fit into an array. But if you have any suggestions how I could improve the code. Felle free to pass them over, I would be glad for anything. //Tom Charles C. Berry wrote: On Tue, 7 Aug 2007, Tom.O wrote: Hi Im pretty new to R and I have run in to a problem. How do I name all the levels in this list. One way: MyList - mapply(function(x) mapply( function(y) mapply( function(z) paste(unlist(x),unlist(y),unlist(z)), Lev3, SIMPLIFY=FALSE ), Lev2, SIMPLIFY=FALSE ), Lev1, SIMPLIFY=FALSE ) However, for many purposes you might better use an array: MyArray.dimnames - list(Lev1,Lev2,Lev3) combos - rev( expand.grid (Lev3,Lev2,Lev1) ) elements - do.call( paste, combos ) MyArray - array( elements, dim=sapply(MyArray.dimnames,length), dimnames=MyArray.dimnames ) MyArray['A1','B1','C2'] Lev1 - c(A1,A2) Lev2 - c(B1,B2) Lev3 - c(C1,C2) MyList - lapply(Lev1,function(x){ lapply(Lev2,function(y){ lapply(Lev3,function(z){ paste(unlist(x),unlist(y),unlist(z)) })})}) I would like to name the different levels in the list with the differnt inputs. So that the first level in named by the inputs in Lev1 and the second Level by the inputs in Lev2, and so on I would then like to use this call MyList$A1$B1$C1 A1 B1 C1 Regards Tom -- View this message in context: http://www.nabble.com/Naming-Lists-tf4228968.html#a12030717 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED]UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Naming-Lists-tf4228968.html#a12047294 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional subsetting
Another way to do this is by using the SQLite select statements making use of the sqldf package. First let us assume that the alphabetic order of the Quality levels reflects their desirability as well. In the example data this is true but if its not true in the whole data set where there may be additional levels of Quality then it would need to be recoded first. The first select statement sorts the data and the second one just takes the last row in each group. (It seems that when using the group by clause that it takes the last row of each group but I could not actually find this documented so beware.) Note that Time and Date names are changed by the underlying software which is why we must refer to them using Time__1 and Date__1. Lines - DeployID Date.Time LocationQuality Latitude Longitude STM05-1 28/02/2005 17:35 Good -35.562 177.158 STM05-1 28/02/2005 19:44 Good -35.487 177.129 STM05-1 28/02/2005 23:01 Unknown -35.399 177.064 STM05-1 01/03/2005 07:28 Unknown -34.978 177.268 STM05-1 01/03/2005 18:06 Poor -34.799 177.027 STM05-1 01/03/2005 18:47 Poor -34.85 177.059 STM05-2 28/02/2005 12:49 Good -35.928 177.328 STM05-2 28/02/2005 21:23 Poor -35.926 177.314 # in next line replace textConnection(Lines) with myfile.dat library(sqldf) DF - read.table(textConnection(Lines), skip = 1, as.is = TRUE, col.names = c(Id, Date, Time, Quality, Lat, Long)) DFo - sqldf(select * from DF order by substr(Date__1, 7, 4) || substr(Date__1, 4, 2) || substr(Date__1, 1, 2) DESC, Quality DESC, abs(substr(Time__1, 1, 2) + substr(Time__1, 4, 2) /60 - 12) DESC) sqldf(select * from DFo group by Date__1) # Here is a second different way to do it. # Another way to do it also using sqldf is via nested selects like this using # the same DF as above sqldf(select * from DF u where abs(substr(Time__1, 1, 2) + substr(Time__1, 4, 2) /60 - 12) = (select min(abs(substr(Time__1, 1, 2) + substr(Time__1, 4, 2) /60 - 12)) from DF x where Quality = (select min(Quality) from DF y where x.Date__1 = y.Date__1) and x.Date__1 = u.Date__1)) On 8/7/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: Read the data in and transform the Date, Time and Quality columns so that Date and Time are the respective chron classes (see RNews 4/1 Help Desk article) and Quality is a factor such that lower levels are better. Then sort the data frame by Date, Quality and time from noon. That would actually be the default ordering if the shown factor levels are the only ones since they naturally fall into that order but we have explicitly ordered them in case your extension to other levels no longer obeys that property. Finally take the first row from each Date group. Lines - DeployID Date.Time LocationQuality Latitude Longitude STM05-1 28/02/2005 17:35 Good -35.562 177.158 STM05-1 28/02/2005 19:44 Good -35.487 177.129 STM05-1 28/02/2005 23:01 Unknown -35.399 177.064 STM05-1 01/03/2005 07:28 Unknown -34.978 177.268 STM05-1 01/03/2005 18:06 Poor -34.799 177.027 STM05-1 01/03/2005 18:47 Poor -34.85 177.059 STM05-2 28/02/2005 12:49 Good -35.928 177.328 STM05-2 28/02/2005 21:23 Poor -35.926 177.314 library(chron) # in next line replace textConnection(Lines) with myfile.dat DF - read.table(textConnection(Lines), skip = 1, as.is = TRUE, col.names = c(Id, Date, Time, Quality, Lat, Long)) DF - transform(DF, Date = chron(Date, format = d/m/y), Time = times(paste(Time, 00, sep = :)), Quality = factor(Quality, levels = c(Good, Poor, Unknown))) o - order(DF$Date, as.numeric(DF$Quality), abs(DF$Time - times(12:00:00))) DF - DF[o,] DF[tapply(row.names(DF), DF$Date, head, 1), ] # The last line above could alternately be written like this: do.call(rbind, by(DF, DF$Date, head, 1)) On 8/7/07, Tim Sippel [EMAIL PROTECTED] wrote: Thanks Ross. Yes, upon your remark it seems that maybe I need to be selecting first for the best available location, and then by the time stamp nearest to 12pm. It seems like tapply(), or aggregate() or aggregate.ts() might be applicable, but I'm working through trying different functions and have yet to find what works. Suggestions on what function to start with would be appreciated. Tim Sippel (MSc) School of Biological Sciences Auckland University Private Bag 92019 Auckland 1020 New Zealand +64-9-373-7599 ext. 84589 (work) +64-9-373-7668 (Fax) +64-21-593-001 (mobile) _ From: Ross Darnell [mailto:[EMAIL PROTECTED] Sent: 08 August 2007 12:43 To: Tim Sippel; r-help@stat.math.ethz.ch Subject: RE: [R] Conditional subsetting Assuming *A* animal can only be in one location at any one time, I don't understand how once you have selected the location nearest 12pm how you can select based on the type of location? Do you mean to select on location before timei.e. from the best location visited that day which was closest to 12pm? Perhaps for condition 2 for a