Re: [R] Plotting in x and y Graph Using 1st Quadrant Only
plot(1:5) seems to suffice. More generally, plot(x, y, xlim = c(0, max(x)), ylim = c(0, min(y))) if I understand your request. Michael On Jul 27, 2012, at 5:58 PM, Matilda E. Gogos matildaelizabe...@gmail.com wrote: How can I graph in x and y using the 1st quadrant, using the cartesian plane, in r? I use a mac 10.6.8. Will I be able to print this out? -- Matilda Gogos matildaelizabe...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] limit of detection (LOD) by logistic regression
Dear all, I am trying to apply the logistic regression to determine the limit of detection (LOD) of a molecular biology assay, the polymerase chain reaction (PCR). The aim of the procedure is to identify the value (variable dilution) that determine a 95% probability of success, that is positive/total=0.95. The procedure I have implemented seemed to work looking at the figure obtained from the sample set 1; however the figure obtained from the sample set 2 shows that interpolation is not correct. Admittedly, these are not the best sample sets to determine the LOD -there are too many 100% successes - but the regression should still work. Does anybody have some tip to solve this incongruence? Many thanks, Luigi Marongiu, MSc ### SAMPLE SET No. 1 ### define data labs-c(dilution, total, positive) p.1-c(10, 20, 19) p.2-c(100, 20, 20) p.3-c(1000, 20, 20) p.4-c(1, 20, 20) p.5-c(10, 20, 20) p.6-c(100, 20, 20) ### create data frame from matrix my.mat-matrix(c(p.1, p.2, p.3, p.4, p.5, p.6), nrow=6, byrow=TRUE) dimnames(my.mat)-list(c(1:6),labs) my.data-as.data.frame(my.mat) attach(my.data) my.data ### define frequency data Y-cbind(positive, total-positive) ### fit model LOGIT model-glm(Y ~ dilution, family=binomial(link=logit), data=my.data) ### plot data and regression line plot(dilution, positive/total, pch=16, ylim=c(0,1), log = x, xaxt=n, xlim=c(1, 100), main=Positive amplification by PCR, ylab=Fraction amplified, xlab=expression(paste(Standard dilutions (c/,mu,l ### add x axis (William Dunlap, personal communication) xlim - par(usr)[1:2] lab.x - seq(ceiling(xlim[1]), floor(xlim[2])) axis(side=1, at=10^lab.x, lab=as.expression(lapply(lab.x, function(x)bquote(10^.(x) lines(dilution, fitted(model), lty=1) ### set significance level and logit # define significance level-0.95 # set logit value logit.LOD-log(level/(1-level)) ### set function to determine LOD (Vito Ricci, personal communication) LOD-function(mod) as.vector((logit.LOD-coef(mod)[1])/coef(mod)[2]) ### computation of the LOD value x.LOD-LOD(model) ### compute S.E. # set variables co-model$coef model.sum-summary(model) SE.co-model.sum$coef[,2] COV.co-model.sum$cov.scaled[1,2] # compute SE SE.LOD-abs(x.LOD) * sqrt( (SE.co[1]/co[1])^2 + (SE.co[2]/co[2])^2 - 2*(COV.co/(co[1]*co[2])) ) log.SE.LOD-log10(SE.LOD) # define boundaries bottom-x.LOD-(1.96*log.SE.LOD) ceiling-x.LOD+(1.96*log.SE.LOD) ### REPORT # mean value x.LOD # bottom bottom # ceiling ceiling ### plot LOD lines(c(0.1, x.LOD), c(0.95, 0.95), lty=2) lines(c(x.LOD, x.LOD), c(0.95, 0), lty=2) text(x.LOD, 0.02, labels = round(x.LOD, digits = 2), pos=4, offset = 0.3, cex = 0.7) text(1, 0.93, label=95%, pos=4, offset = 0.3, cex = 0.7) lines(c(bottom, ceiling), c(0,0), lty=1, lend=butt, lwd=0.7) points(x.LOD, 0, pch=21, cex=0.8, bg=white) detach(my.data) ### ### ### SAMPLE SET No. 2 ### define data labs-c(dilution, total, positive) p.1-c(10, 10, 4) p.2-c(100, 10, 10) p.3-c(1000, 10, 10) p.4-c(1, 10, 10) p.5-c(10, 10, 10) p.6-c(100, 10, 10) ### create data frame from matrix my.mat-matrix(c(p.1, p.2, p.3, p.4, p.5, p.6), nrow=6, byrow=TRUE) dimnames(my.mat)-list(c(1:6),labs) my.data-as.data.frame(my.mat) attach(my.data) my.data ### define frequency data Y-cbind(positive, total-positive) ### fit model LOGIT model-glm(Y ~ dilution, family=binomial(link=logit), data=my.data) ### plot data and regression line plot(dilution, positive/total, pch=16, ylim=c(0,1), log = x, xaxt=n, xlim=c(1, 100), main=Positive amplification by PCR, ylab=Fraction amplified, xlab=expression(paste(Standard dilutions (c/,mu,l ### add x axis (William Dunlap, personal communication) xlim - par(usr)[1:2] lab.x - seq(ceiling(xlim[1]), floor(xlim[2])) axis(side=1, at=10^lab.x, lab=as.expression(lapply(lab.x, function(x)bquote(10^.(x) lines(dilution, fitted(model), lty=1) ### set significance level and logit # define significance level-0.95 #
Re: [R] Error accessing Vegan package
In relation to where I was trying to access the zip file it was here: http://cran.r-project.org/web/bin/windows/contrib/r-release/vegan_2.0-4.zip I used this method as my work computer does not allow me to directly install packages. Instead I have to install through the local zip file option. I had no trouble with accessing the package via http://probability.ca/cran/bin/windows/contrib/r-release/vegan_2.0-4.zip that Ben Bolker suggested. I was also able to access Vegan via the Install packages in R option, albeit only when at home.. Bob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating sparse matrix of type dgCMatrix directly
I want to create a sparse matrix of type dgCMatrix using the Matrix package (and the matrix must be of this type even if other more compact representations may exist). I do library(Matrix) m1-Matrix(rep(1,4),nrow=2,ncol=2,sparse=T) m1 2 x 2 sparse Matrix of class dsCMatrix [1,] 1 1 [2,] 1 1 To convert m1, I do as(m1, dgCMatrix) 2 x 2 sparse Matrix of class dgCMatrix [1,] 1 1 [2,] 1 1 Is it possible to construct a dgCMatrix directly i.e. without going through the additional as() step above? Best regards Søren __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error accessing Vegan package
On 12-07-28 06:49 AM, Bob Green wrote: In relation to where I was trying to access the zip file it was here: http://cran.r-project.org/web/bin/windows/contrib/r-release/vegan_2.0-4.zip I think you had the wrong URL ... ? I was able to find it at http://cran.r-project.org/bin/windows/contrib/r-release/vegan_2.0-4.zip (/web deleted) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] metafor package, proportions: single groups wrt to a categorical dependent variable
At 01:44 28/07/2012, Dushanthi Pinnaduwage wrote: Dear all, I am using R version 2.15.0 and 'metafor' package version 1.6-0. Can this version of the package handle proportions from a categorical dependent variable for single studies?If so how do I set up my dataframe for the raw data from different studies? Also how do I give inputs, specially xi, mi (or ni) to the function escalc()? Thanks,Dushanthi You do not give us very much to go on here. When you say your outcome is proportions do you mean your outcome has more than two values and you are generating (or have been provided with) several proportions for each study or do you mean it is a binary variable? If the latter do you in fact have the numerators and denominators or just the proportion? It would also help if you showed us the code you have tried and the error message you got (if any) or told us the discrepancy between the output you obtained and that which you expected. [[alternative HTML version deleted]] Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A question regarding hypergeometric test
Dear all, There is a simple question regarding gene set enrichment analysis. Say, we have a simple denominator and numerator, therefore hypergeometric test looks like: p=phyper(white-1,total white,total black,drawn). However, there is a question regarding database size. Say, my denominator (total genes on array) is equal to 1. However, database (say GO database) harbor only 8000 from this 1. The question is should I subtract genes from all values in phyper that do not fall into the database? By other words: original function ie: phyper(50,200,9800,500). subtract genes that didn't fall into database for example: phyper(50,180,7700,400). Should I correct my gene lists with database records? Which way is correct? Thank you in advance for the replies. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] metafor package, proportions: single groups wrt to a categorical dependent variable
At 15:47 28/07/2012, Dushanthi Pinnaduwage wrote: Dear Michael, Thanks very much for your quick response. My outcome is not binary. Outcome has more than 2 values (levels) and I have counts for the levels. I have not written a code yet. Well as the name of the key function in metafor (rma.uni) suggests metafor at present does univariate meta-analysis. If you can reduce the outcome to a single measure of effect (perhaps using proportional odds or continuation ratio) then you can enter that with its standard error using the yi and sei parameters to rma.uni. If the nature of your outcome does not allow that then you have a multivariate meta-analysis. In that case you could try mvmeta (available from CRAN) I would like to know if escalc() can handle categorical outcome data (more than 2 proportions). If so, how can I input the counts of each level of the categorical outcome to this function. Hope it is little clear now. Dushanthi You do not give us very much to go on here. When you say your outcome is proportions do you mean your outcome has more than two values and you are generating (or have been provided with) several proportions for each study or do you mean it is a binary variable? If the latter do you in fact have the numerators and denominators or just the proportion? It would also help if you showed us the code you have tried and the error message you got (if any) or told us the discrepancy between the output you obtained and that which you expected. Date: Sat, 28 Jul 2012 14:25:08 +0100 To: dushan...@bell.net; r-help@r-project.org From: i...@aghmed.fsnet.co.uk Subject: Re: [R] metafor package, proportions: single groups wrt to a categorical dependent variable At 01:44 28/07/2012, Dushanthi Pinnaduwage wrote: Dear all, I am using R version 2.15.0 and 'metafor' package version 1.6-0. Can this version of the package handle proportions from a categorical dependent variable for single studies?If so how do I set up my dataframe for the raw data from different studies? Also how do I give inputs, specially xi, mi (or ni) to the function escalc()? Thanks,Dushanthi You do not give us very much to go on here. When you say your outcome is proportions do you mean your outcome has more than two values and you are generating (or have been provided with) several proportions for each study or do you mean it is a binary variable? If the latter do you in fact have the numerators and denominators or just the proportion? It would also help if you showed us the code you have tried and the error message you got (if any) or told us the discrepancy between the output you obtained and that which you expected. [[alternative HTML version deleted]] Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] network plot problem
Is it possible to write a numbers of link next to lines of the network? For exmple, I have 3 trials that have studied two drugs in network. So, now it seems all links are of same weight. 2012/7/27 Vlatka Matkovic Puljic v.matkovic.pul...@gmail.com Thank you! It works fine now! :) 2012/7/27 Rui Barradas ruipbarra...@sapo.pt Hello, As for the directed = FALSE, it works with me: g - graph(given, directed = FALSE) As for the label, it's easy. Create an attribute 'name' and then set the label to that attribute. For instance, using the alphabet's letters, first uppercase. V(g)$name - c(LETTERS, letters)[V(g)] # or any other names V(g)$label - V(g)$name Keep 'name' and 'label' separate, like this you may change the label if needed but the name will still be there. Then the plot functions recognise the label attribute. Hope this helps, Rui Barradas Em 27-07-2012 11:37, Vlatka Matkovic Puljic escreveu: Thank you Rui. With matrix works better. I got plot I have expected to have. I want to be undirected. But directed = FALSE or as.undirected(graph) are not working? Where I'd gone wrong? Shouldn't it be possible to add names instead of numbers of nods with vertex.label= ? 2012/7/26 Rui Barradas ruipbarra...@sapo.pt Hello, I don't see the problem. given - scan(text= 15 2 10 4 10 4 10 4 13 4 13 4 15 4 18 4 11 5 2 6 7 6 7 6 7 6 12 6 15 6 15 6 19 6 22 6 24 6 6 7 5 12 5 12 7 12 11 12 13 12 13 12 13 12 13 12 16 12 17 12 23 12 23 12 23 12 23 12 6 13 12 13 6 14 6 15 9 15 12 15 13 15 17 16 16 17 1 18 12 18 23 18 2 19 6 19 24 19 21 22 3 25 5 26 6 27 7 27 15 29 20 30 25 31 28 31 8 32 6 33 14 33 22 34 ) mat - matrix(given, ncol=2, byrow=TRUE) g - graph(given) # Or, not run (note the transpose) #g - graph(t(mat)) # The edges are exactly what is given E(g) V(g)$label - V(g) g$layout - layout.fruchterman.reingold plot(g, edge.arrow.size=0.5, edge.loop.angle=1, edge.curved=FALSE) So my guess is you've messed up the graph creation. Also, don't post datasets like that, use dput(), lik this: dput(given) c(15, 2, 10, 4, 10, 4, 10, 4, 13, 4, 13, 4, 15, 4, 18, 4, 11, 5, 2, 6, 7, 6, 7, 6, 7, 6, 12, 6, 15, 6, 15, 6, 19, 6, 22, 6, 24, 6, 6, 7, 5, 12, 5, 12, 7, 12, 11, 12, 13, 12, 13, 12, 13, 12, 13, 12, 16, 12, 17, 12, 23, 12, 23, 12, 23, 12, 23, 12, 6, 13, 12, 13, 6, 14, 6, 15, 9, 15, 12, 15, 13, 15, 17, 16, 16, 17, 1, 18, 12, 18, 23, 18, 2, 19, 6, 19, 24, 19, 21, 22, 3, 25, 5, 26, 6, 27, 7, 27, 15, 29, 20, 30, 25, 31, 28, 31, 8, 32, 6, 33, 14, 33, 22, 34) Now all anyone has to do is to copy that output and paste it into an R session. (Try it with the matrix to see the result) Hope this helps, Rui Barradas Em 25-07-2012 16:27, Vlatka Matkovic Puljic escreveu: **Hi, I wanted to create a network of drugs that are being studied together. So, I created a file as a graph. But plot of my network is not corect! It looks like to me that something else is lying behind this network links created. Could someone help me with this? Thanx! a-read.graph(file=file.choose(), format=edgelist) plot.igraph(a) File (number present ID of drug): 15 2 10 4 10 4 10 4 13 4 13 4 15 4 18 4 11 5 2 6 7 6 7 6 7 6 12 6 15 6 15 6 19 6 22 6 24 6 6 7 5 12 5 12 7 12 11 12 13 12 13 12 13 12 13 12 16 12 17 12 23 12 23 12 23 12 23 12 6 13 12 13 6 14 6 15 9 15 12 15 13 15 17 16 16 17 1 18 12 18 23 18 2 19 6 19 24 19 21 22 3 25 5 26 6 27 7 27 15 29 20 30 25 31 28 31 8 32 6 33 14 33 22 34 -- ** Vlatka Matkovic Puljic gsm: +32.474.894953 -- ** Vlatka Matkovic Puljic gsm: +32.474.894953 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] selecting a subset of files to be processed
Dear R People: I am using a Linux system in which I have about 3000 files. I would like to randomly select about 45 of those files to be processed in R. Could I make the selection in R or should I do it in Linux, please? This is with R-2.15.1. Thanks, erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] network plot problem
Hello, It's more or less the same it was with vertices. If the edge labels represent weights, then create an appropriate attribute and set the labels' values to that attribute's values: E(g)$weight - something # keep the weights in a separate attribute E(g)$label - E(g)$weight # set when needed The plot function now uses these edge labels. Rui Barradas Em 28-07-2012 17:44, Vlatka Matkovic Puljic escreveu: Is it possible to write a numbers of link next to lines of the network? For exmple, I have 3 trials that have studied two drugs in network. So, now it seems all links are of same weight. 2012/7/27 Vlatka Matkovic Puljic v.matkovic.pul...@gmail.com Thank you! It works fine now! :) 2012/7/27 Rui Barradas ruipbarra...@sapo.pt Hello, As for the directed = FALSE, it works with me: g - graph(given, directed = FALSE) As for the label, it's easy. Create an attribute 'name' and then set the label to that attribute. For instance, using the alphabet's letters, first uppercase. V(g)$name - c(LETTERS, letters)[V(g)] # or any other names V(g)$label - V(g)$name Keep 'name' and 'label' separate, like this you may change the label if needed but the name will still be there. Then the plot functions recognise the label attribute. Hope this helps, Rui Barradas Em 27-07-2012 11:37, Vlatka Matkovic Puljic escreveu: Thank you Rui. With matrix works better. I got plot I have expected to have. I want to be undirected. But directed = FALSE or as.undirected(graph) are not working? Where I'd gone wrong? Shouldn't it be possible to add names instead of numbers of nods with vertex.label= ? 2012/7/26 Rui Barradas ruipbarra...@sapo.pt Hello, I don't see the problem. given - scan(text= 15 2 10 4 10 4 10 4 13 4 13 4 15 4 18 4 11 5 2 6 7 6 7 6 7 6 12 6 15 6 15 6 19 6 22 6 24 6 6 7 5 12 5 12 7 12 11 12 13 12 13 12 13 12 13 12 16 12 17 12 23 12 23 12 23 12 23 12 6 13 12 13 6 14 6 15 9 15 12 15 13 15 17 16 16 17 1 18 12 18 23 18 2 19 6 19 24 19 21 22 3 25 5 26 6 27 7 27 15 29 20 30 25 31 28 31 8 32 6 33 14 33 22 34 ) mat - matrix(given, ncol=2, byrow=TRUE) g - graph(given) # Or, not run (note the transpose) #g - graph(t(mat)) # The edges are exactly what is given E(g) V(g)$label - V(g) g$layout - layout.fruchterman.reingold plot(g, edge.arrow.size=0.5, edge.loop.angle=1, edge.curved=FALSE) So my guess is you've messed up the graph creation. Also, don't post datasets like that, use dput(), lik this: dput(given) c(15, 2, 10, 4, 10, 4, 10, 4, 13, 4, 13, 4, 15, 4, 18, 4, 11, 5, 2, 6, 7, 6, 7, 6, 7, 6, 12, 6, 15, 6, 15, 6, 19, 6, 22, 6, 24, 6, 6, 7, 5, 12, 5, 12, 7, 12, 11, 12, 13, 12, 13, 12, 13, 12, 13, 12, 16, 12, 17, 12, 23, 12, 23, 12, 23, 12, 23, 12, 6, 13, 12, 13, 6, 14, 6, 15, 9, 15, 12, 15, 13, 15, 17, 16, 16, 17, 1, 18, 12, 18, 23, 18, 2, 19, 6, 19, 24, 19, 21, 22, 3, 25, 5, 26, 6, 27, 7, 27, 15, 29, 20, 30, 25, 31, 28, 31, 8, 32, 6, 33, 14, 33, 22, 34) Now all anyone has to do is to copy that output and paste it into an R session. (Try it with the matrix to see the result) Hope this helps, Rui Barradas Em 25-07-2012 16:27, Vlatka Matkovic Puljic escreveu: **Hi, I wanted to create a network of drugs that are being studied together. So, I created a file as a graph. But plot of my network is not corect! It looks like to me that something else is lying behind this network links created. Could someone help me with this? Thanx! a-read.graph(file=file.choose(), format=edgelist) plot.igraph(a) File (number present ID of drug): 15 2 10 4 10 4 10 4 13 4 13 4 15 4 18 4 11 5 2 6 7 6 7 6 7 6 12 6 15 6 15 6 19 6 22 6 24 6 6 7 5 12 5 12 7 12 11 12 13 12 13 12 13 12 13 12 16 12 17 12 23 12 23 12 23 12 23 12 6 13 12 13 6 14 6 15 9 15 12 15 13 15 17 16 16 17 1 18 12 18 23 18 2 19 6 19 24 19 21 22 3 25 5 26 6 27 7 27 15 29 20 30 25 31 28 31 8 32 6 33 14 33 22 34 -- ** Vlatka Matkovic Puljic gsm: +32.474.894953 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting a subset of files to be processed
Hi Erin, It is not difficult to imagine doing it either in R or via the shell. If they are all in the same directory, I would tend towards R, just because you can easily set the seed and keep that information so you can reproduce your random selection. If the wd is in the directory with the files, from R just: set.seed(10) toprocess - sample(list.files(), size = 45) Cheers, Josh On Sat, Jul 28, 2012 at 10:49 AM, Erin Hodgess erinm.hodg...@gmail.com wrote: Dear R People: I am using a Linux system in which I have about 3000 files. I would like to randomly select about 45 of those files to be processed in R. Could I make the selection in R or should I do it in Linux, please? This is with R-2.15.1. Thanks, erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting a subset of files to be processed
Hello, If the files are to be processed in R select a random sample in R. Using list.files() you can assign a character vector with the filenames of interest and then sample from that vector. ?list.files filenames - list.files(path, pattern) rand.sampl - sample(filenames, 45) Hope this helps, Rui Barradas Em 28-07-2012 18:49, Erin Hodgess escreveu: Dear R People: I am using a Linux system in which I have about 3000 files. I would like to randomly select about 45 of those files to be processed in R. Could I make the selection in R or should I do it in Linux, please? This is with R-2.15.1. Thanks, erin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting a subset of files to be processed
And, in addition to the tip from Rui (and similar from Joshua) below, I would advise that there is one good reason not to try doing it in pure Linux. The only source (that I know of) in Linux itself for random numbers can be tapped by something like cat /dev/random filename /dev/random stores noise generated by the timings of system events (keyboard presses, mouse-clicks, disk accesses, interrupts, etc.) after subjecting them to a high-entropy stirring process. See: man random It yields them in the form of random bytes (each of 8 random 0/1 bits) and you would have to devise some means of coverting those onto a form suitable for accessing a directory listing at random. Not a pretty task! There is also the command 'rand' available in the openSSL toolkit, but that still outputs the results in the same format as /dev/random. If you really want to do this outside R, the I would suggest writing a little C program (to be run from the Linux command line). C can do its own random number generation, with results returned as real (double), and then apply these to select at random from the contents of a file generated by something like ls filesdir filelist.txt and output the random selection. Ted. On 28-Jul-2012 18:00:38 Rui Barradas wrote: Hello, If the files are to be processed in R select a random sample in R. Using list.files() you can assign a character vector with the filenames of interest and then sample from that vector. ?list.files filenames - list.files(path, pattern) rand.sampl - sample(filenames, 45) Hope this helps, Rui Barradas Em 28-07-2012 18:49, Erin Hodgess escreveu: Dear R People: I am using a Linux system in which I have about 3000 files. I would like to randomly select about 45 of those files to be processed in R. Could I make the selection in R or should I do it in Linux, please? This is with R-2.15.1. Thanks, erin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 28-Jul-2012 Time: 19:32:26 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating sparse matrix of type dgCMatrix directly
On Sat, Jul 28, 2012 at 7:26 AM, Søren Højsgaard sor...@math.aau.dk wrote: I want to create a sparse matrix of type dgCMatrix using the Matrix package (and the matrix must be of this type even if other more compact representations may exist). I do library(Matrix) m1-Matrix(rep(1,4),nrow=2,ncol=2,sparse=T) m1 2 x 2 sparse Matrix of class dsCMatrix [1,] 1 1 [2,] 1 1 To convert m1, I do as(m1, dgCMatrix) 2 x 2 sparse Matrix of class dgCMatrix [1,] 1 1 [2,] 1 1 Is it possible to construct a dgCMatrix directly i.e. without going through the additional as() step above? The Matrix function is a high-level function that is designed to produce a compact and informative representation of its arguments. That's why it checks for symmetry, triangularity, etc. To decide how to bypass these checks it would be useful to know what you are starting with. Will it be a dense matrix or a triplet representation or ...? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to extract unique indices for time series Data?
Dear all, I am into following coding: library(quantmod) Data1 - get(getSymbols(TCS.NS, from = as.Date(2008-01-01), return.class = zoo))[,4] Warning messages: 1: In zoo(cd, order.by = index(x), ...) : some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique 2: In zoo(rval, index(x)[i]) : some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique Here, I guess there are some duplicated dates-index. Is there any function available to automatically extract unique indices ??? Thanks and regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to extract unique indices for time series Data?
?unique ?duplicated -- Bert On Sat, Jul 28, 2012 at 12:57 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Dear all, I am into following coding: library(quantmod) Data1 - get(getSymbols(TCS.NS, from = as.Date(2008-01-01), return.class = zoo))[,4] Warning messages: 1: In zoo(cd, order.by = index(x), ...) : some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique 2: In zoo(rval, index(x)[i]) : some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique Here, I guess there are some duplicated dates-index. Is there any function available to automatically extract unique indices ??? Thanks and regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to extract unique indices for time series Data?
On Sat, Jul 28, 2012 at 3:57 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Dear all, I am into following coding: library(quantmod) Data1 - get(getSymbols(TCS.NS, from = as.Date(2008-01-01), return.class = zoo))[,4] Warning messages: 1: In zoo(cd, order.by = index(x), ...) : some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique 2: In zoo(rval, index(x)[i]) : some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique Here, I guess there are some duplicated dates-index. Is there any function available to automatically extract unique indices ??? Try this: library(quantmod) Data1 - Cl(getSymbols(TCS.NS, from = as.Date(2008-01-01), auto.assign = FALSE)) dup - duplicated(time(Data1), fromLast = TRUE) z - as.zoo(Data1[!dup]) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] minor axis ticks in trellis graphics?
Dear Martin, Mea culpa! I screwed up. I was answering (not well, at that) a different question and somehow managed to make it a response to your request. Here's what may be a solution to your problem. (see also this post by ilai: https://stat.ethz.ch/pipermail/r-help/attachments/20120718/7133b673/attachment.pl) We define appropriate locations for the x-ticks to go with suitably defined labels and corresponding tick lengths. ## data: d - data.frame(x = 1:11, y = rnorm(11)) ## Where to put ticks and labels myat - 1:11 mylab - head(c(rbind(LETTERS[1:6], )), -1) # putting letters at every second tick ## lengths of ticks mytck - head(c(rbind(rep(1,6), .5)), -1) ## myat, mylab, mytck should all be same length ## plot xyplot(y ~ x, data = d, xlab=, scales = list(x = list( at = myat, labels = mylab)), par.settings = list( axis.components = list( bottom = list(tck = mytck ## Try it with different myat etc to see how it works myat - seq(1, 11, by = 2/3) mylab - head(c(rbind(LETTERS[1:6], , )), -2) mytck - head(c(rbind(rep(2,6), .5, .5)), -2) myat - seq(1, 11, by = 1/2) mylab - head(c(rbind(LETTERS[1:6], , , )), -3) mytck - head(c(rbind(rep(2,6), .5, 1, .5)), -3) The trick is in constructing suitable vectors myat, mylab, and mytck which you can do any way you like but I would check that the lengths are equal (recycling works fine for simple cases but always seems to trip me up in more complicated cases). Peter Ehlers On 2012-07-26 07:02, Martin Ivanov wrote: Dear Peter, Thank You very much for your suggestion. However, it seems to me to make a completely new notation to the x axis. While I just need to add minor ticks to the axis, that is smaller ticks inbetween the basic ticks. The other option would be to say which ticks to be small and which big, but I have no idea how. Any suggestions will be appreciated. Best regards, Martin embarassing junk from P. Ehlers cut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] puzzling classical Mahalanobis distances from covMcd() {robustbase}
The values should probably be labeled initial instead of raw which is how they are labeled in the source. The Details section of manual indicates that the first step is to identify a subset of the original data between .5 and 1 whose covariance matrix has the lowest possible determinant. The next paragraph: The raw MCD estimate of location is then the average of these h points, whereas the raw MCD estimate of scatter is their covariance matrix, multiplied by a consistency factor and a finite sample correction factor (to make it consistent at the normal model and unbiased at small samples). Following your example: set.seed(42) x - matrix(rnorm(10*3), ncol = 3) xmeans - colMeans(x) Sx - cov(x) D2rb - covMcd(x) D2rb$raw.weights [1] 0 1 1 1 1 1 0 1 0 1 == Note that the raw weights eliminate obs 1, 7, and 9 xmeans; D2rb$raw.center [1] 0.5472968 -0.1634567 -0.1780795== Compare original means [1] 0.08172336 -0.03067387 -0.23956925 and raw means colMeans(x[as.logical(D2rb$raw.weights),]) == means with 1, 7, and 9 eliminated [1] 0.08172336 -0.03067387 -0.23956925 == This matches D2rb$raw.center So the raw values are taken for a subset, h, which includes observations 2, 3, 4, 5, 6, 8, and 10. Given that the raw.center and raw.cov are based on a subset of the original data, the mahalanobis distances will not be the same either. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Fraser D. Neiman Sent: Friday, July 27, 2012 7:16 AM To: r-help@r-project.org Subject: [R] puzzling classical Mahalanobis distances from covMcd() {robustbase} Greetings, I am puzzled about why the _classical_ Mahalanobis distances that I get using the {stats} mahalanobis() function do not match the distances I get from the {robustbase} covMcd() function. Here is an example: x - matrix(rnorm(10*3), ncol = 3) #here is the {stats} result: Sx - cov(x) D2 - mahalanobis(x, colMeans(x), Sx) D2 [1] 1.5135795 1.3761046 1.0367444 1.8111585 4.3038621 5.3195918 3.2798665 5.7559301 [9] 2.2172150 0.3859475 #here is the {robustbase} result Library(robustbase) D2rb- covMcd(x) D2rb$raw.mah [1] 0.7737193 1.1177445 0.7290794 0.6275703 3.5517622 6.0334350 1.0582663 5.7169250 [9] 0.9420184 0.4210470 According to the help file for covMcd{robustbase} raw.mah mahalanobis distances of the observations based on the raw estimate of the location and scatter. So I think the second set of numbers should match the first. But they do not. What am I missing here? Thanks, Fraser __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating sparse matrix of type dgCMatrix directly
-Original Message- From: dmba...@gmail.com [mailto:dmba...@gmail.com] On Behalf Of Douglas Bates Sent: 28. juli 2012 20:36 To: Søren Højsgaard Cc: r-help@r-project.org Subject: Re: [R] Creating sparse matrix of type dgCMatrix directly On Sat, Jul 28, 2012 at 7:26 AM, Søren Højsgaard sor...@math.aau.dk wrote: I want to create a sparse matrix of type dgCMatrix using the Matrix package (and the matrix must be of this type even if other more compact representations may exist). I do library(Matrix) m1-Matrix(rep(1,4),nrow=2,ncol=2,sparse=T) m1 2 x 2 sparse Matrix of class dsCMatrix [1,] 1 1 [2,] 1 1 To convert m1, I do as(m1, dgCMatrix) 2 x 2 sparse Matrix of class dgCMatrix [1,] 1 1 [2,] 1 1 Is it possible to construct a dgCMatrix directly i.e. without going through the additional as() step above? The Matrix function is a high-level function that is designed to produce a compact and informative representation of its arguments. That's why it checks for symmetry, triangularity, etc. To decide how to bypass these checks it would be useful to know what you are starting with. Will it be a dense matrix or a triplet representation or ...? [] Dear Doug, [] I use sparse matrices to represent graphs so the matrices contain zeros and non-zeros (in most cases ones). I need the matrices to be of the dgCMatrix type because these matrices can be used as sparse matrices in the RcppEigen package. For now I have created my own dgCMatrix() function doing what I've sketched above, but there is - as you also indicate - an overhead in the checks in the Matrix() function, and also an overhead in using as(). Regards Søren __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to extract unique indices for time series Data?
Here, I guess there are some duplicated dates-index. Is there any function available to automatically extract unique indices ??? ?zooreg should do what you want, if I'm understanding the question properly. -- Sent from my mobile device Envoyait de mon portable [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] use R remotely (ssh / shell / emacs)
Hi, I have a laptop and a desktop PC. Now I was wondering, if it is possible to create a setup in which you can use the laptop (where ever you are) to remotely access the PC, open an R-instance and let the PC do the heavy computation. I have no idea about remote control, wake-on-lan, ssh and all these things. I just wanted to know if someone here on the list is actually working with such a setup and if, once all is configured, it works reliable. how much work it is to set it up with an editor like emacs? And for which buzzwords should I search the internet? sorry if that is the wrong mailing-list for this kind of questions, I just hope to get some input on the topic. thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] readRDS, In as.double.xts(fishReport$count) : NAs introduced by coercion
Hello, I looked in the R-help but could not find an archive addressing the following. I would like to convert a character to numeric after reading a file with RDS extension. After using as.numeric, I checked if it is numeric. It was not converted. Please help. Here is my code Report - readRDS(file=RDS/Report.RDS) Report[1:2,] dive_id date time species count size 2008-08-06 08:49:00 108/06/2008 8:49:00 S. OYT15 6 2008-08-06 08:49:00 108/06/2008 8:49:00 S. atrovirens 1 23 site depth level TRANSECT VIS_M TEMP_C swell_URSKI 2008-08-06 08:49:00 Hopkins 15 B 1 3.5 13.9 1.0686708 2008-08-06 08:49:00 Hopkins 15 B 1 3.5 13.9 1.0686708 Report$count-as.numeric(Report$count) Warning message: In as.double.xts(fishReport$count) : NAs introduced by coercion is.numeric(Report$count) [1] FALSE Thank you, Y [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use R remotely (ssh / shell / emacs)
Unix-ish operating systems often have X-Windows installed for GUI programs. In this environment, it is very common to log in remotely and run GUI programs on the remote computer that open windows on your local screen. It is also possible to run a vnc or Remote Terminal service/daemon on the remote computer so that you can see a copy of the GUI screen of the remote computer on your local computer. (RT is much more common on Microsoft Windows remote computers, vnc is more operating-system agnostic.) R runs fine either way, but each approach has its own quirks and advantages. Setting such software up is definitely off topic here. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Martin Batholdy batho...@googlemail.com wrote: Hi, I have a laptop and a desktop PC. Now I was wondering, if it is possible to create a setup in which you can use the laptop (where ever you are) to remotely access the PC, open an R-instance and let the PC do the heavy computation. I have no idea about remote control, wake-on-lan, ssh and all these things. I just wanted to know if someone here on the list is actually working with such a setup and if, once all is configured, it works reliable. how much work it is to set it up with an editor like emacs? And for which buzzwords should I search the internet? sorry if that is the wrong mailing-list for this kind of questions, I just hope to get some input on the topic. thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting a subset of files to be processed
Thanks so much! On Sat, Jul 28, 2012 at 1:32 PM, Ted Harding ted.hard...@wlandres.net wrote: And, in addition to the tip from Rui (and similar from Joshua) below, I would advise that there is one good reason not to try doing it in pure Linux. The only source (that I know of) in Linux itself for random numbers can be tapped by something like cat /dev/random filename /dev/random stores noise generated by the timings of system events (keyboard presses, mouse-clicks, disk accesses, interrupts, etc.) after subjecting them to a high-entropy stirring process. See: man random It yields them in the form of random bytes (each of 8 random 0/1 bits) and you would have to devise some means of coverting those onto a form suitable for accessing a directory listing at random. Not a pretty task! There is also the command 'rand' available in the openSSL toolkit, but that still outputs the results in the same format as /dev/random. If you really want to do this outside R, the I would suggest writing a little C program (to be run from the Linux command line). C can do its own random number generation, with results returned as real (double), and then apply these to select at random from the contents of a file generated by something like ls filesdir filelist.txt and output the random selection. Ted. On 28-Jul-2012 18:00:38 Rui Barradas wrote: Hello, If the files are to be processed in R select a random sample in R. Using list.files() you can assign a character vector with the filenames of interest and then sample from that vector. ?list.files filenames - list.files(path, pattern) rand.sampl - sample(filenames, 45) Hope this helps, Rui Barradas Em 28-07-2012 18:49, Erin Hodgess escreveu: Dear R People: I am using a Linux system in which I have about 3000 files. I would like to randomly select about 45 of those files to be processed in R. Could I make the selection in R or should I do it in Linux, please? This is with R-2.15.1. Thanks, erin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 28-Jul-2012 Time: 19:32:26 This message was sent by XFMail - -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use R remotely (ssh / shell / emacs)
I don't know if using R over ssh/remotely is off-topic, but the user needs to provide more information. I am not knowledgeable about wakeup-on-lan (isn't this set in the BIOS?) but I think it would be helpful to know OS's, etc. How much more powerful is the desktop than the laptop? If not, does it make sense to clutter b/w? Also, what does the user envisage doing in the R simulations? I am assuming that this will be run in the batch mode otherwise there does not seem to be much point. Ranjan On Sat, 28 Jul 2012 19:46:57 -0700 Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: Unix-ish operating systems often have X-Windows installed for GUI programs. In this environment, it is very common to log in remotely and run GUI programs on the remote computer that open windows on your local screen. It is also possible to run a vnc or Remote Terminal service/daemon on the remote computer so that you can see a copy of the GUI screen of the remote computer on your local computer. (RT is much more common on Microsoft Windows remote computers, vnc is more operating-system agnostic.) R runs fine either way, but each approach has its own quirks and advantages. Setting such software up is definitely off topic here. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Martin Batholdy batho...@googlemail.com wrote: Hi, I have a laptop and a desktop PC. Now I was wondering, if it is possible to create a setup in which you can use the laptop (where ever you are) to remotely access the PC, open an R-instance and let the PC do the heavy computation. I have no idea about remote control, wake-on-lan, ssh and all these things. I just wanted to know if someone here on the list is actually working with such a setup and if, once all is configured, it works reliable. how much work it is to set it up with an editor like emacs? And for which buzzwords should I search the internet? sorry if that is the wrong mailing-list for this kind of questions, I just hope to get some input on the topic. thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Important Notice: This mailbox is ignored: e-mails are set to be deleted on receipt. For those needing to send personal or professional e-mail, please use appropriate addresses. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] quantreg Wald-Test
Dear all, I know that my question is somewhat special but I tried several times to solve the problems on my own but I am unfortunately not able to compute the following test statistic using the quantreg package. Well, here we go, I appreciate every little comment or help as I really do not know how to tell R what I want it to do^^ My situation is as follows: I have a data set containing a (dependent) vector Y and the regressor X. My aim is to check whether the two variables do not granger-cause each other in quantiles. I started to compute via quantreg for a single tau:= q: rq(Y_t~Y_(t-1)+Y_(t-2)+...+X_(t-1)+X_(t-2)+...,tau=q) This gives me the quantile regression coefficients. Now I want to check whether all the coefficients of X are equal to zero (for this specific tau). Can I do this by applying rq.anova ? I have already asked a similiar question but I am not sure if anova is really calculating this for me.. Currently I am calculating fitunrestricted=rq(Y_t~Y_(t-1)+Y_(t-2)+...+X_(t-1)+X_(t-2)+...,tau=q) fitrestrited=rq(Y_t~Y_(t-1)+Y_(t-2)+...,tau=q) anova(fitrestricted,fitunrestricted) If this is correct can you tell me how the test value is calculated in this case, or in other words: My next step is going to replicate this procedure for a whole range of quantiles (say for tau in [a,b]). To apply a sup-Wald-test I am wondering if it is correct to choose the maximum of the different test values and to simulate the critical values by using the data tabulated in Andrees(1993) (or simulate the vectors of independent Brownian Motions)...please feel free to comment, I am really looking forward to your help! Thank you very much Cheers Stefan -- View this message in context: http://r.789695.n4.nabble.com/quantreg-Wald-Test-tp4638198.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with a regression - Dataset Workinghours
I'm a student. I'm working on a research using the statistical program R 2.15.1. Here's my problem: how i can do a regression considering only values over a certain limit? For example, considering the dataset Workinghour of the Ecdat package, is possible to build a predictive model that express the probability that a wife works more than 8 hours per day? The dataset includes 3382 observation on the number of hours spent working by wifes per year in USA. hoursday=hours/240 index-which(hoursday=8) hoursday[index] As you see, I'm able to extract the values that in 'hoursday' (which is hours/240 working days in one year) are 8,0 but obviously i can't do a regression cause the extracted data are a subset of the entire dataset (955 observations), while the other variables, like age, occupation, income, etc. are still complete(3382). So i can't do: lm = lm(hoursday[index] ~ income+age+education+unemp+child5+child13+child17+nonwhite+owned+mortgage+occupation) In fact R gives me: Error in model.frame.default(formula = hoursday[index] ~ income, drop.unused.levels = TRUE) : variable lengths differ (found for 'income'). Can you help me? Thank you. Giorgio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] producing a graph with glm poisson distributed respons count data and categorical independant variables
Hi, I do have replicates, but not many. The offsets are my the number of replicates actually, i misled myself by thinking i could add the counts of the replicates all up and then go further with the offset function. I decided to split up my experiments in order to interpret them, they were actually from the beginning split up experiments. Because i thought i could see if the bees reacted in a same manner to the different experiments i thought i could analyse it with an interaction factor to reveal whether this was true. But i want now want to analyse them apart so i can draw conclusions from the both experiments apart. I did the following with the counts from p1, including the replicates so i have only the type of field-margin as a variable, as i was only interested in this from the beginning: mengsel count C 39 C 38 A 79 A 96 A 278 D 15 D 15 B 322 B 449 B 262 a.data.p1-read.table(a.data.p1.txt,header=TRUE,sep=) a.data.p1 fit.sat.a.p1-glm(count~mengsel,data=a.data.p1,family=poisson) anova(fit.sat.a.p1,test=Chisq) fit.main.a.p1-glm(count~1,data=a.data.p1,family=poisson) anova(fit.sat.a.p1,fit.main.a.p1,test=Chisq) extractAIC(fit.sat.a.p1) extractAIC(fit.main.a.p1) summary(fit.sat.a.p1) This tells me that the saturated model is better explaning then the other so: Call: glm(formula = count ~ mengsel, family = poisson, data = a.data.p1) Deviance Residuals: Min 1Q Median 3Q Max -6.4531 -3.7799 -0.0404 0.0603 9.2385 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) 5.017280.04698 106.79 2e-16 *** mengselB 0.824330.05635 14.63 2e-16 *** mengselC-1.366620.12327 -11.09 2e-16 *** mengselD-2.309230.18852 -12.25 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for poisson family taken to be 1) Null deviance: 1385.58 on 9 degrees of freedom Residual deviance: 202.01 on 6 degrees of freedom AIC: 273.15 Number of Fisher Scoring iterations: 5 How can i produce a graph for this? I am worried that i still do not have enough replicates to actually draw descent conclusions or conclusions at all...in my second experiment i do not have replicates for 2 out of four types of field margins, for the other two i only have 2 replicates. this being the counts from the second experiment: C 3 p2 C 1 p2 A 90 p2 A 29 p2 D 0 p2 B 157 p2 thanks, babs -- View this message in context: http://r.789695.n4.nabble.com/producing-a-graph-with-glm-poisson-distributed-respons-count-data-and-categorical-independant-variabs-tp4638110p4638192.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using save() to work with objects that exceed memory capacity
Context: I'm relatively new to R and am working with very large datasets. General problem: If working on a dataset requires that I produce more than two objects of roughly the size of the dataset, R quickly uses up its available memory and slows to a virtual halt. My tentative solution: To save and remove objects as they're created, and load them when I need them. To do this I'm trying to automatically generate file names derived from these objects, and use these in save(). My specific question to the list: How do I capture the string that names an object I want to save, in such a way that I can use it in a function that calls save()? For example, suppose I create a matrix and then save it follows: mat-matrix(1:9,3,3) save(mat, file=matfile) Then I get a file of the kind I'd like: the command 'load(matfile)' retrieves the correct matrix, with the original name 'mat'. Further, if I instead save it this way: objectname-mat save(list=ls(pattern=objectname), file=matfile) then I get the same positive result. But now suppose I create a function saveobj - function(objectname,objectfile) + { + save(list=ls(pattern=objectname),file=objectfile); + return()}; Then if I now try to save 'mat' by matname-mat saveobj(matname,matfile) I do not get the same result; namely, the command 'load(mat)' retrieves no objects. Why is this? I'd be grateful for any help on either my specific questions, or suggestions of a better ways to address the issue of limited memory. Thanks, David Romano [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] garch external regressors
Hi everyone, i have a terribly stupid question, but i hope someone could reply me! when i'm fitting a garch model and i want to insert in the variance model some external regressors as temporal dummy or lags of other variables and so on, which timeline should follow the time series that i select? in other words, if the dependent variable is at time t, the external regressors in the ugarchspec formula should be at time t or t-1? thanks a lot sara -- View this message in context: http://r.789695.n4.nabble.com/garch-external-regressors-tp4638190.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] metafor package, proportions: single groups wrt to a categorical dependent variable
Dear Michael, Thanks very much for your quick response. My outcome is not binary. Outcome has more than 2 values (levels) and I have counts for the levels. I have not written a code yet. I would like to know if escalc() can handle categorical outcome data (more than 2 proportions). If so, how can I input the counts of each level of the categorical outcome to this function. Hope it is little clear now. Dushanthi You do not give us very much to go on here. When you say your outcome is proportions do you mean your outcome has more than two values and you are generating (or have been provided with) several proportions for each study or do you mean it is a binary variable? If the latter do you in fact have the numerators and denominators or just the proportion? It would also help if you showed us the code you have tried and the error message you got (if any) or told us the discrepancy between the output you obtained and that which you expected. Date: Sat, 28 Jul 2012 14:25:08 +0100 To: dushan...@bell.net; r-help@r-project.org From: i...@aghmed.fsnet.co.uk Subject: Re: [R] metafor package, proportions: single groups wrt to a categorical dependent variable At 01:44 28/07/2012, Dushanthi Pinnaduwage wrote: Dear all, I am using R version 2.15.0 and 'metafor' package version 1.6-0. Can this version of the package handle proportions from a categorical dependent variable for single studies?If so how do I set up my dataframe for the raw data from different studies? Also how do I give inputs, specially xi, mi (or ni) to the function escalc()? Thanks,Dushanthi You do not give us very much to go on here. When you say your outcome is proportions do you mean your outcome has more than two values and you are generating (or have been provided with) several proportions for each study or do you mean it is a binary variable? If the latter do you in fact have the numerators and denominators or just the proportion? It would also help if you showed us the code you have tried and the error message you got (if any) or told us the discrepancy between the output you obtained and that which you expected. [[alternative HTML version deleted]] Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] low resolution word map
Hi, I'd like to have a low resolution word map in R. The maps package has this option, but if I use the argument, the map looses sense: Russia and Australia get empty etc library(maps) m=map(col=skyblue,fill=TRUE,plot=TRUE,resolution=10) length(m$x) If I drop the fill=TRUE, the effect of resolutaion=1 is lost, ie no change. Is there any other package or could I use the resolution argument differently? Thanks, Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm without intercept
I've just picked up R (been using Matlab, Eviews etc) and I'm having the same issue. Running reg=lm(ticker1~ticker2) gives R^2=50% while running reg=lm(ticker1~0+ticker2) gives R^2=99%!! The charts suggest the fit is worse not better and indeed Eviews/Excel/Matlab all say R^2=15% with intercept=0. How come R calculates a totally different value?! Call: lm(formula = ticker1 ~ ticker2) Residuals: Min 1Q Median 3Q Max -0.22441 -0.03380 0.01099 0.04891 0.16688 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.570620.08187 19.18 2e-16 *** ticker2 0.617220.02699 22.87 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.07754 on 530 degrees of freedom Multiple R-squared: 0.4967, Adjusted R-squared: 0.4958 F-statistic: 523.1 on 1 and 530 DF, p-value: 2.2e-16 Call: lm(formula = ticker1 ~ 0 + ticker2) Residuals: Min1QMedian3Q Max -0.270785 -0.069280 -0.007945 0.087340 0.268786 Coefficients: Estimate Std. Error t value Pr(|t|) ticker2 1.134508 0.001441 787.2 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1008 on 531 degrees of freedom Multiple R-squared: 0.9991, Adjusted R-squared: 0.9991 F-statistic: 6.197e+05 on 1 and 531 DF, p-value: 2.2e-16 Jan private wrote Hi, thanks for your help. I'm beginning to understand things better. If you plotted your data, you would realize that whether you fit the 'best' least squares model or one with a zero intercept, the fit is not going to be very good Do the data cluster tightly around the dashed line? No, and that is why I asked the question. The plotted fit doesn't look any better with or without intercept, so I was surprised that the R-value etc. indicated an excellent regression (which I now understood is the wrong interpretation). One of the references you googled suggests that intercepts should never be omitted. Is this true even if I know that the physical reality behind the numbers suggests an intercept of zero? Thanks, Jan __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/lm-without-intercept-tp3312429p4638204.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mixed-model with paired design covariates
Dear all, I make habitat suitability models for animal species. The purpose of my research is to investigate the accuracy of different models. I clearly have a nested design: - accuracy_measure - response variable - 2 model types (model_type) - fixed effect - 230 species (species) - random effect - 10 replicates/species (replicate) - random effect - 10 subreplicate/replicate - observation So I have: 10*10*230 observations/model, identified as speciesID_replicate_subreplicate (species_ID ranging 1:230, replicate 1:10 and subreplicate 1:10) One could think about such mixed-effect model: my.model-lme(fixed = accuracy_measure~model_type, random = ~1|species/replicate) I do not expand here into model simplification nor if it is best to use lme or lmer, YET... but here are more conceptual questions 1) my replicates subreplicates are paired in the sense that they come from the same split of the data. As an example for species X, the 20 observations of replicate1 of model A and B (10 for A and 10 for B) are linked by a same data split which is likely to influence my accuracy measure. In the same way the 2 observations replicate1_subreplicate2 of model A and B (1 for A and 1 for B) are also linked. Is there a way to introduce such pairing in a mixed effect model? 2) several continuous covariates, attributes of species (number of points for modelization, size in km2 of the range) may influence the measures of accuracy and I may be interested in investigating those effects. How could I include a covariate in such model? How should I strucutre it given that there are species covariates (high order of nesting)? I hope that my questions are relevant! Thank you very much in advance for your help! Nathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Appending the Column names
Hi Freinds, I have two data frames X,Y. I want to append both the data frames into one, along with the columns names from both the data frames (it should look like Z). X: SummaryG Y R Acc 1212 13 Bcc 1114 15 Ccc 1315 16 Y: SummaryG Y R Acc 1011 12 Bcc 1312 11 Ccc 1116 20 Result -- Z: Summary G Y R Acc 1212 13 Bcc 1114 15 Ccc 1315 16 Summary G Y R Acc 1011 12 Bcc 1312 11 Ccc 1116 20 Can anyone help me on this. Thanks in Advance. Thanks, Namit. -- View this message in context: http://r.789695.n4.nabble.com/Appending-the-Column-names-tp4638207.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [rjson] toJSON serializes double as int
I have this object: obj - list(test=1.0) obj $test [1] 1 toJSON(obj) [1] {\test\:1} How do force the serialization of obj$test to be 1.0 instead of 1? I.e. the output should be: toJSON(obj) [1] {\test\:1.0} [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] low resolution word map
Hi, I'd like to have a low resolution word map in R. The maps package has this option, but if I use the argument, the map looses sense: Russia and Australia get empty etc library(maps) m=map(col=skyblue,fill=TRUE,plot=TRUE,resolution=10) length(m$x) If I drop the fill=TRUE, the effect of resolutaion=1 is lost, ie no change. Is there any other package or could I use the resolution argument differently? Thanks, Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Beta-Binomial Regression in R
Hi All: I am trying to generate Beta-Binomial data with regressors using R. I have used the following code to generate Beta-Binomial data. Now I want to add a covariate to the equation. I would then like to use the simulated data to run the Beta-Binomial model with covariates on it. Appreciate any help. set.seed(111) k-20 n-60 x-NULL p-rbeta(k,3,3)# so that the mean nausea rate is alpha/(alpha+beta) for(i in 1:k) x-cbind(x,rbinom(300,n,p[i])) Thanks Anamika [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Coloring Counties in a State Map
I am mapping U.S. state counties in individual states maps, e.g. A map of Montana, with county lines. How do I fill specific counties with a different color? -- View this message in context: http://r.789695.n4.nabble.com/Coloring-Counties-in-a-State-Map-tp4638218.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm without intercept
Hi, R actually uses a different formula for calculating the R square depending on whether the intercept is in the model or not. You may also find this discussion helpful: http://stats.stackexchange.com/questions/7948/when-is-it-ok-to-remove-the-intercept-in-lm/ If you conceptualize R^2 as the squared correlation between the oberserved and fitted values, it is easy to get: summary(m0 - lm(mpg ~ 0 + disp, data = mtcars)) summary(m1 - lm(mpg ~ disp, data = mtcars)) cor(mtcars$mpg, fitted(m0))^2 cor(mtcars$mpg, fitted(m1))^2 but that is not how R calculates R^2. Cheers, Josh On Sat, Jul 28, 2012 at 10:40 AM, citynorman citynor...@hotmail.com wrote: I've just picked up R (been using Matlab, Eviews etc) and I'm having the same issue. Running reg=lm(ticker1~ticker2) gives R^2=50% while running reg=lm(ticker1~0+ticker2) gives R^2=99%!! The charts suggest the fit is worse not better and indeed Eviews/Excel/Matlab all say R^2=15% with intercept=0. How come R calculates a totally different value?! Call: lm(formula = ticker1 ~ ticker2) Residuals: Min 1Q Median 3Q Max -0.22441 -0.03380 0.01099 0.04891 0.16688 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.570620.08187 19.18 2e-16 *** ticker2 0.617220.02699 22.87 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.07754 on 530 degrees of freedom Multiple R-squared: 0.4967, Adjusted R-squared: 0.4958 F-statistic: 523.1 on 1 and 530 DF, p-value: 2.2e-16 Call: lm(formula = ticker1 ~ 0 + ticker2) Residuals: Min1QMedian3Q Max -0.270785 -0.069280 -0.007945 0.087340 0.268786 Coefficients: Estimate Std. Error t value Pr(|t|) ticker2 1.134508 0.001441 787.2 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1008 on 531 degrees of freedom Multiple R-squared: 0.9991, Adjusted R-squared: 0.9991 F-statistic: 6.197e+05 on 1 and 531 DF, p-value: 2.2e-16 Jan private wrote Hi, thanks for your help. I'm beginning to understand things better. If you plotted your data, you would realize that whether you fit the 'best' least squares model or one with a zero intercept, the fit is not going to be very good Do the data cluster tightly around the dashed line? No, and that is why I asked the question. The plotted fit doesn't look any better with or without intercept, so I was surprised that the R-value etc. indicated an excellent regression (which I now understood is the wrong interpretation). One of the references you googled suggests that intercepts should never be omitted. Is this true even if I know that the physical reality behind the numbers suggests an intercept of zero? Thanks, Jan __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/lm-without-intercept-tp3312429p4638204.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.