Re: [R] Change the text size of the title in a legend of a R plot.
Hi Victor, looking at the code for legend, it looks like the same 'cex' value is used for the text in the legend as the title. Here is a trick though: draw the legend twice, with different cex values, and omitting title or text: plot(1) legend(topright,c(a,b),title= ) legend(topright,c( , ),title=Legend,cex=0.6, bty='n', title.adj=0.15) A bit of a hack but it works If you want the title larger, it will probably not fit the box, which you can omit by setting bty='n' (as in the second line). good luck, Remko - Remko Duursma Research Lecturer Centre for Plants and the Environment University of Western Sydney Hawkesbury Campus Richmond NSW 2753 Mobile: +61 (0)422 096908 www.remkoduursma.com On Fri, Apr 29, 2011 at 3:15 PM, Steven McKinney smckin...@bccrc.ca wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Victor Gabillon Sent: April-28-11 8:22 PM To: r-help@r-project.org Subject: [R] Change the text size of the title in a legend of a R plot. Hello, Is it possible to change the text size of the title in a legend of a R plot? I tried to directly change the title.cex argument but it seems not to work. Trying : Horizo - c(1,2,6,10,20) legtext - paste(Horizo,sep=) legend(topleft, legend=legtext,col=col,text.col=col,lwd=lwd, lty=lty,cex=1.1,ncol=3,title = Horizons,title.col =black,title.cex=1.4) I haven't found any cex argument that works for just the legend title, but you can get some modification of the title with the expression argument: legend(topleft, legend=legtext,col=col,text.col=col,lwd=lwd, lty=lty,cex=1.1,ncol=3, title = expression(bold(Horizons)),title.col=black) Does that help? Otherwise, you can of course figure out which functions do the legend plotting, copy and modify those to get a title cex in place. Steve McKinney gives the following error (sorry in french): Erreur dans legend(topleft, legend = legtext, col = col, text.col = col, : argument(s) inutilisé(s) (title.cex = 1.4) saying title.cex argument as been ignored. Thank you for helping. Victor __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Still confused about classes
Hi, I'm still confused about how to find out what methods are defined for a given class. For example, I know that today - Sys.Date() will produce an object of type Date. But I'm not sure what I can do with Date objects or how I can find out. ?Date refers me to the Date documentation page. But it doesn't tell me how, for example, to extract the current year from a date object. I tried year(today)Error: could not find function year Is there some other function that does the job? I want a function f such that f(today)will return 2011. Perhaps there is no such function. But in general I don't have any confidence that I would know how to find it if it existed or that I would know how to assure myself that there was no such function. Thanks. *-- Russ * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Still confused about classes
Hi Russ, One tool that might help could be ?methods and ?showMethods For example: ## for S3 methods(class = Date) ## for S4 showMethods(classes = Date) regarding getting the actual year, I would use (though there may be better ways): format.Date(as.Date(2010-01-01), format = %Y) HTH, Josh On Thu, Apr 28, 2011 at 11:05 PM, Russ Abbott russ.abb...@gmail.com wrote: Hi, I'm still confused about how to find out what methods are defined for a given class. For example, I know that today - Sys.Date() will produce an object of type Date. But I'm not sure what I can do with Date objects or how I can find out. ?Date refers me to the Date documentation page. But it doesn't tell me how, for example, to extract the current year from a date object. I tried year(today)Error: could not find function year Is there some other function that does the job? I want a function f such that f(today) will return 2011. Perhaps there is no such function. But in general I don't have any confidence that I would know how to find it if it existed or that I would know how to assure myself that there was no such function. Thanks. *-- Russ * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Still confused about classes
See ?months. methods(class = Date) would have got you there. (Date is not an S4 class, so people should not be setting S4 methods on it without very good reason, and there are none in R itself.) On Thu, 28 Apr 2011, Joshua Wiley wrote: Hi Russ, One tool that might help could be ?methods and ?showMethods For example: ## for S3 methods(class = Date) ## for S4 showMethods(classes = Date) regarding getting the actual year, I would use (though there may be better ways): format.Date(as.Date(2010-01-01), format = %Y) Call format(), not its methods, or use strftime() directly. HTH, Josh On Thu, Apr 28, 2011 at 11:05 PM, Russ Abbott russ.abb...@gmail.com wrote: Hi, I'm still confused about how to find out what methods are defined for a given class. For example, I know that today - Sys.Date() will produce an object of type Date. But I'm not sure what I can do with Date objects or how I can find out. ?Date refers me to the Date documentation page. But it doesn't tell me how, for example, to extract the current year from a date object. I tried year(today)Error: could not find function year Is there some other function that does the job? I want a function f such that f(today) will return 2011. Perhaps there is no such function. But in general I don't have any confidence that I would know how to find it if it existed or that I would know how to assure myself that there was no such function. Thanks. *-- Russ * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bootstrapping problem
I want to classify bipolar neurons in human cochleas and have data of the following structure: Vol_Nuc Vol_Soma 1 186.23 731.96 2 204.58 4370.96 3 539.98 7344.86 4 477.71 6939.28 5 421.22 5588.53 6 276.61 1017.05 7 392.28 6392.32 8 424.43 6190.13 9 256.41 3850.51 10 249.17 3118.14 11 276.97 3037.29 12 295.30 3703.76 13 314.43 5265.97 14 301.15 5781.73 I already worked with Matlab (I´m not a programmer) and created nice colourcoded dendrograms and also made some verifications of them. I started now working with R and bootstrapped data with a library named pvclust. It worked and R computed ... here is the code: library(pvclust) data = data.frame(Vol_Nuk=c(186.23,204.58,539.98,477.71,421.22,276.61,392.28,424.43,256.41,249.17,276.97,295.3,314.43,301.15), Vol_Soma=c(731.96,4370.96,7344.86,6939.28,5588.53,1017.05,6392.32,6190.13,3850.51,3118.14,3037.29,3703.76,5265.97,5781.73)) plot(data) result-pvclust(data,nboot=100) plot(result) It is also not working using following commands: cluster.bootstrap - pvclust(Raw, nboot=1000, method.dist=abscor) plot(cluster.bootstrap) pvrect(cluster.bootstrap) I always get the following problem: mistake in plot.hclust(x$hclust, main = main, sub = sub, xlab = xlab, col = col, : invalid input for Dendrogram Does anyone has an idea whats wrong... Thanks a lot!! -- View this message in context: http://r.789695.n4.nabble.com/bootstrapping-problem-tp3483068p3483068.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change the text size of the title in a legend of a R plot.
On 04/29/2011 05:21 AM, Victor Gabillon wrote: Horizo - c(1,2,6,10,20) legtext - paste(Horizo,sep=) legend(topleft, legend=legtext,col=col,text.col=col,lwd=lwd, lty=lty,cex=1.1,ncol=3,title = Horizons,title.col =black,title.cex=1.4) I am not sure, but the manual regarding legend seems to be not correct (or at least misleading). There is not title.cex argument for legend (even though the help page mentions it). Either you set cex 1 but this will resize the labels as well. Or you modify the code of legend as follows: change the following (near the end of the code): text2(left + w/2, top - ymax, labels = title, adj = c(0.5, 0), cex = cex, col = title.col) to: text2(left + w/2, top - ymax, labels = title, adj = c(0.5, 0), cex = title.cex, col = title.col) and add title.cex to the arguments of legend. Its probably easiest if you copy the code of legend and save its modified version within a different function. Not sure on whom to contact regarding correcting the documentation of legend(). Perhaps even I am wrong, but I could not find any reference to title.cex in the code. HTH Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] EM vs Bayesian
Maybe it is better to ask this question on: http://stats.stackexchange.com/ The question is not R specific. mario On 24-Apr-11 16:16, Jim Silverton wrote: Hello, Is there any literature there that says that the EM is better/worse than a Baysian model when it comes to differentiating univariate mixture of normal distributions? -- Ing. Mario Valle Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix evaluation using if function
Hi All, I am trying to create a function which evaluates whether the values (which are equal to one) of a matrix are the same as their mirror values. Consider the following matrix: n-matrix(cbind(c(0,1,1),c(1,0,0),c(0,1,0)),3,3) colnames(n)-cbind(A,B,C);rownames(n)-cbind(A,B,C) n A B C A 0 1 0 B 1 0 1 C 1 0 0 Hence, since n[2,1] and n[1,2] are 1 and the same, the function should return the name of the row of n[2,1]. I used the following function: for (i in length(rownames(n))) { for (j in length(colnames(n))){ if(n[i,j]==n[j,i]){ rownames(n)[[i]]-output} else {} } } output NULL The right answer would have been B, though. I simply do not see my mistake. I am very greatful for suggestions. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Still confused about classes
The function for getting the year from date is there in package lubridate (as well as many other convenient functions to work with dates). More generally, finding all methods for a given class may be a little tricky. If all means everything you have installed and currently attached to your search path then methods(class=Date) will do it (for S3 classes). (but The functions listed are those which _are named like methods_ and may not actually be methods (known exceptions are discarded in the code). ) The result depends on which packages you have loaded: in my currently open R session, methods(Date) lists 36 possible methods but after library(zoo) I get two more ( as.yearmon.Date and as.yearqtr.Date). Regards, Kenn On Fri, Apr 29, 2011 at 9:05 AM, Russ Abbott russ.abb...@gmail.com wrote: Hi, I'm still confused about how to find out what methods are defined for a given class. For example, I know that today - Sys.Date() will produce an object of type Date. But I'm not sure what I can do with Date objects or how I can find out. ?Date refers me to the Date documentation page. But it doesn't tell me how, for example, to extract the current year from a date object. I tried year(today)Error: could not find function year Is there some other function that does the job? I want a function f such that f(today) will return 2011. Perhaps there is no such function. But in general I don't have any confidence that I would know how to find it if it existed or that I would know how to assure myself that there was no such function. Thanks. *-- Russ * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using lme4 with three nested random effects
Dear Ben, Are site, transect and plot factors? And do they have unique id's? You could try this rws30.UL$site - factor(rws30.UL$site) rws30.UL$transect - interaction(rws30.UL$site, rws30.UL$transect, drop = TRUE) rws30.UL$plot - interaction(rws30.UL$site, rws30.UL$transect, rws30.UL$plot, drop = TRUE) modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bak.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num +(1|site/transect/plot), data=rws30.UL, family=gaussian, na.action=na.omit) Or modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bak.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num +(1|site) + (1|transect) + (1|plot), data=rws30.UL, family=gaussian, na.action=na.omit) Best regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Benjamin Caldwell Verzonden: vrijdag 29 april 2011 0:37 Aan: r-help Onderwerp: [R] using lme4 with three nested random effects Hi all, I'm trying to fit models for data with three levels of nested random effects: site/transect/plot. For example, modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bar k.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num +(1|site/transect/plot), data=rws30.UL, family=gaussian, na.action=na.omit) but I get the following error: Error: length(f1) == length(f2) is not TRUE In addition: Warning messages: 1: In plot:(transect:site) : numerical expression has 92 elements: only the first used 2: In plot:(transect:site) : numerical expression has 92 elements: only the first used The formulation works for two nested effects (e.g. 1|site/transect) I can get it to run in lme modelincrBS-lme(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bark. thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num, data=rws30.UL, random=(~1| site/transect/plot),na.action=na.omit) but I can't specify a distribution family in that package. Any help much appreciated. Ben Caldwell * * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reference variables by string in for loop
Dear R Users, I am trying to get the following to work better: namevec - c(one, two, three) for (name in namevec) { namedf - eval(parse(text=paste(name, _df, sep=))) ... ... } The rationale behind it being that I created variables with names one_df, two_df and three_df earlier in the same script which I want to reference inside the for loop. Is there a more elegant way to do this? Best Regards, Michael Bach __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot several histograms with same y-axes scaling using hist()
Dear all Problem: hist()-function, scale = “percent” I want to generate histograms for changing underlying data. In order to make them comparable, I want to fix the y-axis (vertical-axis) to, e.g., 0%, 10%, 20%, 30% as well as to fix the spaces, too. So the y-axis in each histogram should be identical. Currently, I have 100 histograms and the y-axis scales changes in each. Here is my code: =Hist(na.exclude(AA3), breaks=50, col=seashell3, scale=percent,xlim=c(-1, 1), xlab=Bewertungsfehler, ylab=Haeufigkeit (in %), main=KBV, border=white) I tried the ylim=c(…), but unfortunately it does not work. Thanks for your help in advance! Regards, Hans -- View this message in context: http://r.789695.n4.nabble.com/plot-several-histograms-with-same-y-axes-scaling-using-hist-tp3483376p3483376.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reference variables by string in for loop
On Fri, Apr 29, 2011 at 1:03 PM, Michael Bach pha...@gmail.com wrote: Dear R Users, I am trying to get the following to work better: namevec - c(one, two, three) for (name in namevec) { namedf - eval(parse(text=paste(name, _df, sep=))) ... ... } The rationale behind it being that I created variables with names one_df, two_df and three_df earlier in the same script which I want to reference inside the for loop. Is there a more elegant way to do this? Yes, one elegant way to do it would be using a named list instead of separate variables. X - list() X$one_df - something X[[two_df]] - something else NAME - one_df X[[NAME]] NAME - two_df X[[NAME]] #etc # the for loop could then be: for(name in names(X)) ... or for(element in X) Another way (not elegant but better and shorter than the eval-parse way) is to use get. ?get Best regards, Kenn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reference variables by string in for loop
Hi Michael. This is a classic :-) ObjectsOfInterest- list(one_df, two_df, three_df) for(namedf in ObjectsOfInterest){...} or probably even better sapply(ObjectsOfInterest, function(namedf){...}) hth. Nick Sabbe -- ping: nick.sa...@ugent.be link: http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Bach Sent: vrijdag 29 april 2011 12:03 To: r-help@r-project.org Subject: [R] Reference variables by string in for loop Dear R Users, I am trying to get the following to work better: namevec - c(one, two, three) for (name in namevec) { namedf - eval(parse(text=paste(name, _df, sep=))) ... ... } The rationale behind it being that I created variables with names one_df, two_df and three_df earlier in the same script which I want to reference inside the for loop. Is there a more elegant way to do this? Best Regards, Michael Bach __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] is there a way/library for generating colorful noise in R ??
I would like to generate some noisy time series. I know that it is possible to classify noise by looking at the exponent (beta) of the relationship between the spectrum of the time series and the frequencies (i.e. spectrum ~ frequency ^ beta ). Is there a way to generate White (beta=0), Pink (beta=-1), Brown (Beta=-2), Blue(beta=1) and Violet (beta=2) noise in R ?. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summer student internship placement at University of York / YCCSA / SEI (paid)
Dear R-lings, I did not know which list to post to, because it is a studentship so not really a job, so it did not fit the r-sig-jobs list and it is about devloping an extension package interfaced with R I hope I did not upset anyone. If so apologies. The Centre For Complex systems Analysis at the University of York (YCCSA) in UK in collaboration with Stockholm Environment Institute is looking for a highly motivated student in Computer Science, Applied Mathematics, Applied Statistics or related fields for a 10 weeks paid student internship over the summer 2011, starting in july, to collaborate in development of a R package. The student will participate in research projects to develop prototypes for toolkits for statistical predictions of diversity and dissimilarity and the generation of spatial landscapes, with applications in the biological and environmental sciences. We require excellent development skills and experience in CUDA/openCL, and a strong foundation in Computing, Statistics / Applied Mathematics and COmputer Graphics. We need an excellent problem solver, able to innovate, find solutions and work independently. For further information on the project please contact ct...@york.ac.uk or go to http://www.york.ac.u...2011/201107.pdf For further information on the studentship programme please look at http://www.york.ac.u...olarships.html. Please send your application not later than the 13 of may to scholarsh...@yccsa.org as one single pdf document including: 1. Your CV (max 2 pages) 2. A brief personal statement (max 1 page) including: * Which project(s) you are interested in (as many as you like but in preference order) * Your reasons for applying * Your academic interest * Your future aspirations 3. A full written academic reference (not just contact details). Your application will not be accepted without this reference (max 1 page). Best, -- Corrado Topi Stockholm Environment Institute Mob: +44 (0) 7769 601784 Tel: +44 (0) 1904 322893 Skype: corrado-eeos Website: sei-international.org University of York York YO10 5DD UK Fax: +44 (0) 1904 322898 EMAIL DISCLAIMER: http://www.york.ac.uk/docs/disclaimer/email.htm -- Corrado Topi Stockholm Environment Institute Mob: +44 (0) 7769 601784 Tel: +44 (0) 1904 322893 Skype: corrado-eeos Website: sei-international.org University of York York YO10 5DD UK Fax: +44 (0) 1904 322898 EMAIL DISCLAIMER: http://www.york.ac.uk/docs/disclaimer/email.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot several histograms with same y-axes scaling using hist()
On Fri, Apr 29, 2011 at 03:35:41AM -0700, hck wrote: Problem: hist()-function, scale = “percent” [...] =Hist(na.exclude(AA3), breaks=50, col=seashell3, scale=percent,xlim=c(-1, 1), xlab=Bewertungsfehler, ylab=Haeufigkeit (in %), main=KBV, border=white) Before anyone can really help you'll need to let us know where your Hist() function came from. hist() from package graphics does not have a scale parameter and honours ylim without a problem. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot several histograms with same y-axes scaling using hist()
Thanks for the note: Indeed, the function is the hist() function not Hist() with capital letter. I use the standard R hist()-function with the lower case only. Nevertheless, the ylim does not work as supposed to. -- View this message in context: http://r.789695.n4.nabble.com/plot-several-histograms-with-same-y-axes-scaling-using-hist-tp3483376p3483479.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Putting x-axis in opposite order
On 04/29/2011 04:09 AM, Bogaso Christofer wrote: Hi all, please consider this plot: xx- seq(4, 0.01, by = -0.04) yy- rnorm(xx) plot(xx, yy, type=l) Here you see my original 'xx' was in decreasing order, however R puts it in the increasing order. I understand that in any plot x and y axis grow is increasing order, however I am wondering whether I can manipulate this to suit my above particular problem, so that number displayed in x-axis would be in the given order. Hi Bogaso, If all else fails, have a look at rev.axis in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reference variables by string in for loop
Nick Sabbe nick.sa...@ugent.be writes: ObjectsOfInterest- list(one_df, two_df, three_df) for(namedf in ObjectsOfInterest){...} I see. This is also more readable and traceable for others. or probably even better sapply(ObjectsOfInterest, function(namedf){...}) I like this one for its functional style. hth. It did, thanks. Kind Regards, Michael Bach __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reference variables by string in for loop
Kenn Konstabel lebats...@gmail.com writes: Another way (not elegant but better and shorter than the eval-parse way) is to use get. ?get This one is handy for interactive use, thanks for the hint. Kind Regards, Michael Bach __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix evaluation using if function
On Apr 29, 2011, at 4:27 AM, ivan wrote: Hi All, I am trying to create a function which evaluates whether the values (which are equal to one) of a matrix are the same as their mirror values. Consider the following matrix: n-matrix(cbind(c(0,1,1),c(1,0,0),c(0,1,0)),3,3) colnames(n)-cbind(A,B,C);rownames(n)-cbind(A,B,C) n A B C A 0 1 0 B 1 0 1 C 1 0 0 Hence, since n[2,1] and n[1,2] are 1 and the same, the function should return the name of the row of n[2,1]. I used the following function: for (i in length(rownames(n))) { for (j in length(colnames(n))){ if(n[i,j]==n[j,i]){ rownames(n)[[i]]-output} else {} } } output NULL The right answer would have been B, though. Can you explain why A would not be an equally good answer to satisfy your problem set up? which(n == t(n) col(n) != row(n) , arr.ind=TRUE) row col B 2 1 A 1 2 rownames(which(n == t(n) col(n) != row(n) , arr.ind=TRUE) ) [1] B A # Which would seem to be the correct answer, but # This adds an additional constraint and also insures no diagonal elements rownames(which(n == t(n) col(n) != row(n) lower.tri(n), arr.ind=TRUE) ) [1] B I simply do not see my mistake. I would rather program a problem correctly that hash through errors in loop logic. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot several histograms with same y-axes scaling using hist()
On 04/29/2011 08:35 PM, hck wrote: Dear all Problem: hist()-function, scale = “percent” I want to generate histograms for changing underlying data. In order to make them comparable, I want to fix the y-axis (vertical-axis) to, e.g., 0%, 10%, 20%, 30% as well as to fix the spaces, too. So the y-axis in each histogram should be identical. Currently, I have 100 histograms and the y-axis scales changes in each. Here is my code: =Hist(na.exclude(AA3), breaks=50, col=seashell3, scale=percent,xlim=c(-1, 1), xlab=Bewertungsfehler, ylab=Haeufigkeit (in %), main=KBV, border=white) I tried the ylim=c(…), but unfortunately it does not work. Hi Hans, The barp function in plotrix can plot histograms (see the last example on the help page) and may be flexible enough to do what you want. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Nomograms from rms' fastbw output objects
Hi Rob, fastbw does not try to produce a full fit object. You have to re-run the fit manually based on what you (sometimes dangerously) learn from fastbw. If I can find a way to add a 'formula' component to the fastbw result then you could do something like lrm(fastbw(fit)$formula, ...). Frank Rob James wrote: There is both a technical and a theoretical element to my question... Should I be able to use the outputs which arise from the fastbw function as inputs to nomogram(). I seem to be failing at this, -- I obtain a subscript out of range error. That I can't do this may speak to technical failings, but I suspect it is because Prof Harrell thinks/knows it injudicious. However, I can't invent a reason why nomograms should be restricted to the full models, if the purpose of fastbw is to generate parsimonious models with appropriate standard errors. I'd welcome comments on either the technical or the theoretical issues. Many thanks in advance, Rob James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Nomograms-from-rms-fastbw-output-objects-tp3482669p3483607.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Element by Element addition of the columns of a Matrix
... is the apply function what you are looking for? A=matrix(1,2,4) apply(A,1,sum) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/Element-by-Element-addition-of-the-columns-of-a-Matrix-tp3483545p3483628.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] abline outside of plot region
Hi R people. I ran into this problem: I created a plot with errbars, like this: errbar(x=c(1,2,3,4), y=c(2,1,3,3), yminus=c(1.5,0.5,2.5,2.5), yplus=c(2.5,1.5,3.5,3.5)) Next, I wanted to accentuate some x value with an abline, like this: abline(v=2) In one of my R sessions (which admittedly I have had open for quite a while now), the abline draws outside of the plotting region of errbars (till the edge of my plotting window at least). I tested for the cause by opening another session (clean) of the same version of R (2.13), and running the same set of commands. In this session, I do not have this behavior. Conclusion: I must have changed some graphical parameter in my original session, but I don't know which one. Do you? As an addendum: I also want to add a few specific axis ticks besides the standard ones in my graph. I used axis for this, and it works. I set col.ticks to match the color of my abline (in the nonsimplified code), and this works too, but unfortunately, the label below the tick is not in this color, and a parameter for this is not present in axis. Suggestions for either? Note: I'm on windows 7 with R 2.13. Nick Sabbe -- ping: nick.sa...@ugent.be link: http://biomath.ugent.be/ http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3-way contingency table
Hi, I have large data frame with many columns. A short example is given below: dataH host ms01 ms31 ms33 ms34 1 cattle4 2096 2 sheep4345 3 cattle4345 4 cattle4345 5 sheep4355 6goat4345 7 sheep4355 8goat4345 9goat4345 10 cattle4345 Now I want to determine the the frequencies of every unique value in every column depending on the host column. It is quite easy to determine the frequencies in total with the following command: dataH2 - dataH[,c(2,3,4,5)] table(as.matrix(dataH2), colnames(dataH2)[col(dataH2)], useNA=ifany) ms01 ms31 ms33 ms34 3 0900 410070 5 0029 6 0001 9 0010 200100 But I cannot manage to get it dependent on the host. I tried xtabs(cbind(ms01, ms31, ms33, ms34) ~ ., dataH) and many other ways but I'm not stressful. I can get it for each column individually with with(dataH, table(host, ms33)) ms33 host 4 5 9 cattle 3 0 1 deer 0 0 0 goat 3 0 0 human 0 0 0 sheep 1 2 0 tick 0 0 0 But I do not want to repeat the command for every column. I need a single table which can be plotted as a balloon plot, for instance. Does anybody knows how to achieve this? -- Kind regards, Mathias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question of VECM restricted regression
Dear Colleague I am trying to figure out how to use R to do OLS restricted VECM regression. However, there are some notation I cannot understand. Please tell me what is 'ect', 'sd' and 'LRM.dl1 in the following practice: #OLS retricted VECM regression data(denmark) sjd - denmark[, c(LRM, LRY, IBO, IDE)] sjd.vecm- ca.jo(sjd, ecdet = const, type=eigen, K=2, spec=longrun, season=4) sjd.vecm.rls-cajorls(sjd.vecm,r=1) summary(sjd.vecm.rls$rlm) sjd.vecm.rls$beta Response LRM.d : Call: lm(formula = substitute(LRM.d), data = data.mat) Residuals: Min1QMedian3Q Max -0.027598 -0.012836 -0.003395 0.015523 0.056034 Coefficients: Estimate Std. Error t value Pr(|t|) ect1-0.212955 0.064354 -3.309 0.00185 ** sd1 -0.057653 0.010269 -5.614 1.16e-06 *** sd2 -0.016305 0.009177 -1.777 0.08238 . sd3 -0.040859 0.008767 -4.660 2.82e-05 *** LRM.dl1 0.049816 0.191992 0.259 0.79646 LRY.dl1 0.075717 0.157902 0.480 0.63389 IBO.dl1 -1.148954 0.372745 -3.082 0.00350 ** IDE.dl1 0.227094 0.546271 0.416 0.67959 sjd.vecm.rls$beta ect1 LRM.l21.00 LRY.l2 -1.032949 IBO.l25.206919 IDE.l2 -4.215879 Many thanks Meilan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot multiple ctrees in the same figure
Dear all: Is there a way one could plot two conditional inference trees (party package, ctree) in a figure specified by layout? My attempts failed as plot.party seemed to take over the layout functionality and forced a single ctree plot to be displayed. A brief (non reproducible) example together with the intended behavior follows below. I hope I am not missing something obvious. My system: R2.12.2 on a Windows machine with party0.9-1 and partykit0.1-0. Thanks. Tudor # CREATE ctrees ... layout(matrix(c(1,2,0,2), 2, 2, byrow=TRUE), widths=c(1,2), heights=c(1,2)) plot(ctree1)# plot first ctree plot(ctree2)# plot second ctree ... -- View this message in context: http://r.789695.n4.nabble.com/Plot-multiple-ctrees-in-the-same-figure-tp3483231p3483231.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change the text size of the title in a legend of a R plot.
thanks everyone for the help. I ended up copying and pasting the legend function from the R source files. I changed it so that the title.cex is not set by default to cex and so that this title.cex can be given as a parameter. It works fine for me. Note that if you make the title too big it goes out of the border as the borders were not designed for the case of a big title. Thanks again!! Victor Le 29/04/2011 10:03, Jannis a écrit : On 04/29/2011 05:21 AM, Victor Gabillon wrote: Horizo - c(1,2,6,10,20) legtext - paste(Horizo,sep=) legend(topleft, legend=legtext,col=col,text.col=col,lwd=lwd, lty=lty,cex=1.1,ncol=3,title = Horizons,title.col =black,title.cex=1.4) I am not sure, but the manual regarding legend seems to be not correct (or at least misleading). There is not title.cex argument for legend (even though the help page mentions it). Either you set cex 1 but this will resize the labels as well. Or you modify the code of legend as follows: change the following (near the end of the code): text2(left + w/2, top - ymax, labels = title, adj = c(0.5, 0), cex = cex, col = title.col) to: text2(left + w/2, top - ymax, labels = title, adj = c(0.5, 0), cex = title.cex, col = title.col) and add title.cex to the arguments of legend. Its probably easiest if you copy the code of legend and save its modified version within a different function. Not sure on whom to contact regarding correcting the documentation of legend(). Perhaps even I am wrong, but I could not find any reference to title.cex in the code. HTH Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] replace non numeric with NA
Hello, I have a sample data frame which looks like this day od month 1 1 0.12 2 3 #VALUE! 1 3 5 0.4 12 4 7 0.8 10 5 11 - 3 6 14 s 7 7 18 -- 12 8 27 197 Now i wish to filter all the non numeric values and replace it with NA. The data frame is actually huge and the non numeric characters vary from - to a string to absolutely anything!!! Can anyone please help ? Thank you, Warm Regards, Nandini [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem installing package sp in R 2.13.0
The rgdal package is not a dependency of sp, only suggested. In addition, you are trying to install source packages, but should (probably) be installing binaries, with type=mac.binary.leopard the most likely. If you need the OSX rgdal binary, make sure that CRAN extras is on your repository path - see ?setRepositories, for example by setRepositories(ind=1:2). If you really want to install source packages under OSX, be sure to read up on this on http://cran.r-project.org/bin/macosx/ looking for the FAQ, and links to tools. If you can manage with binary packages, stay with them. Roger Arnaud Catherine wrote: Hi, I am having troubles trying to install package sp in R (2.13.0) on mac OSX. I have tried installing the package using GUi or function install.packages but it didn't work. Here is the error message I get: also installing the dependency rgdal trying URL 'http://cran.univ-lyon1.fr/src/contrib/rgdal_0.6-33.tar.gz' Content type 'application/x-gzip' length 1422992 bytes (1.4 Mb) opened URL == downloaded 1.4 Mb trying URL 'http://cran.univ-lyon1.fr/src/contrib/sp_0.9-80.tar.gz' Content type 'application/x-gzip' length 738569 bytes (721 Kb) opened URL == downloaded 721 Kb * installing *source* package sp ... ** libs *** arch - i386 sh: make: command not found ERROR: compilation failed for package sp * removing /Library/Frameworks/R.framework/Versions/2.13/Resources/library/sp ERROR: dependency sp is not available for package rgdal The downloaded packages are in /private/var/folders/8P/8P9oV0FHFI83GKIm2cPUOk+++TM/-Tmp-/RtmppsxaRa/downloaded_packages * removing /Library/Frameworks/R.framework/Versions/2.13/Resources/library/rgdal Any help would be much appreciated! Best regards. Dr. Arnaud CATHERINE Post-Doctorant UMR 7245 CNRS/MNHN Molécules de Communication et Adaptation des Micro-organismes Equipe Cyanobactéries, Cyanotoxines et Environnement Muséum National d'Histoire Naturelle 12, rue Buffon , Case 39 75231 Paris Cedex 05 Tel : + 33 (0)1 40 79 31 79 Fax : +33 (0)1 40 79 35 94 Email : arno...@mnhn.fr Site du Muséum National d'Histoire Naturelle : http://www.mnhn.fr [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - Roger Bivand Economic Geography Section Department of Economics Norwegian School of Economics and Business Administration Helleveien 30 N-5045 Bergen, Norway -- View this message in context: http://r.789695.n4.nabble.com/Problem-installing-package-sp-in-R-2-13-0-tp3481107p3483392.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to define specially nested functions
Hi, Jerome and Phil, Thank you for your solutions and I have studied carefully your codes but I have further questions (since I guess the simple lines of codes may not do the real job I am going to describe to you. Please forgive me for my shallowness!) I guess I over-simplified my question, basically I need such a function as the integrand for estimation of the expectation by Monte Carlo methods. Please allow me to state the problem in more details: I have to define a function for Monte Carlo computation of conditional expectation and solve for the argument for which the expectation equals a pre-specified value. Say, the integrand function is f(x,y,z), where x, z are deterministic, y probabilistic and follows a distribution F. I will have to feed x=x0 to f, then I sample from F for y and evaluate f(x0,y,z), and use Monte Carlo method to get the expectation, which gives a function of z; now that the expectation is a function of z only, say, E(z); finally to solve for z such that E(z) = 0.5, for example. The function f itself is very complicated and has high dimensional vectors as arguments except z, which is a real number. I am new in R but unexpectedly encountered this symbolic incapability of R as I almost finished programming all major computations in R. I have been skillful in Matlab and Mathematica (and it is very easy to do this in them) but as I am now in statistics I would like to continue in R unless it really is not able to do it (in that case I will have to recode in Mathematica). Any of your further help is much appreciated! Best regards, -Chee From: Jerome Asselin Sent: Friday, April 29, 2011 12:25 AM To: Chee Chen Cc: R -Help Subject: Re: [R] How to define specially nested functions On Thu, 2011-04-28 at 23:08 -0400, Chee Chen wrote: Dear All, I would like to define a function: f(x,y,z) with three arguments x,y,z, such that: given values for x,y, f(x,y,z) is still a function of z and that I am still allowed to find the root in terms of z when x,y are given. For example: f(x,y,z) = x+y + (x^2-z), given x=1,y=3, f(1,3,z)= 1+3+1-z is a function of z, and then I can use R to find the root z=5. Thank you. -Chee Interesting exercise. I've got this function, which I think it's doing what you're asking. f - function(x,y,z) { fcall - match.call() fargs - NULL if(fcall$x == x) fargs - c(fargs, x) if(fcall$y == y) fargs - c(fargs, y) if(fcall$z == z) fargs - c(fargs, z) ffunargs - as.list(fargs) names(ffunargs) - fargs argslist - list(fcall) ffun - append(argslist, substitute( x+y + (x^2-z) ), after=0)[[1]] as.function(append(ffunargs, ffun)) } This yields. f(3, 2, z) function (z = z) 3 + 2 + (3^2 - z) environment: 0x132fdb8 f(3, 2, z)(3) [1] 11 I haven't figured out how to get rid of the default argument value shown here as 'z = z'. That doesn't prevent it to work, but it's less pretty. If you find a better way, let me know. HTH, Jerome [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about lrm, validate, pentrace
Yes I would select that as the final model. The difference you saw is caused by different treatment of penalization of factor variables, related to the use of the sum squared differences between the estimate at one category from the average over all categories. I think that as long as you code it one way consistently and pick the penalty using that coding you are OK. But if the coefficients of the non-factor variables depend on how the binary predictor is coded, there is a bit more concern. Frank 細田弘吉 wrote: Thank you for you quick reply, Prof. Harrell. According to your advice, I ran pentrace using a very wide range. pentrace.x6factor - pentrace(x6factor.lrm, seq(0, 100, by=0.5)) plot(pentrace.x6factor) I attached this figure. Then, pentrace.x6factor - pentrace(x6factor.lrm, seq(0, 10, by=0.05)) It seems reasonable that the best penalty is 2.55. x6factor.lrm.pen - update(x6factor.lrm, penalty=2.55) cbind(coef(x6factor.lrm), coef(x6factor.lrm.pen), abs(coef(x6factor.lrm)-coef(x6factor.lrm.pen))) [,1][,2][,3] Intercept -4.32434556 -3.86816460 0.456180958 stenosis -0.01496757 -0.01091755 0.004050025 T1 3.04248257 2.42443034 0.618052225 T2-0.75335619 -0.57194342 0.181412767 procedure -1.20847252 -0.82589263 0.382579892 ClinicalScore 0.37623189 0.30524628 0.070985611 validate(x6factor.lrm, bw=F, B=200) index.orig trainingtest optimism index.corrected n Dxy 0.6324 0.6849 0.5955 0.0894 0.5430 200 R20.3668 0.4220 0.3231 0.0989 0.2679 200 Intercept 0. 0. -0.1924 0.1924 -0.1924 200 Slope 1. 1. 0.7796 0.2204 0.7796 200 Emax 0. 0. 0.0915 0.0915 0.0915 200 D 0.2716 0.3229 0.2339 0.0890 0.1826 200 U-0.0192 -0.0192 0.0243 -0.0436 0.0243 200 Q 0.2908 0.3422 0.2096 0.1325 0.1582 200 B 0.1272 0.1171 0.1357 -0.0186 0.1457 200 g 1.6328 1.9879 1.4940 0.4939 1.1389 200 gp0.2367 0.2502 0.2216 0.0286 0.2080 200 validate(x6factor.lrm.pen, bw=F, B=200) index.orig trainingtest optimism index.corrected n Dxy 0.6375 0.6857 0.6024 0.0833 0.5542 200 R20.3145 0.3488 0.3267 0.0221 0.2924 200 Intercept 0. 0. 0.0882 -0.0882 0.0882 200 Slope 1. 1. 1.0923 -0.0923 1.0923 200 Emax 0. 0. 0.0340 0.0340 0.0340 200 D 0.2612 0.2571 0.2370 0.0201 0.2411 200 U-0.0192 -0.0192 -0.0047 -0.0145 -0.0047 200 Q 0.2805 0.2763 0.2417 0.0346 0.2458 200 B 0.1292 0.1224 0.1355 -0.0132 0.1423 200 g 1.2704 1.3917 1.5019 -0.1102 1.3805 200 gp0.2020 0.2091 0.2229 -0.0138 0.2158 200 In the penalized model (x6factor.lrm.pen), the apparent Dxy is 0.64, and bias-corrected Dxy is 0.55. The maximum absolute error is estimated to be 0.034, smaller than non-penalized model (0.0915 in x6factor.lrm) The changes in slope and intercept are substantially reduced in penalized model. I think overfitting is improved at least to some extent. Should I select this as a final model? I have one more question. The procedure variable was defined as 0/1 value in the previous mail. For some graphical reason, I redefined it as treat1/treat2 value. Then, the best penalty value was changed from 3.05 to 2.55. I guess change from numeric to factorial caused this reduction in penalty. Which set up should I select? I appreciate your help in advance. -- KH (11/04/26 0:21), Frank Harrell wrote: You've done a lot of good work on this. Yes I would say you have moderate overfitting with the first model. The only thing that saved you from having severe overfitting is that there seems to be a signal present [I am assume this model is truly pre-specified and was not developed at all by looking at patterns of responses Y.] The use of backwards stepdown demonstrated much worse overfitting. This is in line with what we know about the damage of stepwise selection methods that do not incorporate shrinkage. I would throw away the stepwise regression model. You'll find that the model selected is entirely arbitrary. And you can't use the selected variables in any re-fit of the model, i.e., you can't use lrm pretending that the two remaining variables were pre-specified. Stepwise regression methods only seem to help. When assessed properly we see that is an illusion. You are using penalizing properly but you did not print the full table of penalties vs. effective AIC. We don't have faith that
[R] threshold matrix
Dear all, I have a quite big matrix which I would like to threshold. If the value is below threshold the cell should be zero and if the value is over threshold the cell should be one One really simple way to do that is two have a nested loop and check cell by cell. The problem is that this seems to be really time consuming and ineficient. What do you suggest me to try out? I would like to thank you in advance for your help Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace non numeric with NA
Hi Nandini, On 4/29/2011 6:45 AM, Nandini B wrote: Hello, I have a sample data frame which looks like this day od month 1 1 0.12 2 3 #VALUE! 1 3 5 0.4 12 4 7 0.8 10 5 11 - 3 6 14 s 7 7 18 -- 12 8 27 197 x - data.frame(day=1:8, od = c(0.1,#VALUE!,0.4,0.8,-,s,--,19), month = c(2,1,12,10,3,7,12,7)) x day od month 1 1 0.1 2 2 2 #VALUE! 1 3 3 0.412 4 4 0.810 5 5 - 3 6 6 s 7 7 7 --12 8 8 19 7 x$od - as.numeric(as.character(x$od)) Warning message: NAs introduced by coercion x day od month 1 1 0.1 2 2 2 NA 1 3 3 0.412 4 4 0.810 5 5 NA 3 6 6 NA 7 7 7 NA12 8 8 19.0 7 Best, Jim Now i wish to filter all the non numeric values and replace it with NA. The data frame is actually huge and the non numeric characters vary from - to a string to absolutely anything!!! Can anyone please help ? Thank you, Warm Regards, Nandini [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3-way contingency table
On Apr 29, 2011, at 6:47 AM, Mathias Walter wrote: Hi, I have large data frame with many columns. A short example is given below: dataH host ms01 ms31 ms33 ms34 1 cattle4 2096 2 sheep4345 3 cattle4345 4 cattle4345 5 sheep4355 6goat4345 7 sheep4355 8goat4345 9goat4345 10 cattle4345 Now I want to determine the the frequencies of every unique value in every column depending on the host column. It is quite easy to determine the frequencies in total with the following command: dataH2 - dataH[,c(2,3,4,5)] table(as.matrix(dataH2), colnames(dataH2)[col(dataH2)], useNA=ifany) ms01 ms31 ms33 ms34 3 0900 410070 5 0029 6 0001 9 0010 200100 But I cannot manage to get it dependent on the host. I tried xtabs(cbind(ms01, ms31, ms33, ms34) ~ ., dataH) and many other ways but I'm not stressful. I can get it for each column individually with with(dataH, table(host, ms33)) ms33 host 4 5 9 cattle 3 0 1 deer 0 0 0 goat 3 0 0 human 0 0 0 sheep 1 2 0 tick 0 0 0 But I do not want to repeat the command for every column. I need a single table which can be plotted as a balloon plot, for instance. You have obviously not given us the full data from which your correct answer was drawn, but see if this is going the right direction: require(reshape) dataHm - melt(dataH) Using host as id variables xtabs(~host+value+variable, dataHm) , , variable = ms01 value host 3 4 5 6 9 20 cattle 0 4 0 0 0 0 goat 0 3 0 0 0 0 sheep 0 3 0 0 0 0 , , variable = ms31 value host 3 4 5 6 9 20 cattle 3 0 0 0 0 1 goat 3 0 0 0 0 0 sheep 3 0 0 0 0 0 , , variable = ms33 value host 3 4 5 6 9 20 cattle 0 3 0 0 1 0 goat 0 3 0 0 0 0 sheep 0 1 2 0 0 0 , , variable = ms34 value host 3 4 5 6 9 20 cattle 0 0 3 1 0 0 goat 0 0 3 0 0 0 sheep 0 0 3 0 0 0 Does anybody knows how to achieve this? -- Kind regards, Mathias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace non numeric with NA
On 29/04/2011 6:45 AM, Nandini B wrote: Hello, I have a sample data frame which looks like this day od month 1 1 0.12 2 3 #VALUE! 1 3 5 0.4 12 4 7 0.8 10 5 11 - 3 6 14 s 7 7 18 -- 12 8 27 197 Now i wish to filter all the non numeric values and replace it with NA. The data frame is actually huge and the non numeric characters vary from - to a string to absolutely anything!!! Can anyone please help ? You don't tell use the types of the columns, so I'll assume they are factors. If so, call as.numeric(as.character()) on each of them to convert the number-like values to numbers, the others to NA. For example, df$day - as.numeric(as.character(df$day)) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speed up plotting to MSWindows graphics window
If you are plotting that many data points, you might want to look at 'hexbin' as a way of aggregating the values to a different presentation. It is especially nice if you are doing a scatter plot with a lot of data points and trying to make sense out of it. On Wed, Apr 27, 2011 at 5:16 AM, Jonathan Gabris jonat...@k-m-p.nl wrote: Hello, I am working on a project analysing the performance of motor-vehicles through messages logged over a CAN bus. I am using R 2.12 on Windows XP and 7 I am currently plotting the data in R, overlaying 5 or more plots of data, logged at 1kHz, (using plot.ts() and par(new = TRUE)). The aim is to be able to pan, zoom in and out and get values from the plotted graph using a custom Qt interface that is used as a front end to R.exe (all this works). The plot is drawn by R directly to the windows graphic device. The data is imported from a .csv file (typically around 100MB) to a matrix. (timestamp, message ID, byte0, byte1, ..., byte7) I then separate this matrix into several by message ID (dimensions are in the order of 8cols, 10^6 rows) The panning is done by redrawing the plots, shifted by a small amount. So as to view a window of data from a second to a minute long that can travel the length of the logged data. My problem is that, the redrawing of the plots whilst panning is too slow when dealing with this much data. i.e.: I can see the last graphs being drawn to the screen in the half-second following the view change. I need a fluid change from one view to the next. My question is this: Are there ways to speed up the plotting on the MSWindows display? By reducing plotted point densities to *sensible* values? Using something other than plot.ts() - is the lattice package faster? I don't need publication quality plots, they can be rougher... I have tried: -Using matrices instead of dataframes - (works for calculations but not enough for plots) -increasing the max usable memory (max-mem-size) - (no change) -increasing the size of the pointer protection stack (max-ppsize) - (no change) -deleting the unnecessary leftover matrices - (no change) -I can't use lines() instead of plot() because of the very different scales (rpm-1, flags -1to3) I am going to do some resampling of the logged data to reduce the vector sizes. (removal of *less* important data and use of window.ts()) But I am currently running out of ideas... So if sombody could point out something, I would be greatfull. Thanks, Jonathan Gabris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] abline outside of plot region
On 2011-04-29 06:14, Nick Sabbe wrote: Hi R people. I ran into this problem: I created a plot with errbars, like this: errbar(x=c(1,2,3,4), y=c(2,1,3,3), yminus=c(1.5,0.5,2.5,2.5), yplus=c(2.5,1.5,3.5,3.5)) Next, I wanted to accentuate some x value with an abline, like this: abline(v=2) In one of my R sessions (which admittedly I have had open for quite a while now), the abline draws outside of the plotting region of errbars (till the edge of my plotting window at least). I tested for the cause by opening another session (clean) of the same version of R (2.13), and running the same set of commands. In this session, I do not have this behavior. Conclusion: I must have changed some graphical parameter in my original session, but I don't know which one. Do you? As an addendum: I also want to add a few specific axis ticks besides the standard ones in my graph. I used axis for this, and it works. I set col.ticks to match the color of my abline (in the nonsimplified code), and this works too, but unfortunately, the label below the tick is not in this color, and a parameter for this is not present in axis. Suggestions for either? Note: I'm on windows 7 with R 2.13. plot(1:4, xaxt='n') axis(1, at=2:3, lab=c('a', 'b'), col.ticks=3, col.axis=2, lwd=0, lwd.ticks=1) par(xpd = TRUE) abline(v = 4) Peter Ehlers Nick Sabbe -- ping: nick.sa...@ugent.be link:http://biomath.ugent.be/ http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] threshold matrix
On Apr 29, 2011, at 9:37 AM, Alaios wrote: Dear all, I have a quite big matrix which I would like to threshold. If the value is below threshold the cell should be zero and if the value is over threshold the cell should be one M2 - M M2[M thresh] - 0 M2[M = thresh] - 1 or perhaps simply: M2 - as.numeric( M[] thresh ) One really simple way to do that is two have a nested loop and check cell by cell. The problem is that this seems to be really time consuming and ineficient. What do you suggest me to try out? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix evaluation using if function
David Winsemius wrote: On Apr 29, 2011, at 4:27 AM, ivan wrote: Hi All, I am trying to create a function which evaluates whether the values (which are equal to one) of a matrix are the same as their mirror values. Consider the following matrix: n-matrix(cbind(c(0,1,1),c(1,0,0),c(0,1,0)),3,3) colnames(n)-cbind(A,B,C);rownames(n)-cbind(A,B,C) n A B C A 0 1 0 B 1 0 1 C 1 0 0 Hence, since n[2,1] and n[1,2] are 1 and the same, the function should return the name of the row of n[2,1]. I used the following function: for (i in length(rownames(n))) { for (j in length(colnames(n))){ if(n[i,j]==n[j,i]){ rownames(n)[[i]]-output} else {} } } output NULL The right answer would have been B, though. Can you explain why A would not be an equally good answer to satisfy your problem set up? which(n == t(n) col(n) != row(n) , arr.ind=TRUE) row col B 2 1 A 1 2 rownames(which(n == t(n) col(n) != row(n) , arr.ind=TRUE) ) [1] B A # Which would seem to be the correct answer, but # This adds an additional constraint and also insures no diagonal elements rownames(which(n == t(n) col(n) != row(n) lower.tri(n), arr.ind=TRUE) ) [1] B Wouldn't this do it too (dsince the diagonal is set to false by lower.tri)?: rownames(which(n == t(n) lower.tri(n), arr.ind=TRUE)) Berend -- View this message in context: http://r.789695.n4.nabble.com/matrix-evaluation-using-if-function-tp3483188p3483785.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8
Well, but the original poster also refers to 0.2 and 0.8 as expected min and max, in which case we are back to a joke... Giovanni On Thu, 2011-04-28 at 13:06 -0400, David Winsemius wrote: On Apr 28, 2011, at 12:09 PM, Ravi Varadhan wrote: Surely you must be joking, Mr. Jianfeng. Perhaps not joking and perhaps not with correct statistical specification. A truncated Normal could be simulated with: set.seed(567) x - rnorm(n=5, m=1, sd=1) xtrunc - x[x=0.2 x =0.8] require(logspline) plot(logspline(xtrunc, lbound=0.2, ubound=0.8, nknots=7)) -- David. --- Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On Behalf Of Mao Jianfeng Sent: Thursday, April 28, 2011 12:02 PM To: r-help@r-project.org Subject: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8 Dear all, This is a simple probability problem. I want to know, How to generate a normal distribution with mean=1, min=0.2 and max=0.8? I know how the generate a normal distribution of mean = 1 and sd = 1 and with 500 data point. rnorm(n=500, m=1, sd=1) But, I am confusing with how to generate a normal distribution with expected min and max. I expect to hear your directions. Thanks in advance. Best, Jian-Feng, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Giovanni Petris gpet...@uark.edu Associate Professor Department of Mathematical Sciences University of Arkansas - Fayetteville, AR 72701 Ph: (479) 575-6324, 575-8630 (fax) http://definetti.uark.edu/~gpetris/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] threshold matrix
Thanks a lot. I finally used M2 - M M2[M thresh] - 0 M2[M = thresh] - 1 as I noticed that this one line M2 - as.numeric( M[] thresh ) vectorizes my matrix. One more question I have two matrices that only differ slightly. What will be the easiest way to compare and find the cells that are not the same? Best Regards Alex --- On Fri, 4/29/11, David Winsemius dwinsem...@comcast.net wrote: From: David Winsemius dwinsem...@comcast.net Subject: Re: [R] threshold matrix To: Alaios ala...@yahoo.com Cc: R-help@r-project.org Date: Friday, April 29, 2011, 2:57 PM On Apr 29, 2011, at 9:37 AM, Alaios wrote: Dear all, I have a quite big matrix which I would like to threshold. If the value is below threshold the cell should be zero and if the value is over threshold the cell should be one M2 - M M2[M thresh] - 0 M2[M = thresh] - 1 or perhaps simply: M2 - as.numeric( M[] thresh ) One really simple way to do that is two have a nested loop and check cell by cell. The problem is that this seems to be really time consuming and ineficient. What do you suggest me to try out? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using Java methods in R
H do I obtain a strictly rectangular type-double array (converted to an R 2-dimensional array) from a Java class? I can obtain a 1-dimensional type-double array (vector) or a scalar, but I cannot figure out the two-dimensional from the instructions. Is .jevalArray also involved? My simple Java test class and R test code follows: import java.lang.reflect.Array; public class RJavTest { public static void main(String[]args) { RJavTest rJavTest=new RJavTest(); } public final static String conStg=testString; public final static double con0dbl=10001; public final static double[]con1Arr=new double[] { 10001,10002,10003,10004,10005,10006 }; public final static double[][]con2Arr=new double[][] { { 10001,10002,10003,10004 },{ 20001,20002,20003,20004 },{ 30001,30002,30003,30004 } }; public final static String retConStg() { return(conStg); } public final static double retCon0dbl() { return(con0dbl); } public final static double[] retCon1Arr() { return(con1Arr); } public final static double[][] retCon2Arr() { return(con2Arr); } } library(rJava) .jinit() .jaddClassPath(C:/ad/j) print(.jclassPath()) rJavaTst - .jnew(RJavTest) conn1Arr - .jfield(rJavaTst,sig=[D,con1Arr) print(conn1Arr) print(conn1Arr[2]) conn1ArrRet - .jcall(rJavaTst,returnSig=[D,retCon1Arr) print(conn1ArrRet) print(conn1ArrRet[2]) conn0dbl - .jfield(rJavaTst,sig=D,con0dbl) print(conn0dbl) ##The above works, but not the following conn2Arr - .jfield(rJavaTst,sig=[[D,con2Arr) print(conn2Arr[2]) print(conn2Arr[2,3]) print(conn2Arr) arj34Ret - .jcall(rJavaTst,returnSig=[[D,arReturnTEST) print(arj34Ret) The latter 2-dim stuff doesn't work -- View this message in context: http://r.789695.n4.nabble.com/Using-Java-methods-in-R-tp3469299p3483862.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] threshold matrix
On Fri, Apr 29, 2011 at 07:44:59AM -0700, Alaios wrote: Thanks a lot. I finally used M2 - M M2[M thresh] - 0 M2[M = thresh] - 1 as I noticed that this one line M2 - as.numeric( M[] thresh ) vectorizes my matrix. Hi. This may be avoided, for example M2 - M M2[, ] - as.numeric(M = thresh) or array(as.numeric(M = thresh), dim=dim(M)) One more question I have two matrices that only differ slightly. What will be the easiest way to compare and find the cells that are not the same? If A and B are matrices of the same dimension, then A == B is a logical matrix with TRUE entires for positions, where A and B match exactly. abs(A - B) = eps is a logical matrix with TRUE entires for positions, where A and B differ at most by eps. If you want to get only one logical result, then use all(A == B) for exact equality and all(abs(A - B) = eps) for approximate equality of all entries. See also ?all.equal, which uses the relative error, not absolute difference. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Specify custom par(mfrow()) layout for defined plot()
Dear R Users, I am doing stats::decompose() on 4 different time series. When I issue csdA - decompose(tsA) plot(csdA) I get a summary plot for observed, trend, seasonal and random components of decomposed time series tsA. As I understand it, the object returned by decompose() has it's own plot method where mfrow(4,1) etc. is defined. Now suppose I wanted to wrap those mfrow(4,1) into my own mfrow(2,2) layout. How could I achieve this? Is there a general way to handle these cases? Something like a meta par(mfrow())? Best Regards, Michael Bach __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Still confused about classes
Thanks, to all. I didn't know about either *methods( ) *or the package * lubridate*, which seems like a very nice *Date *package. *-- Russ * On Fri, Apr 29, 2011 at 1:35 AM, Kenn Konstabel lebats...@gmail.com wrote: The function for getting the year from date is there in package lubridate (as well as many other convenient functions to work with dates). More generally, finding all methods for a given class may be a little tricky. If all means everything you have installed and currently attached to your search path then methods(class=Date) will do it (for S3 classes). (but The functions listed are those which _are named like methods_ and may not actually be methods (known exceptions are discarded in the code). ) The result depends on which packages you have loaded: in my currently open R session, methods(Date) lists 36 possible methods but after library(zoo) I get two more ( as.yearmon.Date and as.yearqtr.Date). Regards, Kenn On Fri, Apr 29, 2011 at 9:05 AM, Russ Abbott russ.abb...@gmail.com wrote: Hi, I'm still confused about how to find out what methods are defined for a given class. For example, I know that today - Sys.Date() will produce an object of type Date. But I'm not sure what I can do with Date objects or how I can find out. ?Date refers me to the Date documentation page. But it doesn't tell me how, for example, to extract the current year from a date object. I tried year(today)Error: could not find function year Is there some other function that does the job? I want a function f such that f(today)will return 2011. Perhaps there is no such function. But in general I don't have any confidence that I would know how to find it if it existed or that I would know how to assure myself that there was no such function. Thanks. *-- Russ * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.csv fails to read a CSV file from google docs
Hello all, I wish to use read.csv to read a google doc spreadsheet. I try using the following code: data_url - http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv read.csv(data_url) Which results in the following error: Error in file(file, rt) : cannot open the connection I'm on windows 7. And the code was tried on R 2.12 and 2.13 I remember trying this a few months ago and it worked fine. Any suggestion what might be causing this or how to solve it? Thanks. Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace non numeric with NA
Thanks a lot Jim, this is perfect!! Thank you, Nandini Badarinarayan Date: Fri, 29 Apr 2011 09:49:26 -0400 From: jmac...@med.umich.edu To: nandini...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] replace non numeric with NA Hi Nandini, On 4/29/2011 6:45 AM, Nandini B wrote: Hello, I have a sample data frame which looks like this day od month 1 1 0.12 2 3 #VALUE! 1 3 5 0.4 12 4 7 0.8 10 5 11 - 3 6 14 s 7 7 18 -- 12 8 27 197 x - data.frame(day=1:8, od = c(0.1,#VALUE!,0.4,0.8,-,s,--,19), month = c(2,1,12,10,3,7,12,7)) x day od month 1 1 0.1 2 2 2 #VALUE! 1 3 3 0.412 4 4 0.810 5 5 - 3 6 6 s 7 7 7 --12 8 8 19 7 x$od - as.numeric(as.character(x$od)) Warning message: NAs introduced by coercion x day od month 1 1 0.1 2 2 2 NA 1 3 3 0.412 4 4 0.810 5 5 NA 3 6 6 NA 7 7 7 NA12 8 8 19.0 7 Best, Jim Now i wish to filter all the non numeric values and replace it with NA. The data frame is actually huge and the non numeric characters vary from - to a string to absolutely anything!!! Can anyone please help ? Thank you, Warm Regards, Nandini [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting options only inside functions
The Python solution does not extend, at least not cleanly, to things like dev on/ dev off or to Hadley's locale example. In any case if I am reading the Python source correctly on how they handle user interrupts this solution has the same non-robusness to user interrupts issue that Bill's initial solution had. As a basis I believe what we need is a mechanism that handles a setup, an action, and a cleanup, with setup and cleanup occurring with interrupts disablednand the action with interrupts enabled. Scheme's dynamic wind is similar, though I don't believe the scheme standard addresses interrupts and we don't need to worry about continuations, but some of the issues are similar. Probably we would want two flavors, one in which the action has to be a function that takes as a single argument the result produced by the setup code, and one in which the action can be an argument expression that is then evaluated at the appropriate place by laze evaluation. This can be done at the R level except for the controlling of interrupts (and possibly other asynchronous stuff)-- that would need a new pair of primitives (suspendInterrupts/enableInterupts or something like that). There is something in the Haskell literature on this that I have looked at a while back -- probably time to have another look. On Thu, 28 Apr 2011, Jonathan Daily wrote: I would also love to see this implemented in R, as my current solution to the issue of doing tons of open/close, dev/dev.off, etc. is to use snippets in my IDE, and in the end I feel like it is a hack job. A pythonic with function would also solve most of the situations where I have had to use awkward try or tryCatch calls. I would be willing to help with this project, even if it is just testing. On Wed, Apr 27, 2011 at 5:43 PM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: but it's a little clumsy, because with_connection(file(myfile.txt), {do stuff...}) isn't very useful because you have no way to reference the connection that you're using. Ruby's blocks have arguments which would require big changes to R's syntax. One option would to use pronouns: Looking very much like python 'with' statements: http://effbot.org/zone/python-with-statement.htm Implemented via the 'with' statement which can operate on anything that has a __enter__ and an __exit__ method. Very neat. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luke Tierney Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: l...@stat.uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] threshold matrix
On Apr 29, 2011, at 10:44 AM, Alaios wrote: Thanks a lot. I finally used M2 - M M2[M thresh] - 0 M2[M = thresh] - 1 as I noticed that this one line M2 - as.numeric( M[] thresh ) vectorizes my matrix. One more question I have two matrices that only differ slightly. What will be the easiest way to compare and find the cells that are not the same? M[!M==N] N[!M==N] Best Regards Alex --- On Fri, 4/29/11, David Winsemius dwinsem...@comcast.net wrote: From: David Winsemius dwinsem...@comcast.net Subject: Re: [R] threshold matrix To: Alaios ala...@yahoo.com Cc: R-help@r-project.org Date: Friday, April 29, 2011, 2:57 PM On Apr 29, 2011, at 9:37 AM, Alaios wrote: Dear all, I have a quite big matrix which I would like to threshold. If the value is below threshold the cell should be zero and if the value is over threshold the cell should be one M2 - M M2[M thresh] - 0 M2[M = thresh] - 1 or perhaps simply: M2 - as.numeric( M[] thresh ) One really simple way to do that is two have a nested loop and check cell by cell. The problem is that this seems to be really time consuming and ineficient. What do you suggest me to try out? -- David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting options only inside functions
In python, opening a connection using with allows for a temporary assignment using as. So: with file(/path/to/file) as con: permanent_object = function(con) would provide the return of function(con) globally, but close con. If function(con) causes an error, con is still closed. I agree with your description of what the function would need to do. Would it make sense to make it generic and define default methods for different setups? e.g. Using the current with/within when it is a data.frame/environment, evaluating it when it is a function, etc. On Fri, Apr 29, 2011 at 12:34 PM, luke-tier...@uiowa.edu wrote: The Python solution does not extend, at least not cleanly, to things like dev on/ dev off or to Hadley's locale example. In any case if I am reading the Python source correctly on how they handle user interrupts this solution has the same non-robusness to user interrupts issue that Bill's initial solution had. As a basis I believe what we need is a mechanism that handles a setup, an action, and a cleanup, with setup and cleanup occurring with interrupts disablednand the action with interrupts enabled. Scheme's dynamic wind is similar, though I don't believe the scheme standard addresses interrupts and we don't need to worry about continuations, but some of the issues are similar. Probably we would want two flavors, one in which the action has to be a function that takes as a single argument the result produced by the setup code, and one in which the action can be an argument expression that is then evaluated at the appropriate place by laze evaluation. This can be done at the R level except for the controlling of interrupts (and possibly other asynchronous stuff)-- that would need a new pair of primitives (suspendInterrupts/enableInterupts or something like that). There is something in the Haskell literature on this that I have looked at a while back -- probably time to have another look. On Thu, 28 Apr 2011, Jonathan Daily wrote: I would also love to see this implemented in R, as my current solution to the issue of doing tons of open/close, dev/dev.off, etc. is to use snippets in my IDE, and in the end I feel like it is a hack job. A pythonic with function would also solve most of the situations where I have had to use awkward try or tryCatch calls. I would be willing to help with this project, even if it is just testing. On Wed, Apr 27, 2011 at 5:43 PM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: but it's a little clumsy, because with_connection(file(myfile.txt), {do stuff...}) isn't very useful because you have no way to reference the connection that you're using. Ruby's blocks have arguments which would require big changes to R's syntax. One option would to use pronouns: Looking very much like python 'with' statements: http://effbot.org/zone/python-with-statement.htm Implemented via the 'with' statement which can operate on anything that has a __enter__ and an __exit__ method. Very neat. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luke Tierney Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: l...@stat.uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu -- === Jon Daily Technician === #!/usr/bin/env outside # It's great, trust me. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8
On Fri, 29 Apr 2011, Giovanni Petris wrote: Well, but the original poster also refers to 0.2 and 0.8 as expected min and max, in which case we are back to a joke... Well, he is a lot better with English than I am with Mandarin. He seemed to like the truncated normal answers, so we'll let those be his answers. It is possible to choose parameters for a normal distribution with 500 observations such that the expected value of the maximum is .8 and the expected value of the minimum is .2. Obviously, the mean would be .5, not 1, but what would the variance then have to be to provide the correct expected max and min? That's another legitimate question. Mike -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mao Jianfeng Sent: Thursday, April 28, 2011 12:02 PM To: r-help@r-project.org Subject: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8 Dear all, This is a simple probability problem. I want to know, How to generate a normal distribution with mean=1, min=0.2 and max=0.8? I know how the generate a normal distribution of mean = 1 and sd = 1 and with 500 data point. rnorm(n=500, m=1, sd=1) But, I am confusing with how to generate a normal distribution with expected min and max. I expect to hear your directions. Thanks in advance. Best, Jian-Feng, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv fails to read a CSV file from google docs
On Apr 29, 2011, at 11:19 AM, Tal Galili wrote: Hello all, I wish to use read.csv to read a google doc spreadsheet. I try using the following code: data_url - http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv read.csv(data_url) Which results in the following error: Error in file(file, rt) : cannot open the connection I'm on windows 7. And the code was tried on R 2.12 and 2.13 I remember trying this a few months ago and it worked fine. I am always amused at such claims. Occasionally they are correct, but more often a crucial step has been omitted. In this case you have at a minimum embedded line-feeds in your URL string and have not established a connection, so it could not possibly have succeeded as presented. But now it's time to admit I do not know why it is not succeeding when I correct those flaws. closeAllConnections() data_url - url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv ) read.csv(data_url) Error in open.connection(file, rt) : cannot open the connection closeAllConnections() dd - read.csv(con - url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv )) Error in open.connection(file, rt) : cannot open the connection So, I guess I'm not reading the help pages for `url` and `read.csv` as well I thought I was. Any suggestion what might be causing this or how to solve it? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sub-matrix block size
Dear Rxperts, Can Jordan decomposition of submatrices be useful to determine size of sub blocks? http://en.wikipedia.org/wiki/Jordan_normal_form;.. Thanks for the ideas/suggestions. . I have another similar situation, where at least one of the off diagonal elements of the lower triangle submatrices (as mentioned in the previous example) may be zero.. and based on the visual inspection, the block size of those square submatrices should be the same as in the previous example. How do I resolve this one? m1 - structure(c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), .Dim = c(11L, 11L)) Also, in the vector below is there a simple way to separate out contiguous blocks (for identification purposes)? Please see the inserted 0 in the vector below to identify the next block ... rowSums(m) + colSums(m) - 1 [1] 2 2 1 -1 3 3 3 0 3 3 3 -1 # the elements in this vector are the TRUE sizes of submatrices (zero is inserted to separate contiguous blocks of same size) Regards, Santosh On Wed, Apr 27, 2011 at 6:41 AM, Santosh santosh2...@gmail.com wrote: Thanks, David! That is another interesting perspective to (sub/super) diagonal story! For now I was looking only at block sizes of lower triangle submatrices as Dennis suggested. Regards, Santosh On Wed, Apr 27, 2011 at 5:57 AM, David Winsemius dwinsem...@comcast.netwrote: On Apr 27, 2011, at 12:07 AM, Dennis Murphy wrote: Hi: Maybe this can help get you started. Reading your data into a matrix m, m - structure(c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), .Dim = c(11L, 11L)) rowSums(m) + colSums(m) - 1 [1] 2 2 1 -1 3 3 3 3 3 3 -1 The pair of 2's = a 2 x 2 block, 1 = a 1 x 1 matrix with value 1, -1 = a 1 x 1 matrix with entry 0, a triplet of 3's = a 3 x 3 subblock, etc. You should be able to figure out the rows and columns for each submatrix from the indices of the vector above; the values provide an indication of matrix size as well as position. If we are in the stage of providing potentially useful but incomplete ideas, this would be my notion. Use the row and col functions with [ to locate non-zero elements in the diagonal and subdiagonal: Diagonal: (My matrix was named `mm`) mm[row(mm)==col(mm)] [1] 1 1 1 0 1 1 1 1 1 1 0 First subdiagonal: mm[row(mm)==col(mm)+1] [1] 0 0 0 0 0 0 0 0 0 0 First superdiagonal: mm[row(mm)==col(mm)-1] [1] 1 0 0 0 1 1 0 1 1 0 Perhaps a combination of the two? It seems as though the rowSums/colSums approach might be insensitive to whether triangular blocks were sub or super diagonal: rowSums(mm) + colSums(mm) - 1 [1] 2 2 1 -1 3 3 3 3 3 3 -1 mm[1,2]-0 mm[2,1]-1 rowSums(mm) + colSums(mm) - 1 [1] 2 2 1 -1 3 3 3 3 3 3 -1 HTH, Dennis On Tue, Apr 26, 2011 at 5:13 PM, Santosh santosh2...@gmail.com wrote: Dear Rxperts Below is a small vector of values of zeros and non-zeros... was wondering if there is an efficient way to get the block sizes of submatrices of a big matrix similar to the one shown below? diagonal elements can be zero too. Rows with only a diagonal element may be considered as a unit block size. c(1,0,0,0,0,0,0,0,0,0,0, 1,1,0,0,0,0,0,0,0,0,0, 0,0,1,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,1,0,0,0,0,0,0, 0,0,0,0,1,1,0,0,0,0,0, 0,0,0,0,1,1,1,0,0,0,0, 0,0,0,0,0,0,0,1,0,0,0, 0,0,0,0,0,0,0,1,1,0,0, 0,0,0,0,0,0,0,1,1,1,0, 0,0,0,0,0,0,0,0,0,0,0) Thanks much! Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8
On Apr 29, 2011, at 1:29 PM, Mike Miller wrote: On Fri, 29 Apr 2011, Giovanni Petris wrote: Well, but the original poster also refers to 0.2 and 0.8 as expected min and max, in which case we are back to a joke... Well, he is a lot better with English than I am with Mandarin. He seemed to like the truncated normal answers, so we'll let those be his answers. It is possible to choose parameters for a normal distribution with 500 observations such that the expected value of the maximum is .8 and the expected value of the minimum is .2. Obviously, the mean would be .5, not 1, but what would the variance then have to be to provide the correct expected max and min? That's another legitimate question. You would need to specify an N since the expected first and last order statistic would decrease/increase with increasing N. -- David. Mike -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On Behalf Of Mao Jianfeng Sent: Thursday, April 28, 2011 12:02 PM To: r-help@r-project.org Subject: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8 Dear all, This is a simple probability problem. I want to know, How to generate a normal distribution with mean=1, min=0.2 and max=0.8? I know how the generate a normal distribution of mean = 1 and sd = 1 and with 500 data point. rnorm(n=500, m=1, sd=1) But, I am confusing with how to generate a normal distribution with expected min and max. I expect to hear your directions. Thanks in advance. Best, Jian-Feng, David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The bin/R file - hardcoded paths
Hello, I notice that e.g /home/sguha/lib64 is hard coded into the /bin/R file . I nstalled R as ./configure --prefix=$HOME ... What i need to do is ship the entire R distribution to remote nodes, and run R. These are shipped to ephemeral directories so I dont know the path ahead of time. R_HOME doesn't change things either. So i guess one cant run R on a system unless it's been installed? 1. I can't install R on the compute nodes using ./configure 2. All nodes do have the same architecture 3. I would like to stick to the 'shipping' approach. Thanks Saptarshi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv fails to read a CSV file from google docs
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Friday, April 29, 2011 10:36 AM To: Tal Galili Cc: r-help@r-project.org Subject: Re: [R] read.csv fails to read a CSV file from google docs On Apr 29, 2011, at 11:19 AM, Tal Galili wrote: Hello all, I wish to use read.csv to read a google doc spreadsheet. I try using the following code: data_url - http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enke y=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid =0output=csv read.csv(data_url) Which results in the following error: Error in file(file, rt) : cannot open the connection With S+ I get: S+ download.file(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl= enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0ou tput=csv, destfile=e:/temp/splus) Problem in download.file(http://spreadsheets0.google.com/spreadsheet/pu..: Could not get url: un supported protocol, libcurl was built with SSL disabled, https: not supported! and with cygwin's wget I get E:\temp\jnkwget http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDT Vek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0outpu t=csv --2011-04-29 11:00:10-- http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTV ek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid= 0output=csv Resolving spreadsheets0.google.com... 74.125.224.73, 74.125.224.71, 74.125.224.64, ... Connecting to spreadsheets0.google.com|74.125.224.73|:80... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDT Vek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv [ following] --2011-04-29 11:00:11-- https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDT Vek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid =0output=csv Connecting to spreadsheets0.google.com|74.125.224.73|:443... connected. ERROR: cannot verify spreadsheets0.google.com's certificate, issued by `/C=US/O=Google Inc/CN=Google Internet Authority': Unable to locally verify the issuer's authority. To connect to spreadsheets0.google.com insecurely, use `--no-check-certificate'. Unable to establish SSL connection. so I suspect that the SLL/certifcate business may also be the problem when using R to get the document. The R error message is not very illuminating. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com I'm on windows 7. And the code was tried on R 2.12 and 2.13 I remember trying this a few months ago and it worked fine. I am always amused at such claims. Occasionally they are correct, but more often a crucial step has been omitted. In this case you have at a minimum embedded line-feeds in your URL string and have not established a connection, so it could not possibly have succeeded as presented. But now it's time to admit I do not know why it is not succeeding when I correct those flaws. closeAllConnections() data_url - url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl= enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=tru egid=0output=csv ) read.csv(data_url) Error in open.connection(file, rt) : cannot open the connection closeAllConnections() dd - read.csv(con - url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl= enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=tru egid=0output=csv )) Error in open.connection(file, rt) : cannot open the connection So, I guess I'm not reading the help pages for `url` and `read.csv` as well I thought I was. Any suggestion what might be causing this or how to solve it? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv fails to read a CSV file from google docs
Thanks David for fixing the early issues. The reason for the failure is that the response from the Web server is a to redirect the requester to another page, specifically https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv Note that this is https, not http, and the built-in URL reading facilities in R don't suport https. One way to see this is to use look at the headers in your browser (e.g. Live HTTP Headers), or to use curl, or the RCurl package tt = getForm(http://spreadsheets0.google.com/spreadsheet/pub;, hl =en, key = 0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE, single = true, gid =0, output = csv, .opts = list(followlocation = TRUE, verbose = TRUE)) The verbose option shows the entire dialog, and tt contains the text of the CSV document. read.csv(textConnection(tt)) then yields the data frame D. On 4/29/11 10:36 AM, David Winsemius wrote: On Apr 29, 2011, at 11:19 AM, Tal Galili wrote: Hello all, I wish to use read.csv to read a google doc spreadsheet. I try using the following code: data_url - http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv read.csv(data_url) Which results in the following error: Error in file(file, rt) : cannot open the connection I'm on windows 7. And the code was tried on R 2.12 and 2.13 I remember trying this a few months ago and it worked fine. I am always amused at such claims. Occasionally they are correct, but more often a crucial step has been omitted. In this case you have at a minimum embedded line-feeds in your URL string and have not established a connection, so it could not possibly have succeeded as presented. But now it's time to admit I do not know why it is not succeeding when I correct those flaws. closeAllConnections() data_url - url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv;) read.csv(data_url) Error in open.connection(file, rt) : cannot open the connection closeAllConnections() dd - read.csv(con - url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv;)) Error in open.connection(file, rt) : cannot open the connection So, I guess I'm not reading the help pages for `url` and `read.csv` as well I thought I was. Any suggestion what might be causing this or how to solve it? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv fails to read a CSV file from google docs
On Fri, Apr 29, 2011 at 06:19:24PM +0300, Tal Galili wrote: data_url - http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv read.csv(data_url) Error in file(file, rt) : cannot open the connection I get the same error (R 2.11.1, Debian LINUX) and don't have a solution. But I did some tests and found the origin of the problem I can download the file from google with wget but get some interesting ´information in the process: $ wget -v 'http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv' --2011-04-29 20:07:40-- http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv Resolving spreadsheets0.google.com... 209.85.148.139, 209.85.148.113, 209.85.148.138, ... Connecting to spreadsheets0.google.com|209.85.148.139|:80... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv [following] --2011-04-29 20:07:41-- https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv Connecting to spreadsheets0.google.com|209.85.148.139|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/plain] Saving to: “pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv.1” [ = ] 41 --.-K/s in 0s 2011-04-29 20:07:42 (342 KB/s) - “pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv.1” saved [41] The message that caught my attention was the http redirection: 302 Moved Temporarily. If you try again with the new url you get this: read.csv(url(https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=trueg;)) Error in open.connection(file, rt) : cannot open the connection In addition: Warning message: In open.connection(file, rt) : unsupported URL scheme ?url told me Note that ‘https://’ connections are not supported. Case closed, problem unsolved... Dirty workaround: use system() and wget or whatever command is available on Windows for this. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mlogit package, Error in X[omitlines, ] - NA : subscript out of bounds
I am using the mlogit packages and get a data problem, for which I can't find any clue from R archive. code below shows my related code all the way to the error #--- mydata - data.frame(dependent,x,y,z) mydata$dependent-as.factor(mydata$dependent) mldata-mlogit.data(mydata, varying=NULL, choice=dependent, shape=wide) summary(mlogit.1- mlogit(dependent~1|x+y+z, data = mldata, reflevel=0)) Error in X[omitlines, ] - NA : subscript out of bounds , #--- Could anybody kindly tip how can I possibly solve this problem? Thank you yong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logistic regression with glm: cooks distance and dfbetas are different compared to SPSS output
Hi there, I have the problem, that I'm not able to reproduce the SPSS residual statistics (dfbeta and cook's distance) with a simple binary logistic regression model obtained in R via the glm-function. I tried the following: fit - glm(y ~ x1 + x2 + x3, data, family=binomial) cooks.distance(fit) dfbetas(fit) When i compare the returned values with the values that I get in SPSS, they are different, although the same model is calculated (the coefficients are the same etc.) It seems that different calculation-formulas are used for cooks.distance and dfbetas in SPSS compared to R. Unfortunately I didn't find out, what's the difference in the calculation and how I could get R to calculate me the same statistics that SPSS uses. Or is this an unknown SPSS bug? Greetings Jürgen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace non numeric with NA
Thanks a lot Duncan, this is what I was looking for!!Thank you,Nandini Date: Fri, 29 Apr 2011 09:53:06 -0400 From: murdoch.dun...@gmail.com To: nandini...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] replace non numeric with NA On 29/04/2011 6:45 AM, Nandini B wrote: Hello, I have a sample data frame which looks like this day od month 1 1 0.12 2 3 #VALUE! 1 3 5 0.4 12 4 7 0.8 10 5 11 - 3 6 14 s 7 7 18 -- 12 8 27 197 Now i wish to filter all the non numeric values and replace it with NA. The data frame is actually huge and the non numeric characters vary from - to a string to absolutely anything!!! Can anyone please help ? You don't tell use the types of the columns, so I'll assume they are factors. If so, call as.numeric(as.character()) on each of them to convert the number-like values to numbers, the others to NA. For example, df$day - as.numeric(as.character(df$day)) Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regression Summary for a List
Hi, I am trying to run a regression on two matrices with 10 columns. I have been able to run the regression with the following code: fit=list() for(i in 1:10) { fit[[i]]=lm(monret[,i]~janret[,i]) } However, I can't get the regression to spit out more than the coefficients (summary(fit) does not work). I really need the full summary for each of the 10 regressions, including the R-squared values. I'm sure there's a simple way to do this I just can't seem to figure it out. Thanks. -Ryan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trying to get RWeka/Snowball to work
Hi! I was trying to install RWeka to be able to use SnowballStemmer in a Mac OS X 10.6.7 environment... but coudn't do it... I get error messages after: library(RWeka); install(Snowball); ## Test the supplied vocabulary for the default stemmer ('porter'): source - readLines(system.file(words, porter,voc.txt, + package = Snowball)) result - SnowballStemmer(source) Error in .jnew(name) : java.lang.InternalError: Can't start the AWT because Java was started on the first thread. Make sure StartOnFirstThread is not specified in your application's Info.plist or on the command line target - readLines(system.file(words, porter, output.txt, + package = Snowball)) ## Any differences? any(result != target) Error: object 'result' not found Trying to add database driver (JDBC): RmiJdbc.RJDriver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): jdbc.idbDriver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): org.gjt.mm.mysql.Driver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): com.mckoi.JDBCDriver - Warning, not in CLASSPATH? Trying to add database driver (JDBC): org.hsqldb.jdbcDriver - Warning, not in CLASSPATH? Well after searching around, I decided to take the matter into my own hands not ideal, but it fits my small purpose for now... will possibly expand it later..: http://holme.se/stem/ :) Peter -- +47 920 42 782 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with qualitative variables in anova
Hi, I am newbie in R programming and I need some help. I have two columns the first has 1000 values Y/N/U and the other has f/m. Like that : q7 sex == Um U f Um Nf I want to do one way anova parametric and no parametric. But I have some problems. Code: frameq7 - data.frame(q7,sex) frameq7 r - aov(q7 ~ sex, data = frameq7) summary(r) I take Error in storage.mode(y) - double : invalid to change the storage mode of a factor In addition: Warning message: In model.response(mf, numeric) : using type=numeric with a factor response will be ignored Could you help me please to make it wright ? And finally how can I present this analysis ? with boxplot ? Thanks a lot -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-qualitative-variables-in-anova-tp3483845p3483845.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about lrm, validate, pentrace
(11/04/29 22:09), Frank Harrell wrote: Yes I would select that as the final model. Thank you for your comment. I am able to be confident about my model now. The difference you saw is caused by different treatment of penalization of factor variables, related to the use of the sum squared differences between the estimate at one category from the average over all categories. I think that as long as you code it one way consistently and pick the penalty using that coding you are OK. But if the coefficients of the non-factor variables depend on how the binary predictor is coded, there is a bit more concern. A lot of previous studies have demonstrated that poor outcome is more frequent in treat2 than in treat 1. So, I coded treat1 as 0, and treat2 as 1 in the first mail. Then, I came back to the original coding of treat1 and treat2 in the newer mail. According to your answer, I guess I am OK. :-) Prof Harrell, Your book (Rregression Modeling Strategies) and many kind comments helped me a lot. Thank you very much again. -- KH Frank 細田弘吉 wrote: Thank you for you quick reply, Prof. Harrell. According to your advice, I ran pentrace using a very wide range. pentrace.x6factor- pentrace(x6factor.lrm, seq(0, 100, by=0.5)) plot(pentrace.x6factor) I attached this figure. Then, pentrace.x6factor- pentrace(x6factor.lrm, seq(0, 10, by=0.05)) It seems reasonable that the best penalty is 2.55. x6factor.lrm.pen- update(x6factor.lrm, penalty=2.55) cbind(coef(x6factor.lrm), coef(x6factor.lrm.pen), abs(coef(x6factor.lrm)-coef(x6factor.lrm.pen))) [,1][,2][,3] Intercept -4.32434556 -3.86816460 0.456180958 stenosis -0.01496757 -0.01091755 0.004050025 T1 3.04248257 2.42443034 0.618052225 T2-0.75335619 -0.57194342 0.181412767 procedure -1.20847252 -0.82589263 0.382579892 ClinicalScore 0.37623189 0.30524628 0.070985611 validate(x6factor.lrm, bw=F, B=200) index.orig trainingtest optimism index.corrected n Dxy 0.6324 0.6849 0.5955 0.0894 0.5430 200 R20.3668 0.4220 0.3231 0.0989 0.2679 200 Intercept 0. 0. -0.1924 0.1924 -0.1924 200 Slope 1. 1. 0.7796 0.2204 0.7796 200 Emax 0. 0. 0.0915 0.0915 0.0915 200 D 0.2716 0.3229 0.2339 0.0890 0.1826 200 U-0.0192 -0.0192 0.0243 -0.0436 0.0243 200 Q 0.2908 0.3422 0.2096 0.1325 0.1582 200 B 0.1272 0.1171 0.1357 -0.0186 0.1457 200 g 1.6328 1.9879 1.4940 0.4939 1.1389 200 gp0.2367 0.2502 0.2216 0.0286 0.2080 200 validate(x6factor.lrm.pen, bw=F, B=200) index.orig trainingtest optimism index.corrected n Dxy 0.6375 0.6857 0.6024 0.0833 0.5542 200 R20.3145 0.3488 0.3267 0.0221 0.2924 200 Intercept 0. 0. 0.0882 -0.0882 0.0882 200 Slope 1. 1. 1.0923 -0.0923 1.0923 200 Emax 0. 0. 0.0340 0.0340 0.0340 200 D 0.2612 0.2571 0.2370 0.0201 0.2411 200 U-0.0192 -0.0192 -0.0047 -0.0145 -0.0047 200 Q 0.2805 0.2763 0.2417 0.0346 0.2458 200 B 0.1292 0.1224 0.1355 -0.0132 0.1423 200 g 1.2704 1.3917 1.5019 -0.1102 1.3805 200 gp0.2020 0.2091 0.2229 -0.0138 0.2158 200 In the penalized model (x6factor.lrm.pen), the apparent Dxy is 0.64, and bias-corrected Dxy is 0.55. The maximum absolute error is estimated to be 0.034, smaller than non-penalized model (0.0915 in x6factor.lrm) The changes in slope and intercept are substantially reduced in penalized model. I think overfitting is improved at least to some extent. Should I select this as a final model? I have one more question. The procedure variable was defined as 0/1 value in the previous mail. For some graphical reason, I redefined it as treat1/treat2 value. Then, the best penalty value was changed from 3.05 to 2.55. I guess change from numeric to factorial caused this reduction in penalty. Which set up should I select? I appreciate your help in advance. -- KH (11/04/26 0:21), Frank Harrell wrote: You've done a lot of good work on this. Yes I would say you have moderate overfitting with the first model. The only thing that saved you from having severe overfitting is that there seems to be a signal present [I am assume this model is truly pre-specified and was not developed at all by looking at patterns of responses Y.] The use of backwards stepdown demonstrated much worse overfitting. This is in line with what we know about the damage of stepwise selection methods that do not incorporate shrinkage. I would throw away the stepwise
[R] importing and filtering time series data
Folks, I'm new to R and would like to use it to analyze web server performance data. I collect the data in this CSV format: 1304083104.41,Y,668.856249809 1304083104.41,Y,348.143193007 First column is a seconds.microseconds timestamp, rows with N instead of Y need to be skipped and the last column has the same format as the first column, except it's request duration (latency). I would like to calculate average number of requests per second, mean latency, variance, 5 and 95 percentiles. What is the best way to accomplish this, starting with importing of time series? Thanks, Joel -- - for hire: mac osx device driver ninja, kernel extensions and usb drivers -++--- http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont -++--- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fisher exact for 2x2 table
After I shared comments form the forum yesterday with the biostatistician he indicated this: Fisher's exact test is the non-parametric analog for the Chi-square test for 2x2 comparisons. A version (or extension) of the Fisher's Exact test, known as the Freeman-Halton test applies to comparisons for tables greater than 2x2. SAS can calculate both statistics using the following instructions. proc freq; tables a * b / fisher; Do people here still stand by position fisher exact test can be used for RxC contingency tables ? Sorry to both you all so much it is just important for a paper I am writing and planning to submit soon. ( I have a 4x2 table but does not meet expected frequencies requirements for chi-squared.) I guess people here have suggested R implements, the following, which unfortunately are unavailable at least easily at my library but at least by the titles indicates it is extending it to RxC Mehta CR, Patel NR. A network algorithm for performing Fisher's exact test in r c contingency tables. Journal of the American Statistical Association 1983;78:427-34. Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for Fisher's exact test on unordered r x c contingency tables. ACM Transactions on Mathematical Software 1986;12:154-61. The only reason I ask again is he is exceptionally clear on this point. Thanks again, -Rob viostorm wrote: Thank you all very kindly for your help. -Rob Robert Schutt III, MD, MCS Resident - Department of Internal Medicine University of Virginia, Charlottesville, Virginia viostorm wrote: Thank you all very kindly for your help. -Rob Robert Schutt III, MD, MCS Resident - Department of Internal Medicine University of Virginia, Charlottesville, Virginia -- View this message in context: http://r.789695.n4.nabble.com/fisher-exact-for-2x2-table-tp3481979p3484009.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert filogenetic tree to binary matrix
Hi Ben, Thank you for your help. I did the same question in the r-sig-phylo mailing list. Liam Revell gave the following solution: temp-prop.part(tree) X-matrix(0,nrow=length(tree$tip),ncol=length(temp),dimnames=list(tree$tip.label,tree$node.label)) for(i in 1:ncol(X)) X[temp[[i]],i]-1 Vanderlei -- View this message in context: http://r.789695.n4.nabble.com/Convert-filogenetic-tree-to-binary-matrix-tp3478961p3484371.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression Summary for a List
Ryan - summary expects an lm object, and fit is a list. So you need to use something like lapply(fit,summary) to pass each list element to the summary function. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 29 Apr 2011, Ryan J. McGuigan wrote: Hi, I am trying to run a regression on two matrices with 10 columns. I have been able to run the regression with the following code: fit=list() for(i in 1:10) { fit[[i]]=lm(monret[,i]~janret[,i]) } However, I can't get the regression to spit out more than the coefficients (summary(fit) does not work). I really need the full summary for each of the 10 regressions, including the R-squared values. I'm sure there's a simple way to do this I just can't seem to figure it out. Thanks. -Ryan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditonal Rank
Suppose I have data such as tmp - data.frame(score = c(1,2,3,4, 4,3,2,1), trial = gl(2,4), Gender = gl(2,2,8, labels=c('M', 'F'))) Now I would like to compute a rank on the variable score conditional on trial and gender. I could do res - with(tmp, tapply(score, list(Gender, trial), rank)) res[,1] res[,2] and then finagle a way to create a new variable in the dataframe tmp that has these ranks associated with the correct rows. But, perhaps there is a better way. Any suggestions? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using lme4 with three nested random effects
Thierry, The first suggestion worked. Thank you very much. *Ben Caldwell* University of California, Berkeley 137 Mulford Hall #3114 Berkeley, CA 94720 Office 223 Mulford Hall (510)859-3358 On Fri, Apr 29, 2011 at 1:52 AM, ONKELINX, Thierry thierry.onkel...@inbo.be wrote: Dear Ben, Are site, transect and plot factors? And do they have unique id's? You could try this rws30.UL$site - factor(rws30.UL$site) rws30.UL$transect - interaction(rws30.UL$site, rws30.UL$transect, drop = TRUE) rws30.UL$plot - interaction(rws30.UL$site, rws30.UL$transect, rws30.UL$plot, drop = TRUE) modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bak.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num +(1|site/transect/plot), data=rws30.UL, family=gaussian, na.action=na.omit) Or modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bak.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num +(1|site) + (1|transect) + (1|plot), data=rws30.UL, family=gaussian, na.action=na.omit) Best regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Benjamin Caldwell Verzonden: vrijdag 29 april 2011 0:37 Aan: r-help Onderwerp: [R] using lme4 with three nested random effects Hi all, I'm trying to fit models for data with three levels of nested random effects: site/transect/plot. For example, modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bar k.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num +(1|site/transect/plot), data=rws30.UL, family=gaussian, na.action=na.omit) but I get the following error: Error: length(f1) == length(f2) is not TRUE In addition: Warning messages: 1: In plot:(transect:site) : numerical expression has 92 elements: only the first used 2: In plot:(transect:site) : numerical expression has 92 elements: only the first used The formulation works for two nested effects (e.g. 1|site/transect) I can get it to run in lme modelincrBS-lme(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bark. thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num, data=rws30.UL, random=(~1| site/transect/plot),na.action=na.omit) but I can't specify a distribution family in that package. Any help much appreciated. Ben Caldwell * * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Analysis and graphics by groups
Hello, This is my first post in this e-mail list and I hope it's enough to justify calling for help. In case it's not, sorry. I'm trying to do analysis and graphics using a factor as a criteria to split data and do the analysis/graphics for each subset of data. Right now what I'm trying to do is to fit and plot the following logistic model, according to a third variable named Cerca: dm_fit_T-nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T) I've found a function called gapply which seems to be what I need, but it doesn't seem to work. This is the argument I've used: gapply(perieph,FUN=nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T),groups=Cerca) But I get this error message returned: Error in get(as.character(FUN), mode = function, envir = envir) : object 'FUN' of mode 'function' was not found Can you help me doing this non-linear regression by groups work? Also, after I manage making the regression, I'd also need fitting a line to the nDMTRBgm2~nDMTRBgm2.T.1 data using the same model above. I've used plotfit to do that with one nlm data set. Is it possible to fit each group trend line and data with different colours/symbols in one same graphic? Thank you, Cristiano -- Cristiano Yuji Sasada Sato Doutorando Programa de Pós-Graduação em Ecologia e Evolução - IBRAG / UERJ Laboratório de Ecologia de Rios e Córregos Departamento de Ecologia - Universidade do Estado do Rio de Janeiro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RCurl and postForm()
Hi everybody, I think that I am missing something fundamental in how strings are passed from a postForm() call in R to the curl or libcurl functions underneath. For example, I can do the following using curl from the command line: $ curl -d Archbishop Huxley http://www.datasciencetoolkit.org/text2people; [{gender:u,first_name:,title:archbishop,surnames:Huxley,start_index:0,end_index:17,matched_string:Archbishop Huxley}] Trying the same thing, or what I *think* is the same thing (obvious not) in R (Mac OS 10.6.7, R 2.13.0) produces: library(RCurl) Loading required package: bitops api - http://www.datasciencetoolkit.org/text2people; postForm(api, a=Archbishop Huxley) [1] [{\gender\:\u\,\first_name\:\\,\title\:\archbishop\,\surnames\:\Huxley\,\start_index\:44,\end_index\:61,\matched_string\:\Archbishop Huxley\},{\gender\:\u\,\first_name\:\\,\title\:\archbishop\,\surnames\:\Huxley\,\start_index\:88,\end_index\:105,\matched_string\:\Archbishop Huxley\}] attr(,Content-Type) charset text/html utf-8 I can match the result given on the DSTK API's website by using system(), but doesn't seem like the R-like way of doing something. system(curl -d 'Archbishop Huxley' 'http://www.datasciencetoolkit.org/text2people') 158 141 141 141 0[{gender:u,first_name:,title:archbishop,surnames:Huxley,start_index:0,end_index:17,matched_string:Archbishop Huxley}]17599 72 --:--:-- --:--:-- --:--:-- 670 If you want to see some additional information related to this question, I posted on StackOverflow a few days ago: http://stackoverflow.com/questions/5797688/post-request-using-rcurl I am working on this R wrapper for the data science toolkit as a way of illustrating how to make an R package for the Denver RUG and ran into this problem. Any help to this problem will be greatly appreciated by the Denver RUG! Cheers, Ryan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using R in C#
Hi, I've been able to use R in C# on my machine by following the steps on the http://www.codeproject.com/KB/cs/RtoCSharp.aspx. This works locally, i.e. if R is running on my box. I was wondering if its possible to change it so that I can connect to another machine that is running R (and has rscproxy installed). This way a lot of people can use R in C# without having to first install it on their boxes if its installed on another box that they can connect to. Thanks, Akhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bigining with a Program of SVR
Hi: I'm starting a research of Support Vector Regression. I want to obtain a model to predict a property A with a set of property B, C, D, ... This problem is very common for example in QSAR models. I want to know some examples and package that could help me in this way. I know about caret and e1071. But I' don't know if this package can work with continues variables.? Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/Bigining-with-a-Program-of-SVR-tp3484476p3484476.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using Java methods in R
-- snip -- It clogs up my email, takes a long time to delete, and is hard to be selective enough to not delete some of my other important email. -- snip -- If you don't care about contributing to the R listserve community, it's hard to imagine why that community should care about you. Some people (not me) seem to use nabble [ http://www.nabble.com/ ] to monitor the list. See R under what is cool . Another option is to set up rules in your email client to direct your mail to an appropriate folders or if you use gmail I guess we would say to label you R listserve email. You can search mail archives for a topic of interest with the R command line command RSiteSearch(). To learn more type ?RSiteSearch For fun I put rJava rectangular arrays into this search engine (having no idea what that means) and one of the things that came out was: http://finzi.psych.upenn.edu/R/library/rJava/html/jrectRef-class.html Hopefully, this or one of the other things can be useful to you. Finally for the third time, try joining/looking at: stats-rosuda-devel: http://mailman.rz.uni-augsburg.de/mailman/listinfo/stats-rosuda-devel or the archive: http://mailman.rz.uni-augsburg.de/pipermail/stats-rosuda-devel/ -- Robert W. Baer, Ph.D. Professor of Physiology Kirksville College of Osteopathic Medicine A. T. Still University of Health Sciences 800 W. Jefferson St. Kirksville, MO 63501 660-626-2322 FAX 660-626-2965 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8
On Fri, 29 Apr 2011, David Winsemius wrote: On Apr 29, 2011, at 1:29 PM, Mike Miller wrote: On Fri, 29 Apr 2011, Giovanni Petris wrote: Well, but the original poster also refers to 0.2 and 0.8 as expected min and max, in which case we are back to a joke... Well, he is a lot better with English than I am with Mandarin. He seemed to like the truncated normal answers, so we'll let those be his answers. It is possible to choose parameters for a normal distribution with 500 observations such that the expected value of the maximum is .8 and the expected value of the minimum is .2. Obviously, the mean would be .5, not 1, but what would the variance then have to be to provide the correct expected max and min? That's another legitimate question. You would need to specify an N since the expected first and last order statistic would decrease/increase with increasing N. Right -- I chose N=500, as did the OP. I think the order statistics for the normal are pretty complex, but it wouldn't be hard to use the density for order statistics for the uniform to compute the appropriate values for a standard normal, then rescale. http://en.wikipedia.org/wiki/Order_statistic#The_order_statistics_of_the_uniform_distribution You'd have to multiply the beta density times the inverse normal cdf and get the weighted average for a set of points. It doesn't sound terribly difficult but I don't want to do it! ;-) Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with qualitative variables in anova
Are you working on the same homework problem as user494766? http://stackoverflow.com/questions/5835605/1-way-anova-in-r-help -- David. On Apr 29, 2011, at 10:43 AM, katerinaaa wrote: Hi, I am newbie in R programming and I need some help. I have two columns the first has 1000 values Y/N/U and the other has f/m. Like that : q7 sex == Um U f Um Nf I want to do one way anova parametric and no parametric. But I have some problems. Code: frameq7 - data.frame(q7,sex) frameq7 r - aov(q7 ~ sex, data = frameq7) summary(r) I take Error in storage.mode(y) - double : invalid to change the storage mode of a factor In addition: Warning message: In model.response(mf, numeric) : using type=numeric with a factor response will be ignored Could you help me please to make it wright ? And finally how can I present this analysis ? with boxplot ? Thanks a lot -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-qualitative-variables-in-anova-tp3483845p3483845.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Speed up code with for() loop
Barth sent me a very good code and I modified it a bit. Have a look: Error-rnorm(1000, mean=0, sd=0.05) estimate-(log(1+0.10)+Error) DCF_korrigiert-(1/(exp(1/(exp(0.5*(-estimate)^2/(0.05^2))*sqrt(2*pi/(0.05^2 ))*(1-pnorm(0,((-estimate)/(0.05^2)),sqrt(1/(0.05^2))-1)) DCF_verzerrt-(1/(exp(estimate)-1)) S - 1000 # total sample size D - 1 # number of subsamples Subset - 1 # number in each subsample Select - matrix(sample(S,D*Subset,replace=TRUE),nrow=Subset,ncol=D) DCF_korrigiert_select - matrix(DCF_korrigiert[Select],nrow=Subset,ncol=D) Delta_ln -(log(colMeans(DCF_korrigiert_select, na.rm=T)/(1/0.10))) The only problem I discovered is that R cannot handle more than 2.147.483.647 integers, thus the cells in the matrix are bounded by this condition. (R shows the max by typing: .Machine$integer.max). And if you want to safe the workspace, the file with 10.000 times 10.000 becomes round 2 GB. Compared to the original of just 300 MB. So I cannot perform my previous bootstrap with 1.000.000 times 100.000. But nevertheless 10.000 times 10.000 seems to be sufficiently; I have to say its amazing, how fast the idea works. Has anybody a suggestion how to make it work for the 1.000.000 times 100.000 bootstrap??? -- View this message in context: http://r.789695.n4.nabble.com/Speed-up-code-with-for-loop-tp3481680p3484548.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fisher exact for 2x2 table
Rob: Fisher's exact test is conceptually possible for any r x c contingency table problem and uses the observed multinomial table probability as the test statistic. Other tests for r x c contingency tables use a different test statistic (Chi-squared, likelihood ratio, Zelterman's). It is possible that the probabilities for any of these procedures may differ slightly for the same table configuration even if the probabilities for each test are calculated by enumerating all possible permutations (hypergeometric) under the null hypothesis. See Mielke and Berry 2007 (Permutation Methods: A distance function approach) Chps 6 and7. Mielke has provided efficient Fortran algorithms for enumerating the exact probabilities for 2x2, 3x2, 4x2, 5x2, 6x2 ,3x3,and even 2x2x2 tables for Fisher's exact and Chi-square statistics. I don't remember whether Cyrus Meta's algorithms for Fisher's exact can do more.But the important point to keep in mind is that it is possible to use different statistics for evaluating the same null hypothesis for r x c tables (Fisher's exact uses one form, Chi-square uses another, etc.) and the probabilities can be computed by exact enumeration of all permutations (what people expect Fisher's exact to do but also possible for Chi-square statistic) or by some approximation (asymptotic distribution, Monte Carlo resampling). The complete enumeration of test statistics under the null becomes computationally intractable for large dimension r x c problems whether using the observed table probability (like Fisher's exact) as a test statistic or other like Chi-square statistic. So in short, yes you can use Fisher's exact on your 4 x 2 problem, and the result might differ from using a Chi-square statistic even if you compute the P-value for the Chi-square test by complete enumeration. Note that the minimum expected cell size for the Chi-square test is related to whether the Chi-square distributional approximation (an asymptotic argument) for evaluating the Chi-square statistic will be reasonable and is irrelevant if you calculate your probabilities by exact enumeration of all permutations. Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: brian_c...@usgs.gov tel: 970 226-9326 From: viostorm rob.sch...@gmail.com To: r-help@r-project.org Date: 04/29/2011 01:23 PM Subject: Re: [R] fisher exact for 2x2 table Sent by: r-help-boun...@r-project.org After I shared comments form the forum yesterday with the biostatistician he indicated this: Fisher's exact test is the non-parametric analog for the Chi-square test for 2x2 comparisons. A version (or extension) of the Fisher's Exact test, known as the Freeman-Halton test applies to comparisons for tables greater than 2x2. SAS can calculate both statistics using the following instructions. proc freq; tables a * b / fisher; Do people here still stand by position fisher exact test can be used for RxC contingency tables ? Sorry to both you all so much it is just important for a paper I am writing and planning to submit soon. ( I have a 4x2 table but does not meet expected frequencies requirements for chi-squared.) I guess people here have suggested R implements, the following, which unfortunately are unavailable at least easily at my library but at least by the titles indicates it is extending it to RxC Mehta CR, Patel NR. A network algorithm for performing Fisher's exact test in r c contingency tables. Journal of the American Statistical Association 1983;78:427-34. Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for Fisher's exact test on unordered r x c contingency tables. ACM Transactions on Mathematical Software 1986;12:154-61. The only reason I ask again is he is exceptionally clear on this point. Thanks again, -Rob viostorm wrote: Thank you all very kindly for your help. -Rob Robert Schutt III, MD, MCS Resident - Department of Internal Medicine University of Virginia, Charlottesville, Virginia viostorm wrote: Thank you all very kindly for your help. -Rob Robert Schutt III, MD, MCS Resident - Department of Internal Medicine University of Virginia, Charlottesville, Virginia -- View this message in context: http://r.789695.n4.nabble.com/fisher-exact-for-2x2-table-tp3481979p3484009.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list
[R] For loop and sqldf
Hi list, Can anyone tell my why the following does not work? Thanks a lot! Your help is very much appreciated. DF = data.frame(read.table(textConnection(B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) list-sort(unique(DF$C)) for (t in 1:length(list)) { year = as.character(list[t]) data[year]-sqldf('select * from DF where C = [year]') } I am trying to split up the data.frame into 5 new ones, one for every year. -- View this message in context: http://r.789695.n4.nabble.com/For-loop-and-sqldf-tp3484559p3484559.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] --mem-vsize in R
Hi, I am calculation pairwise correlation coefficient for a matrix of 234 X 3. I am getting the following error, Error in cbind(as.vector(row(cl)), as.vector(col(cl)), as.vector(cl)) : allocMatrix: too many elements specified In addition: There were 50 or more warnings (use warnings() to see the first 50) The function used is, corGraphPearson = function(cData, COR) #COR is threshold 0.5,0.7, etc { cl = unname(cor(cData, use=pairwise.complete.obs, method=pearson)) result = cbind(as.vector(row(cl)),as.vector(col(cl)),as.vector(cl)) result = result[result[,1] != result[,2],] corm = result # remove low cor pairs corm =corm[abs(corm[,3]) = COR, ] # the network net - network(corm, directed = F) } I am running this in a cluster with 4 machines with 24 GB memory each. How should I start R so that I make max use of the memory availbale? Or how to overcome this issue? -- View this message in context: http://r.789695.n4.nabble.com/mem-vsize-in-R-tp3484541p3484541.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with qualitative variables in anova
Yes, I wrote also in the other forum because here I didn't take an answer. Thanks for your reply -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-qualitative-variables-in-anova-tp3483845p3484599.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditonal Rank
Hi: Does this work? library(plyr) ddply(tmp, .(trial, Gender), transform, rankscore = rank(score)) score trial Gender rankscore 1 1 1 M 1 2 2 1 M 2 3 3 1 F 1 4 4 1 F 2 5 4 2 M 2 6 3 2 M 1 7 2 2 F 2 8 1 2 F 1 Alternatively, you could get the 'wide form' with aggregate(score ~ trial + Gender, data = tmp, FUN = rank) trial Gender score.1 score.2 1 1 M 1 2 2 2 M 2 1 3 1 F 1 2 4 2 F 2 1 HTH, Dennis On Fri, Apr 29, 2011 at 12:26 PM, Doran, Harold hdo...@air.org wrote: Suppose I have data such as tmp - data.frame(score = c(1,2,3,4, 4,3,2,1), trial = gl(2,4), Gender = gl(2,2,8, labels=c('M', 'F'))) Now I would like to compute a rank on the variable score conditional on trial and gender. I could do res - with(tmp, tapply(score, list(Gender, trial), rank)) res[,1] res[,2] and then finagle a way to create a new variable in the dataframe tmp that has these ranks associated with the correct rows. But, perhaps there is a better way. Any suggestions? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fisher exact for 2x2 table
Rob-- Your biostatistician has not disagreed with the rest of us about anything except for his preferred name for the test. He wants to call it the Freeman-Halton test, some people call it the Fisher-Freeman-Halton test, but most people call it the Fisher Exact test -- all are the same test. When he was adamant you could not do 2x2, what he was being adamant about was the name you should use when referring to the test for tables larger than 2x2. Why he was doing that, I don't know, but I think it is silly -- he confused you and the rest of us. He goes on to tell you that to get the Freeman-Halton test in SAS, you use tables a * b / fisher. In other words, SAS calls the test Fisher instead of calling it Freeman-Halton. R also calls it Fisher and not Freeman-Halton. I'm like R and SAS and unlike your biostatistician, but to each his own. You say that he is exceptionally clear on this point, which may be true, but what is the point? The point is that he prefers a different *name* for the test than the rest of us. Everyone agrees on the math/stat. Mike -- Michael B. Miller, Ph.D. Minnesota Center for Twin and Family Research Department of Psychology University of Minnesota On Fri, 29 Apr 2011, viostorm wrote: After I shared comments form the forum yesterday with the biostatistician he indicated this: Fisher's exact test is the non-parametric analog for the Chi-square test for 2x2 comparisons. A version (or extension) of the Fisher's Exact test, known as the Freeman-Halton test applies to comparisons for tables greater than 2x2. SAS can calculate both statistics using the following instructions. proc freq; tables a * b / fisher; Do people here still stand by position fisher exact test can be used for RxC contingency tables ? Sorry to both you all so much it is just important for a paper I am writing and planning to submit soon. ( I have a 4x2 table but does not meet expected frequencies requirements for chi-squared.) I guess people here have suggested R implements, the following, which unfortunately are unavailable at least easily at my library but at least by the titles indicates it is extending it to RxC Mehta CR, Patel NR. A network algorithm for performing Fisher's exact test in r c contingency tables. Journal of the American Statistical Association 1983;78:427-34. Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for Fisher's exact test on unordered r x c contingency tables. ACM Transactions on Mathematical Software 1986;12:154-61. The only reason I ask again is he is exceptionally clear on this point. Thanks again, -Rob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fisher exact for 2x2 table
On 29 April 2011 08:43, viostorm rob.sch...@gmail.com wrote: After I shared comments form the forum yesterday with the biostatistician he indicated this: Fisher's exact test is the non-parametric analog for the Chi-square test for 2x2 comparisons. A version (or extension) of the Fisher's Exact test, known as the Freeman-Halton test applies to comparisons for tables greater than 2x2. SAS can calculate both statistics using the following instructions. proc freq; tables a * b / fisher; SAS documentation says: Fisher's exact test was extended to general R×C tables by Freeman and Halton (1951), and this test is *also* known as the Freeman-Halton test. Emphasis mine. Jeremy -- Jeremy Miles Psychology Research Methods Wiki: www.researchmethodsinpsychology.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kolmogorov-Smirnov test
The general idea of the KS test (and others) can be applied to discrete data, but the implementation in R assumes continuous data (does not have the needed adjustments to deal with ties). The chi-square and other tests suffer from the same problems in your case. In all cases the null hypothesis is that the data comes from the stated distribution (poisson in your case), failing to reject the null hypothesis does not prove that the data comes from that distribution, only shows that we cannot disprove that it comes from that distribution. With large sample sizes, your data could come from a true distribution that for all practical purposes is equivalent to the poisson, but due to slight rounding or other errors has probabilities slightly different for some values (a difference that no one would reasonably care about), but these tests can show a significant difference. Usually it is better to just show that your data and the theoretical distribution are close enough to each other rather than depending on a formal test. The plots and diagnostics in the vcd package are a good choice here, you could also use the KS test statistic (ignoring the p-value and warnings) as another measure, but plot the empirical and theoretical distributions to see what the value means and how close they are. Another option is the vis.test function in TeachingDemos, it lets you plot data from the theoretical distribution and the actual data, then see if you can visually tell the difference. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of m.marcinmichal Sent: Thursday, April 28, 2011 3:54 PM To: r-help@r-project.org Subject: Re: [R] Kolmogorov-Smirnov test Hi, thanks for response. The Kolmogorov-Smirnov test is designed for distributions on continuous variable, not discrete like the poisson. That is why you are getting some of your warnings. I read in Fitting distributions whith R Vito Ricci page 19 that: ... Kolmogorov-Smirnov test is used to decide if a sample comes from a population with a specific distribution. I can be applied both for discrete (count) data and continuous binned (even if some Authors do not agree on this point) and both for continuous variables but in page 16 i read that ... while the Kolmogorov-Smirnov and Anderson-Darling tests are restricted to continuous distribution and i was little confused, but try this test to my discrete data. Generally in first step, I try fit my data to discret or continuous distribution (task: find distribution for emirical data). Question, Can I approximate my discret data by the continuous distribution? I know that sometmies we can poisson distribution approxime by the normal distribution. But what happen if I use another distribution like log normall or gama? I done another three tests - chi square test. But this tests return three another results. Suppose that we have the same data i.e vectorSentence. Test: 1. One param - fitdistr(vectorSentence, poisson) chisq.test(table(vectorSentence), p = dpois(1:9, lambda=param[[1]][1]), rescale.p = TRUE) X-squared = 272.8958, df = 8, p-value 2.2e-16 2. Two library(vcd) gf - goodfit(vectorSentence, type=poisson, method=MinChisq) summary(gf) X^2 df P( X^2) Pearson 404.3607 8 2.186332e-82 3. Three fdistc - fitdist(vectorSentence, pois) g-gofstat(fdistc, print.test = TRUE) Chi-squared statistic: 535.344 Degree of freedom of the Chi-squared distribution: 8 Chi-squared p-value: 1.824112e-110 Question which results is correct? I know that I can reject null hipotesis: data don't come from poisson distribution. But which result is correct? For another side I trying to accomplish another problem: 1. Suppose that we have a reference data (dr) from some process (pr) which save in vectorSentence. 2. Suppose that we have a two another sample data d1, d2 from another two process p1, p2 3. We know that all data is discrete. Task: One: check if data d1, d2 is equal to reference data (dr) - this is not a problem. I use a cdf, histogram, another mensure etc. chi square test. But can I use Kolmogorov-Smirnov to test cumulative distribution function hipotesis i.e F(d1) = F(d) for my data? Two: find dr distributions discret or if possible continuous Best Marcin M. -- View this message in context: http://r.789695.n4.nabble.com/Kolmogorov- Smirnov-test-tp3479506p3482349.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strange fluctuations in system.time with kernapply
Hello expeRts, here is something which strikes me as kind of odd and I would like to ask for some enlightenment: First let's do this: tkern - kernel(modified.daniell, c(5,5)) test - rep(1,100) system.time(kernapply(test,tkern)) User System verstrichen 1.100 0.040 1.136 That was easy. Now this: test - rep(1,110) system.time(kernapply(test,tkern)) User System verstrichen 1.400.021.43 Still fine. Now this: test - rep(1,111) system.time(kernapply(test,tkern)) User System verstrichen 1.390 0.020 1.409 Ok, by now it seems boring. But wait: test - rep(1,1110300) system.time(kernapply(test,tkern)) User System verstrichen 12.270 0.030 12.319 There is a sudden - and repeatable! - jump in the time needed to execute kernapply. At least from a naive point of view there should not be much difference between applying a kernel to a vector 111 or 1110300 entries long. But maybe there is some limit here? So I tried this: test - rep(1,1110400) system.time(kernapply(test,tkern)) User System verstrichen 1.960.011.97 which doesn't fit into the pattern. But the best thing is still to come. When I try this test - rep(1,1110308) system.time(kernapply(test,tkern)) then the computer starts to run and does so for longer than 15 minutes until when I normally kill the process. As noted above this behaviour is repeatable and occurs every time I issue these commands. I really would like to know if there is some magic to the number 1110308 I'm not aware of. Last but not least, here is my sessionInfo() R version 2.10.1 (2009-12-14) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=de_DE.utf8 LC_NUMERIC=C [3] LC_TIME=de_DE.utf8LC_COLLATE=de_DE.utf8 [5] LC_MONETARY=C LC_MESSAGES=de_DE.utf8 [7] LC_PAPER=de_DE.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.10.1 Thank you, Alex -- Dipl.-Phys. Alexander SengerTel : +49 30 2093 4941 Humboldt-Universitaet zu Berlin Fax : +49 30 2093 4718 AG Quantenoptik und Metrologie Hausvogteiplatz 5-7 Email : 10117 Berlin, Germany sen...@physik.hu-berlin.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv fails to read a CSV file from google docs
Hello Duncan, Thank you for having a look at this. I tried the code you provided but it failed in the getForm stage. running this: tt = getForm(http://spreadsheets0.google.com/spreadsheet/pub;, + hl =en, key = 0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE, + single = true, gid =0, + output = csv, + .opts = list(followlocation = TRUE, verbose = TRUE)) Resulted in the following error: Error in curlPerform(url = url, headerfunction = header$update, curl = curl, : SSL certificate problem, verify that the CA cert is OK. Details: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed Did I miss some step? Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Fri, Apr 29, 2011 at 9:18 PM, Duncan Temple Lang dun...@wald.ucdavis.edu wrote: Thanks David for fixing the early issues. The reason for the failure is that the response from the Web server is a to redirect the requester to another page, specifically https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv Note that this is https, not http, and the built-in URL reading facilities in R don't suport https. One way to see this is to use look at the headers in your browser (e.g. Live HTTP Headers), or to use curl, or the RCurl package tt = getForm(http://spreadsheets0.google.com/spreadsheet/pub;, hl =en, key = 0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE, single = true, gid =0, output = csv, .opts = list(followlocation = TRUE, verbose = TRUE)) The verbose option shows the entire dialog, and tt contains the text of the CSV document. read.csv(textConnection(tt)) then yields the data frame D. On 4/29/11 10:36 AM, David Winsemius wrote: On Apr 29, 2011, at 11:19 AM, Tal Galili wrote: Hello all, I wish to use read.csv to read a google doc spreadsheet. I try using the following code: data_url - http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv read.csv(data_url) Which results in the following error: Error in file(file, rt) : cannot open the connection I'm on windows 7. And the code was tried on R 2.12 and 2.13 I remember trying this a few months ago and it worked fine. I am always amused at such claims. Occasionally they are correct, but more often a crucial step has been omitted. In this case you have at a minimum embedded line-feeds in your URL string and have not established a connection, so it could not possibly have succeeded as presented. But now it's time to admit I do not know why it is not succeeding when I correct those flaws. closeAllConnections() data_url - url( http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv ) read.csv(data_url) Error in open.connection(file, rt) : cannot open the connection closeAllConnections() dd - read.csv(con - url( http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv )) Error in open.connection(file, rt) : cannot open the connection So, I guess I'm not reading the help pages for `url` and `read.csv` as well I thought I was. Any suggestion what might be causing this or how to solve it? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting options only inside functions
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of luke-tier...@uiowa.edu Sent: Friday, April 29, 2011 9:35 AM To: Jonathan Daily Cc: r-help@r-project.org; Hadley Wickham; Barry Rowlingson Subject: Re: [R] setting options only inside functions The Python solution does not extend, at least not cleanly, to things like dev on/ dev off or to Hadley's locale example. In any case if I am reading the Python source correctly on how they handle user interrupts this solution has the same non-robusness to user interrupts issue that Bill's initial solution had. As a basis I believe what we need is a mechanism that handles a setup, an action, and a cleanup, with setup and cleanup occurring with interrupts disablednand the action with interrupts enabled. Scheme's dynamic wind is similar, though I don't believe the scheme standard addresses interrupts and we don't need to worry about continuations, but some of the issues are similar. Probably we would want two flavors, one in which the action has to be a function that takes as a single argument the result produced by the setup code, and one in which the action can be an argument expression that is then evaluated at the appropriate place by laze evaluation. This can be done at the R level except for the controlling of interrupts (and possibly other asynchronous stuff)-- that would need a new pair of primitives (suspendInterrupts/enableInterupts or something like that). There is something in the Haskell literature on this that I have looked at a while back -- probably time to have another look. Luke, A similar problem is that if optionsList contains an illegal option then setting options(optionList) will commit changes to .Options as it works it way down the optionList until it hits the illegal option, when it throws an error. Then the following on.exit is never called (it wouldn't have the output of options(optionList) to work on if it were called) and the initial settings in optionList stick around forever. E.g., withOptions - function(optionList, expr) { + oldOpt - options(optionList) + on.exit(options(oldOpt)) + expr + } getOption(height) NULL getOption(width) [1] 80 withOptions(list(height=10, width=-2), 666) Error in options(optionList) : invalid 'width' parameter, allowed 10...1 getOption(height) [1] 10 getOption(width) [1] 80 I haven't checked to see if par() works in the same way - it does in S+. An ignoreInterrupts(expr) function would not help in that case. Making options() (and par()) atomic operations would help, but that may be a lot of work. options() might also warn but no change .Options if there were an attempt to set an illegal option. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com On Thu, 28 Apr 2011, Jonathan Daily wrote: I would also love to see this implemented in R, as my current solution to the issue of doing tons of open/close, dev/dev.off, etc. is to use snippets in my IDE, and in the end I feel like it is a hack job. A pythonic with function would also solve most of the situations where I have had to use awkward try or tryCatch calls. I would be willing to help with this project, even if it is just testing. On Wed, Apr 27, 2011 at 5:43 PM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: but it's a little clumsy, because with_connection(file(myfile.txt), {do stuff...}) isn't very useful because you have no way to reference the connection that you're using. Ruby's blocks have arguments which would require big changes to R's syntax. One option would to use pronouns: Looking very much like python 'with' statements: http://effbot.org/zone/python-with-statement.htm Implemented via the 'with' statement which can operate on anything that has a __enter__ and an __exit__ method. Very neat. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luke Tierney Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: l...@stat.uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Use nparcomp function from nparcomp library to run post hoc
Dear list, I tried to use the nparcomp to run some post hoc non-parametric comparison and got and error. Error in uniroot(pfct, interval = interval) : f() values at end points not of opposite sign Appreciate any comments. the command line: nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated') Jun === data as follows structure(list(Group = c(Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle ), Ulceration = c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5), Inflamation = c(3, 4, 3, 2, 3, 3, 4, 4, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3, 3, 4, 3, 3, 2, 3, 3, 4, 3, 3, 2, 4, 4, 4, 4, 4, 4, 4, 3, 3, 4, 3, 5, 3, 3, 4, 4, 3, 3, 2, 4, 2, 3, 3, 4, 3, 4, 3, 3, 4, 3, 4, 2, 3, 3, 4, 2, 3, 4, 3, 2, 3, 3, 3, 2, 3, 2, 2, 2, 2, 4, 3, 2, 3, 3, 4, 3, 3, 4, 3, 4, 2, 4, 3, 4, 2, 4, 3, 4, 3, 2, 2, 2, 2, 3, 2, 3, 2, 4, 3, 2, 4, 4, 4, 2, 2, 3, 3, 2, 4, 3, 2, 3, 2, 2, 2, 4, 2, 3, 2, 3, 2, 3, 3, 3, 4, 3, 3, 4, 4, 2, 3, 2, 3), Fibroplasia = c(4, 4, 4, 4, 4, 3, 4, 4, 4, 3, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 2, 4, 4, 3, 2, 4, 4, 4, 4, 4, 4, 3, 3, 3, 4, 3, 3, 3, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 3, 3, 3, 4, 4, 3, 4, 4, 4, 3, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4, 3, 4, 4, 3, 4, 3, 2, 3, 3, 4, 3, 3, 4, 4, 3, 3, 3, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4, 3, 3, 4, 4, 4, 4, 4, 3, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4), Fibrosis.and.Adexnal.Atrophy = c(4, 4, 4, 3, 4, 4, 4, 4, 4, 3, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 3, 4, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 3, 3, 4, 3, 4, 4, 4, 3, 4, 3, 3, 3, 3, 3, 4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 3, 3, 4, 4, 3, 3, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4, 3, 4, 4, 4, 3, 4, 3, 4, 3, 4, 4, 3, 4, 3, 2, 3, 3, 4, 4, 3, 4, 4, 3, 3, 3, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4), Inflammation = c(2, 2, 2, 1, 1, 1, 2, 3, 1, 2, 1, 1, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, NA, 1, 1, 1, 2, 2, 2, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, NA, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 2, 2, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1, 1, 1, 2, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 2), Fibroplasia.1 = c(4, 4, 4, 4, 4, 3, 4, 4, 4, 3, 4, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 4, 3, NA, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 3, 4, 3, 4, 4, 4, 4, 3, 3, 4, 4, 4, 3, 4, 3, 4, 4, 3, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, NA, 4, 4, 4, 4, 3, 4, 3, 3, 3, 3, 3, 4, 2, 4, 3, 4, 4, 3, 4, 4, 2, 3, 2, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 4, 3, 3, 4, 4, 4, 4, 3, 3, 4, 3, 3, 4, 4, 4, 4, 3, 4, 4), Fibrosis = c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, NA, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, NA, 3, 3, 3, 3, 3, 2, 2, 3, 3, 3, 3, 3, 2, 3, 3, 3,
Re: [R] For loop and sqldf
Hi: Try split(DF, DF$C) Does that work? Dennis On Fri, Apr 29, 2011 at 1:27 PM, mathijsdevaan mathijsdev...@gmail.com wrote: Hi list, Can anyone tell my why the following does not work? Thanks a lot! Your help is very much appreciated. DF = data.frame(read.table(textConnection( B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) list-sort(unique(DF$C)) for (t in 1:length(list)) { year = as.character(list[t]) data[year]-sqldf('select * from DF where C = [year]') } I am trying to split up the data.frame into 5 new ones, one for every year. -- View this message in context: http://r.789695.n4.nabble.com/For-loop-and-sqldf-tp3484559p3484559.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.csv fails to read a CSV file from google docs
Hi Tal You can add ssl.verifypeer = FALSE in the .opts list so that the certificate is simply accepted. Alternatively, you can tell libcurl where to find the certification authority file containing signatures. This can be done via the cainfo option, e.g. cainfo = system.file(CurlSSL, cacert.pem, package = RCurl), Often such a collection of certificates is installed with the ssl library. D. On 4/29/11 2:42 PM, Tal Galili wrote: Hello Duncan, Thank you for having a look at this. I tried the code you provided but it failed in the getForm stage. running this: tt = getForm(http://spreadsheets0.google.com/spreadsheet/pub;, + hl =en, key = 0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE, + single = true, gid =0, + output = csv, + .opts = list(followlocation = TRUE, verbose = TRUE)) Resulted in the following error: Error in curlPerform(url = url, headerfunction = header$update, curl = curl, : SSL certificate problem, verify that the CA cert is OK. Details: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed Did I miss some step? Contact Details:--- Contact me: tal.gal...@gmail.com mailto:tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com http://www.talgalili.com (Hebrew) | www.biostatistics.co.il http://www.biostatistics.co.il (Hebrew) | www.r-statistics.com http://www.r-statistics.com (English) -- On Fri, Apr 29, 2011 at 9:18 PM, Duncan Temple Lang dun...@wald.ucdavis.edu mailto:dun...@wald.ucdavis.edu wrote: Thanks David for fixing the early issues. The reason for the failure is that the response from the Web server is a to redirect the requester to another page, specifically https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv Note that this is https, not http, and the built-in URL reading facilities in R don't suport https. One way to see this is to use look at the headers in your browser (e.g. Live HTTP Headers), or to use curl, or the RCurl package tt = getForm(http://spreadsheets0.google.com/spreadsheet/pub;, hl =en, key = 0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE, single = true, gid =0, output = csv, .opts = list(followlocation = TRUE, verbose = TRUE)) The verbose option shows the entire dialog, and tt contains the text of the CSV document. read.csv(textConnection(tt)) then yields the data frame D. On 4/29/11 10:36 AM, David Winsemius wrote: On Apr 29, 2011, at 11:19 AM, Tal Galili wrote: Hello all, I wish to use read.csv to read a google doc spreadsheet. I try using the following code: data_url - http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv read.csv(data_url) Which results in the following error: Error in file(file, rt) : cannot open the connection I'm on windows 7. And the code was tried on R 2.12 and 2.13 I remember trying this a few months ago and it worked fine. I am always amused at such claims. Occasionally they are correct, but more often a crucial step has been omitted. In this case you have at a minimum embedded line-feeds in your URL string and have not established a connection, so it could not possibly have succeeded as presented. But now it's time to admit I do not know why it is not succeeding when I correct those flaws. closeAllConnections() data_url - url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv) read.csv(data_url) Error in open.connection(file, rt) : cannot open the connection closeAllConnections() dd - read.csv(con - url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
Re: [R] RCurl and postForm()
Hi Ryan postForm() is using a different style (or specifically Content-Type) of submitting the form than the curl -d command. Switching the style = 'POST' uses the same type, but at a quick guess, the parameter name 'a' is causing confusion and the result is the empty JSON array - []. A quick workaround is to use curlPerform() directly rather than postForm() r = dynCurlReader() curlPerform(postfields = 'Archbishop Huxley', url = 'http://www.datasciencetoolkit.org/text2people', verbose = TRUE, post = 1L, writefunction = r$update) r$value() This yields [1] [{\gender\:\u\,\first_name\:\\,\title\:\archbishop\,\surnames\:\Huxley\,\start_index\:0,\end_index\:17,\matched_string\:\Archbishop Huxley\}] and you can use fromJSON() to transform it into data in R. D. On 4/29/11 12:14 PM, Elmore, Ryan wrote: Hi everybody, I think that I am missing something fundamental in how strings are passed from a postForm() call in R to the curl or libcurl functions underneath. For example, I can do the following using curl from the command line: $ curl -d Archbishop Huxley http://www.datasciencetoolkit.org/text2people; [{gender:u,first_name:,title:archbishop,surnames:Huxley,start_index:0,end_index:17,matched_string:Archbishop Huxley}] Trying the same thing, or what I *think* is the same thing (obvious not) in R (Mac OS 10.6.7, R 2.13.0) produces: library(RCurl) Loading required package: bitops api - http://www.datasciencetoolkit.org/text2people; postForm(api, a=Archbishop Huxley) [1] [{\gender\:\u\,\first_name\:\\,\title\:\archbishop\,\surnames\:\Huxley\,\start_index\:44,\end_index\:61,\matched_string\:\Archbishop Huxley\},{\gender\:\u\,\first_name\:\\,\title\:\archbishop\,\surnames\:\Huxley\,\start_index\:88,\end_index\:105,\matched_string\:\Archbishop Huxley\}] attr(,Content-Type) charset text/html utf-8 I can match the result given on the DSTK API's website by using system(), but doesn't seem like the R-like way of doing something. system(curl -d 'Archbishop Huxley' 'http://www.datasciencetoolkit.org/text2people') 158 141 141 141 0[{gender:u,first_name:,title:archbishop,surnames:Huxley,start_index:0,end_index:17,matched_string:Archbishop Huxley}]17599 72 --:--:-- --:--:-- --:--:-- 670 If you want to see some additional information related to this question, I posted on StackOverflow a few days ago: http://stackoverflow.com/questions/5797688/post-request-using-rcurl I am working on this R wrapper for the data science toolkit as a way of illustrating how to make an R package for the Denver RUG and ran into this problem. Any help to this problem will be greatly appreciated by the Denver RUG! Cheers, Ryan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use nparcomp function from nparcomp library to run post hoc
Hi: Is this the function nparcomp() in the nparcomp package or the one from the mutoss package? When using functions from packages, it is useful to indicate the package name. I'm assuming you're using the nparcomp package, because your code worked for me when that package was loaded: library(nparcomp) Loading required package: multcomp Loading required package: mvtnorm Loading required package: survival Loading required package: splines nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated') Nonparametric Multiple Comparison Procedure based on relative contrast effects , Type of Contrast : Dunnett NOTE: *---Weight Matrix--* - Weight matrix for choosen contrast based on all-pairs comparisons *---Analysis of relative effects---* - Simultaneous Confidence Intervals for relative effects p(i,j) with confidence level 0.95 - Method = Multivariate Delta-Method (Logit) - p-Values for H_0: p(i,j)=1/2 *Interpretation* p(a,b) 1/2 : b tends to be larger than a *--Mult.Distribution---* - Equicoordinate Quantile - Global p-Value *--* $weight.matrix snipped for brevity - all zeros $Data.Info Sample Size 1 Duoderm 24 2 Fibrase 24 3 Kollagenase 24 4 Non-treated 24 5Stimulen 24 6 Vehicle 24 $Analysis.of.relative.effects Comparison rel.effect confidence.interval t.value 1 p(Non-treated,Duoderm)0.5 [ 0.499 ; 0.501 ] 0 2 p(Non-treated,Fibrase)0.5 [ 0.499 ; 0.501 ] 0 3 p(Non-treated,Kollagenase)0.5 [ 0.499 ; 0.501 ] 0 4p(Non-treated,Stimulen)0.5 [ 0.499 ; 0.501 ] 0 5 p(Non-treated,Vehicle)0.5 [ 0.499 ; 0.501 ] 0 p.value.adjusted p.value.unadjusted 11 1 21 1 31 1 41 1 51 1 $Mult.Distribution Quantile p.Value.global 1 2.568766 1 $Correlation [1] NA A graphic also appears indicating zero effect, which is what one would expect since Ulceration = 5 for every observation in the data frame. sessionInfo() R version 2.13.0 (2011-04-13) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] nparcomp_1.0-1 multcomp_1.2-5 survival_2.36-9 mvtnorm_0.9-999 [5] sos_1.3-0 brew_1.0-6 plyr_1.5.2 loaded via a namespace (and not attached): [1] tcltk_2.13.0 tools_2.13.0 Check your version of R and the nparcomp package against this. If you have an older version of R or nparcomp, perhaps an upgrade is sufficient to fix the problem. HTH, Dennis On Fri, Apr 29, 2011 at 2:49 PM, Jun Shen jun.shen...@gmail.com wrote: Dear list, I tried to use the nparcomp to run some post hoc non-parametric comparison and got and error. Error in uniroot(pfct, interval = interval) : f() values at end points not of opposite sign Appreciate any comments. the command line: nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated') Jun === data as follows structure(list(Group = c(Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen,
Re: [R] Use nparcomp function from nparcomp library to run post hoc
Hi, Dennis, Thanks for the reply. I tried to upgrade to R 2.13.0. Then when I tried to load the library(nparcomp), I got an error Error: package 'mvtnorm' is not installed for 'arch=i386' What does that mean? Thanks. Jun On Fri, Apr 29, 2011 at 5:49 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: Is this the function nparcomp() in the nparcomp package or the one from the mutoss package? When using functions from packages, it is useful to indicate the package name. I'm assuming you're using the nparcomp package, because your code worked for me when that package was loaded: library(nparcomp) Loading required package: multcomp Loading required package: mvtnorm Loading required package: survival Loading required package: splines nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated') Nonparametric Multiple Comparison Procedure based on relative contrast effects , Type of Contrast : Dunnett NOTE: *---Weight Matrix--* - Weight matrix for choosen contrast based on all-pairs comparisons *---Analysis of relative effects---* - Simultaneous Confidence Intervals for relative effects p(i,j) with confidence level 0.95 - Method = Multivariate Delta-Method (Logit) - p-Values for H_0: p(i,j)=1/2 *Interpretation* p(a,b) 1/2 : b tends to be larger than a *--Mult.Distribution---* - Equicoordinate Quantile - Global p-Value *--* $weight.matrix snipped for brevity - all zeros $Data.Info Sample Size 1 Duoderm 24 2 Fibrase 24 3 Kollagenase 24 4 Non-treated 24 5Stimulen 24 6 Vehicle 24 $Analysis.of.relative.effects Comparison rel.effect confidence.interval t.value 1 p(Non-treated,Duoderm)0.5 [ 0.499 ; 0.501 ] 0 2 p(Non-treated,Fibrase)0.5 [ 0.499 ; 0.501 ] 0 3 p(Non-treated,Kollagenase)0.5 [ 0.499 ; 0.501 ] 0 4p(Non-treated,Stimulen)0.5 [ 0.499 ; 0.501 ] 0 5 p(Non-treated,Vehicle)0.5 [ 0.499 ; 0.501 ] 0 p.value.adjusted p.value.unadjusted 11 1 21 1 31 1 41 1 51 1 $Mult.Distribution Quantile p.Value.global 1 2.568766 1 $Correlation [1] NA A graphic also appears indicating zero effect, which is what one would expect since Ulceration = 5 for every observation in the data frame. sessionInfo() R version 2.13.0 (2011-04-13) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] nparcomp_1.0-1 multcomp_1.2-5 survival_2.36-9 mvtnorm_0.9-999 [5] sos_1.3-0 brew_1.0-6 plyr_1.5.2 loaded via a namespace (and not attached): [1] tcltk_2.13.0 tools_2.13.0 Check your version of R and the nparcomp package against this. If you have an older version of R or nparcomp, perhaps an upgrade is sufficient to fix the problem. HTH, Dennis On Fri, Apr 29, 2011 at 2:49 PM, Jun Shen jun.shen...@gmail.com wrote: Dear list, I tried to use the nparcomp to run some post hoc non-parametric comparison and got and error. Error in uniroot(pfct, interval = interval) : f() values at end points not of opposite sign Appreciate any comments. the command line: nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated') Jun === data as follows structure(list(Group = c(Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase, Non-treated, Non-treated, Non-treated, Non-treated, Non-treated,
Re: [R] For loop and sqldf
On Apr 29, 2011, at 4:27 PM, mathijsdevaan wrote: Hi list, Can anyone tell my why the following does not work? Thanks a lot! Your help is very much appreciated. DF = data.frame(read.table(textConnection(B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) list-sort(unique(DF$C)) ; require(sqldf); data -list() # added inits for (t in 1:length(list)) { year = as.character(list[t]) data[year]-sqldf('select * from DF where C = [year]') #I see you have already gotten a workable answer, but thought you might want to see if this would work: data[year]-sqldf(paste('select * from DF where C = ', year, sep=) ) # Two changes ... let `year` get evaluated and don't put `year` in brackets. } data $`1995` [1] 8025 8026 8029 $`1996` [1] 8026 $`1997` [1] 8025 8026 8027 $`1998` [1] 8026 8027 8029 $`1999` [1] 8026 8027 8028 8029 I am trying to split up the data.frame into 5 new ones, one for every year. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.