[R] object of class madata
Hi, Am using maanova package for doing anova.But am getting error like this..plz, help me regarding this.. TGR=read.madata(rmaexpr.dat,designfile=design.dat) Reading one color array. Otherwise change arrayType='twoColor' then read the data again Warning messages: 1: In read.madata(rmaexpr.dat, designfile = design.dat) : Assume that the first column is probeid. If you have probeid specify it, otherwise set 'probeid=0' then read the data again 2: In read.madata(rmaexpr.dat, designfile = design.dat) : Assume that intensity value is saved from the second column. Otherwise provide 'intensity' (first column storing intensity) information, and read the data again fit.fix=fitmaanova(TGR,formula=sample,random=~1) Error in x$terms : object of type 'closure' is not subsettable fit.fix=fitmaanova(TGR,formula=sample,random=~1) Error in fitmaanova(TGR, formula = sample, random = ~1) : The first input variable is not an object of class madata. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Three-way Panel Data Analysis
Dear R users, I have panel data on the amount of money spent by travellers from 8 origin countries in 4 destinations. I would like to carry out analysis for destinations, origins and time. However, it seems to me that the package plm can only esitmate two-way panel data (indexed by a two-dimensional array). Any suggestions would be greatly appreciated. Thank you. Best regards, Danice [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ed50
-Mensaje original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En nombre de Dipa Hari Enviado el: lunes, 12 de julio de 2010 22:19 Para: r-help@r-project.org Asunto: [R] ed50 I am using semiparametric Model library(mgcv) sm1=gam(y~x1+s(x2),family=binomial, f) How should I find out standard error for ed50 for the above model ED50 =( -sm1$coef[1]-f(x2)) / sm1$coef [2] f(x2) is estimated value for non parametric term. Thanks Two ways, 1) Re-parameterise the model so that ed50 is an explicit parameter in the model, or 2) Taylor series (aka Delta method) using the standard errors of coef[1], coef[2] and their correlation. HTH Dr. Rubén Roa-Ureta AZTI - Tecnalia / Marine Research Unit Txatxarramendi Ugartea z/g 48395 Sukarrieta (Bizkaia) SPAIN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Equivalent of SAS's FIRST. And LAST. Variable in R?
Hi all, I'm just wondering if there is a equivalent of SAS's FIRST. and LAST. variables in R? For example, suppose this is a snapshot of the data: ClientCode CaseCode open closeImportant 1 37 28 2003-07-08 2003-09-021 2 37 310 2003-11-01 2004-09-101 3 37 1562 2007-04-03 2007-07-271 4 38 29 2003-02-28 2007-09-051 5 38 599 2004-07-14 2007-10-311 For each client, I want to see that if a case is important (Important = 1), then check its close date and the next open-date for the client. E.g. for Client 37, I want to get the difference between Case 310's open date and Case 28's close date, as well as find out whether Case 310 is Important. I know how to get this using SAS's DATA step (with BY statement and FIRST. variable). But I'm having trouble finding the equivalent in R. Any suggestions will be greatly appreciated! Cheers, Kevin Kevin Wang Senior Adviser, Government Advisory Services Advisory KPMG 10 Shelley Street Sydney NSW 2000 Australia Tel +61 2 9335 8282 Fax +61 2 9335 7001 Mob 0404 518 301 kevinw...@kpmg.com.au mailto:kevinw...@kpmg.com.au kpmg.com.au http://kpmg.com.au/ Best Accounting Firm - BRW Client Choice Awards 2010 Protect the environment: please think before you print [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Equivalent of SAS's FIRST. And LAST. Variable in R?
Hi, I expect there are better ways of doing this then the following, but you can probably get the same sort of thing using the split command bb = matrix(c(1,1,1,2,2,2,3,3,3,3,10,11,12,43,23,14,52,52,12,23),ncol=2,byrow =FALSE) aa = split(bb,bb[,1]) sapply(aa,function(temp) {cc = matrix(temp,ncol=ncol(bb));cc[nrow(cc),2] - cc[1,2]}) where the first column of bb is the ClientCode the sapply gives the difference in the second column of bb between the last and first value for each client Martyn -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Wang, Kevin (SYD) Sent: 13 July 2010 07:52 To: r-help@r-project.org Cc: Wang, Kevin (SYD); wang.ke...@gmail.com Subject: [R] Equivalent of SAS's FIRST. And LAST. Variable in R? Hi all, I'm just wondering if there is a equivalent of SAS's FIRST. and LAST. variables in R? For example, suppose this is a snapshot of the data: ClientCode CaseCode open closeImportant 1 37 28 2003-07-08 2003-09-021 2 37 310 2003-11-01 2004-09-101 3 37 1562 2007-04-03 2007-07-271 4 38 29 2003-02-28 2007-09-051 5 38 599 2004-07-14 2007-10-311 For each client, I want to see that if a case is important (Important = 1), then check its close date and the next open-date for the client. E.g. for Client 37, I want to get the difference between Case 310's open date and Case 28's close date, as well as find out whether Case 310 is Important. I know how to get this using SAS's DATA step (with BY statement and FIRST. variable). But I'm having trouble finding the equivalent in R. Any suggestions will be greatly appreciated! Cheers, Kevin Kevin Wang Senior Adviser, Government Advisory Services Advisory KPMG 10 Shelley Street Sydney NSW 2000 Australia Tel +61 2 9335 8282 Fax +61 2 9335 7001 Mob 0404 518 301 kevinw...@kpmg.com.au mailto:kevinw...@kpmg.com.au kpmg.com.au http://kpmg.com.au/ Best Accounting Firm - BRW Client Choice Awards 2010 Protect the environment: please think before you print [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modifying vector elements
Hello R-users, I have a very large vector (a) containing elements consisting of numbers and letters, this is the i.e. a [1] 1.11.2a 1.11.2d 1.11.2e 1.11.2f 1.11.2x1 1.11.2x1b [7] 1.11.2x21.11.2x2a 1.11.2x2b 1.11.2x31.11.2x4 1.11.2x4a [13] 1.11.2x51.11.3a 1.11.3b 1.11.3x11.11.3x1b 1.11.3x1c [19] 1.11.3x1d 1.12.1x11.12.1x1b 1.12.1x21.12.1x3 1.12.1x3b [25] 1.12.1x41.3.1x1 1.3.4x1 1.3.6a 1.3.6c 1.3.6x1 [31] 1.3.6x1a1.3.6x1b1.3.6x1c1.3.6x1d1.3.6x1e 1.3.6x2 [37] 1.3.7a 1.3.7b 1.3.7c 1.3.7d 1.5.4a 1.5.4b [43] 1.5.4x1a1.5.4x1b1.5.4x2 1.5.4x3 1.5.4x6 1.5.5b [49] 1.5.6a 1.5.6b 1.5.6x1 1.5.6x2 1.5.7a 1.5.7b [55] 1.5.7x1 1.5.7x2 1.7.1b 1.7.1c 1.7.1e 1.7.1f [61] 1.7.1g 1.7.1i 1.7.1j 1.7.1k 1.7.1x1 1.7.1x2 [67] 1.7.1x3 1.7.1x4 1.7.1x5 1.7.2b 1.7.2x1 1.9.1a How can I remove from each record everything that is after the number after the second dot? E.g.: 1.11.2a becomes 1.11.2, 1.12.1x4 becomes 1.12.1, 1.9.1a becomes 1.9.1...and so forth. Thanks Lorenzo Lorenzo Cattarino PhD Candidate (Confirmed) Landscape Ecology and Conservation Group Centre for Spatial Environmental Research School of Geography, Planning and Environmental Management The University of Queensland Brisbane, Queensland, 4072, Australia Telephone 61-7-3365 4370, Mobile 0410884610 Email l.cattar...@uq.edu.au Internet http://www.gpem.uq.edu.au/cser http://www.gpem.uq.edu.au/cser [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modifying vector elements
Here is one way a_clip - sub(^([0-9]+\\.[0-9]+\\.[0-9]+).*$, \\1, a) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Lorenzo Cattarino Sent: Tuesday, 13 July 2010 5:20 PM To: r-help@r-project.org Subject: [R] modifying vector elements Hello R-users, I have a very large vector (a) containing elements consisting of numbers and letters, this is the i.e. a [1] 1.11.2a 1.11.2d 1.11.2e 1.11.2f 1.11.2x1 1.11.2x1b [7] 1.11.2x21.11.2x2a 1.11.2x2b 1.11.2x31.11.2x4 1.11.2x4a [13] 1.11.2x51.11.3a 1.11.3b 1.11.3x11.11.3x1b 1.11.3x1c [19] 1.11.3x1d 1.12.1x11.12.1x1b 1.12.1x21.12.1x3 1.12.1x3b [25] 1.12.1x41.3.1x1 1.3.4x1 1.3.6a 1.3.6c 1.3.6x1 [31] 1.3.6x1a1.3.6x1b1.3.6x1c1.3.6x1d1.3.6x1e 1.3.6x2 [37] 1.3.7a 1.3.7b 1.3.7c 1.3.7d 1.5.4a 1.5.4b [43] 1.5.4x1a1.5.4x1b1.5.4x2 1.5.4x3 1.5.4x6 1.5.5b [49] 1.5.6a 1.5.6b 1.5.6x1 1.5.6x2 1.5.7a 1.5.7b [55] 1.5.7x1 1.5.7x2 1.7.1b 1.7.1c 1.7.1e 1.7.1f [61] 1.7.1g 1.7.1i 1.7.1j 1.7.1k 1.7.1x1 1.7.1x2 [67] 1.7.1x3 1.7.1x4 1.7.1x5 1.7.2b 1.7.2x1 1.9.1a How can I remove from each record everything that is after the number after the second dot? E.g.: 1.11.2a becomes 1.11.2, 1.12.1x4 becomes 1.12.1, 1.9.1a becomes 1.9.1...and so forth. Thanks Lorenzo Lorenzo Cattarino PhD Candidate (Confirmed) Landscape Ecology and Conservation Group Centre for Spatial Environmental Research School of Geography, Planning and Environmental Management The University of Queensland Brisbane, Queensland, 4072, Australia Telephone 61-7-3365 4370, Mobile 0410884610 Email l.cattar...@uq.edu.au Internet http://www.gpem.uq.edu.au/cser http://www.gpem.uq.edu.au/cser [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modifying vector elements
On Tue, Jul 13, 2010 at 3:20 AM, Lorenzo Cattarino l.cattar...@uq.edu.au wrote: Hello R-users, I have a very large vector (a) containing elements consisting of numbers and letters, this is the i.e. a [1] 1.11.2a 1.11.2d 1.11.2e 1.11.2f 1.11.2x1 1.11.2x1b [...] How can I remove from each record everything that is after the number after the second dot? E.g.: 1.11.2a becomes 1.11.2, 1.12.1x4 becomes 1.12.1, 1.9.1a becomes 1.9.1...and so forth. If they are all of the form shown then the question is equivalent to removing the first alphabetic character, [[:alpha:]], and everything thereafter (.*) which is just this. sub([[:alpha:]].*, , a) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix Column Names
DN == David Neu da...@davidneu.com on Mon, 12 Jul 2010 18:15:04 -0400 writes: DN Hi, Is there a way to create a matrix in which the DN column names are not checked to see if they are valid DN variable names? Why do you need that if you are really using a matrix, not a data frame? DN I'm looking something similar to the check.names DN argument to data.frame. That's a good idea. The relevant code inside data.frame() is simply if (check.names) vnames - make.names(vnames, unique = TRUE) DN If so, would such an approach work for the sparse matrix DN classes in the Matrix package. Using function make.names(), yes of course. {but I'm still puzzled *why* you need this; If you want only want somewhat *short* names, I'd rather use vnames - abbreviate(vnames, 8) or variations of that such as vnames - abbreviate(vnames, 8, method=both.sides) or also vnames - abbreviate(vnames, 8, strict=FALSE) DN Many thanks! you're welcome! Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to mean, min lists and numbers
On 07/13/2010 01:10 AM, g...@ucalgary.ca wrote: I would like to sum/mean/min a list of lists and numbers to return the related lists. -1+2*c(1,1,0)+2+c(-1,10,-1) returns c(2,13,0) but sum(1,2*c(1,1,0),2,c(-1,10,-1)) returns 15 not a list. Using the suggestions of Gabor Grothendieck, Reduce('+',list(-1,2*c(1,1,0),2,c(-1,10,-1))) returns what we want, c(2,13,0). However, it seems that this way does not work to mean/min. So, how to mean/min a list of lists and numbers to return a list? Thanks, Hi James, If you really have a list, and not a vector as in your example, look at the rapply function in the base package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regarding R -installation
Dear R-help Team Members, I am venkatesh , Student of university of Hyderabad, while Installing R from the specified servers, I encountered the following problem. please help me regarding. i need this to do my project . Thanking you. *Problem* : Cannot access installation media http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2 (Medium 1). Check whether the server is accessible Download (curl) error for ' http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2/repodata/repomd.xml ': Error code: Connection failed Error message: couldn't connect to host yours truly, B.venkatesh, University of Hyderabad India. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regarding accesing R- Repositories at servers
Dear R-help team , I am venkatesh, student of University of Hyderabad, India. I couldn't able to access R-repositories at Your specified servers.It is giving error such as Couldn't able to access media. Can you please help me Regarding this. i am anticipating for your reply, thanking you. wishes regards B.venkatesh, University of Hyderabad, India 9440186746 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can i draw a graph with high and low data points
I have 5 columns- Trial.Group, Mean, Standard Deviation, Upper percentile, Lower percentile. Trial.Group 41 subjects: 3 to 4 yrs-Male Mean 444 SD 25 upper 494 lower 393 and all the data is like that. and i wish to recreate this excel table. http://r.789695.n4.nabble.com/file/n2287158/untitled.GIF untitled.GIF problem with my code- doesn't put Trial.Group on the x axis Thanks for the help -- View this message in context: http://r.789695.n4.nabble.com/How-can-i-draw-a-graph-with-high-and-low-data-points-tp2282524p2287158.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about food sampling analysis
Hi Sarah, We regularly undertake work in the food sector and have developed many custom built solutions. To be more specific, the statistics we employ is that of sensory analysis and we regularly use the sensominer package in R. Regards, Richard Weeks Mangosolutions data analysis that delivers Mail: rwe...@mango-solutions.com T: +44 (0)1249 767700 F: +44 (0)1249 767707 M: +44 (0)7500 040365 Unit 2 Greenways Business Park Bellinger Close Chippenham Wilts SN15 1BN UK -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sarah Henderson Sent: 12 July 2010 22:42 To: R List Subject: [R] Question about food sampling analysis Greetings to all, and my apologies for a question that is mostly about statistics and secondarily about R. I have just started a new job that (this week, apparently) requires statistical knowledge beyond my training (as an epidemiologist). The problem: - We have 57 food production facilities in three categories - Samples of 4-6 different foods were tested for listeria at each facility - I need to describe the presence of listeria in food (1) overall and (2) by facility category. I know that samples within each facility cannot be treated as independent, so I need an approach that accounts for (1) clustering within facilities and (2) the different number of samples taken at each facility. If someone could kindly point me towards the right type of analysis for this and/or its associated R functions/packages, I would greatly appreciate it. Many thanks, Sarah [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. LEGAL NOTICE This message is intended for the use o...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Barplots
Hi R, I am examining the mean returns 10 days before and 10 days after a event. Now I have several events the corresponding pre and post event 10 day mean returns... something like this Pre_Start Pre_End Pre_MeanPre_SD Post_StartPost_EndPost_Mean Post_SD 1 2002-02-22 2002-03-08 0.004968027 0.017443954 2002-03-12 2002-03-25 0.0004099697 0.012529438 2 2002-04-25 2002-05-08 -0.006371706 0.011008257 2002-05-10 2002-05-23 -0.0022429404 0.007736497 3 2002-07-24 2002-08-06 0.005083225 0.015508255 2002-08-08 2002-08-21 0.0048237816 0.008116529 4 2002-07-24 2002-08-06 0.005083225 0.015508255 2002-08-08 2002-08-21 0.0048237816 0.008116529 5 2003-01-08 2003-01-21 0.004439480 0.012310963 2003-01-23 2003-02-05 -0.0064620002 0.012731789 I obtained a barplot using the below layout(matrix(c(1,1,2,2),ncol=2,byrow=T)) barplot(rnorm(10),main=Pre_Event_Returns,col=red) barplot(rnorm(10),main=Post_Event_Returns,col=blue) However I would like to know if it is possible to do the following- merge the two barplots i.e. a single barplot which will include both the pre and post event returns Any suggestions would be appreciated Ravi This e-mail may contain confidential and/or privileged i...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: in continuation with the earlier R puzzle
Hi r-help-boun...@r-project.org napsal dne 12.07.2010 16:09:30: When I just run a for loop it works. But if I am going to run a for loop every time for large vectors I might as well use C or any other language. The reason R is powerful is becasue it can handle large vectors without each element being manipulated? Please let me know where I am wrong. for(i in 1:length(news1o)){ + if(news1o[i]s2o[i]) + s[i]-1 + else + s[i]--1 + } Think in R not in C. Why using loops when you can use whole object directly. It is like drinking beer from snifters. It is possible but using pints is preferable and more convenient. news1os2o gives you a logical vector the same length and you can use it directly for further selection or computation. You can consider FALSE as 0 and TRUE as 1 and use it as numeric vector so x-runif(10) y-runif(10) c(-1,1)[(xy)+1] selects -1 when FALSE and 1 when TRUE. or you can use it in mathematical operation directly (xy)*2-1 Regards Petr -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Barplots
Hi r-help-boun...@r-project.org napsal dne 13.07.2010 12:18:21: Hi R, I am examining the mean returns 10 days before and 10 days after a event. Now I have several events the corresponding pre and post event 10 day mean returns... something like this Pre_Start Pre_End Pre_MeanPre_SD Post_StartPost_EndPost_Mean Post_SD 1 2002-02-22 2002-03-08 0.004968027 0.017443954 2002-03-12 2002-03-25 0.0004099697 0.012529438 2 2002-04-25 2002-05-08 -0.006371706 0.011008257 2002-05-10 2002-05-23 -0.0022429404 0.007736497 3 2002-07-24 2002-08-06 0.005083225 0.015508255 2002-08-08 2002-08-21 0.0048237816 0.008116529 4 2002-07-24 2002-08-06 0.005083225 0.015508255 2002-08-08 2002-08-21 0.0048237816 0.008116529 5 2003-01-08 2003-01-21 0.004439480 0.012310963 2003-01-23 2003-02-05 -0.0064620002 0.012731789 I obtained a barplot using the below layout(matrix(c(1,1,2,2),ncol=2,byrow=T)) barplot(rnorm(10),main=Pre_Event_Returns,col=red) barplot(rnorm(10),main=Post_Event_Returns,col=blue) Maybe barplot(rbind(rnorm(10), rnorm(10)),main=Event_Returns,col=c(red,blue), beside=T) + appropriate legend Regards Petr However I would like to know if it is possible to do the following- merge the two barplots i.e. a single barplot which will include both the pre and post event returns Any suggestions would be appreciated Ravi This e-mail may contain confidential and/or privileged i...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast string comparison
I see. I did not get these performances since did not directly compare arrays but run seemingly expensive for-loops to do it iteratively... :( R. On Tue, Jul 13, 2010 at 1:42 AM, Hadley Wickham had...@rice.edu wrote: strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse = )) system.time(strings[-1] == strings[-1e5]) # user system elapsed # 0.016 0.000 0.017 So it takes ~1/100 of a second to do ~100,000 string comparisons. You need to provide a reproducible example that illustrates why you think string comparisons are slow. Hadley On Tue, Jul 13, 2010 at 6:52 AM, Ralf B ralf.bie...@gmail.com wrote: I am asking this question because String comparison in R seems to be awfully slow (based on profiling results) and I wonder if perhaps '==' alone is not the best one can do. I did not ask for anything particular and I don't think I need to provide a self-contained source example for the question. So, to re-phrase my question, are there more (runtime) effective ways to find out if two strings (about 100-150 characters long) are equal? Ralf On Sun, Jul 11, 2010 at 2:37 PM, Sharpie ch...@sharpsteen.net wrote: Ralf B wrote: What is the fastest way to compare two strings in R? Ralf Which way is not fast enough? In other words, are you asking this question because profiling showed one of R's string comparison operations is causing a massive bottleneck in your code? If so, which one and how are you using it? -Charlie - Charlie Sharpsteen Undergraduate-- Environmental Resources Engineering Humboldt State University -- View this message in context: http://r.789695.n4.nabble.com/Fast-string-comparison-tp2285156p2285409.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Modify the plotting parameters for Vennerable obj.
Dear List, I would like to modify the settings for plotting a Vennerable object, but I don't know how...so if anyone has an idea I would be really graetfull. best, Fabian some R code to illustrate my problem: library(Vennerable) ven - compute.Venn(Venn(SetNames=c(A, B), Weight=c(0,111,106, 26))) # now my problem is that whenever I plot the object, the plot appears in box, and for cosmetic reasons I would like to get rid of that. plot(ven) sessionInfo() R version 2.10.0 (2009-10-26) i386-apple-darwin8.11.1 locale: [1] C attached base packages: [1] grid stats graphics grDevices utils datasets methods [8] base other attached packages: [1] Vennerable_2.0 RColorBrewer_1.0-2 lattice_0.18-3 RBGL_1.22.0 [5] graph_1.24.0 ggplot2_0.8.5 digest_0.4.2 reshape_0.8.3 [9] plyr_0.1.9 proto_0.3-8limma_3.2.1 loaded via a namespace (and not attached): [1] tools_2.10.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
Hi I do not use any of mentioned libraries so I can not directly answer it. I would try to use debug(expr.frame) to see at what time the error is thrown. I have no idea why did you obtain error. Try to evaluate code in peaces e.g. what is result of list(MAf=quote(SMA(Last, 20)), MAs=quote(SMA(Last, 50))) and look for differences between results got from spx data and nifty data. Regards Petr Raghu r.raghura...@gmail.com napsal dne 13.07.2010 13:17:42: Many many thanks to all of you. The beer cleared the air of doubts! Pls look at the following lines of code. This is taken from the example of tradesys documentation. When I run the given example using the data.frame spx it works just very fine but while I use some other data.frame (here nifty) it crashes. Now I can intuit that the total rows in the column named Last are 3637 and if i do a 20d MA and a 50d MA the respective rows for each of them are 3618 and 3588. Why does expr.frame crash for one data.frame and not for the other? I have given str() for both below for youe kind perusal. library(tradesys) library(TTR) x=nifty[,c(Open,Last)] d - expr.frame(x, list(MAf=quote(SMA(Last, 20)), MAs=quote(SMA(Last, 50 Error in data.frame(c(1000, 1001.53, 987.17, 976.28, 960.32, 951.93, 949.29, : arguments imply differing number of rows: 3637, 3618, 3588 str(nifty) 'data.frame': 3637 obs. of 6 variables: $ Date..GMT.: Factor w/ 3637 levels 01/01/1996,01/01/1997,..: 321 687 807 929 1052 1172 1537 1650 1764 1886 ... $ Open : num 1000 1002 987 976 960 ... $ High : num 1000 1002 987 976 960 ... $ Low : num 1000 989 977 963 952 ... $ Last : num 1000 989 978 964 953 ... $ Date : num 321 687 807 929 1052 ... str(spx) 'data.frame': 14940 obs. of 5 variables: $ Open : num 16.7 16.9 16.9 17 17.1 ... $ High : num 16.7 16.9 16.9 17 17.1 ... $ Low : num 16.7 16.9 16.9 17 17.1 ... $ Close : num 16.7 16.9 16.9 17 17.1 ... $ Volume: num 126 189 255 201 252 216 263 297 333 146 ... Thanks Raghu On Tue, Jul 13, 2010 at 12:01 PM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 12.07.2010 16:09:30: When I just run a for loop it works. But if I am going to run a for loop every time for large vectors I might as well use C or any other language. The reason R is powerful is becasue it can handle large vectors without each element being manipulated? Please let me know where I am wrong. for(i in 1:length(news1o)){ + if(news1o[i]s2o[i]) + s[i]-1 + else + s[i]--1 + } Think in R not in C. Why using loops when you can use whole object directly. It is like drinking beer from snifters. It is possible but using pints is preferable and more convenient. news1os2o gives you a logical vector the same length and you can use it directly for further selection or computation. You can consider FALSE as 0 and TRUE as 1 and use it as numeric vector so x-runif(10) y-runif(10) c(-1,1)[(xy)+1] selects -1 when FALSE and 1 when TRUE. or you can use it in mathematical operation directly (xy)*2-1 Regards Petr -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- 'Raghu' __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Zoo - bug ???
Hi folks, I am confused whether the following is a bug or it is fine Here is the explanation a - zoo(c(NA,1:9),1:10) Now If I do rollapply(a,FUN=mean,width=3,align=right) I get rollapply(a,FUN=mean,width=3,align=right) 3 4 5 6 7 8 9 10 NA NA NA NA NA NA NA NA But I shouldn't be getting NA right ? i.e for index 10 I should get (1/3)*(9+8+7) Similarly rollapply(a,FUN=mean,width=3) 2 3 4 5 6 7 8 9 NA NA NA NA NA NA NA NA Zoo version : installed.packages()[zoo,Version] [1] 1.6-3 My machine details sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-intel32 locale: [1] LC_COLLATE=English_India.1252 LC_CTYPE=English_India.1252 LC_MONETARY=English_India.1252 LC_NUMERIC=C [5] LC_TIME=English_India.1252 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] zoo_1.6-3 rcom_2.2-1 rscproxy_1.3-1 Revobase_3.2.0 loaded via a namespace (and not attached): [1] grid_2.10.1lattice_0.18-3 tools_2.10.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] in continuation with the earlier R puzzle
Many many thanks to all of you. The beer cleared the air of doubts! Pls look at the following lines of code. This is taken from the example of tradesys documentation. When I run the given example using the data.frame spx it works just very fine but while I use some other data.frame (here nifty) it crashes. Now I can intuit that the total rows in the column named Last are 3637 and if i do a 20d MA and a 50d MA the respective rows for each of them are 3618 and 3588. Why does expr.frame crash for one data.frame and not for the other? I have given str() for both below for youe kind perusal. library(tradesys) library(TTR) x=nifty[,c(Open,Last)] d - expr.frame(x, list(MAf=quote(SMA(Last, 20)), MAs=quote(SMA(Last, 50 Error in data.frame(c(1000, 1001.53, 987.17, 976.28, 960.32, 951.93, 949.29, : arguments imply differing number of rows: 3637, 3618, 3588 str(nifty) 'data.frame': 3637 obs. of 6 variables: $ Date..GMT.: Factor w/ 3637 levels 01/01/1996,01/01/1997,..: 321 687 807 929 1052 1172 1537 1650 1764 1886 ... $ Open : num 1000 1002 987 976 960 ... $ High : num 1000 1002 987 976 960 ... $ Low : num 1000 989 977 963 952 ... $ Last : num 1000 989 978 964 953 ... $ Date : num 321 687 807 929 1052 ... str(spx) 'data.frame': 14940 obs. of 5 variables: $ Open : num 16.7 16.9 16.9 17 17.1 ... $ High : num 16.7 16.9 16.9 17 17.1 ... $ Low : num 16.7 16.9 16.9 17 17.1 ... $ Close : num 16.7 16.9 16.9 17 17.1 ... $ Volume: num 126 189 255 201 252 216 263 297 333 146 ... Thanks Raghu On Tue, Jul 13, 2010 at 12:01 PM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi r-help-boun...@r-project.org napsal dne 12.07.2010 16:09:30: When I just run a for loop it works. But if I am going to run a for loop every time for large vectors I might as well use C or any other language. The reason R is powerful is becasue it can handle large vectors without each element being manipulated? Please let me know where I am wrong. for(i in 1:length(news1o)){ + if(news1o[i]s2o[i]) + s[i]-1 + else + s[i]--1 + } Think in R not in C. Why using loops when you can use whole object directly. It is like drinking beer from snifters. It is possible but using pints is preferable and more convenient. news1os2o gives you a logical vector the same length and you can use it directly for further selection or computation. You can consider FALSE as 0 and TRUE as 1 and use it as numeric vector so x-runif(10) y-runif(10) c(-1,1)[(xy)+1] selects -1 when FALSE and 1 when TRUE. or you can use it in mathematical operation directly (xy)*2-1 Regards Petr -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Three-way Panel Data Analysis
Dear Danice, as far as I know, three-way panels are not considered in the econometrics literature (two dimensions make things complicated enough already). They are also not implemented in plm. You might find support for more elaborate nesting structures in the nlme and lme4 packages. Yet, as the empirical question is not clear from what you say, you might as well want to use separate regressions, destination/origin/time dummies (possibly interacted with coefficients) and so on. Best wishes, Giovanni --- original message Message: 111 Date: Tue, 13 Jul 2010 07:14:06 +0100 From: danice ng danice...@gmail.com To: r-help@r-project.org Subject: [R] Three-way Panel Data Analysis Message-ID: aanlktime7a8oydgotqr-qw0mauw8iqn7rvknc9csx...@mail.gmail.com Content-Type: text/plain Dear R users, I have panel data on the amount of money spent by travellers from 8 origin countries in 4 destinations. I would like to carry out analysis for destinations, origins and time. However, it seems to me that the package plm can only esitmate two-way panel data (indexed by a two-dimensional array). Any suggestions would be greatly appreciated. Thank you. Best regards, Danice [[alternative HTML version deleted]] -- Giovanni Millo Research Dept., Assicurazioni Generali SpA Via Machiavelli 4, 34132 Trieste (Italy) tel. +39 040 671184 fax +39 040 671160 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] StartsWith over vector of Strings?
Given vectors of strings of arbitrary length content - c(abc, def) searchset - c(a, abc, abcdef, d, def, defghi) Is it possible to determine the content String set that matches the searchset in the sense of 'startswith' ? This would be a vector of all strings in content that start with the string of any of the strings in the searchset. In the little example here, this would be: result - c(abc, abc, def, def) Best, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding R -installation
On Tue, 2010-07-13 at 13:39 +0530, venkatesh bandaru wrote: Dear R-help Team Members, I am venkatesh , Student of university of Hyderabad, while Installing R from the specified servers, I encountered the following problem. please help me regarding. i need this to do my project . Thanking you. *Problem* : Cannot access installation media http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2 (Medium 1). Check whether the server is accessible Download (curl) error for ' http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2/repodata/repomd.xml ': Error code: Connection failed Error message: couldn't connect to host Those are the package repositories for your distribution of Linux (OpenSuse), and are nothing to do with R, the R Foundation and it's CRAN servers AFAIK. If you want to download the R sources, try: http://cran.r-project.org/mirrors.html Choose one near you and then look at the R Sources and R Binaries entries in the menu on the left. As for the problem with openSuse, you might need to try their help forums. HTH G yours truly, B.venkatesh, University of Hyderabad India. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo - bug ???
On Tue, 2010-07-13 at 17:13 +0530, sayan dasgupta wrote: Hi folks, I am confused whether the following is a bug or it is fine Here is the explanation a - zoo(c(NA,1:9),1:10) Now If I do rollapply(a,FUN=mean,width=3,align=right) mean() has argument na.rm which defaults to FALSE. As such, if NA are in the computation the mean is undefined and the answer will be NA. If you pass na.rm = TRUE to rollapply, mean ignores the NA and works on the remaining values: rollapply(a,FUN=mean,width=3,align=right, na.rm = TRUE) 3 4 5 6 7 8 9 10 1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0 HTH G I get rollapply(a,FUN=mean,width=3,align=right) 3 4 5 6 7 8 9 10 NA NA NA NA NA NA NA NA But I shouldn't be getting NA right ? i.e for index 10 I should get (1/3)*(9+8+7) Similarly rollapply(a,FUN=mean,width=3) 2 3 4 5 6 7 8 9 NA NA NA NA NA NA NA NA Zoo version : installed.packages()[zoo,Version] [1] 1.6-3 My machine details sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-intel32 locale: [1] LC_COLLATE=English_India.1252 LC_CTYPE=English_India.1252 LC_MONETARY=English_India.1252 LC_NUMERIC=C [5] LC_TIME=English_India.1252 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] zoo_1.6-3 rcom_2.2-1 rscproxy_1.3-1 Revobase_3.2.0 loaded via a namespace (and not attached): [1] grid_2.10.1lattice_0.18-3 tools_2.10.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] print.trellis draw.in - plaintext (gmail mishap)
That helped. I continued to have issues with draw.in=vplayout(2,2)$name (I guess I still don't understand it use), but the following positions the plot on the grid where I want it. grid.newpage() pushViewport(viewport(layout=grid.layout(2,2))) vp - vplayout(2,2) pushViewport(vp) print(p,newpage=FALSE) upViewport() The reason the ggplot instances just worked is that the ggplot2 package is doing the viewport traversal for me in ggplot2::print.ggplot. Thanks for the help! Mark On 07/12/2010 07:29 PM, Felix Andrews wrote: The problem is that you have not pushed your viewport so it doesn't exist in the plot. (You only pushed the layout viewport). grid.ls(viewports = TRUE) ROOT GRID.VP.82 Try this: vp- vplayout(2,2) pushViewport(vp) upViewport() grid.ls(viewports = TRUE) #ROOT # GRID.VP.82 #GRID.VP.86 print(p, newpage = FALSE, draw.in = vp$name) -Felix On 13 July 2010 01:22, Mark Connollywmcon...@ncsu.edu wrote: require(grid) require(lattice) fred = data.frame(x=1:5,y=runif(5)) vplayout- function (x,y) viewport(layout.pos.row=x, layout.pos.col=y) grid.newpage() pushViewport(viewport(layout=grid.layout(2,2))) p = xyplot(y~x,fred) print( p,newpage=FALSE,draw.in=vplayout(2,2)$name) On Mon, Jul 12, 2010 at 8:58 AM, Felix Andrewsfe...@nfrac.org wrote: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Yes, please, reproducible code. On 10 July 2010 00:49, Mark Connollywmcon...@ncsu.edu wrote: I am attempting to plot a trellis object on a grid. vplayout = viewport(layout.pos.row=x, layout.pos.col=y) grid.newpage() pushViewport(viewport(layout=grid.layout(2,2))) g1 = ggplot() ... g2 = ggplot() ... g3 = ggplot() ... p = xyplot() ... # works as expected print(g1, vp=vplayout(1,1)) print(g2, vp=vplayout(1,2)) print(g3, vp=vplayout(2,1)) # does not work print( p, newpage=FALSE, draw.in=vplayout(2,2)$name) Error in grid.Call.graphics(L_downviewport, name$name, strict) : Viewport 'GRID.VP.112' was not found What am I doing wrong? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 http://www.neurofractal.org/felix/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo - bug ???
On Tue, Jul 13, 2010 at 5:27 PM, Gavin Simpson gavin.simp...@ucl.ac.ukwrote: On Tue, 2010-07-13 at 17:13 +0530, sayan dasgupta wrote: Hi folks, I am confused whether the following is a bug or it is fine Here is the explanation a - zoo(c(NA,1:9),1:10) Now If I do rollapply(a,FUN=mean,width=3,align=right) mean() has argument na.rm which defaults to FALSE. As such, if NA are in the computation the mean is undefined and the answer will be NA. If you pass na.rm = TRUE to rollapply, mean ignores the NA and works on the remaining values: rollapply(a,FUN=mean,width=3,align=right, na.rm = TRUE) 3 4 5 6 7 8 9 10 1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0 This is fine but the problem is logically when you are doing rollapply only the first 2 values should be NA but suppose for index 10 as I have mentioned the rollapply should be a mean of b9, 8 ,7 and there is no NA here. So it should not return NA HTH G I get rollapply(a,FUN=mean,width=3,align=right) 3 4 5 6 7 8 9 10 NA NA NA NA NA NA NA NA But I shouldn't be getting NA right ? i.e for index 10 I should get (1/3)*(9+8+7) Similarly rollapply(a,FUN=mean,width=3) 2 3 4 5 6 7 8 9 NA NA NA NA NA NA NA NA Zoo version : installed.packages()[zoo,Version] [1] 1.6-3 My machine details sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-intel32 locale: [1] LC_COLLATE=English_India.1252 LC_CTYPE=English_India.1252 LC_MONETARY=English_India.1252 LC_NUMERIC=C [5] LC_TIME=English_India.1252 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] zoo_1.6-3 rcom_2.2-1 rscproxy_1.3-1 Revobase_3.2.0 loaded via a namespace (and not attached): [1] grid_2.10.1lattice_0.18-3 tools_2.10.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/http://www.ucl.ac.uk/%7Eucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Define package-wide character constants
Dear list! I develop a package for R and wonder how I can best define package-wide constants (both character strings or named vectors of strings) which are used throughout different classes and methods. I'm new to R and wonder if there is some kind of “best practice” that I just haven't read of yet. My main programming language is Java, so if that helps anyone to understand my thinking: I mean values that I would normally put into a class like Constants.java as “public static final” variables, or into a .properties file. A concrete example: I deal with XML files, both parsing and encoding. Right now I have several classes representing documents which I handle, and in each of the encoding methods there is a character string for the schema location. If I want to change that location then I have to change it several times (neglecting search and replace), but I'd rather have a single point of change for that. I've read documentation on environments, - and the assign function, BUT am still not sure how to approach my problem BEST. assign could most probably do the job, but do I use/create a certain environment (like myConstants)? Or should I just use my package's environment? Thanks for any experiences, (alternative) ideas, or pointers at existing discussions! Regards, Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Substring function?
Hi all, I would like to detect all strings in the vector 'content' that contain the strings from the vector 'search'. Here a code example: content - data.frame(urls=c( http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3;, http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4?p=stufftoggle=1;) ) search - data.frame(signatures=c(http://www.google.com/search;)) subset(content, search$signatures %in% content$urls) I am getting an error: [1] urls 0 rows (or 0-length row.names) What I would like to achieve is the return of http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3;. Is that possible? In practice I would like to run this over 1000s of strings in 'content' and 100s of strings in 'search'. Could I run into performance issues with this approach and, if so, are there better ways? Best, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Define package-wide character constants
On 13/07/2010 8:20 AM, Daniel Nüst wrote: Dear list! I develop a package for R and wonder how I can best define package-wide constants (both character strings or named vectors of strings) which are used throughout different classes and methods. I'm new to R and wonder if there is some kind of “best practice” that I just haven't read of yet. My main programming language is Java, so if that helps anyone to understand my thinking: I mean values that I would normally put into a class like Constants.java as “public static final” variables, or into a .properties file. A concrete example: I deal with XML files, both parsing and encoding. Right now I have several classes representing documents which I handle, and in each of the encoding methods there is a character string for the schema location. If I want to change that location then I have to change it several times (neglecting search and replace), but I'd rather have a single point of change for that. I've read documentation on environments, - and the assign function, BUT am still not sure how to approach my problem BEST. assign could most probably do the job, but do I use/create a certain environment (like myConstants)? Or should I just use my package's environment? Thanks for any experiences, (alternative) ideas, or pointers at existing discussions! If they are constants, you won't need to use - or assign() to change them: just define them at top level in one of the source files for your package. For example, errorMsg - You have made an error. Then you can refer to them in functions in your package, e.g. foo - function() { stop(errorMsg) } If you use a NAMESPACE file you can limit the visibility of these objects to your package by not exporting them. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo - bug ???
On Tue, 2010-07-13 at 17:41 +0530, sayan dasgupta wrote: On Tue, Jul 13, 2010 at 5:27 PM, Gavin Simpson gavin.simp...@ucl.ac.uk wrote: On Tue, 2010-07-13 at 17:13 +0530, sayan dasgupta wrote: Hi folks, I am confused whether the following is a bug or it is fine Here is the explanation a - zoo(c(NA,1:9),1:10) Now If I do rollapply(a,FUN=mean,width=3,align=right) mean() has argument na.rm which defaults to FALSE. As such, if NA are in the computation the mean is undefined and the answer will be NA. If you pass na.rm = TRUE to rollapply, mean ignores the NA and works on the remaining values: rollapply(a,FUN=mean,width=3,align=right, na.rm = TRUE) 3 4 5 6 7 8 9 10 1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0 This is fine but the problem is logically when you are doing rollapply only the first 2 values should be NA but suppose for index 10 as I have mentioned the rollapply should be a mean of b9, 8 ,7 and there is no NA here. So it should not return NA Indeed, there seems to be something odd happening here: consider, rollapply(a,FUN=mean,width=3) 2 3 4 5 6 7 8 9 NA NA NA NA NA NA NA NA rollapply(a,FUN=mean,width=3, na.rm = FALSE) 2 3 4 5 6 7 8 9 NA 2 3 4 5 6 7 8 rollapply(a,FUN=mean,width=3, na.rm = TRUE) 2 3 4 5 6 7 8 9 1.5 2.0 3.0 4.0 5.0 6.0 7.0 8.0 and if you debug zoo:rollapply.zoo, the top one gets passed off to rollmean early on in the code, whilst the second (na.rm = FALSE) is handled by rollapply itself. And I see why this is happening. If ... contains anything, is anything then the code will not enter the switch statement which passes off control to functions like rollmean() (in this case). This explains the difference between the first and second calls with na.rm = FALSE. And of course, this is mentioned on ?rollapply. Must read the help!!! So, as rollmean doesn't accept an na.rm argument or pass it on, you need to do rollapply(a,FUN=mean,width=3, na.rm = FALSE) This is not a bug as ?rollapply tells you what it does, passes you to ?rollmean which states that it doesn't work for inputs with NAs. To get behaviour you want though, you have to do the somewhat odd workaround and force computation via rollapply by providing an extra argument, even a gibberish one, e.g.: rollapply(a,FUN=mean,width=3, foo = 1) will work. HTH G HTH G I get rollapply(a,FUN=mean,width=3,align=right) 3 4 5 6 7 8 9 10 NA NA NA NA NA NA NA NA But I shouldn't be getting NA right ? i.e for index 10 I should get (1/3)*(9+8+7) Similarly rollapply(a,FUN=mean,width=3) 2 3 4 5 6 7 8 9 NA NA NA NA NA NA NA NA Zoo version : installed.packages()[zoo,Version] [1] 1.6-3 My machine details sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-intel32 locale: [1] LC_COLLATE=English_India.1252 LC_CTYPE=English_India.1252 LC_MONETARY=English_India.1252 LC_NUMERIC=C [5] LC_TIME=English_India.1252 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] zoo_1.6-3 rcom_2.2-1 rscproxy_1.3-1 Revobase_3.2.0 loaded via a namespace (and not attached): [1] grid_2.10.1lattice_0.18-3 tools_2.10.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~ %~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~ %~%~%~% --
Re: [R] Zoo - bug ???
On Tue, Jul 13, 2010 at 7:43 AM, sayan dasgupta kitt...@gmail.com wrote: Hi folks, I am confused whether the following is a bug or it is fine Here is the explanation a - zoo(c(NA,1:9),1:10) Now If I do rollapply(a,FUN=mean,width=3,align=right) I get rollapply(a,FUN=mean,width=3,align=right) 3 4 5 6 7 8 9 10 NA NA NA NA NA NA NA NA But I shouldn't be getting NA right ? i.e for index 10 I should get (1/3)*(9+8+7) Similarly rollapply(a,FUN=mean,width=3) 2 3 4 5 6 7 8 9 NA NA NA NA NA NA NA NA This is documented behavior (thanks to Gavin for pointing this out) but I agree that it is undesirable and we will consider how to address this. In the meantime use rollapply(a, 3, mean) so that it does not use rollmean or if you want NAs removed when doing the mean calculation use na.rm = TRUE as Gavin suggested. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Continuing on with a loop when there's a failure
On Jul 13, 2010, at 8:47 AM, Josh B wrote: Thanks again, David. ...but, alas, I still can't get it work! Here's what I'm trying now: for (i in 1:2) { mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x)) results[1,i] - anova(mod.poly3)[1,3] } You need to do some programming. You did not get an error from the lrm but rather from the anova call because you tried to give the results of the try function to anova without first checking to see if an error had occurred. -- David. Here's what happens (from the console): Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : NA/NaN/Inf in foreign function call (arg 1) Error in UseMethod(anova) : no applicable method for 'anova' applied to an object of class try-error ...so I still can't make my results matrix. Could I ask you for some specific code to make this work? I'm not that familiar with the syntax for try or tryCatch, and the help files for them are pretty bad, in my humble opinion. I should clarify that I actually don't care about the failed runs per se. I just want R to keep going in spite of them and give me my results matrix. From: David Winsemius dwinsem...@comcast.net To: Josh B josh...@yahoo.com Cc: R Help r-help@r-project.org Sent: Mon, July 12, 2010 8:09:03 PM Subject: Re: [R] Continuing on with a loop when there's a failure On Jul 12, 2010, at 6:18 PM, Josh B wrote: Hi R sages, Here is my latest problem. Consider the following toy example: x - read.table(textConnection(y1 y2 y3 x1 x2 indv.1 bagels donuts bagels 4 6 indv.2 donuts donuts donuts 5 1 indv.3 donuts donuts donuts 1 10 indv.4 donuts donuts donuts 10 9 indv.5 bagels donuts bagels 0 2 indv.6 bagels donuts bagels 2 9 indv.7 bagels donuts bagels 8 5 indv.8 bagels donuts bagels 4 1 indv.9 donuts donuts donuts 3 3 indv.10 bagels donuts bagels 5 9 indv.11 bagels donuts bagels 9 10 indv.12 bagels donuts bagels 3 1 indv.13 donuts donuts donuts 7 10 indv.14 bagels donuts bagels 2 10 indv.15 bagels donuts bagels 9 6), header = TRUE) I want to fit a logistic regression of y1 on x1 and x2. Then I want to run a logistic regression of y2 on x1 and x2. Then I want to run a logistic regression of y3 on x1 and x2. In reality I have many more Y columns than simply y1, y2, and y3, so I must design a loop. Notice that y2 is invariant and thus it will fail. In reality, some y columns will fail for much more subtle reasons. Simply screening my data to eliminate invariant columns will not eliminate the problem. What I want to do is output a piece of the results from each run of the loop to a matrix. I want the to try each of my y columns, and not give up and stop running simply because a particular y column is bad. I want it to give me NA or something similar in my results matrix for the bad y columns, but I want it to keep going give me good data for the good y columns. For instance: results - matrix(nrow = 1, ncol = 3) colnames(results) - c(y1, y2, y3) for (i in 1:2) { mod.poly3 - lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x) results[1,i] - anova(mod.poly3)[1,3] } If I run this code, it gives up when fitting y2 because the y2 is bad. It doesn't even try to fit y3. Here's what my console shows: results y1 y2 y3 [1,] 0.6976063 NA NA As you can see, it gave up before fitting y3, which would have worked. How do I force my code to keep going through the loop, despite the rotten apples it encounters along the way? ?try http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f (Doesn't only apply to simulations.) Exact code that gets the job done is what I am interested in. I am a post-doc -- I am not taking any classes. I promise this is not a homework assignment! -- David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Continuing on with a loop when there's a failure
On Jul 13, 2010, at 9:04 AM, David Winsemius wrote: On Jul 13, 2010, at 8:47 AM, Josh B wrote: Thanks again, David. ...but, alas, I still can't get it work! (BTW, it did work.) Here's what I'm trying now: for (i in 1:2) { mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x)) results[1,i] - anova(mod.poly3)[1,3] } You need to do some programming. (Or I suppose you could wrap both the lrm and the anova calls in try.) You did not get an error from the lrm but rather from the anova call because you tried to give the results of the try function to anova without first checking to see if an error had occurred. -- David. Here's what happens (from the console): Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : NA/NaN/Inf in foreign function call (arg 1) Error in UseMethod(anova) : no applicable method for 'anova' applied to an object of class try-error ...so I still can't make my results matrix. Could I ask you for some specific code to make this work? I'm not that familiar with the syntax for try or tryCatch, and the help files for them are pretty bad, in my humble opinion. I should clarify that I actually don't care about the failed runs per se. I just want R to keep going in spite of them and give me my results matrix. From: David Winsemius dwinsem...@comcast.net To: Josh B josh...@yahoo.com Cc: R Help r-help@r-project.org Sent: Mon, July 12, 2010 8:09:03 PM Subject: Re: [R] Continuing on with a loop when there's a failure On Jul 12, 2010, at 6:18 PM, Josh B wrote: Hi R sages, Here is my latest problem. Consider the following toy example: x - read.table(textConnection(y1 y2 y3 x1 x2 indv.1 bagels donuts bagels 4 6 indv.2 donuts donuts donuts 5 1 indv.3 donuts donuts donuts 1 10 indv.4 donuts donuts donuts 10 9 indv.5 bagels donuts bagels 0 2 indv.6 bagels donuts bagels 2 9 indv.7 bagels donuts bagels 8 5 indv.8 bagels donuts bagels 4 1 indv.9 donuts donuts donuts 3 3 indv.10 bagels donuts bagels 5 9 indv.11 bagels donuts bagels 9 10 indv.12 bagels donuts bagels 3 1 indv.13 donuts donuts donuts 7 10 indv.14 bagels donuts bagels 2 10 indv.15 bagels donuts bagels 9 6), header = TRUE) I want to fit a logistic regression of y1 on x1 and x2. Then I want to run a logistic regression of y2 on x1 and x2. Then I want to run a logistic regression of y3 on x1 and x2. In reality I have many more Y columns than simply y1, y2, and y3, so I must design a loop. Notice that y2 is invariant and thus it will fail. In reality, some y columns will fail for much more subtle reasons. Simply screening my data to eliminate invariant columns will not eliminate the problem. What I want to do is output a piece of the results from each run of the loop to a matrix. I want the to try each of my y columns, and not give up and stop running simply because a particular y column is bad. I want it to give me NA or something similar in my results matrix for the bad y columns, but I want it to keep going give me good data for the good y columns. For instance: results - matrix(nrow = 1, ncol = 3) colnames(results) - c(y1, y2, y3) for (i in 1:2) { mod.poly3 - lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x) results[1,i] - anova(mod.poly3)[1,3] } If I run this code, it gives up when fitting y2 because the y2 is bad. It doesn't even try to fit y3. Here's what my console shows: results y1 y2 y3 [1,] 0.6976063 NA NA As you can see, it gave up before fitting y3, which would have worked. How do I force my code to keep going through the loop, despite the rotten apples it encounters along the way? ?try http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f (Doesn't only apply to simulations.) Exact code that gets the job done is what I am interested in. I am a post-doc -- I am not taking any classes. I promise this is not a homework assignment! -- David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generate groups with random size but given total sample size
Dear list, I am currently doing some simulation studies where I want to compare different scenarios. In particular, two scenarios should be compared: 10.000 cases in 100 groups with 100 cases per group and 10.000 cases in 100 groups with random group size (ranging from 5 to 500). The first part is no problem: id - seq(1,1) group - sort(rep(seq(1,100),100)) But I don't get along with the second scenario. Using sample does give me 100 groups with random cases, but generates more than 10.000 cases: set.seed(13) sum(sample(5:500, 100)) [1] 24583 Another way could be generating one sample at a time and sum the cases. But this would end up in trail error to fit the 10.000 cases. Maybe it would break rules of probability, too. I'm convinced that there should be another (and even better) way to handle this problem in R... :-) Best regards, Arne Schulz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Custom nonlinear self starting function w/ 2 covariates
Dear all I finally found the way to do it. Nlme accepts simpler functions than selfStart: # Defining my function Myfun -function(x1,x2,Tmax,Topt,B,E) { (((Tmax-x1)/(Tmax-Topt))*(x1/Topt)^(Topt/(Tmax-Topt)))*exp(-exp(B*(log(x2)-log(abs(E) } # Calling nlme nlmefit3 - nlme( y ~ Myfun(x1,x2,Tmax,Topt,B,E), data, fixed=Tmax+Topt+B+E ~ 1, random=Tmax+Topt+B+E ~ 1, start=list(fixed=(c(Topt=25.206, Tmax=36.085, B=-0.825, E=6.435))) ) Unfortunately, in with nlmer I'm stuck with the error message gradient attribute of evaluated model must be a numeric matrix, but it's good that it works with nlme. Sebastien Guyader wrote: Hello, I'm trying to adjust a non linear model in which the biological response variable (ratio of germinated fungus spores) is dependent on 2 covariates (temperature and time). The response to temperature is modeled by a kind of beta function with 2 parameters (optimal and maximum temperatures) and the time function is a 2-parameter Weibull. Adjustments with nls or gnls work, but I need to do mixed-effects modeling. It seems like nlme or nlmer need self starting functions, but so far I can't find a way to code a selfstart function with 2 x covariates. Is it just impossible? Is there another way? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Custom-nonlinear-self-starting-function-w-2-covariates-tp2286099p2287391.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast string comparison
On Tue, 2010-07-13 at 01:42 -0400, Hadley Wickham wrote: strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse = )) system.time(strings[-1] == strings[-1e5]) # user system elapsed # 0.016 0.000 0.017 So it takes ~1/100 of a second to do ~100,000 string comparisons. You need to provide a reproducible example that illustrates why you think string comparisons are slow. Here's a vectorized alternative to '==' for strings, with minimal argument checking or result conversion. I haven't looked at the corresponding R source code, it may be similar: library(inline) code - SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig - signature(s1=character, s2=character) strcmp - cfunction(sig, code) system.time(strings[-1] == strings[-1e5]) user system elapsed 0.036 0.000 0.035 system.time(strcmp(strings[-1], strings[-1e5])) user system elapsed 0.032 0.000 0.034 That's pretty fast, though I seem to be working with a slower system than Hadley. It's hard to see how this could be improved, except maybe by caching results of string comparisons. -Matt Hadley On Tue, Jul 13, 2010 at 6:52 AM, Ralf B ralf.bie...@gmail.com wrote: I am asking this question because String comparison in R seems to be awfully slow (based on profiling results) and I wonder if perhaps '==' alone is not the best one can do. I did not ask for anything particular and I don't think I need to provide a self-contained source example for the question. So, to re-phrase my question, are there more (runtime) effective ways to find out if two strings (about 100-150 characters long) are equal? Ralf On Sun, Jul 11, 2010 at 2:37 PM, Sharpie ch...@sharpsteen.net wrote: Ralf B wrote: What is the fastest way to compare two strings in R? Ralf Which way is not fast enough? In other words, are you asking this question because profiling showed one of R's string comparison operations is causing a massive bottleneck in your code? If so, which one and how are you using it? -Charlie - Charlie Sharpsteen Undergraduate-- Environmental Resources Engineering Humboldt State University -- View this message in context: http://r.789695.n4.nabble.com/Fast-string-comparison-tp2285156p2285409.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Substring function?
well %in% is really checking if the element is in the set and is not a substring operator. To get the result you want, try content[grepl(search$signatures, content$urls),] For multiple operations you could try sapply(search$signatures, grepl, x=content$urls) Nikhil Kaza Asst. Professor, City and Regional Planning University of North Carolina nikhil.l...@gmail.com On Jul 13, 2010, at 8:22 AM, Ralf B wrote: Hi all, I would like to detect all strings in the vector 'content' that contain the strings from the vector 'search'. Here a code example: content - data.frame(urls=c( http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3 , http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4?p=stufftoggle=1 ) ) search - data.frame(signatures=c(http://www.google.com/search;)) subset(content, search$signatures %in% content$urls) I am getting an error: [1] urls 0 rows (or 0-length row.names) What I would like to achieve is the return of http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3 . Is that possible? In practice I would like to run this over 1000s of strings in 'content' and 100s of strings in 'search'. Could I run into performance issues with this approach and, if so, are there better ways? Best, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RODBC and Excel 2010 xlsx
Hi List, just to know if the issue is only a problem of mine or if it is a general issue due to the new MS Office pack. I'm using R 2.11.1 32 bits in a Windows 7 x64 with the MS office 2010 x64 installed. I can import .xls files normally (the same way I did with my Excel 2007 32 bits). But the function odbcConnectExcel2007 isn't able to import .xlsx files now that I have the new version of the Office package. It gives me the following warning message, which make impossible the importing process through sqlFetch: Warning messages: 1: In odbcDriverConnect(con, tabQuote = c([, ]), ...) : [RODBC] ERROR: state IM002, code 0, message [Microsoft][ODBC Driver Manager] Nome da fonte de dados não encontrado e nenhum driver padrão especificado (Source name not found and no default driver specified) 2: In odbcDriverConnect(con, tabQuote = c([, ]), ...) : ODBC connection failed I'm obviously bypassing it converting my .xlsx files to .xls. Well the question is simple. Is this an expected issue, like the one when the xlsx format was released and it will be worked out, or I'm having and specific problem at one of my system components (drivers)? Thank you very much for the attention. Rodrigo Aluizio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS Proc summary/means as a R function
Thanks Richard and Erik, I hate to buy the book and not find the solution to the following: proc.means - function() { deparse(match.call()[-1]) } proc.means(this is a sentence) unexpected symbol in proc means(this is) One possible solution would be to 'peek' into the memory buffer that holds the function arguments. It is easy to replicate the 'dataset' output for many SAS procs(ie transpose, freq, summary, means...) I am not interested in 'report writing in R'. The hard part is parsing the SAS syntax, I wish R had a drop down to PERL. per1 on; some perl code perl off; also sas on; some SAS code sas off; The purpose of parmbuff is to turn off of Rs scanning and resolution of function arguments and just provide the bare text between '(' and ')' in the function call. This is a very powerful construct. A function would provide something like sas.on( ) -- View this message in context: http://r.789695.n4.nabble.com/SAS-Proc-summary-means-as-a-R-function-tp2286888p2287350.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regarding installation from ROracle
I am using windows Xp OS and R 2.10. I treid to install ROracle package and I got following error:- This application has failed to start because orasql9.dll was not found. Re-installing the application may fix this problem I have already installed the dependecy package DBI Please help me... -- View this message in context: http://r.789695.n4.nabble.com/Regarding-installation-from-ROracle-tp2287331p2287331.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Accessing files on password-protected FTP sites
Thanks for the tip. From the link you posted: | You can embed the user id and password into the URL. For example: | | http://userid:passw...@www.anywhere.com/ | ftp://userid:passw...@ftp.anywhere.com/ I'm still having issues, though. I am trying to fetch some csv files from a storage site used by my company, and I've tried the read.csv and download.file commands. These are the error messages that pop up: read.csv(ftp://userid:passw...@ftp.anywhere.com/data.csv;) Error in file(file, rt) : cannot open the connection download.file(ftp://userid:passw...@ftp.anywhere.com/data.csv;, C:/data.csv) trying URL 'ftp://userid:passw...@ftp.anywhere.com/data.csv' Error in download.file(ftp://userid:passw...@ftp.anywhere.com/data.csv;, : cannot open URL 'ftp://userid:passw...@ftp.anywhere.com/data.csv' Am I leaving out any important options from these commands, that would allow me to access the site if I include them? When I type the URL into Firefox the same way I have entered it into R, I get the files I need. But for my particular project, I am going to have to automate the process. Obviously these are not my real userID, password, or website name. In case it is relevant, I am trying to access files that store information on the positions in my company's stock portfolio; these files are stored on our brokerage firm's website. -- View this message in context: http://r.789695.n4.nabble.com/Accessing-files-on-password-protected-FTP-sites-tp2286862p2287373.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SARIMA model
Dear All, Could someone please advice me the appropriate package for fitting the SARIMA model? Thanks Fir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Continuing on with a loop when there's a failure
Thanks again, David. ...but, alas, I still can't get it work! Here's what I'm trying now: for (i in 1:2) { mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x)) results[1,i] - anova(mod.poly3)[1,3] } Here's what happens (from the console): Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : NA/NaN/Inf in foreign function call (arg 1) Error in UseMethod(anova) : no applicable method for 'anova' applied to an object of class try-error ...so I still can't make my results matrix. Could I ask you for some specific code to make this work? I'm not that familiar with the syntax for try or tryCatch, and the help files for them are pretty bad, in my humble opinion. I should clarify that I actually don't care about the failed runs per se. I just want R to keep going in spite of them and give me my results matrix. From: David Winsemius dwinsem...@comcast.net Cc: R Help r-help@r-project.org Sent: Mon, July 12, 2010 8:09:03 PM Subject: Re: [R] Continuing on with a loop when there's a failure On Jul 12, 2010, at 6:18 PM, Josh B wrote: Hi R sages, Here is my latest problem. Consider the following toy example: x - read.table(textConnection(y1 y2 y3 x1 x2 indv.1 bagels donuts bagels 4 6 indv.2 donuts donuts donuts 5 1 indv.3 donuts donuts donuts 1 10 indv.4 donuts donuts donuts 10 9 indv.5 bagels donuts bagels 0 2 indv.6 bagels donuts bagels 2 9 indv.7 bagels donuts bagels 8 5 indv.8 bagels donuts bagels 4 1 indv.9 donuts donuts donuts 3 3 indv.10 bagels donuts bagels 5 9 indv.11 bagels donuts bagels 9 10 indv.12 bagels donuts bagels 3 1 indv.13 donuts donuts donuts 7 10 indv.14 bagels donuts bagels 2 10 indv.15 bagels donuts bagels 9 6), header = TRUE) I want to fit a logistic regression of y1 on x1 and x2. Then I want to run a logistic regression of y2 on x1 and x2. Then I want to run a logistic regression of y3 on x1 and x2. In reality I have many more Y columns than simply y1, y2, and y3, so I must design a loop. Notice that y2 is invariant and thus it will fail. In reality, some y columns will fail for much more subtle reasons. Simply screening my data to eliminate invariant columns will not eliminate the problem. What I want to do is output a piece of the results from each run of the loop to a matrix. I want the to try each of my y columns, and not give up and stop running simply because a particular y column is bad. I want it to give me NA or something similar in my results matrix for the bad y columns, but I want it to keep going give me good data for the good y columns. For instance: results - matrix(nrow = 1, ncol = 3) colnames(results) - c(y1, y2, y3) for (i in 1:2) { mod.poly3 - lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x) results[1,i] - anova(mod.poly3)[1,3] } If I run this code, it gives up when fitting y2 because the y2 is bad. It doesn't even try to fit y3. Here's what my console shows: results y1 y2 y3 [1,] 0.6976063 NA NA As you can see, it gave up before fitting y3, which would have worked. How do I force my code to keep going through the loop, despite the rotten apples it encounters along the way? ?try http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-capture-or-ignore-errors-in-a-long-simulation_003f (Doesn't only apply to simulations.) Exact code that gets the job done is what I am interested in. I am a post-doc -- I am not taking any classes. I promise this is [[elided Yahoo spam]] -- David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Continuing on with a loop when there's a failure
In my opinion the try and tryCatch commands are written and documented rather poorly. Thus I am not sure what to program exactly. For instance, I could query mod.poly3 and use an if/then statement to proceed, but querying mod.poly3 is weird. For instance, here's the output when it fails: mod.poly3 - try(lrm(x[,2] ~ pol(x1, 3) + pol(x2, 3), data=x)) Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : NA/NaN/Inf in foreign function call (arg 1) mod.poly3 [1] Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : \n NA/NaN/Inf in foreign function call (arg 1)\n attr(,class) [1] try-error ...and here's the output when it succeeds: mod.poly3 - try(lrm(x[,1] ~ pol(x1, 3) + pol(x2, 3), data=x)) mod.poly3 Logistic Regression Model lrm(formula = x[, 1] ~ pol(x1, 3) + pol(x2, 3), data = x) Frequencies of Responses bagels donuts 10 5 Obs Max Deriv Model L.R. d.f. P C 15 4e-04 3.37 6 0.7616 0.76 Dxy Gamma Tau-a R2 Brier g 0.52 0.52 0.248 0.279 0.183 1.411 gr gp 4.1 0.261 Coef S.E.Wald Z P Intercept -5.68583 5.23295 -1.09 0.2772 x1 1.87020 2.14635 0.87 0.3836 x1^2 -0.42494 0.48286 -0.88 0.3788 x1^3 0.02845 0.03120 0.91 0.3618 x2 3.49560 3.54796 0.99 0.3245 x2^2 -0.94888 0.82067 -1.16 0.2476 x2^3 0.06362 0.05098 1.25 0.2121 ...so what exactly would I query to design my if/then statement? From: David Winsemius dwinsem...@comcast.net To: David Winsemius dwinsem...@comcast.net Sent: Tue, July 13, 2010 9:09:04 AM Subject: Re: [R] Continuing on with a loop when there's a failure On Jul 13, 2010, at 9:04 AM, David Winsemius wrote: On Jul 13, 2010, at 8:47 AM, Josh B wrote: Thanks again, David. [[elided Yahoo spam]] (BTW, it did work.) Here's what I'm trying now: for (i in 1:2) { mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x)) results[1,i] - anova(mod.poly3)[1,3] } You need to do some programming. (Or I suppose you could wrap both the lrm and the anova calls in try.) You did not get an error from the lrm but rather from the anova call because you tried to give the results of the try function to anova without first checking to see if an error had occurred. --David. Here's what happens (from the console): Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : NA/NaN/Inf in foreign function call (arg 1) Error in UseMethod(anova) : no applicable method for 'anova' applied to an object of class try-error ...so I still can't make my results matrix. Could I ask you for some specific code to make this work? I'm not that familiar with the syntax for try or tryCatch, and the help files for them are pretty bad, in my humble opinion. I should clarify that I actually don't care about the failed runs per se. I just want R to keep going in spite of them and give me my results matrix. From: David Winsemius dwinsem...@comcast.net Cc: R Help r-help@r-project.org Sent: Mon, July 12, 2010 8:09:03 PM Subject: Re: [R] Continuing on with a loop when there's a failure On Jul 12, 2010, at 6:18 PM, Josh B wrote: Hi R sages, Here is my latest problem. Consider the following toy example: x - read.table(textConnection(y1 y2 y3 x1 x2 indv.1 bagels donuts bagels 4 6 indv.2 donuts donuts donuts 5 1 indv.3 donuts donuts donuts 1 10 indv.4 donuts donuts donuts 10 9 indv.5 bagels donuts bagels 0 2 indv.6 bagels donuts bagels 2 9 indv.7 bagels donuts bagels 8 5 indv.8 bagels donuts bagels 4 1 indv.9 donuts donuts donuts 3 3 indv.10 bagels donuts bagels 5 9 indv.11 bagels donuts bagels 9 10 indv.12 bagels donuts bagels 3 1 indv.13 donuts donuts donuts 7 10 indv.14 bagels donuts bagels 2 10 indv.15 bagels donuts bagels 9 6), header = TRUE) I want to fit a logistic regression of y1 on x1 and x2. Then I want to run a logistic regression of y2 on x1 and x2. Then I want to run a logistic regression of y3 on x1 and x2. In reality I have many more Y columns than simply y1, y2, and y3, so I must design a loop. Notice that y2 is invariant and thus it will fail. In reality, some y columns will fail for much more subtle reasons. Simply screening my data to eliminate invariant columns will not eliminate the problem. What I want to do is output a piece of the results from each run of the loop to a matrix. I want the to try each of my y columns, and not give up and stop running simply because a particular y column is bad. I want it to give me NA or something similar in my results matrix for the bad y columns, but I want it to keep going give me good data for the good y columns. For
Re: [R] RODBC and Excel 2010 xlsx
On Tue, Jul 13, 2010 at 9:31 AM, Rodrigo Aluizio r.alui...@gmail.com wrote: Hi List, just to know if the issue is only a problem of mine or if it is a general issue due to the new MS Office pack. I'm using R 2.11.1 32 bits in a Windows 7 x64 with the MS office 2010 x64 installed. I can import .xls files normally (the same way I did with my Excel 2007 32 bits). But the function odbcConnectExcel2007 isn't able to import .xlsx files now that I have the new version of the Office package. It gives me the following warning message, which make impossible the importing process through sqlFetch: Warning messages: 1: In odbcDriverConnect(con, tabQuote = c([, ]), ...) : [RODBC] ERROR: state IM002, code 0, message [Microsoft][ODBC Driver Manager] Nome da fonte de dados não encontrado e nenhum driver padrão especificado (Source name not found and no default driver specified) 2: In odbcDriverConnect(con, tabQuote = c([, ]), ...) : ODBC connection failed I'm obviously bypassing it converting my .xlsx files to .xls. Well the question is simple. Is this an expected issue, like the one when the xlsx format was released and it will be worked out, or I'm having and specific problem at one of my system components (drivers)? Thank you very much for the attention. Suspect its a driver issue but you might want to look over the variety of methods for reading in Excel spreadsheets here: http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] interpretation of svm models with the e1071 package
Hi, On Sat, Jul 10, 2010 at 12:35 AM, Noah Silverman n...@smartmediacorp.com wrote: Steve, Couldn't he also just use the decision.value property to see the equivilent of t(x) %*% b for each row? I don't follow what you're saying. What is this the equivalent of? What's b here? The bias/offset? -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast string comparison
Hi Matt, I think there are some confusing factors in your results. system.time(strcmp(strings[-1], strings[-1e5])) would also include the time required to perform both subscripting (strings[-1] and strings[-1e5] ) which actually takes some time. Also, you do have a bit of overhead due to the use of STRING_ELT and the write barrier. I've include below a version that uses R internals so that you get the fast (but you have to understand the risks, etc ...) version of STRING_ELT using the plugin system of inline. library(inline) code - SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig - signature(s1=character, s2=character) strcmp - cfunction(sig, code) strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse = )) lhs - strings[-1] rhs - strings[-1e5] system.time( lhs == rhs ) system.time(strcmp( lhs, rhs) ) library(inline) settings - getPlugin( default ) settings$includes - paste( #define USE_RINTERNALS, settings$includes, collapse = \n ) code2 - SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig - signature(s1=character, s2=character ) strcmp2 - cxxfunction(sig, code2, settings = settings) system.time(strcmp2( lhs, rhs) ) I get: $ Rscript strings.R Le chargement a nécessité le package : methods utilisateur système écoulé 0.002 0.000 0.002 utilisateur système écoulé 0.004 0.000 0.005 utilisateur système écoulé 0.003 0.000 0.003 Romain Le 13/07/10 15:24, Matt Shotwell a écrit : On Tue, 2010-07-13 at 01:42 -0400, Hadley Wickham wrote: strings- replicate(1e5, paste(sample(letters, 100, rep = T), collapse = )) system.time(strings[-1] == strings[-1e5]) # user system elapsed # 0.016 0.000 0.017 So it takes ~1/100 of a second to do ~100,000 string comparisons. You need to provide a reproducible example that illustrates why you think string comparisons are slow. Here's a vectorized alternative to '==' for strings, with minimal argument checking or result conversion. I haven't looked at the corresponding R source code, it may be similar: library(inline) code- SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig- signature(s1=character, s2=character) strcmp- cfunction(sig, code) system.time(strings[-1] == strings[-1e5]) user system elapsed 0.036 0.000 0.035 system.time(strcmp(strings[-1], strings[-1e5])) user system elapsed 0.032 0.000 0.034 That's pretty fast, though I seem to be working with a slower system than Hadley. It's hard to see how this could be improved, except maybe by caching results of string comparisons. -Matt Hadley On Tue, Jul 13, 2010 at 6:52 AM, Ralf Bralf.bie...@gmail.com wrote: I am asking this question because String comparison in R seems to be awfully slow (based on profiling results) and I wonder if perhaps '==' alone is not the best one can do. I did not ask for anything particular and I don't think I need to provide a self-contained source example for the question. So, to re-phrase my question, are there more (runtime) effective ways to find out if two strings (about 100-150 characters long) are equal? Ralf On Sun, Jul 11, 2010 at 2:37 PM, Sharpiech...@sharpsteen.net wrote: Ralf B wrote: What is the fastest way to compare two strings in R? Ralf Which way is not fast enough? In other words, are you asking this question because profiling showed one of R's string comparison operations is causing a massive bottleneck in your code? If so, which one and how are you using it? -Charlie -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/bc8jNi : Rcpp 0.8.4 |- http://bit.ly/dz0RlX : bibtex 0.2-1 `- http://bit.ly/a5CK2h : Les estivales 2010 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
[R] Extract Clusters from Biclust Object - writeBiclusterResults
Update: Solution ## Dear all, just in case someone has the same question: I found the method writeBiclusterResults. It prints the results of all modules together in one file, containing the gene names and array/experiment names. It does not contain the values, however, so these have to be parsed by yourself from the original data file. -- View this message in context: http://r.789695.n4.nabble.com/Extract-Clusters-from-Biclust-Object-tp2286066p2287441.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dicretizing a normal distribution to predefined bands
Dear useRs, I am facing the following problem in R and hope you can help me. I want to discretize a normal distribution to 4 predefined bands. The bands are 1,2,10 and 20. In order to maintain the symmetric shape and the mean of the density I need to cut off all negative values and the corresponding part on the positive axis and allocate the mass taken away proportionally on the remaining support. I tried the function discretize from the actuar package, but I am not sure I properly define the step and range. I am sorry for the maybe trivial question and thanks in advance for any help! mary -- View this message in context: http://r.789695.n4.nabble.com/Dicretizing-a-normal-distribution-to-predefined-bands-tp2287455p2287455.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] interpretation of svm models with the e1071 package
Hi, On Mon, Jul 12, 2010 at 4:55 AM, manuel.martin manuel.mar...@orleans.inra.fr wrote: On 07/10/2010 04:11 AM, Steve Lianoglou wrote: On Fri, Jul 9, 2010 at 12:15 PM, manuel.martin manuel.mar...@orleans.inra.fr wrote: snip Dear all, after having calibrated a svm model through the svm() command of the e1071 package, is there a way to i) represent the modeled relationships between the y and X variables (response variable vs. predictors)? Can you explain a bit more ... how do you want them represented? I was thinking to a simple ŷ = fi(Xi) plot, fi resulting from the fitted svm model. Xi is the predictor, among the whole set of predictors, X, one wish to see the relationship with the response. For boosted regression trees, which I am more familiar with, this is fi function is estimated by averaging the effects of all predictors but Xi, and plotting how ŷ varies as Xi does. I still think you might be able to get some mileage out of calculating your W vector and looking at the values in each of its coordinates/bins. I think one problem trying to figure out something for the plot you are after is that I feel like depending on the choice of kernel used in for your SVM, rigging up such an fi(Xi) plot might not be as straight forward as you might think, since kernels can manipulate your feature space in fun ways. There's some literature out there about how to extract meaning/features from an SVM model. Perhaps you can search through some of that to help get some ideas. -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Batch file export
Dear all, I have a code that generates data vectors within R. For example assume: z - rlnorm(1000, meanlog = 0, sdlog = 1) Every time a vector has been generated I would like to export it into a csv file. So my idea is something as follows: for (i in 1:100) { z - rlnorm(1000, meanlog = 0, sdlog = 1) write.csv(z, c:/z_i.csv) Where z_i.csv is a filename that is related to the run (e.g. z_001.csv, z_002.csv, ...). Could anyone please advice me on the most convenient way of doing this? Thanks very much in advance, Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS Proc summary/means as a R function
On 13/07/2010 8:39 AM, Roger Deangelis wrote: Thanks Richard and Erik, I hate to buy the book and not find the solution to the following: proc.means - function() { deparse(match.call()[-1]) } proc.means(this is a sentence) unexpected symbol in proc means(this is) One possible solution would be to 'peek' into the memory buffer that holds the function arguments. It is easy to replicate the 'dataset' output for many SAS procs(ie transpose, freq, summary, means...) I am not interested in 'report writing in R'. The hard part is parsing the SAS syntax, I wish R had a drop down to PERL. per1 on; some perl code perl off; It would not be hard to write something like that. The syntax would be perl( some perl code ) where the function is something like perl - function(code) { f - tempfile() writeLines(code, f) system(paste(perl, f)) } You do need to watch out for escapes in the text, or be careful about what quotes you use, e.g. perl(' + print Hello World\n; + ') Hello World Similarly for SAS, but I don't know how you tell SAS to process a file. Duncan Murdoch also sas on; some SAS code sas off; The purpose of parmbuff is to turn off of Rs scanning and resolution of function arguments and just provide the bare text between '(' and ')' in the function call. This is a very powerful construct. A function would provide something like sas.on( ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Batch file export
write.csv(z, paste(c:/z_,i,.csvsep='')) You will have to modify this to prepend 0s. Nikhil Kaza Asst. Professor, City and Regional Planning University of North Carolina nikhil.l...@gmail.com On Jul 13, 2010, at 10:03 AM, Michael Haenlein wrote: Dear all, I have a code that generates data vectors within R. For example assume: z - rlnorm(1000, meanlog = 0, sdlog = 1) Every time a vector has been generated I would like to export it into a csv file. So my idea is something as follows: for (i in 1:100) { z - rlnorm(1000, meanlog = 0, sdlog = 1) write.csv(z, c:/z_i.csv) Where z_i.csv is a filename that is related to the run (e.g. z_001.csv, z_002.csv, ...). Could anyone please advice me on the most convenient way of doing this? Thanks very much in advance, Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RODBC and Excel 2010 xlsx
On Tue, 13 Jul 2010, Rodrigo Aluizio wrote: Hi List, just to know if the issue is only a problem of mine or if it is a general issue due to the new MS Office pack. I'm using R 2.11.1 32 bits in a It's a Microsoft muddle, covered in the RODBC manual for the next release. Note that R-sig-db is the right list for questions about RODBC, not R-help. Simply, if you have MS Office 2007/2010 installed, you can only have its ODBC drivers of the same architecture installed. Nothing to do with RODBC nor R, and something MS omits to mention on the download page for the drivers. Also, the ODBC data sources manager for x64 Windows tells you only about 64-bit drivers and DSNs. There is a different manager for 32-bit, but it is rather hidden The simplest thing to do is to use 64-bit R. I found this out the other way round: I have 32-bit Office installed and could not install the 64-bit ODBC drivers. The pre-release of RODBC at http://www.stats.ox.ac.uk/pub/R/RODBC_1.3-2.tar.gz is only available as a source package, but you can unpack it and read the updated manual. Windows 7 x64 with the MS office 2010 x64 installed. I can import .xls files normally (the same way I did with my Excel 2007 32 bits). But the function odbcConnectExcel2007 isn't able to import .xlsx files now that I have the new version of the Office package. It gives me the following warning message, which make impossible the importing process through sqlFetch: Warning messages: 1: In odbcDriverConnect(con, tabQuote = c([, ]), ...) : [RODBC] ERROR: state IM002, code 0, message [Microsoft][ODBC Driver Manager] Nome da fonte de dados n??o encontrado e nenhum driver padr??o especificado (Source name not found and no default driver specified) 2: In odbcDriverConnect(con, tabQuote = c([, ]), ...) : ODBC connection failed I'm obviously bypassing it converting my .xlsx files to .xls. Well the question is simple. Is this an expected issue, like the one when the xlsx format was released and it will be worked out, or I'm having and specific problem at one of my system components (drivers)? Thank you very much for the attention. Rodrigo Aluizio [[alternative HTML version deleted]] -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to select the column header with \Sexpr{}
Hi Felipe, The problem has nothing to do with Sweave or \Sexpr. The problem is that by the time you call \Sexpr report is a matrix, and you cannot access the column names of a matrix with names(). You need to use colnames() or convert the matrix to a data.frame. Perhaps a true useR can write R code in a Sweave file without checking it, but for mere mortals it is best to evaluate the R code in an interactive session to make sure it works before asking Sweave to insert it into your .tex file. If you had tried to evaluate names(report)[1] in an interactive session you would have discovered your problem immediately. Best, Ista On Tue, Jul 13, 2010 at 4:15 AM, Felipe Carrillo mazatlanmex...@yahoo.com wrote: I had tried that earlier and didn't work either, I probably have \Sexpr in the wrong place. See example: Column one header gets blank: \documentclass[11pt]{article} \usepackage{longtable,verbatim,ctable} \usepackage{longtable,pdflscape} \usepackage{fmtcount,hyperref} \usepackage{fullpage} \title{United States} \begin{document} \setkeys{Gin}{width=1\textwidth} \maketitle echo=F,results=hide= report - structure(list(Date = c(3/12/2010, 3/13/2010, 3/14/2010, 3/15/2010), Run1 = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146), 140 (111 ? 150)), Run2 = c(33 (71 ? 71), n (0 ? 0), 337 (67 ? 74), 140 (68 ? 84)), Run3 = c(890 (32 ? 47), n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), Run4 = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? )), Run4 = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? ))), .Names = c(ID_Date, Run1, Run2, Run3, Run4, Run5), row.names = c(NA, 4L), class = data.frame) require(stringr) report - t(apply(report, 1, function(x) {str_replace(x, \\?, -)})) #report #latex(report,file=) @ \begin{landscape} \begin{table}[!tbp] \begin{center} \begin{tabular}{ll}\hline\hline \multicolumn{1}{c}{\Sexpr{names(report)[1]}} # Using \Sexpr here \multicolumn{1}{c}{Run1} \multicolumn{1}{c}{Run2} \multicolumn{1}{c}{Run3} \multicolumn{1}{c}{Run4} \multicolumn{1}{c}{Run5}\tabularnewline \hline 13/12/201033 (119 ? 119)33 (71 ? 71)890 (32 ? 47)0 ( ? )0 ( ? )\tabularnewline 23/13/2010n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)\tabularnewline 33/14/2010893 (110 ? 146)337 (67 ? 74)10,602 (32 ? 52)0 ( ? )0 ( ? )\tabularnewline 43/15/2010140 (111 ? 150)140 (68 ? 84)2,635 (34 ? 66)0 ( ? )0 ( ? )\tabularnewline \hline \end{tabular} \end{center} \end{table} \end{landscape} \end{document} Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA - Original Message From: David Winsemius dwinsem...@comcast.net To: Felipe Carrillo mazatlanmex...@yahoo.com Cc: Duncan Murdoch murdoch.dun...@gmail.com; r-h...@stat.math.ethz.ch Sent: Mon, July 12, 2010 3:14:49 PM Subject: Re: [R] How to select the column header with \Sexpr{} On Jul 12, 2010, at 5:45 PM, Felipe Carrillo wrote: Thanks for the quick reply Duncan. I don't think I have explained myself well, I have a dataset named report and my column headers are run1,run2,run3,run4 and so on. I know how to access the data below those columns with \Sexpr{report[1,1]} \Sexpr{report[1,2]} and so on, but I can't access my column headers with \Sexpr{} because I can't find the way to reference run1,run2,run3 and run4. Sorry if I am not explain myself really well. Wouldn't this just be: \Sexpr{names(report)} # ? or perhaps you want specific items in that vector? Sexpr{names(report)[1]}, Sexpr{names(report)[2]}, etc --David. - Original Message From: Duncan Murdoch murdoch.dun...@gmail.com To: Felipe Carrillo mazatlanmex...@yahoo.com Cc: r-h...@stat.math.ethz.ch Sent: Mon, July 12, 2010 2:18:15 PM Subject: Re: [R] How to select the column header with \Sexpr{} On 12/07/2010 5:10 PM, Felipe Carrillo wrote: Hi: Since I work with a few different fish runs my column headers change everytime I start a new Year. I have been using \Sexpr{} for my row and columns and now I am trying to use with my report column headers. \Sexpr{1,1} is row 1 column 1, what can I use for headers? I tried \Sexpr{0,1} but sweave didn't like it..Thanks in advance for any hints \Sexpr takes an R expression, and inserts the first element of the result into your text. Using just 0,1 (not including the quotes) is not a valid R expression. You need to use paste() or some other function to construct the label you want to put in place, e.g. \Sexpr{paste(0,1,sep=,)} will give you 0,1. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing
Re: [R] Continuing on with a loop when there's a failure
On Jul 13, 2010, at 9:24 AM, Josh B wrote: In my opinion the try and tryCatch commands are written and documented rather poorly. Thus I am not sure what to program exactly. Didn't you see the silent parameter? Its seems to be documented fairly clearly to me. The testing of try at the console is not going to be very illuminating, since it really only has value within a function that you want to continue despite an error. try() WILL provide that facility but _you_ need to decide what you do with the information it returns, which in the case of its use with the default silent=FALSE is just the error message itself. For instance, I could query mod.poly3 and use an if/then statement to proceed, So why didn't you? A good result would be signaled by: lrm %in class(mod.poly3) -- David. but querying mod.poly3 is weird. For instance, here's the output when it fails: mod.poly3 - try(lrm(x[,2] ~ pol(x1, 3) + pol(x2, 3), data=x)) Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : NA/NaN/Inf in foreign function call (arg 1) mod.poly3 [1] Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : \n NA/NaN/Inf in foreign function call (arg 1)\n attr(,class) [1] try-error ...and here's the output when it succeeds: mod.poly3 - try(lrm(x[,1] ~ pol(x1, 3) + pol(x2, 3), data=x)) mod.poly3 Logistic Regression Model lrm(formula = x[, 1] ~ pol(x1, 3) + pol(x2, 3), data = x) Frequencies of Responses bagels donuts 10 5 Obs Max Deriv Model L.R. d.f. P C 15 4e-04 3.37 6 0.7616 0.76 Dxy Gamma Tau-a R2 Brier g 0.52 0.52 0.248 0.279 0.183 1.411 gr gp 4.1 0.261 Coef S.E.Wald Z P Intercept -5.68583 5.23295 -1.09 0.2772 x1 1.87020 2.14635 0.87 0.3836 x1^2 -0.42494 0.48286 -0.88 0.3788 x1^3 0.02845 0.03120 0.91 0.3618 x2 3.49560 3.54796 0.99 0.3245 x2^2 -0.94888 0.82067 -1.16 0.2476 x2^3 0.06362 0.05098 1.25 0.2121 ...so what exactly would I query to design my if/then statement? From: David Winsemius dwinsem...@comcast.net To: David Winsemius dwinsem...@comcast.net Cc: Josh B josh...@yahoo.com; R Help r-help@r-project.org Sent: Tue, July 13, 2010 9:09:04 AM Subject: Re: [R] Continuing on with a loop when there's a failure On Jul 13, 2010, at 9:04 AM, David Winsemius wrote: On Jul 13, 2010, at 8:47 AM, Josh B wrote: Thanks again, David. ...but, alas, I still can't get it work! (BTW, it did work.) Here's what I'm trying now: for (i in 1:2) { mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x)) results[1,i] - anova(mod.poly3)[1,3] } You need to do some programming. (Or I suppose you could wrap both the lrm and the anova calls in try.) You did not get an error from the lrm but rather from the anova call because you tried to give the results of the try function to anova without first checking to see if an error had occurred. --David. Here's what happens (from the console): Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : NA/NaN/Inf in foreign function call (arg 1) Error in UseMethod(anova) : no applicable method for 'anova' applied to an object of class try-error ...so I still can't make my results matrix. Could I ask you for some specific code to make this work? I'm not that familiar with the syntax for try or tryCatch, and the help files for them are pretty bad, in my humble opinion. I should clarify that I actually don't care about the failed runs per se. I just want R to keep going in spite of them and give me my results matrix. From: David Winsemius dwinsem...@comcast.net To: Josh B josh...@yahoo.com Cc: R Help r-help@r-project.org Sent: Mon, July 12, 2010 8:09:03 PM Subject: Re: [R] Continuing on with a loop when there's a failure On Jul 12, 2010, at 6:18 PM, Josh B wrote: Hi R sages, Here is my latest problem. Consider the following toy example: x - read.table(textConnection(y1 y2 y3 x1 x2 indv.1 bagels donuts bagels 4 6 indv.2 donuts donuts donuts 5 1 indv.3 donuts donuts donuts 1 10 indv.4 donuts donuts donuts 10 9 indv.5 bagels donuts bagels 0 2 indv.6 bagels donuts bagels 2 9 indv.7 bagels donuts bagels 8 5 indv.8 bagels donuts bagels 4 1 indv.9 donuts donuts donuts 3 3 indv.10 bagels donuts bagels 5 9 indv.11 bagels donuts bagels 9 10 indv.12 bagels donuts bagels 3 1 indv.13 donuts donuts donuts 7 10 indv.14 bagels donuts bagels 2 10 indv.15 bagels donuts bagels 9 6), header = TRUE) I want to fit a logistic regression of y1 on x1 and x2. Then I want to run a logistic regression of y2 on x1 and x2. Then I want to run a logistic regression of y3 on x1 and x2. In reality I have many more Y
Re: [R] Continuing on with a loop when there's a failure
On Jul 13, 2010, at 10:26 AM, David Winsemius wrote: On Jul 13, 2010, at 9:24 AM, Josh B wrote: In my opinion the try and tryCatch commands are written and documented rather poorly. Thus I am not sure what to program exactly. Didn't you see the silent parameter? Its seems to be documented fairly clearly to me. The testing of try at the console is not going to be very illuminating, since it really only has value within a function that you want to continue despite an error. try() WILL provide that facility but _you_ need to decide what you do with the information it returns, which in the case of its use with the default silent=FALSE is just the error message itself. For instance, I could query mod.poly3 and use an if/then statement to proceed, So why didn't you? A good result would be signaled by: rather: lrm %in% class(mod.poly3) -- David. but querying mod.poly3 is weird. For instance, here's the output when it fails: mod.poly3 - try(lrm(x[,2] ~ pol(x1, 3) + pol(x2, 3), data=x)) Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : NA/NaN/Inf in foreign function call (arg 1) mod.poly3 [1] Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : \n NA/NaN/Inf in foreign function call (arg 1)\n attr(,class) [1] try-error ...and here's the output when it succeeds: mod.poly3 - try(lrm(x[,1] ~ pol(x1, 3) + pol(x2, 3), data=x)) mod.poly3 Logistic Regression Model lrm(formula = x[, 1] ~ pol(x1, 3) + pol(x2, 3), data = x) Frequencies of Responses bagels donuts 10 5 Obs Max Deriv Model L.R. d.f. P C 15 4e-04 3.37 6 0.7616 0.76 Dxy Gamma Tau-a R2 Brier g 0.52 0.52 0.248 0.279 0.183 1.411 gr gp 4.1 0.261 Coef S.E.Wald Z P Intercept -5.68583 5.23295 -1.09 0.2772 x1 1.87020 2.14635 0.87 0.3836 x1^2 -0.42494 0.48286 -0.88 0.3788 x1^3 0.02845 0.03120 0.91 0.3618 x2 3.49560 3.54796 0.99 0.3245 x2^2 -0.94888 0.82067 -1.16 0.2476 x2^3 0.06362 0.05098 1.25 0.2121 ...so what exactly would I query to design my if/then statement? From: David Winsemius dwinsem...@comcast.net To: David Winsemius dwinsem...@comcast.net Cc: Josh B josh...@yahoo.com; R Help r-help@r-project.org Sent: Tue, July 13, 2010 9:09:04 AM Subject: Re: [R] Continuing on with a loop when there's a failure On Jul 13, 2010, at 9:04 AM, David Winsemius wrote: On Jul 13, 2010, at 8:47 AM, Josh B wrote: Thanks again, David. ...but, alas, I still can't get it work! (BTW, it did work.) Here's what I'm trying now: for (i in 1:2) { mod.poly3 - try(lrm(x[,i] ~ pol(x1, 3) + pol(x2, 3), data=x)) results[1,i] - anova(mod.poly3)[1,3] } You need to do some programming. (Or I suppose you could wrap both the lrm and the anova calls in try.) You did not get an error from the lrm but rather from the anova call because you tried to give the results of the try function to anova without first checking to see if an error had occurred. --David. Here's what happens (from the console): Error in fitter(X, Y, penalty.matrix = penalty.matrix, tol = tol, weights = weights, : NA/NaN/Inf in foreign function call (arg 1) Error in UseMethod(anova) : no applicable method for 'anova' applied to an object of class try-error ...so I still can't make my results matrix. Could I ask you for some specific code to make this work? I'm not that familiar with the syntax for try or tryCatch, and the help files for them are pretty bad, in my humble opinion. I should clarify that I actually don't care about the failed runs per se. I just want R to keep going in spite of them and give me my results matrix. From: David Winsemius dwinsem...@comcast.net To: Josh B josh...@yahoo.com Cc: R Help r-help@r-project.org Sent: Mon, July 12, 2010 8:09:03 PM Subject: Re: [R] Continuing on with a loop when there's a failure On Jul 12, 2010, at 6:18 PM, Josh B wrote: Hi R sages, Here is my latest problem. Consider the following toy example: x - read.table(textConnection(y1 y2 y3 x1 x2 indv.1 bagels donuts bagels 4 6 indv.2 donuts donuts donuts 5 1 indv.3 donuts donuts donuts 1 10 indv.4 donuts donuts donuts 10 9 indv.5 bagels donuts bagels 0 2 indv.6 bagels donuts bagels 2 9 indv.7 bagels donuts bagels 8 5 indv.8 bagels donuts bagels 4 1 indv.9 donuts donuts donuts 3 3 indv.10 bagels donuts bagels 5 9 indv.11 bagels donuts bagels 9 10 indv.12 bagels donuts bagels 3 1 indv.13 donuts donuts donuts 7 10 indv.14 bagels donuts bagels 2 10 indv.15 bagels donuts bagels 9 6), header = TRUE) I want to fit a logistic regression of y1 on x1 and x2. Then I want to run a logistic regression of y2 on x1 and x2. Then I want to run a logistic
[R] how to extract information from anova results
Hi, I have used the instruction aov in the following manner: res - aov(qwe ~ asd) when I typed res I get: _ Call: aov(formula = qwe ~ asd) Terms: asd Residuals Sum of Squares 0.0708704 0.5255957 Deg. of Freedom 1 8 Residual standard error: 0.2563191 Estimated effects may be unbalanced _ I need to access the value of the Sum of Squares (i.e. I want another variable to be equal to it, e.g myvar - Sum.of.Squares) . I tried names(res) to see which values are accessible, but I couldn't find the Sum of Squares. I had a similar problem when I tried to access the p.value which can be readily SEEN using summary(res). In general, is there an easy way to access the values generated by an R function? Thank you, LBA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to extract information from anova results
On Jul 13, 2010, at 10:35 AM, Luis Borda de Agua wrote: Hi, I have used the instruction aov in the following manner: res - aov(qwe ~ asd) when I typed res I get: _ Call: aov(formula = qwe ~ asd) Terms: asd Residuals Sum of Squares 0.0708704 0.5255957 Deg. of Freedom 1 8 Residual standard error: 0.2563191 Estimated effects may be unbalanced _ I need to access the value of the Sum of Squares (i.e. I want another variable to be equal to it, e.g myvar - Sum.of.Squares) . I tried names(res) to see which values are accessible, but I couldn't find the Sum of Squares. I had a similar problem when I tried to access the p.value which can be readily SEEN using summary(res). In general, is there an easy way to access the values generated by an R function? When you typed res, the interpreter determined that it was of type aov and dispatched it to the print method for objects of that class. The list of print methods is accessed with: methods(print) and it's a long list. print.aov is asterisked so you either need to look at the function with: getAnywhere(print.aov) or perhaps more directly assign summary(res) to an object and access its SS values. -- David. Thank you, LBA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to select the column header with \Sexpr{}
Thanks Izta: I see your point, then I should extract the column names when the dataset is first read because is a dataframe: report - structure(list(Date = c(3/12/2010, 3/13/2010, 3/14/2010, 3/15/2010), Run1 = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146), 140 (111 ? 150)), Run2 = c(33 (71 ? 71), n (0 ? 0), 337 (67 ? 74), 140 (68 ? 84)), Run3 = c(890 (32 ? 47), n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), Run4 = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? )), Run4 = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? ))), .Names = c(ID_Date, Run1, Run2, Run3, Run4, Run5), row.names = c(NA, 4L), class = data.frame) str(report) 'data.frame': 4 obs. of 6 variables: $ ID_Date: chr 3/12/2010 3/13/2010 3/14/2010 3/15/2010 $ Run1 : chr 33 (119 ? 119) n (0 ? 0) 893 (110 ? 146) 140 (111 ? 150) $ Run2 : chr 33 (71 ? 71) n (0 ? 0) 337 (67 ? 74) 140 (68 ? 84) $ Run3 : chr 890 (32 ? 47) n (0 ? 0) 10,602 (32 ? 52) 2,635 (34 ? 66) $ Run4 : chr 0 ( ? ) n (0 ? 0) 0 ( ? ) 0 ( ? ) $ Run5 : chr 0 ( ? ) n (0 ? 0) 0 ( ? ) 0 ( ? ) names(report)[1] # I can extract the column name here [1] Date But after I use 'stringr to convert the character '?' to '-' 'report' is not a dataframe anymore and returns a NULL when trying to extract the column names. I was not aware that \Sexpr{} only work on dataframes, thanks for your help. - Original Message From: Ista Zahn iz...@psych.rochester.edu To: Felipe Carrillo mazatlanmex...@yahoo.com Cc: David Winsemius dwinsem...@comcast.net; r-h...@stat.math.ethz.ch Sent: Tue, July 13, 2010 7:13:39 AM Subject: Re: [R] How to select the column header with \Sexpr{} Hi Felipe, The problem has nothing to do with Sweave or \Sexpr. The problem is that by the time you call \Sexpr report is a matrix, and you cannot access the column names of a matrix with names(). You need to use colnames() or convert the matrix to a data.frame. Perhaps a true useR can write R code in a Sweave file without checking it, but for mere mortals it is best to evaluate the R code in an interactive session to make sure it works before asking Sweave to insert it into your .tex file. If you had tried to evaluate names(report)[1] in an interactive session you would have discovered your problem immediately. Best, Ista On Tue, Jul 13, 2010 at 4:15 AM, Felipe Carrillo mazatlanmex...@yahoo.com wrote: I had tried that earlier and didn't work either, I probably have \Sexpr in the wrong place. See example: Column one header gets blank: \documentclass[11pt]{article} \usepackage{longtable,verbatim,ctable} \usepackage{longtable,pdflscape} \usepackage{fmtcount,hyperref} \usepackage{fullpage} \title{United States} \begin{document} \setkeys{Gin}{width=1\textwidth} \maketitle echo=F,results=hide= report - structure(list(Date = c(3/12/2010, 3/13/2010, 3/14/2010, 3/15/2010), Run1 = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146), 140 (111 ? 150)), Run2 = c(33 (71 ? 71), n (0 ? 0), 337 (67 ? 74), 140 (68 ? 84)), Run3 = c(890 (32 ? 47), n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), Run4 = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? )), Run4 = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? ))), .Names = c(ID_Date, Run1, Run2, Run3, Run4, Run5), row.names = c(NA, 4L), class = data.frame) require(stringr) report - t(apply(report, 1, function(x) {str_replace(x, \\?, -)})) #report #latex(report,file=) @ \begin{landscape} \begin{table}[!tbp] \begin{center} \begin{tabular}{ll}\hline\hline \multicolumn{1}{c}{\Sexpr{names(report)[1]}} # Using \Sexpr here \multicolumn{1}{c}{Run1} \multicolumn{1}{c}{Run2} \multicolumn{1}{c}{Run3} \multicolumn{1}{c}{Run4} \multicolumn{1}{c}{Run5}\tabularnewline \hline 13/12/201033 (119 ? 119)33 (71 ? 71)890 (32 ? 47)0 ( ? )0 ( ? )\tabularnewline 23/13/2010n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)\tabularnewline 33/14/2010893 (110 ? 146)337 (67 ? 74)10,602 (32 ? 52)0 ( ? )0 ( ? )\tabularnewline 43/15/2010140 (111 ? 150)140 (68 ? 84)2,635 (34 ? 66)0 ( ? )0 ( ? )\tabularnewline \hline \end{tabular} \end{center} \end{table} \end{landscape} \end{document} Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA - Original Message From: David Winsemius dwinsem...@comcast.net To: Felipe Carrillo mazatlanmex...@yahoo.com Cc: Duncan Murdoch murdoch.dun...@gmail.com; r-h...@stat.math.ethz.ch Sent: Mon, July 12, 2010 3:14:49 PM Subject: Re: [R] How to select the column header with \Sexpr{} On Jul 12, 2010, at 5:45 PM, Felipe Carrillo wrote: Thanks for the quick reply Duncan. I don't think I have explained myself well, I have a dataset named report and my column headers are run1,run2,run3,run4 and so on. I know how to access the data below those columns with \Sexpr{report[1,1]} \Sexpr{report[1,2]} and so on, but I can't access my column headers with \Sexpr{}
Re: [R] Substring function?
The high-level concept you need is called Regular Expressions. R supports these through several functions, see ?regex . Ralf B wrote: Hi all, I would like to detect all strings in the vector 'content' that contain the strings from the vector 'search'. Here a code example: content - data.frame(urls=c( http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3;, http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4?p=stufftoggle=1;) ) search - data.frame(signatures=c(http://www.google.com/search;)) subset(content, search$signatures %in% content$urls) I am getting an error: [1] urls 0 rows (or 0-length row.names) What I would like to achieve is the return of http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3;. Is that possible? In practice I would like to run this over 1000s of strings in 'content' and 100s of strings in 'search'. Could I run into performance issues with this approach and, if so, are there better ways? Best, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to extract information from anova results
I think the easiest way is from calling anova() on your aov class object. For instance y - 1:10 x - runif(10) my.aov - aov(y ~ x) anova(my.aov)[Residuals, Sum Sq] anova(my.aov)[x, Pr(F)] You can also extract these values from a call to summary(my.aov), but that output is a list (even for an ANOVA with a single error stratum), so you'd have to add [[1]] selecting the first (or if there were more than one whichever you wanted) element of the list. summary(my.aov)[[1]][Residuals, Sum Sq] Cheers, Josh On Tue, Jul 13, 2010 at 7:35 AM, Luis Borda de Agua lba...@gmail.com wrote: Hi, I have used the instruction aov in the following manner: res - aov(qwe ~ asd) when I typed res I get: _ Call: aov(formula = qwe ~ asd) Terms: asd Residuals Sum of Squares 0.0708704 0.5255957 Deg. of Freedom 1 8 Residual standard error: 0.2563191 Estimated effects may be unbalanced _ I need to access the value of the Sum of Squares (i.e. I want another variable to be equal to it, e.g myvar - Sum.of.Squares) . I tried names(res) to see which values are accessible, but I couldn't find the Sum of Squares. I had a similar problem when I tried to access the p.value which can be readily SEEN using summary(res). In general, is there an easy way to access the values generated by an R function? Thank you, LBA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MplusAutomation
R list- i have begun using the MplusAutomation while piloting a large-scale simulation (~200,000 replications). since the package takes advantage of the DOS batch mode available in Mplus, each replication starts and activates a new instance of a command prompt window. this effectively locks me out of my computer for the duration of the simulation. my question is this: can anyone suggest how i might pass the quiet command to the DOS program? is there a way to generally specify this from R? or any specific recommendations/experience with this package? thanks, --matthew .. matthew m. gushta american institutes for research 202.403.5079 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding R -installation
Dear nuncio my internet is connected properly, i am running yast as a superuser , i am getting the following error *Problem* : Cannot access installation media http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2 (Medium 1). Check whether the server is accessible Download (curl) error for ' http://download.opensuse.org/repositories/devel:languages:R:patched/openSUSE_11.2/repodata/repomd.xml ': Error code: Connection failed Error message: couldn't connect to host yours truly, B.venkatesh, University of Hyderabad India. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] isdst warning when rounding a range of time data: fix or suppress?
Dear Clay, dear list, I face the same problem when rounding POSIXct objects. Have you (or has anybody) found an explanation meanwhile, or a way to work around this issue? Example: opt-options(digits.secs=3) ts1-as.POSIXct(c(2006-11-01 09:00:00.03, 2006-11-01 09:00:01, 2006-11-01 09:00:01.0245, 2006-11-01 09:00:01.11,2006-11-01 09:00:03), tz=GMT) ra1-seq(2,6,1) data-data.frame(ts1,ra1) data$lo1-data$ts1==round.POSIXt(data$ts1,secs) data Even though in this example all results are correct, is there a chance that incorrect results are returned? Thanks,Phil Clay Heaton wrote: Hi, I'm working with timeseries data. The values are every 5 seconds and each series can last up to 4-5 days. To generate the x-axis labels, I'm doing the following: = # Variable for displaying hours on the x-axis rtime - as.POSIXct(round(range(timedata), hours)) # Variable for displaying days on the x-axis stime - as.POSIXct(round(range(timedata), days)) # Plot the hours on the x-axis axis.POSIXct(1, at=seq(rtime[1], rtime[2], by=hour), format=%H, cex.axis=.6, lwd=0, lwd.ticks=1, hadj=0.2, las=2, tck=-0.02) # Plot the days on the x-axis axis.POSIXct(1, at=seq(stime[1], stime[2], by=day), format=%A, cex.axis=.7, line=1, lty=0, padj=-1.4) = The data generated and the plots look fine. R issues a warning on the round() function when rtime is set, though. It looks like this: round(range(cgmtime), hours) [1] 2003-11-04 14:00:00 EST 2003-11-07 11:00:00 EST Warning message: In if (isdst == -1) { : the condition has length 1 and only the first element will be used Am I approaching this incorrectly? Is there another way to achieve the same result without the warning? Or is there a way I can suppress the warning? Thanks in advance, Clay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/isdst-warning-when-rounding-a-range-of-time-data-fix-or-suppress-tp1680540p2287574.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Robust regression error: Too many singular resamples
You could try rlm in the MASS package; it doesn't use he resampling step. That seems to do the trick. Thank you! -- View this message in context: http://r.789695.n4.nabble.com/Robust-regression-error-Too-many-singular-resamples-tp2286585p2287468.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distributing a value for a given month across the number of weeks in that month
Actually, I realized that my task was a bit more complicated as I have different (let's call them) Markets and the dates repeat themselves across markets. And the original code from Gabor gives an error - because dates repeate themselves and apparently zoo cannot handle it. So, I had to do program a way around it (below). It works. However, I am wondering if there is a shorter/more elegant way of doing it? Thank you! Dimitri ### My original data frame is a bit more complicated - dates repeat themselves for 2 markets: monthly-data.frame(month=c(20100301,20100401,20100501,20100301,20100401,20100501),monthly.value=c(100,200,300,10,20,30),market=c(Market A,Market A, Market A,Market B,Market B, Market B)) monthly$month-as.character(monthly$month) monthly$month-as.Date(monthly$month,%Y%m%d) (monthly) library(zoo) # pull in development version of na.locf.zoo source(http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/na.locf.R?revision=725root=zoo;) # convert to zoo my.z.list-NULL for(i in 1:length(levels(monthly$market))){ my.frame-monthly[monthly$market %in% levels(monthly$market)[i],1:2] my.z.list[[i]] - with(my.frame, zoo(monthly.value, month)) } # get sequence of all dates and from that get mondays all.dates - seq(start(my.z.list[[1]]), as.Date(as.yearmon(end(my.z.list[[1]])), frac = 1), by = day) mondays - all.dates[weekdays(all.dates) == Monday] (mondays) # use na.locf to fill in mondays and ave to distribute them weekly-NULL for(i in 1:length(levels(monthly$market))){ weekly[[i]] - na.locf(my.z.list[[i]], xout = mondays) weekly[[i]][] - ave(weekly[[i]], as.yearmon(mondays), FUN = function(x) x[1]/length(x)) } (weekly) ### Creating a data frame with markets stacked on top of each other - like in the original monthly data frame: for(i in 1:length(weekly)){ weekly[[i]]-as.data.frame(weekly[[i]]) weekly[[i]]$week-row.names(weekly[[i]]) names(weekly[[i]])[1]-weekly.value weekly[[i]]$market-levels(monthly$market)[i] } weekly.data-do.call(rbind,weekly) That's it. Dimitri On Fri, Jul 9, 2010 at 10:22 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Fri, Jul 9, 2010 at 9:35 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! Any hint would be greatly appreciated. I have a data frame that contains (a) monthly dates and (b) a value that corresponds to each month - see the data frame monthly below: monthly-data.frame(month=c(20100301,20100401,20100501),monthly.value=c(100,200,300)) monthly$month-as.character(monthly$month) monthly$month-as.Date(monthly$month,%Y%m%d) (monthly) I need to split each month into weeks, e.g., weeks that start on Monday (it could as well be Sunday - it does not really matter) and distribute the monthly value evenly across weeks. So, if a month has 5 Mondays, then the monthly value should be dividied by 5, but if a month has only 4 weeks, then the monthly value should be divided by 4. The output I need is like this: week weekly.value 2010-03-01 20 2010-03-08 20 2010-03-15 20 2010-03-22 20 2010-03-29 20 2010-04-05 50 2010-04-12 50 2010-04-19 50 2010-04-26 50 2010-05-03 60 2010-05-10 60 2010-05-17 60 2010-05-24 60 2010-05-31 60 There is new functionality in na.locf in the development version of zoo that makes it particularly convenient to do this. First create a zoo object z from monthly and get a vector of all the mondays. Then use na.locf to place the monthly value in each monday and ave to distribute them out. library(zoo) # pull in development version of na.locf.zoo source(http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/na.locf.R?revision=725root=zoo;) # convert to zoo z - with(monthly, zoo(monthly.value, month)) # get sequence of all dates and from that get mondays all.dates - seq(start(z), as.Date(as.yearmon(end(z)), frac = 1), by = day) mondays - all.dates[weekdays(all.dates) == Monday] # use na.locf to fill in mondays and ave to distribute them weeks - na.locf(z, xout = mondays) weeks[] - ave(weeks, as.yearmon(mondays), FUN = function(x) x[1]/length(x)) # show output in a few different formats weeks as.data.frame(weeks) data.frame(Monday = as.Date(time(weeks)), value = weeks) data.frame(Monday = as.Date(time(weeks)), value = weeks, row.names = NULL) plot(weeks) The output looks like this: weeks 2010-03-01 2010-03-08 2010-03-15 2010-03-22 2010-03-29 2010-04-05 2010-04-12 20 20 20 20 20 50 50 2010-04-19 2010-04-26 2010-05-03 2010-05-10 2010-05-17 2010-05-24 2010-05-31 50 50 60 60 60 60 60 as.data.frame(weeks) weeks 2010-03-01 20 2010-03-08 20 2010-03-15 20 2010-03-22 20 2010-03-29 20 2010-04-05 50 2010-04-12 50 2010-04-19 50 2010-04-26 50 2010-05-03 60 2010-05-10 60 2010-05-17 60 2010-05-24 60 2010-05-31 60
Re: [R] Accessing files on password-protected FTP sites
Is it possible to download data from password-protected ftp sites? I saw another thread with instructions for uploading files using RCurl, but I could not find information for downloading them in the RCurl documentation. Did you try the ?getURL function in RCurl? See the `Test the passwords` example in the examples on the help page... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the degrees of freedom in an nlme model
If the curves are sufficiently close to sine (cosine) curves and you know the period, then this can be restructured as a linear model and you can avoid all the complexities that come with non-linear models. Further, from your description, it does not sound like you really gain much from using the mixed effects vs. just fixed effects, so this could be reduced to a simple use of lm and anova. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Jun Shen Sent: Monday, July 12, 2010 4:57 PM To: Bert Gunter Cc: R-help Subject: Re: [R] What is the degrees of freedom in an nlme model Hi, Bert, Thanks for your thoughtful explanation. I think the problem is quite over my head and maybe I should leave how for experts :) The situation is I have a group of sigmoid curves (let's say, they are supposed to be the same) but occasionally you will see a few curves kind of different. So how do we say they are actually different or not from the majority curves in a statistical way? The original idea was proposed by Monson and Rodbard in 1978 (Am J Physiol. 1978 Aug;235(2):E97-102). The paper is freely available. The idea is to fit the curves individually and obtain the residual sum of squares and then fit the curves altogether somehow constraining some parameters and then you have another residual sum of squares. Then you can do a F-test. In my case, I wonder if I can use a mixed-effect modeling to do the simultaneous fitting job. Now you see, the problem is the degrees of freedom. Based on your explanation, it seems no reliable calculation of df for nonlinear models. However I can still see the df reported in nlme or nls models. Now I am not even sure if I should use them. Another thing I observed is even I added more random effects to the nlme model, the denominator df did not seem to change. Is it correct? Thanks again. Jun On Mon, Jul 12, 2010 at 4:00 PM, Bert Gunter gunter.ber...@gene.com wrote: Jun: Short answer: There is no such thing as df for a nonlinear model (whether or not mixed effects). Longer answer: df is the dimension of the null space when the data are projected on the linear subspace of the model matrix of a **linear model ** . So, strictly speaking, no linear model, no df. HOWEVER... nonlinear models are usually (always??) fit by successive linear approximations, and approximate df are obtained from these approximating subspaces. However, the problem with this is that there is no guarantee that the relevant residual distributions are sufficiently chisq with the approximate df to give reasonable answers. In fact, lots of people much smarter than I have spent lots of time trying to figure out what sorts of approximations one should use to get trustworthy results. The thing is, in nonlinear models, it can DEPEND on the exact form of the model -- indeed, that's what distinguishes nonlinear models from linear ones! So this turns out to be really hard and afaik these smart people don't agree on what should be done. To see what one of the smartest people have to say about this, search the archives for Doug Bates's comments on this w.r.t. lmer (he won't compute such distributions nor provide P values because he doesn't know how to do it reliably. Doug -- please correct me if I have it wrong). A stock way to extricate oneself from this dilemma is: bootstrap! Unfortunately, this is also probably too facile: for one thing, with a nondiagonal covariance matrix (as in mixed effects models), how do you resample to preserve the covariance structure? I believe this is an area of active research in the time series literature, for example. For another, this may be too computationally demanding to be practicable due to convergence issues. Bottom line: there may be no good way to do what you want. Note to experts: Please view this post as an invitation to correct my errors and provide authoritative info. Cheers to all, Bert Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Jun Shen Sent: Monday, July 12, 2010 12:34 PM To: R-help Subject: [R] What is the degrees of freedom in an nlme model Dear all, I want to do a F test, which involves calculation of the degrees of freedom for the residuals. Now say, I have a nlme object mod.nlme. I have two questions 1.How do I extract the degrees of freedom? 2.How is this degrees of freedom calculated in an nlme model? Thanks. Jun Shen Some sample code and data = mod.nlme-nlme(RESP~E0+(Emax-
Re: [R] Fast string comparison
Good idea Romain, there is quite a bit of type testing in the function versions of STRING_ELT and CHAR, not to mention the function call overhead. Since the types are checked explicitly, I believe this function is safe. All together now... system.time(strings[-1] == strings[-1e5]) user system elapsed 0.032 0.000 0.035 system.time(strcmp(strings[-1], strings[-1e5])) user system elapsed 0.032 0.000 0.034 system.time(strcmp2(strings[-1], strings[-1e5])) user system elapsed 0.024 0.000 0.026 system.time(lhs==rhs) user system elapsed 0.012 0.000 0.013 system.time(strcmp(lhs, rhs)) user system elapsed 0.012 0.000 0.011 system.time(strcmp2(lhs, rhs)) user system elapsed 0.004 0.000 0.004 I looks like you can squeeze out more speed using the macro versions of STRING_ELT and CHAR. On Tue, 2010-07-13 at 09:48 -0400, Romain Francois wrote: Hi Matt, I think there are some confusing factors in your results. system.time(strcmp(strings[-1], strings[-1e5])) would also include the time required to perform both subscripting (strings[-1] and strings[-1e5] ) which actually takes some time. Also, you do have a bit of overhead due to the use of STRING_ELT and the write barrier. I've include below a version that uses R internals so that you get the fast (but you have to understand the risks, etc ...) version of STRING_ELT using the plugin system of inline. library(inline) code - SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig - signature(s1=character, s2=character) strcmp - cfunction(sig, code) strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse = )) lhs - strings[-1] rhs - strings[-1e5] system.time( lhs == rhs ) system.time(strcmp( lhs, rhs) ) library(inline) settings - getPlugin( default ) settings$includes - paste( #define USE_RINTERNALS, settings$includes, collapse = \n ) code2 - SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig - signature(s1=character, s2=character ) strcmp2 - cxxfunction(sig, code2, settings = settings) system.time(strcmp2( lhs, rhs) ) I get: $ Rscript strings.R Le chargement a nécessité le package : methods utilisateur système écoulé 0.002 0.000 0.002 utilisateur système écoulé 0.004 0.000 0.005 utilisateur système écoulé 0.003 0.000 0.003 Romain Le 13/07/10 15:24, Matt Shotwell a écrit : On Tue, 2010-07-13 at 01:42 -0400, Hadley Wickham wrote: strings- replicate(1e5, paste(sample(letters, 100, rep = T), collapse = )) system.time(strings[-1] == strings[-1e5]) # user system elapsed # 0.016 0.000 0.017 So it takes ~1/100 of a second to do ~100,000 string comparisons. You need to provide a reproducible example that illustrates why you think string comparisons are slow. Here's a vectorized alternative to '==' for strings, with minimal argument checking or result conversion. I haven't looked at the corresponding R source code, it may be similar: library(inline) code- SEXP ans; int i, len, *cans; if(!isString(s1) || !isString(s2)) error(\invalid arguments\); len = length(s1)length(s2)?length(s2):length(s1); PROTECT(ans = allocVector(INTSXP, len)); cans = INTEGER(ans); for(i = 0; i len; i++) cans[i] = strcmp(CHAR(STRING_ELT(s1,i)),\ CHAR(STRING_ELT(s2,i))); UNPROTECT(1); return ans; sig- signature(s1=character, s2=character) strcmp- cfunction(sig, code) system.time(strings[-1] == strings[-1e5]) user system elapsed 0.036 0.000 0.035 system.time(strcmp(strings[-1], strings[-1e5])) user system elapsed 0.032 0.000 0.034 That's pretty fast, though I seem to be working with a slower system than Hadley. It's hard to see how this could be improved, except maybe by caching results of string comparisons. -Matt Hadley On Tue, Jul 13, 2010 at 6:52 AM, Ralf Bralf.bie...@gmail.com wrote: I am asking this question because String comparison in R seems to be awfully slow (based on
Re: [R] How to select the column header with \Sexpr{}
Hi Felipe, See in line below. On Tue, Jul 13, 2010 at 11:04 AM, Felipe Carrillo mazatlanmex...@yahoo.com wrote: Thanks Izta: I see your point, then I should extract the column names when the dataset is first read because is a dataframe: That might work, but it's definitely not how I would do it. report - structure(list(Date = c(3/12/2010, 3/13/2010, 3/14/2010, 3/15/2010), Run1 = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146), 140 (111 ? 150)), Run2 = c(33 (71 ? 71), n (0 ? 0), 337 (67 ? 74), 140 (68 ? 84)), Run3 = c(890 (32 ? 47), n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), Run4 = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? )), Run4 = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? ))), .Names = c(ID_Date, Run1, Run2, Run3, Run4, Run5), row.names = c(NA, 4L), class = data.frame) str(report) 'data.frame': 4 obs. of 6 variables: $ ID_Date: chr 3/12/2010 3/13/2010 3/14/2010 3/15/2010 $ Run1 : chr 33 (119 ? 119) n (0 ? 0) 893 (110 ? 146) 140 (111 ? 150) $ Run2 : chr 33 (71 ? 71) n (0 ? 0) 337 (67 ? 74) 140 (68 ? 84) $ Run3 : chr 890 (32 ? 47) n (0 ? 0) 10,602 (32 ? 52) 2,635 (34 ? 66) $ Run4 : chr 0 ( ? ) n (0 ? 0) 0 ( ? ) 0 ( ? ) $ Run5 : chr 0 ( ? ) n (0 ? 0) 0 ( ? ) 0 ( ? ) names(report)[1] # I can extract the column name here [1] Date But after I use 'stringr to convert the character '?' to '-' 'report' is not a dataframe anymore and returns a NULL when trying to extract the column names. No, it will not report NULL when extracting _column names_. Try colnames(report). It will report NULL when trying to extract the _names_ using names(report), because matrices have colnames and rownames but not names. I was not aware that \Sexpr{} only work on dataframes, thanks for your help. The problem is _not with \Sexpr_. The problem is that you are asking for the names() of a matrix, which do not exist in R. You can use colnames() like this \Sexpr{colnames(report)[1]} or you can convert report to a data.frame and use names, like this \Sexpr{names(as.data.frame(report))[1]} HTH, Ista - Original Message From: Ista Zahn iz...@psych.rochester.edu To: Felipe Carrillo mazatlanmex...@yahoo.com Cc: David Winsemius dwinsem...@comcast.net; r-h...@stat.math.ethz.ch Sent: Tue, July 13, 2010 7:13:39 AM Subject: Re: [R] How to select the column header with \Sexpr{} Hi Felipe, The problem has nothing to do with Sweave or \Sexpr. The problem is that by the time you call \Sexpr report is a matrix, and you cannot access the column names of a matrix with names(). You need to use colnames() or convert the matrix to a data.frame. Perhaps a true useR can write R code in a Sweave file without checking it, but for mere mortals it is best to evaluate the R code in an interactive session to make sure it works before asking Sweave to insert it into your .tex file. If you had tried to evaluate names(report)[1] in an interactive session you would have discovered your problem immediately. Best, Ista On Tue, Jul 13, 2010 at 4:15 AM, Felipe Carrillo mazatlanmex...@yahoo.com wrote: I had tried that earlier and didn't work either, I probably have \Sexpr in the wrong place. See example: Column one header gets blank: \documentclass[11pt]{article} \usepackage{longtable,verbatim,ctable} \usepackage{longtable,pdflscape} \usepackage{fmtcount,hyperref} \usepackage{fullpage} \title{United States} \begin{document} \setkeys{Gin}{width=1\textwidth} \maketitle echo=F,results=hide= report - structure(list(Date = c(3/12/2010, 3/13/2010, 3/14/2010, 3/15/2010), Run1 = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146), 140 (111 ? 150)), Run2 = c(33 (71 ? 71), n (0 ? 0), 337 (67 ? 74), 140 (68 ? 84)), Run3 = c(890 (32 ? 47), n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), Run4 = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? )), Run4 = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? ))), .Names = c(ID_Date, Run1, Run2, Run3, Run4, Run5), row.names = c(NA, 4L), class = data.frame) require(stringr) report - t(apply(report, 1, function(x) {str_replace(x, \\?, -)})) #report #latex(report,file=) @ \begin{landscape} \begin{table}[!tbp] \begin{center} \begin{tabular}{ll}\hline\hline \multicolumn{1}{c}{\Sexpr{names(report)[1]}} # Using \Sexpr here \multicolumn{1}{c}{Run1} \multicolumn{1}{c}{Run2} \multicolumn{1}{c}{Run3} \multicolumn{1}{c}{Run4} \multicolumn{1}{c}{Run5}\tabularnewline \hline 13/12/201033 (119 ? 119)33 (71 ? 71)890 (32 ? 47)0 ( ? )0 ( ? )\tabularnewline 23/13/2010n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)n (0 ? 0)\tabularnewline 33/14/2010893 (110 ? 146)337 (67 ? 74)10,602 (32 ? 52)0 ( ? )0 ( ? )\tabularnewline 43/15/2010140 (111 ? 150)140 (68 ? 84)2,635 (34 ? 66)0 ( ? )0 ( ? )\tabularnewline \hline \end{tabular} \end{center} \end{table} \end{landscape} \end{document} Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service
Re: [R] How can i draw a graph with high and low data points
There are several functions in several packages for plotting intervals that will give you plots much better than the excel one. The RSiteSearch function or the sos package may help you find those. But it is also easy to create such plots using just a few lines of R code and base graphics. Read the help pages for plot.default (look at the ylim argument) and the segments function (the order and seq functions may also be of use). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Nathaniel Saxe Sent: Tuesday, July 13, 2010 2:36 AM To: r-help@r-project.org Subject: Re: [R] How can i draw a graph with high and low data points I have 5 columns- Trial.Group, Mean, Standard Deviation, Upper percentile, Lower percentile. Trial.Group 41 subjects: 3 to 4 yrs-Male Mean 444 SD 25 upper 494 lower 393 and all the data is like that. and i wish to recreate this excel table. http://r.789695.n4.nabble.com/file/n2287158/untitled.GIF untitled.GIF problem with my code- doesn't put Trial.Group on the x axis Thanks for the help -- View this message in context: http://r.789695.n4.nabble.com/How-can-i- draw-a-graph-with-high-and-low-data-points-tp2282524p2287158.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] StartsWith over vector of Strings?
content[na.omit(pmatch(searchset, content,,TRUE))] -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Ralf B Sent: Tuesday, July 13, 2010 5:47 AM To: r-help@r-project.org Subject: [R] StartsWith over vector of Strings? Given vectors of strings of arbitrary length content - c(abc, def) searchset - c(a, abc, abcdef, d, def, defghi) Is it possible to determine the content String set that matches the searchset in the sense of 'startswith' ? This would be a vector of all strings in content that start with the string of any of the strings in the searchset. In the little example here, this would be: result - c(abc, abc, def, def) Best, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the degrees of freedom in an nlme model
... Aug;235(2):E97-102). The paper is freely available. The idea is to fit the curves individually and obtain the residual sum of squares and then fit the curves altogether somehow constraining some parameters and then you have another residual sum of squares. Then you can do a F-test. -- No you can't. This paper was apparently written by someone who doesn't sufficiently understand the statistical issues. This is not uncommon -- even papers in statistical journals sometimes get it wrong. In my case, I wonder if I can use a mixed-effect modeling to do the simultaneous fitting job. -- You need to consult with your local statistician. This forum is not the appropriate venue for difficult statistical questions that require intimate familiarity with the data and an understanding of the scientific questions of interest. -- Bert Gunter Genentech Nonclinical Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generate groups with random size but given total sample size
For one definition of random: ss - rexp(100) ss - ss/sum(ss) ss - 5 + round( ss*9500 ) cnt - 0 while( ( d - sum(ss) - 1 ) != 0 ) { tmpid - sample.int(100,1) ss[tmpid] - ss[tmpid] - d ss[ ss 500 ] - 500 ss[ ss 5 ] - 5 cnt - cnt + 1 if (cnt 100) { cat('problems finding a solution, stopping after 100 iterations\n') break } } group - rep( 1:100, ss ) Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Arne Schulz Sent: Tuesday, July 13, 2010 7:10 AM To: r-help@r-project.org Subject: [R] Generate groups with random size but given total sample size Dear list, I am currently doing some simulation studies where I want to compare different scenarios. In particular, two scenarios should be compared: 10.000 cases in 100 groups with 100 cases per group and 10.000 cases in 100 groups with random group size (ranging from 5 to 500). The first part is no problem: id - seq(1,1) group - sort(rep(seq(1,100),100)) But I don't get along with the second scenario. Using sample does give me 100 groups with random cases, but generates more than 10.000 cases: set.seed(13) sum(sample(5:500, 100)) [1] 24583 Another way could be generating one sample at a time and sum the cases. But this would end up in trail error to fit the 10.000 cases. Maybe it would break rules of probability, too. I'm convinced that there should be another (and even better) way to handle this problem in R... :-) Best regards, Arne Schulz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MplusAutomation
Create a shortcut that targets your batch file and edit its properties to open minimized, and call the shortcut from R rather than the calling the batch file directly. Gushta, Matthew mgus...@air.org wrote: R list- i have begun using the MplusAutomation while piloting a large-scale simulation (~200,000 replications). since the package takes advantage of the DOS batch mode available in Mplus, each replication starts and activates a new instance of a command prompt window. this effectively locks me out of my computer for the duration of the simulation. my question is this: can anyone suggest how i might pass the quiet command to the DOS program? is there a way to generally specify this from R? or any specific recommendations/experience with this package? thanks, --matthew .. matthew m. gushta american institutes for research 202.403.5079 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distributing a value for a given month across the number of weeks in that month
On Tue, Jul 13, 2010 at 11:19 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Actually, I realized that my task was a bit more complicated as I have different (let's call them) Markets and the dates repeat themselves across markets. And the original code from Gabor gives an error - because dates repeate themselves and apparently zoo cannot handle it. So, I had to do program a way around it (below). It works. However, I am wondering if there is a shorter/more elegant way of doing it? Thank you! Dimitri ### My original data frame is a bit more complicated - dates repeat themselves for 2 markets: monthly-data.frame(month=c(20100301,20100401,20100501,20100301,20100401,20100501),monthly.value=c(100,200,300,10,20,30),market=c(Market A,Market A, Market A,Market B,Market B, Market B)) monthly$month-as.character(monthly$month) monthly$month-as.Date(monthly$month,%Y%m%d) (monthly) Assuming the dates for each market are the same we split them into a zoo object with one market per column and following the same approach as last time we use by in place of ave. The lines marked ## are same as last time. Be sure you are using zoo 1.6-4 from CRAN since it makes use of the na.locf features added in that version. z - read.zoo(monthly, split = market) all.dates - seq(start(z), as.Date(as.yearmon(end(z)), frac = 1), by = day) ## mondays - all.dates[weekdays(all.dates) == Monday] ## weeks - na.locf(z, xout = mondays) ## do.call(rbind, by(weeks, as.yearmon(mondays), + function(x) zoo(x/nrow(x), rownames(x Market.A Market.B 2010-03-01 202 2010-03-08 202 2010-03-15 202 2010-03-22 202 2010-03-29 202 2010-04-05 505 2010-04-12 505 2010-04-19 505 2010-04-26 505 2010-05-03 606 2010-05-10 606 2010-05-17 606 2010-05-24 606 2010-05-31 606 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Time Variable and Historical Interest Rates
Guys, I wrote to the finance mailing list earlier with my questions but was directed here. Sorry for the repeat. --- library(quantmod) now - Sys.time() midnight - strptime()# I want to make this a static variable that will be equal to 12:00:00 am but I dont know what to put here. I keep getting NA for everything I do if(now == midnight) { getFX(EUR/USD, from = Sys.Date() -1, to = Sys.Date() - 1) write.table(EURUSD, ~Documents/stat arb/project/eurusd.csv, append = TRUE, row.names = FALSE, col.names = FALSE) } --- Also, append is ignored when I use write.csv. I had to resort to using write.table. Is this always the case? As for the historical interest rates, thank you all very much for providing me with the information (Finance mailing list). I used the fImport package and called the method fredSeries to download DPRIME data for the same time frame as currency data I have (Thank you, Mr. Gallon). But that is only data for US. What about other countries? I was talking to a professor and he said that there was a way to read data from a website into R if you know the url. Would this help in getting the interest rates of other countries? (I believe the function is aptly named url). Could someone provide an example, please? All help is very much appreciated. Sincerely, Aaditya Nanduri [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate confidence interval of the mean based on ANOVA
Paul wrote: I am trying to recreate an analysis that has been done by another group (in SAS I believe). I'm stuck on one part, I think because my stats knowledge is lacking, and while it's OT, I'm hoping someone here can help. Given this dataframe; snip Well, that will teach me to read the question ! The previous analysis stated (quite clearly) that they calculated confidence intervals using number of runs - 1 degrees of freedom, so doing my t quantile over 5 df instead of 17 produced the right answer. Paul. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hclust information in a table
i try to show the result of the cluster-analysis (hclust, method=ward) in a table with following information first column: height second column: number of clusters third column: clustering information 0,041 | 20 | (3)-(5) 0,111 | 19 | (6)-(11) 0,211 | 18 | (3,5)-(9) 0,402 | 17 | (6,11)-(16) ... is there any function or code to do this? -- Mit freundlichen Grüßen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distributing a value for a given month across the number of weeks in that month
Thank you very much, Gabor! Dimitri On Tue, Jul 13, 2010 at 12:25 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Tue, Jul 13, 2010 at 11:19 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Actually, I realized that my task was a bit more complicated as I have different (let's call them) Markets and the dates repeat themselves across markets. And the original code from Gabor gives an error - because dates repeate themselves and apparently zoo cannot handle it. So, I had to do program a way around it (below). It works. However, I am wondering if there is a shorter/more elegant way of doing it? Thank you! Dimitri ### My original data frame is a bit more complicated - dates repeat themselves for 2 markets: monthly-data.frame(month=c(20100301,20100401,20100501,20100301,20100401,20100501),monthly.value=c(100,200,300,10,20,30),market=c(Market A,Market A, Market A,Market B,Market B, Market B)) monthly$month-as.character(monthly$month) monthly$month-as.Date(monthly$month,%Y%m%d) (monthly) Assuming the dates for each market are the same we split them into a zoo object with one market per column and following the same approach as last time we use by in place of ave. The lines marked ## are same as last time. Be sure you are using zoo 1.6-4 from CRAN since it makes use of the na.locf features added in that version. z - read.zoo(monthly, split = market) all.dates - seq(start(z), as.Date(as.yearmon(end(z)), frac = 1), by = day) ## mondays - all.dates[weekdays(all.dates) == Monday] ## weeks - na.locf(z, xout = mondays) ## do.call(rbind, by(weeks, as.yearmon(mondays), + function(x) zoo(x/nrow(x), rownames(x Market.A Market.B 2010-03-01 20 2 2010-03-08 20 2 2010-03-15 20 2 2010-03-22 20 2 2010-03-29 20 2 2010-04-05 50 5 2010-04-12 50 5 2010-04-19 50 5 2010-04-26 50 5 2010-05-03 60 6 2010-05-10 60 6 2010-05-17 60 6 2010-05-24 60 6 2010-05-31 60 6 -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS Proc summary/means as a R function
Hello, are you trying to pase SAS code (or lightly modified SAS code) and run it in R? Then you are right: the hard part is parsing the code. I don't believe that's possible without a custom parser, and even then it's really hard to parse all the SAS sub languages right: data step, macro code and macro variables, IML, SAS Procedures etc. On Tuesday 13 July 2010 02:39:22 pm Roger Deangelis wrote: Thanks Richard and Erik, I hate to buy the book and not find the solution to the following: proc.means - function() { deparse(match.call()[-1]) } proc.means(this is a sentence) unexpected symbol in proc means(this is) One possible solution would be to 'peek' into the memory buffer that holds the function arguments. It is easy to replicate the 'dataset' output for many SAS procs(ie transpose, freq, summary, means...) I am not interested in 'report writing in R'. The hard part is parsing the SAS syntax, I wish R had a drop down to PERL. per1 on; some perl code perl off; also sas on; some SAS code sas off; The purpose of parmbuff is to turn off of Rs scanning and resolution of function arguments and just provide the bare text between '(' and ')' in the function call. This is a very powerful construct. A function would provide something like sas.on( ) -- Friedrich Schuster Dompfaffenweg 6 69123 Heidelberg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Wrap column headers caption
Hi: Using this dataframe with quite long column headers, how can I wrap the text so that the columns are narrower. I was trying to use strwrap without success. Thanks reportDF - structure(list(IDDate = c(3/12/2010, 3/13/2010, 3/14/2010, 3/15/2010), FirstRunoftheYear = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146), 140 (111 ? 150)), SecondRunoftheYear = c(33 (71 ? 71), n (0 ? 0), 337 (67 ? 74), 140 (68 ? 84)), ThirdRunoftheYear = c(890 (32 ? 47), n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), FourthRunoftheYear = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? )), LastRunoftheYear = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? ))), .Names = c(IDDate, First Run of the Year, Second Run of the Year, Third Run of the Year, Fourth Run of the Year, Last Run of the Year), row.names = c(NA, 4L), class = data.frame) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate confidence interval of the mean based on ANOVA
N_runs -1 seems a bit of an odd df to choose to calculate the CI for a mean. To answer your question, I think that t.test() is the easiest way to get a CI in R. That said, you can use the MS_residuals from ANOVA to take advantage of variance calculated on groups and pooled. Something like: foo - structure(list(OBS = structure(1:18, .Label = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54), class = factor), NOM = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c(0.05, 0.1, 1), class = factor), RUN = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L), .Label = c(1, 2, 3, 4, 5, 6), class = factor), CALC = c(0.04989, 0.04872, 0.04544, 0.05645, 0.06516, 0.0622, 0.04868, 0.05006, 0.04746, 0.05574, 0.04442, 0.04742, 0.05508, 0.0593, 0.04898, 0.06373, 0.05537, 0.04674)), .Names = c(OBS, NOM, RUN, CALC), row.names = c(NA, 18L), class = data.frame) foo.aov - aov(CALC ~ RUN, data = foo) sdpooled.calc - sqrt(anova(foo.aov)[Residuals, Mean Sq]) mcalc - mean(foo$CALC) ncalc - length(foo$CALC) t.crit - qt(p = .05/2, df = 12, lower.tail=FALSE) #then if memory serves the CI for means formula is mcalc - ((t.crit * sdpooled.calc)/sqrt(ncalc)) mcalc + ((t.crit * sdpooled.calc)/sqrt(ncalc)) #rm(foo, foo.aov, sdpooled.calc, mcalc, ncalc, t.crit) Btw, it helps if you send plaintext emails rather than html. Best regards, Josh On Tue, Jul 13, 2010 at 10:14 AM, Paul p...@paulhurley.co.uk wrote: Paul wrote: I am trying to recreate an analysis that has been done by another group (in SAS I believe). I'm stuck on one part, I think because my stats knowledge is lacking, and while it's OT, I'm hoping someone here can help. Given this dataframe; snip Well, that will teach me to read the question ! The previous analysis stated (quite clearly) that they calculated confidence intervals using number of runs - 1 degrees of freedom, so doing my t quantile over 5 df instead of 17 produced the right answer. Paul. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS Proc summary/means as a R function
What is the original intent? The bandwidth:productivity ratio is not looking encouraging for this problem. Frank On 07/13/2010 12:38 PM, schuster wrote: Hello, are you trying to pase SAS code (or lightly modified SAS code) and run it in R? Then you are right: the hard part is parsing the code. I don't believe that's possible without a custom parser, and even then it's really hard to parse all the SAS sub languages right: data step, macro code and macro variables, IML, SAS Procedures etc. On Tuesday 13 July 2010 02:39:22 pm Roger Deangelis wrote: Thanks Richard and Erik, I hate to buy the book and not find the solution to the following: proc.means- function() { deparse(match.call()[-1]) } proc.means(this is a sentence) unexpected symbol in proc means(this is) One possible solution would be to 'peek' into the memory buffer that holds the function arguments. It is easy to replicate the 'dataset' output for many SAS procs(ie transpose, freq, summary, means...) I am not interested in 'report writing in R'. The hard part is parsing the SAS syntax, I wish R had a drop down to PERL. per1 on; some perl code perl off; also sas on; some SAS code sas off; The purpose of parmbuff is to turn off of Rs scanning and resolution of function arguments and just provide the bare text between '(' and ')' in the function call. This is a very powerful construct. A function would provide something like sas.on( ) -- Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wrap column headers caption
You can't try this: sapply(names(reportDF), toString, width = 10) abbreviate(names(reportDF)) On Tue, Jul 13, 2010 at 2:43 PM, Felipe Carrillo mazatlanmex...@yahoo.comwrote: Hi: Using this dataframe with quite long column headers, how can I wrap the text so that the columns are narrower. I was trying to use strwrap without success. Thanks reportDF - structure(list(IDDate = c(3/12/2010, 3/13/2010, 3/14/2010, 3/15/2010), FirstRunoftheYear = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146), 140 (111 ? 150)), SecondRunoftheYear = c(33 (71 ? 71), n (0 ? 0), 337 (67 ? 74), 140 (68 ? 84)), ThirdRunoftheYear = c(890 (32 ? 47), n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), FourthRunoftheYear = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? )), LastRunoftheYear = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? ))), .Names = c(IDDate, First Run of the Year, Second Run of the Year, Third Run of the Year, Fourth Run of the Year, Last Run of the Year), row.names = c(NA, 4L), class = data.frame) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-pkgs] New package list for analyzing list surveyexperiments
I agree that 'list' is a terrible package name, but only secondarily because it is a data type. The primary problem is that it is so generic as to be almost totally uninformative about what the package does. For some reason package writers seem to prefer maximally uninformative names for their packages. To take some examples of recently announced packages, can anyone guess what packages 'FDTH', 'rtv', or 'lavaan' do? Why the aversion to informative names along the lines of 'Freq_dist_and_histogram', 'RandomTimeVariables', and 'Latent_Variable_Analysis', respectively? R.Raubertas -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jeffrey J. Hallman Sent: Monday, July 12, 2010 10:09 AM To: r-h...@stat.math.ethz.ch Subject: Re: [R] [R-pkgs] New package list for analyzing list surveyexperiments I know nothing about your package, but list is a terrible name for it, as list is also the name of a data type in R. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Wrap column headers caption
Using this dataframe with quite long column headers, how can I wrap the text so that the columns are narrower. I was trying to use strwrap without success. Thanks reportDF - structure(list(IDDate = c(3/12/2010, 3/13/2010, 3/14/2010, 3/15/2010), FirstRunoftheYear = c(33 (119 ? 119), n (0 ? 0), 893 (110 ? 146), 140 (111 ? 150)), SecondRunoftheYear = c(33 (71 ? 71), n (0 ? 0), 337 (67 ? 74), 140 (68 ? 84)), ThirdRunoftheYear = c(890 (32 ? 47), n (0 ? 0), 10,602 (32 ? 52), 2,635 (34 ? 66)), FourthRunoftheYear = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? )), LastRunoftheYear = c(0 ( ? ), n (0 ? 0), 0 ( ? ), 0 ( ? ))), .Names = c(IDDate, First Run of the Year, Second Run of the Year, Third Run of the Year, Fourth Run of the Year, Last Run of the Year), row.names = c(NA, 4L), class = data.frame) I could be wrong here, but I don't think there's a way to do that as print.data.frame is currently defined. You might find the print.gap argument of some use, it ultimately gets passed down to print.default and will affect the output. I can't think of a way to do this, hopefully someone else will have an idea. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] StartsWith over vector of Strings?
When running the combined code with your suggested line: content - data.frame(urls=c( http://www.google.com/search?source=ighl=enrlz==q=stuffaq=faqi=g10aql=oq=gs_rfai=CrrIS3VU8TJqcMJHuzASm9qyBBgAAAKoEBU_QsmVh;, http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4?p=stufftoggle=1cop=mssei=UTF-8fr=yfp-t-701;) ) searchset - data.frame(signatures=c(http://www.google.com/search;)) content[na.omit(pmatch(searchset, content$urls))] print(content) I am getting both URLs as results, but in fact, would expect only the first URL. Am I overlooking something? Ralf On Tue, Jul 13, 2010 at 12:03 PM, Greg Snow greg.s...@imail.org wrote: content[na.omit(pmatch(searchset, content,,TRUE))] -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Ralf B Sent: Tuesday, July 13, 2010 5:47 AM To: r-help@r-project.org Subject: [R] StartsWith over vector of Strings? Given vectors of strings of arbitrary length content - c(abc, def) searchset - c(a, abc, abcdef, d, def, defghi) Is it possible to determine the content String set that matches the searchset in the sense of 'startswith' ? This would be a vector of all strings in content that start with the string of any of the strings in the searchset. In the little example here, this would be: result - c(abc, abc, def, def) Best, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] StartsWith over vector of Strings?
My solution was based on using vectors (which were your original example), now you are using data frames. The actual result is NA, then you just print content again (which my code never modified) so you are going to see the full content data frame. Try: content[na.omit(pmatch(searchset$signatures, content$urls)),] then look at all the pieces (starting from inside out) to see what is happening at each step to understand what is going on. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: Ralf B [mailto:ralf.bie...@gmail.com] Sent: Tuesday, July 13, 2010 11:57 AM To: Greg Snow Cc: r-help@r-project.org Subject: Re: [R] StartsWith over vector of Strings? When running the combined code with your suggested line: content - data.frame(urls=c( http://www.google.com/search?source=ighl=enrlz==q=stuffaq=f aqi=g10aql=oq=gs_rfai=CrrIS3VU8TJqcMJHuzASm9qyBBgAAAKoEBU_QsmVh, http://search.yahoo.com/search;_ylt=Atvki9MVpnxuEcPmXLEWgMqbvZx4 ?p=stufftoggle=1cop=mssei=UTF-8fr=yfp-t-701) ) searchset - data.frame(signatures=c(http://www.google.com/search;)) content[na.omit(pmatch(searchset, content$urls))] print(content) I am getting both URLs as results, but in fact, would expect only the first URL. Am I overlooking something? Ralf On Tue, Jul 13, 2010 at 12:03 PM, Greg Snow greg.s...@imail.org wrote: content[na.omit(pmatch(searchset, content,,TRUE))] -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Ralf B Sent: Tuesday, July 13, 2010 5:47 AM To: r-help@r-project.org Subject: [R] StartsWith over vector of Strings? Given vectors of strings of arbitrary length content - c(abc, def) searchset - c(a, abc, abcdef, d, def, defghi) Is it possible to determine the content String set that matches the searchset in the sense of 'startswith' ? This would be a vector of all strings in content that start with the string of any of the strings in the searchset. In the little example here, this would be: result - c(abc, abc, def, def) Best, Ralf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RJSONIO install problem
Hi everybody I am trying to install RJSONIO from source in on Mac OS X 10.5.8. I used the Package Installer. The message and sessionInfo is attached below Can someone help me to understand the error message and maybe give hint towards solving the problem thanks in advance Christiaan Message: The downloaded packages are in /private/var/folders/ub/ubvWLUkKHf8WAywv5rmtcE+++TI/-Tmp-/RtmpZflYon/downloaded_packages * installing *source* package RJSONIO ... ** libs *** arch - i386 gcc -arch i386 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -I/usr/local/include-fPIC -g -O2 -c ConvertUTF.c -o ConvertUTF.o i686-apple-darwin9-gcc-4.0.1: installation problem, cannot exec 'as': No such file or directory make: *** [ConvertUTF.o] Error 1 ERROR: compilation failed for package RJSONIO * removing /Library/Frameworks/R.framework/Versions/2.11/Resources/library/RJSONIO sessionInfo is R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Boxplot: Scale outliers
Hello! I am trying to scale the outliers in a boxplot. I am passing pars = list(boxwex=0.1, staplewex=0.1, outwex=0.1) to the boxplot command. The boxes are scaled correctly, but the circles (outliers) are not scaled at all, and thus pretty big compared to the boxes scaled with 0.1. Am I missing something? Thanks in advance! Robert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time Variable and Historical Interest Rates
On Tue, Jul 13, 2010 at 9:54 AM, Aaditya Nanduri aaditya.nand...@gmail.com wrote: Guys, I wrote to the finance mailing list earlier with my questions but was directed here. Sorry for the repeat. --- library(quantmod) now - Sys.time() midnight - strptime() # I want to make this a static variable that will be equal to 12:00:00 am but I dont know what to put here. I keep getting NA for everything I do The key to what I did was format(). I am turning the output of Sys.time() to something that can be compared to the character vector 'midnight'. Also, I would use 24 hour time. #assign midnight and now midnight - 00:00:00 now - format(Sys.time(), format = %H:%M:%S) #Look at the structure for midnight and now str(midnight) str(now) #print to screen midnight now if(now == midnight) { This test seems prone to failure. There is a one second period when 'now' must be assigned or it will fail. getFX(EUR/USD, from = Sys.Date() -1, to = Sys.Date() - 1) write.table(EURUSD, ~Documents/stat arb/project/eurusd.csv, append = TRUE, row.names = FALSE, col.names = FALSE) } --- Also, append is ignored when I use write.csv. I had to resort to using write.table. Is this always the case? write.csv() is a convenience wrapper for write.table(). It is also clearly stated in the documentation for ?write.csv These [write.csv, write.csv2] wrappers are deliberately inflexible: they are designed to ensure that the correct conventions are used to write a valid file. Attempts to change 'append', 'col.names', 'sep', 'dec' or 'qmethod' are ignored, with a warning. So yes, it is always the case. If you want to use write.table() to make a comma separated file, you might consider adding the argument sep = , to your write.table() call. As for the historical interest rates, thank you all very much for providing me with the information (Finance mailing list). I used the fImport package and called the method fredSeries to download DPRIME data for the same time frame as currency data I have (Thank you, Mr. Gallon). But that is only data for US. What about other countries? I was talking to a professor and he said that there was a way to read data from a website into R if you know the url. Would this help in getting the interest rates of other countries? (I believe the function is aptly named url). Could someone provide an example, please? I imagine it would help if websites provide different countries interest rates in a convenient file. In fact, in general you would not even have to use url(). Here is an example. On my website I have a tab delimited data file. I can access it from R by: read.table( file = http://www.joshuawiley.com/psyc211/Psyc211-hw1-part1.txt;, header = TRUE, sep = \t) It is also possible to enter user names and passwords into the URL. This general pattern also works for ftp sites. For secure http (https) I only know how to access them through R in Windows. Cheers, Josh All help is very much appreciated. Sincerely, Aaditya Nanduri [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] working out main effect variance when different parameterization is used and interaction term exists
Dear all, Apologies if this question is bit theoretical and for the longish email. I am meta-analyzing the coefficients and standard errors from multiple studies where the raw data is not available. Each study analyst runs a model that includes an interaction term for, say, between sex and smoking and age. Here is an illustrative example example for one study: set.seed(1066) status - rbinom( 1000, 1, 0.2 ) males - rbinom( 1000, 1, 0.6 ) smoke - rbinom( 1000, 1, 0.3 ) age- runif(1000, min=20, max=80) coef( summary( f1 - glm( status ~ males*smoke + age, family=binomial ) ) ) # Estimate Std. Errorz value Pr(|z|) # (Intercept) -1.520399871 0.284464584 -5.3447774 9.052825e-08 # males0.213851446 0.201717381 1.0601538 2.890746e-01 # smoke -0.123103049 0.292346483 -0.4210861 6.736922e-01 # age -0.001056007 0.004612947 -0.2289223 8.189293e-01 # males:smoke 0.283775173 0.362821438 0.7821345 4.341355e-01 Now, unfortunately some analysts coded sex as females instead of males. Using the same dataset, I get the following output with females: females - 1 - males coef( summary( f1 - glm( status ~ females*smoke + age, family=binomial )) ) # Estimate Std. Errorz value Pr(|z|) # (Intercept) -1.306548425 0.262573162* -4.9759405 6.493160e-07 # females -0.213851446 0.201717381* -1.0601538 2.890746e-01 # smoke 0.160672124 0.214923130* 0.7475795 4.547138e-01 # age -0.001056007 0.004612947 -0.2289223 8.189293e-01 # females:smoke -0.283775173 0.362821438 -0.7821345 4.341355e-01 I have worked out algebrically (and numerically) the following: Beta(females) = -Beta(males) Var(females)= Var(males) Beta(females:smoke) = -Beta(males:smoke) Var(females:smoke) = Var(males:smoke) Beta(smoke | fit1) = Beta(smoke | fit2) + Beta(females:smoke) = 0.160672124 -0.283775173 = -0.1231030 How can I calculate the Var(smoke | fit1) from Var(smoke | fit2) ? I tried to derive this algebrically but ended up with a covariance term which I could not solve. If I could cleverly convert Var(smoke | fit2) to Var(smoke | fit1) then I could avoid going back to each analyst since this particular analyses is only one of many hundreds we run and it would be annoying for each analyst to use the same parameterisation. Any suggestions is much appreciated. Many thanks in advance. Regards, Adai __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-pkgs] New package list for analyzing list surveyexperiments
Raubertas, Richard wrote: I agree that 'list' is a terrible package name, but only secondarily because it is a data type. The primary problem is that it is so generic as to be almost totally uninformative about what the package does. For some reason package writers seem to prefer maximally uninformative names for their packages. To take some examples of recently announced packages, can anyone guess what packages 'FDTH', 'rtv', or 'lavaan' do? Why the aversion to informative names along the lines of 'Freq_dist_and_histogram', 'RandomTimeVariables', and 'Latent_Variable_Analysis', respectively? I'm sure it's part tradition... ls cat rm cp mv su __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need help on index for time series object
Dear all, Please forgive me if there is a duplicate post; my previous mail perhaps didnt reach the list... Let say I have following time series library(zoo) dat1 - zooreg(rnorm(10), start=as.Date(2010-01-01), frequency=1) dat1[c(3, 7,8)] = NA dat1 2010-01-01 2010-01-02 2010-01-03 2010-01-04 2010-01-05 2010-01-06 2010-01-07 2010-01-08 2010-01-09 2010-01-10 0.31244288 -2.49383257 NA 0.38975582 -1.23040380 -0.09697926 NA NA -0.63171455 0.15867246 Now I want to get the Indies for the non-NA elements of dat1. Means I want to get a vector like: 1,2,4,5,6,9.10 Having a time series vector like dat1, is there any straightforward approach to get that? Thanks and regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help on index for time series object
Megh, I don't know whether this is the best way, but it works: seq(1,length(dat1))[!is.na(dat1)] [1] 1 2 4 5 6 9 10 Jonathan On Tue, Jul 13, 2010 at 1:58 PM, Megh Dal megh700...@yahoo.com wrote: Dear all, Please forgive me if there is a duplicate post; my previous mail perhaps didnt reach the list... Let say I have following time series library(zoo) dat1 - zooreg(rnorm(10), start=as.Date(2010-01-01), frequency=1) dat1[c(3, 7,8)] = NA dat1 2010-01-01 2010-01-02 2010-01-03 2010-01-04 2010-01-05 2010-01-06 2010-01-07 2010-01-08 2010-01-09 2010-01-10 0.31244288 -2.49383257 NA 0.38975582 -1.23040380 -0.09697926 NA NA -0.63171455 0.15867246 Now I want to get the Indies for the non-NA elements of dat1. Means I want to get a vector like: 1,2,4,5,6,9.10 Having a time series vector like dat1, is there any straightforward approach to get that? Thanks and regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] export tables to excel files on multiple sheets with titles for each table
Hello R-users, Checking the archives, I recently came across this topic: export tables to Excel files (http://r.789695.n4.nabble.com/export-tables-to-Excel-files-td1565679.html#a1565679), and the following interesting references have been proposed: http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows http://www.r-bloggers.com/export-data-frames-to-multi-worksheet-excel-file-2/ but my problem is somehow a small extension to what has been discussed, and although i have a solution, i seek something more elegant. I want to export multiple dataframes (on multiple sheets), but i also want each of them to have its own title that is to be written also in Excel. The packages/functions that i have checked, cannot accommodate a title that is to be written on the sheet, along with the actual dataframe of interest. I can do something similar to what i need, but without writing the dataframes on multiple sheets. #head(USArrests) and head(iris) written with accompanying title one under each other write.excel-function (tab, ...){ zz - file(example.dat, a+b) cat(\TITLE extra line,file = zz, sep = \n) write.table(tab, file=zz, row.names=F,sep=\t) close(zz)} write.excel(head(USArrests)) write.excel(head(iris)) Any suggestion on how to export the same information on two separate sheets, and keeping also a title for each of them, is highly appreciated, as i have been searching for some time for a good solution. Thank you very much and have a great day ahead! Eugen Pircalabelu (0032)471 842 140 (0040)727 839 293 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.