[R] R Help
Has anyone programmed the Nonparametric Canonical Correlation method in R? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time series problem: time points don't match
Gabor: That is not the ideal solution, but it definitely works to provide me with the easier alternative. Thanks for the reply! -- View this message in context: http://n4.nabble.com/time-series-problem-time-points-don-t-match-tp1748387p1748706.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Exporting Nuopt from splus to R
Hi all, Thanks for the wonderful forum with all the valuable help and comments here. I have been a splus user for the past 7 to 8 years and now crossing the mind of changing over to R. Have been doing a lot of reading and one of the main reasons is being an open source and the wonderful things that comes with that. My question is though, is it possible to export any of the function or librarys that come with splus to R.? For my specific situation. Windows platform, if there is a compiled s.dll is there a way we can get this working in R. I would think if it s function or source file it probably can be written without much difficulty in R. But what about the compiled data. I am not a system programmer so don't know much about compiling/ undoing that. From my understanding it is going to be difficult, is that my understanding right.? Thanks -- View this message in context: http://n4.nabble.com/Exporting-Nuopt-from-splus-to-R-tp1748681p1748681.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ODD and EVEN numbers
Excuse me Carl Withoft! For your information, this is not my homework. I'm just helping my friend in a part of her R code. And everytime I ask a question here, it's just a SMALL PART of the 2-pages-program that I am doing. And for your information, the answers that I get, I still think on how to make use of them. It does not mean that when I get answers, I use them immediately without thinking! And you have no right to tell me that coz I don't remember you answering any of my questions. IF YOU DON'T KNOW THE ANSWERS TO MY QUESTIONS, just keep quiet, and let the smart guys share their thoughts. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ODD and EVEN numbers
Just to give you a hint for the future: If you ask google for odd, even, R you get a messages from 2003 as second match: --- Dave Caccace wrote: Hi, I'm trying to create a function, jim(p) which varies depending on whether the value of p is odd or even. I was trying to use th eIf function, but i cant work out a formula to work out if p is odd or even. Thanks, Dave if(p %% 2) odd else even Uwe Ligges -- (Hi Uwe!) My guess is, using so much capitals in your e-mail has turned away about 1000 helpful souls from your future posts. May be reading the posting guide and a one minute try to solve the problem by yourself googling would be appropriate? Think for a moment: Google would have given an answer (the answer!) in 1 minute. You wrote an e-mail to quite a few thousands of subscribers. That needed more than a minute on your side. And how many hours of reading time took it off of your readers? Seasonal greetings Detlef On Thu, 01 Apr 2010 17:27:01 -0700 girlm...@yahoo.com wrote: Excuse me Carl Withoft! For your information, this is not my homework. I'm just helping my friend in a part of her R code. And everytime I ask a question here, it's just a SMALL PART of the 2-pages-program that I am doing. And for your information, the answers that I get, I still think on how to make use of them. It does not mean that when I get answers, I use them immediately without thinking! And you have no right to tell me that coz I don't remember you answering any of my questions. IF YOU DON'T KNOW THE ANSWERS TO MY QUESTIONS, just keep quiet, and let the smart guys share their thoughts. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Biplot for PCA using labdsv package
Hi everyone, I am doing PCA with labdsv package. I was trying to create a biplot graphs in order to observe arrows related to my variables. However when I run the script for this graph, the console just keep saying: *Error in nrow(y) : element 1 is empty; the part of the args list of 'dim' being evaluated was: (x)* could please someone tell me what this means? what i am doing wrong? I will really appreciate any suggestions and help. Thanks, Dilys [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Using a string as a variable name - revisited
Hi Without some insight about foo, list or counts it is impossible to say what is wrong. mat-matrix(1:12, 3,4) colnames(mat)-letters[1:4] DF-as.data.frame(mat) fac-factor(names(DF)) fac [1] a b c d Levels: a b c d ff-fac[3] ff [1] c Levels: a b c d DF[[ff]] [1] 7 8 9 ff-fac[1] DF[[ff]] [1] 1 2 3 As you can see with DF as data frame and ff extracted from vector of name as a factor everything seems to work OK. When anybody except you wont to do foo - list$taxon[match(5,list$item)] he gets Error in list$taxon : object of type 'builtin' is not subsettable So provide some toy example or at least structure of objects you use and you probably get solution. Regards Petr r-help-boun...@r-project.org napsal dne 01.04.2010 23:24:55: I would like to revisit a problem that was discussed previously (see quoted discussion below). I am trying to do the same thing, using a string to indicate a column with the same name. I am making foo a string taken from a list of names. It matches the row where item = 5, and picks the corresponding taxon foo - list$taxon[match(5,list$item)] Let's say this returns foo as Aulacoseira_islandica. I have another matrix counts with column headers corresponding to the taxon list. But, when I try to access the data in the Aulacoseira_islandica column, it instead uses the data from another column. For instance... columndata - counts[[foo]] ...returns the data from the wrong column. What it seems to be doing is converting the text Aulacoseira_islandica to a number (25, for some reason) and reading the count data from column number 25, instead of from the column labelled with Aulacoseira_islandica. If I try... columndata - counts$Aulacoseira_islandica ...it works fine. Any thoughts? -Euan NRRI-University of Minnesota Duluth __ Jason Horn-2 Oct 20, 2006; 06:28pm [R] Using a string as a variable name Is it possible to use a string as a variable name? For example: foo=var1 frame$foo # frame is a data frame with with a column titled var1 This does not work, unfortunately. Am I just missing the correct syntax to make this work? - Jason __ Oct 20, 2006; 06:30pm Re: [R] Using a string as a variable name frame[[foo]] On 10/20/06, Jason Horn [hidden email] wrote: Is it possible to use a string as a variable name? For example: foo=var1 frame$foo # frame is a data frame with with a column titled var1 This does not work, unfortunately. Am I just missing the correct syntax to make this work? - Jason -- Jim Holtman Cincinnati, OH +1 513 646 9390 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] roccomp
The ROCR package has methods to compute AUC and related methods. You might want to check it out. Ravi -- View this message in context: http://n4.nabble.com/roccomp-tp1748818p1748903.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Derivative of a smooth function
Dear All, I've been searching for appropriate codes to compute the rate of change and the curvature of nonparametric regression model whish was denoted by a smooth function but unfortunately don't manage to do it. I presume that such characteristics from a smooth curve can be determined by the first and second derivative operators. The following are the example of fitting a nonparametric regression model via smoothing spline function from the Help file in R. ### attach(cars) plot(speed, dist, main = data(cars) smoothing splines) cars.spl - smooth.spline(speed, dist) lines(cars.spl, col = blue) lines(smooth.spline(speed, dist, df=10), lty=2, col = red) legend(5,120,c(paste(default [C.V.] = df =,round(cars.spl$df,1)),s( * , df = 10)), col = c(blue,red), lty = 1:2, bg='bisque') detach() ### Could someone please advice me the appropriate way to determine such derivatives on the curves which were fitted by the function above and would like to thank you in advance. Cheers Fir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to save a model in DB and retrieve It
I'm wondering how to save an object (models like lm, loess, etc) in a DB to retrieve and use it afterwards, an example: wind_ms - abs(rnorm(24*30)*4+8) air_kgm3 - rnorm(24*30, 0.1)*0.1 + 1.1 wind_dg - rnorm(24*30) * 360/7 ms - c(0:25) kw_mm92 - c(0,0,0,20,94,205,391,645,979,1375,1795,2000,2040) kw_mm92 - c(kw_mm92, rep(2050, length(ms)-length(kw_mm92))) modelspline - splinefun(ms, kw_mm92) kw - abs(modelspline(wind_ms) - (wind_dg)*2 + (air_kgm3 - 1.15)*300 + rnorm(length(wind_ms))*10) #plot(wind_ms, kw) windDat - data.frame(kw, wind_ms, air_kgm3, wind_dg) windDat[windDat$wind_ms 3, 'kw'] - 0 model - loess(kw ~ wind_ms + air_kgm3 + wind_dg, data = windDat, enp.target = 10*5*3) #, span = 0.1) modX - serialize(model, connection = NULL, ascii = T) Channel - odbcConnect(someSysDSN; UID=aUid; PWD=aPwd) sqlQuery(Channel, paste( INSERT INTO GRT.GeneratorsModels ([cGeneratorID] ,[tModel] VALUES (1,, paste(', gsub(', '', rawToChar(modX)), ', sep = ''), ), sep = ) ) # Up to this it is working correctly, # in DB I have the modX variable # Problem arise retrieving data and 64kb limit: strQ - SELECT CONVERT(varchar(max), tModel) AS tModel FROMGRT.GeneratorsModels WHERE (cGeneratorID = 1) x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows = FALSE) x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows = FALSE) #read error Above code is working for simplier models that have a shorter representation in variable modX. Any advice on how to store and retieve this kind of objects? Thanks Daniele ORS Srl Via Agostino Morando 1/3 12060 Roddi (Cn) - Italy Tel. +39 0173 620211 Fax. +39 0173 620299 / +39 0173 433111 Web Site www.ors.it Qualsiasi utilizzo non autorizzato del presente messaggio e dei suoi allegati è vietato e potrebbe costituire reato. Se lei avesse ricevuto erroneamente questo messaggio, Le saremmo grati se provvedesse alla distruzione dello stesso e degli eventuali allegati. Opinioni, conclusioni o altre informazioni riportate nella e-mail, che non siano relative alle attività e/o alla missione aziendale di O.R.S. Srl si intendono non attribuibili alla società stessa, né la impegnano in alcun modo. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hat matrix and MSEP
Dear all, I have 100 x 5 data matrix and 100 x 1 response vector. I have calculated Hat matirx and the diagonal of the matrix. Now I want to know if the diagonals say something about future prediction( Mean square error for preidiction)? Can I get variance explained for future data? Thanks alot -- Linda Garcia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: Generative Topographic Map
I am running GTM on the same datda space points but changing the number of latent space points, the number of basis functions and parameter sigma. I found a combination of such parameters that works fine. On the other hand on page 7 of the paper The Generative Topographic Mapping by Swensen, Bishop, and Williams, it is stated Thre is no over-fitting if the number of sample points is increased since the nymber of degrees of freedom in the model is controlled by the mapping function y(x;W). However, since you have transalted the code from matLab to R I am pretty sure you know hwt is the cause of the fllowing messages, which routines generates them and under which circumstances. Once I have these details clear, I can possible try and avoid the event that causes them gtm_trn: Warning -- M-Step matrix singular, using pinv.\n 1: In chol.default(A, pivot = TRUE) : matrix not positive definite 2: In gtm_trn(T, FI, W, lambda, 1, b, 2, quiet = FALSE, minSing = 0.01) : Using 40 out of 40 eigenvalues Thank you so much. Maura -Messaggio originale- Da: Ondrej Such [mailto:ondrej.s...@gmail.com] Inviato: gio 01/04/2010 17.07 A: mau...@alice.it Oggetto: Re: Generative Topographic Map Hello Maura, Thank you for your email. Marcus Svensen, one of method's authors works at Microsoft, and can give you more insight how it works. Email marcu...@microsoft.com, home page http://research.microsoft.com/en-us/um/people/markussv/. I also found his Ph.D. thesis very insightful. I believe it is never useful to have as many data points in latent space as there are data points. And with 2000 points I can't imagine going beyond latent dimension 3 or 4, in applying GTM. I've heard that GTM can be useful mostly in situations, when data follows a relatively smooth manifold. Best, --Ondrej 2010/4/1 mau...@alice.it Thank you. I figured that out myself last night. I always forget that read.table does not actually read data into a matrix. GTM MatLab toolbox comes with a nice guide to use the package which may as well become an R vignette. Anyway, I got the singular matrix warnings myself and do not know whether I should be concerned about it or not. Moreover, I do not know how to avoid that. I will go through some other experiments keeping the data space samples and dimensionality fixed and changing some of the input parameters. I stress our goal is NOT visualization. We do not know the intrinsic dimensionality of that data space samples. Therefore we can only proceed by trial--error. That is we vary the dimensionality of the embedding space. In this experiment the dimensionality of the data space is 7 so we start out projecting our original data to a 1D embedding space, then we try out a 2D embedding space, ..., all the way up to a 6D embedding space. Since we do not know the intrinsic dimensionality of the original data, we need a method to evaluate the reliability of the projection. To assess that we reconstruct the data back from the embedding to the data space and here we calculate the RMSD between the original data and the reconstructed ones. Basically, using RMSD, we need as many reconstructed points as the original number. Such a requirement is achieved by choosing as many points in the latent space as in the data space. Can such a choice be the cause of the matrix singularity ? Futhermore, is the number of basis functions related to the number of latent space points somehow ? Unluckily, even GTM MatLab documentation is not explicitly providing any clear criteria about the parameters choice and their dependence, if any. Thank you, Maura -Messaggio originale- Da: Ondrej Such [mailto:ondrej.s...@gmail.com ondrej.s...@gmail.com] Inviato: gio 01/04/2010 11.16 A: mau...@alice.it Oggetto: Re: Generative Topographic Map Hello, the problem that's tripping the package is that T is a data.frame and not a matrix. Simply replacing T - read.table(DHA_TNH.txt) with T - as.matrix(read.table(DHA_TNH.txt)) makes the code run (though warnings about singular matrices remain, I'm not sure to what degree that is worrisome). I'd be curious, as to how you'd suggest improving the documentation. Hope this helps, --Ondrej 2010/3/31 mau...@alice.it I tried to use R version of package I noticed the original MatLab Pckage is much better documented. I had a look at the R demo code gtm_demo and found that variable Y is used in advanced of being created: I wrote my own few lines as follows: inDir - C:/Documents and Settings/Monville/Alanine Dipeptide/DBP1/DHA setwd(inDir) T - read.table(DHA_TNH.txt) L - 3 X - matrix(nrow=nrow(T),ncol=3,byrow=TRUE) MU - matrix(nrow=round(nrow(T)/5), ncol=L) for(i in 1:ncol(X)) { for(j in 1:nrow(X)) { X[j,i] - RANDU() } } for(i in 1:ncol(MU)) { for(j in 1:nrow(MU)) { MU[j,i] - RANDU() } } sigma -1 FI - gtm_gbf(MU,sigma,X) W -
[R] plot area: secondary y-axis does not display well
Dear useRs, I'm having a slight problem with plotting on 2 axes. While the following code works alright on screen, the saved output does not turn out as desired i.e. the secondary y-axis does not display fully. Just run the code and look at image output. Suggestions please... thanks, Muhammad --- rm(list=ls()) x - 1:100 y - 200:300 par(mar=c(5,5,5,7)+0.1) # inner margin par(oma=c(3,3,3,7))# outer margin png(image.png) plot(x,cex=0.5,type=l,lty=2,pch=3,xlab=year,ylab=x-axis,las=1,col=blue) par(new=TRUE) plot(y,cex=0.5,type=l,lty=2,pch=3,xlab=,ylab=,las=1,axes=FALSE,ylim=c(0,500),col=red) axis(4,las=1) mtext(y-axis,side=4,line=3) legend(topleft,col=c(blue,red),lty=2,legend=c(x,y),bty=n) box(figure,col=red) box(plot,col=blue) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What should I do regarding DLL attempted to change... warning ?
Hi all, The call to: library(rJava) Results in the following warning massage: Warning message: In inDL(x, as.logical(local), as.logical(now), ...) : DLL attempted to change FPU control word from 8001f to 9001f After some searching I found the following explanation: R expects all calls to DLLs (including the initializing call) to leave the FPU control word unchanged. Many run-time libraries reset the FPU control word during initialization; this will cause problems in R, and will result in a warning message like DLL attempted to change FPU control word from 8001f to 9001f. The value 8001f that gets reported is in the format expected by the C library routine _controlfp; the raw value that is used in the FPU register is 037F. Also with a few old discussions that explain (for a package developer) how to avoid this. The question is, should I, as a useR, do anything regarding this warning massage ? I use winXP , here is my sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rJava_0.8-3 Thanks, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: plot area: secondary y-axis does not display well
Hi r-help-boun...@r-project.org napsal dne 02.04.2010 12:12:02: Dear useRs, I'm having a slight problem with plotting on 2 axes. While the following code works alright on screen, the saved output does not turn out as desired i.e. the secondary y-axis does not display fully. Just run the code and look at image output. Suggestions please... thanks, Muhammad --- rm(list=ls()) x - 1:100 y - 200:300 par(mar=c(5,5,5,7)+0.1) # inner margin par(oma=c(3,3,3,7))# outer margin png(image.png) png device does not know about your margin settings. It was called **after** call to par. So put your par(mar=c(5,5,5,7)+0.1) # inner margin par(oma=c(3,3,3,7))# outer margin after call to png. Regards Petr plot(x,cex=0.5,type=l,lty=2,pch=3,xlab=year,ylab=x-axis,las=1,col=blue) par(new=TRUE) plot(y,cex=0.5,type=l,lty=2,pch=3,xlab=,ylab=,las=1,axes=FALSE,ylim=c(0, 500),col=red) axis(4,las=1) mtext(y-axis,side=4,line=3) legend(topleft,col=c(blue,red),lty=2,legend=c(x,y),bty=n) box(figure,col=red) box(plot,col=blue) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time series problem: time points don't match
On Thu, Apr 1, 2010 at 7:08 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Perhaps something like this: library(zoo) library(chron) # read in data Lines1 - date time level temp 2009/10/01 00:01:52.0 2.8797 18.401 2009/10/01 00:16:52.0 2.8769 18.382 2009/10/01 00:31:52.0 2.8708 18.309 2009/10/01 00:46:52.0 2.8728 18.285 2009/10/01 01:01:52.0 2.8716 18.245 2009/10/01 01:16:52.0 2.8710 18.190 Lines2 - date time level temp 2009/10/01 00:11:06.0 2.9507 18.673 2009/10/01 00:26:06.0 2.9473 18.630 2009/10/01 00:41:06.0 2.9470 18.593 2009/10/01 00:56:06.0 2.9471 18.562 2009/10/01 01:11:06.0 2.9451 18.518 2009/10/01 01:26:06.0 2.9471 18.480 DF1 - read.table(textConnection(Lines1), header = TRUE, as.is = TRUE) DF2 - read.table(textConnection(Lines2), header = TRUE, as.is = TRUE) z1 - zoo(DF1[3:4], chron(DF1[,1], DF1[,2], format=c(Y/M/D, H:M:S))) z2 - zoo(DF2[3:4], chron(DF2[,1], DF2[,2], format=c(Y/M/D, H:M:S))) # process inputs z1 and z2 # aggregating into 15 minute intervals and merging z1a - aggregate(z1, trunc(time(z1), 00:15:00), tail, n = 1) z2a - aggregate(z2, trunc(time(z2), 00:25:00), tail, n = 1) The last line should have been: z2a - aggregate(z2, trunc(time(z2), 00:15:00), tail, n = 1) z - merge(z1a, z2a) On Thu, Apr 1, 2010 at 1:35 PM, Brad Patrick Schneid bpsch...@gmail.com wrote: Hi, I have a time series problem that I would like some help with if you have the time. I have many data from many sites that look like this: Site.1 date time level temp 2009/10/01 00:01:52.0 2.8797 18.401 2009/10/01 00:16:52.0 2.8769 18.382 2009/10/01 00:31:52.0 2.8708 18.309 2009/10/01 00:46:52.0 2.8728 18.285 2009/10/01 01:01:52.0 2.8716 18.245 2009/10/01 01:16:52.0 2.8710 18.190 Site.2 date time level temp 2009/10/01 00:11:06.0 2.9507 18.673 2009/10/01 00:26:06.0 2.9473 18.630 2009/10/01 00:41:06.0 2.9470 18.593 2009/10/01 00:56:06.0 2.9471 18.562 2009/10/01 01:11:06.0 2.9451 18.518 2009/10/01 01:26:06.0 2.9471 18.480 As you can see, the times do not match up. What I would like to do is be able to merge these two data sets to the nearest time stamp by creating a new time between the two; something like this: date new.time level.1 temp.1 level.2 temp.2 2009/10/01 00:01:52.0 2.8797 18.401 NA NA 2009/10/01 00:13:59.0 2.8769 18.382 2.9507 18.673 2009/10/01 00:28:59.0 2.8708 18.309 2.9473 18.630 2009/10/01 00:43:59.0 2.8728 18.285 2.9470 18.593 2009/10/01 00:59:59.0 2.8716 18.245 2.9471 18.562 2009/10/01 01:13:59.0 2.8710 18.190 2.9451 18.518 2009/10/01 01:26:06.0 NA NA 2.9471 18.480 Note that the sites may not match in the # of observations and a return of NA would be necessary, but a deletion of that time point all together for both sites would be preferred. A possibly easier alternative would be a way to assign generic times for each observation according to the time interval, so that the 1st observation for each day would have a time = 00:00:00 and each consecutive one would be 15 minutes later. Thanks for any suggestions. Brad -- View this message in context: http://n4.nabble.com/time-series-problem-time-points-don-t-match-tp1748387p1748387.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: plot area: secondary y-axis does not display well
Thanks Ivan, Jim and Petr. The output turns out as desired after I've taken your suggestions. Muhammad __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What should I do regarding DLL attempted to change... warning ?
On 02/04/2010 7:01 AM, Tal Galili wrote: Hi all, The call to: library(rJava) Results in the following warning massage: Warning message: In inDL(x, as.logical(local), as.logical(now), ...) : DLL attempted to change FPU control word from 8001f to 9001f After some searching I found the following explanation: R expects all calls to DLLs (including the initializing call) to leave the FPU control word unchanged. Many run-time libraries reset the FPU control word during initialization; this will cause problems in R, and will result in a warning message like DLL attempted to change FPU control word from 8001f to 9001f. The value 8001f that gets reported is in the format expected by the C library routine _controlfp; the raw value that is used in the FPU register is 037F. Also with a few old discussions that explain (for a package developer) how to avoid this. The question is, should I, as a useR, do anything regarding this warning massage ? It's a bug in the rJava package, so you should report it to the maintainer of that package. Duncan Murdoch I use winXP , here is my sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rJava_0.8-3 Thanks, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What should I do regarding DLL attempted to change... warning ?
Thanks Duncan and Romain, I'll go and do that. with regards, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Fri, Apr 2, 2010 at 2:10 PM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 02/04/2010 7:01 AM, Tal Galili wrote: Hi all, The call to: library(rJava) Results in the following warning massage: Warning message: In inDL(x, as.logical(local), as.logical(now), ...) : DLL attempted to change FPU control word from 8001f to 9001f After some searching I found the following explanation: R expects all calls to DLLs (including the initializing call) to leave the FPU control word unchanged. Many run-time libraries reset the FPU control word during initialization; this will cause problems in R, and will result in a warning message like DLL attempted to change FPU control word from 8001f to 9001f. The value 8001f that gets reported is in the format expected by the C library routine _controlfp; the raw value that is used in the FPU register is 037F. Also with a few old discussions that explain (for a package developer) how to avoid this. The question is, should I, as a useR, do anything regarding this warning massage ? It's a bug in the rJava package, so you should report it to the maintainer of that package. Duncan Murdoch I use winXP , here is my sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rJava_0.8-3 Thanks, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] what is the significance of RSq in earth function??
Hello, I'm using earth function for Multivariate Adaptive Regression splines. what is the significance of RSq in earth function?? following's the code. printed value is of RSq. tr.wage-sample(1:nrow(HCMwage), 0.8*nrow(HCMwage)) tst.wage- (1:nrow(HCMwage))[-tr.wage] HCMwageModel-earth(V2~V3+V4+V5+V6+V7+V8+V9+V10+V11+V12+W,data=HCMwage[tr.wage,]) prdHCMwage-predict(HCMwageModel, newdata=HCMwage[tst.wage,]) wg2HCM-HCMwage$V2[tst.wage] RwageHCM-(1-sum((wg2HCM - prdHCMwage)^2)/sum((wg2HCM-mean(wg2HCM))^2)) print(RwageHCM) [1] 0.3204129 Thanks and Regards, Vibha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Restricting optimisation algorithm's parameter space in GNLM
Hello, I have a problem. I am using the NLME library to fit a non-linear model. There is a linear component to the model that has a couple parameter values that can only be positive (the coefficients are embedded in a sqrt). When I try and fit the model to data the search algorithm tries to see if a negative value for one of these parameter values will produce an optimal fit. When it does so, it crashes because the equation can not have a negative value because its in a sqrt function. QUESTION: How do I restrict the optimisation algorithm's parameter space so it does not search negative values when using GNLM? Are there other Libraries that Fit Non-linear models and allow for one to control the parameter space the search algorithm is restricted by? Any help would be appreciated. Thanks Dom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot area: secondary y-axis does not display well
Hi Muhammad, The problem is that you set the par() options before creating your png. I've tried, and it works if you do this: ... # x and y png(image.png) par(mar=c(5,5,5,7)+0.1) # inner margin par(oma=c(3,3,3,7))# outer margin ... # rest of your code HTH, Ivan Le 4/2/2010 12:12, Muhammad Rahiz a écrit : Dear useRs, I'm having a slight problem with plotting on 2 axes. While the following code works alright on screen, the saved output does not turn out as desired i.e. the secondary y-axis does not display fully. Just run the code and look at image output. Suggestions please... thanks, Muhammad --- rm(list=ls()) x - 1:100 y - 200:300 par(mar=c(5,5,5,7)+0.1) # inner margin par(oma=c(3,3,3,7))# outer margin png(image.png) plot(x,cex=0.5,type=l,lty=2,pch=3,xlab=year,ylab=x-axis,las=1,col=blue) par(new=TRUE) plot(y,cex=0.5,type=l,lty=2,pch=3,xlab=,ylab=,las=1,axes=FALSE,ylim=c(0,500),col=red) axis(4,las=1) mtext(y-axis,side=4,line=3) legend(topleft,col=c(blue,red),lty=2,legend=c(x,y),bty=n) box(figure,col=red) box(plot,col=blue) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] timeseries plot
Hello, I am using plot( ) function to plot time-series. it takes time-series object as an argument but i want to plot predicted data with training set, to compare them. is there any function available? Vibha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What should I do regarding DLL attempted to change... warning ?
On Fri, 2 Apr 2010, Duncan Murdoch wrote: On 02/04/2010 7:01 AM, Tal Galili wrote: Hi all, The call to: library(rJava) Results in the following warning massage: Warning message: In inDL(x, as.logical(local), as.logical(now), ...) : DLL attempted to change FPU control word from 8001f to 9001f After some searching I found the following explanation: R expects all calls to DLLs (including the initializing call) to leave the FPU control word unchanged. Many run-time libraries reset the FPU control word during initialization; this will cause problems in R, and will result in a warning message like DLL attempted to change FPU control word from 8001f to 9001f. The value 8001f that gets reported is in the format expected by the C library routine _controlfp; the raw value that is used in the FPU register is 037F. Also with a few old discussions that explain (for a package developer) how to avoid this. The question is, should I, as a useR, do anything regarding this warning massage ? It's a bug in the rJava package, so you should report it to the maintainer of that package. I suspect it is much more likely to be in the Java installation being linked to, so you should make sure that is fully updated and report its version to the maintainer if the problem persists. It does not do this for me with Sun Java 1.6.0u18. Duncan Murdoch I use winXP , here is my sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rJava_0.8-3 Thanks, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-SIG-Finance] Derivative of a smooth function
Please keep in mind this question has absolutely nothing to do with finance, and therefore needs to instead be directed to R-help. Thanks in advance for keeping the R-finance list on topic. Jeff On Fri, Apr 2, 2010 at 3:36 AM, FMH kagba2...@yahoo.com wrote: Dear All, I've been searching for appropriate codes to compute the rate of change and the curvature of nonparametric regression model whish was denoted by a smooth function but unfortunately don't manage to do it. I presume that such characteristics from a smooth curve can be determined by the first and second derivative operators. The following are the example of fitting a nonparametric regression model via smoothing spline function from the Help file in R. ### attach(cars) plot(speed, dist, main = data(cars) smoothing splines) cars.spl - smooth.spline(speed, dist) lines(cars.spl, col = blue) lines(smooth.spline(speed, dist, df=10), lty=2, col = red) legend(5,120,c(paste(default [C.V.] = df =,round(cars.spl$df,1)),s( * , df = 10)), col = c(blue,red), lty = 1:2, bg='bisque') detach() ### Could someone please advice me the appropriate way to determine such derivatives on the curves which were fitted by the function above and would like to thank you in advance. Cheers Fir ___ r-sig-fina...@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. -- Jeffrey Ryan jeffrey.r...@insightalgo.com ia: insight algorithmics www.insightalgo.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Derivative of a smooth function
Please learn how to use `RsiteSearch' before posting questions to the list: RSiteSearch(derivative smooth function) This should have provided you with plenty of solutions. Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: FMH kagba2...@yahoo.com Date: Friday, April 2, 2010 4:39 am Subject: [R] Derivative of a smooth function To: r-help@r-project.org Cc: r-sig-fina...@stat.math.ethz.ch Dear All, I've been searching for appropriate codes to compute the rate of change and the curvature of nonparametric regression model whish was denoted by a smooth function but unfortunately don't manage to do it. I presume that such characteristics from a smooth curve can be determined by the first and second derivative operators. The following are the example of fitting a nonparametric regression model via smoothing spline function from the Help file in R. ### attach(cars) plot(speed, dist, main = data(cars) smoothing splines) cars.spl - smooth.spline(speed, dist) lines(cars.spl, col = blue) lines(smooth.spline(speed, dist, df=10), lty=2, col = red) legend(5,120,c(paste(default [C.V.] = df =,round(cars.spl$df,1)),s( * , df = 10)), col = c(blue,red), lty = 1:2, bg='bisque') detach() ### Could someone please advice me the appropriate way to determine such derivatives on the curves which were fitted by the function above and would like to thank you in advance. Cheers Fir __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] build Mac distribution for R package
Dear R users, can somebody give me some suggestions about how to build Mac distribution on my own Mac OS Thanks -- Wenjun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] timeseries plot
Here are a few ways: Try this: set.seed(123) TS - ts(1:25 + rnorm(25)) tt - time(TS) tt.pred - end(tt)[1] + 1:10 both - ts(c(TS, predict(lm(TS ~ tt), list(tt = tt.pred ts.plot(both, TS, gpars = list(type = o, col = 2:1, pch = 20)) and read ?ts, ?start, ?ts.plot and next time please provide some sample data using dput. See last line to every message and the posting guide. On Fri, Apr 2, 2010 at 8:14 AM, vibha patel vibhapatel...@gmail.com wrote: Hello, I am using plot( ) function to plot time-series. it takes time-series object as an argument but i want to plot predicted data with training set, to compare them. is there any function available? Vibha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plots don't update with xlab, etc. What am I doing wrong.
Hi, I've been struggling with this problem the last few days and finally discovered it's happening at a very fundamental level. Going through Stephen Turner's tutorial on ggplot2, I entered these base graphics commands: with(diamonds, plot(carat,price)) with(diamonds, plot(carat,price), xlab=Weight in Carats, ylab=Price in USD, main=Diamonds are expensive!) The first command works as expected and draws the plot with labels carat and price and no title. The second command makes R redraw the plot (I can see it clear and redraw), but it's identical to the first! What am I doing wrong? Marsh Feldman [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] build Mac distribution for R package
On Apr 2, 2010, at 9:26 AM, wenjun zheng wrote: Dear R users, can somebody give me some suggestions about how to build Mac distribution on my own Mac OS It appears you have not read the most basic background information yet: http://cran.r-project.org/doc/manuals/R-admin.pdf -- David. Thanks -- Wenjun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plots don't update with xlab, etc. What am I doing wrong.
On Apr 2, 2010, at 9:56 AM, Marshall Feldman wrote: Hi, I've been struggling with this problem the last few days and finally discovered it's happening at a very fundamental level. Going through Stephen Turner's tutorial on ggplot2, I entered these base graphics commands: with(diamonds, plot(carat,price)) with(diamonds, plot(carat,price), xlab=Weight in Carats, ylab=Price in USD, main=Diamonds are expensive!) Remove the extraneous ). -- David. The first command works as expected and draws the plot with labels carat and price and no title. The second command makes R redraw the plot (I can see it clear and redraw), but it's identical to the first! What am I doing wrong? Marsh Feldman David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
I'm using rpart function for creating regression trees. now how to measure the fitness of regression tree??? thanks n Regards, Vibha I read R-help as a digest so often come late to a discussion. Let me start by being the first to directly answer the question: fit - rpart(time ~ age +ph.ecog,lung) summary(fit) Call: rpart(formula = time ~ age + ph.ecog, data = lung) n= 228 CP nsplit rel error xerror xstd 1 0.0351 0 1.000 1.009949 0.1137819 2 0.01459053 1 0.9648333 1.049636 0.1282259 3 0.01324335 3 0.9356523 1.090562 0.1301632 4 0.0100 7 0.8810284 1.063609 0.1298557 Node number 1: 228 observations,complexity param=0.0351 mean=305.2325, MSE=44176.93 left son=2 (51 obs) right son=3 (177 obs) Primary splits: ... The relative error and cross-validated relative error columns above, for a regression tree, are equal to 1-R^2. In this case none of the splits are useful; even the naive non-cross-validated improvement for the first split isn't much (R^2 .04). Now to the larger debate. I do not find trees as useless as Frank (does anyone). I like to use them for initial data exploration, in the same fashion as a scatterplot. But I fight the same battle that he does with some colleages and customers: they are so very easy to interpret that the results are often severely over-interpreted, sometimes to the point that the tree did more harm than good. All forward stepwise procedures are unstable. Particularly with rich data sets, such as I see each day in the medical field, there are mulitple overlapping/correlated predictors. Small changes in the data will completely change the order of a forward stepwise regression. Anyone who puts faith in the ORDER of inclusion as a measure of worth is like a flag in a fitful breeze. A bigger problem with rpart is the users consistenly ignore the xerror column above, and print out (and believe) bigger trees than they should. Once the xerror bottoms out you are almost certainly looking at random noise. Since the xerror curve often has a long flat bottom the 1SE rule is better (anything within 1SE of the min is a tie, use the smallest of a set of tied models). Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summing data based on certain conditions
Dear all, Thanks for the contributions so far. I've had a look at these and the closest I've come to solving it is the following: data_ave - ave(data$rammday, by=c(data$month, data$year)) Warning messages: 1: In split.default(x, g) : data length is not a multiple of split variable 2: In split.default(seq_along(x), f, drop = drop, ...) : data length is not a multiple of split variable I'm slightly confused by the warning message, as the data lengths do appear the same: dim(data) [1] 1073 6 length(data$year) [1] 1073 length(data$month) [1] 1073 Maybe the approach I'm taking is wrong. Any suggestions would be gratefully received. Many thanks, Steve Date: Wed, 31 Mar 2010 23:31:25 +0200 From: stephan.kola...@gmx.de To: smurray...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Summing data based on certain conditions ?by may also be helpful. Stephan Steve Murray schrieb: Dear all, I have a dataset of 1073 rows, the first 15 which look as follows: data[1:15,] date year month day rammday thmmday 1 3/8/1988 1988 3 8 1.43 0.94 2 3/15/1988 1988 3 15 2.86 0.66 3 3/22/1988 1988 3 22 5.06 3.43 4 3/29/1988 1988 3 29 18.76 10.93 5 4/5/1988 1988 4 5 4.49 2.70 6 4/12/1988 1988 4 12 8.57 4.59 7 4/16/1988 1988 4 16 31.18 22.18 8 4/19/1988 1988 4 19 19.67 12.33 9 4/26/1988 1988 4 26 3.14 1.79 10 5/3/1988 1988 5 3 11.51 6.33 11 5/10/1988 1988 5 10 5.64 2.89 12 5/17/1988 1988 5 17 37.46 20.89 13 5/24/1988 1988 5 24 9.86 9.81 14 5/31/1988 1988 5 31 13.00 8.63 15 6/7/1988 1988 6 7 0.43 0.00 I am looking for a way by which I can create monthly totals of rammday (rainfall in mm/day; column 5) by doing the following: For each case where the month value and the year are the same (e.g. 3 and 1988, in the first four rows), find the mean of the the corresponding rammday values and then times by the number of days in that month (i.e. 31 in this case). Note however that the number of month values in each case isn't always the same (e.g. in this subset of data, there are 4 values for month 3, 5 for month 4 and 5 for month 5). Also the months will of course recycle for the following years, so it's not simply a case of finding a monthly total for *all* the 3s in the whole dataset, just those associated with each year in turn. How would I go about doing this in R? Any help will be gratefully received. Many thanks, Steve _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cross-validation for parameter selection (glm/logit)
If my aim is to select a good subset of parameters for my final logit model built using glm(). What is the best way to cross-validate the results so that they are reliable? Let's say that I have a large dataset of 1000's of observations. I split this data into two groups, one that I use for training and another for validation. First I use the training set to build a model, and the the stepAIC() with a Forward-Backward search. BUT, if I base my parameter selection purely on this result, I suppose it will be somewhat skewed due to the 1-time data split (I use only 1 training dataset) What is the correct way to perform this variable selection? And are the readily available packages for this? Similarly, when I have my final parameter set, how should I go about and make the final assessment of the models predictability? CV? What package? Thank you in advance, Jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] POSIX primer
I have not used POSIX classes previously and now have a need to use them. I have sports data with times of some athletes after different events. I need to perform some simple analyses using the times. I think I've figured out how to do this. I just want to confirm with others who have more experience that this is indeed the correct approach. If not, please suggest a more appropriate way. Suppose I have times for two athletes after event 1. times - c('14:15', '16:45') Now, I use strptime() as follows x - strptime(times, %M:%S) x [1] 2010-04-02 00:14:15 2010-04-02 00:16:45 class(x) [1] POSIXt POSIXlt Now, I want the average time across all athletes as well as the min and max, so I do: mean(x); min(x); max(x) [1] 2010-04-02 00:15:30 EDT [1] 2010-04-02 00:14:15 EDT [1] 2010-04-02 00:16:45 EDT Now, I want to rank order the athletes: rank(x) Error in if (xi == xj) 0L else if (xi xj) 1L else -1L : missing value where TRUE/FALSE needed But, I can rank order the following. rank(times) [1] 1 2 I don't need the date in the object x, but I can't figure out how to remove it. Nonetheless, it doesn't seem to affect anything. x [1] 2010-04-02 00:14:15 2010-04-02 00:16:45 Is this the right approach for using time variables and performing some computations on them? Or, is there another approach I should look at. Thanks, Harold sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lme4_0.999375-32 Matrix_0.999375-31 lattice_0.17-26 loaded via a namespace (and not attached): [1] grid_2.10.0 tools_2.10.0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summing data based on certain conditions
Dear Steve, Multiplying the mean with the number of observations is essentially the same as summing the numbers. Have a look at the plyr packages. library(plyr) ddply(data, c(month, year), function(x){ c(MeanMultiplied = mean(x$ramm) * nrow(x), Sum = sum(x$ramm)) }) ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Steve Murray Verzonden: vrijdag 2 april 2010 16:37 Aan: stephan.kola...@gmx.de; gunter.ber...@gene.com CC: r-help@r-project.org Onderwerp: Re: [R] Summing data based on certain conditions Dear all, Thanks for the contributions so far. I've had a look at these and the closest I've come to solving it is the following: data_ave - ave(data$rammday, by=c(data$month, data$year)) Warning messages: 1: In split.default(x, g) : data length is not a multiple of split variable 2: In split.default(seq_along(x), f, drop = drop, ...) : data length is not a multiple of split variable I'm slightly confused by the warning message, as the data lengths do appear the same: dim(data) [1] 1073 6 length(data$year) [1] 1073 length(data$month) [1] 1073 Maybe the approach I'm taking is wrong. Any suggestions would be gratefully received. Many thanks, Steve Date: Wed, 31 Mar 2010 23:31:25 +0200 From: stephan.kola...@gmx.de To: smurray...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Summing data based on certain conditions ?by may also be helpful. Stephan Steve Murray schrieb: Dear all, I have a dataset of 1073 rows, the first 15 which look as follows: data[1:15,] date year month day rammday thmmday 1 3/8/1988 1988 3 8 1.43 0.94 2 3/15/1988 1988 3 15 2.86 0.66 3 3/22/1988 1988 3 22 5.06 3.43 4 3/29/1988 1988 3 29 18.76 10.93 5 4/5/1988 1988 4 5 4.49 2.70 6 4/12/1988 1988 4 12 8.57 4.59 7 4/16/1988 1988 4 16 31.18 22.18 8 4/19/1988 1988 4 19 19.67 12.33 9 4/26/1988 1988 4 26 3.14 1.79 10 5/3/1988 1988 5 3 11.51 6.33 11 5/10/1988 1988 5 10 5.64 2.89 12 5/17/1988 1988 5 17 37.46 20.89 13 5/24/1988 1988 5 24 9.86 9.81 14 5/31/1988 1988 5 31 13.00 8.63 15 6/7/1988 1988 6 7 0.43 0.00 I am looking for a way by which I can create monthly totals of rammday (rainfall in mm/day; column 5) by doing the following: For each case where the month value and the year are the same (e.g. 3 and 1988, in the first four rows), find the mean of the the corresponding rammday values and then times by the number of days in that month (i.e. 31 in this case). Note however that the number of month values in each case isn't always the same (e.g. in this subset of data, there are 4 values for month 3, 5 for month 4 and 5 for month 5). Also the months will of course recycle for the following years, so it's not simply a case of finding a monthly total for *all* the 3s in the whole dataset, just those associated with each year in turn. How would I go about doing this in R? Any help will be gratefully received. Many thanks, Steve _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de
Re: [R] POSIX primer
On Fri, Apr 2, 2010 at 10:49 AM, Doran, Harold hdo...@air.org wrote: I have not used POSIX classes previously and now have a need to use them. I have sports data with times of some athletes The main reason to use POSIXct is if you need time zones. If you don't then you might be better off with chron. See R News 4/1. library(chron) tt - times(c('00:14:15', '00:16:45')) summary(tt) Min. 1st Qu. Median Mean 3rd Qu. Max. 00:14:15 00:14:52 00:15:30 00:15:30 00:16:08 00:16:45 min(tt); max(tt); mean(tt) [1] 00:14:15 [1] 00:16:45 [1] 00:15:30 rank(tt) [1] 1 2 order(tt) [1] 1 2 sort(tt) [1] 00:14:15 00:16:45 tt[order(tt)] [1] 00:14:15 00:16:45 after different events. I need to perform some simple analyses using the times. I think I've figured out how to do this. I just want to confirm with others who have more experience that this is indeed the correct approach. If not, please suggest a more appropriate way. Suppose I have times for two athletes after event 1. times - c('14:15', '16:45') Now, I use strptime() as follows x - strptime(times, %M:%S) x [1] 2010-04-02 00:14:15 2010-04-02 00:16:45 class(x) [1] POSIXt POSIXlt Now, I want the average time across all athletes as well as the min and max, so I do: mean(x); min(x); max(x) [1] 2010-04-02 00:15:30 EDT [1] 2010-04-02 00:14:15 EDT [1] 2010-04-02 00:16:45 EDT Now, I want to rank order the athletes: rank(x) Error in if (xi == xj) 0L else if (xi xj) 1L else -1L : missing value where TRUE/FALSE needed But, I can rank order the following. rank(times) [1] 1 2 I don't need the date in the object x, but I can't figure out how to remove it. Nonetheless, it doesn't seem to affect anything. x [1] 2010-04-02 00:14:15 2010-04-02 00:16:45 Is this the right approach for using time variables and performing some computations on them? Or, is there another approach I should look at. Thanks, Harold sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lme4_0.999375-32 Matrix_0.999375-31 lattice_0.17-26 loaded via a namespace (and not attached): [1] grid_2.10.0 tools_2.10.0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merge failure using zoo package
Readers, Please refer to attached example data files. It seems that the merge function fails for the latter section of the data set. Command terminal output: library(chron) library(zoo) x-read.zoo(test1.csv,header=TRUE,sep=,,FUN=times) y-read.zoo(test2.csv,header=TRUE,sep=,,FUN=times) z-(na.approx(merge(x[,2],y[,2]),time(z1))) z x[, 2]y[, 2] 01:01:01 0.5418645 0.1755847 01:01:30 0.3486081 0.2068249 01:01:42 0.4808362 0.2380651 01:02:00 0.6130642 0.4983712 01:02:23 0.3140116 0.7586773 01:19:00 0.8545863 0.8927112 01:24:00 0.965 0.1490374 To overcome this behaviour the files test3 and test4 were created by removing data that had been merged previously. Command terminal output below: x-read.zoo(test3.csv,header=TRUE,sep=,,FUN=times) y-read.zoo(test4.csv,header=TRUE,sep=,,FUN=times) z-(na.approx(merge(x[,2],y[,2]),time(z1))) z x[, 2]y[, 2] 01:03:06 0.4827475 0.7350236 01:03:30 0.6951390 0.8376028 01:03:50 0.5798283 0.9401821 01:04:00 0.4645176 0.8330635 01:04:30 0.6167257 0.7259450 01:19:00 0.8545863 0.8927112 01:24:00 0.965 0.1490374 The only way to obtain a more complete merge of the data sets is to create manually new files where previously merged data is removed and then put all the merged data into a new file. Surely this package should merge the data sets completely? yours, rh...@conference.jabber.org r251 mandriva2008 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summing data based on certain conditions
Steve - Take a closer look at the help page for ave(), especially the ... argument. Try data_ave - ave(data$rammday, data$month, data$year,FUN=mean) (Assuming you want to calculate the mean -- your example didn't specify a function.) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 2 Apr 2010, Steve Murray wrote: Dear all, Thanks for the contributions so far. I've had a look at these and the closest I've come to solving it is the following: data_ave - ave(data$rammday, by=c(data$month, data$year)) Warning messages: 1: In split.default(x, g) : data length is not a multiple of split variable 2: In split.default(seq_along(x), f, drop = drop, ...) : data length is not a multiple of split variable I'm slightly confused by the warning message, as the data lengths do appear the same: dim(data) [1] 1073 6 length(data$year) [1] 1073 length(data$month) [1] 1073 Maybe the approach I'm taking is wrong. Any suggestions would be gratefully received. Many thanks, Steve Date: Wed, 31 Mar 2010 23:31:25 +0200 From: stephan.kola...@gmx.de To: smurray...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Summing data based on certain conditions ?by may also be helpful. Stephan Steve Murray schrieb: Dear all, I have a dataset of 1073 rows, the first 15 which look as follows: data[1:15,] date year month day rammday thmmday 1 3/8/1988 1988 3 8 1.43 0.94 2 3/15/1988 1988 3 15 2.86 0.66 3 3/22/1988 1988 3 22 5.06 3.43 4 3/29/1988 1988 3 29 18.76 10.93 5 4/5/1988 1988 4 5 4.49 2.70 6 4/12/1988 1988 4 12 8.57 4.59 7 4/16/1988 1988 4 16 31.18 22.18 8 4/19/1988 1988 4 19 19.67 12.33 9 4/26/1988 1988 4 26 3.14 1.79 10 5/3/1988 1988 5 3 11.51 6.33 11 5/10/1988 1988 5 10 5.64 2.89 12 5/17/1988 1988 5 17 37.46 20.89 13 5/24/1988 1988 5 24 9.86 9.81 14 5/31/1988 1988 5 31 13.00 8.63 15 6/7/1988 1988 6 7 0.43 0.00 I am looking for a way by which I can create monthly totals of rammday (rainfall in mm/day; column 5) by doing the following: For each case where the month value and the year are the same (e.g. 3 and 1988, in the first four rows), find the mean of the the corresponding rammday values and then times by the number of days in that month (i.e. 31 in this case). Note however that the number of month values in each case isn't always the same (e.g. in this subset of data, there are 4 values for month 3, 5 for month 4 and 5 for month 5). Also the months will of course recycle for the following years, so it's not simply a case of finding a monthly total for *all* the 3s in the whole dataset, just those associated with each year in turn. How would I go about doing this in R? Any help will be gratefully received. Many thanks, Steve _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] POSIX primer
Beautiful. Thank you. -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Friday, April 02, 2010 10:59 AM To: Doran, Harold Cc: r-help@r-project.org Subject: Re: [R] POSIX primer On Fri, Apr 2, 2010 at 10:49 AM, Doran, Harold hdo...@air.org wrote: I have not used POSIX classes previously and now have a need to use them. I have sports data with times of some athletes The main reason to use POSIXct is if you need time zones. If you don't then you might be better off with chron. See R News 4/1. library(chron) tt - times(c('00:14:15', '00:16:45')) summary(tt) Min. 1st Qu. Median Mean 3rd Qu. Max. 00:14:15 00:14:52 00:15:30 00:15:30 00:16:08 00:16:45 min(tt); max(tt); mean(tt) [1] 00:14:15 [1] 00:16:45 [1] 00:15:30 rank(tt) [1] 1 2 order(tt) [1] 1 2 sort(tt) [1] 00:14:15 00:16:45 tt[order(tt)] [1] 00:14:15 00:16:45 after different events. I need to perform some simple analyses using the times. I think I've figured out how to do this. I just want to confirm with others who have more experience that this is indeed the correct approach. If not, please suggest a more appropriate way. Suppose I have times for two athletes after event 1. times - c('14:15', '16:45') Now, I use strptime() as follows x - strptime(times, %M:%S) x [1] 2010-04-02 00:14:15 2010-04-02 00:16:45 class(x) [1] POSIXt POSIXlt Now, I want the average time across all athletes as well as the min and max, so I do: mean(x); min(x); max(x) [1] 2010-04-02 00:15:30 EDT [1] 2010-04-02 00:14:15 EDT [1] 2010-04-02 00:16:45 EDT Now, I want to rank order the athletes: rank(x) Error in if (xi == xj) 0L else if (xi xj) 1L else -1L : missing value where TRUE/FALSE needed But, I can rank order the following. rank(times) [1] 1 2 I don't need the date in the object x, but I can't figure out how to remove it. Nonetheless, it doesn't seem to affect anything. x [1] 2010-04-02 00:14:15 2010-04-02 00:16:45 Is this the right approach for using time variables and performing some computations on them? Or, is there another approach I should look at. Thanks, Harold sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lme4_0.999375-32 Matrix_0.999375-31 lattice_0.17-26 loaded via a namespace (and not attached): [1] grid_2.10.0 tools_2.10.0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] roccomp
Thank you, Ravi, I have looked at that package, but I don't see any method to compare two ROC curves. I beleive the method used by roccomp is based on Delong. JoAnn From: Ravi Kulkarni [via R] [ml-node+1748903-1261333028-216...@n4.nabble.com] Sent: Friday, April 02, 2010 2:57 AM To: Alvarez, Joann Marie Subject: Re: roccomp The ROCR package has methods to compute AUC and related methods. You might want to check it out. Ravi View message @ http://n4.nabble.com/roccomp-tp1748818p1748903.html To unsubscribe from roccomp, click here (link removed) =. -- View this message in context: http://n4.nabble.com/roccomp-tp1748818p1749257.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge failure using zoo package
Data files test1, ...2, ...3, ...4 respectively. time1,dataset1 01:01:00,0.73512097 01:01:30,0.34860813 01:02:00,0.61306418 01:02:30,0.01495898 01:03:00,0.27035612 01:03:30,0.69513898 01:04:00,0.46451758 01:04:30,0.61672569 01:05:00,0.82496122 01:05:30,0.34766154 01:06:00,0.69618714 01:06:30,0.39035214 01:07:00,0.01680143 01:07:30,0.28576967 01:08:00,0.01205416 01:08:30,0.89637254 01:09:00,0.63147653 01:09:30,0.01522139 01:10:00,0.27661960 01:10:30,0.50974124 01:11:00,0.68141977 01:11:30,0.90725854 01:12:00,0.83823443 01:12:30,0.53360241 01:13:00,0.17769196 01:13:30,0.83438616 01:14:00,0.67248807 01:14:30,0.09991933 01:15:00,0.03334966 01:15:30,0.93292355 01:16:00,0.15990837 01:16:30,0.05354050 01:17:00,0.55281203 01:17:30,0.37845690 01:18:00,0.89051365 01:18:30,0.16674292 01:19:00,0.85458626 01:19:30,0.19278550 01:20:00,0.73240405 01:20:30,0.16417524 01:21:00,0.73878212 01:21:30,0.51790118 01:22:00,0.83076438 01:22:30,0.4704 01:23:00,0.02108640 01:23:30,0.82911053 01:24:00,0.9646 01:24:30,0.14493657 01:25:00,0.84422332 01:25:30,0.41589974 01:26:00,0.67606367 01:26:30,0.00606434 01:27:00,0.59951991 01:27:30,0.43949260 01:28:00,0.66297385 01:28:30,0.33131298 01:29:00,0.06102041 01:29:30,0.84722118 01:30:00,0.46841491 01:30:30,0.34200755 01:31:00,0.87386578 01:31:30,0.70737403 01:32:00,0.23978781 01:32:30,0.11787278 01:33:00,0.14679814 01:33:30,0.65217063 01:34:00,0.81355908 01:34:30,0.31583482 01:35:00,0.92167666 01:35:30,0.55931271 01:36:00,0.13641271 01:36:30,0.35048575 01:37:00,0.17243584 01:37:30,0.93645686 01:38:00,0.85356548 01:38:30,0.61399352 01:39:00,0.05910707 01:39:30,0.01721605 01:40:00,0.94845557 01:40:30,0.48117810 01:41:00,0.34752402 01:41:30,0.59295472 01:42:00,0.64267429 01:42:30,0.57859933 01:43:00,0.00201441 01:43:30,0.32530995 01:44:00,0.25474645 01:44:30,0.93187534 01:45:00,0.99361033 01:45:30,0.16591641 time2,dataset2 01:01:01,0.17558467 01:01:42,0.23806514 01:02:23,0.75867726 01:03:06,0.73502357 01:03:50,0.94018206 01:04:35,0.61882643 01:05:21,0.68417492 01:06:08,0.05744461 01:06:55,0.33344394 01:07:44,0.68752593 01:08:33,0.17270469 01:09:23,0.81522124 01:10:03,0.68304352 01:10:43,0.38774082 01:11:23,0.84176890 01:12:04,0.0936 01:12:44,0.13431965 01:13:25,0.92210721 01:14:06,0.33630635 01:14:47,0.56690294 01:15:29,0.09870816 01:16:11,0.77864105 01:16:53,0.61803441 01:17:35,0.09133728 01:18:17,0.08925487 01:19:00,0.89271117 01:19:42,0.56605742 01:20:25,0.98520534 01:21:08,0.66104843 01:21:51,0.96948589 01:22:34,0.05692690 01:23:17,0.71887456 01:24:00,0.14903741 01:24:43,0.86569445 01:25:26,0.27923513 01:26:09,0.98365033 01:26:53,0.08308399 01:27:36,0.87071027 01:28:19,0.26475705 01:29:03,0.76409811 01:29:47,0.59563256 01:30:31,0.23995054 01:31:14,0.00951054 01:31:59,0.21367270 time1,dataset1 01:02:30,0.01495898 01:03:00,0.27035612 01:03:30,0.69513898 01:04:00,0.46451758 01:04:30,0.61672569 01:05:00,0.82496122 01:05:30,0.34766154 01:06:00,0.69618714 01:06:30,0.39035214 01:07:00,0.01680143 01:07:30,0.28576967 01:08:00,0.01205416 01:08:30,0.89637254 01:09:00,0.63147653 01:09:30,0.01522139 01:10:00,0.27661960 01:10:30,0.50974124 01:11:00,0.68141977 01:11:30,0.90725854 01:12:00,0.83823443 01:12:30,0.53360241 01:13:00,0.17769196 01:13:30,0.83438616 01:14:00,0.67248807 01:14:30,0.09991933 01:15:00,0.03334966 01:15:30,0.93292355 01:16:00,0.15990837 01:16:30,0.05354050 01:17:00,0.55281203 01:17:30,0.37845690 01:18:00,0.89051365 01:18:30,0.16674292 01:19:00,0.85458626 01:19:30,0.19278550 01:20:00,0.73240405 01:20:30,0.16417524 01:21:00,0.73878212 01:21:30,0.51790118 01:22:00,0.83076438 01:22:30,0.4704 01:23:00,0.02108640 01:23:30,0.82911053 01:24:00,0.9646 01:24:30,0.14493657 01:25:00,0.84422332 01:25:30,0.41589974 01:26:00,0.67606367 01:26:30,0.00606434 01:27:00,0.59951991 01:27:30,0.43949260 01:28:00,0.66297385 01:28:30,0.33131298 01:29:00,0.06102041 01:29:30,0.84722118 01:30:00,0.46841491 01:30:30,0.34200755 01:31:00,0.87386578 01:31:30,0.70737403 01:32:00,0.23978781 01:32:30,0.11787278 01:33:00,0.14679814 01:33:30,0.65217063 01:34:00,0.81355908 01:34:30,0.31583482 01:35:00,0.92167666 01:35:30,0.55931271 01:36:00,0.13641271 01:36:30,0.35048575 01:37:00,0.17243584 01:37:30,0.93645686 01:38:00,0.85356548 01:38:30,0.61399352 01:39:00,0.05910707 01:39:30,0.01721605 01:40:00,0.94845557 01:40:30,0.48117810 01:41:00,0.34752402 01:41:30,0.59295472 01:42:00,0.64267429 01:42:30,0.57859933 01:43:00,0.00201441 01:43:30,0.32530995 01:44:00,0.25474645 01:44:30,0.93187534 01:45:00,0.99361033 01:45:30,0.16591641 time2,dataset2 01:03:06,0.73502357 01:03:50,0.94018206 01:04:35,0.61882643 01:05:21,0.68417492 01:06:08,0.05744461 01:06:55,0.33344394 01:07:44,0.68752593 01:08:33,0.17270469 01:09:23,0.81522124 01:10:03,0.68304352 01:10:43,0.38774082 01:11:23,0.84176890 01:12:04,0.0936 01:12:44,0.13431965 01:13:25,0.92210721 01:14:06,0.33630635 01:14:47,0.56690294 01:15:29,0.09870816 01:16:11,0.77864105 01:16:53,0.61803441 01:17:35,0.09133728 01:18:17,0.08925487 01:19:00,0.89271117
Re: [R] Cross-validation for parameter selection (glm/logit)
Jay Unless I have misunderstood some statistical subtleties, you can use the AIC in place of actual cross-validation, as the AIC is asymptotically equivalent to leave-out-one cross-validation under MLE. Joe Stone, M. An asymptotic equivalence of choice of model by cross-validation and Akaike's criterion Journal of the Royal Statistical Society. Series B (Methodological), 1977, 39, 44-47 Abstract: A logarithmic assessment of the performance of a predicting density is found to lead to asymptotic equivalence of choice of model by cross-validation and Akaike's criterion, when maximum likelihood estimation is used within each model. Jay josip.2...@gmail.com Sent by: r-help-boun...@r-project.org 04/02/2010 09:14 AM To r-help@r-project.org cc Subject [R] Cross-validation for parameter selection (glm/logit) If my aim is to select a good subset of parameters for my final logit model built using glm(). What is the best way to cross-validate the results so that they are reliable? Let's say that I have a large dataset of 1000's of observations. I split this data into two groups, one that I use for training and another for validation. First I use the training set to build a model, and the the stepAIC() with a Forward-Backward search. BUT, if I base my parameter selection purely on this result, I suppose it will be somewhat skewed due to the 1-time data split (I use only 1 training dataset) What is the correct way to perform this variable selection? And are the readily available packages for this? Similarly, when I have my final parameter set, how should I go about and make the final assessment of the models predictability? CV? What package? Thank you in advance, Jay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summing data based on certain conditions
On Apr 2, 2010, at 10:36 AM, Steve Murray wrote: Dear all, Thanks for the contributions so far. I've had a look at these and the closest I've come to solving it is the following: data_ave - ave(data$rammday, by=c(data$month, data$year)) Warning messages: 1: In split.default(x, g) : data length is not a multiple of split variable 2: In split.default(seq_along(x), f, drop = drop, ...) : data length is not a multiple of split variable I'm slightly confused by the warning message, as the data lengths do appear the same: dim(data) [1] 10736 length(data$year) [1] 1073 length(data$month) [1] 1073 All, true no doubt, but did you look at length (c(data$month, data$year) ) # ?? -- David. Maybe the approach I'm taking is wrong. Any suggestions would be gratefully received. Many thanks, Steve Date: Wed, 31 Mar 2010 23:31:25 +0200 From: stephan.kola...@gmx.de To: smurray...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Summing data based on certain conditions ?by may also be helpful. Stephan Steve Murray schrieb: Dear all, I have a dataset of 1073 rows, the first 15 which look as follows: data[1:15,] date year month day rammday thmmday 1 3/8/1988 1988 3 8 1.43 0.94 2 3/15/1988 1988 3 15 2.86 0.66 3 3/22/1988 1988 3 22 5.06 3.43 4 3/29/1988 1988 3 29 18.76 10.93 5 4/5/1988 1988 4 5 4.49 2.70 6 4/12/1988 1988 4 12 8.57 4.59 7 4/16/1988 1988 4 16 31.18 22.18 8 4/19/1988 1988 4 19 19.67 12.33 9 4/26/1988 1988 4 26 3.14 1.79 10 5/3/1988 1988 5 3 11.51 6.33 11 5/10/1988 1988 5 10 5.64 2.89 12 5/17/1988 1988 5 17 37.46 20.89 13 5/24/1988 1988 5 24 9.86 9.81 14 5/31/1988 1988 5 31 13.00 8.63 15 6/7/1988 1988 6 7 0.43 0.00 I am looking for a way by which I can create monthly totals of rammday (rainfall in mm/day; column 5) by doing the following: For each case where the month value and the year are the same (e.g. 3 and 1988, in the first four rows), find the mean of the the corresponding rammday values and then times by the number of days in that month (i.e. 31 in this case). Note however that the number of month values in each case isn't always the same (e.g. in this subset of data, there are 4 values for month 3, 5 for month 4 and 5 for month 5). Also the months will of course recycle for the following years, so it's not simply a case of finding a monthly total for *all* the 3s in the whole dataset, just those associated with each year in turn. How would I go about doing this in R? Any help will be gratefully received. Many thanks, Steve _ We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Do you have a story that started on Hotmail? Tell us now __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge failure using zoo package
The files only have one data column. What is the meaning of x[,2], etc. ? What is z1? Please provide reproducible code and data all in a single file using this style so its clear what is what. Also please cut down the size of your data to the smallest size that will still illustrate the problem. Lines1 - a,b 1,2 3,4 library(zoo) library(chron) z1 - read.zoo(textConnection(Lines1), header = TRUE, sep = ,, FUN = ...) etc. On Fri, Apr 2, 2010 at 10:55 AM, e-letter inp...@gmail.com wrote: Readers, Please refer to attached example data files. It seems that the merge function fails for the latter section of the data set. Command terminal output: library(chron) library(zoo) x-read.zoo(test1.csv,header=TRUE,sep=,,FUN=times) y-read.zoo(test2.csv,header=TRUE,sep=,,FUN=times) z-(na.approx(merge(x[,2],y[,2]),time(z1))) z x[, 2] y[, 2] 01:01:01 0.5418645 0.1755847 01:01:30 0.3486081 0.2068249 01:01:42 0.4808362 0.2380651 01:02:00 0.6130642 0.4983712 01:02:23 0.3140116 0.7586773 01:19:00 0.8545863 0.8927112 01:24:00 0.965 0.1490374 To overcome this behaviour the files test3 and test4 were created by removing data that had been merged previously. Command terminal output below: x-read.zoo(test3.csv,header=TRUE,sep=,,FUN=times) y-read.zoo(test4.csv,header=TRUE,sep=,,FUN=times) z-(na.approx(merge(x[,2],y[,2]),time(z1))) z x[, 2] y[, 2] 01:03:06 0.4827475 0.7350236 01:03:30 0.6951390 0.8376028 01:03:50 0.5798283 0.9401821 01:04:00 0.4645176 0.8330635 01:04:30 0.6167257 0.7259450 01:19:00 0.8545863 0.8927112 01:24:00 0.965 0.1490374 The only way to obtain a more complete merge of the data sets is to create manually new files where previously merged data is removed and then put all the merged data into a new file. Surely this package should merge the data sets completely? yours, rh...@conference.jabber.org r251 mandriva2008 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exporting Nuopt from splus to R
Jp2010 mandans_p at yahoo.com writes: From my understanding it is going to be difficult, is that my understanding right.? Probably impossible ... TIBCO, or whoever owns S-PLUS now (I don't pay much attention, so it's hard for me to keep track) does try to achieve as much R-compatibility as possible: see http://csan.insightful.com/doc/spluspackages.pdf . If your use of NUOPT is mission-critical, or if the price of S-PLUS is not prohibitive, it might make sense to keep using S-PLUS to some extent. If you can't do that, take a look at http://cran.r-project.org/web/views/Optimization.html to see if R can do the things you are using NUOPT for. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge failure using zoo package
On 02/04/2010, Gabor Grothendieck ggrothendi...@gmail.com wrote: The files only have one data column. What is the meaning of x[,2], etc. ? What is z1? I only want to merge one column from one file with one column from another file. With [x,2], I am trying to select the column of data. Please provide reproducible code and data all in a single file using this style so its clear what is what. Also please cut down the size of your data to the smallest size that will still illustrate the problem. See other posting; each file that I used is separated by an empty line. This error seems to occur with the data set size as shown in the files. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R abrupt exit
Dear Lists: I recently ran quite annoyance problem while running R on Ubuntu 9.10. When running the program, the system suddenly exit from the R session with the following warnings: # OMP: Hint: This may cause performance degradation and correctness issues. Set environment variable KMP_DUPLICATE_LIB_OK=TRUE to ignore this problem and force the program to continue anyway. Please note that the use of KMP_DUPLICATE_LIB_OK is unsupported and using it may cause undefined behavior. For more information, please contact Intel(R) Premier Support. Aborted ## I have to restart R again, and all the calculations are being lost. According to the warnings, I set the environment to true, no good. I reinstalled the R program again, no good either. I googled the problem, it seems that there was no R help on the topic so far. Any suggestions that the whole OS may have conflicts? I have only one copy of the following file on my computer system? /usr/lib/R/lib/libguide.so Thanks. jacob # options(KMP_DUPLICATE_LIB_OK) $KMP_DUPLICATE_LIB_OK [1] TRUE sessionInfo() R version 2.10.1 (2009-12-14) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] datasets grDevices splines graphics stats utils methods [8] base other attached packages: [1] Design_2.3-0 Hmisc_3.7-0 survival_2.35-8 [4] GEOquery_2.11.3 RCurl_1.4-0 bitops_1.0-4.1 [7] affy_1.24.2 Biobase_2.6.1preprocessCore_1.8.0 [10] R.methodsS3_1.0.3 loaded via a namespace (and not attached): [1] affyio_1.14.0 cluster_1.12.1 grid_2.10.1lattice_0.18-3 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mcmcglmm starting value example
Dear Ping, It is not possible to pass starting values for the fixed effects. It doesn't make much sense to give starting values for the fixed effects because they can be Gibbs sampled in a single pass conditional on the latent variables and the (co)variance components - after a single iteration they would forget their starting values. The cut points in an ordinal model are a different matter. At the moment I do not allow user defined starting values for the cut points, but agree that it may be useful. I am about to release an update that allows spline fitting in MCMCglmm. If it is starting values for the cut points you're really after I can add that in before I release? Cheers, Jarrod On 29 Mar 2010, at 22:06, ping chen wrote: Hi R-users: Can anyone give an example of giving starting values for MCMCglmm? I can't find any anywhere. I have 1 random effect (physicians, and there are 50 of them) and family=ordinal. How can I specify starting values for my fixed effects? It doesn't seem to have the option to do so. Thanks, Ping -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge failure using zoo package
The code does not run with the files. I need the requested information, namely a single file containing code and data and that I can just copy into a session without editing and see the result you see. On Fri, Apr 2, 2010 at 11:27 AM, e-letter inp...@gmail.com wrote: On 02/04/2010, Gabor Grothendieck ggrothendi...@gmail.com wrote: The files only have one data column. What is the meaning of x[,2], etc. ? What is z1? I only want to merge one column from one file with one column from another file. With [x,2], I am trying to select the column of data. Please provide reproducible code and data all in a single file using this style so its clear what is what. Also please cut down the size of your data to the smallest size that will still illustrate the problem. See other posting; each file that I used is separated by an empty line. This error seems to occur with the data set size as shown in the files. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exporting Nuopt from splus to R
On 02.04.2010 01:16, Jp2010 wrote: Hi all, Thanks for the wonderful forum with all the valuable help and comments here. I have been a splus user for the past 7 to 8 years and now crossing the mind of changing over to R. Have been doing a lot of reading and one of the main reasons is being an open source and the wonderful things that comes with that. My question is though, is it possible to export any of the function or librarys that come with splus to R.? For my specific situation. Windows platform, if there is a compiled s.dll is there a way we can get this working in R. I would think if it s function or source file it probably can be written without much difficulty in R. But what about the compiled data. I am not a system programmer so don't know much about compiling/ undoing that. From my understanding it is going to be difficult, is that my understanding right.? Thanks If you are talking abouit an already compiled dll, it is not possible to get it working. You need to recompile the sources and createw some new library. Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge failure using zoo package
On 02/04/2010, Gabor Grothendieck ggrothendi...@gmail.com wrote: The code does not run with the files. I need the requested information, namely a single file containing code and data and that I can just copy into a session without editing and see the result you see. I don't understand how I can combine the four csv files into a single file of data and terminal commands? Anyway, further terminal output. The following also occurs with correction of the commands. The data merge is incomplete (to 1:27:30); data set 1 ends at time 1:45:30; data set 2 1:31:59 library(chron) library(zoo) z1-read.zoo(test1.csv,header=TRUE,sep=,,FUN=times) z2-read.zoo(test2.csv,header=TRUE,sep=,,FUN=times) z3-(na.approx(merge(z1[,2],z2[,2]),time(z1))) z3 z1[, 2]z2[, 2] 01:01:01 0.54186455 0.17558467 01:01:30 0.34860813 0.20682491 01:01:42 0.48083615 0.23806514 01:02:00 0.61306418 0.49837120 01:02:23 0.31401158 0.75867726 01:02:30 0.01495898 0.75079270 01:03:00 0.27035612 0.74290813 01:03:06 0.48274755 0.73502357 01:03:30 0.69513898 0.83760282 01:03:50 0.57982828 0.94018206 01:04:00 0.46451758 0.83306352 01:04:30 0.61672569 0.72594497 01:04:35 0.72084346 0.61882643 01:05:00 0.82496122 0.65150068 01:05:21 0.58631138 0.68417492 01:05:30 0.34766154 0.47526482 01:06:00 0.69618714 0.26635471 01:06:08 0.54326964 0.05744461 01:06:30 0.39035214 0.19544428 01:06:55 0.20357679 0.33344394 01:07:00 0.01680143 0.45147127 01:07:30 0.28576967 0.56949860 01:07:44 0.14891191 0.68752593 01:08:00 0.01205416 0.51591885 01:08:30 0.89637254 0.34431177 01:08:33 0.76392454 0.17270469 01:09:00 0.63147653 0.49396296 01:09:23 0.32334896 0.81522124 01:09:30 0.01522139 0.77116200 01:10:00 0.27661960 0.72710276 01:10:03 0.39318042 0.68304352 01:10:30 0.50974124 0.53539217 01:10:43 0.59558051 0.38774082 01:11:00 0.68141977 0.61475486 01:11:23 0.79433915 0.84176890 01:11:30 0.90725854 0.59232742 01:12:00 0.83823443 0.34288594 01:12:04 0.68591842 0.0936 01:12:30 0.53360241 0.11388206 01:12:44 0.35564718 0.13431965 01:13:00 0.17769196 0.52821343 01:13:25 0.50603906 0.92210721 01:13:30 0.83438616 0.72684026 01:14:00 0.67248807 0.53157330 01:14:06 0.38620370 0.33630635 01:14:30 0.09991933 0.45160464 01:14:47 0.06663450 0.56690294 01:15:00 0.03334966 0.33280555 01:15:29 0.48313660 0.09870816 01:15:30 0.93292355 0.32535246 01:16:00 0.15990837 0.55199675 01:16:11 0.10672443 0.77864105 01:16:30 0.05354050 0.69833773 01:16:53 0.30317627 0.61803441 01:17:00 0.55281203 0.44246870 01:17:30 0.37845690 0.26690299 01:17:35 0.63448528 0.09133728 01:18:00 0.89051365 0.09029608 01:18:17 0.52862829 0.08925487 01:18:30 0.16674292 0.49098302 01:19:00 0.85458626 0.89271117 01:19:30 0.19278550 0.72938430 01:19:42 0.46259477 0.56605742 01:20:00 0.73240405 0.77563138 01:20:25 0.44828965 0.98520534 01:20:30 0.16417524 0.87715304 01:21:00 0.73878212 0.76910073 01:21:08 0.62834165 0.66104843 01:21:30 0.51790118 0.81526716 01:21:51 0.67433278 0.96948589 01:22:00 0.83076438 0.66529956 01:22:30 0.4704 0.36111323 01:22:34 0.24832072 0.05692690 01:23:00 0.02108640 0.38790073 01:23:17 0.42509847 0.71887456 01:23:30 0.82911053 0.43395599 01:24:00 0.9646 0.14903741 01:24:30 0.14493657 0.50736593 01:24:43 0.49457995 0.86569445 01:25:00 0.84422332 0.57246479 01:25:26 0.63006153 0.27923513 01:25:30 0.41589974 0.51404020 01:26:00 0.67606367 0.74884526 01:26:09 0.34106401 0.98365033 01:26:30 0.00606434 0.53336716 01:26:53 0.30279212 0.08308399 01:27:00 0.59951991 0.34562608 01:27:30 0.43949260 0.60816818 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R abrupt exit
Google leads to some discussion on the Intel Sofware Network: http://software.intel.com/en-us/forums/showthread.php?t=64585 Be warned: I haven't read the discussion. -Peter Ehlers On 2010-04-02 9:30, jacob wrote: Dear Lists: I recently ran quite annoyance problem while running R on Ubuntu 9.10. When running the program, the system suddenly exit from the R session with the following warnings: # OMP: Hint: This may cause performance degradation and correctness issues. Set environment variable KMP_DUPLICATE_LIB_OK=TRUE to ignore this problem and force the program to continue anyway. Please note that the use of KMP_DUPLICATE_LIB_OK is unsupported and using it may cause undefined behavior. For more information, please contact Intel(R) Premier Support. Aborted ## I have to restart R again, and all the calculations are being lost. According to the warnings, I set the environment to true, no good. I reinstalled the R program again, no good either. I googled the problem, it seems that there was no R help on the topic so far. Any suggestions that the whole OS may have conflicts? I have only one copy of the following file on my computer system? /usr/lib/R/lib/libguide.so Thanks. jacob # options(KMP_DUPLICATE_LIB_OK) $KMP_DUPLICATE_LIB_OK [1] TRUE sessionInfo() R version 2.10.1 (2009-12-14) x86_64-pc-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] datasets grDevices splines graphics stats utils methods [8] base other attached packages: [1] Design_2.3-0 Hmisc_3.7-0 survival_2.35-8 [4] GEOquery_2.11.3 RCurl_1.4-0 bitops_1.0-4.1 [7] affy_1.24.2 Biobase_2.6.1preprocessCore_1.8.0 [10] R.methodsS3_1.0.3 loaded via a namespace (and not attached): [1] affyio_1.14.0 cluster_1.12.1 grid_2.10.1lattice_0.18-3 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge failure using zoo package
Below is the format that was requested. This has the data followed by the corrected code at the end. There are several things that were wrong: 1. z1[,2] is wrong since z1 is a vector, not a 2d matrix. Ditto for z2. Ideally zoo would have given an error message but in any case its wrong. It should be just z1. 2. The merge works ok but the second argument to na.approx should be xout=time(z1). xout= was missing. 3. there were a number of bugs removed from na.approx in the devel version of zoo. I don't think they impact this but if you have any problems with na.approx then uncomment the source statement in the code below. It will bring in the development version of na.approx into your workspace. Lines1 - time1,dataset1 01:01:00,0.73512097 01:01:30,0.34860813 01:02:00,0.61306418 01:02:30,0.01495898 01:03:00,0.27035612 01:03:30,0.69513898 01:04:00,0.46451758 01:04:30,0.61672569 01:05:00,0.82496122 01:05:30,0.34766154 01:06:00,0.69618714 01:06:30,0.39035214 01:07:00,0.01680143 01:07:30,0.28576967 01:08:00,0.01205416 01:08:30,0.89637254 01:09:00,0.63147653 01:09:30,0.01522139 01:10:00,0.27661960 01:10:30,0.50974124 01:11:00,0.68141977 01:11:30,0.90725854 01:12:00,0.83823443 01:12:30,0.53360241 01:13:00,0.17769196 01:13:30,0.83438616 01:14:00,0.67248807 01:14:30,0.09991933 01:15:00,0.03334966 01:15:30,0.93292355 01:16:00,0.15990837 01:16:30,0.05354050 01:17:00,0.55281203 01:17:30,0.37845690 01:18:00,0.89051365 01:18:30,0.16674292 01:19:00,0.85458626 01:19:30,0.19278550 01:20:00,0.73240405 01:20:30,0.16417524 01:21:00,0.73878212 01:21:30,0.51790118 01:22:00,0.83076438 01:22:30,0.4704 01:23:00,0.02108640 01:23:30,0.82911053 01:24:00,0.9646 01:24:30,0.14493657 01:25:00,0.84422332 01:25:30,0.41589974 01:26:00,0.67606367 01:26:30,0.00606434 01:27:00,0.59951991 01:27:30,0.43949260 01:28:00,0.66297385 01:28:30,0.33131298 01:29:00,0.06102041 01:29:30,0.84722118 01:30:00,0.46841491 01:30:30,0.34200755 01:31:00,0.87386578 01:31:30,0.70737403 01:32:00,0.23978781 01:32:30,0.11787278 01:33:00,0.14679814 01:33:30,0.65217063 01:34:00,0.81355908 01:34:30,0.31583482 01:35:00,0.92167666 01:35:30,0.55931271 01:36:00,0.13641271 01:36:30,0.35048575 01:37:00,0.17243584 01:37:30,0.93645686 01:38:00,0.85356548 01:38:30,0.61399352 01:39:00,0.05910707 01:39:30,0.01721605 01:40:00,0.94845557 01:40:30,0.48117810 01:41:00,0.34752402 01:41:30,0.59295472 01:42:00,0.64267429 01:42:30,0.57859933 01:43:00,0.00201441 01:43:30,0.32530995 01:44:00,0.25474645 01:44:30,0.93187534 01:45:00,0.99361033 01:45:30,0.16591641 Lines2 - time2,dataset2 01:01:01,0.17558467 01:01:42,0.23806514 01:02:23,0.75867726 01:03:06,0.73502357 01:03:50,0.94018206 01:04:35,0.61882643 01:05:21,0.68417492 01:06:08,0.05744461 01:06:55,0.33344394 01:07:44,0.68752593 01:08:33,0.17270469 01:09:23,0.81522124 01:10:03,0.68304352 01:10:43,0.38774082 01:11:23,0.84176890 01:12:04,0.0936 01:12:44,0.13431965 01:13:25,0.92210721 01:14:06,0.33630635 01:14:47,0.56690294 01:15:29,0.09870816 01:16:11,0.77864105 01:16:53,0.61803441 01:17:35,0.09133728 01:18:17,0.08925487 01:19:00,0.89271117 01:19:42,0.56605742 01:20:25,0.98520534 01:21:08,0.66104843 01:21:51,0.96948589 01:22:34,0.05692690 01:23:17,0.71887456 01:24:00,0.14903741 01:24:43,0.86569445 01:25:26,0.27923513 01:26:09,0.98365033 01:26:53,0.08308399 01:27:36,0.87071027 01:28:19,0.26475705 01:29:03,0.76409811 01:29:47,0.59563256 01:30:31,0.23995054 01:31:14,0.00951054 01:31:59,0.21367270 Lines3 - time1,dataset1 01:02:30,0.01495898 01:03:00,0.27035612 01:03:30,0.69513898 01:04:00,0.46451758 01:04:30,0.61672569 01:05:00,0.82496122 01:05:30,0.34766154 01:06:00,0.69618714 01:06:30,0.39035214 01:07:00,0.01680143 01:07:30,0.28576967 01:08:00,0.01205416 01:08:30,0.89637254 01:09:00,0.63147653 01:09:30,0.01522139 01:10:00,0.27661960 01:10:30,0.50974124 01:11:00,0.68141977 01:11:30,0.90725854 01:12:00,0.83823443 01:12:30,0.53360241 01:13:00,0.17769196 01:13:30,0.83438616 01:14:00,0.67248807 01:14:30,0.09991933 01:15:00,0.03334966 01:15:30,0.93292355 01:16:00,0.15990837 01:16:30,0.05354050 01:17:00,0.55281203 01:17:30,0.37845690 01:18:00,0.89051365 01:18:30,0.16674292 01:19:00,0.85458626 01:19:30,0.19278550 01:20:00,0.73240405 01:20:30,0.16417524 01:21:00,0.73878212 01:21:30,0.51790118 01:22:00,0.83076438 01:22:30,0.4704 01:23:00,0.02108640 01:23:30,0.82911053 01:24:00,0.9646 01:24:30,0.14493657 01:25:00,0.84422332 01:25:30,0.41589974 01:26:00,0.67606367 01:26:30,0.00606434 01:27:00,0.59951991 01:27:30,0.43949260 01:28:00,0.66297385 01:28:30,0.33131298 01:29:00,0.06102041 01:29:30,0.84722118 01:30:00,0.46841491 01:30:30,0.34200755 01:31:00,0.87386578 01:31:30,0.70737403 01:32:00,0.23978781 01:32:30,0.11787278 01:33:00,0.14679814 01:33:30,0.65217063 01:34:00,0.81355908 01:34:30,0.31583482 01:35:00,0.92167666 01:35:30,0.55931271 01:36:00,0.13641271 01:36:30,0.35048575 01:37:00,0.17243584 01:37:30,0.93645686 01:38:00,0.85356548 01:38:30,0.61399352 01:39:00,0.05910707 01:39:30,0.01721605 01:40:00,0.94845557
[R] angles
Hi R users, I would like to construct a sort hybrid vector/scatter plot. My data is in the following format: 3-column x,y,z data-frame in which every row is a separate data-point. The x y columns are coordinates, and the z column contains orientation data (range 0-180 degrees, with East=0 North=90). I need to set each x,y, point to have the alignment in z. Hence my 'vectors' would simply be lines with the mid-point at x,y and without arrow-heads. R's normal vector plot requires a pair of x,y coords for the start end of each vector, whereas I just have an orientation. Any ideas? Cheers! Tom -- View this message in context: http://n4.nabble.com/angles-tp1749321p1749321.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge failure using zoo package
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of e-letter Sent: Friday, April 02, 2010 9:20 AM To: Gabor Grothendieck Cc: r-help@r-project.org Subject: Re: [R] Merge failure using zoo package On 02/04/2010, Gabor Grothendieck ggrothendi...@gmail.com wrote: The code does not run with the files. I need the requested information, namely a single file containing code and data and that I can just copy into a session without editing and see the result you see. I don't understand how I can combine the four csv files into a single file of data and terminal commands? One way is to use a call to textConnection() instead of a file name. E.g., if you show a file called data.txt containing the lines VarA VarB 12 34 and you read that into R with data-read.table(header=TRUE, data.txt) then the R-helper needs to copy the file contents into an editor, save the file under the appropriate name, then copy the command into an R session. However, you can replace the file and the original read.table command with one command that an R-helper can paste into an R seesion: data - read.table(header=TRUE, textConnection( VarA VarB 12 34 )) Another approach is to use dput(data) to print the dataset and stick a 'data-' on the front of what was printed. E.g., with the above 'data' you can do dput(data) structure(list(VarA = c(1L, 3L), VarB = c(2L, 4L)), .Names = c(VarA, VarB), class = data.frame, row.names = c(NA, -2L)) and send R-help the command data - structure(list(VarA = c(1L, 3L), VarB = c(2L, 4L)), .Names = c(VarA, VarB), class = data.frame, row.names = c(NA, -2L)) You may have to insert some line breaks in sensible positions so that Outlook or Exchange doesn't break lines in nonsensical positions. Again, the R-helper can copy and paste that code into an R session and come up with a dataset identical to yours. (I've seen copy-n-pastable code in R-help that starts with remove(list=ls()) The suggestion to remove all objects really puts off R-helpers.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Anyway, further terminal output. The following also occurs with correction of the commands. The data merge is incomplete (to 1:27:30); data set 1 ends at time 1:45:30; data set 2 1:31:59 library(chron) library(zoo) z1-read.zoo(test1.csv,header=TRUE,sep=,,FUN=times) z2-read.zoo(test2.csv,header=TRUE,sep=,,FUN=times) z3-(na.approx(merge(z1[,2],z2[,2]),time(z1))) z3 z1[, 2]z2[, 2] 01:01:01 0.54186455 0.17558467 01:01:30 0.34860813 0.20682491 01:01:42 0.48083615 0.23806514 01:02:00 0.61306418 0.49837120 01:02:23 0.31401158 0.75867726 01:02:30 0.01495898 0.75079270 01:03:00 0.27035612 0.74290813 01:03:06 0.48274755 0.73502357 01:03:30 0.69513898 0.83760282 01:03:50 0.57982828 0.94018206 01:04:00 0.46451758 0.83306352 01:04:30 0.61672569 0.72594497 01:04:35 0.72084346 0.61882643 01:05:00 0.82496122 0.65150068 01:05:21 0.58631138 0.68417492 01:05:30 0.34766154 0.47526482 01:06:00 0.69618714 0.26635471 01:06:08 0.54326964 0.05744461 01:06:30 0.39035214 0.19544428 01:06:55 0.20357679 0.33344394 01:07:00 0.01680143 0.45147127 01:07:30 0.28576967 0.56949860 01:07:44 0.14891191 0.68752593 01:08:00 0.01205416 0.51591885 01:08:30 0.89637254 0.34431177 01:08:33 0.76392454 0.17270469 01:09:00 0.63147653 0.49396296 01:09:23 0.32334896 0.81522124 01:09:30 0.01522139 0.77116200 01:10:00 0.27661960 0.72710276 01:10:03 0.39318042 0.68304352 01:10:30 0.50974124 0.53539217 01:10:43 0.59558051 0.38774082 01:11:00 0.68141977 0.61475486 01:11:23 0.79433915 0.84176890 01:11:30 0.90725854 0.59232742 01:12:00 0.83823443 0.34288594 01:12:04 0.68591842 0.0936 01:12:30 0.53360241 0.11388206 01:12:44 0.35564718 0.13431965 01:13:00 0.17769196 0.52821343 01:13:25 0.50603906 0.92210721 01:13:30 0.83438616 0.72684026 01:14:00 0.67248807 0.53157330 01:14:06 0.38620370 0.33630635 01:14:30 0.09991933 0.45160464 01:14:47 0.06663450 0.56690294 01:15:00 0.03334966 0.33280555 01:15:29 0.48313660 0.09870816 01:15:30 0.93292355 0.32535246 01:16:00 0.15990837 0.55199675 01:16:11 0.10672443 0.77864105 01:16:30 0.05354050 0.69833773 01:16:53 0.30317627 0.61803441 01:17:00 0.55281203 0.44246870 01:17:30 0.37845690 0.26690299 01:17:35 0.63448528 0.09133728 01:18:00 0.89051365 0.09029608 01:18:17 0.52862829 0.08925487 01:18:30 0.16674292 0.49098302 01:19:00 0.85458626 0.89271117 01:19:30 0.19278550 0.72938430 01:19:42 0.46259477 0.56605742 01:20:00 0.73240405 0.77563138 01:20:25 0.44828965 0.98520534 01:20:30 0.16417524 0.87715304 01:21:00 0.73878212 0.76910073 01:21:08 0.62834165 0.66104843 01:21:30 0.51790118 0.81526716 01:21:51 0.67433278 0.96948589 01:22:00 0.83076438 0.66529956 01:22:30 0.4704 0.36111323 01:22:34 0.24832072 0.05692690 01:23:00 0.02108640 0.38790073 01:23:17 0.42509847 0.71887456 01:23:30
Re: [R] Adding regression lines to each factor on a plot when using ANCOVA
This is a nice example; thanks for providing it in this form. I tried to trim it down to show fewer groups, but ran into the following errors that I can't understand: ## keep species 1:6 dataset - subset(dataset, species 7) Warning message: In Ops.factor(species, 7) : not meaningful for factors ## OK, just subset the rows of dataset to keep species 1:6 dataset - dataset[1:20,] ancova(logBeak ~ logMass * species, data=dataset) Error in `contrasts-`(`*tmp*`, value = contr.treatment) : contrasts can be applied only to factors with 2 or more levels ancova(logBeak ~ logMass + species, data=dataset) Error in `contrasts-`(`*tmp*`, value = contr.treatment) : contrasts can be applied only to factors with 2 or more levels -Michael RICHARD M. HEIBERGER wrote: ## Steve, ## please use the ancova function in the HH package. install.packages(HH) library(HH) ## windows.options(record=TRUE) windows.options(record=TRUE) # hypothetical data beak.lgth - c(2.3,4.2,2.7,3.4,4.2,4.8,1.9,2.2,1.7,2.5,15,16.5,14.7,9.6,8.5,9.1, 9.4,17.7,15.6,14,6.8,8.5,9.4,10.5,10.9,11.2,11.5,19,17.2,18.9, 19.5,19.9,12.6,12.1,12.9,14.1,12.5,15,14.8,4.3,5.7,2.4,3.5,2.9) mass - c(45.9,47.1,47.6,17.2,17.9,17.7,44.9,44.8,45.3,44.9,39,39.7,41.2, 84.8,79.2,78.3,82.8,102.8,107.2,104.1,51.7,45.5,50.6,27.5,26.6, 27.5,26.9,25.4,23.7,21.7,22.2,23.8,46.9,51.5,49.4,33.4,33.1,33.2, 34.7,39.3,41.7,40.5,42.7,41.8) ## Make species into a factor species - factor(c(1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,5,5,6,6,6,7,7,7, 8,8,8,8,9,9,9,9,9,10,10,10,11,11,11,11,12,12,12,12,12)) ## then construct a data.frame with the three variables and the log transforms dataset - data.frame(species, beak.lgth, mass, logBeak=log10(beak.lgth), logMass=log10(mass)) ## default is 7 colors, we need 12 trellis.par.set(superpose.line, Rows(trellis.par.get(superpose.line), c(1:6, 1:6))) trellis.par.set(superpose.symbol, Rows(trellis.par.get(superpose.symbol), c(1:6, 1:6))) ancova(logBeak ~ logMass * species, data=dataset) ancova(logBeak ~ logMass + species, data=dataset) ancova(logBeak ~ logMass, groups=species, data=dataset) ancova(logBeak ~ species, x=logMass, data=dataset) bwplot(logBeak ~ species, data=dataset) ## Rich [[alternative HTML version deleted]] -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] angles
Look at the my.symbols function in the TeachingDemos package. You can get line segments using the ms.arrows function and setting the length argument to 0 (or you can make your own plotting function by copying ms.arrows and replacing the call to arrows with a call to segments). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Tom_R Sent: Friday, April 02, 2010 10:13 AM To: r-help@r-project.org Subject: [R] angles Hi R users, I would like to construct a sort hybrid vector/scatter plot. My data is in the following format: 3-column x,y,z data-frame in which every row is a separate data-point. The x y columns are coordinates, and the z column contains orientation data (range 0-180 degrees, with East=0 North=90). I need to set each x,y, point to have the alignment in z. Hence my 'vectors' would simply be lines with the mid-point at x,y and without arrow- heads. R's normal vector plot requires a pair of x,y coords for the start end of each vector, whereas I just have an orientation. Any ideas? Cheers! Tom -- View this message in context: http://n4.nabble.com/angles- tp1749321p1749321.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to save a model in DB and retrieve It
Look at the serialize function, it may accomplish what you want. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Daniele Amberti Sent: Friday, April 02, 2010 2:37 AM To: r-help@r-project.org; r-sig...@stat.math.ethz.ch Subject: [R] How to save a model in DB and retrieve It I'm wondering how to save an object (models like lm, loess, etc) in a DB to retrieve and use it afterwards, an example: wind_ms - abs(rnorm(24*30)*4+8) air_kgm3 - rnorm(24*30, 0.1)*0.1 + 1.1 wind_dg - rnorm(24*30) * 360/7 ms - c(0:25) kw_mm92 - c(0,0,0,20,94,205,391,645,979,1375,1795,2000,2040) kw_mm92 - c(kw_mm92, rep(2050, length(ms)-length(kw_mm92))) modelspline - splinefun(ms, kw_mm92) kw - abs(modelspline(wind_ms) - (wind_dg)*2 + (air_kgm3 - 1.15)*300 + rnorm(length(wind_ms))*10) #plot(wind_ms, kw) windDat - data.frame(kw, wind_ms, air_kgm3, wind_dg) windDat[windDat$wind_ms 3, 'kw'] - 0 model - loess(kw ~ wind_ms + air_kgm3 + wind_dg, data = windDat, enp.target = 10*5*3) #, span = 0.1) modX - serialize(model, connection = NULL, ascii = T) Channel - odbcConnect(someSysDSN; UID=aUid; PWD=aPwd) sqlQuery(Channel, paste( INSERT INTO GRT.GeneratorsModels ([cGeneratorID] ,[tModel] VALUES (1,, paste(', gsub(', '', rawToChar(modX)), ', sep = ''), ), sep = ) ) # Up to this it is working correctly, # in DB I have the modX variable # Problem arise retrieving data and 64kb limit: strQ - SELECT CONVERT(varchar(max), tModel) AS tModel FROMGRT.GeneratorsModels WHERE (cGeneratorID = 1) x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows = FALSE) x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows = FALSE) #read error Above code is working for simplier models that have a shorter representation in variable modX. Any advice on how to store and retieve this kind of objects? Thanks Daniele ORS Srl Via Agostino Morando 1/3 12060 Roddi (Cn) - Italy Tel. +39 0173 620211 Fax. +39 0173 620299 / +39 0173 433111 Web Site www.ors.it --- - Qualsiasi utilizzo non autorizzato del presente messaggio e dei suoi allegati è vietato e potrebbe costituire reato. Se lei avesse ricevuto erroneamente questo messaggio, Le saremmo grati se provvedesse alla distruzione dello stesso e degli eventuali allegati. Opinioni, conclusioni o altre informazioni riportate nella e-mail, che non siano relative alle attività e/o alla missione aziendale di O.R.S. Srl si intendono non attribuibili alla società stessa, né la impegnano in alcun modo. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge failure using zoo package
On Fri, Apr 2, 2010 at 1:01 PM, William Dunlap wdun...@tibco.com wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of e-letter Sent: Friday, April 02, 2010 9:20 AM To: Gabor Grothendieck Cc: r-help@r-project.org Subject: Re: [R] Merge failure using zoo package On 02/04/2010, Gabor Grothendieck ggrothendi...@gmail.com wrote: The code does not run with the files. I need the requested information, namely a single file containing code and data and that I can just copy into a session without editing and see the result you see. I don't understand how I can combine the four csv files into a single file of data and terminal commands? One way is to use a call to textConnection() instead of a file name. E.g., if you show a file called data.txt containing the lines VarA VarB 1 2 3 4 and you read that into R with data-read.table(header=TRUE, data.txt) then the R-helper needs to copy the file contents into an editor, save the file under the appropriate name, then copy the command into an R session. However, you can replace the file and the original read.table command with one command that an R-helper can paste into an R seesion: data - read.table(header=TRUE, textConnection( VarA VarB 1 2 3 4 )) I personally rarely use this form since it makes it harder to transition to the case where you have a file name. I generally prefer the textConnection form. Another approach is to use dput(data) to print the dataset and stick a 'data-' on the front of what was printed. E.g., with the above 'data' you can do dput(data) structure(list(VarA = c(1L, 3L), VarB = c(2L, 4L)), .Names = c(VarA, VarB), class = data.frame, row.names = c(NA, -2L)) and send R-help the command data - structure(list(VarA = c(1L, 3L), VarB = c(2L, 4L)), .Names = c(VarA, VarB), class = data.frame, row.names = c(NA, -2L)) This form is convenient but in some cases it may leave one wondering how the data came about. As long as that is not in question then dput is really nice. You may have to insert some line breaks in sensible positions so that Outlook or Exchange doesn't break lines in nonsensical positions. Again, the R-helper can copy and paste that code into an R session and come up with a dataset identical to yours. (I've seen copy-n-pastable code in R-help that starts with remove(list=ls()) The suggestion to remove all objects really puts off R-helpers.) Yes, that is unacceptable code to post. It can really cause horrible problems for readers. I personally never use code like this. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Anyway, further terminal output. The following also occurs with correction of the commands. The data merge is incomplete (to 1:27:30); data set 1 ends at time 1:45:30; data set 2 1:31:59 library(chron) library(zoo) z1-read.zoo(test1.csv,header=TRUE,sep=,,FUN=times) z2-read.zoo(test2.csv,header=TRUE,sep=,,FUN=times) z3-(na.approx(merge(z1[,2],z2[,2]),time(z1))) z3 z1[, 2] z2[, 2] 01:01:01 0.54186455 0.17558467 01:01:30 0.34860813 0.20682491 01:01:42 0.48083615 0.23806514 01:02:00 0.61306418 0.49837120 01:02:23 0.31401158 0.75867726 01:02:30 0.01495898 0.75079270 01:03:00 0.27035612 0.74290813 01:03:06 0.48274755 0.73502357 01:03:30 0.69513898 0.83760282 01:03:50 0.57982828 0.94018206 01:04:00 0.46451758 0.83306352 01:04:30 0.61672569 0.72594497 01:04:35 0.72084346 0.61882643 01:05:00 0.82496122 0.65150068 01:05:21 0.58631138 0.68417492 01:05:30 0.34766154 0.47526482 01:06:00 0.69618714 0.26635471 01:06:08 0.54326964 0.05744461 01:06:30 0.39035214 0.19544428 01:06:55 0.20357679 0.33344394 01:07:00 0.01680143 0.45147127 01:07:30 0.28576967 0.56949860 01:07:44 0.14891191 0.68752593 01:08:00 0.01205416 0.51591885 01:08:30 0.89637254 0.34431177 01:08:33 0.76392454 0.17270469 01:09:00 0.63147653 0.49396296 01:09:23 0.32334896 0.81522124 01:09:30 0.01522139 0.77116200 01:10:00 0.27661960 0.72710276 01:10:03 0.39318042 0.68304352 01:10:30 0.50974124 0.53539217 01:10:43 0.59558051 0.38774082 01:11:00 0.68141977 0.61475486 01:11:23 0.79433915 0.84176890 01:11:30 0.90725854 0.59232742 01:12:00 0.83823443 0.34288594 01:12:04 0.68591842 0.0936 01:12:30 0.53360241 0.11388206 01:12:44 0.35564718 0.13431965 01:13:00 0.17769196 0.52821343 01:13:25 0.50603906 0.92210721 01:13:30 0.83438616 0.72684026 01:14:00 0.67248807 0.53157330 01:14:06 0.38620370 0.33630635 01:14:30 0.09991933 0.45160464 01:14:47 0.06663450 0.56690294 01:15:00 0.03334966 0.33280555 01:15:29 0.48313660 0.09870816 01:15:30 0.93292355 0.32535246 01:16:00 0.15990837 0.55199675 01:16:11 0.10672443 0.77864105 01:16:30 0.05354050 0.69833773 01:16:53 0.30317627 0.61803441 01:17:00 0.55281203 0.44246870 01:17:30 0.37845690 0.26690299 01:17:35 0.63448528 0.09133728 01:18:00 0.89051365 0.09029608 01:18:17
Re: [R] Trouble loading package
On 28.03.2010 21:20, Peter Ehlers wrote: I haven't seen an answer to this yet. Your problem may stem from having defined a variable T. I can replicate your error messages with: T - hello library(RMark) So methinks that this probably indicates that there may be a problem with using T for TRUE (when will Rusers finally stop doing that???). And sure enough, after loading RMark (with no T in my workspace), I find that the authors of RMark have replaced base R's .First.lib with their own version which contains the line: info - strsplit(library(help = pkgname, character.only = T)$info[[1]], \\:[ ]+) Note to RMark authors (and others): get used to using TRUE and FALSE. The few characters saved by using T/F are not worth it! Let me add: Please note that such code could cannot pass R CMD check. In other words, the authors either ignored the error or never checked their package. Such a package could not be shipped through CRAN, for example. Please use the package checks. They are really useful. Best, Uwe Ligges -Peter Ehlers On 2010-03-26 15:40, Glenn E Stauffer wrote: I am trying to load a package called Rmark, but when I run library(Rmark) I get the following: library(RMark) Error in !character.only : invalid argument type Error in library(RMark) : .First.lib failed for 'RMark' When I try to load Rmark from the packages menu, I get: local({pkg- select.list(sort(.packages(all.available = TRUE))) + if(nchar(pkg)) library(pkg, character.only=TRUE)}) Error in !character.only : invalid argument type Error in library(pkg, character.only = TRUE) : .First.lib failed for 'RMark' Any ideas what is causing this error? My OS is Windows XP, and my R version is R.2.10.1 Thanks, Glenn * Glenn E. Stauffer Graduate Research Assistant Department of Ecology Montana State University Bozeman, MT 59717 406-994-5677 gestauf...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R package checking error.
The error message F used instead of FALSE is pretty clear to me ...: Use FALSE rather than F in your code. Uwe Ligges On 30.03.2010 07:36, Dong H. Oh wrote: Dear useRs, I am trying to build my package (nonpareff) which deals with some models of data envelopment analysis. The building worked well, but checking complains when it tests examples. Zipped nonparaeff.Rcheck is attached. Following is the log. - arecibo:tmp arecibo$ R CMD build nonparaeff/ * checking for file 'nonparaeff/DESCRIPTION' ... OK * preparing 'nonparaeff': * checking DESCRIPTION meta-information ... OK * checking whether 'INDEX' is up-to-date ... NO * use '--force' to overwrite the existing 'INDEX' * removing junk files * checking for LF line-endings in source and make files * checking for empty or unneeded directories * building 'nonparaeff_0.5-1.tar.gz' arecibo:tmp arecibo$ R CMD check nonparaeff_0.5-1.tar.gz * checking for working pdflatex ... OK * using log directory '/Users/arecibo/tmp/nonparaeff.Rcheck' * using R version 2.10.0 (2009-10-26) * using session charset: UTF-8 * checking for file 'nonparaeff/DESCRIPTION' ... OK * this is package 'nonparaeff' version '0.5-1' * checking package dependencies ... OK * checking if this is a source package ... OK * checking for executable files ... OK * checking whether package 'nonparaeff' can be installed ... WARNING Found the following significant warnings: Warning: package 'geometry' was built under R version 2.10.1 See '/Users/arecibo/tmp/nonparaeff.Rcheck/00install.out' for details. * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... OK * checking top-level files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for non-ASCII characters ... OK * checking R files for syntax errors ... OK * checking whether the package can be loaded ... OK * checking whether the package can be loaded with stated dependencies ... OK * checking for unstated dependencies in R code ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking R code for possible problems ... NOTE Found possibly global 'T' or 'F' in the following function: ar.dual.dea * checking Rd files ... NOTE prepare_Rd: ar.dual.dea.Rd:51: Dropping empty section \seealso * checking Rd metadata ... OK * checking Rd cross-references ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking examples ... ERROR Running examples in 'nonparaeff-Ex.R' failed. The error most likely occurred in: ### * ar.dual.dea flush(stderr()); flush(stdout()) ### Name: ar.dual.dea ### Title: Assurance Region Data Envelopment Aanlysis (AR-DEA) ### Aliases: ar.dual.dea ### Keywords: Data Envelopment Analysis ### ** Examples ## AR constraint of 0.25= v2/v1= 1. library(Hmisc) library(lpSolve) ar.dat- data.frame(y = c(1, 1, 1, 1, 1, 1), + x1 = c(2, 3, 6, 3, 6, 6), + x2 = c(5, 3, 1, 8, 4, 2)) (re- + ar.dual.dea(ar.dat, noutput = 1, orientation = 1, rts = 1, ar.l = + matrix(c(0, 0, 0.25, -1, -1, 1), nrow = 2, ncol = 3), ar.r = c(0, 0), + ar.dir = c(=, =))) Error in ar.dual.dea(ar.dat, noutput = 1, orientation = 1, rts = 1, ar.l = matrix(c(0, : F used instead of FALSE Execution halted --- Following is sessionInfor() R sessionInfo() R version 2.10.0 (2009-10-26) x86_64-apple-darwin9.8.0 locale: [1] C/UTF-8/C/C/C/C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lmtest_0.9-26 zoo_1.5-8 gdata_2.6.1 lpSolve_5.6.4 xtable_1.5-5 [6] MASS_7.3-3 loaded via a namespace (and not attached): [1] grid_2.10.0 gtools_2.6.1lattice_0.17-26 Thank you for your time and consideration. Best regards, Dong-hyun Oh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding regression lines to each factor on a plot when using ANCOVA
Michael and others, Here is my complete ancova example http://astro.ocis.temple.edu/~rmh/HH/hotdog.pdf This example, especially in Figure 6, places them in a context of a Cartesian product of models with the intercept having two levels and slope having three levels. It is based on the ancova chapter from my book \item Heiberger, Richard M., and Burt Holland (2004). {\it Statistical Analysis and Data Display: An Intermediate Course with Examples in \Splus, \R, and \SAS}, Springer--Verlag, New York. \url{http://springeronline.com/0-387-40270-5}. and in essentially this form was used in my chapter in Heiberger, Richard M., and Burt Holland (2008). ``Structured Sets of Graphs.'' Chapter III.6, (pp.~415--445) in {\it Handbook of Computational Statistics on Data Visualization}, edited by Chun-houh Chen, Antony Unwin, and Wolfgang H\{a}rdle. Springer-Verlag, Berlin. Rich [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] All sub-summands of a vector
Hello, I'd like to take all possible sub-summands of a vector in the quickest and most efficient way possible. By sub-summands I mean for each sub-vector, take its sum. Which is to say: if I had the vector x-1:4 I'd want the sum of x[1], x[2], etc. And then the sum of x[1:2], x[2:3], etc. And then...so on. The result would be: 1 2 3 4 2 5 7 6 9 10 I can do this with for loops (code below) but for long vectors (10^6 elements) looping takes more time than I'd like. Any suggestions? Thanks very much in advance-- Andy # calculate sums of all sub-vectors... x - 1:4 sub.vect - vector(list,4) for(t in 1:4) { maxi - 4 - t + 1 this.sub - numeric(maxi) for(i in 1:maxi) { this.sub[i] - sum(x[i:(i+t-1)]) } sub.vect[[t]] - this.sub } sub.vect [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting the first row based on a factor
Hello there, I have a situation where I would like to select the first row of a particular factor for a data frame (data example below). So that is, I would like to select the first entry when the factor1 =A and then the first row when factor1=B etc. I have thousands of entries so I need some general way of doing this. I have a minimal example that should illustrate what I am trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04. Thanks so much in advance! Sam #Minimal example x - rnorm(100) y - rnorm(100) xy - data.frame(x,y) xy$factor1 - c(A, B,C,D) xy$factor2 - c(a,b) xy - xy[order(xy$factor1),] #This simply orders the data to look more like the actual data I am working with #I am trying to use this approach but I am not sure that I am selecting the correct row and then the output temp is a total mess. temp - with(xy, unlist(lapply(split(xy, list(factor1=factor1, factor2=factor2)), function(x) x[1,]))) xy factor1 factor2 10.700042585 -2.481633101 A a # I would like to select this row 51.402677849 -0.691143942 A a 90.188287765 -1.723823157 A a 13 0.714946028 0.715361315 A a 17 0.690177271 -0.112394002 A a 21 0.333101579 -0.316285321 A a 25 0.439505793 -3.356415326 A a 89 -1.001153334 -0.739440288 A a 93 0.135509539 0.949943380 A a 97 -1.730936150 0.356133105 A a 2 -0.399355582 -0.843874548 B b # Then I would like to select this row. etc 61.285958969 0.958501988 B b 10 0.495795836 -0.805012667 B b 14 0.512486789 -0.968247016 B b 18 -1.189627025 0.455278250 B b -- * Sam Albers Geography Program University of Northern British Columbia University Way Prince George, British Columbia Canada, V2N 4Z9 phone: 250 960-6777 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] All sub-summands of a vector
Hi Andy, Take a look at the rollapply function in the zoo package. require(zoo) Loading required package: zoo x - 1:4 rollapply(zoo(x), 1, sum) 1 2 3 4 1 2 3 4 rollapply(zoo(x), 2, sum) 1 2 3 3 5 7 rollapply(zoo(x), 3, sum) 2 3 6 9 rollapply(zoo(x), 4, sum) 2 10 # all at once sapply(1:4, function(r) rollapply(zoo(x), r, sum)) HTH, Jorge On Fri, Apr 2, 2010 at 2:24 PM, Andy Rominger wrote: Hello, I'd like to take all possible sub-summands of a vector in the quickest and most efficient way possible. By sub-summands I mean for each sub-vector, take its sum. Which is to say: if I had the vector x-1:4 I'd want the sum of x[1], x[2], etc. And then the sum of x[1:2], x[2:3], etc. And then...so on. The result would be: 1 2 3 4 2 5 7 6 9 10 I can do this with for loops (code below) but for long vectors (10^6 elements) looping takes more time than I'd like. Any suggestions? Thanks very much in advance-- Andy # calculate sums of all sub-vectors... x - 1:4 sub.vect - vector(list,4) for(t in 1:4) { maxi - 4 - t + 1 this.sub - numeric(maxi) for(i in 1:maxi) { this.sub[i] - sum(x[i:(i+t-1)]) } sub.vect[[t]] - this.sub } sub.vect [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting the first row based on a factor
Hello, Sam Albers wrote: Hello there, I have a situation where I would like to select the first row of a particular factor for a data frame (data example below). So that is, I would like to select the first entry when the factor1 =A and then the first row when factor1=B etc. I have thousands of entries so I need some general way of doing this. I have a minimal example that should illustrate what I am trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04. Thanks so much in advance! Sam #Minimal example x - rnorm(100) y - rnorm(100) xy - data.frame(x,y) xy$factor1 - c(A, B,C,D) xy$factor2 - c(a,b) xy - xy[order(xy$factor1),] #This simply orders the data to look more like the actual data I am working with Does xy[!duplicated(xy$factor1),] do what you want? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] All sub-summands of a vector
There is also rollmean in the zoo package which might be slightly faster since its optimized for that operation. k * rollmean(x, k) e.g. 2 * rollmean(1:4, 2) [1] 3 5 7 will give a rolling sum. runmean in the caTools package is even faster. On Fri, Apr 2, 2010 at 2:31 PM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: Hi Andy, Take a look at the rollapply function in the zoo package. require(zoo) Loading required package: zoo x - 1:4 rollapply(zoo(x), 1, sum) 1 2 3 4 1 2 3 4 rollapply(zoo(x), 2, sum) 1 2 3 3 5 7 rollapply(zoo(x), 3, sum) 2 3 6 9 rollapply(zoo(x), 4, sum) 2 10 # all at once sapply(1:4, function(r) rollapply(zoo(x), r, sum)) HTH, Jorge On Fri, Apr 2, 2010 at 2:24 PM, Andy Rominger wrote: Hello, I'd like to take all possible sub-summands of a vector in the quickest and most efficient way possible. By sub-summands I mean for each sub-vector, take its sum. Which is to say: if I had the vector x-1:4 I'd want the sum of x[1], x[2], etc. And then the sum of x[1:2], x[2:3], etc. And then...so on. The result would be: 1 2 3 4 2 5 7 6 9 10 I can do this with for loops (code below) but for long vectors (10^6 elements) looping takes more time than I'd like. Any suggestions? Thanks very much in advance-- Andy # calculate sums of all sub-vectors... x - 1:4 sub.vect - vector(list,4) for(t in 1:4) { maxi - 4 - t + 1 this.sub - numeric(maxi) for(i in 1:maxi) { this.sub[i] - sum(x[i:(i+t-1)]) } sub.vect[[t]] - this.sub } sub.vect [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] compare two fingerprint images
Hello I wanted to compare two fingerprint images. How do you do with R?. Is there a role for cross-correlation of images? Thanks -- = Juan Antonio Gil Pascual Prof. Titular de Métodos de Investigación en Educación correo: j...@edu.uned.es web: www.uned.es/personal/jgil U.N.E.D. Fac. de Educación Dpto. MIDE I Pº Senda del Rey,7 desp. 122 28040 MADRID Tel. 91 398 72 79 Fax. 91 398 72 88 Antes de imprimir este correo piense bien si es necesario hacerlo: El medioambiente es cosa de todos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting the first row based on a factor
Thanks! On Fri, Apr 2, 2010 at 11:35 AM, Erik Iverson er...@ccbr.umn.edu wrote: Hello, Sam Albers wrote: Hello there, I have a situation where I would like to select the first row of a particular factor for a data frame (data example below). So that is, I would like to select the first entry when the factor1 =A and then the first row when factor1=B etc. I have thousands of entries so I need some general way of doing this. I have a minimal example that should illustrate what I am trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04. Thanks so much in advance! Sam #Minimal example x - rnorm(100) y - rnorm(100) xy - data.frame(x,y) xy$factor1 - c(A, B,C,D) xy$factor2 - c(a,b) xy - xy[order(xy$factor1),] #This simply orders the data to look more like the actual data I am working with Does xy[!duplicated(xy$factor1),] This most definitely works. What a beautifully elegant solution. Thanks! do what you want? -- * Sam Albers Geography Program University of Northern British Columbia University Way Prince George, British Columbia Canada, V2N 4Z9 phone: 250 960-6777 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lineplot.CI in sciplot: option ci.fun can't be changed?
For now, just change fun(x) to median(x) (or whatever) in your ci.fun() below. E.g. lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= function(x) c(mean(x)-2*se(x), mean(x)+2*se(x))) Otherwise, maybe the list members could help with a solution. An example that illustrates the problem: ex.fn - function(x, fun = mean, fun2 = function(x) fun(x)+sd(x)) { list(fun=fun(x), fun2=fun2(x)) } data - rnorm(10) ex.fn(data) #works ex.fn(data, fun=median) #works ex.fn(data, fun2=function(x) fun(x)+3) #error with fun(x) not found On Fri, 2010-04-02 at 17:36 +, Tao Shi wrote: hi List and Manuel, I have encounter the following problem with the function lineplot.CI. I'm running R 2.10.1, sciplot 1.0-7 on Win XP. It seems like it's a scoping issue, but I couldn't figure it out. Thanks! ...Tao lineplot.CI(x.factor = dose, response = len, data = ToothGrowth)## fine lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=median) ## fine lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=mean) ## fine lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= function(x) c(fun(x)-2*se(x), fun(x)+2*se(x))) ## failed! Error in FUN(X[[1L]], ...) : could not find function fun debug(lineplot.CI) lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= function(x) c(fun(x)-2*se(x), fun(x)+2*se(x))) Browse[2] debug: mn.data - tapply(response, groups, fun) Browse[2] debug: CI.data - tapply(response, groups, ci.fun) Browse[2] fun function (x) mean(x, na.rm = TRUE) environment: 0x07178640 Browse[2] ci.fun function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)) Browse[2] debug(ci.fun) Browse[2] fun function (x) mean(x, na.rm = TRUE) environment: 0x07178640 Browse[2] debugging in: FUN(X[[1L]], ...) debug: c(fun(x) - 2 * se(x), fun(x) + 2 * se(x)) Browse[3] Error in FUN(X[[1L]], ...) : could not find function fun undebug(lineplot.CI) lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= function(x) c(fun(x)-se(x), fun(x)+se(x))) Error in FUN(X[[1L]], ...) : could not find function fun lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = function(x) mean(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), fun(x)+se(x))) Error in FUN(X[[1L]], ...) : could not find function fun lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = function(x) median(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), fun(x)+se(x))) Error in FUN(X[[1L]], ...) : could not find function fun _ Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1 -- http://mutualism.williams.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sharing levels across multiple factor vectors
Ah, I finally figured it out: I had asked In both of those cases, why is the [] needed? It's because when on the left hand side of an assignment, the bracket operator attempts to preserve the class and dimension of the object it's subsetting. (Or at least, that's true when the object is a data frame.) -- View this message in context: http://n4.nabble.com/Sharing-levels-across-multiple-factor-vectors-tp1747714p1749502.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lineplot.CI in sciplot: option ci.fun can't be changed?
hi List and Manuel, I have encounter the following problem with the function lineplot.CI. I'm running R 2.10.1, sciplot 1.0-7 on Win XP. It seems like it's a scoping issue, but I couldn't figure it out. Thanks! ...Tao lineplot.CI(x.factor = dose, response = len, data = ToothGrowth) ## fine lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=median) ## fine lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=mean) ## fine lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= function(x) c(fun(x)-2*se(x), fun(x)+2*se(x))) ## failed! Error in FUN(X[[1L]], ...) : could not find function fun debug(lineplot.CI) lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= function(x) c(fun(x)-2*se(x), fun(x)+2*se(x))) Browse[2] debug: mn.data - tapply(response, groups, fun) Browse[2] debug: CI.data - tapply(response, groups, ci.fun) Browse[2] fun function (x) mean(x, na.rm = TRUE) environment: 0x07178640 Browse[2] ci.fun function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)) Browse[2] debug(ci.fun) Browse[2] fun function (x) mean(x, na.rm = TRUE) environment: 0x07178640 Browse[2] debugging in: FUN(X[[1L]], ...) debug: c(fun(x) - 2 * se(x), fun(x) + 2 * se(x)) Browse[3] Error in FUN(X[[1L]], ...) : could not find function fun undebug(lineplot.CI) lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= function(x) c(fun(x)-se(x), fun(x)+se(x))) Error in FUN(X[[1L]], ...) : could not find function fun lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = function(x) mean(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), fun(x)+se(x))) Error in FUN(X[[1L]], ...) : could not find function fun lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = function(x) median(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), fun(x)+se(x))) Error in FUN(X[[1L]], ...) : could not find function fun _ Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. N:WL:en-US:WM_HMP:042010_1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] panel data
Hello, I have an unbalanced panel data set that looks like: ID,YEAR,HEIGHT Tom,2007,65 Tom,2008,66 Mary,2007,45 Mary,2008,50 Harry,2007,62 Harry,2008,62 James,2007,68 Jack,2007,70 Jordan,2008,72 That is, James, Jack, and Jordan are missing a YEAR. Is there any command that will fill in the missing YEAR such that the end result will be balanced and look like: ID,YEAR,HEIGHT Tom,2007,65 Tom,2008,66 Mary,2007,45 Mary,2008,50 Harry,2007,62 Harry,2008,62 James,2007,68 James,2008,NA Jack,2007,70 Jack,2008,NA Jordan,2007,NA Jordan,2008,72 Thank you. Geoff -- Geoffrey Smith Visiting Assistant Professor Department of Finance W. P. Carey School of Business Arizona State University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] All sub-summands of a vector
Great, thanks for your help. I tried: x - 1:1 y - lapply(1:1,function(t){t*runmean(x,t,alg=fast,endrule=trim)}) and it worked in about 90 sec. Thanks again, Andy On Fri, Apr 2, 2010 at 3:43 PM, Gabor Grothendieck ggrothendi...@gmail.comwrote: There is also rollmean in the zoo package which might be slightly faster since its optimized for that operation. k * rollmean(x, k) e.g. 2 * rollmean(1:4, 2) [1] 3 5 7 will give a rolling sum. runmean in the caTools package is even faster. On Fri, Apr 2, 2010 at 2:31 PM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: Hi Andy, Take a look at the rollapply function in the zoo package. require(zoo) Loading required package: zoo x - 1:4 rollapply(zoo(x), 1, sum) 1 2 3 4 1 2 3 4 rollapply(zoo(x), 2, sum) 1 2 3 3 5 7 rollapply(zoo(x), 3, sum) 2 3 6 9 rollapply(zoo(x), 4, sum) 2 10 # all at once sapply(1:4, function(r) rollapply(zoo(x), r, sum)) HTH, Jorge On Fri, Apr 2, 2010 at 2:24 PM, Andy Rominger wrote: Hello, I'd like to take all possible sub-summands of a vector in the quickest and most efficient way possible. By sub-summands I mean for each sub-vector, take its sum. Which is to say: if I had the vector x-1:4 I'd want the sum of x[1], x[2], etc. And then the sum of x[1:2], x[2:3], etc. And then...so on. The result would be: 1 2 3 4 2 5 7 6 9 10 I can do this with for loops (code below) but for long vectors (10^6 elements) looping takes more time than I'd like. Any suggestions? Thanks very much in advance-- Andy # calculate sums of all sub-vectors... x - 1:4 sub.vect - vector(list,4) for(t in 1:4) { maxi - 4 - t + 1 this.sub - numeric(maxi) for(i in 1:maxi) { this.sub[i] - sum(x[i:(i+t-1)]) } sub.vect[[t]] - this.sub } sub.vect [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mouse-clicking on xy-plot
Hi, I seem to recall coming across a function that allowed one to mouse-click on an xy-plot and obtain x and y coordinates. Can anyone remind me its name? Thanks, Nuno __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple plot of values and error bars: Is there an existing function for this
In an OpenOffice.org forum someone asked if it was possible to plot some raw data and then add a line for the confidence interval. Example at http://www.graphpad.com/help/Prism5/scatter%20-%20grouped.png While it may be possible to do this in OOo's spreadsheet program it looks nasty (both to do and the results ) I can do this in R but I'm not good enough that I can produce a fairly seamless function to handle it. This must be a fairly common plot so I was wondering if anyone can point me to an existing function for it? I suspect if I knew a bit more about ggplot2 I could do it there. Thanks __ Get a sneak peak at messages with a han l/overview2/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mouse-clicking on xy-plot
Are you thinking of ?identify ? Nuno Prista wrote: Hi, I seem to recall coming across a function that allowed one to mouse-click on an xy-plot and obtain x and y coordinates. Can anyone remind me its name? Thanks, Nuno __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] my mail to post
nmola...@gmail.com -- Att: Nicolás Molano [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mouse-clicking on xy-plot
You can use identify() to obtain coordinates from plotted points but if you want any coordinates you could use locator(): plot(1:10) loc - locator(n=3) str(loc) List of 2 $ x: num [1:3] 2.3 5.4 8.29 $ y: num [1:3] 6.15 8.33 2.6 points(loc$x, loc$y, col=2) Walmes. - ..ooo0 ... ..()... 0ooo... Walmes Zeviani ...\..(.(.)... Master in Statistics and Agricultural Experimentation \_). )../ walmeszevi...@hotmail.com, Lavras - MG, Brasil (_/ -- View this message in context: http://n4.nabble.com/mouse-clicking-on-xy-plot-tp1749562p1749586.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] optim doesnt work with my function
#Hello, i have created this function, but optim doesnt maximize it, just return the value at the inits W-function(l){ w-rep(0,dim(D)[1]) for(i in 1:dim(D)[1]){ w[i]-PAitk(D[i,],D[-i,],l) } return(prod(w)) } #D is a matrix with entires in {0,1}, l is a vector which length(l)= dim(D)[2] #PAitk is an other function defined as PAitk-function(y,D,lambda){ o-rep(0,dim(D)[1]) for(i in 1:dim(D)[1]){ o[i]-Aitk(lambda,y,D[i,]) } return(sum(o)/dim(D)[1]) } #with the same restriction on l and Aitk-function(l,x,y){ prod((l^(1-abs(x-y)))*((1-l)^abs(x-y))) } #with the same restriction on l #i want to maximize W in this way optim(rep(.75,5),W,method =L-BFGS-B,lower =rep(0.50001,5),upper=rep(0.,5),control=list(fnscale=-1)) #but as i tell you before it just returns the W´s value at the inits rep(.75,5) or any you put on it. #I am grateful for the help that you could offer to me -- View this message in context: http://n4.nabble.com/optim-doesnt-work-with-my-function-tp1749591p1749591.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] panel data
On Apr 2, 2010, at 3:39 PM, Geoffrey Smith wrote: Hello, I have an unbalanced panel data set that looks like: ID,YEAR,HEIGHT Tom,2007,65 Tom,2008,66 Mary,2007,45 Mary,2008,50 Harry,2007,62 Harry,2008,62 James,2007,68 Jack,2007,70 Jordan,2008,72 That is, James, Jack, and Jordan are missing a YEAR. Is there any command that will fill in the missing YEAR such that the end result will be balanced and look like: ID,YEAR,HEIGHT Tom,2007,65 Tom,2008,66 Mary,2007,45 Mary,2008,50 Harry,2007,62 Harry,2008,62 James,2007,68 James,2008,NA Jack,2007,70 Jack,2008,NA Jordan,2007,NA Jordan,2008,72 It's not one command but it's an approach ... assumes you have data in a dataframe named ftbl: fexp - expand.grid(ID=unique(ftbl$ID), YEAR=unique(ftbl$YEAR)) merge(fexp, ftbl, all=TRUE) ID YEAR HEIGHT 1 Harry 2007 62 2 Harry 2008 62 3Jack 2007 70 4Jack 2008 NA 5 James 2007 68 6 James 2008 NA 7 Jordan 2007 NA 8 Jordan 2008 72 9Mary 2007 45 10 Mary 2008 50 11Tom 2007 65 12Tom 2008 66 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] model reparameterization
== y=c(100,200,300,400,500) treatment=c(1,2,3,3,4) block=c(1,1,2,3,3) summary(lm(y~as.factor(treatment)+as.factor(block))) == The aim is to find a model that can estimate the comparison between treatment 1 with 2 and treatment 3 with 4. I have tried all the possible ones === relevel(as.factor(block),ref=1);relevel(as.factor(treatment),ref=1) relevel(as.factor(block),ref=1);relevel(as.factor(treatment),ref=2) relevel(as.factor(block),ref=1);relevel(as.factor(treatment),ref=3) relevel(as.factor(block),ref=1);relevel(as.factor(treatment),ref=4) relevel(as.factor(block),ref=2);relevel(as.factor(treatment),ref=1) relevel(as.factor(block),ref=2);relevel(as.factor(treatment),ref=2) relevel(as.factor(block),ref=2);relevel(as.factor(treatment),ref=3) relevel(as.factor(block),ref=2);relevel(as.factor(treatment),ref=4) relevel(as.factor(block),ref=3);relevel(as.factor(treatment),ref=1) relevel(as.factor(block),ref=3);relevel(as.factor(treatment),ref=2) relevel(as.factor(block),ref=3);relevel(as.factor(treatment),ref=3) relevel(as.factor(block),ref=3);relevel(as.factor(treatment),ref=4) each followed by a line of summary(lm(y~as.factor(treatment)+as.factor(block))) i seem to always get NaNs Am I doing something wrong there? What model should I use then? Thanks! casper -- View this message in context: http://n4.nabble.com/model-reparameterization-tp1749621p1749621.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-DB] How to save a model in DB and retrieve It
A very simple option, since you're only looking to efficiently store and retrieve, is something like a key-value store. There is a new rredis (redis) package on CRAN, as well as the RBerkeley (Oracle Berkeley DB) package. RBerkeley is as simple as db_put() and db_get() calls where you specify a key and serialize/unserialize the object before and after. Caveat to RBerkeley is that it is only functional on *nix until someone contributes a Windows version or insight on what I need to do to make that work (issue is that Berkeley DB can't be compiled easily using the R version of mingw to compile). The package code is likely to work for windows if you can manage to get the db headers/libs installed with the R toolchain. HTH Jeff On Fri, Apr 2, 2010 at 3:37 AM, Daniele Amberti daniele.ambe...@ors.it wrote: I'm wondering how to save an object (models like lm, loess, etc) in a DB to retrieve and use it afterwards, an example: wind_ms - abs(rnorm(24*30)*4+8) air_kgm3 - rnorm(24*30, 0.1)*0.1 + 1.1 wind_dg - rnorm(24*30) * 360/7 ms - c(0:25) kw_mm92 - c(0,0,0,20,94,205,391,645,979,1375,1795,2000,2040) kw_mm92 - c(kw_mm92, rep(2050, length(ms)-length(kw_mm92))) modelspline - splinefun(ms, kw_mm92) kw - abs(modelspline(wind_ms) - (wind_dg)*2 + (air_kgm3 - 1.15)*300 + rnorm(length(wind_ms))*10) #plot(wind_ms, kw) windDat - data.frame(kw, wind_ms, air_kgm3, wind_dg) windDat[windDat$wind_ms 3, 'kw'] - 0 model - loess(kw ~ wind_ms + air_kgm3 + wind_dg, data = windDat, enp.target = 10*5*3) #, span = 0.1) modX - serialize(model, connection = NULL, ascii = T) Channel - odbcConnect(someSysDSN; UID=aUid; PWD=aPwd) sqlQuery(Channel, paste( INSERT INTO GRT.GeneratorsModels ([cGeneratorID] ,[tModel] VALUES (1,, paste(', gsub(', '', rawToChar(modX)), ', sep = ''), ), sep = ) ) # Up to this it is working correctly, # in DB I have the modX variable # Problem arise retrieving data and 64kb limit: strQ - SELECT CONVERT(varchar(max), tModel) AS tModel FROM GRT.GeneratorsModels WHERE (cGeneratorID = 1) x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows = FALSE) x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows = FALSE) #read error Above code is working for simplier models that have a shorter representation in variable modX. Any advice on how to store and retieve this kind of objects? Thanks Daniele ORS Srl Via Agostino Morando 1/3 12060 Roddi (Cn) - Italy Tel. +39 0173 620211 Fax. +39 0173 620299 / +39 0173 433111 Web Site www.ors.it Qualsiasi utilizzo non autorizzato del presente messaggio e dei suoi allegati è vietato e potrebbe costituire reato. Se lei avesse ricevuto erroneamente questo messaggio, Le saremmo grati se provvedesse alla distruzione dello stesso e degli eventuali allegati. Opinioni, conclusioni o altre informazioni riportate nella e-mail, che non siano relative alle attività e/o alla missione aziendale di O.R.S. Srl si intendono non attribuibili alla società stessa, né la impegnano in alcun modo. ___ R-sig-DB mailing list -- R Special Interest Group r-sig...@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-db -- Jeffrey Ryan jeffrey.r...@insightalgo.com ia: insight algorithmics www.insightalgo.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding regression lines to each factor on a plot when using ANCOVA
On 2010-04-02 11:07, Michael Friendly wrote: This is a nice example; thanks for providing it in this form. I tried to trim it down to show fewer groups, but ran into the following errors that I can't understand: ## keep species 1:6 dataset - subset(dataset, species 7) Warning message: In Ops.factor(species, 7) : not meaningful for factors You could use: as.numeric(as.character(species)) 7 (I usually keep both a numeric and a factor version of the variable when I foresee doing something like this. Then you can just use the numeric version with ''.) ## OK, just subset the rows of dataset to keep species 1:6 dataset - dataset[1:20,] ancova(logBeak ~ logMass * species, data=dataset) Error in `contrasts-`(`*tmp*`, value = contr.treatment) : contrasts can be applied only to factors with 2 or more levels I don't get this error (in R 2.11.0 devel), but the lattice display doesn't work. I get packet errors which go away after I make sure that the new 'species' has only 6 levels: dataset$species - factor(dataset$species) I suspect that you may have to do the same. -Peter ancova(logBeak ~ logMass + species, data=dataset) Error in `contrasts-`(`*tmp*`, value = contr.treatment) : contrasts can be applied only to factors with 2 or more levels -Michael RICHARD M. HEIBERGER wrote: ## Steve, ## please use the ancova function in the HH package. install.packages(HH) library(HH) ## windows.options(record=TRUE) windows.options(record=TRUE) # hypothetical data beak.lgth - c(2.3,4.2,2.7,3.4,4.2,4.8,1.9,2.2,1.7,2.5,15,16.5,14.7,9.6,8.5,9.1, 9.4,17.7,15.6,14,6.8,8.5,9.4,10.5,10.9,11.2,11.5,19,17.2,18.9, 19.5,19.9,12.6,12.1,12.9,14.1,12.5,15,14.8,4.3,5.7,2.4,3.5,2.9) mass - c(45.9,47.1,47.6,17.2,17.9,17.7,44.9,44.8,45.3,44.9,39,39.7,41.2, 84.8,79.2,78.3,82.8,102.8,107.2,104.1,51.7,45.5,50.6,27.5,26.6, 27.5,26.9,25.4,23.7,21.7,22.2,23.8,46.9,51.5,49.4,33.4,33.1,33.2, 34.7,39.3,41.7,40.5,42.7,41.8) ## Make species into a factor species - factor(c(1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,5,5,6,6,6,7,7,7, 8,8,8,8,9,9,9,9,9,10,10,10,11,11,11,11,12,12,12,12,12)) ## then construct a data.frame with the three variables and the log transforms dataset - data.frame(species, beak.lgth, mass, logBeak=log10(beak.lgth), logMass=log10(mass)) ## default is 7 colors, we need 12 trellis.par.set(superpose.line, Rows(trellis.par.get(superpose.line), c(1:6, 1:6))) trellis.par.set(superpose.symbol, Rows(trellis.par.get(superpose.symbol), c(1:6, 1:6))) ancova(logBeak ~ logMass * species, data=dataset) ancova(logBeak ~ logMass + species, data=dataset) ancova(logBeak ~ logMass, groups=species, data=dataset) ancova(logBeak ~ species, x=logMass, data=dataset) bwplot(logBeak ~ species, data=dataset) ## Rich [[alternative HTML version deleted]] -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting 'SOME' values from a linear model.
I have a regression model that I have used on some data and to look at its accuracy compared to some other models I have been extracting the same 10% of the real data to perform a sum of squares calculation this is what I have tried but it gives me the '0.9*length(t)' fitted value and the 'length(t)' fitted value that I want, but it doesnt give me those in between. my.lm my.lm$fitted[c(0.9*length(t), length(t))] Help please Thanks -- View this message in context: http://n4.nabble.com/Extracting-SOME-values-from-a-linear-model-tp1749643p1749643.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using GIS data in R
On 2/04/2010, at 4:37 AM, Scott Duke-Sylvester wrote: I have a simple problem: I need to load a ERSI shapefile of US states and check whether or not a set of points are within the boundary of these states. I have the shapefile, I have the coordinates but I'm having a great deal of difficulty bringing the two together. The problem is the various GIS packages for R do not play well with each other. sp, shapefiles, maptools, etc all use different data structures. Can someone suggest a simple set of commands that will work together that will: 1) load the shapefile data. 2) Allow me to test whether or not a (lng,lat) coordinate pair are inside or outside the polygons defined in the shapefile. You may get some mileage out of looking at Adrian Baddeley's vignette ``Handling shapefiles in the spatstat package'' (available at the entry for spatstat under contributed extension packages on CRAN). For item 2) you may find the inside.owin() function in spatstat useful. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting 'SOME' values from a linear model.
On Apr 2, 2010, at 5:14 PM, HouseBandit wrote: I have a regression model that I have used on some data and to look at its accuracy compared to some other models I have been extracting the same 10% of the real data to perform a sum of squares calculation this is what I have tried but it gives me the '0.9*length(t)' fitted value and the 'length(t)' fitted value that I want, but it doesnt give me those in between. my.lm my.lm$fitted[] That would index exactly 2 numbers. What was your goal? Try just this at your console: c(0.9*length(t), length(t) ) You may want to look at: ?sample -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting 'SOME' values from a linear model.
my goal is to return the selected fitted values and then perform a sum of squares calcuation with them. I have looked at 'list' etc but cant return anything. Its either all of the fitted values or just the first and last of the sub set that I need. Cheers -- View this message in context: http://n4.nabble.com/Extracting-SOME-values-from-a-linear-model-tp1749643p1749659.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cross-validation for parameter selection (glm/logit)
Hi, On Fri, Apr 2, 2010 at 9:14 AM, Jay josip.2...@gmail.com wrote: If my aim is to select a good subset of parameters for my final logit model built using glm(). What is the best way to cross-validate the results so that they are reliable? Let's say that I have a large dataset of 1000's of observations. I split this data into two groups, one that I use for training and another for validation. First I use the training set to build a model, and the the stepAIC() with a Forward-Backward search. BUT, if I base my parameter selection purely on this result, I suppose it will be somewhat skewed due to the 1-time data split (I use only 1 training dataset) Another approach would be to use penalized regression models. The glment package has lasso and elasticnet models for both logistic and normal regression models. Intuitively: in addition to minimizing (say) the squared loss, the model has to pay some cost (lambda) for including a non-zero parameter in your model, which in turn provides sparse models. You ca use CV to fine tune the value for lambda. If you're not familiar with these penalized models, the glmnet package has a few references to get you started. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vector length help using prcomp
Hi I am doing PCA using prcomp and when I try to get predicted values for the different PC's the number of data points is always one less than in my original data set. This is a problem because it prevents me from doing any post-hoc analysis due to the fact that my dependent variables are one entry longer than my PC's. I have checked for missing data to see if it is omitting any but it is not. It seems like it is always omitting the first data point because the output for the predicted PC values always starts at 2 not 1. Other than that the results of the analysis make sense and it appears to be working correctly. If anyone has any idea why this may be happening I would appreciate some help. This is the script I am using. chemPR1 - prcomp(~ ANC + color + CA + pH + TP + volume + maxdepth + meandepth + elevation + surface + shoreline + littoral , center = TRUE, scale=TRUE, scores=TRUE, cor=TRUE) PC1-(predict(ALSC1)[,1]) Thanks. Jason -- View this message in context: http://n4.nabble.com/vector-length-help-using-prcomp-tp1749669p1749669.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cross-validation for parameter selection (glm/logit)
Inline below: Bert Gunter Genentech Nonclinical Statistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Steve Lianoglou Sent: Friday, April 02, 2010 2:34 PM To: Jay Cc: r-help@r-project.org Subject: Re: [R] Cross-validation for parameter selection (glm/logit) Hi, On Fri, Apr 2, 2010 at 9:14 AM, Jay josip.2...@gmail.com wrote: If my aim is to select a good subset of parameters for my final logit model built using glm(). -- Define good What is the best way to cross-validate the -- Define best results so that they are reliable? -- Define reliable Answers depend on what you mean by these terms. I suggest you consult a statistician to work with you. These are huge issues for which you would profit by some guidance. Cheers, Bert Let's say that I have a large dataset of 1000's of observations. I split this data into two groups, one that I use for training and another for validation. First I use the training set to build a model, and the the stepAIC() with a Forward-Backward search. BUT, if I base my parameter selection purely on this result, I suppose it will be somewhat skewed due to the 1-time data split (I use only 1 training dataset) Another approach would be to use penalized regression models. The glment package has lasso and elasticnet models for both logistic and normal regression models. Intuitively: in addition to minimizing (say) the squared loss, the model has to pay some cost (lambda) for including a non-zero parameter in your model, which in turn provides sparse models. You ca use CV to fine tune the value for lambda. If you're not familiar with these penalized models, the glmnet package has a few references to get you started. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting 'SOME' values from a linear model.
On Apr 2, 2010, at 5:32 PM, HouseBandit wrote: my goal is to return the selected fitted values ... Which were never really selected. ... and then perform a sum of squares calcuation with them. I have looked at 'list' etc but cant return anything. Its either all of the fitted values or just the first and last of the sub set that I need. A) In the future, don't delete the email train. B) try this code and see if you can get value out of it: vec - 1:100 vec[(length(vec)*0.9):length(vec)] [1] 90 91 92 93 94 95 96 97 98 99 100 Mind you this is just a guess at what you wanted because your origianl posting seem unclear as to your goal, at least to my reading. -- David. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Arellano -- how to generate data for y given t
See topic -- View this message in context: http://n4.nabble.com/Arellano-how-to-generate-data-for-y-given-t-tp1749656p1749656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lineplot.CI in sciplot: option ci.fun can't be changed?
Thanks, Manuel! Subject: Re: lineplot.CI in sciplot: option ci.fun can't be changed? From: manuel.a.mora...@williams.edu To: shi...@hotmail.com CC: r-help@r-project.org; mmora...@williams.edu Date: Fri, 2 Apr 2010 14:22:33 -0400 For now, just change fun(x) to median(x) (or whatever) in your ci.fun() below. E.g. lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= function(x) c(mean(x)-2*se(x), mean(x)+2*se(x))) Otherwise, maybe the list members could help with a solution. An example that illustrates the problem: ex.fn - function(x, fun = mean, fun2 = function(x) fun(x)+sd(x)) { list(fun=fun(x), fun2=fun2(x)) } data - rnorm(10) ex.fn(data) #works ex.fn(data, fun=median) #works ex.fn(data, fun2=function(x) fun(x)+3) #error with fun(x) not found On Fri, 2010-04-02 at 17:36 +, Tao Shi wrote: hi List and Manuel, I have encounter the following problem with the function lineplot.CI. I'm running R 2.10.1, sciplot 1.0-7 on Win XP. It seems like it's a scoping issue, but I couldn't figure it out. Thanks! ...Tao lineplot.CI(x.factor = dose, response = len, data = ToothGrowth) ## fine lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=median) ## fine lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=mean) ## fine lineplot.CI(x.factor = dose, response = len, data = ToothGrow[[elided Hotmail spam]] Error in FUN(X[[1L]], ...) : could not find function fun debug(lineplot.CI) lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= function(x) c(fun(x)-2*se(x), fun(x)+2*se(x))) Browse[2] debug: mn.data - tapply(response, groups, fun) Browse[2] debug: CI.data - tapply(response, groups, ci.fun) Browse[2] fun function (x) mean(x, na.rm = TRUE) Browse[2] ci.fun function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)) Browse[2] debug(ci.fun) Browse[2] fun function (x) mean(x, na.rm = TRUE) Browse[2] debugging in: FUN(X[[1L]], ...) debug: c(fun(x) - 2 * se(x), fun(x) + 2 * se(x)) Browse[3] Error in FUN(X[[1L]], ...) : could not find function fun undebug(lineplot.CI) lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= function(x) c(fun(x)-se(x), fun(x)+se(x))) Error in FUN(X[[1L]], ...) : could not find function fun lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = function(x) mean(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), fun(x)+se(x))) Error in FUN(X[[1L]], ...) : could not find function fun lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = function(x) median(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), fun(x)+se(x))) Error in FUN(X[[1L]], ...) : could not find function fun _ Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1 -- http://mutualism.williams.edu _ The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting 'SOME' values from a linear model.
David Winsemius wrote: On Apr 2, 2010, at 5:32 PM, HouseBandit wrote: my goal is to return the selected fitted values ... Which were never really selected. ... and then perform a sum of squares calcuation with them. I have looked at 'list' etc but cant return anything. Its either all of the fitted values or just the first and last of the sub set that I need. A) In the future, don't delete the email train. B) try this code and see if you can get value out of it: vec - 1:100 vec[(length(vec)*0.9):length(vec)] [1] 90 91 92 93 94 95 96 97 98 99 100 Mind you this is just a guess at what you wanted because your origianl posting seem unclear as to your goal, at least to my reading. -- David. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi, I had just tried something similar and got it working. my.lm my.lm.fit-my.lm$fitted my.lm.fit[(0.9*length(t)): length(t)] Thanks for your quick replies though Cheers -- View this message in context: http://n4.nabble.com/Extracting-SOME-values-from-a-linear-model-tp1749643p1749705.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tetrachoric correlations
Hi, Is there any R library/package that calculates tetrachoric correlations from given marginals and Pearson correlations among ordinal variables? Inputs to polychor function in polycor package are either contingency tables or ordinal data themselves. I am looking for something that takes marginal distributions and Pearson correlation as inputs. For example, Y1=(1,2,3) with P(Y1=1)=0.3, P(Y1=2)=0.5, P(Y1=3)=0.2 and Y2=(1,2) with P(Y2=1)=0.6, P(Y2=2)=0.4, and corr(Y1,Y2)=0.5 (Pearson correlation among ordinal variables) How do I calculate the tetrachoric correlation here? Thanks, Hakan Demirtas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] compare multiple values with vector and return vector
Dear all, I have a vector, and for each element I want to check whether it is equal to any element from another vector. I want a vector of logical values with the length of the first one as return. In R this would be : x - 1:10 sapply(x,function(y){any(y==c(2,3,4))}) [1] FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE It works pretty smooth, but I have the feeling there's a less complicated way of doing it. My code should be readable by programmers who are not really familiar with R, but I hate to use for-loops as I have pretty huge datasets. Anybody an idea? thank you in advance. Cheers Joris -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] compare multiple values with vector and return vector
On 3/04/2010, at 11:35 AM, Joris Meys wrote: Dear all, I have a vector, and for each element I want to check whether it is equal to any element from another vector. I want a vector of logical values with the length of the first one as return. In R this would be : x - 1:10 sapply(x,function(y){any(y==c(2,3,4))}) [1] FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE It works pretty smooth, but I have the feeling there's a less complicated way of doing it. My code should be readable by programmers who are not really familiar with R, but I hate to use for-loops as I have pretty huge datasets. Anybody an idea? thank you in advance. ?%in% cheers, Rolf Turner ## Attention: This e-mail message is privileged and confidential. If you are not the intended recipient please delete the message and notify the sender. Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshal www.marshalsoftware.com ## __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with PDF/Latex when building a package
Dear R People: I'm building a packages on an Ubuntu Karmic Koala 9.10 system and am getting the following errors: * checking PDF version of manual ... WARNING LaTeX errors when creating PDF version. This typically indicates Rd problems. LaTeX errors found: ! Font T1/ptm/m/n/10=ptmr8t at 10.0pt not loadable: Metric (TFM) file not found . to be read again relax l.7 \begin{document} ! Font T1/ptm/m/n/24.88=ptmr8t at 24.88pt not loadable: Metric (TFM) file not f ound. to be read again relax l.8 \chapter*{} ! Font T1/ptm/bx/n/24.88=ptmb8t at 24.88pt not loadable: Metric (TFM) file not found. to be read again relax l.8 \chapter*{} ! Font \T1/ptm/b/n/24.88=nullfont not loadable: Metric (TFM) file not found. to be read again \relax l.8 \chapter*{} ! Font T1/ptm/bx/n/10=ptmb8t at 10.0pt not loadable: Metric (TFM) file not foun d. to be read again relax l.10 {\textbf{\huge Package `RcmdrPlugin.epack'} ! Font \T1/ptm/b/n/10=nullfont not loadable: Metric (TFM) file not found. to be read again \relax l.10 {\textbf{\huge Package `RcmdrPlugin.epack'} } ! Font T1/ptm/bx/n/20.74=ptmb8t at 20.74pt not loadable: Metric (TFM) file not found. to be read again relax l.10 {\textbf{\huge Package `RcmdrPlugin.epack'} ! Font \T1/ptm/b/n/20.74=nullfont not loadable: Metric (TFM) file not found. to be read again \relax l.10 {\textbf{\huge Package `RcmdrPlugin.epack'} } ! Font T1/ptm/m/n/12=ptmr8t at 12.0pt not loadable: Metric (TFM) file not found . to be read again relax l.11 \par\bigskip{\large ! Font T1/pcr/m/n/10=pcrr8t at 10.0pt not loadable: Metric (TFM) file not found . to be read again relax l.19 ...AsIs{Erin hodgess\email{hodge...@uhd.edu}} ! Font T1/ptm/m/n/14.4=ptmr8t at 14.4pt not loadable: Metric (TFM) file not fou nd. to be read again relax l.33 ...r Plug-In}{RcmdrepackPlugin.Rdash.package} ! Font T1/ptm/bx/n/14.4=ptmb8t at 14.4pt not loadable: Metric (TFM) file not fo und. to be read again relax l.33 ...r Plug-In}{RcmdrepackPlugin.Rdash.package} ! Font \T1/ptm/b/n/14.4=nullfont not loadable: Metric (TFM) file not found. to be read again \relax l.33 ...r Plug-In}{RcmdrepackPlugin.Rdash.package} ! Font T1/phv/m/n/14.4=phvr8t at 14.4pt not loadable: Metric (TFM) file not fou nd. to be read again relax l.33 ...r Plug-In}{RcmdrepackPlugin.Rdash.package} ! Font T1/ptm/m/it/10=ptmri8t at 10.0pt not loadable: Metric (TFM) file not fou nd. to be read again relax l.33 ...r Plug-In}{RcmdrepackPlugin.Rdash.package} ! Font T1/ptm/m/sl/10=ptmro8t at 10.0pt not loadable: Metric (TFM) file not fou nd. to be read again relax l.62 \end{document} ! Font T1/phv/m/n/10=phvr8t at 10.0pt not loadable: Metric (TFM) file not found . to be read again relax l.62 \end{document} * checking PDF version of manual without index ... ERROR e...@ubuntu:~$ When I run on another system, it runs fine. Has anyone run into this lately, please? I got the pdflatex via sudo apt-get install pdflatex. Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.