Re: [R] optim() not finding optimal values
Derek, As a general strategy, and as an alternative to parscale when using optim, you can just estimate the logarithm of your parameters. So in optim the par argument would contain the logarithm of your parameters, whereas in the model itself you would write exp(par) in the place of the parameter. The purpose of this is to bring all parameters to a similar scale to aid the numerical algorithm in finding the optimum over several dimensions. Due to the functional invariance property of maximum likelihood estimates your transformed pars back to the original scale are also the MLEs of the pars in your model. If you were using ADMB you'd get the standard error of the pars in the original scale simply by declaring them sd_report number class. With optim, you would get the standard error of pars in the original scale post-hoc by using Taylor series (a.k.a. Delta method) which in this case is very simple since the transformation is just the exponential. In relation to your model/data combination, since you have only 17 years of data and just one series of cpue, and this is a rather common case, you may want to give the choice to set B0=K, i.e. equilibrium conditions at the start, in your function, to reduce the dimensionality of your profile likelihood approximation thus helping the optimizer. HTH Dr. Rubén Roa-Ureta AZTI - Tecnalia / Marine Research Unit Txatxarramendi Ugartea z/g 48395 Sukarrieta (Bizkaia) SPAIN -Mensaje original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En nombre de Derek Ogle Enviado el: sábado, 26 de junio de 2010 22:28 Para: R (r-help@R-project.org) Asunto: [R] optim() not finding optimal values I am trying to use optim() to minimize a sum-of-squared deviations function based upon four parameters. The basic function is defined as ... SPsse - function(par,B,CPE,SSE.only=TRUE) { n - length(B) # get number of years of data B0 - par[B0]# isolate B0 parameter K - par[K] # isolate K parameter q - par[q] # isolate q parameter r - par[r] # isolate r parameter predB - numeric(n) predB[1] - B0 for (i in 2:n) predB[i] - predB[i-1]+r*predB[i-1]*(1-predB[i-1]/K)-B[i-1] predCPE - q*predB sse - sum((CPE-predCPE)^2) if (SSE.only) sse else list(sse=sse,predB=predB,predCPE=predCPE) } My call to optim() looks like this # the data d - data.frame(catch= c(9,113300,155860,181128,198584,198395,139040,109969,71896 ,59314,62300,65343,76990,88606,118016,108250,108674), cpe=c(109.1,112.4,110.5,99.1,84.5,95.7,74.1,70.2,63.1,66.4,60. 5,89.9,117.0,93.0,116.6,90.0,105.1)) pars - c(80,100,0.0001,0.17) # put all parameters into one vector names(pars) - c(B0,K,q,r) # name the parameters ( SPoptim - optim(pars,SPsse,B=d$catch,CPE=d$cpe) )# run optim() This produces parameter estimates, however, that are not at the minimum value of the SPsse function. For example, these parameter estimates produce a smaller SPsse, parsbox - c(732506,1160771,0.0001484,0.4049) names(parsbox) - c(B0,K,q,r) ( res2 - SPsse(parsbox,d$catch,d$cpe,SSE.only=FALSE) ) Setting the starting values near the parameters shown in parsbox even resulted in a movement away from (to a larger SSE) those parameter values. ( SPoptim2 - optim(parsbox,SPsse,B=d$catch,CPE=d$cpe) )# run optim() This issue most likely has to do with my lack of understanding of optimization routines but I'm thinking that it may have to do with the optimization method used, tolerance levels in the optim algorithm, or the shape of the surface being minimized. Ultimately I was hoping to provide an alternative method to fisheries biologists who use Excel's solver routine. If anyone can offer any help or insight into my problem here I would be greatly appreciative. Thank you in advance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fatal error: unable to restore saved data
Hi r-help-boun...@r-project.org napsal dne 26.06.2010 00:11:33: Albert - The message refers to a file specifically called .RData. Files with subscripts of .rdata are not related. You can see your current working directory by typing getwd() at the R prompt. I'm not sure where rattle enters into the picture. If .Rdata file was saved when rattle was installed and used for some operation you can get this sort of error. Just install rattle package to R 2.11.1 library and everything shall be OK. Regards Petr - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 25 Jun 2010, Albert Lee, Ph.D. wrote: I just installed the R 2.11.1 version on my computer and I encountered a fatal error: Unable to restore saved data in .RData and kick me out of R right away. I still can run 2.10.2. There is no package called rattle I checked various posts regarding this error. I still can't get it to work. I removed two files that had .rdata extension and still does not work. Any suggestion? Please advise. How do you check your current working directory? Albert Confidentiality Notice: This communication, and any file...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Euclidean Distance Matrix Analysis (EDMA) in R?
basicly I am using standart shape package in R, no need additional code for analysis http://cran.r-project.org/web/packages/shapes/index.html http://cran.r-project.org/web/packages/shapes/shapes.pdf anf the main reference is Statistical Shape Analysis (http://www.amazon.com/Statistical-Shape-Analysis-I-Dryden/dp/0471958166) -- View this message in context: http://r.789695.n4.nabble.com/Euclidean-Distance-Matrix-Analysis-EDMA-in-R-tp2266797p2270578.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package(pls) - extracting explained Y-variance
Christian Jebsen jeb...@rz.uni-leipzig.de writes: Dear R-help users, I'd like to use the R-package pls and want to extract the explained Y-variance to identify the important (PLS-) principal components in my model, related to the y-data. For explained X-variance there is a function: explvar(). If I understand it right, the summary() function gives an overview, where the y-variance is shown, but I can't extract it for plotting. If you look at the summary function (summary.mvr), you will see that it uses the R2 function for this: yve - 100 * drop(R2(object, estimate = train, intercept = FALSE)$val) (For cross-validated or test set validated models, it uses RMSEP.) -- Bjørn-Helge Mevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Forecast Package in R: auto.arima function
Hey, I have a few doubts with regard to the usage of the auto.arima function from the forecast package in R. *Background:* I have a set of about 50 time-series for which I would like to estimate the best autroregressive model. (I want to estimate the coefficients and order of p). Each of the series is non-stationary and are also have a non-normal distribution. The data is non-seasonal. My objective is to group these 50 odd time-series into 6-7 groups and apply the same auto-regressive model.(Essentially want a best fit auto-regressive model for each of the groups). For a single time-series if I apply: fit-auto.arima(series1,d=NA,D=0,max.p=6,max.q=0,max.order=6,stationary=F,ic=c(aic),trace=T,allowdrift=F) will the differencing be done internally and the final coefficients for the AR parameters be outputted by the coef(fit) function? Or do I have to make the series stationary before I apply the auto.arima function? I.e if finally my result is something like Coefficients: ar1 ar2 intercept 0.1561 -0.4495 635.1266 s.e. 0.2076 0.196722.0342 for the above command. will my model be yt=0.1561*yt-1 -.4495*yt-2 + 635.1266 only? Kindly shed some light on the above issues. Also if I want my final model to be a composite one involving expoential smoothing and autro-regressive terms what is the best mode of action? Thanks and regards, Kishan -- A. Phani Kishan 3rd Year B.Tech Dept. of Computer Science Engineering IIT MADRAS Ph: +919962363545 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Note on PCA (not directly with R)
Dear all, I am looking for some interactive study materials on Principal component analysis. Basically I would like to know what we are actually doing with PCA? What is happening within the dataset at the time of doing PCA. Probably a 3-dimensional interactive explanation would be best for me. I have gone through some online materials specially Wikipedia etc, however what I need a movable explanation to understand that. Any suggestion please? Thanks for your time [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotrix Trick
Yes, I believe I did something along your lines. See the code snippet at the end of the email which sorts everything out as far as I am concerned. Cheers Lorenzo # library(Cairo) library(plotrix) set.seed(1234) myseq - abs(rnorm(25)) myseq[20] - 100 #introduce outlier! A - matrix(myseq, ncol=5,nrow=5) pre - c(A,B,C,D,E) CairoPDF(test_color_scale_log-fixed.pdf) oldpar-par( mar=c(7,4,4,6) , cex.axis=1.4,cex.lab=1.6,cex.main=1.6) testcol-color.gradient(c(0.2,1),c(0.2,0.5),c(0,0),nslices=5) my_min_mat - min(A) col.labels-c(formatC(min(A),format=f,digits=1),formatC(max(A),format=f,digits=1)) color2D.matplot(log(A),main=Title,c(0.2,1),c(0.2,0.5),c(0,0), xlab = , ylab=, show.legend=FALSE, show.values=0,vcol=black,vcex=1, axes=FALSE) axis(1,at=c(0.5,1.5,2.5,3.5,4.5),labels=pre) axis(2,at=rev(c(0.5,1.5,2.5,3.5,4.5)),labels=pre, las=1 ) color.legend(5.3,1,5.8,3,col.labels,testcol,align=rb,gradient=y) for(i in 1:5) { for(j in 1:5) { text(i-0.5,j-0.5,formatC(A[6-j,i],format=f,digits=1),col=white) } } box() par(oldpar) dev.off() On Thu, 2010-06-24 at 12:13 -0400, David Winsemius wrote: On Jun 24, 2010, at 11:38 AM, Lorenzo Isella wrote: Dear Hrishi, I am almost there, thanks. The only small problem left is to convince also the colorbar to plot the values I want. Consider the small snippet at the end of the email: colors and numbers inside the cells are OK, but the legend shows the extremes of the log transformed data instead of the original one. Any suggestions? Why don't you emulate the seconde example of the help page for color2D.matplot and leave out the show legend and then add one after the plotting has been done? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple Treatments Meta-analysis (MTM)
Dear R Users, Is there a package that performs multiple-treatment meta-analysis in R? Also, most of the articles I read that apply MTM are interested in combining direct and indirect effects. I am interested in adjusting for multiple comparisons within the same study (multiple comparison probelm - I think). For example, I am doing a meta-analysis where some RCTs have 3 arms (A, B and control) and both A - Contol and B - Control contribute to the same meta-analysis. Will MTM be able to handle this situation? I will appreciate a good reference that explains the concepts and principles behind MTM (not just an application with a statement and a table of results). I am aware of the other recommended, yet basic, methods for approaching this situation. I am interested in what MTM has to offer. Thanks for your time and efforts. Regards, Pryseley _ Hotmail: Free, trusted and rich email service. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using if statement on function
Hello everybody, I'm trying to use a if-statment on a function. For a better understanding I want to present a small example: FUN=mean # could also be median,sd or any other function if (FUN == mean) plot(...) if (FUN == median) plot(...) ... This doesn't work, because FUN is a function. I've already tried to coerce the type of FUN with as.character( ), but that's also not possible. I'm stuck with this task and it is absolutely necessary to give FUN the class of a function. I'm looking forward for any hints, clues or solutions for my problem. So much thanks in advance Etienne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Exponential Smoothing: Forecast package
Hey, I am using the ets() function in the forecast package to find out the best fit parameters for my time-series. I have about 50 sets of time series data. I'm currently using the function as follows: ets(x,model=AZZ,opt.crit=mse) As to my observation about 5-10 of them have been identified by ets to have a trend and an alpha, beta values have been thrown up - which have been same in all these cases. When I read up online it came up as a Brown's double exponential smoothing as opposed to Holt's exponential smoothing (where alpha and beta differ). I am guessing this is happening as AIC/AICc/BIC select a model based on accuracy as well as a weight on number of parameters (1 in case of brown's, 2 in case of holt's). Now if I want to see results of the best parameters from the Holt's method, how should I go about it? And is there any study comparing the accuracy of brown's double exponential model versus holt's exponential model? Thanks in advance, Phani -- A. Phani Kishan 3rd Year B.Tech Dept. of Computer Science Engineering IIT MADRAS Ph: +919962363545 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using if statement on function
If I understand the problem properly, you want something like this: function(FUN, ...) { FunName - deparse(substitute(FUN)) if(FunName == mean) { ... } else if(FunName == median) { ... } } Using 'switch' is an alternative to 'if'. On 28/06/2010 10:50, Etienne Stockhausen wrote: Hello everybody, I'm trying to use a if-statment on a function. For a better understanding I want to present a small example: FUN=mean # could also be median,sd or any other function if (FUN == mean) plot(...) if (FUN == median) plot(...) ... This doesn't work, because FUN is a function. I've already tried to coerce the type of FUN with as.character( ), but that's also not possible. I'm stuck with this task and it is absolutely necessary to give FUN the class of a function. I'm looking forward for any hints, clues or solutions for my problem. So much thanks in advance Etienne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using if statement on function
You could also consider isTRUE(all.equal(FUN, mean)) isTRUE(all.equal(mean, mean)) [1] TRUE isTRUE(all.equal(mean, median)) [1] FALSE HTH Keith J Patrick Burns pbu...@pburns.seanet.com wrote in message news:4c28777f.1040...@pburns.seanet.com... If I understand the problem properly, you want something like this: function(FUN, ...) { FunName - deparse(substitute(FUN)) if(FunName == mean) { ... } else if(FunName == median) { ... } } Using 'switch' is an alternative to 'if'. On 28/06/2010 10:50, Etienne Stockhausen wrote: Hello everybody, I'm trying to use a if-statment on a function. For a better understanding I want to present a small example: FUN=mean # could also be median,sd or any other function if (FUN == mean) plot(...) if (FUN == median) plot(...) ... This doesn't work, because FUN is a function. I've already tried to coerce the type of FUN with as.character( ), but that's also not possible. I'm stuck with this task and it is absolutely necessary to give FUN the class of a function. I'm looking forward for any hints, clues or solutions for my problem. So much thanks in advance Etienne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Offscreen rendering in RGL?
On 27 Jun 2010, at 22:19, Duncan Murdoch wrote: On 27/06/2010 12:58 PM, Matthew Neilson wrote: Hi there, I've written a script for reading 3D simulation data into R, rendering it using RGL, and then saving the resulting plot using the snapshot3d() function. The results are fantastic! However, whenever RGL plots anything it automatically brings the viewing window into focus. Since I'm producing a large number of plots in a loop, my machine becomes almost unusable for the duration of the script. When producing 2D plots in R (i.e. not using RGL), I can easily call the pdf() function before each plot (and then close it with the dev.off() function) so that the plot is written directly to a file, thus bypassing the display. This allows me to set scripts running in the background, so that I can get on with other things. ;) Is there a way of forcing RGL to draw to an invisible (virtual, or buffered?) display that can then be saved using the snapshot3d() function? rgl can't do that, but perhaps your OS can, e.g. you set up an X11 server that doesn't display anything on your screen. I don't know if that's possible or not. You can avoid bringing the window to the top by setting the top argument to FALSE when you call snapshot3d, but what I found when doing this on many systems was that I got a snapshot showing the overlapping window, not just the contents of the rgl window. What happens on your system will depend on your graphics driver. You might also be able to tell rgl (via r3dDefaults) to open the window mostly off your screen. I don't know if you'll get a useful snapshot from it. Thanks for the suggestions! I tried launching a vnc server on (for example) display :4, and then using Sys.setenv(DISPLAY=:4) at the beginning of my script. Unfortunately, vncserver doesn't support GLX, so upon calling open3d() I'm presented with an error message and the R session terminates. Setting top=FALSE in each RGL-related statement does indeed prevent the window from being brought to the top, but as you said this results in unpredictable output -- in my case, the plot only takes up about a quarter of the plotting window and I'm left with masses of white-space. Fortunately, your final suggestion suits my needs perfectly! I simply need to set $windowRect such that the top/right corner of the RGL device is at the extreme bottom/left corner of my display. This frees up 99% of my display, and I don't see any flickering/redrawing (because the only visible portion of the window is the right edge of its title-bar). Note that at least some of the window must remain on- screen -- if $windowRect is set such that the entire window is off- screen, the windowing system becomes messed up. Thanks again for the help, it's much appreciated! -Matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sampling one random frame from each unique trial?
Hi: Try this: do.call(rbind, lapply(split(h, h$file), function(x) x[sample(1:nrow(x), 1), ])) My test returns file time_pred distance_1 distance_2 12.03.08_ins_odo_01 12.03.08_ins_odo_01 210 19.003 18.023 12.03.08_ins_odo_02 12.03.08_ins_odo_0290 13.668 12.950 12.03.08_ins_odo_03 12.03.08_ins_odo_03 120 21.220 26.370 12.03.08_ins_odo_07 12.03.08_ins_odo_07 180 16.301 19.976 distance_3 12.03.08_ins_odo_01 14.666 12.03.08_ins_odo_02 13.506 12.03.08_ins_odo_03 23.962 12.03.08_ins_odo_07 25.309 The function does the following: (1) Splits the data frame into a list, where each component of the list is a sub-data frame. (2) Applies the (anonymous) sampling function to each list component (lapply) (3) Combines the individual outputs together using the rbind function (do.call) Since this is the raison d'etre of the plyr package, one can also use library(plyr) ddply(d, 'file', function(x) x[sample(1:nrow(x), 1), ]) file time_pred distance_1 distance_2 distance_3 1 12.03.08_ins_odo_01 270 15.694 9.285 4.135 2 12.03.08_ins_odo_02 270 17.252 18.235 18.661 3 12.03.08_ins_odo_03 240 18.117 19.111 19.870 4 12.03.08_ins_odo_0790 19.790 23.276 18.678 (Your results may vary, but you do get one row per file as output.) HTH, Dennis On Sun, Jun 27, 2010 at 6:16 PM, Kristiina Hurme kristiina.hu...@uconn.eduwrote: hello everyone. please bear with me if this is very easy... I have a data set with many trials, and frames within each trial. I would like to pull out one random frame from each trial. here is an example. i have 4 unique trials (file), and various frames within each (time_pred). I would like to randomly sample 4 rows, but 1 from each trial (file). this sample data is called h file time_pred distance_1 distance_2 distance_3 1 12.03.08_ins_odo_01 210 19.003 18.023 14.666 2 12.03.08_ins_odo_01 240 23.905 20.087 17.266 3 12.03.08_ins_odo_01 270 15.694 9.285 4.135 4 12.03.08_ins_odo_02 0 22.142 16.061 14.776 5 12.03.08_ins_odo_0230 2.968 12.533 19.696 6 12.03.08_ins_odo_0260 6.175 17.701 20.198 7 12.03.08_ins_odo_0290 13.668 12.950 13.506 8 12.03.08_ins_odo_02 120 7.098 17.817 22.878 9 12.03.08_ins_odo_02 270 17.252 18.235 18.661 10 12.03.08_ins_odo_02 300 7.967 15.944 8.130 11 12.03.08_ins_odo_0390 18.724 17.931 21.148 12 12.03.08_ins_odo_03 120 21.220 26.370 23.962 13 12.03.08_ins_odo_03 150 21.225 24.376 20.194 14 12.03.08_ins_odo_03 180 22.298 24.119 24.606 15 12.03.08_ins_odo_03 210 8.413 14.464 15.219 16 12.03.08_ins_odo_03 240 18.117 19.111 19.870 17 12.03.08_ins_odo_0760 24.063 25.779 24.800 18 12.03.08_ins_odo_0790 19.790 23.276 18.678 19 12.03.08_ins_odo_07 120 15.617 23.707 19.545 20 12.03.08_ins_odo_07 150 24.818 22.373 24.515 21 12.03.08_ins_odo_07 180 16.301 19.976 25.309 22 12.03.08_ins_odo_07 210 23.843 24.772 26.025 23 12.03.08_ins_odo_07 240 9.029 15.125 20.139 24 12.03.08_ins_odo_07 270 6.533 22.833 23.618 here is my code so far... random -for(i in unique(file)){h[sample(1:24,1),]} random but this only gives me one sample... and if I try to exclude naming it as random, then nothing comes up. i'm confused and very new to R. please help! many many thanks! kristiina -- View this message in context: http://r.789695.n4.nabble.com/sampling-one-random-frame-from-each-unique-trial-tp2270396p2270396.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using if statement on function
On 28/06/2010 5:50 AM, Etienne Stockhausen wrote: Hello everybody, I'm trying to use a if-statment on a function. For a better understanding I want to present a small example: FUN=mean # could also be median,sd or any other function if (FUN == mean) plot(...) if (FUN == median) plot(...) ... This doesn't work, because FUN is a function. I've already tried to coerce the type of FUN with as.character( ), but that's also not possible. I'm stuck with this task and it is absolutely necessary to give FUN the class of a function. I'm looking forward for any hints, clues or solutions for my problem. You should use identical() to compare two functions: FUN - mean identical(FUN, mean) [1] TRUE identical(FUN, median) [1] FALSE all.equal() is probably sufficient for your needs, but it will ignore some small differences. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Axes intercept
I have a plot where the values of the y axis go from a positive number to a negative number and I want the x axis to intercept at zero rather than at the bottom of the y axis, regardless of its value. Can anyone help me to do this? Thanks in advance Vivien Vivien Kent MSc Oxon PhD candidate Evolutionary Anthropology Research Group Department of Anthropology Durham University Dawson Building South Road Durham DH1 3LE United Kingdom Email 1: v.t.k...@durham.ac.uk Email 2: vivien.k...@gmail.com Web: www.graduatejunction.com/academic/vivien_kent [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Seasonality - Centered MA vs. Holt-Winters
Hello, I asked this question on the r-finance list server and didn't get a reply. Thought I would try here to. I am trying to deseasonalize some financial time series data and I wanted some feedback on the best methods for doing this. I found two Centered Moving Average and Holt-Winters. Which is better and/or more appropriate for financial time series data in your opinion? I understand that Holt-Winters is a type of exponential smoothing and I have a complete description of how to calculate Centered Moving Averages (seasonal indices), but no discussion on which is more appropriate for certain situations. Any information on either (or other methods that you feel are better) would be great! Thanks! -- Ben Nachtrieb M.S.F. Investments, University of Denver Beta Gamma Sigma member [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Popularity of R, SAS, SPSS, Stata...
Muenchen, Robert A (Bob) wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Muenchen, Robert A (Bob) Sent: Friday, June 25, 2010 3:08 PM To: Joris Meys; Dario Solari Cc: r-help@r-project.org Subject: Re: [R] Popularity of R, SAS, SPSS, Stata... I had taken the opposite tack with Google Trends by subtracting keywords like: SAS -shoes -airlines -sonar... but never got as good results as that beautiful X code for search. When you see the end-of-semester panic bumps in traffic, you know you're nailing it! I have to eat those words already. The R code for search that showed a peak every December did not have quotes around it, so it was searching for those three words not the complete phrase. When you add the quotes, the peaks vanish. Once you go the phrase route, you gain precision but end up with zero counts on various phrases. I avoided that by combining them with + to get enough to plot. The resulting graph shows SAS dominant until mid-2006 when SPSS takes the top position, followed by R, SAS, Stata in order: http://www.google.com/insights/search/#q=%22r%20code%20for%22%2B%22r%20m anual%22%2B%22r%20tutorial%22%2B%22r%20graph%22%2C%22sas%20code%20for%22 %2B%22sas%20manual%22%2B%22sas%20tutorial%22%2B%22sas%20graph%22%2C%22sp ss%20code%20for%22%2B%22spss%20manual%22%2B%22spss%20tutorial%22%2B%22sp ss%20graph%22%2C%22stata%20code%20for%22%2B%22stata%20manual%22%2B%22sta ta%20tutorial%22%2B%22stata%20graph%22%2C%22s-plus%20code%20for%22%2B%22 s-plus%20manual%22%2Bs-plus%20tutorial%22%2B%22s-plus%20graph%22cmpt=q This might be a good one to add to http://r4stats.com/popularity It is also something that people considering a position with a CRO or Pharma might want to look at (at least for clinical trials). Looks to me like the outsourcing of stats jobs continues... Best, Jim Bob I see that there's a car, the R Code Mustang, that adding for gets rid of. Thanks for getting me back on a topic that I had given up on! Bob -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Joris Meys Sent: Thursday, June 24, 2010 7:56 PM To: Dario Solari Cc: r-help@r-project.org Subject: Re: [R] Popularity of R, SAS, SPSS, Stata... Nice idea, but quite sensitive to search terms, if you compare your result on ... code with ... code for: http://www.google.com/insights/search/#q=r%20code%20for%2Csas%20code%2 0 f or%2Cspss%20code%20forcmpt=q On Thu, Jun 24, 2010 at 10:48 PM, Dario Solari dario.sol...@gmail.com wrote: First: excuse for my english My opinion: a useful font for measuring popoularity can be Google Insights for Search - http://www.google.com/insights/search/# Every person using a software like R, SAS, SPSS needs first to learn it. So probably he make a web-search for a manual, a tutorial, a guide. One can measure the share of this kind of serach query. This kind of results can be useful to determine trends of popularity. Example 1: R tutorial/manual/guide, SAS tutorial/manual/guide, SPSS tutorial/manual/guide http://www.google.com/insights/search/#q=%22r%20tutorial%22%2B%22r%20m a n ual%22%2B%22r%20guide%22%2B%22r%20vignette%22%2C%22spss%20tutorial%22% 2 B %22spss%20manual%22%2B%22spss%20guide%22%2C%22sas%20tutorial%22%2B%22s a s %20manual%22%2B%22sas%20guide%22cmpt=q Example 2: R software, SAS software, SPSS software http://www.google.com/insights/search/#q=%22r%20software%22%2C%22spss% 2 0 software%22%2C%22sas%20software%22cmpt=q Example 3: R code, SAS code, SPSS code http://www.google.com/insights/search/#q=%22r%20code%22%2C%22spss%20co d e %22%2C%22sas%20code%22cmpt=q Example 4: R graph, SAS graph, SPSS graph http://www.google.com/insights/search/#q=%22r%20graph%22%2C%22spss%20g r a ph%22%2C%22sas%20graph%22cmpt=q Example 5: R regression, SAS regression, SPSS regression http://www.google.com/insights/search/#q=%22r%20regression%22%2C%22sps s % 20regression%22%2C%22sas%20regression%22cmpt=q Some example are cross-software (learning needs - Example1), other can be biased by the tarditional use of that software (in SPSS usually you don't manipulate graph, i think) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] optim() not finding optimal values
Ruben, Transforming the parameters is also a good idea, but the obvious caveat is that the transformation must be feasible. The log-transformation is only feasible for positive parameter domain. This happens to be the case for OP's problem. In fact, the log-transform does better than ratio scaling in terms of improving rate of convergence for this particular model/data combination. SPsse.log - function(par,B,CPE,SSE.only=TRUE) { n - length(B) # get number of years of data par - tan(par) # log-transformation B0 - par[B0]# isolate B0 parameter K - par[K] # isolate K parameter q - par[q] # isolate q parameter r - par[r] # isolate r parameter predB - numeric(n) predB[1] - B0 for (i in 2:n) predB[i] - predB[i-1]+r*predB[i-1]*(1-predB[i-1]/K)-B[i-1] predCPE - q*predB sse - sum((CPE-predCPE)^2) if (SSE.only) sse else list(sse=sse,predB=predB,predCPE=predCPE) } SPoptim3 - optim(log(parsbox),SPsse.log,B=d$catch,CPE=d$cpe, method=BFGS) SPoptim3 $par B0 K q r 13.5035475 13.9634098 -8.8142230 -0.9030033 $value [1] 1619.481 $counts function gradient 546 $convergence [1] 0 $message NULL exp(SPoptim3$par) # transforming to original scale B0Kqr 7.320086e+05 1.159396e+06 1.486044e-04 4.053505e-01 I don't think there is a need to set B0=K since convergence is pretty good with scaling and/or with log-transformation. Best, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rubén Roa Sent: Monday, June 28, 2010 2:24 AM To: Derek Ogle; R (r-help@R-project.org) Subject: Re: [R] optim() not finding optimal values Derek, As a general strategy, and as an alternative to parscale when using optim, you can just estimate the logarithm of your parameters. So in optim the par argument would contain the logarithm of your parameters, whereas in the model itself you would write exp(par) in the place of the parameter. The purpose of this is to bring all parameters to a similar scale to aid the numerical algorithm in finding the optimum over several dimensions. Due to the functional invariance property of maximum likelihood estimates your transformed pars back to the original scale are also the MLEs of the pars in your model. If you were using ADMB you'd get the standard error of the pars in the original scale simply by declaring them sd_report number class. With optim, you would get the standard error of pars in the original scale post-hoc by using Taylor series (a.k.a. Delta method) which in this case is very simple since the transformation is just the exponential. In relation to your model/data combination, since you have only 17 years of data and just one series of cpue, and this is a rather common case, you may want to give the choice to set B0=K, i.e. equilibrium conditions at the start, in your function, to reduce the dimensionality of your profile likelihood approximation thus helping the optimizer. HTH Dr. Rubén Roa-Ureta AZTI - Tecnalia / Marine Research Unit Txatxarramendi Ugartea z/g 48395 Sukarrieta (Bizkaia) SPAIN -Mensaje original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En nombre de Derek Ogle Enviado el: sábado, 26 de junio de 2010 22:28 Para: R (r-help@R-project.org) Asunto: [R] optim() not finding optimal values I am trying to use optim() to minimize a sum-of-squared deviations function based upon four parameters. The basic function is defined as ... SPsse - function(par,B,CPE,SSE.only=TRUE) { n - length(B) # get number of years of data B0 - par[B0]# isolate B0 parameter K - par[K] # isolate K parameter q - par[q] # isolate q parameter r - par[r] # isolate r parameter predB - numeric(n) predB[1] - B0 for (i in 2:n) predB[i] - predB[i-1]+r*predB[i-1]*(1-predB[i-1]/K)-B[i-1] predCPE - q*predB sse - sum((CPE-predCPE)^2) if (SSE.only) sse else list(sse=sse,predB=predB,predCPE=predCPE) } My call to optim() looks like this # the data d - data.frame(catch= c(9,113300,155860,181128,198584,198395,139040,109969,71896 ,59314,62300,65343,76990,88606,118016,108250,108674), cpe=c(109.1,112.4,110.5,99.1,84.5,95.7,74.1,70.2,63.1,66.4,60. 5,89.9,117.0,93.0,116.6,90.0,105.1)) pars - c(80,100,0.0001,0.17) # put all parameters into one vector names(pars) - c(B0,K,q,r) # name the parameters ( SPoptim - optim(pars,SPsse,B=d$catch,CPE=d$cpe) )# run optim() This produces
[R] (New) Popularity of R, SAS, SPSS, Stata...
Greeting Listserv Readers, At http://r4stats.com/popularity I have added plots, data, and/or discussion of: 1. Scholarly impact of each package across the years 2. The number of subscribers to some of the listservs 3. How popular each package is among Google searches across the years 4. Survey results from a Rexer Analytics poll 5. Survey results from a KDnuggests poll 6. A rudimentary analysis of the software skills that employers are seeking Thanks very much to all the folks who helped on this project including: John Fox, Marc Schwartz, Duncan Murdoch, Martin Weiss, John (Jiangtang) HU, Andre Wielki, Kjetil Halvorsen, Dario Solari, Joris Meys, Keo Ormsby, Karl Rexer, and Gregory Piatetsky-Shapiro. If anyone can think of other angles, please let me know. Cheers, Bob = Bob Muenchen (pronounced Min'-chen), Manager Research Computing Support Voice: (865) 974-5230 Email: muenc...@utk.edu Web: http://oit.utk.edu/research, News: http://oit.utk.edu/research/news.php Feedback: http://oit.utk.edu/feedback/ = __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim() not finding optimal values
Oops, I my previous email, the second line in the `SPsse.log' function should have been: par - exp(par) # log-transformation Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ravi Varadhan Sent: Monday, June 28, 2010 9:48 AM To: 'Rubén Roa'; 'Derek Ogle'; 'R' Subject: Re: [R] optim() not finding optimal values Ruben, Transforming the parameters is also a good idea, but the obvious caveat is that the transformation must be feasible. The log-transformation is only feasible for positive parameter domain. This happens to be the case for OP's problem. In fact, the log-transform does better than ratio scaling in terms of improving rate of convergence for this particular model/data combination. SPsse.log - function(par,B,CPE,SSE.only=TRUE) { n - length(B) # get number of years of data par - tan(par) # log-transformation B0 - par[B0]# isolate B0 parameter K - par[K] # isolate K parameter q - par[q] # isolate q parameter r - par[r] # isolate r parameter predB - numeric(n) predB[1] - B0 for (i in 2:n) predB[i] - predB[i-1]+r*predB[i-1]*(1-predB[i-1]/K)-B[i-1] predCPE - q*predB sse - sum((CPE-predCPE)^2) if (SSE.only) sse else list(sse=sse,predB=predB,predCPE=predCPE) } SPoptim3 - optim(log(parsbox),SPsse.log,B=d$catch,CPE=d$cpe, method=BFGS) SPoptim3 $par B0 K q r 13.5035475 13.9634098 -8.8142230 -0.9030033 $value [1] 1619.481 $counts function gradient 546 $convergence [1] 0 $message NULL exp(SPoptim3$par) # transforming to original scale B0Kqr 7.320086e+05 1.159396e+06 1.486044e-04 4.053505e-01 I don't think there is a need to set B0=K since convergence is pretty good with scaling and/or with log-transformation. Best, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rubén Roa Sent: Monday, June 28, 2010 2:24 AM To: Derek Ogle; R (r-help@R-project.org) Subject: Re: [R] optim() not finding optimal values Derek, As a general strategy, and as an alternative to parscale when using optim, you can just estimate the logarithm of your parameters. So in optim the par argument would contain the logarithm of your parameters, whereas in the model itself you would write exp(par) in the place of the parameter. The purpose of this is to bring all parameters to a similar scale to aid the numerical algorithm in finding the optimum over several dimensions. Due to the functional invariance property of maximum likelihood estimates your transformed pars back to the original scale are also the MLEs of the pars in your model. If you were using ADMB you'd get the standard error of the pars in the original scale simply by declaring them sd_report number class. With optim, you would get the standard error of pars in the original scale post-hoc by using Taylor series (a.k.a. Delta method) which in this case is very simple since the transformation is just the exponential. In relation to your model/data combination, since you have only 17 years of data and just one series of cpue, and this is a rather common case, you may want to give the choice to set B0=K, i.e. equilibrium conditions at the start, in your function, to reduce the dimensionality of your profile likelihood approximation thus helping the optimizer. HTH Dr. Rubén Roa-Ureta AZTI - Tecnalia / Marine Research Unit Txatxarramendi Ugartea z/g 48395 Sukarrieta (Bizkaia) SPAIN -Mensaje original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En nombre de Derek Ogle Enviado el: sábado, 26 de junio de 2010 22:28 Para: R (r-help@R-project.org) Asunto: [R] optim() not finding optimal values I am trying to use optim() to minimize a sum-of-squared deviations function based upon four parameters. The basic function is defined as ... SPsse - function(par,B,CPE,SSE.only=TRUE) { n - length(B) # get number of years of data B0 - par[B0]# isolate B0 parameter K - par[K] # isolate K parameter q - par[q] # isolate q parameter r - par[r] # isolate r parameter predB - numeric(n) predB[1] - B0 for (i in 2:n) predB[i] - predB[i-1]+r*predB[i-1]*(1-predB[i-1]/K)-B[i-1] predCPE - q*predB sse - sum((CPE-predCPE)^2) if (SSE.only) sse else list(sse=sse,predB=predB,predCPE=predCPE) } My call to optim() looks like this # the data d - data.frame(catch= c(9,113300,155860,181128,198584,198395,139040,109969,71896
Re: [R] Averaging half hourly data to hourly
Hi Jenny, Date: Thu, 24 Jun 2010 09:45:10 +0100 From: Jennifer Wright j.wright...@sms.ed.ac.uk To: r-help@r-project.org Subject: [R] Averaging half hourly data to hourly Message-ID: 4c231b16.1080...@sms.ed.ac.uk Content-Type: text/plain; charset=ISO-8859-1; format=flowed Hi all, I have some time-series data in half hourly time steps and I want to be able to either average or sum the two half hours together into hours. Any help greatly appreciated. Thanks, Jenny -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. See period.apply and endpoints in the xts package: require(xts) (x - .xts(1:10,(1:10)*1800)) [,1] 1969-12-31 18:30:001 1969-12-31 19:00:002 1969-12-31 19:30:003 1969-12-31 20:00:004 1969-12-31 20:30:005 1969-12-31 21:00:006 1969-12-31 21:30:007 1969-12-31 22:00:008 1969-12-31 22:30:009 1969-12-31 23:00:00 10 period.apply(x, endpoints(x,hours), sum) [,1] 1969-12-31 18:30:001 1969-12-31 19:30:005 1969-12-31 20:30:009 1969-12-31 21:30:00 13 1969-12-31 22:30:00 17 1969-12-31 23:00:00 10 period.apply(x, endpoints(x,hours), mean) [,1] 1969-12-31 18:30:00 1.0 1969-12-31 19:30:00 2.5 1969-12-31 20:30:00 4.5 1969-12-31 21:30:00 6.5 1969-12-31 22:30:00 8.5 1969-12-31 23:00:00 10.0 Best, -- Joshua Ulrich FOSS Trading: www.fosstrading.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Forecast Package: MAPE as criteria
Hey, The ets function of the forecast package has options to chose only between mse, amse,lik and nmse as the criteria for model selection. However I want MAPE to be the criteria. Is there any implementation of this already? Else if I have to write a function for myself, which is the best way to go about it? Thank, Phani -- A. Phani Kishan 3rd Year B.Tech Dept. of Computer Science Engineering IIT MADRAS Ph: +919962363545 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (New) Popularity of R, SAS, SPSS, Stata...
Dear Robert, I've tried to acces that link, but to no prevail. Seems the server r4stats.com is down, as he doesn't respond. This link got me to the site : http://sites.google.com/site/r4statistics/popularity Cheers Joris On Mon, Jun 28, 2010 at 3:52 PM, Muenchen, Robert A (Bob) muenc...@utk.edu wrote: Greeting Listserv Readers, At http://r4stats.com/popularity I have added plots, data, and/or discussion of: 1. Scholarly impact of each package across the years 2. The number of subscribers to some of the listservs 3. How popular each package is among Google searches across the years 4. Survey results from a Rexer Analytics poll 5. Survey results from a KDnuggests poll 6. A rudimentary analysis of the software skills that employers are seeking Thanks very much to all the folks who helped on this project including: John Fox, Marc Schwartz, Duncan Murdoch, Martin Weiss, John (Jiangtang) HU, Andre Wielki, Kjetil Halvorsen, Dario Solari, Joris Meys, Keo Ormsby, Karl Rexer, and Gregory Piatetsky-Shapiro. If anyone can think of other angles, please let me know. Cheers, Bob = Bob Muenchen (pronounced Min'-chen), Manager Research Computing Support Voice: (865) 974-5230 Email: muenc...@utk.edu Web: http://oit.utk.edu/research, News: http://oit.utk.edu/research/news.php Feedback: http://oit.utk.edu/feedback/ = __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mhplot error with test example: ylim not found
It seems to me that gap::mhtplot needs a fix. You might want to contact the maintainer (cc'd). In the meantime, you should be able to place an object ylim in your workspace before calling the function: ylim - c(0, 10) mhtplot(test, ylim = c(0, 10)) Of course, you could also just fixt the function (it's short and the fix is easy). -Peter Ehlers On 2010-06-23 8:47, vaneet wrote: Hello all, I am trying to make a genome association plot for p-values related to SNPs and was fortunate to find that R contains a package that produces Manhattan plots which is what's preferred for my current project. The function mhtplot() is found in the 'gap' package which I installed in R 2.11.1 on Windows. I thought I'd test out the function first with the examples they give in the documentation: # foo example test- matrix(c(1,1,4,1,1,6,1,10,3,2,1,5,2,2,6,2,4,8),byrow=TRUE,6) mhtplot(test) mhtplot(test,logscale=F) # fake example with Affy500k data affy-c(40220, 41400, 33801, 32334, 32056, 31470, 25835, 27457, 22864, 28501, 26273, 24954, 19188, 15721, 14356, 15309, 11281, 14881, 6399, 12400, 7125, 6207) CM- cumsum(affy) n.markers- sum(affy) n.chr- length(affy) test- data.frame(chr=rep(1:n.chr,affy),pos=1:n.markers,p=runif(n.markers)) # to reduce size of the plot # bitmap(mhtplot.bmp,res=72*5) oldpar- par() par(las=2,cex=0.6) colors- rep(c(blue,green),11) mhtplot(test,colors=colors,pch=19,bg=colors) title(A simulated example according to EPIC-Norfolk QCed SNPs) When I run either of the examples the following results: Error in mhtplot() : object 'ylim' not found Even though I'm not sure why it would require this parameter, when I tried to fill in the parameter (ylim = c(0,10)) the same error results. I am not sure how to get around this. Vaneet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axes intercept
- Original Message - From: KENT V.T. v.t.k...@durham.ac.uk To: r-help@r-project.org Sent: Monday, June 28, 2010 8:28 AM Subject: [R] Axes intercept I have a plot where the values of the y axis go from a positive number to a negative number and I want the x axis to intercept at zero rather than at the bottom of the y axis, regardless of its value. Can anyone help me to do this? ?plot ?axis As a toy example: y=rnorm(100) x=abs(y) plot(x, y, axes=FALSE) axis(1, pos=0) axis(2) box() Thanks in advance Vivien Vivien Kent MSc Oxon PhD candidate Evolutionary Anthropology Research Group Department of Anthropology Durham University Dawson Building South Road Durham DH1 3LE United Kingdom Email 1: v.t.k...@durham.ac.uk Email 2: vivien.k...@gmail.com Web: www.graduatejunction.com/academic/vivien_kent [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] linear predicted values of the index function in an ordered probit model
Hello, currently I am estimating an ordered probit model with the function polr (MASS package). Is there a simple way to obtain values for the prediction of the index function ($X*\hat{\beta}$)? (E..g. in the GLM function there is the linear.prediction value for this purpose). If not, is there another function / package where this feature is implemented? Thank you very much for your answer in advance! Best, Martin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stacked Histogram, multiple lines for dates of news stories?
Dear colleagues, I have extracted the dates of several news stories from a newspaper data base to chart coverage trends of an issue over time. They are in a data frame that looks just like one generated by the reproducible code below. I can already generate a histogram of the dates with various intervals (months, quarters, weeks years) using hist.Date. However, there are two other things I'd like to do. First, I'd like to either create a stacked histogram so that one could see whether one newspaper really pushed coverage of an issue at a certain point while others then followed later on in time. Second, or alternatively, I would like to do a line graph of the same data for the different papers to represent the same trends. I guess what I'm finding challenging is that I don't have counts of the number of stories on each day or in each week or in each month; I just have the dates themselves. The date.Hist command was very useful in turning those into bins, but I'd like to push it a bit further and to a stacked histogram or a multiple line chart. Can anyone suggest a way to go about doing this? I should say, I played around in Hadley Wickham's ggplot package and looked at his website, and there is a way to render multiple lines here: http://had.co.nz/ggplot2/scale_date.html but it was not clear to me how to plot just the dates or an index of the dates as I don't have a value for the y axis, other than the number of times a story was published in that time frame. Regardless, I hope someone can suggest something. Yours, Simon J. Kiss test=sample(1:3, 50, replace=TRUE) test=as.factor(test) levels(test)=c(Star, Globe and Mail, Post) test2=ISOdatetime(sample(2004:2009, 50, replace=TRUE), sample(1:12, size=50, replace=TRUE), sample(1:30, 50, replace=TRUE), 0,0,0) test2=as.Date(test2) test_df=data.frame(test, test2) * Simon J. Kiss, PhD SSHRC and DAAD Post-Doctoral Fellow John F. Kennedy Institute of North America Studies Free University of Berlin Lansstraße 7-9 14195 Berlin, Germany Cell: +49 (0)1525-300-2812, Web: http://www.jfki.fu-berlin.de/index.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ways to work with R and Postgres
On Mon, Jun 28, 2010 at 9:55 AM, 顾小波 guxiaobo1...@gmail.com wrote: Hi Gabor, The package dependency path is RpgSQL- RJDBC - rJava, but it seems this is not a Windows 64bit rJava package. Regards. Xiaobo.Gu Make sure you are using a version that supports 64 bit Windows. See the rJava news file: http://www.rforge.net/rJava/news.html For followups on rJava the following list is the one to use: http://mailman.rz.uni-augsburg.de/mailman/listinfo/stats-rosuda-devel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fatal error: unable to restore saved data
I had the same problem sometime back and could not settled it out in factory condition. Then onwards I run R from command prompt and it works property. A little bit cumbersome work for me as double clicking on desktop icon doesn't work. Arun, -- View this message in context: http://r.789695.n4.nabble.com/fatal-error-unable-to-restore-saved-data-tp2268985p2270844.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ways to work with R and Postgres
Hi Gabor, The package dependency path is RpgSQL- RJDBC - rJava, but it seems this is not a Windows 64bit rJava package. Regards. Xiaobo.Gu -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Sunday, June 27, 2010 12:51 PM To: 顾小波 Cc: r-help@r-project.org Subject: Re: [R] Ways to work with R and Postgres 2010/6/27 顾小波 guxiaobo1...@gmail.com: Hi, I post this message to the general r-help list hoping anyone within a wider range have suggestions: There are three ways to integration R and postgres, especially on 64bit Microsoft windows Platform, 1. via RODBC package, which has 32 bit and 64 bit version for windows 2. via RPostgres interface, which only has 32bit version currently 3. via plr for Greenplum, which only supports a few kinds of functionality, and supports only specific versions of R. Do you have any idea about the advantages and disadvantages of each, and the differences among them There is also the RpgSQL package. In addition the sqldf package uses RpgSQL. sqldf by default uses SQLite but if the RpgSQL package is loaded then it defaults to PostgreSQL. Here BOD Is a built in R data.frame: library(sqldf) Loading required package: DBI Loading required package: RSQLite Loading required package: RSQLite.extfuns Loading required package: gsubfn Loading required package: proto Loading required package: chron library(RpgSQL) Loading required package: RJDBC BOD Time demand 118.3 22 10.3 33 19.0 44 16.0 55 15.6 67 19.8 sqldf('select regr_slope(demand, Time) slope, + regr_intercept(demand, Time) intercept, + corr(demand, Time) corr from BOD') Loading required package: tcltk Loading Tcl/Tk interface ... done slope intercept corr 1 1.721429 8.521429 0.8030693 coef(lm(demand ~ Time, BOD)); cor(BOD$Time, BOD$demand) (Intercept)Time 8.5214291.721429 [1] 0.8030693 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] distance matrix?
I have a vector 0 to 10 and want to create a matrix with the differences between the numbers in it for instance: 0 1 2 3 4 5 6 7 8 9 10 0 0 1 2 3 4 5 6 7 8 9 10 1 1 0 1 2 3 4 5 6 789 2 3 4 5 6 7 8 9 10 Etc etc. So that the matrix is filled with the differences between in absolute value so there are no negatives. Any ideas? Thanks -- View this message in context: http://r.789695.n4.nabble.com/distance-matrix-tp2270722p2270722.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Export Results
Thanks for your sugestions. But when I do wdGet(T) I have de next message. wdGet(T) Error in if (!(tmp[[ActiveDocument]][[Name]] == filename)) tmp$Open(paste(path, : argument is of length zero What is happen? Thaks for all -- View this message in context: http://r.789695.n4.nabble.com/Export-Results-tp2268622p2270745.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating quarterly data
Dear R Experts, I have data in the following format x1 x2 time 2 4 1 3 1 2 4 6 3 1 4 4 5 6 5 8 5 6 . . . . . . . . . 1 5 399 3 4 400 Time represents each month which simply has a number from 1 to 400 (i.e. the data covers 400 months). I would like to somehow create two new variables which counts x1 and x2 as quarterly data in stead of monthly. Like this: x1 x1 Quarter 10 12 1 12 15 2 . . . . . . . . . 23 21 100 Can any one tell me how this can be done in R? With kind regards, Thomas Jensen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subtraction loop
I have a data frame with 2 columns, one for day and one for average. The day starts at 97 all the way to 279. I want to subtract day 98 average- day 97 average, then day99 average -day 98 average and so on down my list, creating another column with the subtracted results. I have: Day DailyAverage 1970.6076782 2980.7121478 3990.8059347 4 1000.9545806 5 1011.0589791 6 1021.1335981 7 1031.2676922 8 1041.4429847 9 1051.6477266 10 1061.7970784 etc. with values to day 279 I'm guessing I need to create some kind of loop for this, but I am unsure as to how to do this. Any help is greatly appreciated. Thanks, Emilija __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mathematical expression in varnames of lattice parallel plot
How can I insert mathematical expressions for variable names in a lattice parallel plot? I tried to implement mathematical expressions in varnames, however, without success. For example, neither parallel(~iris[1:4] | Species, iris, varnames=c(P[Width], Petal[length], alpha[Width], Sepal[Length])) nor parallel(~iris[1:4] | Species, iris, varnames=c(expression(P[Width]), expression(Petal[length]), expression(alpha[Width]), expression(Sepal[Length]))) result in the desired labeling of the different axes of the parallel plot. Thank you very much. Regards, Doris -- --- Dr.-Ing. Doris Brockmann Post-doctoral researcher INRA - Laboratoire de Biotechnologie de l'Environnement (LBE) Avenue des Etangs, Narbonne, F-11100, France Tel : (33)(0)4 68 42 51 88 (standard : 51 51) Fax : (33)(0)4 68 42 51 60 email: brock...@supagro.inra.fr http://www.montpellier.inra.fr/narbonne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axes intercept
try: plot(..., axes=FALSE) axis(1, pos=0) On Mon, Jun 28, 2010 at 9:28 AM, KENT V.T. v.t.k...@durham.ac.uk wrote: I have a plot where the values of the y axis go from a positive number to a negative number and I want the x axis to intercept at zero rather than at the bottom of the y axis, regardless of its value. Can anyone help me to do this? Thanks in advance Vivien Vivien Kent MSc Oxon PhD candidate Evolutionary Anthropology Research Group Department of Anthropology Durham University Dawson Building South Road Durham DH1 3LE United Kingdom Email 1: v.t.k...@durham.ac.uk Email 2: vivien.k...@gmail.com Web: www.graduatejunction.com/academic/vivien_kent [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subtraction loop
Hello, Check the function diff() it can do it for you, no need for a loop. Cheers, Jonas Mandel ecvet...@uwaterloo.ca a écrit : I have a data frame with 2 columns, one for day and one for average. The day starts at 97 all the way to 279. I want to subtract day 98 average- day 97 average, then day99 average -day 98 average and so on down my list, creating another column with the subtracted results. I have: Day DailyAverage 1970.6076782 2980.7121478 3990.8059347 4 1000.9545806 5 1011.0589791 6 1021.1335981 7 1031.2676922 8 1041.4429847 9 1051.6477266 10 1061.7970784 etc. with values to day 279 I'm guessing I need to create some kind of loop for this, but I am unsure as to how to do this. Any help is greatly appreciated. Thanks, Emilija __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] integration of two normal density
Inline Below Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of bill.venab...@csiro.au Sent: Friday, June 25, 2010 10:53 PM To: carrieands...@gmail.com; R-help@r-project.org Subject: Re: [R] integration of two normal density Your intuition is wrong and R is right. Why should the product of two probability density functions be a normalized pdf also? -- as is trivially seen with two uniforms on [0,2], with pdf= 1/2, product = 1/4 on [0,2] . -- Bert -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Carrie Li Sent: Saturday, 26 June 2010 1:28 PM To: r-help Subject: [R] integration of two normal density Hello everyone, I have a question about integration of two density function Intuitively, I think the value after integration should be 1, but they are not. Am I missing something here ? t - function(y){dnorm(y, mean=3)*dnorm(y/2, mean=1.5)} integrate(t, -Inf, Inf) 0.3568248 with absolute error 4.9e-06 Also, is there any R function or package could do multivariate integration ? Thanks for any suggestions! Carrie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Basic question - more efficient method than loop?
I'm guessing there's a more efficient way to do the following using the index features of R. Appreciate any thoughts for (i in 1:nrow(dbs1)){ if(dbs1$Payor[i] %in% Payor.Group.Medicaid) dbs1$Payor.Group[i] = Medicaid if(dbs1$Payor[i] %in% Payor.Group.Medicare) dbs1$Payor.Group[i] = Medicare if(dbs1$Payor[i] %in% Payor.Group.Commercial) dbs1$Payor.Group[i] = Commercial if(dbs1$Payor[i] %in% Payor.Group.Workers.Comp) dbs1$Payor.Group[i] = Workers Comp if(dbs1$Payor[i] %in% Payor.Group.Self.Pay) dbs1$Payor.Group[i] = Self Pay if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Newborn) dbs1$Adm.Source.Group[i] = Newborn if(dbs1$Adm.Source[i] %in% Adm.Source.Group.ED) dbs1$Adm.Source.Group[i] = ED if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Routine) dbs1$Adm.Source.Group[i] = Routine if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Transfer) dbs1$Adm.Source.Group[i] = Transfer } -- View this message in context: http://r.789695.n4.nabble.com/Basic-question-more-efficient-method-than-loop-tp2271096p2271096.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim() not finding optimal values
If you are going to make this program available for general use you want to take every precaution to make it bulletproof. This is a fairly informative data set. The model will undoubtedly be used on far less informative data. While the model looks pretty simple it is very challenging from a numerical point of view. I took a moment to code it up in AD Model Builder. The true minimum is 1619.480495 So I think Ravi has finally arrived pretty close to the answer. One way of judging the difficulty of a model is to look at the eigenvalues of the Hessian at the minimum. They are 3.122884668e-09 1.410866202e-08 1866282.520 1.330233652e+13 so the condition number is around 1.e+21. One begins to see why these models are challenging. The model as formulated represents the state of the art in fisheries models circa 1985. A lot of progress has been made since that time. Using B_t for the biomass and C_t for the catch the equation in the code is B_{t+1} = B_t + r *B_t*(1-B_t/K) -C_t (1) First notice that for (1) to make sense the following conditions must be satisfied B_t 0 for all t r 0 K0 Strictly speaking it is not necessary that B_t=K but if B_tK and r is large then B_{t+1} could be 0. So formulation (1) gives Murphys law a good chance. How to fix it. Notice that (1) is really a rough approximation to the solution of a differential equation B'(t) = r *B(t)*(1-B(t)/K) -C (2) where in (2) C is a constant catch rate. To fix (1) we use a semi-implicit differencing scheme. Because it is useful to allow smaller step sizes than one we denote them by d. B_{t+d} = B_t + d* r *B_t*(1-B_{t+d}/K) -d*C_t*B_{t+d}/B_t (1) The idea is that the quantity 1-x with x0 will be replaced by 1/(1+x). Expanding 2 and solving for B_{t+d} yields B_{t+d} = (1+d*r) B_t / (1+d*r*B_t/K +d*C_t/B_t) (3) So long as r0, K0 C_t0 then starting from an initial value B_0 0 ensures that B_t 0 for all t0. We can let d=1/nsteps where nsteps is the number of steps in the approximate integration for each year which can be increased until the solution is judged to be close enough to the exact solution from (2) Notice that in (3) as C_t -- infinity B_{t+d} -- 0 So that you can never catch more fish than you have. I coded up this version of the model in AD Model Builder and fit it to the data. It is now much more resistant to bad starting values for the parameters etc. If anyone wants the tpl file for the model in ADMB they can contact me off list. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating quarterly data
One approach: d - data.frame(x1=c(2,3,4,1,5,8), x2=c(4,1,6,4,6,5), time=1:6) d$quarter - (d$time-1) %/% 4 # Or whatever your logic is aggregate(cbind(x1,x2) ~ quarter, data = d, sum) # quarter x1 x2 # 1 0 10 15 # 2 1 13 11 Hope this helps Allan On 28/06/10 13:23, Thomas Jensen wrote: Dear R Experts, I have data in the following format x1x2time 241 312 463 144 565 856 ... ... ... 15399 34400 Time represents each month which simply has a number from 1 to 400 (i.e. the data covers 400 months). I would like to somehow create two new variables which counts x1 and x2 as quarterly data in stead of monthly. Like this: x1x1Quarter 10121 12152 ... ... ... 2321100 Can any one tell me how this can be done in R? With kind regards, Thomas Jensen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question - more efficient method than loop?
On 28/06/10 16:46, GL wrote: I'm guessing there's a more efficient way to do the following using the index features of R. Appreciate any thoughts for (i in 1:nrow(dbs1)){ if(dbs1$Payor[i] %in% Payor.Group.Medicaid) dbs1$Payor.Group[i] = Medicaid Try something like dbs1$Payor.Group[dbs1$Payor %in% Payor.Group.Medicaid]- Medicaid etc. to get started. Hope this helps. Allan [...] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question - more efficient method than loop?
Perfect. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Basic-question-more-efficient-method-than-loop-tp2271096p2271153.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question - more efficient method than loop?
GL pfl...@shands.ufl.edu [Mon, Jun 28, 2010 at 05:46:13PM CEST]: I'm guessing there's a more efficient way to do the following using the index features of R. Appreciate any thoughts 1st thought: ifelse() for (i in 1:nrow(dbs1)){ if(dbs1$Payor[i] %in% Payor.Group.Medicaid) dbs1$Payor.Group[i] = Medicaid within(dbs1, Payor.Group - ifelse(Payor %in% Payor.Group.Medicaid, Medicaid, ifelse( and so on )) 2nd thought: library(car); ?recode 3rd thought (untested and contrary to the spirit of R): lst - list(Medicare, Commercial, Workers.Comp, etc. ); codePayor - function(lst) { if (length(lst) == 0) else ifelse(dbs1$Payor %in% eval(parse(paste(Payor.Group, lst[[1]], sep=.))), lst[[1]], codePayor(lst[-1])) } dbs1$Payor.Group - codePayor(lst) -- Johannes Hüsing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johan...@huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question - more efficient method than loop?
On Mon, 28 Jun 2010, GL wrote: I'm guessing there's a more efficient way to do the following using the index features of R. Appreciate any thoughts Use the levels( dbs1$Payor.Group ) - new.levels idiom after dbs1$Payor.Group - factor( dbs1$Payor ) See ?levels and example( levels ) Note the 'combine some levels' example. HTH, Chuck for (i in 1:nrow(dbs1)){ if(dbs1$Payor[i] %in% Payor.Group.Medicaid) dbs1$Payor.Group[i] = Medicaid if(dbs1$Payor[i] %in% Payor.Group.Medicare) dbs1$Payor.Group[i] = Medicare if(dbs1$Payor[i] %in% Payor.Group.Commercial) dbs1$Payor.Group[i] = Commercial if(dbs1$Payor[i] %in% Payor.Group.Workers.Comp) dbs1$Payor.Group[i] = Workers Comp if(dbs1$Payor[i] %in% Payor.Group.Self.Pay) dbs1$Payor.Group[i] = Self Pay if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Newborn) dbs1$Adm.Source.Group[i] = Newborn if(dbs1$Adm.Source[i] %in% Adm.Source.Group.ED) dbs1$Adm.Source.Group[i] = ED if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Routine) dbs1$Adm.Source.Group[i] = Routine if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Transfer) dbs1$Adm.Source.Group[i] = Transfer } -- View this message in context: http://r.789695.n4.nabble.com/Basic-question-more-efficient-method-than-loop-tp2271096p2271096.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question - more efficient method than loop?
Johannes Huesing johan...@huesing.name [Mon, Jun 28, 2010 at 06:31:20PM CEST]: [...] eval(parse(paste(Payor.Group, lst[[1]], sep=.))), eval(parse(text=paste(Payor.Group, lst[[1]], sep=.))), -- Johannes Hüsing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johan...@huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] integration of two normal density
Isn't it equally trivial to demonstrate that the product of two pdfs _may_ be a normalized pdf? For example, the uniform (0,1) pdf: f(x) = 1 for x in (0, 1), and 0 otherwise Hence, g(x) = f(x)*f(x) = 1 for x in (0, 1), and 0 otherwise _is_ a normalized pdf. But this is a little silly. Rather than memorize answers to questions like is the product of pdfs also a pdf?, we ought to be confident in the properties of pdfs (i.e. not the answers, but the means to arrive at answers). On Mon, 2010-06-28 at 11:42 -0400, Bert Gunter wrote: Inline Below Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of bill.venab...@csiro.au Sent: Friday, June 25, 2010 10:53 PM To: carrieands...@gmail.com; R-help@r-project.org Subject: Re: [R] integration of two normal density Your intuition is wrong and R is right. Why should the product of two probability density functions be a normalized pdf also? -- as is trivially seen with two uniforms on [0,2], with pdf= 1/2, product = 1/4 on [0,2] . -- Bert -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Carrie Li Sent: Saturday, 26 June 2010 1:28 PM To: r-help Subject: [R] integration of two normal density Hello everyone, I have a question about integration of two density function Intuitively, I think the value after integration should be 1, but they are not. Am I missing something here ? t - function(y){dnorm(y, mean=3)*dnorm(y/2, mean=1.5)} integrate(t, -Inf, Inf) 0.3568248 with absolute error 4.9e-06 Also, is there any R function or package could do multivariate integration ? Thanks for any suggestions! Carrie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stacked Histogram, multiple lines for dates of news stories?
Hi Simon, Here are two ways to do that with ggplot: qplot(test2, data = test_df, geom = freqpoly, colour = test, binwidth = 30, drop = F) qplot(test2, data = test_df, geom = bar, fill = test, binwidth = 30) binwidth is in days. If you want to bin by other intervals (like months), I'd recommend doing so before plotting. Hadley On Mon, Jun 28, 2010 at 10:04 AM, Simon Kiss sjk...@gmail.com wrote: Dear colleagues, I have extracted the dates of several news stories from a newspaper data base to chart coverage trends of an issue over time. They are in a data frame that looks just like one generated by the reproducible code below. I can already generate a histogram of the dates with various intervals (months, quarters, weeks years) using hist.Date. However, there are two other things I'd like to do. First, I'd like to either create a stacked histogram so that one could see whether one newspaper really pushed coverage of an issue at a certain point while others then followed later on in time. Second, or alternatively, I would like to do a line graph of the same data for the different papers to represent the same trends. I guess what I'm finding challenging is that I don't have counts of the number of stories on each day or in each week or in each month; I just have the dates themselves. The date.Hist command was very useful in turning those into bins, but I'd like to push it a bit further and to a stacked histogram or a multiple line chart. Can anyone suggest a way to go about doing this? I should say, I played around in Hadley Wickham's ggplot package and looked at his website, and there is a way to render multiple lines here: http://had.co.nz/ggplot2/scale_date.html but it was not clear to me how to plot just the dates or an index of the dates as I don't have a value for the y axis, other than the number of times a story was published in that time frame. Regardless, I hope someone can suggest something. Yours, Simon J. Kiss test=sample(1:3, 50, replace=TRUE) test=as.factor(test) levels(test)=c(Star, Globe and Mail, Post) test2=ISOdatetime(sample(2004:2009, 50, replace=TRUE), sample(1:12, size=50, replace=TRUE), sample(1:30, 50, replace=TRUE), 0,0,0) test2=as.Date(test2) test_df=data.frame(test, test2) * Simon J. Kiss, PhD SSHRC and DAAD Post-Doctoral Fellow John F. Kennedy Institute of North America Studies Free University of Berlin Lansstraße 7-9 14195 Berlin, Germany Cell: +49 (0)1525-300-2812, Web: http://www.jfki.fu-berlin.de/index.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating Summaries for each level of a Categorical variable
The problem is that tapply is expecting a vector for the first argument, your first argument is a list or data frame, so the length that it sees is the number of list elements (columns of the data frame). You need to either pass a single vector, or use functions like aggregate or the plyr package to work on all the columns in a data frame. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of RaoulD Sent: Saturday, June 26, 2010 10:47 PM To: r-help@r-project.org Subject: Re: [R] Calculating Summaries for each level of a Categorical variable Hi Corey, Thanks so much for this. However, I get this error for tapply - Error in tapply(RT, RT$R, fun=WA): arguments must have same length. Any idea how to get around this? Thanks again, Raoul -- View this message in context: http://r.789695.n4.nabble.com/Calculating-Summaries-for-each-level-of- a-Categorical-variable-tp2269349p2269815.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] several common sub-axes within multiple plot area
How about: #my example: dev.new() layout( rbind( c(1,2), c(7,7), c(3,4), c(8,8), c(5,6), c(9,9) ), heights=c(10,1,10,1,10,1) ) #Graph 1: plot(rnorm(20), rnorm(20), xlab = Results 1 (Int), ylab = Variable A, main = Factor X) #Graph 2: plot(rnorm(20), rnorm(20), xlab = Results 1 (Int), ylab = Variable A, main = Factor Y) #Graph 3: plot(rnorm(20), rnorm(20), xlab = Results 2 (Int), ylab = Variable B) #Graph 4: plot(rnorm(20), rnorm(20), xlab = Results 2 (Int), ylab = Variable B) #Graph 5: plot(rnorm(20), rnorm(20), xlab = Results 3 (Int), ylab = Variable C) #Graph 6: plot(rnorm(20), rnorm(20), xlab = Results 3 (Int), ylab = Variable C) par(mar=rep(0,4)) plot.new() text( .5, .5, Results 1 (Int), font=2, cex=1.5 ) plot.new() text( .5, .5, Results 2 (Int), font=2, cex=1.5 ) plot.new() text( .5, .5, Results 3 (Int), font=2, cex=1.5 ) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Karl Brand Sent: Saturday, June 26, 2010 7:21 AM To: r-help@r-project.org Subject: [R] several common sub-axes within multiple plot area Dear List, I'd really appreciate tip's or code demonstrating how i can achieve some common axis labels integrated into a multiple plot. In my example (below), i'm trying to achieve: -a single Results 1 (Int) centered btwn row 1 and row 2; -a single Results 2 (Int) centered btwn row 2 and row 3; and, -a single Results 3 (Int) centered at the bottom, ie., below row 3. I played with mtext() and par(oma=... per this post- https://stat.ethz.ch/pipermail/r-help/2004-October/059453.html But have so far failed to achieve my goal. Can i succeed with something combined with the 'high level' plot() function? Or do i need to get specific with some low level commands (help!)? With big thanks in advance for any suggestions/examples. cheers, Karl #my example: dev.new() plot.new() par(mfrow=c(3,2)) #Graph 1: plot(rnorm(20), rnorm(20), xlab = Results 1 (Int), ylab = Variable A, main = Factor X) #Graph 2: plot(rnorm(20), rnorm(20), xlab = Results 1 (Int), ylab = Variable A, main = Factor Y) #Graph 3: plot(rnorm(20), rnorm(20), xlab = Results 2 (Int), ylab = Variable B) #Graph 4: plot(rnorm(20), rnorm(20), xlab = Results 2 (Int), ylab = Variable B) #Graph 5: plot(rnorm(20), rnorm(20), xlab = Results 3 (Int), ylab = Variable C) #Graph 6: plot(rnorm(20), rnorm(20), xlab = Results 3 (Int), ylab = Variable C) -- Karl Brand Department of Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam T +31 (0)10 704 3457 |F +31 (0)10 704 4743 |M +31 (0)642 777 268 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic question - more efficient method than loop?
1) Create a table with two columns: payor and payor.group. 2) Merge that table with your original data Hadley On Mon, Jun 28, 2010 at 10:46 AM, GL pfl...@shands.ufl.edu wrote: I'm guessing there's a more efficient way to do the following using the index features of R. Appreciate any thoughts for (i in 1:nrow(dbs1)){ if(dbs1$Payor[i] %in% Payor.Group.Medicaid) dbs1$Payor.Group[i] = Medicaid if(dbs1$Payor[i] %in% Payor.Group.Medicare) dbs1$Payor.Group[i] = Medicare if(dbs1$Payor[i] %in% Payor.Group.Commercial) dbs1$Payor.Group[i] = Commercial if(dbs1$Payor[i] %in% Payor.Group.Workers.Comp) dbs1$Payor.Group[i] = Workers Comp if(dbs1$Payor[i] %in% Payor.Group.Self.Pay) dbs1$Payor.Group[i] = Self Pay if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Newborn) dbs1$Adm.Source.Group[i] = Newborn if(dbs1$Adm.Source[i] %in% Adm.Source.Group.ED) dbs1$Adm.Source.Group[i] = ED if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Routine) dbs1$Adm.Source.Group[i] = Routine if(dbs1$Adm.Source[i] %in% Adm.Source.Group.Transfer) dbs1$Adm.Source.Group[i] = Transfer } -- View this message in context: http://r.789695.n4.nabble.com/Basic-question-more-efficient-method-than-loop-tp2271096p2271096.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data frame row statistics (mean)?
Hello, I am trying to calculate the mean value of each row in a data frame (d), I am having troubles and getting errors using the code I have written. Below is a brief example of the code, any thought or suggestions would be great. Thank you for your time, Doug # Example Code: d - data.frame(st1=c(1,2,3,4), st2=c(2,5,6,7), st3=c(5,5,NA,7), st4=c(6,5,7,8)) avg - rep(NA,length(d[,1])) for (i in 1:length(d[,1])) { avg[i] = mean(d[i,1:4], na.rm=TRUE) } # Final Output wanted. st1 st2 st3 st4 avg 1 1 2 5 6 3.50 2 2 5 5 5 4.25 3 3 6 NA 7 5.33 4 4 7 7 8 6.50 -- - Douglas M. Hultstrand, MS Senior Hydrometeorologist Metstat, Inc. Windsor, Colorado voice: 720.771.5840 email: dmhul...@metstat.com web: http://www.metstat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame row statistics (mean)?
Doug - Try d$avg = apply(d,1,mean,na.rm=TRUE) d st1 st2 st3 st4 avg 1 1 2 5 6 3.50 2 2 5 5 5 4.25 3 3 6 NA 7 5.33 4 4 7 7 8 6.50 (If you must use a loop, calculate mean(as.numeric(d[i,1:4])) Take a look at mean(d[1,1:4]) to see why your program doesn't work properly.) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Mon, 28 Jun 2010, Douglas M. Hultstrand wrote: Hello, I am trying to calculate the mean value of each row in a data frame (d), I am having troubles and getting errors using the code I have written. Below is a brief example of the code, any thought or suggestions would be great. Thank you for your time, Doug # Example Code: d - data.frame(st1=c(1,2,3,4), st2=c(2,5,6,7), st3=c(5,5,NA,7), st4=c(6,5,7,8)) avg - rep(NA,length(d[,1])) for (i in 1:length(d[,1])) { avg[i] = mean(d[i,1:4], na.rm=TRUE) } # Final Output wanted. st1 st2 st3 st4 avg 1 1 2 5 6 3.50 2 2 5 5 5 4.25 3 3 6 NA 7 5.33 4 4 7 7 8 6.50 -- - Douglas M. Hultstrand, MS Senior Hydrometeorologist Metstat, Inc. Windsor, Colorado voice: 720.771.5840 email: dmhul...@metstat.com web: http://www.metstat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame row statistics (mean)?
Douglas M. Hultstrand wrote: Hello, I am trying to calculate the mean value of each row in a data frame (d), I am having troubles and getting errors using the code I have written. Below is a brief example of the code, any thought or suggestions would be great. Thank you for your time, Doug # Example Code: d - data.frame(st1=c(1,2,3,4), st2=c(2,5,6,7), st3=c(5,5,NA,7), st4=c(6,5,7,8)) avg - rep(NA,length(d[,1])) for (i in 1:length(d[,1])) { avg[i] = mean(d[i,1:4], na.rm=TRUE) } # Final Output wanted. st1 st2 st3 st4 avg 1 1 2 5 6 3.50 2 2 5 5 5 4.25 3 3 6 NA 7 5.33 4 4 7 7 8 6.50 d$avg - rowMeans(d, na.rm = TRUE) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame row statistics (mean)?
Hello Doug, I just wanted to add that a faster way to initialize a vector is: avg - vector(numeric, nrow(d)) Also you might like nrow(d) over length(d[ , 1]) if the number of rows is what you are after. Its sister function is ncol() . Best regards, Josh On Mon, Jun 28, 2010 at 11:37 AM, Douglas M. Hultstrand dmhul...@metstat.com wrote: Hello, I am trying to calculate the mean value of each row in a data frame (d), I am having troubles and getting errors using the code I have written. Below is a brief example of the code, any thought or suggestions would be great. Thank you for your time, Doug # Example Code: d - data.frame(st1=c(1,2,3,4), st2=c(2,5,6,7), st3=c(5,5,NA,7), st4=c(6,5,7,8)) avg - rep(NA,length(d[,1])) for (i in 1:length(d[,1])) { avg[i] = mean(d[i,1:4], na.rm=TRUE) } # Final Output wanted. st1 st2 st3 st4 avg 1 1 2 5 6 3.50 2 2 5 5 5 4.25 3 3 6 NA 7 5.33 4 4 7 7 8 6.50 -- - Douglas M. Hultstrand, MS Senior Hydrometeorologist Metstat, Inc. Windsor, Colorado voice: 720.771.5840 email: dmhul...@metstat.com web: http://www.metstat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exponential Smoothing: Forecast package
Hi Phani, to get the best Holt's model, I would simply wrap a suitable function calling ets() within optim() and optimize for alpha and beta - the values given by ets() without constraints would probably be good starting values, but you had better start the optimization with a variety of starting values to make sure you don't end up in a local minimum. I know of no comparison just between Holt and Brown - but you could use the above methods and the M3 Competition dataset (in Mcomp) to look how the two methods compare on a (more or less) benchmark dataset. HTH Stephan phani kishan schrieb: Hey, I am using the ets() function in the forecast package to find out the best fit parameters for my time-series. I have about 50 sets of time series data. I'm currently using the function as follows: ets(x,model=AZZ,opt.crit=mse) As to my observation about 5-10 of them have been identified by ets to have a trend and an alpha, beta values have been thrown up - which have been same in all these cases. When I read up online it came up as a Brown's double exponential smoothing as opposed to Holt's exponential smoothing (where alpha and beta differ). I am guessing this is happening as AIC/AICc/BIC select a model based on accuracy as well as a weight on number of parameters (1 in case of brown's, 2 in case of holt's). Now if I want to see results of the best parameters from the Holt's method, how should I go about it? And is there any study comparing the accuracy of brown's double exponential model versus holt's exponential model? Thanks in advance, Phani __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating quarterly data
Thanks, Allan, that did the trick :) Best, Thomas On Jun 28, 2010, at 6:13 PM, Allan Engelhardt wrote: One approach: d - data.frame(x1=c(2,3,4,1,5,8), x2=c(4,1,6,4,6,5), time=1:6) d$quarter - (d$time-1) %/% 4 # Or whatever your logic is aggregate(cbind(x1,x2) ~ quarter, data = d, sum) # quarter x1 x2 # 1 0 10 15 # 2 1 13 11 Hope this helps Allan On 28/06/10 13:23, Thomas Jensen wrote: Dear R Experts, I have data in the following format x1x2time 241 312 463 144 565 856 ... ... ... 15399 34400 Time represents each month which simply has a number from 1 to 400 (i.e. the data covers 400 months). I would like to somehow create two new variables which counts x1 and x2 as quarterly data in stead of monthly. Like this: x1x1Quarter 10121 12152 ... ... ... 2321100 Can any one tell me how this can be done in R? With kind regards, Thomas Jensen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mhplot error with test example: ylim not found
Many thanks Peter. I have uploaded 1.0-23 which should have this fixed. Jing Hua -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: 28 June 2010 15:47 To: vaneet Cc: r-help@r-project.org; Jing Hua Zhao Subject: Re: [R] mhplot error with test example: ylim not found It seems to me that gap::mhtplot needs a fix. You might want to contact the maintainer (cc'd). In the meantime, you should be able to place an object ylim in your workspace before calling the function: ylim - c(0, 10) mhtplot(test, ylim = c(0, 10)) Of course, you could also just fixt the function (it's short and the fix is easy). -Peter Ehlers On 2010-06-23 8:47, vaneet wrote: Hello all, I am trying to make a genome association plot for p-values related to SNPs and was fortunate to find that R contains a package that produces Manhattan plots which is what's preferred for my current project. The function mhtplot() is found in the 'gap' package which I installed in R 2.11.1 on Windows. I thought I'd test out the function first with the examples they give in the documentation: # foo example test- matrix(c(1,1,4,1,1,6,1,10,3,2,1,5,2,2,6,2,4,8),byrow=TRUE,6) mhtplot(test) mhtplot(test,logscale=F) # fake example with Affy500k data affy-c(40220, 41400, 33801, 32334, 32056, 31470, 25835, 27457, 22864, 28501, 26273, 24954, 19188, 15721, 14356, 15309, 11281, 14881, 6399, 12400, 7125, 6207) CM- cumsum(affy) n.markers- sum(affy) n.chr- length(affy) test- data.frame(chr=rep(1:n.chr,affy),pos=1:n.markers,p=runif(n.markers)) # to reduce size of the plot # bitmap(mhtplot.bmp,res=72*5) oldpar- par() par(las=2,cex=0.6) colors- rep(c(blue,green),11) mhtplot(test,colors=colors,pch=19,bg=colors) title(A simulated example according to EPIC-Norfolk QCed SNPs) When I run either of the examples the following results: Error in mhtplot() : object 'ylim' not found Even though I'm not sure why it would require this parameter, when I tried to fill in the parameter (ylim = c(0,10)) the same error results. I am not sure how to get around this. Vaneet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice and Beamer
Two things I think are some of the best developments in statistics and production are the lattice package and the beamer class for presentation in Latex. One thing I have not become very good at is properly sizing my visuals to look good in a presentation. For instance, I have the following code that creates a nice plot (sorry, cannot provide reproducible data). bwplot(testedgrade~person_measure|gender + ethnicity, pfile, layout=c(2,5), main = 'Distribution of Person Measure by Grade\n Conditional on Gender and Ethnicity (Math)', xlab = 'Grade') Now inside my latex document using the beamer class for presentation I have the following \begin{frame} \frametitle{Distribution of Person Parameters by Grade Conditional on Gender and Ethnicity} \begin{figure}[htb] \centering \fbox{\includegraphics[scale=.3]{personGenEthn.pdf}} \end{figure} \end{frame} I use the scale argument here. I do this totally randomly. I first start with scale=.5. Then, I create the document and look at it. If it seems to fit, I keep it. If it's too big, I resize it until it looks good. There must certainly be a much better way to size these for specific use with latex presentations. Any thoughts? Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mhplot error with test example: ylim not found
Thank you, everyone. The function works fine now. Vaneet -Original Message- From: Jing Hua Zhao [mailto:jinghua.z...@mrc-epid.cam.ac.uk] Sent: Monday, June 28, 2010 2:46 PM To: Peter Ehlers; Lotay, Vaneet Cc: r-help@r-project.org Subject: RE: [R] mhplot error with test example: ylim not found Many thanks Peter. I have uploaded 1.0-23 which should have this fixed. Jing Hua -Original Message- From: Peter Ehlers [mailto:ehl...@ucalgary.ca] Sent: 28 June 2010 15:47 To: vaneet Cc: r-help@r-project.org; Jing Hua Zhao Subject: Re: [R] mhplot error with test example: ylim not found It seems to me that gap::mhtplot needs a fix. You might want to contact the maintainer (cc'd). In the meantime, you should be able to place an object ylim in your workspace before calling the function: ylim - c(0, 10) mhtplot(test, ylim = c(0, 10)) Of course, you could also just fixt the function (it's short and the fix is easy). -Peter Ehlers On 2010-06-23 8:47, vaneet wrote: Hello all, I am trying to make a genome association plot for p-values related to SNPs and was fortunate to find that R contains a package that produces Manhattan plots which is what's preferred for my current project. The function mhtplot() is found in the 'gap' package which I installed in R 2.11.1 on Windows. I thought I'd test out the function first with the examples they give in the documentation: # foo example test- matrix(c(1,1,4,1,1,6,1,10,3,2,1,5,2,2,6,2,4,8),byrow=TRUE,6) mhtplot(test) mhtplot(test,logscale=F) # fake example with Affy500k data affy-c(40220, 41400, 33801, 32334, 32056, 31470, 25835, 27457, 22864, 28501, 26273, 24954, 19188, 15721, 14356, 15309, 11281, 14881, 6399, 12400, 7125, 6207) CM- cumsum(affy) n.markers- sum(affy) n.chr- length(affy) test- data.frame(chr=rep(1:n.chr,affy),pos=1:n.markers,p=runif(n.markers)) # to reduce size of the plot # bitmap(mhtplot.bmp,res=72*5) oldpar- par() par(las=2,cex=0.6) colors- rep(c(blue,green),11) mhtplot(test,colors=colors,pch=19,bg=colors) title(A simulated example according to EPIC-Norfolk QCed SNPs) When I run either of the examples the following results: Error in mhtplot() : object 'ylim' not found Even though I'm not sure why it would require this parameter, when I tried to fill in the parameter (ylim = c(0,10)) the same error results. I am not sure how to get around this. Vaneet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice and Beamer
Harold, I usually just specify a width=x instead of a scale. The height is then automatically scaled to maintain the aspect ratio and you get the right size for the presentation regardless of the size of the original. Christian Raschke m.CR Am Jun 28, 2010 um 12:28 schrieb Doran, Harold hdo...@air.org: Two things I think are some of the best developments in statistics and production are the lattice package and the beamer class for presentation in Latex. One thing I have not become very good at is properly sizing my visuals to look good in a presentation. For instance, I have the following code that creates a nice plot (sorry, cannot provide reproducible data). bwplot(testedgrade~person_measure|gender + ethnicity, pfile, layout=c (2,5), main = 'Distribution of Person Measure by Grade\n Conditional on Gender and Ethnicity (Math)', xlab = 'Grade') Now inside my latex document using the beamer class for presentation I have the following \begin{frame} \frametitle{Distribution of Person Parameters by Grade Conditional on Gender and Ethnicity} \begin{figure}[htb] \centering \fbox{\includegraphics[scale=.3]{personGenEthn.pdf}} \end{figure} \end{frame} I use the scale argument here. I do this totally randomly. I first start with scale=.5. Then, I create the document and look at it. If it seems to fit, I keep it. If it's too big, I resize it until it looks good. There must certainly be a much better way to size these for specific use with latex presentations. Any thoughts? Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to run Bibtex with pdfLatex in StatEt/MikTex on Windows ?
Hello, I'm running R2.10, Eclipse, StatEt and MikTex 2.8 to create Sweave documents, and everything seems to work great, until today... I was trying to add citations from a Bibtex file, but I just got [?] citations. However, if I open the .tex file that StatEt created in MikTex and run the latex+bibtex+pdflatex command, the citations are present. Does anyone know how to either configure StatEt to run bibtex or to somehow fix this ? Thanks Paul. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotmeans
Hello, I am using library(gplots) to do something like data(state) x1 - state.area/1 x2 - x1+round((rnorm(length(state.area),3,3))) plotmeans(x1 ~ state.region) Is it possible to plot x2 to x1 in the same graph, something like: linesmeans(x2 ~ state.region) Best wishes, Cheba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice and Beamer
On Mon, Jun 28, 2010 at 12:28 PM, Doran, Harold hdo...@air.org wrote: Two things I think are some of the best developments in statistics and production are the lattice package and the beamer class for presentation in Latex. One thing I have not become very good at is properly sizing my visuals to look good in a presentation. For instance, I have the following code that creates a nice plot (sorry, cannot provide reproducible data). bwplot(testedgrade~person_measure|gender + ethnicity, pfile, layout=c(2,5), main = 'Distribution of Person Measure by Grade\n Conditional on Gender and Ethnicity (Math)', xlab = 'Grade') Now inside my latex document using the beamer class for presentation I have the following \begin{frame} \frametitle{Distribution of Person Parameters by Grade Conditional on Gender and Ethnicity} \begin{figure}[htb] \centering \fbox{\includegraphics[scale=.3]{personGenEthn.pdf}} \end{figure} \end{frame} I use the scale argument here. I do this totally randomly. I first start with scale=.5. Then, I create the document and look at it. If it seems to fit, I keep it. If it's too big, I resize it until it looks good. There must certainly be a much better way to size these for specific use with latex presentations. Any thoughts? I think we have had this discussion before and I have tried to convince you to use Sweave with beamer and lattice.:-) A big advantage of Sweave is that you have the code that the generates the figures in the LaTeX file and you don't allow the possibility of losing track of PDF files containing the latest versions of figures. In my preamble I have some lines like \SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=10,height=6.5,strip.white=all} \SweaveOpts{include=TRUE} \setkeys{Gin}{width=\textwidth} Setting the default height and width of the PDF figure and the inclusion width=\textwidth provides a default scaling that looks good to me. If I want a shorter figure that allows for text on the slide then I set the height to a smaller value. A full height version looks like \begin{frame} \frametitle{Plot of inverse canonical link for the Bernoulli distribution} BernoulliinvLink,fig=TRUE,echo=FALSE= linkinv - function(eta) 1/(1+exp(-eta)) eta - seq(-7,7,len = 701) print(xyplot(linkinv(eta) ~ eta, type = c(g,l), xlab = expression(eta), ylab = expression(mu == frac(1,1+exp(-eta) @ \end{frame} Harold [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] several common sub-axes within multiple plot area
Cheers Greg, That's really simple. That's excellent. Thank you. Sincere thanks for the education. The more i learn, the more i like getting it done with R. karl On 6/28/2010 7:32 PM, Greg Snow wrote: How about: #my example: dev.new() layout( rbind( c(1,2), c(7,7), c(3,4), c(8,8), c(5,6), c(9,9) ), heights=c(10,1,10,1,10,1) ) #Graph 1: plot(rnorm(20), rnorm(20), xlab = Results 1 (Int), ylab = Variable A, main = Factor X) #Graph 2: plot(rnorm(20), rnorm(20), xlab = Results 1 (Int), ylab = Variable A, main = Factor Y) #Graph 3: plot(rnorm(20), rnorm(20), xlab = Results 2 (Int), ylab = Variable B) #Graph 4: plot(rnorm(20), rnorm(20), xlab = Results 2 (Int), ylab = Variable B) #Graph 5: plot(rnorm(20), rnorm(20), xlab = Results 3 (Int), ylab = Variable C) #Graph 6: plot(rnorm(20), rnorm(20), xlab = Results 3 (Int), ylab = Variable C) par(mar=rep(0,4)) plot.new() text( .5, .5, Results 1 (Int), font=2, cex=1.5 ) plot.new() text( .5, .5, Results 2 (Int), font=2, cex=1.5 ) plot.new() text( .5, .5, Results 3 (Int), font=2, cex=1.5 ) -- Karl Brand Department of Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam T +31 (0)10 704 3457 |F +31 (0)10 704 4743 |M +31 (0)642 777 268 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] advice on package devel with external libs
hi all - i'm working on an R package that makes use of my own shared library written in C. but i also am making use of another C-written library. (my package is for facilitating biological namespace translations via online (i.e. up-to-date) biological databases.) problem is, the library i'm using is not a standard library (i.e. i doubt it will be installed on most users' machines). i also don't think too many users will be particularly adept in installing a shared library. for users with a sysadmin, it can be done easily enough, but on local installations i fear most will be incapable of properly installing/ locating the library so my code can link to it during compile time. (in case anyone was wondering, the library in question is a lightweight JSON parser... yes i know there are existing R packages for this, but they are *very* slow for large JSON object coding/ encoding.) how have folks dealt with this in the past with R packages? i've thought about wrapping the other library itself as a separate R package which basically does nothing on installation other than compile and put the libraries a predictable location... but this seems rather silly (and may violate the JSON parser package's license). thanks for any input on this, -murat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Zoo series to a date time stamp that is regular
NOTE: I will provide data if necessary, but I didn't want clutter everyones mailbox All: I have a time series with level and temperature data for 11 sites for each of three bases. I will have to do this more than once is what I am saying here. OK, The time series are zoo objects with index values in chron format. The problem is that the date and times should be at even 15 min intervals, but because of operator error (me) they are on 15 min intervals for some of the data and on 15+1 min intervals for some and 15+1.56 for some of the same site. I would like to create a regular 15min time series from this and need to know if I can just create an empty time series with the entire date time I would like and then aggregate. I am not sure how merge or aggregate would act in this particular situation. I want to snap readings to the nearest 15min interval. kindest regards, -- Stephen Sefick | Auburn University | | Department of Biological Sciences | | 331 Funchess Hall | | Auburn, Alabama | | 36849| |___| | sas0...@auburn.edu | | http://www.auburn.edu/~sas0025 | |___| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Not-In operator
I would like to use grep to return all the lines of a data frame that do not contain the letters HD. I have tried the ^ inside brackets as well as !. The data frame is one column consisting of spaces,numbers, and letters with several thousand rows. Thank you! Colton [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Not-In operator
Colton - Have you looked at the invert= argument of grep()? (In regular expressions, ^ means beginning of string, and ! has no special meaning.) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Mon, 28 Jun 2010, Colton Hunt wrote: I would like to use grep to return all the lines of a data frame that do not contain the letters HD. I have tried the ^ inside brackets as well as !. The data frame is one column consisting of spaces,numbers, and letters with several thousand rows. Thank you! Colton [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Not-In operator
Giving a reproducible example would likely lead to a solution quickly. Colton Hunt wrote: I would like to use grep to return all the lines of a data frame that do not contain the letters HD. I have tried the ^ inside brackets as well as !. The data frame is one column consisting of spaces,numbers, and letters with several thousand rows. Thank you! Colton [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Export Results
Do you have an open word file ? Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Mon, Jun 28, 2010 at 1:38 PM, Pedro Mota Veiga motave...@net.sapo.ptwrote: Thanks for your sugestions. But when I do wdGet(T) I have de next message. wdGet(T) Error in if (!(tmp[[ActiveDocument]][[Name]] == filename)) tmp$Open(paste(path, : argument is of length zero What is happen? Thaks for all -- View this message in context: http://r.789695.n4.nabble.com/Export-Results-tp2268622p2270745.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo series to a date time stamp that is regular
On Mon, Jun 28, 2010 at 4:42 PM, stephen sefick ssef...@gmail.com wrote: NOTE: I will provide data if necessary, but I didn't want clutter everyones mailbox All: I have a time series with level and temperature data for 11 sites for each of three bases. I will have to do this more than once is what I am saying here. OK, The time series are zoo objects with index values in chron format. The problem is that the date and times should be at even 15 min intervals, but because of operator error (me) they are on 15 min intervals for some of the data and on 15+1 min intervals for some and 15+1.56 for some of the same site. I would like to create a regular 15min time series from this and need to know if I can just create an empty time series with the entire date time I would like and then aggregate. I am not sure how merge or aggregate would act in this particular situation. I want to snap readings to the nearest 15min interval. kindest regards, In the development version of zoo na.locf has an xout argument (modeled after ?approx). If you add an xout = g argument where g is the desired grid then it will fill the grid with NAs, fill the NAs in the usual na.locf way and then drop any points not on the grid. Here is an example taken the development version of the zoo faq which uses 10 minutes. Its just two lines. The first sets up the grid and the second is the just-mentioned na.locf call. # load development version of na.locf library(zoo) library(chron) source(http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/na.locf.R?revision=725root=zoo;) # read data into z Lines - Time,Value 2009-10-09 5:00:00,210 2009-10-09 5:05:00,207 2009-10-09 5:17:00,250 2009-10-09 5:30:00,193 2009-10-09 5:41:00,205 2009-10-09 6:00:00,185 library(chron) z - read.zoo(textConnection(Lines), FUN = as.chron, sep = ,, header = TRUE) # create 10 minute grid, g, and align to it g - seq(start(z), end(z), by = times(00:10:00)) na.locf(z, xout = g) At the expense of a slightly more complex call, its also possible to calculate it with the current CRAN version of zoo if you use na.approx in place of the na.locf line above as shown: na.approx(z, xout = g, method = constant, rule = 2) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] advice on package devel with external libs
Some ideas, 1. Wrap the library as an R package, as you said, and check for the library at configure time (i.e. with autoconf or custom script). But if you do, it would be great to provide an R-level API so that we can all use it. This is the strategy of the 'cairo', 'RGtk', 'rgl', and 'gsl' packages. Also, maybe try and collaborate with the developers of the 'rjson' package to improve it. 2. If the library is appropriately licensed, and truly 'lightweight', simply add its sources to your package. R core does this with zlib and a few other libraries. However, this puts the burden on you to maintain code written by others. There are several JSON parsers with very liberal licenses (www.json.org), and some are tiny. -Matt On Mon, 2010-06-28 at 16:10 -0400, Murat Tasan wrote: hi all - i'm working on an R package that makes use of my own shared library written in C. but i also am making use of another C-written library. (my package is for facilitating biological namespace translations via online (i.e. up-to-date) biological databases.) problem is, the library i'm using is not a standard library (i.e. i doubt it will be installed on most users' machines). i also don't think too many users will be particularly adept in installing a shared library. for users with a sysadmin, it can be done easily enough, but on local installations i fear most will be incapable of properly installing/ locating the library so my code can link to it during compile time. (in case anyone was wondering, the library in question is a lightweight JSON parser... yes i know there are existing R packages for this, but they are *very* slow for large JSON object coding/ encoding.) how have folks dealt with this in the past with R packages? i've thought about wrapping the other library itself as a separate R package which basically does nothing on installation other than compile and put the libraries a predictable location... but this seems rather silly (and may violate the JSON parser package's license). thanks for any input on this, -murat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina http://biostatmatt.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distance matrix?
x - 0:10 y - t(replicate(11, 0:10)) abs(sweep(y, 1, x)) Hope this helps. On Mon, Jun 28, 2010 at 5:21 AM, clips10 m.mcquil...@lancaster.ac.uk wrote: I have a vector 0 to 10 and want to create a matrix with the differences between the numbers in it for instance: 0 1 2 3 4 5 6 7 8 9 10 0 0 1 2 3 4 5 6 7 8 9 10 1 1 0 1 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 10 Etc etc. So that the matrix is filled with the differences between in absolute value so there are no negatives. Any ideas? Thanks -- View this message in context: http://r.789695.n4.nabble.com/distance-matrix-tp2270722p2270722.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John A. Ramey, M.S. Ph.D. Candidate Department of Statistics Baylor University http://www.ramhiser.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo series to a date time stamp that is regular
Gabor, This is very close, but it interpolates values that do not exist in the original series. Is there a way to just snap the series to a grid without interpolating? -- Stephen Sefick | Auburn University | | Department of Biological Sciences | | 331 Funchess Hall | | Auburn, Alabama | | 36849| |___| | sas0...@auburn.edu | | http://www.auburn.edu/~sas0025 | |___| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotmeans
Try adding par(new=TRUE) after plotting the first plot and then just plot the second one. You have to make sure that both use the same y axis but I will leave it to you to find out how ;-) (I would fix the y limits of both plots...) HTH Jannis cheba meier schrieb: Hello, I am using library(gplots) to do something like data(state) x1 - state.area/1 x2 - x1+round((rnorm(length(state.area),3,3))) plotmeans(x1 ~ state.region) Is it possible to plot x2 to x1 in the same graph, something like: linesmeans(x2 ~ state.region) Best wishes, Cheba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo series to a date time stamp that is regular
On Mon, Jun 28, 2010 at 5:52 PM, stephen sefick ssef...@gmail.com wrote: Gabor, This is very close, but it interpolates values that do not exist in the original series. Is there a way to just snap the series to a grid without interpolating? Just round up or down the times with trunc. Using z from my prior post this rounds up to the next 10 minute boundary. (If there are multiple data values in a 10 minute interval it takes the last value.) min10 - times(00:10:00) halfsec - times(00:00:01)/2 aggregate(z, trunc(time(z) + as.numeric(min10 - halfsec), min10), function(x) tail(x, 1)) If there is only data value in each 10 minute interval this will round it up. If there are multiple data values in an interval it it takes the last one. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo series to a date time stamp that is regular
#z is my raggidy zoo series min15 - times(00:15:00) trunc(index(z), min15) This looks like what I want I am just truncating the index to the nearest 15 min interval. a quick check with length confirms that they are both of the same length. I am just checking to make sure that I am not missing something. Thank you very much for your help. Stephen On Mon, Jun 28, 2010 at 5:05 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, Jun 28, 2010 at 5:52 PM, stephen sefick ssef...@gmail.com wrote: Gabor, This is very close, but it interpolates values that do not exist in the original series. Is there a way to just snap the series to a grid without interpolating? Just round up or down the times with trunc. Using z from my prior post this rounds up to the next 10 minute boundary. (If there are multiple data values in a 10 minute interval it takes the last value.) min10 - times(00:10:00) halfsec - times(00:00:01)/2 aggregate(z, trunc(time(z) + as.numeric(min10 - halfsec), min10), function(x) tail(x, 1)) If there is only data value in each 10 minute interval this will round it up. If there are multiple data values in an interval it it takes the last one. -- Stephen Sefick | Auburn University | | Department of Biological Sciences | | 331 Funchess Hall | | Auburn, Alabama | | 36849| |___| | sas0...@auburn.edu | | http://www.auburn.edu/~sas0025 | |___| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo series to a date time stamp that is regular
On Mon, Jun 28, 2010 at 6:36 PM, stephen sefick ssef...@gmail.com wrote: #z is my raggidy zoo series min15 - times(00:15:00) trunc(index(z), min15) This looks like what I want I am just truncating the index to the nearest 15 min interval. a quick check with length confirms that they are both of the same length. I am just checking to make sure that I am not missing something. Thank you very much for your help. Yes, that truncates. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo series to a date time stamp that is regular
Now there are duplicates. I am having a really hard time with this. I want to keep the index the same if it lies on a 15min interval and round up or down to the closest interval. On Mon, Jun 28, 2010 at 5:40 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, Jun 28, 2010 at 6:36 PM, stephen sefick ssef...@gmail.com wrote: #z is my raggidy zoo series min15 - times(00:15:00) trunc(index(z), min15) This looks like what I want I am just truncating the index to the nearest 15 min interval. a quick check with length confirms that they are both of the same length. I am just checking to make sure that I am not missing something. Thank you very much for your help. Yes, that truncates. -- Stephen Sefick | Auburn University | | Department of Biological Sciences | | 331 Funchess Hall | | Auburn, Alabama | | 36849| |___| | sas0...@auburn.edu | | http://www.auburn.edu/~sas0025 | |___| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distance matrix?
abs(outer(1:10, 1:10, FUN=-)) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]012345678 9 [2,]101234567 8 [3,]210123456 7 [4,]321012345 6 [5,]432101234 5 [6,]543210123 4 [7,]654321012 3 [8,]765432101 2 [9,]876543210 1 [10,]987654321 0 Kjetil On Mon, Jun 28, 2010 at 5:44 PM, John Ramey johnra...@gmail.com wrote: x - 0:10 y - t(replicate(11, 0:10)) abs(sweep(y, 1, x)) Hope this helps. On Mon, Jun 28, 2010 at 5:21 AM, clips10 m.mcquil...@lancaster.ac.uk wrote: I have a vector 0 to 10 and want to create a matrix with the differences between the numbers in it for instance: 0 1 2 3 4 5 6 7 8 9 10 0 0 1 2 3 4 5 6 7 8 9 10 1 1 0 1 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 10 Etc etc. So that the matrix is filled with the differences between in absolute value so there are no negatives. Any ideas? Thanks -- View this message in context: http://r.789695.n4.nabble.com/distance-matrix-tp2270722p2270722.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John A. Ramey, M.S. Ph.D. Candidate Department of Statistics Baylor University http://www.ramhiser.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo series to a date time stamp that is regular
On Mon, Jun 28, 2010 at 6:47 PM, stephen sefick ssef...@gmail.com wrote: Now there are duplicates. I am having a really hard time with this. I want to keep the index the same if it lies on a 15min interval and round up or down to the closest interval. If you have duplicates in an interval then use aggregate as shown in my prior post. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo series to a date time stamp that is regular
There are NA most likely. Will aggregate pull the value out? Again, thanks for all of the help. Stephen On Mon, Jun 28, 2010 at 6:04 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, Jun 28, 2010 at 6:47 PM, stephen sefick ssef...@gmail.com wrote: Now there are duplicates. I am having a really hard time with this. I want to keep the index the same if it lies on a 15min interval and round up or down to the closest interval. If you have duplicates in an interval then use aggregate as shown in my prior post. -- Stephen Sefick | Auburn University | | Department of Biological Sciences | | 331 Funchess Hall | | Auburn, Alabama | | 36849| |___| | sas0...@auburn.edu | | http://www.auburn.edu/~sas0025 | |___| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Identify and extract a whole word of variable length using regular expressions
Hi everybody, I'm quite weak with regular expression, and I need some help... I have strings of the type a [1,] ppe46 Rv3018c MT3098/MT3101 MTV012.32c [2,] ppe16 Rv1135c MT1168 [3,] ppe21 Rv1548c MT1599 MTCY48.17 [4,] ppe12 Rv0755c MT0779 [5,] PE_PGRS51 Rv3367 [etc..for several hundreds] I want have instead only: [1,] Rv3018c [2,] Rv1135c [3,] Rv1548c [4,] Rv0755c [5,] Rv3367 Besides these examples, the only thing I know for sure is that the magic substrings I want to extract are entire word all starting by Rv. So Rvx, preceded and followed by a space, and of a variable length. I don't have any other infos. Do you know how to pick them? I checked for their presence using grep, and \\Rv*\\ expression, I tried with some string functions from Hmisc, or in the other way, by substituting with empty strings everything except the Rv word, but I didn't achieve that much... Could you please give me some suggestions? Thanks a lot, Giulio _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zoo series to a date time stamp that is regular
On Mon, Jun 28, 2010 at 7:08 PM, stephen sefick ssef...@gmail.com wrote: There are NA most likely. Will aggregate pull the value out? Again, thanks for all of the help. Stephen I assume you mean NAs in the data. They are handled by the function you give to aggregate so just make sure you use something appropriate. z - zoo(c(1, NA, 3, 4), 0:3) aggregate(z, time(z) %/% 2, function(x) mean(x, na.rm = TRUE)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identify and extract a whole word of variable length using regular expressions
Giulio - This sub('^.* ?(Rv[^ ]*) ?.*$','\\1',a) [1] Rv3018c Rv1135c Rv1548c Rv0755c Rv3367 seems to do what you want. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Mon, 28 Jun 2010, Giulio Di Giovanni wrote: Hi everybody, I'm quite weak with regular expression, and I need some help... I have strings of the type a [1,] ppe46 Rv3018c MT3098/MT3101 MTV012.32c [2,] ppe16 Rv1135c MT1168 [3,] ppe21 Rv1548c MT1599 MTCY48.17 [4,] ppe12 Rv0755c MT0779 [5,] PE_PGRS51 Rv3367 [etc..for several hundreds] I want have instead only: [1,] Rv3018c [2,] Rv1135c [3,] Rv1548c [4,] Rv0755c [5,] Rv3367 Besides these examples, the only thing I know for sure is that the magic substrings I want to extract are entire word all starting by Rv. So Rvx, preceded and followed by a space, and of a variable length. I don't have any other infos. Do you know how to pick them? I checked for their presence using grep, and \\Rv*\\ expression, I tried with some string functions from Hmisc, or in the other way, by substituting with empty strings everything except the Rv word, but I didn't achieve that much... Could you please give me some suggestions? Thanks a lot, Giulio _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identify and extract a whole word of variable length using regular expressions
On Mon, Jun 28, 2010 at 7:17 PM, Giulio Di Giovanni perimessagg...@hotmail.com wrote: Hi everybody, I'm quite weak with regular expression, and I need some help... I have strings of the type a [1,] ppe46 Rv3018c MT3098/MT3101 MTV012.32c [2,] ppe16 Rv1135c MT1168 [3,] ppe21 Rv1548c MT1599 MTCY48.17 [4,] ppe12 Rv0755c MT0779 [5,] PE_PGRS51 Rv3367 [etc..for several hundreds] I want have instead only: [1,] Rv3018c [2,] Rv1135c [3,] Rv1548c [4,] Rv0755c [5,] Rv3367 Besides these examples, the only thing I know for sure is that the magic substrings I want to extract are entire word all starting by Rv. So Rvx, preceded and followed by a space, and of a variable length. I don't have any other infos. Do you know how to pick them? I checked for their presence using grep, and \\Rv*\\ expression, I tried with some string functions from Hmisc, or in the other way, by substituting with empty strings everything except the Rv word, but I didn't achieve that much... Could you please give me some suggestions? You can use strapply in gsubfn to pick out strings by content. The regular expression says match a word bound followed by R followed by v followed by 0 or more non-spaces: library(gsubfn) strapply(a, \\bRv\\S*, c, perl = TRUE, simplify = TRUE) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stacking several vectors from the list
Hi everybody, I'm working on the very messy data, I have tried to clean it up in SAS and SAS/IML but there is not enough info on how to handle certain things in SAS so I have turned to R. The thing itself should be rather simple, so i was wondering if someone could help me out. The original .csv has ([1] 7138 6338 ) dimensions with funds with the corresponding dates and observations for each date for around 10 years and 4000+ funds, meaning in COL5 has the next fund's name and so on. COL1 COL2 COL3 COL4 HBNNF US Equity DateEQY_SH_OUT PX_VOLUME #NAME? #N/A N/A 135000 7/7/2008#N/A N/A 105000 7/17/2008 #N/A N/A 59 7/22/2008 #N/A N/A 4 so in R this .csv is somehow read as list (using typeof) and not as dataframe, and a lot of stuff like regexpr searches in the whole file do not work or behave strangely. I want to stack the fund data, and create a long dataset with a fund name, date, eqy_sh_out and px_volume, with fund name present for each date. That should look like this, Fund_name DateEQY_SH_OUT PX_VOLUME HBNNF US Equity 7/7/2008#N/A N/A105000 HBNNF US Equity 7/17/2008 #N/A N/A59 HBNNF US Equity 7/22/2008 #N/A N/A4 HBNNF US Equity 7/24/2008 #N/A N/A3000 HBNNF US Equity 7/31/2008 #N/A N/A1000 HBNNF US Equity 8/20/2008 #N/A N/A1000 HBNNF US Equity 8/26/2008 #N/A N/A2000 HBNNF US Equity 8/27/2008 #N/A N/A2000 HBNNF US Equity 9/2/2008#N/A N/A5000 HND CN Equity 1/17/2008 #N/A N/A28000 HND CN Equity 1/18/2008 #N/A N/A25000 HND CN Equity 1/21/2008 #N/A N/A5000 HND CN Equity 1/22/2008 #N/A N/A101000 HND CN Equity 1/23/2008 #N/A N/A122000 Any way to accomplish this? Should be an easy way, but i have never worked with lists and somehow it doesn't read as a dataframe with strange results. small_raw[1,1] [1] HBNNF US Equity Levels: 0.26 0.46 COL1 HBNNF US Equity grep(Equity,as.character(small_raw)) integer(0) small_raw[[1]] [1] HBNNF US Equity [5] [9] [13] [17] [21] [25] [29] [33] [37] [41] [45] [49] [53] [57] [61] [65] [69] [73] [77] [81] [85] [89] [93] [97] 0.460.46 [101] 0.460.26 [105] 0.260.26 [109] 0.260.26 [113] 0.260.26 [117] 0.260.26 [121] 0.260.26 [125] 0.260.26 [129] 0.260.26 [133] 0.260.26 [137] 0.260.26 [141] 0.260.26 [145] 0.260.26 [149] 0.26
[R] ask a question about list in R project
my list al is as below: mylist=list(c(2,3),5,7) mylist [[1]] [1] 2 3 [[2]] [1] 5 [[3]] [1] 7 How could I get the following FOUR lists: First one [[1]] [1] 3 [[2]] [1] 5 [[3]] [1] 7 Second one [[1]] [1] 2 [[2]] [1] 5 [[3]] [1] 7 Third One [[1]] [1] 2 3 [[2]] [1] 7 Last one [[1]] [1] 2 3 [[2]] [1] 5 Do I have to use 'for' loops? Please give me sone suggestions! Thank you all! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stacking several vectors from the list
On Mon, Jun 28, 2010 at 7:30 PM, astar...@uci.edu wrote: Hi everybody, I'm working on the very messy data, I have tried to clean it up in SAS and SAS/IML but there is not enough info on how to handle certain things in SAS so I have turned to R. The thing itself should be rather simple, so i was wondering if someone could help me out. The original .csv has ([1] 7138 6338 ) dimensions with funds with the corresponding dates and observations for each date for around 10 years and 4000+ funds, meaning in COL5 has the next fund's name and so on. COL1 COL2 COL3 COL4 HBNNF US Equity Date EQY_SH_OUT PX_VOLUME #NAME? #N/A N/A 135000 7/7/2008 #N/A N/A 105000 7/17/2008 #N/A N/A 59 7/22/2008 #N/A N/A 4 so in R this .csv is somehow read as list (using typeof) and not as dataframe, and a lot of stuff like regexpr searches in the The typeof of a data.frame is list so you do have a data frame -- not a list. Perhaps the problem is that you do not want factor columns but want character columns instead. Use read.csv(..., as.is = TRUE) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stacking several vectors from the list
On Mon, Jun 28, 2010 at 7:40 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, Jun 28, 2010 at 7:30 PM, astar...@uci.edu wrote: Hi everybody, I'm working on the very messy data, I have tried to clean it up in SAS and SAS/IML but there is not enough info on how to handle certain things in SAS so I have turned to R. The thing itself should be rather simple, so i was wondering if someone could help me out. The original .csv has ([1] 7138 6338 ) dimensions with funds with the corresponding dates and observations for each date for around 10 years and 4000+ funds, meaning in COL5 has the next fund's name and so on. COL1 COL2 COL3 COL4 HBNNF US Equity Date EQY_SH_OUT PX_VOLUME #NAME? #N/A N/A 135000 7/7/2008 #N/A N/A 105000 7/17/2008 #N/A N/A 59 7/22/2008 #N/A N/A 4 so in R this .csv is somehow read as list (using typeof) and not as dataframe, and a lot of stuff like regexpr searches in the The typeof of a data.frame is list so you do have a data frame -- not a list. Perhaps the problem is that you do not want factor columns but want character columns instead. Use read.csv(..., as.is = TRUE) Just to be clear a data frame is a list so not a list means not just a list -- its also a data frame. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Model validation and penalization with rms package
I’ve been using Frank Harrell’s rms package to do bootstrap model validation. Is it the case that the optimum penalization may still give a model which is substantially overfitted? I calculated corrected R^2, optimism in R^2, and corrected slope for various penalties for a simple example: x1 - rnorm(45) x2 - rnorm(45) x3 - rnorm(45) y - x1 + 2*x2 + rnorm(45,0,3) ols0 - ols(y ~ x1 + x2 + x3, x=TRUE, y=TRUE) corrected.Rsq - rep(0,60) optimism.Rsq - rep(0,60) corrected.slope - rep(0,60) for (pen in 1:60) { olspen - ols(y ~ x1 + x2 + x3, penalty=pen, x=TRUE, y=TRUE) val - validate(olspen, B=200) corrected.Rsq[pen] - val[R-square, index.corrected] optimism.Rsq[pen] - val[R-square, optimism] corrected.slope[pen] - val[Slope, index.corrected] } plot(corrected.Rsq) x11(); plot(optimism.Rsq) x11(); plot(corrected.slope) p - pentrace(ols0, penalty=1:60) ols9 - ols(y ~ x1 + x2 + x3, penalty=9, x=TRUE, y=TRUE) validate(ols9, B=200) index.orig training test optimism index.corrected n R-square0.2523404 0.2820734 0.1911878 0.09088563 0.1614548 200 MSE 7.8497722 7.3525300 8.4918212 -1.13929116 8.9890634 200 Intercept 0.000 0.000 -0.1353572 0.13535717 -0.1353572 200 Slope 1.000 1.000 1.1707137 -0.17071372 1.1707137 200 pentrace tells me that of the penalties 1, 2,..., 60, corrected AIC is maximised by a penalty of 9. This is consistent with the corrected R^2 plot, which shows a maximum somewhere around 10. However, a penalty of 9 still gives an R^2 optimism of 0.09 (training R^2=0.28, test R^2=0.19), suggesting overfitting. Do we just have to live with this R^2 optimism? It can be decreased by taking a bigger penalty, but then the corrected R^2 is reduced. Also, a penalty of 9 gives a corrected slope of about 1.17 (corrected slope of 1 is achieved with a penalty of about 1 or 2). Thanks for any help/advice you can give. Mark -- Mark Seeto Statistician National Acoustic Laboratories A Division of Australian Hearing __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need help for SVM code for microarray classification
Hi I am Aadhithya I am trying to write a code to classify microarray data (AML and ALL) using SVM in R my code goes like this : library(e1071) train-read.table(Z:/Documents/train.txt,header=T); test-read.table(Z:/Documents/test.txt,header=T); cl - c(c(rep(ALL,10), rep(AML,10))); model- svm(train,cl); pred - predict(model,t(test)); table(pred,t(cl)); But I am not able to run it its giving me error . I will be really grateful if someone can help me.Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/Need-help-for-SVM-code-for-microarray-classification-tp2271652p2271652.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to delete rows based on replicate values in one column with some extra calcuation
Hi, folks, Please let me address the problem by the following codes: first=c('u','b','e','k','j','c','u','f','c','e') second=c('usa','Brazil','England','Korea','Japan','China','usa','France','China','England') third=1:10 data=data.frame(first,second,third) ## You may understand values in the first column are the unique codes for those in the second column. So 'u' is only for usa. Replicate values appear the same rows for the first and second columns. ### Now I want to delete replicate rows with the same values in first (sceond) rows and sum up values in the third column for the same values. mm=melt(data,id='first') sum=cast(mm,first~variable,sum) ### This does not work. But the expected dataframe is like this: 1 uthird 8 2 bthird 2 3 ethird 13 4 kthird 4 5 jthird 5 6 cthird 15 8 fthird 8 Thanks in advance. Yi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrix operations
Hello I have a quick question. I need to compute matrix in R, like A - t(X) %*% solve(V) %*% X, where X is a vector and V is a matrix This code works, but now i want to optimize it. I have try: A - crossprod(X, solve(V)) %*% X Is there another, better way? WBR Dima [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to draw multi group plot?
As the attachement,I wanna draw multi group plot. But I can only use : plot(x,y...) points(...) It's a heavy work to use these command if there're too many groups to be drawn because I have to use point() for many times. I wanna know wheter there's command which can draw the multigroup plot directly? Thanks My best. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ask a question about list in R project
There was a recent fortune suggestion along the lines of any simple English sentence can probably be satisfied with a simple set of R functions without loops. In this case you appear to be forgetting the simple English sentence part of that formulation. -- David. On Jun 28, 2010, at 7:37 PM, song song wrote: my list al is as below: mylist=list(c(2,3),5,7) mylist [[1]] [1] 2 3 [[2]] [1] 5 [[3]] [1] 7 How could I get the following FOUR lists: First one [[1]] [1] 3 [[2]] [1] 5 [[3]] [1] 7 Second one [[1]] [1] 2 [[2]] [1] 5 [[3]] [1] 7 Third One [[1]] [1] 2 3 [[2]] [1] 7 Last one [[1]] [1] 2 3 [[2]] [1] 5 Do I have to use 'for' loops? Please give me sone suggestions! Thank you all! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix operations
Since X is a vector, then A - sum(X, solve(V, X)) is probably slightly better here. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Dmitrij Kudriavcev Sent: Tuesday, 29 June 2010 12:29 PM To: r-help@r-project.org Subject: [R] Matrix operations Hello I have a quick question. I need to compute matrix in R, like A - t(X) %*% solve(V) %*% X, where X is a vector and V is a matrix This code works, but now i want to optimize it. I have try: A - crossprod(X, solve(V)) %*% X Is there another, better way? WBR Dima [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix operations
Oops. Try A - sum(X * solve(V, X)) (too fast!) -Original Message- From: Venables, Bill (CMIS, Cleveland) Sent: Tuesday, 29 June 2010 1:05 PM To: 'Dmitrij Kudriavcev'; 'r-help@r-project.org' Subject: RE: [R] Matrix operations Since X is a vector, then A - sum(X, solve(V, X)) is probably slightly better here. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Dmitrij Kudriavcev Sent: Tuesday, 29 June 2010 12:29 PM To: r-help@r-project.org Subject: [R] Matrix operations Hello I have a quick question. I need to compute matrix in R, like A - t(X) %*% solve(V) %*% X, where X is a vector and V is a matrix This code works, but now i want to optimize it. I have try: A - crossprod(X, solve(V)) %*% X Is there another, better way? WBR Dima [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.