Re: [R] R-help Digest, Vol 123, Issue 30
Hi all are there any R packages that include circular stats similar to Oriana (http://www.kovcomp.co.uk/oriana/newver4.html)? I am interested in looking at annual patterns of bat activity where data will have date/times and relative abundance values for each Date. I would like to have a circular plot with the circumference axis the 12 months of the year and then a value of relative abundance and likely with ggplot2 this can be set to color= species. Tnx Bruce __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Construct plot combination using grid without plotting and retrieving an object?
Hi On 05/24/13 20:24, Johannes Graumann wrote: Hi, I'm currently combining multiple plots using something along the lines of the following pseudo-code: library(grid) grid.newpage() tmpLayout - grid.layout( nrow=4, ncol=2) pushViewport(viewport(layout = tmpLayout)) and than proceeding with filling the viewports ... works fine, but for packaging of functions I would really prefer if I could assemble all of this in an object which in the end would be callable with print. I'm envisioning something along the lines of what I can do with ggplot2: return a plot as a ggpplot object and plot it later rather than as I assemble it. Is that possible with a complex grid figure? Thanks for any pointers. You can work off-screen with grobs and gTrees and vpTrees, for example ... library(grid) vplay - viewport(layout=grid.layout(2, 2), name=vplay) vp.1.1 - viewport(layout.pos.col=1, layout.pos.row=1, name=vp.1.1) vp.2.2 - viewport(layout.pos.col=2, layout.pos.row=2, name=vp.2.2) x - gTree(childrenvp=vpTree(vplay, vpList(vp.1.1, vp.2.2)), children=gList( rectGrob(vp=vplay::vp.1.1, gp=gpar(fill=grey)), textGrob(1, vp=vplay::vp.1.1), rectGrob(vp=vplay::vp.2.2, gp=gpar(fill=grey)), textGrob(2, vp=vplay::vp.2.2))) grid.newpage() grid.draw(x) ... is that the sort of thing you mean? Paul Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 p...@stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 3d interactive video using the rgl package
Hi Duncan, Thanks a lot for your response, that was very helpful. I've managed to get my head around the javascript code produced by the writeWebGL function: I now have a 4d interactive animation that can be played in a web browser. Let me know if you're interested in seeing it and I'll send it to you by email. Thanks again for your kind help. Regards, Xavier Hoenner From: Duncan Murdoch [murdoch.dun...@gmail.com] Sent: Thursday, 16 May 2013 11:52 PM To: Xavier Hoenner Cc: r-help@r-project.org Subject: Re: [R] 3d interactive video using the rgl package On 16/05/2013 4:06 AM, Xavier Hoenner wrote: Hi all, I've been using the 'rgl' package to visualise in 3d the water temperature recorded by a glider deployed off the coast of Australia (see snapshot attached). Using the writeWebGL function, I'm able to produce an html file of the scene with which I can then interact (e.g. zoom in/out, rotate) in my web browser. In R, I have created another scene that includes a loop plotting the movements of the glider with the time. Is it possible to export that whole animation with the writeWebGL function? I've only managed to export the scene once all the points of my loop have been plotted, and the movie3d() function is not really a good option for me as I would like to be able to interact with my 3d animation in my web browser. Thanks in anticipation for your help. No, that's not currently supported. You could probably do it using Javascript in the web page produced by writeWebGL. I'm not sure whether it could be done entirely using the template argument, or whether you'd need to manually edit the writeWebGL output. If you put together something like this, please let me know. I'd like to see it. Duncan Murdoch Xavier Dr. Xavier Hoenner eMII Project Officer, Integrated Marine Observing System (IMOS) University of Tasmania, Sandy Bay campus, Maths Building, room 355, Private Bag 21, Hobart, TAS 7001 Tel: +61 3 6226 1752tel:%2B61%203%206226%201752; Mob: +61 411 271 462tel:%2B61%20411%20271%20462; Fax: +61 3 6226 8575tel:%2B61%203%206226%208575 Email: xavier.hoen...@utas.edu.aumailto:xavier.hoen...@utas.edu.au, URL: http://imos.org.au/emii.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] configure ddply() to avoid reordering of '.variables'
Hello, I'm using ddply() in plyr and I notice that it has the habit of re-ordering the levels of the '.variables' by which the splitting is done. I'm concerned about correctly retrieving the original ordering. Consider: require(plyr) x - iris[ order(iris$Species, decreasing=T), ] head(x) #Sepal.Length Sepal.Width Petal.Length Petal.Width Species #101 6.3 3.3 6.0 2.5 virginica #102 5.8 2.7 5.1 1.9 virginica #103 7.1 3.0 5.9 2.1 virginica #104 6.3 2.9 5.6 1.8 virginica #105 6.5 3.0 5.8 2.2 virginica #106 7.6 3.0 6.6 2.1 virginica xa - ddply(x, .(Species), function(x) {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length - mean(x$Sepal.Length)))}) # |==| 100% ##notice how the ordering of Species is different ##from that in the input data frame head(xa) # Species Sepal.Length mean.adj #1 setosa 5.10.094 #2 setosa 4.9 -0.106 #3 setosa 4.7 -0.306 #4 setosa 4.6 -0.406 #5 setosa 5.0 -0.006 #6 setosa 5.40.394 all.equal(xa$Species, x$Species) #[1] 100 string mismatches all.equal(xa[ order(xa$Species, decreasing=T), ]$Species, x$Species) #[1] TRUE all.equal(xa$Sepal.Length, x$Sepal.Length) #[1] Mean relative difference: 0.2785 all.equal(xa[ order(xa$Species, decreasing=T), ]$Sepal.Length, x$Sepal.Length) #[1] TRUE In my real data, should I be concerned that simply reordering by the '.variables' variable wouldn't necessarily restore the original ordering as in the input data frame? Is it possible to instruct ddply() to avoid re-ordering the supplied '.variables' variable? Regards, Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Indexing within by statement - different coloured lines in abline wanted..
Dear R-list I'm trying to get each regression line, plotted using abline, to be of a different colour as the following code illustrates. I'm hoping there is a simple indexing solution. Many thanks. ## code from here colours=c(black,red,blue,green,pink) Mean=500;Sd=10;NosSites=5;Xaxis=seq(1,5,1) SlopeCoefficient=5;Site=(gl(NosSites,length(Xaxis),labels=1:NosSites)) Predictor=rep(Xaxis,NosSites) InterceptAdjustment=rnorm(n=NosSites,mean=Xaxis,sd=50) RandomIntercept=rep(InterceptAdjustment,each=length(Xaxis)) PreResponse=rnorm(n=length(Predictor), mean=Mean+SlopeCoefficient*1:length(Xaxis),sd=Sd) Response1=PreResponse+RandomIntercept #create data frame Data2=data.frame(Site,Predictor,Mean,SlopeCoefficient,RandomIntercept,Response1) Data1=data.frame(Site=Data2$Site,Predictor=Data2$Predictor,Response1=Data2$Response1) #plotting var=as.numeric(levels(Data1$Site)) par(mfrow=c(1,3)) plot(Response1~Predictor,data=Data1,xlim=c(min(Xaxis),max(Xaxis)),ylim=c(MN,MX), pch=as.numeric(Site),main=Raw data with linear regresssions by Site) by(Data1,Data1$Site,function(Site){ par(new=T) abline(lm(Response1~Predictor,data=Site),col=colours[])#index in here. }) The Scottish Association for Marine Science (SAMS) is registered in Scotland as a Company Limited by Guarantee (SC009292) and is a registered charity (9206). SAMS has an actively trading wholly owned subsidiary company: SAMS Research Services Ltd a Limited Company (SC224404). All Companies in the group are registered in Scotland and share a registered office at Scottish Marine Institute, Oban Argyll PA37 1QA. The content of this message may contain personal views which are not the views of SAMS unless specifically stated. Please note that all email traffic is monitored for purposes of security and spam filtering. As such individual emails may be examined in more detail. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Classification of Multivariate Time Series
Dear All, Apologies for not posting a code snippet, but I really need a pointer about a methodology to look at my data and possibly some R package which can ease my task. I am given a set consisting of several multivariate noisy time series, let's call it {A}. Each A_i in {A}, in turn, consists of several numerical time series. Then I have another set of shorter time series {B}. Now, for every B_j in {B}, I need to determine the time series A_i where most likely B_j comes from (A_i is not just a subset of B_j). In other words, I need to determine the distance between A_i and B_j. I was thinking about the Mahalanobis distance described here. http://en.wikipedia.org/wiki/Mahalanobis_distance However, I have several questions in my head 1) With the Mahalanobis distance, do I lose the info about the time structure of the data? I am not just comparing some distributions, but some time series and the ordering of the data is important. 2) Even if the use of the Mahalanobis distance was appropriate, it involves the calculation of a covariance matrix and a mean. Should I average A_i or B_j (or a subset of B_j having the same length as A_i)? And should I use a correlation matrix based on A_i or B_j? Any suggestion is welcome. Lorenzo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Time Series prediction
Hello, I would like to use a parametric TS model and predictor as benchmark to compare against other ML methods I'm employing. I currently build a simple e.g. ARIMA model using the convenient auto.arima function like this: library(forecast) df - read.table(/Users/bravegag/data/myts.dat) # btw my test data doesn't have seasonality but periodicity so the value # 2 is arbitrarily set, using a freq of yearly or 1 would make unhappy some # R ts functions tsdata - ts(df$signal, freq=2) arimamodel - auto.arima(tsdata, max.p=15, max.q=10, stationary=FALSE, ic=bic, stepwise=TRUE, seasonal=FALSE, parallel=FALSE, num.cores=4, trace=TRUE, allowdrift=TRUE) arimapred - forecast.Arima(arimamodel, h=20) plot(arimapred) The problem is that the forecast.Arima function is apparently doing a free run i.e. it uses the forecast(t+1) value as input to compute forecast(t+2) and I'm instead interested in a prediction mode where it always use the observed tsdata(t+1) value to predict forecast(t+2), the observed tsdata(t+2) to predict forecast(t+3) and so on. Can anyone please advice how to achieve this? TIA, Best regards, Giovanni [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification of Multivariate Time Series
Did you have a look at Dynamic Time Warping and dtw package? Best, E. On Mon, May 27, 2013 at 01:34:42PM +0200, Lorenzo Isella wrote: Dear All, Apologies for not posting a code snippet, but I really need a pointer about a methodology to look at my data and possibly some R package which can ease my task. I am given a set consisting of several multivariate noisy time series, let's call it {A}. Each A_i in {A}, in turn, consists of several numerical time series. Then I have another set of shorter time series {B}. Now, for every B_j in {B}, I need to determine the time series A_i where most likely B_j comes from (A_i is not just a subset of B_j). In other words, I need to determine the distance between A_i and B_j. I was thinking about the Mahalanobis distance described here. http://en.wikipedia.org/wiki/Mahalanobis_distance However, I have several questions in my head 1) With the Mahalanobis distance, do I lose the info about the time structure of the data? I am not just comparing some distributions, but some time series and the ordering of the data is important. 2) Even if the use of the Mahalanobis distance was appropriate, it involves the calculation of a covariance matrix and a mean. Should I average A_i or B_j (or a subset of B_j having the same length as A_i)? And should I use a correlation matrix based on A_i or B_j? Any suggestion is welcome. Lorenzo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Indexing within by statement - different coloured lines in abline wanted..
Slightly diffferent approach but will this do what you want. library(ggplot2) ggplot(Data1, aes(Predictor, Response1, colour = Site)) + geom_smooth(method= lm, se = FALSE) + ggtitle(Raw data with linear regresssions by Site) John Kane Kingston ON Canada -Original Message- From: tom.wild...@sams.ac.uk Sent: Mon, 27 May 2013 10:39:58 + To: r-help@r-project.org Subject: [R] Indexing within by statement - different coloured lines in abline wanted.. Dear R-list I'm trying to get each regression line, plotted using abline, to be of a different colour as the following code illustrates. I'm hoping there is a simple indexing solution. Many thanks. ## code from here colours=c(black,red,blue,green,pink) Mean=500;Sd=10;NosSites=5;Xaxis=seq(1,5,1) SlopeCoefficient=5;Site=(gl(NosSites,length(Xaxis),labels=1:NosSites)) Predictor=rep(Xaxis,NosSites) InterceptAdjustment=rnorm(n=NosSites,mean=Xaxis,sd=50) RandomIntercept=rep(InterceptAdjustment,each=length(Xaxis)) PreResponse=rnorm(n=length(Predictor), mean=Mean+SlopeCoefficient*1:length(Xaxis),sd=Sd) Response1=PreResponse+RandomIntercept #create data frame Data2=data.frame(Site,Predictor,Mean,SlopeCoefficient,RandomIntercept,Response1) Data1=data.frame(Site=Data2$Site,Predictor=Data2$Predictor,Response1=Data2$Response1) #plotting var=as.numeric(levels(Data1$Site)) par(mfrow=c(1,3)) plot(Response1~Predictor,data=Data1,xlim=c(min(Xaxis),max(Xaxis)),ylim=c(MN,MX), pch=as.numeric(Site),main=Raw data with linear regresssions by Site) by(Data1,Data1$Site,function(Site){ par(new=T) abline(lm(Response1~Predictor,data=Site),col=colours[])#index in here. }) The Scottish Association for Marine Science (SAMS) is registered in Scotland as a Company Limited by Guarantee (SC009292) and is a registered charity (9206). SAMS has an actively trading wholly owned subsidiary company: SAMS Research Services Ltd a Limited Company (SC224404). All Companies in the group are registered in Scotland and share a registered office at Scottish Marine Institute, Oban Argyll PA37 1QA. The content of this message may contain personal views which are not the views of SAMS unless specifically stated. Please note that all email traffic is monitored for purposes of security and spam filtering. As such individual emails may be examined in more detail. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Share photos screenshots in seconds... TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if1 Works in all emails, instant messengers, blogs, forums and social networks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] metaMDS with large dataset produces 'insufficient data' warning
Greetings everyone, I am running MDS on a very large dataset (12 x 25071 - 12 model runs with 25071 output values each), and also on a very much reduced version of the dataset (randomly select 1000 of the 25071 output values). I would like to look at similarities/dissimilarities between the 12 model runs. When I use metaMDS on the full dataset, I get a warning message: Warning message: In metaMDS(MDSdata, distance = bray, k = 2, autotransform = FALSE) : Stress is (nearly) zero - you may have insufficient data I don't think I have insufficient data, with 12 x 25071 data points, and when I reduce the dataset to only 1000 values per model run (so only 12 x 1000) I don't get this warning (though the final stress is now only just below 0.2 - my desired value). Is this warning because I have insufficient data? Or is it because of the nature of a large dataset? I can supply a dataset in .txt format by email, if that would be helpful. Thanks for your help, Raeanne The Scottish Association for Marine Science (SAMS) is registered in Scotland as a Company Limited by Guarantee (SC009292) and is a registered charity (9206). SAMS has an actively trading wholly owned subsidiary company: SAMS Research Services Ltd a Limited Company (SC224404). All Companies in the group are registered in Scotland and share a registered office at Scottish Marine Institute, Oban Argyll PA37 1QA. The content of this message may contain personal views which are not the views of SAMS unless specifically stated. Please note that all email traffic is monitored for purposes of security and spam filtering. As such individual emails may be examined in more detail. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Indexing within by statement - different coloured lines in abline wanted..
abline(lm(Response1~Predictor,data=Site),col=colours[as.numeric(Site[1,1 ])]) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Tom Wilding Sent: Montag, 27. Mai 2013 12:40 To: r-help@r-project.org Subject: [R] Indexing within by statement - different coloured lines in abline wanted.. Dear R-list I'm trying to get each regression line, plotted using abline, to be of a different colour as the following code illustrates. I'm hoping there is a simple indexing solution. Many thanks. ## code from here colours=c(black,red,blue,green,pink) Mean=500;Sd=10;NosSites=5;Xaxis=seq(1,5,1) SlopeCoefficient=5;Site=(gl(NosSites,length(Xaxis),labels=1:NosSites)) Predictor=rep(Xaxis,NosSites) InterceptAdjustment=rnorm(n=NosSites,mean=Xaxis,sd=50) RandomIntercept=rep(InterceptAdjustment,each=length(Xaxis)) PreResponse=rnorm(n=length(Predictor), mean=Mean+SlopeCoefficient*1:length(Xaxis),sd=Sd) Response1=PreResponse+RandomIntercept #create data frame Data2=data.frame(Site,Predictor,Mean,SlopeCoefficient,RandomIntercept,Re sponse1) Data1=data.frame(Site=Data2$Site,Predictor=Data2$Predictor,Response 1=Data2$Response1) #plotting var=as.numeric(levels(Data1$Site)) par(mfrow=c(1,3)) plot(Response1~Predictor,data=Data1,xlim=c(min(Xaxis),max(Xaxis)),ylim=c (MN,MX), pch=as.numeric(Site),main=Raw data with linear regresssions by Site) by(Data1,Data1$Site,function(Site){ par(new=T) abline(lm(Response1~Predictor,data=Site),col=colours[])#index in here. }) The Scottish Association for Marine Science (SAMS) is registered in Scotland as a Company Limited by Guarantee (SC009292) and is a registered charity (9206). SAMS has an actively trading wholly owned subsidiary company: SAMS Research Services Ltd a Limited Company (SC224404). All Companies in the group are registered in Scotland and share a registered office at Scottish Marine Institute, Oban Argyll PA37 1QA. The content of this message may contain personal views which are not the views of SAMS unless specifically stated. Please note that all email traffic is monitored for purposes of security and spam filtering. As such individual emails may be examined in more detail. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification of Multivariate Time Series
Look at: State - Space Discrimination and Clustering of. Atmospheric Time Series Data. Based on Kullback Information Measures. Thomas Bengtsson If you Google the topic, there are host of other papers too, but the one meshes with exiting star-space methods. -Roy On May 27, 2013, at 4:34 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, Apologies for not posting a code snippet, but I really need a pointer about a methodology to look at my data and possibly some R package which can ease my task. I am given a set consisting of several multivariate noisy time series, let's call it {A}. Each A_i in {A}, in turn, consists of several numerical time series. Then I have another set of shorter time series {B}. Now, for every B_j in {B}, I need to determine the time series A_i where most likely B_j comes from (A_i is not just a subset of B_j). In other words, I need to determine the distance between A_i and B_j. I was thinking about the Mahalanobis distance described here. http://en.wikipedia.org/wiki/Mahalanobis_distance However, I have several questions in my head 1) With the Mahalanobis distance, do I lose the info about the time structure of the data? I am not just comparing some distributions, but some time series and the ordering of the data is important. 2) Even if the use of the Mahalanobis distance was appropriate, it involves the calculation of a covariance matrix and a mean. Should I average A_i or B_j (or a subset of B_j having the same length as A_i)? And should I use a correlation matrix based on A_i or B_j? Any suggestion is welcome. Lorenzo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** The contents of this message do not reflect any position of the U.S. Government or NOAA. ** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center 1352 Lighthouse Avenue Pacific Grove, CA 93950-2097 e-mail: roy.mendelss...@noaa.gov (Note new e-mail address) voice: (831)-648-9029 fax: (831)-648-8440 www: http://www.pfeg.noaa.gov/ Old age and treachery will overcome youth and skill. From those who have been given much, much will be expected the arc of the moral universe is long, but it bends toward justice -MLK Jr. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] configure ddply() to avoid reordering of '.variables'
May be this helps levels(x$Species) #[1] setosa versicolor virginica x$Species- factor(x$Species,levels=unique(x$Species)) xa - ddply(x, .(Species), function(x) {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length - mean(x$Sepal.Length)))}) head(xa) # Species Sepal.Length mean.adj #1 virginica 6.3 -0.288 #2 virginica 5.8 -0.788 #3 virginica 7.1 0.512 #4 virginica 6.3 -0.288 #5 virginica 6.5 -0.088 #6 virginica 7.6 1.012 A.K. - Original Message - From: Liviu Andronic landronim...@gmail.com To: r-help@r-project.org Help r-help@r-project.org Cc: Sent: Monday, May 27, 2013 4:47 AM Subject: [R] configure ddply() to avoid reordering of '.variables' Hello, I'm using ddply() in plyr and I notice that it has the habit of re-ordering the levels of the '.variables' by which the splitting is done. I'm concerned about correctly retrieving the original ordering. Consider: require(plyr) x - iris[ order(iris$Species, decreasing=T), ] head(x) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species #101 6.3 3.3 6.0 2.5 virginica #102 5.8 2.7 5.1 1.9 virginica #103 7.1 3.0 5.9 2.1 virginica #104 6.3 2.9 5.6 1.8 virginica #105 6.5 3.0 5.8 2.2 virginica #106 7.6 3.0 6.6 2.1 virginica xa - ddply(x, .(Species), function(x) {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length - mean(x$Sepal.Length)))}) # |==| 100% ##notice how the ordering of Species is different ##from that in the input data frame head(xa) # Species Sepal.Length mean.adj #1 setosa 5.1 0.094 #2 setosa 4.9 -0.106 #3 setosa 4.7 -0.306 #4 setosa 4.6 -0.406 #5 setosa 5.0 -0.006 #6 setosa 5.4 0.394 all.equal(xa$Species, x$Species) #[1] 100 string mismatches all.equal(xa[ order(xa$Species, decreasing=T), ]$Species, x$Species) #[1] TRUE all.equal(xa$Sepal.Length, x$Sepal.Length) #[1] Mean relative difference: 0.2785 all.equal(xa[ order(xa$Species, decreasing=T), ]$Sepal.Length, x$Sepal.Length) #[1] TRUE In my real data, should I be concerned that simply reordering by the '.variables' variable wouldn't necessarily restore the original ordering as in the input data frame? Is it possible to instruct ddply() to avoid re-ordering the supplied '.variables' variable? Regards, Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about subsetting S4 object in ROCR
Dear list I'm testing a predictor and I produced nice performance plots with ROCR package utilizing the 3 standard command pred - prediction(predictions, labels) perf - performance(pred, measure = tpr, x.measure = fpr) plot(perf, col=rainbow(10)) The pred object and the perfo object are S4 with the following slots An object of class performance Slot x.name: [1] False positive rate Slot y.name: [1] True positive rate Slot alpha.name: [1] Cutoff Slot x.values: [[1]] [1] 0.00 0.00 0.05 0.10 0.10 0.10 0.10 0.10 0.15 0.15 0.15 0.20 0.25 0.25 0.25 0.25 0.25 0.30 0.35 0.35 0.35 0.40 0.40 0.45 0.50 0.50 0.55 0.55 0.60 [30] 0.65 0.65 0.70 0.70 0.75 0.80 0.85 0.90 0.90 0.95 1.00 1.00 Slot y.values: [[1]] [1] 0.00 0.05 0.05 0.05 0.10 0.15 0.20 0.25 0.25 0.30 0.35 0.35 0.35 0.40 0.45 0.50 0.55 0.55 0.55 0.60 0.65 0.65 0.70 0.70 0.70 0.75 0.75 0.80 0.80 [30] 0.80 0.85 0.85 0.90 0.90 0.90 0.90 0.90 0.95 0.95 0.95 1.00 Slot alpha.values: [[1]] [1] Inf 33309 32968 31688 31648 31355 31122 31047 30777 30589 30460 30395 30305 30159 29841 29101 28734 28657 28393 28196 27740 27662 27373 27078 [25] 26763 26303 25573 25416 25364 25357 24993 23834 23789 23616 22357 20669 20092 18720 18136 17323 16665 Now i'd like to make a plot (and also compute the AUC) only of the area corresponding to 0.80 y.values and 0.40 x.values. According to your experience is it possible to subset the perf object to the afore mentioned values? Thanks Guido [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] configure ddply() to avoid reordering of '.variables'
Also, you can check: http://stackoverflow.com/questions/7235421/how-to-ddply-without-sorting keeping.order - function(data, fn, ...) { col - .sortColumn data[,col] - 1:nrow(data) out - fn(data, ...) if (!col %in% colnames(out)) stop(Ordering column not preserved by function) out - out[order(out[,col]),] out[,col] - NULL out } x - iris[ order(iris$Species, decreasing=T), ] xa- ddply(x,.(Species),mutate,mean.adj=Sepal.Length-mean(Sepal.Length))[-c(2:4)] xa1- keeping.order(x,ddply,.(Species),mutate,mean.adj=Sepal.Length-mean(Sepal.Length))[-c(2:4)] head(xa1) # Sepal.Length Species mean.adj #101 6.3 virginica -0.288 #102 5.8 virginica -0.788 #103 7.1 virginica 0.512 #104 6.3 virginica -0.288 #105 6.5 virginica -0.088 #106 7.6 virginica 1.012 A.K. - Original Message - From: arun smartpink...@yahoo.com To: Liviu Andronic landronim...@gmail.com Cc: R help r-help@r-project.org Sent: Monday, May 27, 2013 10:06 AM Subject: Re: [R] configure ddply() to avoid reordering of '.variables' May be this helps levels(x$Species) #[1] setosa versicolor virginica x$Species- factor(x$Species,levels=unique(x$Species)) xa - ddply(x, .(Species), function(x) {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length - mean(x$Sepal.Length)))}) head(xa) # Species Sepal.Length mean.adj #1 virginica 6.3 -0.288 #2 virginica 5.8 -0.788 #3 virginica 7.1 0.512 #4 virginica 6.3 -0.288 #5 virginica 6.5 -0.088 #6 virginica 7.6 1.012 A.K. - Original Message - From: Liviu Andronic landronim...@gmail.com To: r-help@r-project.org Help r-help@r-project.org Cc: Sent: Monday, May 27, 2013 4:47 AM Subject: [R] configure ddply() to avoid reordering of '.variables' Hello, I'm using ddply() in plyr and I notice that it has the habit of re-ordering the levels of the '.variables' by which the splitting is done. I'm concerned about correctly retrieving the original ordering. Consider: require(plyr) x - iris[ order(iris$Species, decreasing=T), ] head(x) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species #101 6.3 3.3 6.0 2.5 virginica #102 5.8 2.7 5.1 1.9 virginica #103 7.1 3.0 5.9 2.1 virginica #104 6.3 2.9 5.6 1.8 virginica #105 6.5 3.0 5.8 2.2 virginica #106 7.6 3.0 6.6 2.1 virginica xa - ddply(x, .(Species), function(x) {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length - mean(x$Sepal.Length)))}) # |==| 100% ##notice how the ordering of Species is different ##from that in the input data frame head(xa) # Species Sepal.Length mean.adj #1 setosa 5.1 0.094 #2 setosa 4.9 -0.106 #3 setosa 4.7 -0.306 #4 setosa 4.6 -0.406 #5 setosa 5.0 -0.006 #6 setosa 5.4 0.394 all.equal(xa$Species, x$Species) #[1] 100 string mismatches all.equal(xa[ order(xa$Species, decreasing=T), ]$Species, x$Species) #[1] TRUE all.equal(xa$Sepal.Length, x$Sepal.Length) #[1] Mean relative difference: 0.2785 all.equal(xa[ order(xa$Species, decreasing=T), ]$Sepal.Length, x$Sepal.Length) #[1] TRUE In my real data, should I be concerned that simply reordering by the '.variables' variable wouldn't necessarily restore the original ordering as in the input data frame? Is it possible to instruct ddply() to avoid re-ordering the supplied '.variables' variable? Regards, Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] updating observations in lm
dear R experts---I would like to update OLS regressions with new observations on the front of the data, and delete some old observations from the rear. my goal is to have a flexible moving-window regression, with a minimum number of observations and a maximum number of observations. I can keep (X' X) and (X' y), and add or subtract observations from these two quantities myself, and then use crossprod. strucchange does recursive residuals, which is closely related, but it is not designed for such flexible movable windows, nor primarily designed to produce standard errors of coefficients. before I get started on this, I just wanted to inquire whether someone has already written such a function. regards, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for probit regression. How to avoid p=1 or p=0
Dear all: I am writing the following small function for a probit likelihood. As indicated, in order to avoid p=1 or p=0, I defined some precisions. I feel however, that there might be a better way to do this. Any help is greatly appreciated. ## ##set limits to avoid px=0 or px=1 precision1 - 0.99 precision0 - 0.01 logpost - function(par, data){ px - pnorm(b0 + b1x) # to avoid px=1 or px=0 px[px precision1] - precision1 px[px precision0] - precision0 loga - sum( y*log(px)+(1-y)*log(1-px) ) loga } # Best, Keramat Nourijelyani, PhD Associate Professorof Biostatistics Tehran University of Medical Sciences http://tums.ac.ir/faculties/nourij [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for probit regression. How to avoid p=1 or p=0
Hello, You write a function of two arguments, 'par' and 'data' and do not use them in the body of the function. Furthermore, what are b0, b1x and y? Also, take a look at ?.Machine. In particular, couldn't you use precision0 - .Machine$double.eps precision1 - 1 - .Machine$double.eps instead of 0.01 and 0.99? Hope this helps, Rui Barradas Em 27-05-2013 16:21, knouri escreveu: Dear all: I am writing the following small function for a probit likelihood. As indicated, in order to avoid p=1 or p=0, I defined some precisions. I feel however, that there might be a better way to do this. Any help is greatly appreciated. ## ##set limits to avoid px=0 or px=1 precision1 - 0.99 precision0 - 0.01 logpost - function(par, data){ px- pnorm(b0 + b1x) # to avoid px=1 or px=0 px[px precision1] - precision1 px[px precision0] - precision0 loga - sum( y*log(px)+(1-y)*log(1-px) ) loga } # Best, Keramat Nourijelyani, PhD Associate Professorof Biostatistics Tehran University of Medical Sciences http://tums.ac.ir/faculties/nourij [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assistant
Dear Sir/Ma, I Adelabu.A.A, one of the R-users from Nigeria. I have a data-set of claims paid, premium for individual life-insurance policy holder but not in triangle form. how can i running stochastics chainladder in r on it. please help [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How sum all possible combinations of rows, given 4 matrices
Hello all, I have 4 matrices with 3 columns each (different number of rows though). I want to find a function that returns all possible 3-place vectors corresponding to the sum by columns of picking one row from matrix 1, one from matrix 2, one from matrix 3, and one from matrix 4. So basically, all possible ways of picking one row from each matrix and then sum their columns to obtain a 3-place vector. Is there a way to use expand.grid and reduce to obtain this result? Or am I on the wrong track? Thank you, Bruno PS:I believe I have given all relevant info. I apologize in advance if my question is ill-posed or ambiguous. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stop on fail using data manipulation
Hello, I have a data set with test results for multiple devices (rows). I also have an index (column) that stores the first failing test for each device. I need to remove the results for all the tests that come after the first failing test. Example of a data table: Device,first_failing_test,test1,test2,test3,test4,test5 1,test2,1,2,3,4,5 2,test4,2,3,4,5,6 3,test1,3,4,5,6,7 New table: Device,first_failing_test,test1,test2,test3,test4,test5 1,test2,1,2,na,na,na 2,test4,2,3,4,5,na 3,test1,3,4,5,na,na Ideally I need to pass the first table as an argument to a function and get back the second table. Any idea how this can be done in R? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 123, Issue 30
On 05/27/2013 10:28 AM, Neotropical bat risk assessments wrote: Hi all are there any R packages that include circular stats similar to Oriana (http://www.kovcomp.co.uk/oriana/newver4.html)? I am interested in looking at annual patterns of bat activity where data will have date/times and relative abundance values for each Date. I would like to have a circular plot with the circumference axis the 12 months of the year and then a value of relative abundance and likely with ggplot2 this can be set to color= species. Tnx Bruce Hi Bruce, Here is a possibility: library(plotrix) batact-matrix(c(sin(seq(0,1.833*pi,length=12))+2+rnorm(36)/4), nrow=3,byrow=TRUE) batpos-seq(0,1.833*pi,length=12) radial.plot(batact,batpos,rp.type=ps,main=Bat activity by month, line.col=2:4,radial.lim=0:4,label.pos=batpos,labels=month.abb, point.symbols=16:18,point.col=2:4,label.prop=1.1,start=pi/2, clockwise=TRUE) legend(-3.5,0.5,paste(Species,1:3),pch=16:18,col=2:4) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] updating observations in lm
Ivo: 1. You should not be fitting linear models as you describe. For why not and how they should be fit, consult a suitable text on numerical methods (e.g. Givens and Hoeting). 2. In R, I suggest using lm() and ?update, feeding update() data modified as you like. This is, after all, the reason for update(). -- Bert On Mon, May 27, 2013 at 8:12 AM, ivo welch ivo.we...@anderson.ucla.edu wrote: dear R experts---I would like to update OLS regressions with new observations on the front of the data, and delete some old observations from the rear. my goal is to have a flexible moving-window regression, with a minimum number of observations and a maximum number of observations. I can keep (X' X) and (X' y), and add or subtract observations from these two quantities myself, and then use crossprod. strucchange does recursive residuals, which is closely related, but it is not designed for such flexible movable windows, nor primarily designed to produce standard errors of coefficients. before I get started on this, I just wanted to inquire whether someone has already written such a function. regards, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How sum all possible combinations of rows, given 4 matrices
Homework? We don't do homework here. -- Bert On Mon, May 27, 2013 at 8:24 AM, Estigarribia, Bruno estig...@email.unc.edu wrote: Hello all, I have 4 matrices with 3 columns each (different number of rows though). I want to find a function that returns all possible 3-place vectors corresponding to the sum by columns of picking one row from matrix 1, one from matrix 2, one from matrix 3, and one from matrix 4. So basically, all possible ways of picking one row from each matrix and then sum their columns to obtain a 3-place vector. Is there a way to use expand.grid and reduce to obtain this result? Or am I on the wrong track? Thank you, Bruno PS:I believe I have given all relevant info. I apologize in advance if my question is ill-posed or ambiguous. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How sum all possible combinations of rows, given 4 matrices
I expect the answer to involve manipulating indices. But why do you need to do this? This looks suspiciously like homework, and there is a no-homework policy on this list (see the Posting Guide). --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Estigarribia, Bruno estig...@email.unc.edu wrote: Hello all, I have 4 matrices with 3 columns each (different number of rows though). I want to find a function that returns all possible 3-place vectors corresponding to the sum by columns of picking one row from matrix 1, one from matrix 2, one from matrix 3, and one from matrix 4. So basically, all possible ways of picking one row from each matrix and then sum their columns to obtain a 3-place vector. Is there a way to use expand.grid and reduce to obtain this result? Or am I on the wrong track? Thank you, Bruno PS:I believe I have given all relevant info. I apologize in advance if my question is ill-posed or ambiguous. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] updating observations in lm
hi bert---thanks for the answer. my particular problem is well conditioned [stock returns] and speed is very important. about 4 years ago, I asked for speedier alternatives to lm (and you helped me on this one, too), and then checked into the speed/accuracy tradeoff. http://r.789695.n4.nabble.com/very-fast-OLS-regression-td884832.html . for the particular problem I had, solve(crossprod(x),crossprod(x,y)) worked reasonably well. moreover, it is easy to debug, being so simple. it was faster than lm() by a factor 5.. (for a more generic library use, it would be nice to have a warning flag when this algorithm fails, in which case it would fall back on a more robust algorithm or at least emit a warning. I wonder how much it would cost to check the condition of the matrix before deciding on the algorithm.) I looked at update(), but its documentation seems to refer to updating models, not observations. even if it did, given the speed of lm(), I don't think it will be that useful. regards, /iaw Ivo Welch (ivo.we...@gmail.com) On Mon, May 27, 2013 at 9:26 AM, Bert Gunter gunter.ber...@gene.com wrote: Ivo: 1. You should not be fitting linear models as you describe. For why not and how they should be fit, consult a suitable text on numerical methods (e.g. Givens and Hoeting). 2. In R, I suggest using lm() and ?update, feeding update() data modified as you like. This is, after all, the reason for update(). -- Bert On Mon, May 27, 2013 at 8:12 AM, ivo welch ivo.we...@anderson.ucla.edu wrote: dear R experts---I would like to update OLS regressions with new observations on the front of the data, and delete some old observations from the rear. my goal is to have a flexible moving-window regression, with a minimum number of observations and a maximum number of observations. I can keep (X' X) and (X' y), and add or subtract observations from these two quantities myself, and then use crossprod. strucchange does recursive residuals, which is closely related, but it is not designed for such flexible movable windows, nor primarily designed to produce standard errors of coefficients. before I get started on this, I just wanted to inquire whether someone has already written such a function. regards, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stop on fail using data manipulation
I have a doubt about your New table especially the 3rd row: Since after test1 , the test fails, i guess 4,5 should be NA dat1-read.table(text= Device,first_failing_test,test1,test2,test3,test4,test5 1,test2,1,2,3,4,5 2,test4,2,3,4,5,6 3,test1,3,4,5,6,7 ,sep=,,header=TRUE,stringsAsFactors=FALSE) res-do.call(rbind,lapply(seq_len(nrow(dat1)),function(i) {indx-colnames(dat1[i,])[-c(1:2)]%in% dat1[i,2]; indx1- indx[-length(indx)];dat1[i,-c(1:2)][as.logical(cumsum(c(FALSE,indx1)))]-NA; dat1[i,] })) res # Device first_failing_test test1 test2 test3 test4 test5 #1 1 test2 1 2 NA NA NA #2 2 test4 2 3 4 5 NA #3 3 test1 3 NA NA NA NA A.K. - Original Message - From: Ala' Jaouni ajao...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, May 27, 2013 10:40 AM Subject: [R] Stop on fail using data manipulation Hello, I have a data set with test results for multiple devices (rows). I also have an index (column) that stores the first failing test for each device. I need to remove the results for all the tests that come after the first failing test. Example of a data table: Device,first_failing_test,test1,test2,test3,test4,test5 1,test2,1,2,3,4,5 2,test4,2,3,4,5,6 3,test1,3,4,5,6,7 New table: Device,first_failing_test,test1,test2,test3,test4,test5 1,test2,1,2,na,na,na 2,test4,2,3,4,5,na 3,test1,3,4,5,na,na Ideally I need to pass the first table as an argument to a function and get back the second table. Any idea how this can be done in R? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] updating observations in lm
?lm.fit ## may be useful to you then. Have you tried it? -- Bert On Mon, May 27, 2013 at 9:52 AM, ivo welch ivo.we...@gmail.com wrote: hi bert---thanks for the answer. my particular problem is well conditioned [stock returns] and speed is very important. about 4 years ago, I asked for speedier alternatives to lm (and you helped me on this one, too), and then checked into the speed/accuracy tradeoff. http://r.789695.n4.nabble.com/very-fast-OLS-regression-td884832.html . for the particular problem I had, solve(crossprod(x),crossprod(x,y)) worked reasonably well. moreover, it is easy to debug, being so simple. it was faster than lm() by a factor 5.. (for a more generic library use, it would be nice to have a warning flag when this algorithm fails, in which case it would fall back on a more robust algorithm or at least emit a warning. I wonder how much it would cost to check the condition of the matrix before deciding on the algorithm.) I looked at update(), but its documentation seems to refer to updating models, not observations. even if it did, given the speed of lm(), I don't think it will be that useful. regards, /iaw Ivo Welch (ivo.we...@gmail.com) On Mon, May 27, 2013 at 9:26 AM, Bert Gunter gunter.ber...@gene.com wrote: Ivo: 1. You should not be fitting linear models as you describe. For why not and how they should be fit, consult a suitable text on numerical methods (e.g. Givens and Hoeting). 2. In R, I suggest using lm() and ?update, feeding update() data modified as you like. This is, after all, the reason for update(). -- Bert On Mon, May 27, 2013 at 8:12 AM, ivo welch ivo.we...@anderson.ucla.edu wrote: dear R experts---I would like to update OLS regressions with new observations on the front of the data, and delete some old observations from the rear. my goal is to have a flexible moving-window regression, with a minimum number of observations and a maximum number of observations. I can keep (X' X) and (X' y), and add or subtract observations from these two quantities myself, and then use crossprod. strucchange does recursive residuals, which is closely related, but it is not designed for such flexible movable windows, nor primarily designed to produce standard errors of coefficients. before I get started on this, I just wanted to inquire whether someone has already written such a function. regards, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about subsetting S4 object in ROCR
On 27.05.2013 16:18, Guido Leoni wrote: Dear list I'm testing a predictor and I produced nice performance plots with ROCR package utilizing the 3 standard command pred - prediction(predictions, labels) perf - performance(pred, measure = tpr, x.measure = fpr) plot(perf, col=rainbow(10)) The pred object and the perfo object are S4 with the following slots An object of class performance Slot x.name: [1] False positive rate Slot y.name: [1] True positive rate Slot alpha.name: [1] Cutoff Slot x.values: [[1]] [1] 0.00 0.00 0.05 0.10 0.10 0.10 0.10 0.10 0.15 0.15 0.15 0.20 0.25 0.25 0.25 0.25 0.25 0.30 0.35 0.35 0.35 0.40 0.40 0.45 0.50 0.50 0.55 0.55 0.60 [30] 0.65 0.65 0.70 0.70 0.75 0.80 0.85 0.90 0.90 0.95 1.00 1.00 Slot y.values: [[1]] [1] 0.00 0.05 0.05 0.05 0.10 0.15 0.20 0.25 0.25 0.30 0.35 0.35 0.35 0.40 0.45 0.50 0.55 0.55 0.55 0.60 0.65 0.65 0.70 0.70 0.70 0.75 0.75 0.80 0.80 [30] 0.80 0.85 0.85 0.90 0.90 0.90 0.90 0.90 0.95 0.95 0.95 1.00 Slot alpha.values: [[1]] [1] Inf 33309 32968 31688 31648 31355 31122 31047 30777 30589 30460 30395 30305 30159 29841 29101 28734 28657 28393 28196 27740 27662 27373 27078 [25] 26763 26303 25573 25416 25364 25357 24993 23834 23789 23616 22357 20669 20092 18720 18136 17323 16665 Now i'd like to make a plot (and also compute the AUC) only of the area corresponding to 0.80 y.values and 0.40 x.values. According to your experience is it possible to subset the perf object to the afore mentioned values? But x=0.4 and y=0.8 is just a point, so I don't get which plot and area you are talking about now? Best, UWe Ligges Thanks Guido [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How sum all possible combinations of rows, given 4 matrices
Hi, Not sure if this is what you expected: set.seed(24) mat1- matrix(sample(1:20,3*4,replace=TRUE),ncol=3) set.seed(28) mat2- matrix(sample(1:25,3*6,replace=TRUE),ncol=3) set.seed(30) mat3- matrix(sample(1:35,3*8,replace=TRUE),ncol=3) set.seed(35) mat4- matrix(sample(1:40,3*10,replace=TRUE),ncol=3) dat1-expand.grid(seq(dim(mat1)[1]),seq(dim(mat2)[1]),seq(dim(mat3)[1]),seq(dim(mat4)[1])) vec1-paste0(mat,1:4) matNew-do.call(cbind,lapply(seq_len(ncol(dat1)),function(i) get(vec1[i])[dat1[,i],])) colnames(matNew)- (seq(12)-1)%%3+1 datNew-data.frame(matNew) res-sapply(split(colnames(datNew),gsub(\\..*,,colnames(datNew))),function(x) rowSums(datNew[,x])) dim(res) #[1] 1920 3 head(res) # X1 X2 X3 #[1,] 46 63 70 #[2,] 45 68 59 #[3,] 55 55 66 #[4,] 51 65 61 #[5,] 48 84 75 #[6,] 47 89 64 A.K. - Original Message - From: Estigarribia, Bruno estig...@email.unc.edu To: r-help@R-project.org r-help@r-project.org Cc: Sent: Monday, May 27, 2013 11:24 AM Subject: [R] How sum all possible combinations of rows, given 4 matrices Hello all, I have 4 matrices with 3 columns each (different number of rows though). I want to find a function that returns all possible 3-place vectors corresponding to the sum by columns of picking one row from matrix 1, one from matrix 2, one from matrix 3, and one from matrix 4. So basically, all possible ways of picking one row from each matrix and then sum their columns to obtain a 3-place vector. Is there a way to use expand.grid and reduce to obtain this result? Or am I on the wrong track? Thank you, Bruno PS:I believe I have given all relevant info. I apologize in advance if my question is ill-posed or ambiguous. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How I can rearrange columns in data.frame?
Hi R-User, I am wondering how I can rearrange columns in a table in R. I do have very big data set (4500 columns). I have given an example of the data set. dput(dat) structure(list(preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8), preV59A1b = c(0.05, 0.57, 0.03, 0.5)), .Names = c(preV1001A1b, preV15A1b, preV2032A1b, preV2035A1b, preV59A1b), class = data.frame, row.names = c(NA, -4L)) I wanted to make like this dput(dat1) structure(list(preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV59A1b = c(0.05, 0.57, 0.03, 0.5), preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8)), .Names = c(preV15A1b, preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b), class = data.frame, row.names = c(NA, -4L)) Any suggestions. KG == [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How I can rearrange columns in data.frame?
On 27-05-2013, at 20:17, Kristi Glover kristi.glo...@hotmail.com wrote: Hi R-User, I am wondering how I can rearrange columns in a table in R. I do have very big data set (4500 columns). I have given an example of the data set. dput(dat) structure(list(preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8), preV59A1b = c(0.05, 0.57, 0.03, 0.5)), .Names = c(preV1001A1b, preV15A1b, preV2032A1b, preV2035A1b, preV59A1b), class = data.frame, row.names = c(NA, -4L)) I wanted to make like this dput(dat1) structure(list(preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV59A1b = c(0.05, 0.57, 0.03, 0.5), preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8)), .Names = c(preV15A1b, preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b), class = data.frame, row.names = c(NA, -4L)) Any suggestions. KG dat2 - dat[,c(preV15A1b, preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b)] identical(dat1,dat2) or something like this: dat3.cols - match(c(preV15A1b, preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b), names(dat)) dat3 - dat[,dat3.cols] identical(dat3,dat2) A general solution will depend on the new ordering of your columns. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How I can rearrange columns in data.frame?
On May 27, 2013, at 20:17 , Kristi Glover wrote: Hi R-User, I am wondering how I can rearrange columns in a table in R. I do have very big data set (4500 columns). I have given an example of the data set. dput(dat) structure(list(preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8), preV59A1b = c(0.05, 0.57, 0.03, 0.5)), .Names = c(preV1001A1b, preV15A1b, preV2032A1b, preV2035A1b, preV59A1b), class = data.frame, row.names = c(NA, -4L)) I wanted to make like this dput(dat1) structure(list(preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV59A1b = c(0.05, 0.57, 0.03, 0.5), preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8)), .Names = c(preV15A1b, preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b), class = data.frame, row.names = c(NA, -4L)) Any suggestions. KG Is there a particular logic to that ordering? Otherwise, the obvious way is nm - c(preV15A1b, preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b) dat1 - dat[nm] Or, maybe you are looking for something like this? o - order(as.numeric(sub(preV([0-9]*)A1b, \\1, names(dat (dat1 - dat[o]) -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How I can rearrange columns in data.frame?
Hi, Try this: dat2-dat[order(as.numeric(gsub(preV(\\d+).*,\\1,colnames(dat] dat2 # preV15A1b preV59A1b preV1001A1b preV2032A1b preV2035A1b #1 0.57 0.05 0.59 0.40 0.95 #2 0.62 0.57 0.30 0.80 0.67 #3 0.51 0.03 0.78 0.24 0.81 #4 0.95 0.50 0.43 0.34 0.80 identical(dat1,dat2) #[1] TRUE A.K. Hi R-User, I am wondering how I can rearrange columns in a table in R. I do have very big data set (4500 columns). I have given an example of the data set. dput(dat) structure(list(preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8), preV59A1b = c(0.05, 0.57, 0.03, 0.5)), .Names = c(preV1001A1b, preV15A1b, preV2032A1b, preV2035A1b, preV59A1b), class = data.frame, row.names = c(NA, -4L)) I wanted to make like this dput(dat1) structure(list(preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV59A1b = c(0.05, 0.57, 0.03, 0.5), preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8)), .Names = c(preV15A1b, preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b), class = data.frame, row.names = c(NA, -4L)) Any suggestions. KG __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] updating observations in lm
Look at the biglm package. It does 2 of the 3 things that you asked for: Construct an initial lm fit and add a new block of data to update that fit. It does not remove data, but you may be able to look at the code and figure out a way to modify it to do the final piece. On Mon, May 27, 2013 at 9:12 AM, ivo welch ivo.we...@anderson.ucla.eduwrote: dear R experts---I would like to update OLS regressions with new observations on the front of the data, and delete some old observations from the rear. my goal is to have a flexible moving-window regression, with a minimum number of observations and a maximum number of observations. I can keep (X' X) and (X' y), and add or subtract observations from these two quantities myself, and then use crossprod. strucchange does recursive residuals, which is closely related, but it is not designed for such flexible movable windows, nor primarily designed to produce standard errors of coefficients. before I get started on this, I just wanted to inquire whether someone has already written such a function. regards, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] updating observations in lm
The essential trick here is the Sherman-Morrison-Woodbury formula. My quantreg package has a lm.fit.recursive function that implements a fortran version for adding observations, but like biglm I don't remove observations at the other end either. Roger Koenker rkoen...@illinois.edu On May 27, 2013, at 2:07 PM, Greg Snow wrote: Look at the biglm package. It does 2 of the 3 things that you asked for: Construct an initial lm fit and add a new block of data to update that fit. It does not remove data, but you may be able to look at the code and figure out a way to modify it to do the final piece. On Mon, May 27, 2013 at 9:12 AM, ivo welch ivo.we...@anderson.ucla.eduwrote: dear R experts---I would like to update OLS regressions with new observations on the front of the data, and delete some old observations from the rear. my goal is to have a flexible moving-window regression, with a minimum number of observations and a maximum number of observations. I can keep (X' X) and (X' y), and add or subtract observations from these two quantities myself, and then use crossprod. strucchange does recursive residuals, which is closely related, but it is not designed for such flexible movable windows, nor primarily designed to produce standard errors of coefficients. before I get started on this, I just wanted to inquire whether someone has already written such a function. regards, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] updating observations in lm
On 27-05-2013, at 17:12, ivo welch ivo.we...@anderson.ucla.edu wrote: dear R experts---I would like to update OLS regressions with new observations on the front of the data, and delete some old observations from the rear. my goal is to have a flexible moving-window regression, with a minimum number of observations and a maximum number of observations. I can keep (X' X) and (X' y), and add or subtract observations from these two quantities myself, and then use crossprod. strucchange does recursive residuals, which is closely related, but it is not designed for such flexible movable windows, nor primarily designed to produce standard errors of coefficients. before I get started on this, I just wanted to inquire whether someone has already written such a function. For regression one would use a QR decomposition. There is an opensource Fortran library qrupdate (http://sourceforge.net/projects/qrupdate/) that can update an unpivoted QR decomposition for the case of deleting rows/columns and inserting rows/columns. It could be used to make an R package, which could be used for doing a moving window regression. Quite a lot of work. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] choose the lines
Hi, Try this: dat1- read.csv(dat7.csv,header=TRUE,stringsAsFactors=FALSE,sep=\t) dat.bru- dat1[!is.na(dat1$evnmt_brutal),] fun1- function(dat){ lst1- split(dat,dat$patient_id) lst2- lapply(lst1,function(x) x[cumsum(x$evnmt_brutal==0)0,]) lst3- lapply(lst2,function(x) x[!(all(x$evnmt_brutal==1)|all(x$evnmt_brutal==0)),]) lst4-lapply(lst3,function(x) {vect.brutal=c() for(line in which(x$evnmt_brutal==1)){ if(x$evnmt_brutal[line-1]==0){ vect.brutal=c(vect.brutal,line) } } vect.brutal1- sort(c(vect.brutal,vect.brutal-1)) x[vect.brutal1,] } ) res- do.call(rbind,lst4) row.names(res)- 1:nrow(res) res } fun1(dat.bru)head(fun1(dat.bru),10) # X patient_id number responsed_at t basdai_d evnmt_brutal #1 14 2 13 2011-08-07 13 0.900 0 #2 15 2 14 2011-09-11 14 -0.800 1 #3 22 3 2 2010-06-29 1 -0.800 0 #4 23 3 3 2010-08-05 2 0.000 1 #5 24 3 4 2010-09-05 3 1.200 0 #6 25 3 5 2010-10-13 4 1.925 1 #7 26 3 6 2010-11-15 5 -2.525 0 #8 27 3 7 2010-12-18 6 -0.200 1 #9 53 5 9 2011-02-13 8 0.000 0 #10 54 5 10 2011-03-19 9 -1.200 1 A.K. ___ From: GUANGUAN LUO guanguan...@gmail.com To: arun smartpink...@yahoo.com Sent: Monday, May 27, 2013 8:48 AM Subject: choose the lines Hello, Arun, in this data, i want to choose every line with the variable evnmt_brutal==1 the precedent line( line-1) with evnmt_brutal==0, i had done this, res.bru - dat7[!is.na(dat7$evnmt_brutal),] vect.brutal=c() for(line in which(res.bru$evnmt_brutal==1)){ if(res.r$evnmt_brutal[line-1]==0){ vect.brutal=c(vect.brutal,line)} } vect.brutal but now i think it's not correct. Because if there are the situations just like this Patient_id evnmt_brutal 1 ... 1 ... 1 0 2 1 2 ... 2 ... I would have chosen the lines of two different patients, so that is not correct. Do you know how can i change a little and get the correct lines just for each patient? Thank you so much. GG __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] updating observations in lm
Gentlemans as 274 algorithm allows weights, so adding an obs with a weight of -1 would do the trick of removing obs, too. This may be a good job for hadwell wickhams c code interface. On May 27, 2013 12:47 PM, Berend Hasselman b...@xs4all.nl wrote: On 27-05-2013, at 17:12, ivo welch ivo.we...@anderson.ucla.edu wrote: dear R experts---I would like to update OLS regressions with new observations on the front of the data, and delete some old observations from the rear. my goal is to have a flexible moving-window regression, with a minimum number of observations and a maximum number of observations. I can keep (X' X) and (X' y), and add or subtract observations from these two quantities myself, and then use crossprod. strucchange does recursive residuals, which is closely related, but it is not designed for such flexible movable windows, nor primarily designed to produce standard errors of coefficients. before I get started on this, I just wanted to inquire whether someone has already written such a function. For regression one would use a QR decomposition. There is an opensource Fortran library qrupdate ( http://sourceforge.net/projects/qrupdate/) that can update an unpivoted QR decomposition for the case of deleting rows/columns and inserting rows/columns. It could be used to make an R package, which could be used for doing a moving window regression. Quite a lot of work. Berend [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] updating observations in lm
On 27-05-2013, at 21:57, ivo welch ivo.we...@gmail.com wrote: Gentlemans as 274 algorithm allows weights, so adding an obs with a weight of -1 would do the trick of removing obs, too. This may be a good job for hadwell wickhams c code interface. Searching for Gentlemans as 274 algorithm with google turned up this: http://jblevins.org/mirror/amiller/ where there is a fortran code for am updated AS 274 algorithm. I can't judge whether this is suitable for deleting observations. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bayes Logit and Cholesky Decomposition
I am trying to use the package Bayes Logit and I keep getting this error message. chol2inv(chol(P1.j)) : error in evaluating the argument 'x' in selecting a method for function 'chol2inv': Error in chol.default(P1.j) : the leading minor of order 5 is not positive definite I can't see why this would be so because the prior variance matrix that I feed in is a diagonal matrix so it is definitely positive definite. Tjun Kiat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assistant
On 05/28/2013 12:22 AM, Adelabu Ahmmed wrote: Dear Sir/Ma, I Adelabu.A.A, one of the R-users from Nigeria. I have a data-set of claims paid, premium for individual life-insurance policy holder but not in triangle form. how can i running stochastics chainladder in r on it. please help [[alternative HTML version deleted]] Hi Ahmmed, This is a very specific question. You might find answers by contacting an actuarial forum, e.g. http://www.actuary.com/actuarial-discussion-forum/ http://www.actuarialoutpost.com/actuarial_discussion_forum/ Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data reshaping
Hello again, let say I have following data-frame: Dat - data.frame(c(rep(c(A, B), each = 4), C, C, C), c(rep(1:4, 2), 1, 2, 3), 11:21) colnames(Dat) - c(X1, X2, X3) Dat X1 X2 X3 1 A 1 11 2 A 2 12 3 A 3 13 4 A 4 14 5 B 1 15 6 B 2 16 7 B 3 17 8 B 4 18 9 C 1 19 10 C 2 20 11 C 3 21 Now I want to put that data-frame in the following form: Dat1 - rbind(c(11,12,13,14), c(15,16,17,18), c(19,20,21, NA)); colnames(Dat1) - c(1,2,3,4); rownames(Dat1) - c(A, B, C) Dat1 1 2 3 4 A 11 12 13 14 B 15 16 17 18 C 19 20 21 NA Basically, 'Dat' is the melted form of 'Dat1' Can somebody point me any R function for doing that? Thanks for your help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data reshaping
res1- xtabs(X3~X1+X2,data=Dat) res1 # X2 #X1 1 2 3 4 # A 11 12 13 14 # B 15 16 17 18 # C 19 20 21 0 library(reshape2) dcast(Dat,X1~X2,value.var=X3) # X1 1 2 3 4 #1 A 11 12 13 14 #2 B 15 16 17 18 #3 C 19 20 21 NA A.K. Hello again, let say I have following data-frame: Dat - data.frame(c(rep(c(A, B), each = 4), C, C, C), c(rep(1:4, 2), 1, 2, 3), 11:21) colnames(Dat) - c(X1, X2, X3) Dat X1 X2 X3 1 A 1 11 2 A 2 12 3 A 3 13 4 A 4 14 5 B 1 15 6 B 2 16 7 B 3 17 8 B 4 18 9 C 1 19 10 C 2 20 11 C 3 21 Now I want to put that data-frame in the following form: Dat1 - rbind(c(11,12,13,14), c(15,16,17,18), c(19,20,21, NA)); colnames(Dat1) - c(1,2,3,4); rownames(Dat1) - c(A, B, C) Dat1 1 2 3 4 A 11 12 13 14 B 15 16 17 18 C 19 20 21 NA Basically, 'Dat' is the melted form of 'Dat1' Can somebody point me any R function for doing that? Thanks for your help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot histograms in a loop
Hi, Try either: set.seed(28) stats1- as.data.frame(matrix(rnorm(5*1),ncol=5)) pdf(paste(test,1,.pdf,sep=)) par(mfrow=c(2,1)) lst1- lapply(names(stats1),function(i) {hist(stats1[,i],100,col=lightblue,main=paste0(Histogram of ,i),xlab=i );qqnorm(stats1[,i])}) dev.off() #or pdf(paste(test1,1,.pdf,sep=)) par(mfrow=c(2,1)) for(colName in names(stats1)){ hist(stats1[,colName],100,col=lightblue,xlab=colName,main=paste0(Histogram of ,colName)) qqnorm(stats1[,colName]) } dev.off() A.K. I have a dataset with more than 50 columns, and I need to check distribution for each variable. The idea was to plot histograms and qq plots for each of them and check if distribution is normal. I tried something like this: for(colName in names(stats)){ pdf(paste(test,1,.pdf,sep=)) hist(stats$get(colName)) 100, col=lightblue) qqnorm(stats$get(colName)) } dev.off() but that doesn't work. It would be great if I could also manage to store all of them in one file, what I think this code should do... Thanks, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fitting grid-based models
Hello! I'm interested to fit parameters (to data) in a grid-based (individual) model. If I understood well, simecol library has the fitOdeModel function but it is only suited to odeModels (differential equation). Alternatively, FME package has several functions able to perform this procedure but all examples are for differential eq. models. It is mentioned in the that such functions could also be used for other kind of models. Could thus be used for grid-based ones? If that, could be used even the model does not follow the simecol syntax? Thanks for your help! Javier -- ## Javier Rodríguez Pérez Dep. Biología de Organismos y Sistemas Unidad Mixta de Investigación en Biodiversidad Universidad de Oviedo Valentin Andrés Álvarez s/n, Oviedo 33006, Spain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data reshaping
library(reshape2) dcast(Dat, X1 ~X2, value.var = X3) X1 1 2 3 4 1 A 11 12 13 14 2 B 15 16 17 18 3 C 19 20 21 NA or use ? reshape HTH Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au At 10:37 28/05/2013, you wrote: Hello again, let say I have following data-frame: Dat - data.frame(c(rep(c(A, B), each = 4), C, C, C), c(rep(1:4, 2), 1, 2, 3), 11:21) colnames(Dat) - c(X1, X2, X3) Dat X1 X2 X3 1 A 1 11 2 A 2 12 3 A 3 13 4 A 4 14 5 B 1 15 6 B 2 16 7 B 3 17 8 B 4 18 9 C 1 19 10 C 2 20 11 C 3 21 Now I want to put that data-frame in the following form: Dat1 - rbind(c(11,12,13,14), c(15,16,17,18), c(19,20,21, NA)); colnames(Dat1) - c(1,2,3,4); rownames(Dat1) - c(A, B, C) Dat1 1 2 3 4 A 11 12 13 14 B 15 16 17 18 C 19 20 21 NA Basically, 'Dat' is the melted form of 'Dat1' Can somebody point me any R function for doing that? Thanks for your help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] p values of plor
Hi all: As to the polr {MASS} function, how to find out p values of every parameter? From the example of R help: house.plr - polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing) summary(house.plr) How to find out the p values of house.plr? Many thanks. Best. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] p values of plor
On May 27, 2013, at 7:59 PM, meng wrote: Hi all: As to the polr {MASS} function, how to find out p values of every parameter? From the example of R help: house.plr - polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing) summary(house.plr) How to find out the p values of house.plr? Getting p-values from t-statistics should be fairly straight-forward: summary(house.plr)$coefficients -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.