[R] Combining two or more KML files in R
Wanted to post an exchange with Roger Bivand on combining KML files. I had hoped to combine multiple KML files into a single SpatialPolygonsDataFrame. I used readOGR to bring files into R; had I worked with .shp files, I might've used readShapePoly and a unique IDvar. SPRbind construction would have followed. Turns out I needed to review spChFIDs-methods in the sp package. Roger said, 'they let you assign row.names to the geometries and the data slot data.frame consistently.' Zack -- View this message in context: http://r.789695.n4.nabble.com/Combining-two-or-more-KML-files-in-R-tp4378333p4378333.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constraint on one of parameters.
Thanks for your suggestion. I did read the manual but it seems those examples set boundaries for every parameter. I have no idea how to set bound for only one parameter(in my case, only for theta[21]). I tried adding method='L-BFGS-B',lower=c(rep(inf,20),-1),upper=c(rep(inf,20),1), but got this error object 'inf' not found. I thought for those unconstrained parameters I can set them as (-inf, inf), but the manual says using L-BFGS-B method, the estimated parameters should be finite numbers. My constrain on theta[21] is -1 theta[21] 1 For those unconstrained parameters, how do I free them? Thank you very much. On Fri, Feb 10, 2012 at 1:26 AM, Rubén Roa r...@azti.es wrote: Read optimx's help. There are 'method', 'upper', 'lower' arguments that'll let you put bounds on pars. HTH Rubén -Mensaje original- De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En nombre de FU-WEN LIANG Enviado el: jueves, 09 de febrero de 2012 23:56 Para: r-help@r-project.org Asunto: [R] Constraint on one of parameters. Dear all, I have a function to optimize for a set of parameters and want to set a constraint on only one parameter. Here is my function. What I want to do is estimate the parameters of a bivariate normal distribution where the correlation has to be between -1 and 1. Would you please advise how to revise it? ex=function(s,prob,theta1,theta,xa,xb,xc,xd,t,delta) { expo1= exp(s[3]*xa+s[4]*xb+s[5]*xc+s[6]*xd) expo2= exp(s[9]*xa+s[10]*xb+s[11]*xc+s[12]*xd) expo3= exp(s[15]*xa+s[16]*xb+s[17]*xc+s[18]*xd) expo4= exp(s[21]*xa+s[22]*xb+s[23]*xc+s[24]*xd) expo5= exp(s[27]*xa+s[28]*xb+s[29]*xc+s[30]*xd) nume1=prob[1]*(s[2]^-s[1]*s[1]*t^(s[1]-1)*expo1)^delta*exp(-s[2]^-s[1]*t^s[1]*expo1)* theta1[1]^xa*(1-theta1[1])^(1-xa)*theta1[2]^xb*(1-theta1[2])^(1-xb)*(1+theta1[11]*(xa-theta1[1])*(xb-theta1[2])/sqrt(theta1[1]*(1-theta1[1]))/sqrt(theta1[2]*(1-theta1[2])))/ (2*pi*theta[2]*theta[4]*sqrt(1-theta[21]^2))*exp(-2*(1-theta[21]^2))^(-1)*((xc-theta[1])^2/theta[2]^2+(xd-theta[3])^2/theta[4]^2-2*theta[21]^2*(xc-theta[1])*(xd-theta[3])/(theta[2]*theta[4])) nume2=prob[2]*(s[8]^-s[7]*s[7]*t^(s[7]-1)*expo2)^delta*exp(-s[8]^-s[7]*t^s[7]*expo2)* theta1[3]^xa*(1-theta1[3])^(1-xa)*theta1[4]^xb*(1-theta1[4])^(1-xb)*(1+theta1[11]*(xa-theta1[3])*(xb-theta1[4])/sqrt(theta1[3]*(1-theta1[3]))/sqrt(theta1[4]*(1-theta1[4])))/ (2*pi*theta[6]*theta[8]*sqrt(1-theta[21]^2))*exp(-2*(1-theta[21]^2))^(-1)*((xc-theta[5])^2/theta[6]^2+(xd-theta[7])^2/theta[8]^2-2*theta[21]^2*(xc-theta[5])*(xd-theta[7])/(theta[6]*theta[8])) nume3=prob[3]*(s[14]^-s[13]*s[13]*t^(s[13]-1)*expo3)^delta*exp(-s[14]^-s[13]*t^s[13]*expo3)* theta1[5]^xa*(1-theta1[5])^(1-xa)*theta1[6]^xb*(1-theta1[6])^(1-xb)*(1+theta1[11]*(xa-theta1[5])*(xb-theta1[6])/sqrt(theta1[5]*(1-theta1[5]))/sqrt(theta1[6]*(1-theta1[6])))/ (2*pi*theta[10]*theta[12]*sqrt(1-theta[21]^2))*exp(-2*(1-theta[21]^2))^(-1)*((xc-theta[9])^2/theta[10]^2+(xd-theta[11])^2/theta[12]^2-2*theta[21]^2*(xc-theta[9])*(xd-theta[11])/(theta[10]*theta[12])) nume4=prob[4]*(s[20]^-s[19]*s[19]*t^(s[19]-1)*expo4)^delta*exp(-s[20]^-s[19]*t^s[19]*expo4)* theta1[7]^xa*(1-theta1[7])^(1-xa)*theta1[8]^xb*(1-theta1[8])^(1-xb)*(1+theta1[11]*(xa-theta1[7])*(xb-theta1[8])/sqrt(theta1[7]*(1-theta1[7]))/sqrt(theta1[8]*(1-theta1[8])))/ (2*pi*theta[14]*theta[16]*sqrt(1-theta[21]^2))*exp(-2*(1-theta[21]^2))^(-1)*((xc-theta[13])^2/theta[14]^2+(xd-theta[15])^2/theta[16]^2-2*theta[21]^2*(xc-theta[13])*(xd-theta[15])/(theta[14]*theta[16])) nume5=prob[5]*(s[26]^-s[25]*s[25]*t^(s[25]-1)*expo5)^delta*exp(-s[26]^-s[25]*t^s[25]*expo5)* theta1[9]^xa*(1-theta1[9])^(1-xa)*theta1[10]^xb*(1-theta1[10])^(1-xb)*(1+theta1[11]*(xa-theta1[9])*(xb-theta1[10])/sqrt(theta1[9]*(1-theta1[9]))/sqrt(theta1[10]*(1-theta1[10])))/ (2*pi*theta[18]*theta[20]*sqrt(1-theta[21]^2))*exp(-2*(1-theta[21]^2))^(-1)*((xc-theta[17])^2/theta[18]^2+(xd-theta[19])^2/theta[20]^2-2*theta[21]^2*(xc-theta[17])*(xd-theta[19])/(theta[18]*theta[20])) denom=nume1+nume2+nume3+nume4+nume5 Ep1=nume1/denom Ep2=nume2/denom Ep3=nume3/denom Ep4=nume4/denom Ep5=nume5/denom elogld= sum(Ep1*(-log(2*pi*theta[2]*theta[4]*sqrt(1-theta[21]^2))-(2*(1-theta[21]^2))^(-1)*((xc-theta[1])^2/theta[2]^2+(xd-theta[3])^2/theta[4]^2-2*theta[21]^2*(xc-theta[1])*(xd-theta[3])/(theta[2]*theta[4] + sum(Ep2*(-log(2*pi*theta[6]*theta[8]*sqrt(1-theta[21]^2))-(2*(1-theta[21]^2))^(-1)*((xc-theta[5])^2/theta[6]^2+(xd-theta[7])^2/theta[8]^2-2*theta[21]^2*(xc-theta[5])*(xd-theta[7])/(theta[6]*theta[8] + sum(Ep3*(-log(2*pi*theta[10]*theta[12]*sqrt(1-theta[21]^2))-(2*(1-theta[21]^2))^(-1)*((xc-theta[9])^2/theta[10]^2+(xd-theta[11])^2/theta[12]^2-2*theta[21]^2*(xc-theta[9])*(xd-theta[11])/(theta[10]*theta[12] +
Re: [R] Schwefel Function Optimization
Vartanian, Ara aravart at indiana.edu writes: All, I am looking for an optimization library that does well on something as chaotic as the Schwefel function: schwefel - function(x) sum(-x * sin(sqrt(abs(x With these guys, not much luck: optim(c(1,1), schwefel)$value [1] -7.890603 optim(c(1,1), schwefel, method=SANN, control=list(maxit=1))$value [1] -28.02825 optim(c(1,1), schwefel, lower=c(-500,-500), upper=c(500,500), method=L-BFGS-B)$value [1] -7.890603 optim(c(1,1), schwefel, method=BFGS)$value [1] -7.890603 optim(c(1,1), schwefel, method=CG)$value [1] -7.890603 Why is it necessary over and over again to point to the Optimization Task View? This is a question about a global optimization problem, and the task view tells you to look at packages like 'NLoptim' with specialized routines, or use one of the packages with evolutionary algorithms, such as 'DEoptim' or'pso'. library(DEoptim) schwefel - function(x) sum(-x * sin(sqrt(abs(x de - DEoptim(schwefel, lower = c(-500,-500), upper = c(500,500), control = list(trace = FALSE)) de$optim$bestmem # par1 par2 # 420.9687 420.9687 de$optim$bestval # [1] -837.9658 All trapped in local minima. I get the right answer when I pick a starting point that's close: optim(c(400,400), schwefel, lower=c(-500,-500), upper=c(500,500), method=L-BFGS-B)$value [1] -837.9658 Of course I can always roll my own: r - vector() for(i in 1:1000) { x - runif(2, -500,500) m - optim(x, schwefel, lower=c(-500,-500), upper=c(500,500), method=L-BFGS-B) r - rbind(r, c(m$par, m$value)) } And this does fine. I'm just wondering if this is the right approach, or if there is some other package that wraps this kind of multi-start up so that the user doesn't have to think about it. Best, Ara __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Find interval between numbers in list or vector
Thanks, it is exactly the function that I needed. -- View this message in context: http://r.789695.n4.nabble.com/Find-interval-between-numbers-in-list-or-vector-tp4376115p4378473.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constraint on one of parameters.
On Fri, Feb 10, 2012 at 11:40:57PM -0600, FU-WEN LIANG wrote: Thanks for your suggestion. I did read the manual but it seems those examples set boundaries for every parameter. I have no idea how to set bound for only one parameter(in my case, only for theta[21]). I tried adding method='L-BFGS-B',lower=c(rep(inf,20),-1),upper=c(rep(inf,20),1), but got this error object 'inf' not found. I thought for those unconstrained Hi. Names in R are case sensitive. The infinity is Inf, not inf. parameters I can set them as (-inf, inf), but the manual says using L-BFGS-B method, the estimated parameters should be finite numbers. My constrain on theta[21] is -1 theta[21] 1 For those unconstrained parameters, how do I free them? If the bound cannot be -Inf and Inf, try -1e308 and 1e308. This is close to the bounds of the range of numeric values. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to plot a nice legend?
i'd like to plot a legend in my diagram. The diagram will be included in a TikZ LaTeX document later. I tried the legend() function, but - it can not find a good place it self where the legend fits and playing around with coordinates and scaling consumes a lot time - standard settings for the text need adjustment (linespacing is quite large and so on) Is there an alternative to legend()? Is it possible to place the legend() outside of the plot area? Kind regards, -- Jonas Stein n...@jonasstein.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fixed effects with clustered standard errors
Dear Giovanni, I recalled the procedure and here is the output : Erreur : impossible d'allouer un vecteur de taille 39.2 Mo De plus : Messages d'avis : 1: In structure(list(message = as.character(message), call = call), : Reached total allocation of 12279Mb: see help(memory.size) 2: In structure(list(message = as.character(message), call = call), : Reached total allocation of 12279Mb: see help(memory.size) 3: In structure(list(message = as.character(message), call = call), : Reached total allocation of 12279Mb: see help(memory.size) 4: In structure(list(message = as.character(message), call = call), : Reached total allocation of 12279Mb: see help(memory.size) traceback() Pas d'historique des fonctions appelées ('traceback') disponible I had a similar problem with another stat software but in a different context : when I tried to fit a fixed effect logit model. The soft did not converge because, as you rightly guessed at the beginning of this thread, the number of points for some individuals is too high. This might be the source of error here, probably? I recall that the median number of points in my database is quite low at 10, but I have individuals with more than 2000, 5000 or even 50 000 points! What do you think ? Many thanks Best, On 9 February 2012 10:31, John L caribou...@gmx.fr wrote: Dear Giovanni, Many thanks for your interesting suggestions. Your guess is indeed right, I only use the 'within' fixed effects specification. I will soon send to this list all the additional information you requested in order to understand what might cause this problem, but I would say as a first guess that the inefficiency is (probably?) due to individuals with too many datapoints : the median number of points is 10, but I have some individuals with more than 1000, 5000 or even 80 000 points overall! So basically my dataset is probably too strange, as you suggested, compared to the standard panel dataset in social sciences... To be continued... ;-) Many thanks again Best, On 8 February 2012 18:55, Millo Giovanni [via R] ml-node+s789695n4370302...@n4.nabble.com wrote: Dear John, interesting. There must be a bottleneck somewhere, which possibly went unnoticed because econometricians seldom use so many data points. In fact 'plm' wasn't designed to handle only 700 Megs of data at a time; but we're happy to investigate in this direction too. E.g., I was aware of some efficiency problems if effect=twoways but I seem to understand that you are using effect=individual? -- which takes me to the main point. I understand that enclosing the data for a reproducible report, as requested by the posting guide, is awkward for such a big dataset. Yet it would be of great help if you at least produced: - an output of your procedure, in order to see what goes wrong and where - the output of traceback() called immediately after you got the error (idem) and possibly gave it a try with lm() applied to the very same formula and data, maybe into a system.time( ... ) statement. Else, the information you provide is way too scant to even make an educated guess. For example, it isn't clear whether the problem is related to plm() or to vcovHC.plm etc. As far as simple demeaning is concerned, you might try the following code, which really does only that. Be aware that **standard errors are biased** etc. etc., this is not meant to be a proper function but just a computational test for your data and a quick demonstration of demeaning. 'plm()' is far more structured, for a number of reasons. Please execute it inside system.time() again. # test function for within model, BIASED SEs !! # ## ## ## example: ## data(Produc, package=plm) ## mod - FEmod(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, index=Produc$state, data=Produc) ## summary(mod) ## ## compare with: ## library(plm) ## example(plm) demean-function(x,index,lambda=1,na.rm=F) { as.vector(x-lambda*tapply(x,index,mean,na.rm=na.rm)[unclass(as.factor(in dex))]) } FEmod-function(formula,index,data=ls()) { ## fit a model without intercept in any case formula-as.formula(paste(deparse(formula(formula)),-1,sep=)) X-model.matrix(formula,data=data) y-model.response(model.frame(formula,data=data)) ## reduce index accordingly names(index)-row.names(data) ind-index[which(names(index)%in%row.names(X))] ## within transf. MX-matrix(NA,ncol=dim(X)[[2]],nrow=dim(X)[[1]]) for(i in 1:dim(X)[[2]]) { MX[,i]-demean(X[,i],index=ind,lambda=1) } My-demean(y,index=ind,lambda=1) ## estimate within model femod-lm(My~MX-1) return(femod) } ### end test function Best, Giovanni ### original message # -- Message: 28 Date: Tue, 07 Feb 2012 15:35:07 +0100 From: [hidden email] To: [hidden email] Subject: [R] fixed effects with clustered standard errors Message-ID: [hidden email]
Re: [R] how to plot a nice legend?
On 12-02-10 3:45 PM, Jonas Stein wrote: i'd like to plot a legend in my diagram. The diagram will be included in a TikZ LaTeX document later. I tried the legend() function, but - it can not find a good place it self where the legend fits and playing around with coordinates and scaling consumes a lot time - standard settings for the text need adjustment (linespacing is quite large and so on) Is there an alternative to legend()? Is it possible to place the legend() outside of the plot area? Kind regards, There are various alternatives available; you can also write your own, by modifying the standard one. Generally there are lots of possibilities for customizing within the standard one; e.g. y.intersp will affect the line spacing, using a negative value for inset (together with xpd=NA) will allow the legend to be moved outside the plot. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] updating one's own package
Win7 x64, R2.14.1 Dear list, I would like to get into the habbit of creating a package for each project. That way I can package my project-specific functions and data together and load or unload that at will. Also, it will allow easy navigation through all the functions I will have available for each project, since I can then use ?myfunction and immediately see the details of that function (esp. useful when revisiting a project after several months). Anyway, I have been successful in constructing a package, using the package.skeleton. My question is how to easily update the package when I write an additional function or rewrite an existing function. Do I then need to again build the package fully, or is there an incremental update option somewhere? This I can not find anywhere, maybe I am overlooking something? Thanks, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] best option for big 3D arrays?
On 12-02-10 9:12 AM, Djordje Bajic wrote: Hi all, I am trying to fill a 904x904x904 array, but at some point of the loop R states that the 5.5Gb sized vector is too big to allocate. I have looked at packages such as bigmemory, but I need help to decide which is the best way to store such an object. It would be perfect to store it in this cube form (for indexing and computation purpouses). If not possible, maybe the best is to store the 904 matrices separately and read them individually when needed? Never dealed with such a big dataset, so any help will be appreciated (R+ESS, Debian 64bit, 4Gb RAM, 4core) I'd really recommend getting more RAM, so you can have the whole thing loaded in memory. 16 Gb would be nice, but even 8Gb should make a substantial difference. It's going to be too big to store as an array since arrays have a limit of 2^31-1 entries, but you could store it as a list of matrices, e.g. x - vector(list, 904) for (i in 1:904) x[[i]] - matrix(0, 904,904) and then refer to entry i,j,k as x[[i]][j,k]. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice 3d coordinate transformation
On Fri, Feb 10, 2012 at 12:43 AM, ilai ke...@math.montana.edu wrote: Hello List! I asked this before (with no solution), but maybe this time... I'm trying to project a surface to the XY under a 3d cloud using lattice. I can project contour lines following the code for fig 13.7 in Deepayan Sarkar's Lattice, Multivariate Data Visualization with R, but it fails when I try to color them in using panel.levelplot. ?utilities.3d says there may be some bugs, and I think ltransform3dto3d() is not precise (where did I hear that?), but is this really the source of my problem? Is there a (simple?) workaround, maybe using 3d.wire but projecting it to XY? How? Please, any insight may be useful. I don't think this will be that simple. panel.levelplot() essentially draws a bunch of colored rectangles. For a 3D projection, each of these will become (four-sided) polygons. You need to compute the coordinates of those polygons, figure out their fill colors (possibly using ?level.colors) and then draw them. -Deepayan Thanks in advance, Elai. A working example: ## data d and predicted surf: set.seed(1113) d - data.frame(x=runif(30),y=runif(30),g=gl(2,15)) d$z - with(d,rnorm(30,3*asin(x^2)-5*y^as.integer(g),.1)) d$z - d$z+min(d$z)^2 surf - by(d,d$g,function(D){ fit - lm(z~poly(x,2)*poly(y,2),data=D) outer(seq(0,1,l=10),seq(0,1,l=10),function(x,y,...) predict(fit,data.frame(x=x,y=y))) }) ## # This works to get contours: require(lattice) cloud(z~x+y|g,data=d,layout=c(2,1), type='h', lwd=3, par.box=list(lty=0), scales=list(z=list(arrows=F,tck=0)), panel.3d.cloud = function(x, y, z,rot.mat, distance, zlim.scaled, nlevels=20,...){ add.line - trellis.par.get(add.line) clines - contourLines(surf[[packet.number()]],nlevels = nlevels) for (ll in clines) { m - ltransform3dto3d(rbind(ll$x-.5, ll$y-.5, zlim.scaled[1]), rot.mat, distance) panel.lines(m[1,], m[2,], col = add.line$col, lty = add.line$lty, lwd = add.line$lwd) } panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled = zlim.scaled, ...) } ) # But using levelplot: panel.3d.levels - function(x, y, z,rot.mat, distance, zlim.scaled,...) { zz - surf[[packet.number()]] n - nrow(zz) s - seq(-.5,.5,l=n) m - ltransform3dto3d(rbind(rep(s,n),rep(s,each=n),zlim.scaled[1]), rot.mat, distance) panel.levelplot(m[1,],m[2,],zz,1:n^2,col.regions=heat.colors(20)) panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled = zlim.scaled, ...) } cloud(z~x+y|g,data=d,layout=c(2,1), type='h', panel.3d.cloud = panel.3d.levels, scales=list(z=list(arrows=F,tck=0)),par.box=list(lty=0),lwd=3) # I also tried to fill between contours but can't figure out what to do with the edges and how to incorporate the x,y limits to 1st and nth levels. panel.3d.contour - function(x, y, z,rot.mat, distance,xlim,ylim, zlim.scaled,nlevels=20,...) { add.line - trellis.par.get(add.line) zz - surf[[packet.number()]] clines - contourLines(zz,nlevels = nlevels) colreg - heat.colors(max(unlist(lapply(clines,function(ll) ll$level for (i in 2:length(clines)) { ll - clines[[i]] ll0 - clines[[i-1]] m - ltransform3dto3d(rbind(ll$x-.5, ll$y-.5, zlim.scaled[1]), rot.mat, distance) m0 - ltransform3dto3d(rbind(ll0$x-.5, ll0$y-.5, zlim.scaled[1]), rot.mat, distance) xvec - c(m0[1,],m[1,ncol(m):1]) yvec - c(m0[2,],m[2,ncol(m):1]) panel.polygon(xvec,yvec,col=colreg[ll$level],border='transparent') panel.lines(m[1,], m[2,], col = add.line$col, lty = add.line$lty, lwd = add.line$lwd) } panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled = zlim.scaled, ...) } cloud(z~x+y|g,data=d,layout=c(2,1), type='h', panel.3d.cloud = panel.3d.contour, scales=list(z=list(arrows=F,tck=0)),par.box=list(lty=0),lwd=3) # __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to solve long tick labels (axis.text.x)
Hi, all, I am a newbie for [r]. I am currently trying to learn this example. /http://learnr.wordpress.com/2009/03/17/ggplot2-barplots// /http://learnr.wordpress.com/2009/03/17/ggplot2-barplots/ / After I made a drawing /c - b + facet_grid(Region ~ .) + opts(legend.position = none) / If I want to make the axis.text.x (I don't want to mix with axis labels, so I type axis.text.x, or simply tick labels) to become horizontal, I think I could do it. However, how could I make the text in two rows like 1820 - on the first row and 30 on the second row? I am also trying to make another graph, however, the axis.text.x do not follows any pattern... let say, amercian handsome guys, italian ladys, smart japanese, etc... How could I wrap those tick-labels in ggplot??? I have tried to follow the wrapper from /http://stackoverflow.com/questions/5574157/r-ggplot2-can-i-make-the-facet-strip-text-wrap-around/ /https://stat.ethz.ch/pipermail/r-help/2005-April/069496.html/ But I just failed again and again...X_X. Hope some genius could help. Thanks. vd -- View this message in context: http://r.789695.n4.nabble.com/How-to-solve-long-tick-labels-axis-text-x-tp4378760p4378760.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Detect numerical series
Hello, I am struggling with detecting successive digits in a numerical series vector. Here is an example: vec - c(1, 15, 26, 29, 30, 31, 37, 40, 41) I want to be able to detect 29, 30, 31 and 40, 41. Then, I would like to delete the successive digits from the vector. 1, 15, 26, 29, 37, 40 Cheers -- View this message in context: http://r.789695.n4.nabble.com/Detect-numerical-series-tp4379088p4379088.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Detect numerical series
On Feb 11, 2012, at 10:01 AM, syrvn wrote: Hello, I am struggling with detecting successive digits in a numerical series vector. Here is an example: vec - c(1, 15, 26, 29, 30, 31, 37, 40, 41) I want to be able to detect 29, 30, 31 and 40, 41. Then, I would like to delete the successive digits from the vector. 1, 15, 26, 29, 37, 40 vec[ c(TRUE, !diff(vec) == 1) ] #[1] 1 15 26 29 37 40 Cheers -- View this message in context: http://r.789695.n4.nabble.com/Detect-numerical-series-tp4379088p4379088.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Detect numerical series
That's great code. Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Detect-numerical-series-tp4379088p4379133.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] naiveBayes: slow predict, weird results
We don't have the data, but my guess is that you want to have some factors in your data that were integers when you tried the code below. Uwe Ligges On 10.02.2012 03:43, Sam Steingold wrote: I did this: nb- naiveBayes(users, platform) pl- predict(nb,users) nrow(users) == 314781 ncol(users) == 109 1. naiveBayes() was quite fast (~20 seconds), while predict() was slow (tens of minutes). why? 2. the predict results were completely off the mark (quite the opposite of the expected overfitting). suffice it to show the tables: pl: android blackberry ipad iphone lg linuxmac 3 5 11 14 312723 5 11 mobile nokiasamsungsymbianunknownwindows 1864 17 16112 0 0 platform: android blackberry ipad iphone lg linuxmac 18013 1221 2647 1328 4 2936 34336 mobile nokiasamsungsymbianunknownwindows 18 88 39103 2660 251388 i.e., nb classified nearly everything as lg while in the actual data lg is virtually nonexistent. 3. when I print nb, I see A-priori probabilities (which are what I expected) and Conditional probabilities which are confusing because there are only two of them, e.g.: android0.048464998 0.43946764 blackberry 0.001638002 0.04045564 ipad 0.322251606 1.84940588 iphone 0.030873494 0.23250250 lg 0.0 0. linux 0.023501362 0.34698919 mac0.082653774 1.22535027 mobile 0.0 0. nokia 0.0 0. samsung0.0 0. symbian0.0 0. unknown0.003759398 0.08219078 windows0.021158528 0.32916970 the predictors are integers. is the first column for the 0 predictors and the second for all non-0? Is there a way to ask naiveBayes to differenciate between non-0 values? thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need to aggregate large dataset by week...
If I understand what you want here are two possible ways to approach the problem. One uses aggregate and one uses the reshape package to melt and cast the data into the form you want. To use reshape you need to install the reshape package. Assuming your dataset is named xx aggregate(xx, by=list(xx$week), mean) library(reshape) mm - melt(xx, id=c(week)) cast(mm, week ~ variable, mean) John Kane Kingston ON Canada -Original Message- From: revda...@gmail.com Sent: Fri, 10 Feb 2012 04:55:44 -0800 (PST) To: r-help@r-project.org Subject: [R] Need to aggregate large dataset by week... Hi all, I have a large dataset with ~8600 observations that I want to compress to weekly means. There are 9 variables (columns), and I have already added a week column with 51 weeks. I have been looking at the functions: aggregate, tapply, apply, etc. and I am just not savvy enough with R to figure this out on my own, though I'm sure it's fairly easy. I also have the Dates (month/day/year) for all of the observations, but I figured just having a week column may be easier. If someone wanted to show me how to organize this data using a date function and aggregating by month that would be useful too! Here's an example of the data set, with only 5 of the variables and 10 of 8600 obs.: weekrainfall windspeed winddir temp oakdepth 1 1 0.2000 0.89000 245.9200 1.15 4.40 2 1 0. 0.84000 292.8800 1.19 5.30 3 1 0.2000 0.74000 258.5400 1.36 6.00 4 1 0. 0.930003.7000 1.43 4.40 5 1 0.2000 0.69000 37.8200 1.56 5.20 6 1 0. 0.8 17.2900 1.69 4.40 7 1 0.2000 0.7 28.7300 1.88 5.00 8 1 0.2000 1.12000 294.3700 1.93 6.00 9 1 0. 1.21000 274.9700 1.80 4.40 10 1 0. 1.31000 279.2400 1.86 5.80 ...so after about 170 observations it changes to week 2, and so on. I've tried something like this, but its only one variable's mean, and I would rather have the rows=weeks and columns= the different variables. tapply(metdata$rainfall,metdata$week,FUN=mean) 1 2 3 4 5 6 0.080952381 0.101190476 0.379761905 0.179761905 0.0 0.295238095 7 8 9 10 11 12 0.146428571 0.015476190 0.16389 0.098809524 0.065476190 0.215476190 Hope this is enough information and that I'm not just re-asking an old question. Thanks so much in advance for any help. -- View this message in context: http://r.789695.n4.nabble.com/Need-to-aggregate-large-dataset-by-week-tp4376154p4376154.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Send your photos by email in seconds... TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if3 Works in all emails, instant messengers, blogs, forums and social networks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] object not found - Can not figure out why I get this error: Error in NROW(yCoordinatesOfLines) : object 'low' not found
Hi, I have been using R for over a year now. I am a very happy user. Thank you for making this happen. This is my first question to this list. I trying to add some functions to quantmod that would enable me to draw arbitrary lines and text and make sure they are redrawn. I have created following function: require(quantmod) # Add horizontal line to graph produced by quantmod::chart_Series() add_HorizontalLine-function(yCoordinatesOfLines, on=1, ...) { lenv - new.env() lenv$add_horizontalline - function(x, yCoordinatesOfLines, ...) { xdata - x$Env$xdata xsubset - x$Env$xsubset x0coords - rep(1, NROW(yCoordinatesOfLines)) x1coords - rep(NROW(xdata[xsubset]), NROW(yCoordinatesOfLines)) if ((NROW(x0coords) 0) (NROW(x1coords) 0)) { segments(x0coords, yCoordinatesOfLines, x1coords, yCoordinatesOfLines, ...) #abline(h=yCoordinatesOfLines, ...) } } mapply(function(name, value) {assign(name,value,envir=lenv)}, names(list(yCoordinatesOfLines=yCoordinatesOfLines,...)), list(yCoordinatesOfLines=yCoordinatesOfLines,...)) exp - parse(text=gsub(list,add_horizontalline, as.expression(substitute(list(x=current.chob(), yCoordinatesOfLines=yCoordinatesOfLines, ..., srcfile=NULL) plot_object - current.chob() lenv$xdata - plot_object$Env$xdata #plot_object$set_frame(sign(on)*abs(on)+1L) plot_object$set_frame(2*on) plot_object$add(exp,env=c(lenv, plot_object$Env),expr=TRUE) plot_object } # Short test function that uses add_HorizontalLine test-function(series, low=20, high=80) { chart_Series(SPX, subset=2012) add_TA(RSI(Cl(SPX))) plot(add_HorizontalLine(c(low, high), on=2, col=c('green', 'red'), lwd=2)) } # Actual test SPX - getSymbols(^GSPC, from=2000-01-01, auto.assign=FALSE) dev.new() test(SPX) This gives me the following error: test(SPX) Error in NROW(yCoordinatesOfLines) : object 'low' not found What am I doing wrong here? Any hints highly appreciated. The funniest thing is that this was working and somehow broke it... Best, Samo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] passing an extra argument to an S3 generic
You are setting a new class (inflmlm) at the end of mlm.influence. Remove that second to last line and enjoy your new S3 method. I'm not sure, but I think it is just the new class inflmlm applied to inf in the formals of hatvalues.mlm confused the dispatch mechanism. You would think the error message will call the offending class not numeric double but that's above my pay grade... You could probably put back the inflmlm class assignment with an explicit call to UseMethod in hatvalues.mlm ? Cheers On Fri, Feb 10, 2012 at 2:35 PM, Michael Friendly frien...@yorku.ca wrote: On 2/10/2012 4:09 PM, Henrik Bengtsson wrote: So people may prefer to do the following: hatvalues.mlm- function(model, m=1, infl, ...) { if (missing(infl)) { infl- mlm.influence(model, m=m, do.coef=FALSE); } hat- infl$H m- infl$m names(hat)- if(m==1) infl$subsets else apply(infl$subsets,1, paste, collapse=',') hat } Thanks; I tried exactly that, but I still can't pass m=2 to the mlm method through the generic hatvalues(Rohwer.mod) 1 2 3 4 5 6 7 8 0.16700926 0.21845327 0.14173469 0.07314341 0.56821462 0.15432157 0.04530969 0.17661104 9 10 11 12 13 14 15 16 0.05131298 0.45161152 0.14542776 0.17050399 0.10374592 0.12649927 0.33246744 0.33183461 17 18 19 20 21 22 23 24 0.17320579 0.26353864 0.29835817 0.07880597 0.14023750 0.19380286 0.04455330 0.20641708 25 26 27 28 29 30 31 32 0.15712604 0.15333879 0.36726467 0.11189754 0.30426999 0.08655434 0.08921878 0.07320950 hatvalues(Rohwer.mod, m=2) Error in UseMethod(hatvalues) : no applicable method for 'hatvalues' applied to an object of class c('double', 'numeric') ## This works: hatvalues.mlm(Rohwer.mod, m=2) ... output snipped hatvalues function (model, ...) UseMethod(hatvalues) bytecode: 0x021339e4 environment: namespace:stats -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Derive pattern from vector
Hello, consider the following vector 'chars': chars - c(A, B, C, C, D, E, E, E, F, F, F) I need to convert 'chars' into the following pattern: 1, 2, 3, 3, 4, 5, 5, 5, 6, 7, 8 As soon as there are duplicates they get the same number otherwise it's increasing numbers. However, for the char 'F' it should be always increasing numbers. Is that possible in R? I used the following code: chars - c('A', 'B', 'C', 'C', 'D', 'E', 'E', 'E', 'F', 'F', 'F') chars_dup - duplicated(chars) cumsum(!chars_dup) [1] 1 2 3 3 4 5 5 5 6 6 6 But I do not know how to treat 'F' in the way described above. Regards -- View this message in context: http://r.789695.n4.nabble.com/Derive-pattern-from-vector-tp4379312p4379312.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Derive pattern from vector
On Sat, Feb 11, 2012 at 09:11:12AM -0800, syrvn wrote: Hello, consider the following vector 'chars': chars - c(A, B, C, C, D, E, E, E, F, F, F) I need to convert 'chars' into the following pattern: 1, 2, 3, 3, 4, 5, 5, 5, 6, 7, 8 As soon as there are duplicates they get the same number otherwise it's increasing numbers. However, for the char 'F' it should be always increasing numbers. Is that possible in R? I used the following code: chars - c('A', 'B', 'C', 'C', 'D', 'E', 'E', 'E', 'F', 'F', 'F') chars_dup - duplicated(chars) cumsum(!chars_dup) [1] 1 2 3 3 4 5 5 5 6 6 6 But I do not know how to treat 'F' in the way described above. Try this non_dup - !duplicated(chars) | chars == 'F' cumsum(non_dup) [1] 1 2 3 3 4 5 5 5 6 7 8 HTH. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Derive pattern from vector
fantastic. Thanks for that chunk of code. Works great! :) -- View this message in context: http://r.789695.n4.nabble.com/Derive-pattern-from-vector-tp4379312p4379402.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Embed R code in online database
Dear R-List! I would like to embed R code in an online database such as i.e. a google spreadsheet in way that users can add data to the database and that R's calculations are updated automatically and i.e. given out in the spreadsheet. Maybe even graphs could be updated online? Is there a way to implement this? Many thanks! J. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Counting occurences of variables in a dataframe
Hi everybody, I have a large dataframe similar to this one: knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af') kdate - as.Date( c('20111001', '2002', '20101001', '20100315', '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'), format=%Y%m%d) kdata - data.frame (knames, kdate) I would like to add a new variable to the dataframe counting the occurrences of different values in knames in their order of appearance (according to the date as in indicated in kdate). The solution should be a variable with the values 2,2,1,1,1,2,1,2,1,1. I could do it with a loop, but there must be a more elegant way to this. Thanks! Best, Kai [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rpart and splitting criteria
On 10.02.2012 10:37, MYRIAM TABASSO wrote: Dear All, I have questions about the function rpart to construct a regression tree in R code. My problem is how to change the splitting criteria. In the rpart we have : parms=list(split=..) , I ask you if in this command is it possible to use an another splitting criterion to substitute the default criteria( gini or information)? No. Uwe Ligges Does someone can help me ? Thank you, Myriam Tabasso [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] updating one's own package
On 11.02.2012 13:49, Wet Bell Diver wrote: Win7 x64, R2.14.1 Dear list, I would like to get into the habbit of creating a package for each project. That way I can package my project-specific functions and data together and load or unload that at will. Also, it will allow easy navigation through all the functions I will have available for each project, since I can then use ?myfunction and immediately see the details of that function (esp. useful when revisiting a project after several months). Anyway, I have been successful in constructing a package, using the package.skeleton. My question is how to easily update the package when I write an additional function or rewrite an existing function. Do I then need to again build the package fully, or is there an incremental update option somewhere? This I can not find anywhere, maybe I am overlooking something? You can use package.skeleton with force = FALSE in order not to overwrite existing files. Anyway, I typically edit the files of the package directly, without using any helper functions. And if you add a new function, you can use prompt() to prepare a corresponding Rd file. Uwe Ligges Thanks, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to properly build model matrices
On 09.02.2012 22:39, Yang Zhang wrote: I always bump into a few (very minor) problems when building model matrices with e.g.: train = model.matrix(label~., read.csv('train.csv')) target = model.matrix(label~., read.csv('target.csv')) (1) The two may have different factor levels, yielding different matrices. I usually first rbind the data frames together to meld the factors, and then split them apart and matrixify them. You can preprocess the data and explicitly define the levels for factor variables in your data.frames. (2) The target set that I'm predicting on typically doesn't have labels. I usually manually append dummy labels to the target data frame. R cannot know labels if you do not provide any. (3) I almost always remove the Intercept from the model matrices, since it seems to always be redundant (I usually use caret). Then change your model formula to: label ~ . - 1. But note the interpretation changes and it is *not* redundant in general. Uwe Ligges None of these is a big deal at all, but I'm just curious if I'm missing something simple in how I'm doing things. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug with memory allocation when loading Rdata files iteratively?
On 10.02.2012 01:56, Janko Thyson wrote: Dear list, when iterating over a set of Rdata files that are loaded, analyzed and then removed from memory again, I experience a *significant* increase in an R process' memory consumption (killing the process eventually). It just seems like removing the object via |rm()| and firing |gc()| do not have any effect, so the memory consumption of each loaded R object cumulates until there's no more memory left :-/ Possibly, this is also related to XML package functionality (mainly |htmlTreeParse| and |getNodeSet|), but I also experience the described behavior when simply iteratively loading and removing Rdata files. Please provide a reproducible example. If you manage to produce one with XML only, report to its maintainer. If you manage to provide one without XML, report to R-devel. But please try with recent versions of XML and R (both unstated in your message). Uwe Ligges I've put together a little example that illustrates the memory ballooning mentioned above which you can find here: http://stackoverflow.com/questions/9220849/significant-memory-issue-in-r-when-iteratively-loading-rdata-files-killing-the Is this a bug? Any chance of working around this? Thanks a lot and best regards, Janko [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting occurences of variables in a dataframe
Hello Kai This looks like a fun question. Here is my solution, I'd be curious to see solutions by other people here. It can also be tweaked in various ways, and easily put into a function (actually, if you do it - please put it back online :) ) The only thing that might require some work is the rearranging of the columns. Cheers, Tal ## # Loading the functions ## # Making sure we can source code from github source( http://www.r-statistics.com/wp-content/uploads/2012/01/source_https.r.txt;) # This is based on code first discussed here: ## http://www.r-statistics.com/2012/01/printing-nested-tables-in-r-bridging-between-the-reshape-and-tables-packages/ # Reading in the function for using merge that reserves order source_https( https://raw.github.com/talgalili/R-code-snippets/master/merge.data.frame.r;) ## # Make Data knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af') kdate - as.Date( c('20111001', '2002', '20101001', '20100315', '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'), format=%Y%m%d) kdata - data.frame (knames, kdate) kdata$kdate - as.character(kdata$kdate) ## # Calculate counts tmp - data.frame(table(kdata$kdate)) colnames(tmp)[1] - kdate tmp[,1] - as.character(tmp[,1]) # Based on this: # http://www.r-statistics.com/2012/01/merging-two-data-frame-objects-while-preserving-the-rows-order/ merge.data.frame(kdata ,tmp ,keep_order = x) ### Solution: kdate knames Freq 9 2011-10-01 ab1 10 2011-11-02 aa1 2 2010-10-01 ac2 1 2010-03-15 ad1 4 2010-12-01 ab1 5 2011-01-05 ac1 3 2010-10-01 aa2 7 2011-05-04 ad1 8 2011-06-03 ae1 6 2011-02-01 af1 Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Sat, Feb 11, 2012 at 8:17 PM, Kai Mx govo...@gmail.com wrote: Hi everybody, I have a large dataframe similar to this one: knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af') kdate - as.Date( c('20111001', '2002', '20101001', '20100315', '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'), format=%Y%m%d) kdata - data.frame (knames, kdate) I would like to add a new variable to the dataframe counting the occurrences of different values in knames in their order of appearance (according to the date as in indicated in kdate). The solution should be a variable with the values 2,2,1,1,1,2,1,2,1,1. I could do it with a loop, but there must be a more elegant way to this. Thanks! Best, Kai [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] debug in a loop
On 12-02-10 12:48 PM, Justin Haynes wrote: You can add if(is.na(tab[i])) browser() or if(is.na(tab[i])) break see inline You can also do this temporarily. Supposing that you used source(foo.R) to enter a function with that code in it, and you want the check on line 10, you'd enter setBreakpoint(foo.R#10, tracer=quote(if(is.na(tab[i])) browser())) Duncan Murdoch On Fri, Feb 10, 2012 at 7:22 AM, ikuzarraz...@hotmail.fr wrote: Hi, I'd like to debug in a loop (using debug() and browser() etc but not print() ). I'am looking for the first occurence of NA. For instance: tab = c(1:300) tab[250] = NA len = length(tab) for (i in 1:len){ if(i != len){ if(is.na(tab[i])) browser() tab[i] = tab[i]+tab[i+1] } } I do not want to do Browse[2] n for each step ... I'd like to declare a browser() in the loop with a condition. But how to write stop running when you encounter NA ? Thanks for your help -- View this message in context: http://r.789695.n4.nabble.com/debug-in-a-loop-tp4376563p4376563.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting occurences of variables in a dataframe
On Sat, Feb 11, 2012 at 07:17:54PM +0100, Kai Mx wrote: Hi everybody, I have a large dataframe similar to this one: knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af') kdate - as.Date( c('20111001', '2002', '20101001', '20100315', '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'), format=%Y%m%d) kdata - data.frame (knames, kdate) I would like to add a new variable to the dataframe counting the occurrences of different values in knames in their order of appearance (according to the date as in indicated in kdate). The solution should be a variable with the values 2,2,1,1,1,2,1,2,1,1. I could do it with a loop, but there must be a more elegant way to this. Hi. Is the first 2 in the new variable due to the fact that the name is ab and ab at row 5 has older date? If so, then try the following ind - order(kdata$kdate) f - function(x) seq.int(along.with=x) kdata$x - ave(1:nrow(kdata), kdata$knames[ind], FUN=f)[order(ind)] knames kdate x 1 ab 2011-10-01 2 2 aa 2011-11-02 2 3 ac 2010-10-01 1 4 ad 2010-03-15 1 5 ab 2010-12-01 1 6 ac 2011-01-05 2 7 aa 2010-10-01 1 8 ad 2011-05-04 2 9 ae 2011-06-03 1 10 af 2011-02-01 1 kdata$knames[ind] orders the names by increasing date. ave(...)[order(ind)] reorders the output of ave() to the original order. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] colnames documentation
On 10.02.2012 04:53, R. Michael Weylandt wrote: Consider the following in R 2.14.1 (seems to still be the case in Rdevel): x- matrix(1:9, 3) colnames(x) # NULL as expected colnames(x, do.NULL = TRUE) # NULL -- since we didn't change the default colnames(x, do.NULL = FALSE) # col1 col2 col3 This doesn't really seem to square with the documentation which reads: do.NULL: logical. Should this create names if they are ‘NULL’? The details section expounds and says: If ‘do.NULL’ is ‘FALSE’, a character vector (of length ‘NROW(x)’ or ‘NCOL(x)’) is returned in any case, prepending ‘prefix’ to simple numbers, if there are no dimnames or the corresponding component of the dimnames is ‘NULL’. But I have to admit that I don't really get it. (The interpretation of the docs; I understand the functionality) Could someone enlighten me? Given what the details section says (and the behavior of the function is), I'd expect something more like: do.NULL: logical. Is NULL an acceptable return value? If FALSE, column names derived from prefix are returned. Changed to \item{do.NULL}{logical. If \code{FALSE} and names are \code{NULL}, names are created.} Michael PS -- In my searching, I think the link to the svn on the developer page (http://developer.r-project.org/) is wrong: clicking it takes one to what appears to be the same page: am I incorrect in assuming it should link to http://svn.r-project.org/R for the current svn? I think you followed the link to the svn sources of that developer page (rather than the software R). Uwe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package ICSNP
On 10.02.2012 04:44, David Winsemius wrote: On Feb 9, 2012, at 9:44 PM, David Winsemius wrote: On Feb 9, 2012, at 5:37 PM, Miho Morimoto wrote: install.packages(ICSNP) --- Please select a CRAN mirror for use in this session --- Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.4 We are not even close to version 2.4. Or maybe my memory is too short. I read this as being in the future but now that I think about it further v2.4 was probably in 2004. Actually, R-2.4.0 was released in October 2006. ICSNP appeared in 2007 on CRAN. Hence we never generated a binary for an R version that was already unsupported at that time. That means the OP should really upgrade R! Maybe in another 15 years? Furthermore the directory for version 2.14 does not have the target package. The CRAN on has. What made you choose this repository over one of the standard CRAN repos? (The repository at www.stat.ox.ac.uk has a rather specialized reason for existence.) This is a standard repository under Windows (in addition to ordinary CRAN) and contacted by default. The Warning comes from the fact that the repository for the outdated R version was removed in the meantime - I guess the OP has not changed the default, hence R also looked into CRAN but did not find the package anywhere. Best, Uwe Ligges Warning in download.packages(pkgs, destdir = tmpd, available = available, : no package 'ICSNP' at the repositories Miho David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting codebook data into R
Hi Eric - after seeing the difficulty of inputting this kind of data into R I decided to use your method. It was rather painless using PSPP to do what I wanted - however, how do I now create an SPSS file and then use the memisc package to read it in? -- View this message in context: http://r.789695.n4.nabble.com/Getting-codebook-data-into-R-tp4374331p4379433.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] AMOVA error: 'bin' must be numeric or a factor
Hi! I am trying to analyse my data using amova (http://www.oga-lab.net/RGM2/func.php?rd_id=pegas:amova): My input to R is a DNA sequence file, format=fasta dna- read.dna(XX.fasta, format=fasta) #left other options as default d- dist.dna(dna, model=raw) g- read.table(XXX.design) Load necessary libraries: library(pegas) Loading required package: adegenet Loading required package: MASS Loading required package: ade4 Running Amova: amova(d ~ g, nperm = 100) Error in FUN(X[[1L]], ...) : 'bin' must be numeric or a factor How can I solve this bin problem? I think it might be a problem with the g variabel. In the example they type g - factor(c(rep(A, 7), rep(B, 8))) I cannot find any information about what this c(rep) does. Do anyone know about a proper manual for the amova function? My input file for g looks like this: sequenceA 1 sequenceB 1 sequenceC 1 . . . . sequenceD 5 . . sequenceE 9 sequenceF 9 Where sequenceA is in group 1, sequenceD is in group 5 and so on... If I type is.factor(g) I get FALSE I have also checked that the d (a matrix file) is a numeric file. It should be correct. is.numeric(d) TRUE Any help will be very much appreciated! Cheers, Hanne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting occurences of variables in a dataframe
On Feb 11, 2012, at 1:17 PM, Kai Mx wrote: Hi everybody, I have a large dataframe similar to this one: knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af') kdate - as.Date( c('20111001', '2002', '20101001', '20100315', '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'), format=%Y%m%d) kdata - data.frame (knames, kdate) ave(unclass(kdate), knames, FUN=order ) [1] 2 2 1 1 1 2 1 2 1 1 That was actually not using the dataframe values but you could also do this: kdata$ord - with(kdata, ave(unclass(kdate), knames, FUN=order )) kdata knames kdate ord 1 ab 2011-10-01 2 2 aa 2011-11-02 2 3 ac 2010-10-01 1 4 ad 2010-03-15 1 5 ab 2010-12-01 1 6 ac 2011-01-05 2 7 aa 2010-10-01 1 8 ad 2011-05-04 2 9 ae 2011-06-03 1 10 af 2011-02-01 1 I would like to add a new variable to the dataframe counting the occurrences of different values in knames in their order of appearance (according to the date as in indicated in kdate). The solution should be a variable with the values 2,2,1,1,1,2,1,2,1,1. I could do it with a loop, but there must be a more elegant way to this. Thanks! Best, Kai [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] obtaining a true/false vector with combination of strsplit, length, unlist,
Hi, A pared down version of the dataset I'm working with: edm-read.table(textConnection(WELLIDX_GRID Y_GRID LAYER ROW COLUMN SPECIES CALCULATED OBSERVED w301_3 4428. 1389 2 6 18 1 3558 6490. w304_12 4836. 6627 2 27 20 1 3509 3228. 02_10_120803.6125E+04 13875 1 56145 1 2774 -999.0 02_10_120803.6125E+04 13875 1 56145 1 2774 -999.0 02_10_120813.6375E+04 13875 1 56146 1 3493 -999.0 02_10_120923.9125E+04 13875 1 56157 1 4736 -999.0 w305_12 2962. 7326 2 30 12 1 4575 5899.),header=T) closeAllConnections() I'm having a hard time coming up with the R code that would produce a TRUE/FALSE vector based on whether or not the first column of the data.frame edm has a length of 2 or 3? To show what I mean going row-by-row, I could do the following: length(strsplit(as.character(edm$WELLID),_)[[1]])==3 [1] FALSE length(strsplit(as.character(edm$WELLID),_)[[2]])==3 [1] FALSE length(strsplit(as.character(edm$WELLID),_)[[3]])==3 [1] TRUE length(strsplit(as.character(edm$WELLID),_)[[4]])==3 [1] TRUE length(strsplit(as.character(edm$WELLID),_)[[5]])==3 [1] TRUE length(strsplit(as.character(edm$WELLID),_)[[6]])==3 [1] TRUE length(strsplit(as.character(edm$WELLID),_)[[7]])==3 [1] FALSE I've fumbled around trying to come up with a line of R code that would create a vector that looks like: FALSE FALSE TRUE TRUE TRUE TRUE FALSE The final goal is to use this vector to create two new data.frames, where, for example, the first contains all the rows of edm in which the first column has a length of 2 when split using a _ character. The second data.frame would contain all the rows in which the first column has a length of 3 when split using a _ character. Thanks, Eric -- View this message in context: http://r.789695.n4.nabble.com/obtaining-a-true-false-vector-with-combination-of-strsplit-length-unlist-tp4380050p4380050.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] obtaining a true/false vector with combination of strsplit, length, unlist,
You are so very close: sapply(edm[,1], function(x)length(strsplit(as.character(x), _)[[1]]) == 3) [1] FALSE FALSE TRUE TRUE TRUE TRUE FALSE Thanks for providing a small reproducible example. dput() tends to work better for than than textConnection(), because many email clients add arbitrary newlines, messing up the text formatting. Sarah On Sat, Feb 11, 2012 at 4:51 PM, emorway emor...@usgs.gov wrote: edm-read.table(textConnection(WELLID X_GRID Y_GRID LAYER ROW COLUMN SPECIES CALCULATED OBSERVED w301_3 4428. 1389 2 6 18 1 3558 6490. w304_12 4836. 6627 2 27 20 1 3509 3228. 02_10_12080 3.6125E+04 13875 1 56 145 1 2774 -999.0 02_10_12080 3.6125E+04 13875 1 56 145 1 2774 -999.0 02_10_12081 3.6375E+04 13875 1 56 146 1 3493 -999.0 02_10_12092 3.9125E+04 13875 1 56 157 1 4736 -999.0 w305_12 2962. 7326 2 30 12 1 4575 5899.),header=T) closeAllConnections() -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] obtaining a true/false vector with combination of strsplit, length, unlist,
It sounds like the problem boils down to counting the number of _s in the WELLID variable, and seeing if there are two: nchar(gsub('[^_]','',edm$WELLID)) == 2 [1] FALSE FALSE TRUE TRUE TRUE TRUE FALSE - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Sat, 11 Feb 2012, emorway wrote: Hi, A pared down version of the dataset I'm working with: edm-read.table(textConnection(WELLIDX_GRID Y_GRID LAYER ROW COLUMN SPECIES CALCULATED OBSERVED w301_3 4428. 1389 2 6 18 1 3558 6490. w304_12 4836. 6627 2 27 20 1 3509 3228. 02_10_120803.6125E+04 13875 1 56145 1 2774 -999.0 02_10_120803.6125E+04 13875 1 56145 1 2774 -999.0 02_10_120813.6375E+04 13875 1 56146 1 3493 -999.0 02_10_120923.9125E+04 13875 1 56157 1 4736 -999.0 w305_12 2962. 7326 2 30 12 1 4575 5899.),header=T) closeAllConnections() I'm having a hard time coming up with the R code that would produce a TRUE/FALSE vector based on whether or not the first column of the data.frame edm has a length of 2 or 3? To show what I mean going row-by-row, I could do the following: length(strsplit(as.character(edm$WELLID),_)[[1]])==3 [1] FALSE length(strsplit(as.character(edm$WELLID),_)[[2]])==3 [1] FALSE length(strsplit(as.character(edm$WELLID),_)[[3]])==3 [1] TRUE length(strsplit(as.character(edm$WELLID),_)[[4]])==3 [1] TRUE length(strsplit(as.character(edm$WELLID),_)[[5]])==3 [1] TRUE length(strsplit(as.character(edm$WELLID),_)[[6]])==3 [1] TRUE length(strsplit(as.character(edm$WELLID),_)[[7]])==3 [1] FALSE I've fumbled around trying to come up with a line of R code that would create a vector that looks like: FALSE FALSE TRUE TRUE TRUE TRUE FALSE The final goal is to use this vector to create two new data.frames, where, for example, the first contains all the rows of edm in which the first column has a length of 2 when split using a _ character. The second data.frame would contain all the rows in which the first column has a length of 3 when split using a _ character. Thanks, Eric -- View this message in context: http://r.789695.n4.nabble.com/obtaining-a-true-false-vector-with-combination-of-strsplit-length-unlist-tp4380050p4380050.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] colnames documentation
Thanks Uwe. Michael 2012/2/11 Uwe Ligges lig...@statistik.tu-dortmund.de: On 10.02.2012 04:53, R. Michael Weylandt wrote: Consider the following in R 2.14.1 (seems to still be the case in Rdevel): x- matrix(1:9, 3) colnames(x) # NULL as expected colnames(x, do.NULL = TRUE) # NULL -- since we didn't change the default colnames(x, do.NULL = FALSE) # col1 col2 col3 This doesn't really seem to square with the documentation which reads: do.NULL: logical. Should this create names if they are ‘NULL’? The details section expounds and says: If ‘do.NULL’ is ‘FALSE’, a character vector (of length ‘NROW(x)’ or ‘NCOL(x)’) is returned in any case, prepending ‘prefix’ to simple numbers, if there are no dimnames or the corresponding component of the dimnames is ‘NULL’. But I have to admit that I don't really get it. (The interpretation of the docs; I understand the functionality) Could someone enlighten me? Given what the details section says (and the behavior of the function is), I'd expect something more like: do.NULL: logical. Is NULL an acceptable return value? If FALSE, column names derived from prefix are returned. Changed to \item{do.NULL}{logical. If \code{FALSE} and names are \code{NULL}, names are created.} Michael PS -- In my searching, I think the link to the svn on the developer page (http://developer.r-project.org/) is wrong: clicking it takes one to what appears to be the same page: am I incorrect in assuming it should link to http://svn.r-project.org/R for the current svn? I think you followed the link to the svn sources of that developer page (rather than the software R). Uwe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to see a R function's code
I was wondering how do I actually see what's inside a function, say, density of t distribution, dt()? I know for some, I can type the function name inside R and the code will be displayed. But for dt(), I get dt function (x, df, ncp, log = FALSE) { if (missing(ncp)) .Internal(dt(x, df, log)) else .Internal(dnt(x, df, ncp, log)) } environment: namespace:stats I am curious because I am doing rejection sampling and want to find a bigger distribution. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting codebook data into R
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of barny Sent: Saturday, February 11, 2012 10:04 AM To: r-help@r-project.org Subject: Re: [R] Getting codebook data into R Hi Eric - after seeing the difficulty of inputting this kind of data into R I decided to use your method. It was rather painless using PSPP to do what I wanted - however, how do I now create an SPSS file and then use the memisc package to read it in? There is SPSS code for reading the files on the codebook page http://www.cdc.gov/nchs/nsfg/nsfg_2006_2010_puf.htm#codebooks hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to see a R function's code
Hi, Section 2 of the R Internals manual gives you some information. Assuming you have the source code, path_to_R/src/main/names.c holds the look up table. I am pretty sure that dt is one of the do_math* group (maybe math2??) so arithmetic.c may be useful. These are all text files so you can search in the source, but as these are pretty low level functions, I would expect it to take some time and effort to see and understand the code you want. Someone else on the list may know an easier way or know straight where to go for your particlar purpose. Cheers, Josh On Sat, Feb 11, 2012 at 2:19 PM, Colstat cols...@gmail.com wrote: I was wondering how do I actually see what's inside a function, say, density of t distribution, dt()? I know for some, I can type the function name inside R and the code will be displayed. But for dt(), I get dt function (x, df, ncp, log = FALSE) { if (missing(ncp)) .Internal(dt(x, df, log)) else .Internal(dnt(x, df, ncp, log)) } environment: namespace:stats I am curious because I am doing rejection sampling and want to find a bigger distribution. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple histograms from a dataframe
Le 11 février 2012 02:33, David Winsemius dwinsem...@comcast.net a écrit : On Feb 10, 2012, at 7:05 PM, Adel ESSAFI wrote: Hi list I need some help for drawing some histograms I have a dataframe , say, X Y Z T I want to draw a histogram Z-T for each value of the couple (X-Y). When I use thus syntax library(lattice) histogram(law[,3] ~ law[,66] | law[,1] ) Perhaps (but untested in the absence of data); histogram( Z ~ T | interaction(X, Y) , data=dfrmname ) Thanks , that helped a lot. now, I have another problem: I want to draw many (two) figures together. The par(new=T) directve does not recognize the ploy provided by lattice library when I tired : xyplot(law[,66] ~ law[,3]| interaction(law[,1],law[,2]),type='l') par(new=T) *Warning message: In par(new = T) : calling par(new=TRUE) with no plot* xyplot(law[,67] ~ law[,3]| interaction(law[,1],law[,2]),type='l') and the second xyplot() draws a new figure. what can I do to draw to figures together using lattice? Thanks it draws multiple histograms but by selecting distinct values of law[,1] The deal is to make the same thing but for a couple of columns Thanks in advance for help Adel -- David Winsemius, MD West Hartford, CT -- PhD candidate in Computer Science Address 3 avenue lamine, cité ezzahra, Sousse 4000 Tunisia tel: +216 97 246 706 (+33640302046 jusqu'au 15/6) fax: +216 71 391 166 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Detect numerical series
This was actually answered a couple times in StackOverflow. Someone and I indpendently wrote up the following function, stolen directly from the source for rle(). # extended version of rle to find all sorts of sequences # if incr=0, this is rle seqle- function (x,incr=1) { if (!is.vector(x) !is.list(x)) stop('x' must be an atomic vector) n - length(x) if (n == 0L) return(structure(list(lengths = integer(), values = x), class = rle)) y - x[-1L] != x[-n] +incr i - c(which(y | is.na(y)), n) structure(list(lengths = diff(c(0L, i)), values = x[i]), class = rle) } quote From: David Winsemius dwinsemius_at_comcast.net Date: Sat, 11 Feb 2012 10:08:17 -0500 On Feb 11, 2012, at 10:01 AM, syrvn wrote: Hello, I am struggling with detecting successive digits in a numerical series vector. Here is an example: vec - c(1, 15, 26, 29, 30, 31, 37, 40, 41) I want to be able to detect 29, 30, 31 and 40, 41. Then, I would like to delete the successive digits from the vector. 1, 15, 26, 29, 37, 40 vec[ c(TRUE, !diff(vec) == 1) ] #[1] 1 15 26 29 37 40 -- Sent from my Cray XK6 Quidvis recte factum, quamvis humile, praeclarum. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using igraph: community membership of components built by decompose.graph()
Hi everyone! I would appreciate help with using decompose.graph(), community detection functions from igraph and lapply(). I have an igraph object G with vertex attribute label and edge attribute weight. I want to calculate community memberships using different functions from igraph, for simplicity let it be walktrap.community. This graph is not connected, that is why I decided to decompose it into connected components and run walktrap.community on each component, and afterwards add a community membership vertex attribute to the original graph G. I am doing currently the following comps - decompose.graph(G,min.vertices=2) communities - lapply(comps,walktrap.community) At this point I get stuck since I get the list object with the structure I cannot figure out. The documentation on decompose.graph tells only that it returns list object, and when I use lapply on the result I get completely confused in the results. Moreover, the communities are numbered from 0 in each component, and I don't know how to supply weights parameter into walktrap.community function. If it were not for the components, I would have done the following: wt - walktrap.community(G, modularity=TRUE, weights=E(G)$weight) wmemb - community.to.membership(G, wt$merges, steps=which.max(wt$modularity)-1) V(G)$walktrap - wmemb$membership Could anyone please help me solve this issue? Or provide some information/links which could help? Thanks and best wishes, Natalia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Install the rugarch-package
I have problem installing rugarch, too. I use R 2.14.1 on Mac OS X 10.7.3. When I tried to load rugauch, I got the bellowing error message: Loading required package: Rcpp Loading required package: RcppArmadillo Loading required package: numDeriv Loading required package: chron Loading required package: Rsolnp Loading required package: truncnorm Rsolnp (version 1.11) initialized. Package rugarch (1.0-7) loaded. To cite, see citation(rugarch) Error in as.environment(pos) : no item called newtable on the search list In addition: Warning message: In objects(newtable, all.names = TRUE) : ‘newtable’ converted to character string Error: package/namespace load failed for ‘rug arch’ I have tried removing all related packages and reinstalling them but the error still exists. I appreciate if someone can help me resolve this issue. -- View this message in context: http://r.789695.n4.nabble.com/Install-the-rugarch-package-tp3911903p4380077.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New Sex Video.#########
New Sex Video.. Video share http://money586.blogspot.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Get identical results for parallel and sequential?
Hi All, I have a question about R parallel computing by using snowfall. How can I set the seeds on parallel workers to get the same result as sequential mode? For example: sfSapply(c(1,1),rnorm) [1] 1.823082 -2.222052 rnorm(2) [1] -0.5179967 -1.0807196 How to get the identical result? Thanks. Libo -- View this message in context: http://r.789695.n4.nabble.com/Get-identical-results-for-parallel-and-sequential-tp4380110p4380110.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Parallel question
Hi All, I have a question about R parallel computing by using snowfall. How can I set the seeds on parallel workers to get the same result as sequential mode? For example: sfSapply(c(1,1),rnorm) [1] 1.823082 -2.222052 rnorm(2) [1] -0.5179967 -1.0807196 How to get the identical result? Thanks. Libo Sun Graduate Student, Department of Statistics, Colorado State University Fort Collins, CO -- View this message in context: http://r.789695.n4.nabble.com/R-Parallel-question-tp4380098p4380098.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New Sex Video.#########
New Sex Video.. Video share http://money586.blogspot.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading data from a worksheet on the Internet
Dear R-users, I have to read data from a worksheet that is available on the Internet. I have been doing this by copying the worksheet from the browser. But I would like to be able to copy the data automatically using the url command. But when using url command the result is the source code, I mean, a html code. I see that the data I need is in the source code but before thinking about reading the data from the html code I wonder if there is a package or anoher way to extract these data since reading from the code will demand many work and it can be not so accurate. Below one can see the from where I am trying to export the data: dados-url( http://www.mar.mil.br/dhn/chm/meteo/prev/dados/pnboia/sc1201_arquivos/sheet002.htm,r ) I am looking forward any help. Thanks in advance , Nilza Barros [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice 3d coordinate transformation
Thank you Deepayan, your answer put me on the path to SOLVED !!! Actually passing projected corners to panel.rect was the first thing I tried, but couldn't get it to work. However, panel.3dpolygon in latticeExtra did the trick. I'm posting it here for completion. require(lattice) ; require(latticeExtra) set.seed(1113) d - data.frame(x=runif(30),y=runif(30),g=gl(2,15)) d$z - with(d,rnorm(30,3*asin(x^2)-5*y^as.integer(g),.1)) d$z - d$z+min(d$z)^2 surf - by(d,d$g,function(D){ fit - lm(z~poly(x,2)*poly(y,2),data=D) outer(seq(0,1,l=10),seq(0,1,l=10),function(x,y,...) predict(fit,data.frame(x=x,y=y))) }) panel.3d.surf - function(x, y, z, rot.mat, distance, zlim.scaled, ...){ zz - surf[[packet.number()]] ; n - nrow(zz) lp - level.colors(zz, at = do.breaks(range(zz), 20), col.regions = heat.colors(20)) s - seq(-.5,.5,l=n) ; cntrds - expand.grid(s,s) ; index - 0 apply(cntrds,1,function(i){ index - index+1 xx - i[1]+c(-.5,-.5,.5,.5)/(n-1) ; yy - i[2]+c(-.5,.5,.5,-.5)/(n-1) panel.3dpolygon(xx,yy, zlim.scaled[1], rot.mat, distance, border=lp[index], col=lp[index],...) }) panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled = zlim.scaled, ...) } cloud(z~x+y|g,data=d,layout=c(2,1), type='h', panel.3d.cloud = panel.3d.surf, zoom = 1,screen=list(z= 21,y=0,x=-60),aspect = c(1,1), panel.aspect = 1, scales=list(z=list(arrows=F,tck=0),x=list(distance=.75)), par.box=list(lwd=NA),lwd=3) ## Beautiful ! On Sat, Feb 11, 2012 at 6:00 AM, Deepayan Sarkar deepayan.sar...@gmail.com wrote: On Fri, Feb 10, 2012 at 12:43 AM, ilai ke...@math.montana.edu wrote: Hello List! I asked this before (with no solution), but maybe this time... I'm trying to project a surface to the XY under a 3d cloud using lattice. I can project contour lines following the code for fig 13.7 in Deepayan Sarkar's Lattice, Multivariate Data Visualization with R, but it fails when I try to color them in using panel.levelplot. ?utilities.3d says there may be some bugs, and I think ltransform3dto3d() is not precise (where did I hear that?), but is this really the source of my problem? Is there a (simple?) workaround, maybe using 3d.wire but projecting it to XY? How? Please, any insight may be useful. I don't think this will be that simple. panel.levelplot() essentially draws a bunch of colored rectangles. For a 3D projection, each of these will become (four-sided) polygons. You need to compute the coordinates of those polygons, figure out their fill colors (possibly using ?level.colors) and then draw them. -Deepayan Thanks in advance, Elai. A working example: ## data d and predicted surf: set.seed(1113) d - data.frame(x=runif(30),y=runif(30),g=gl(2,15)) d$z - with(d,rnorm(30,3*asin(x^2)-5*y^as.integer(g),.1)) d$z - d$z+min(d$z)^2 surf - by(d,d$g,function(D){ fit - lm(z~poly(x,2)*poly(y,2),data=D) outer(seq(0,1,l=10),seq(0,1,l=10),function(x,y,...) predict(fit,data.frame(x=x,y=y))) }) ## # This works to get contours: require(lattice) cloud(z~x+y|g,data=d,layout=c(2,1), type='h', lwd=3, par.box=list(lty=0), scales=list(z=list(arrows=F,tck=0)), panel.3d.cloud = function(x, y, z,rot.mat, distance, zlim.scaled, nlevels=20,...){ add.line - trellis.par.get(add.line) clines - contourLines(surf[[packet.number()]],nlevels = nlevels) for (ll in clines) { m - ltransform3dto3d(rbind(ll$x-.5, ll$y-.5, zlim.scaled[1]), rot.mat, distance) panel.lines(m[1,], m[2,], col = add.line$col, lty = add.line$lty, lwd = add.line$lwd) } panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled = zlim.scaled, ...) } ) # But using levelplot: panel.3d.levels - function(x, y, z,rot.mat, distance, zlim.scaled,...) { zz - surf[[packet.number()]] n - nrow(zz) s - seq(-.5,.5,l=n) m - ltransform3dto3d(rbind(rep(s,n),rep(s,each=n),zlim.scaled[1]), rot.mat, distance) panel.levelplot(m[1,],m[2,],zz,1:n^2,col.regions=heat.colors(20)) panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled = zlim.scaled, ...) } cloud(z~x+y|g,data=d,layout=c(2,1), type='h', panel.3d.cloud = panel.3d.levels, scales=list(z=list(arrows=F,tck=0)),par.box=list(lty=0),lwd=3) # I also tried to fill between contours but can't figure out what to do with the edges and how to incorporate the x,y limits to 1st and nth levels. panel.3d.contour - function(x, y, z,rot.mat, distance,xlim,ylim, zlim.scaled,nlevels=20,...) { add.line - trellis.par.get(add.line) zz - surf[[packet.number()]] clines - contourLines(zz,nlevels = nlevels) colreg - heat.colors(max(unlist(lapply(clines,function(ll) ll$level for (i in 2:length(clines)) { ll - clines[[i]] ll0 - clines[[i-1]] m - ltransform3dto3d(rbind(ll$x-.5, ll$y-.5, zlim.scaled[1]), rot.mat, distance) m0 -
Re: [R] how to plot a nice legend?
There are various alternatives available; you can also write your own, by modifying the standard one. Generally there are lots of possibilities for customizing within the standard one; e.g. y.intersp will affect the line spacing, using a negative value for inset (together with xpd=NA) will allow the legend to be moved outside the plot. i tried without success: plot(1:10) legend(1,3, legend=c(one, two), inset=-1, xpd=NA) The legend is still placed inside the plot on point (1,3) What could i have done wrong? Can i include a legend like this in a standard plot like plot(1:10) too? http://www.r-bloggers.com/wp-content/uploads/2011/03/heatmap.png kind regards, -- Jonas Stein n...@jonasstein.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to plot a nice legend?
Hi, On Sat, Feb 11, 2012 at 8:07 PM, Jonas Stein n...@jonasstein.de wrote: There are various alternatives available; you can also write your own, by modifying the standard one. Generally there are lots of possibilities for customizing within the standard one; e.g. y.intersp will affect the line spacing, using a negative value for inset (together with xpd=NA) will allow the legend to be moved outside the plot. i tried without success: plot(1:10) legend(1,3, legend=c(one, two), inset=-1, xpd=NA) The legend is still placed inside the plot on point (1,3) What could i have done wrong? Wrong? Nothing. You told R to put the legend at c(1,3) so it did. If you want it elsewhere you need to specify that. legend(-1,3, legend=c(one, two), inset=-1, xpd=NA) maybe, or some other location? Can i include a legend like this in a standard plot like plot(1:10) too? http://www.r-bloggers.com/wp-content/uploads/2011/03/heatmap.png Yes. What part of that do you want to duplicate? You can specify colors, symbols, labels, etc. in legend(). Also, please link to the original blog post, not just the figure, so that the author gets some credit and we can see the code used. Sarah kind regards, -- Jonas Stein n...@jonasstein.de -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to plot a nice legend?
Wrong? Nothing. You told R to put the legend at c(1,3) so it did. If you want it elsewhere you need to specify that. legend(-1,3, legend=c(one, two), inset=-1, xpd=NA) maybe, or some other location? ok that works fine. Now i understand how to use it. If i create several plots it would be nice if all legends would have the same distance to plots with different scaling. Can the legend be placed vertically centered, 1cm right to the plot aera? Can i include a legend like this in a standard plot like plot(1:10) too? http://www.r-bloggers.com/wp-content/uploads/2011/03/heatmap.png Yes. What part of that do you want to duplicate? The coloured squares. for the reader who got to this article and had the same question: I have just found another nice solution for a colour legend a minute ago http://www.r-bloggers.com/rethinking-loess-for-binomial-response-pitch-fx-strike-zone-maps/ You can specify colors, symbols, labels, etc. in legend(). can i even invent my own symbols? Also, please link to the original blog post, not just the figure, so that the author gets some credit and we can see the code used. sure http://www.r-bloggers.com/ggheat-a-ggplot2-style-heatmap-function/ kind regards, -- Jonas Stein n...@jonasstein.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple histograms from a dataframe
On Feb 11, 2012, at 6:25 PM, Adel ESSAFI wrote: Le 11 février 2012 02:33, David Winsemius dwinsem...@comcast.net a écrit : On Feb 10, 2012, at 7:05 PM, Adel ESSAFI wrote: Hi list I need some help for drawing some histograms I have a dataframe , say, X Y Z T I want to draw a histogram Z-T for each value of the couple (X-Y). When I use thus syntax library(lattice) histogram(law[,3] ~ law[,66] | law[,1] ) Perhaps (but untested in the absence of data); histogram( Z ~ T | interaction(X, Y) , data=dfrmname ) Thanks , that helped a lot. now, I have another problem: I want to draw many (two) figures together. The par(new=T) directve does not recognize the ploy provided by lattice library Par is for base graphics. xyplot is part of lattice and grid graphics. when I tired : xyplot(law[,66] ~ law[,3]| interaction(law[,1],law[,2]),type='l') par(new=T) Warning message: In par(new = T) : calling par(new=TRUE) with no plot xyplot(law[,67] ~ law[,3]| interaction(law[,1],law[,2]),type='l') and the second xyplot() draws a new figure. what can I do to draw to figures together using lattice? You need to describe what you mean by together. It is possible that the goup parameter is what you want but that's just a guess. It's also possible that the formular operator + will give you what you desire. Perhaps: xyplot( law[,67] + law[,66] ~ law[,3]| interaction(law[,1],law[, 2]),type='l') it draws multiple histograms but by selecting distinct values of law[,1] The deal is to make the same thing but for a couple of columns That doesn't make any sense to me. But then I do apologize for the English language. It's horribly complex and syntactically a mess. Thanks in advance for help Adel -- David Winsemius, MD West Hartford, CT -- PhD candidate in Computer Science Address 3 avenue lamine, cité ezzahra, Sousse 4000 Tunisia tel: +216 97 246 706 (+33640302046 jusqu'au 15/6) fax: +216 71 391 166 David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting occurences of variables in a dataframe
On Sat, Feb 11, 2012 at 04:05:25PM -0500, David Winsemius wrote: On Feb 11, 2012, at 1:17 PM, Kai Mx wrote: Hi everybody, I have a large dataframe similar to this one: knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af') kdate - as.Date( c('20111001', '2002', '20101001', '20100315', '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'), format=%Y%m%d) kdata - data.frame (knames, kdate) ave(unclass(kdate), knames, FUN=order ) [1] 2 2 1 1 1 2 1 2 1 1 That was actually not using the dataframe values but you could also do this: kdata$ord - with(kdata, ave(unclass(kdate), knames, FUN=order )) kdata knames kdate ord 1 ab 2011-10-01 2 2 aa 2011-11-02 2 3 ac 2010-10-01 1 4 ad 2010-03-15 1 5 ab 2010-12-01 1 6 ac 2011-01-05 2 7 aa 2010-10-01 1 8 ad 2011-05-04 2 9 ae 2011-06-03 1 10 af 2011-02-01 1 Hi. This is a good solution, if there are at most two occurrences of each name. If there are more occurrences, then function order should be replaced by rank. Replacing name aa at row 2 by ab, we get knames -c('ab', 'ab', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af') kdate - as.Date( c('20111001', '2002', '20101001', '20100315', '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'), format=%Y%m%d) kdata - data.frame (knames, kdate) kdata$ord - with(kdata, ave(unclass(kdate), knames, FUN=order)) kdata$rank - with(kdata, ave(unclass(kdate), knames, FUN=rank)) kdata knames kdate ord rank 1 ab 2011-10-01 32 2 ab 2011-11-02 13 3 ac 2010-10-01 11 4 ad 2010-03-15 11 5 ab 2010-12-01 21 6 ac 2011-01-05 22 7 aa 2010-10-01 11 8 ad 2011-05-04 22 9 ae 2011-06-03 11 10 af 2011-02-01 11 The names ab occur in the order row 5, row 1, row 2, so row 1 should get index 2, row 2 index 3. If some of the dates repeat, then rank() by default computes the average index. In this case, the following function f() may be used knames -c('ab', 'ab', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af') kdate - as.Date( c('20111001', '20111001', '20101001', '20100315', '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'), format=%Y%m%d) kdata - data.frame (knames, kdate) kdata$rank - with(kdata, ave(unclass(kdate), knames, FUN=rank)) f - function(x) rank(x, ties.method=first) kdata$f - with(kdata, ave(unclass(kdate), knames, FUN=f)) kdata knames kdate rank f 1 ab 2011-10-01 2.5 2 2 ab 2011-10-01 2.5 3 3 ac 2010-10-01 1.0 1 4 ad 2010-03-15 1.0 1 5 ab 2010-12-01 1.0 1 6 ac 2011-01-05 2.0 2 7 aa 2010-10-01 1.0 1 8 ad 2011-05-04 2.0 2 9 ae 2011-06-03 1.0 1 10 af 2011-02-01 1.0 1 Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.