Re: [R] sorting subsetting a data.frame
On Mar 6, 2011, at 7:34 PM, David Winsemius wrote: On Mar 6, 2011, at 6:05 PM, Liviu Andronic wrote: On Sun, Mar 6, 2011 at 11:53 PM, Liviu Andronic landronim...@gmail.com wrote: On Sun, Mar 6, 2011 at 11:49 PM, Liviu Andronic landronim...@gmail.com wrote: Dear all This may be obvious, but I cannot get it working. I'm trying to subset sort a data frame in one go. x - iris x$Species1 - as.character(x$Species) ##subsetting alone works fine with(x, x[Sepal.Length==6.7,]) ##sorting alone works fine with(x, x[order(Sepal.Length, rev(sort(Species1))),]) ##gets subsetted, but not sorted as expected with(x, x[(Sepal.Length==6.7) order(Sepal.Length, rev(sort(Species1))),]) ##gets subsetted, but sorts very strangely xa - with(x, x[Sepal.Length==6.7,]); with(xa, xa[order(Sepal.Length, rev(sort(Species1))),]) xa - with(x, x[Sepal.Length==6.7,]); with(xa, xa[order(rev(sort(Species1))),]) And of course I found the culprit after sending the e-mail: wrong call order. The following does what I want, although it's a bit messy: xa - with(x, x[Sepal.Length==6.7,]); with(xa, xa[rev(order(sort(Species1))),]) xa - subset(x, Sepal.Length==6.7); with(xa, xa[rev(order((sort(Species1,]) But it still doesn't work on my data! Any ideas for a different approach? subset(x[order(x$Species1), ], Sepal.Length==6.7 ) Slight modification of the order/sort first subset second strategy: xa - x[order(x$Species1), ][x$Sepal.Length==6.7, ] If you really wanted to select cases first, and still keep it a one- liner there is the option of: (y - x[x$Sepal.Length==6.7, ])[order(y$Species1), ] Liviu Regards Liviu I've checked The R Inferno, Quick-R and several other places with no obvious solution. Any ideas? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed- reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replace for loop when vector calling itself
Hi, I am missing something obvious. Need to create vector as: (0, i-1 + TheoP(i) - TheoP(i-1), repeat) Where i is the index position in the vector and i[1] is always 0. Found myself having to use a For Loop because I could not get sapply working. Any suggestions ? delta - function(x) { start = index[x] end = index[x+1] - 1 iTheo = start:end len = length(iTheo) theoP = as.numeric(TheoBA$Bid[iTheo]) d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } return(d) } Thanks, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-for-loop-when-vector-calling-itself-tp3338383p3338383.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Preferred way to create bubble plots?
I have to create a number of bubble plots, and am wondering what methods folks prefer for this task. I've been experimenting with the symbols() function, with text() to provide plot labels. Any opinions on the relative merits of this method versus others? One criterion would be the ability to fine-tune the placement of text labels. I would like to use lattice, but haven't found a way to make it work for this purpose. Thanks in advance. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Zero Inflated Distributions
Any help on this would be appreciated. Thank you. -- View this message in context: http://r.789695.n4.nabble.com/Zero-Inflated-Distributions-tp3334861p3338344.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Probabilities outside [0, 1] using Support Vector Machines (SVM) in e1071
predict.svm only returns probabilities for model types (Model$type) less than 2, which I guess are classification models (?). In any case, the probabilities are returned as an attribute which your result clearly lacks. Trivial example: model- svm(Species ~ ., data = iris[-1,], probability=TRUE) ( p- predict(model, newdata=iris[1,], probability=TRUE) ) 1 setosa attr(,probabilities) setosa versicolor virginica 1 0.9796166 0.01147175 0.008911663 Levels: setosa versicolor virginica Hope this helps a little. Allan On 04/03/11 18:54, Adam B. Smith wrote: Hi All, I'm attempting to use eps-regression or nu-regression SVM to compute probabilities but the predict function applied to an svm model object returns values outside [0, 1]: Variable Data looks like: Present X02 X03 X05 X06 X07 X13 X14 X15 X18 1 0 1634 48 2245.469 -1122.0750 3367.544 11105.013 2017.306 40 23227 2 0 1402 40 2611.519 -811.2500 3422.769 10499.425 1800.475 40 13822 3 0 1379 40 2576.150 -842.8500 3419.000 10166.237 2328.756 37 14200 4 0 1869 51 2645.794 -982.2938 3628.088 9610.037 1699.656 43 20762 ... and bgEnv looks similar: X02 X03 X05 X06 X07 X13 X14 X15 X18 1 1001 39 2521.406 -38.0875 2559.494 48507.312 3925.7563 63 20616 2 1587 39 3148.056 -895.0187 4043.075 5937.669 910.9062 55 15156 3 1610 40 4116.918 172.6812 3944.237 2287.431 196.0312 51 2739 4 1495 43 3678.381 236.9250 3441.456 3298.625 23.9875 86 281 5 1564 43 3010.988 -623.6063 3634.594 3416.350 819.6375 34 3848 ... modelFormula- as.formula(Present ~ X02 + X03 + X05 + X06 + X07 + X13 + X14 + X15 + X18) Model- svm( modelFormula, data=Data, gamma=0.25, cost=4, nu=0.10, kernel='radial', scale=TRUE, type='nu-regression', na.action=na.omit, probability=TRUE ) bgPreds- predict( Model, newdata=bgEnv, type='nu-regression', probability=TRUE ) bgPreds looks like: 11 12 13 14 15 16 17 18 0.54675813 0.37587560 0.39526542 0.67043587 -0.03079247 0.16696996 0.04714134 0.06989950 19 20 0.07615735 0.14923408 Notice the negative value. I can also get values1. I had thought argument probability=TRUE would give probabilities. Any help would be greatly appreciated! Adam Adam B. Smith University of California, Berkeley __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rowSums - am I getting something wrong?
I am trying to construct a data set with some sequences for example: a = seq(0,1,0.1) m = matrix(nrow = 1331, ncol = 3) m[,1] = rep(a,121) m[,2] = rep(a,11,each = 11) m[,3] = rep(a,1,each = 121) I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having. I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values: [161,] 0.6 0.3 0.1 which(rowSum(m)1) [53] 119 120 121 132 142 143 152 153 154 161 162 As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix) If I try the following: q=rowSums(m) which(q1) [53] 119 120 121 132 142 143 152 153 154 161 162 But if I add and subtract 1 from this: q=q+1 q=q-1 which(q1) [53] 119 120 121 132 142 143 152 153 154 162 What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect. Any help would be great Tom message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] wireframe() display a graph with two colors, not a gradient.
On Sun, Feb 27, 2011 at 1:29 AM, James Platt james-pl...@hotmail.co.uk wrote: Hi all, I'm quite new to wireframe, essentially what I want to do is display a graph, and z-values 1 would be yellow and those 1 would be blue. This is a bit of my data. 0.334643563 0.350913807 0.383652307 0.370325283 0.38779016 0.42387392 0.39861579 0.418389687 0.460692165 0.43888516 0.468015843 0.520560489 0.499544084 0.535099422 0.60982153 0.569888047 0.634351734 0.717646392 0.717202578 0.810887467 0.935152498 0.901982916 1.044895388 1.228306176 1.12856184 1.314210456 1.462055626 1.377314404 1.540372345 1.6206216 1.494044604 1.618244219 1.631295797 data.m = as.matrix(read.table(/Users/James/Desktop/c.txt, sep='\t')) library(lattice) wireframe(data.m,aspect = c(0.3), shade=TRUE, screen = list(z = 0, x = -45), light.source = c(0,0,10), distance = 0.2) i know I probably need to use the col.regions or level.colors argument, but I'm not sure what I actually need to supply to the arguments to get it to work. All the examples I've seen also have a gradient of color between two or more colors, I want to color half the graph with z values 1 yellow, and 1 blue. Something like wireframe(volcano, drape = TRUE, at = c(90, 150, 200)) perhaps? Note that color may not vary within a quadrilateral, so your boundary will be jagged to some extent. -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums - am I getting something wrong?
Hi Tom, That's once again the floating point number issue: see FAQ 7.31. Look at this: sum(m[161,]) [1] 1 sum(m[161,])==1 [1] FALSE sum(m[161,])-1 [1] 2.220446e-16 So 0.6+0.3+0.1 is indeed greater than 1 Try this instead: round(sum(m[161,]))==1 [1] TRUE HTH, Ivan Le 3/7/2011 08:08, thomas.salve...@syngenta.com a écrit : I am trying to construct a data set with some sequences for example: a = seq(0,1,0.1) m = matrix(nrow = 1331, ncol = 3) m[,1] = rep(a,121) m[,2] = rep(a,11,each = 11) m[,3] = rep(a,1,each = 121) I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having. I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values: [161,] 0.6 0.3 0.1 which(rowSum(m)1) [53] 119 120 121 132 142 143 152 153 154 161 162 As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix) If I try the following: q=rowSums(m) which(q1) [53] 119 120 121 132 142 143 152 153 154 161 162 But if I add and subtract 1 from this: q=q+1 q=q-1 which(q1) [53] 119 120 121 132 142 143 152 153 154 162 What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect. Any help would be great Tom message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Same color key for multiple lattice contour plots
On Mon, Feb 21, 2011 at 5:29 AM, joepvanderzanden joep_vd_zan...@hotmail.com wrote: Hi all, I'm trying to make multiple lattice contour plots which have the same color key, to allow good comparisons. However, I run into some problems when fitting the plots to the color key. Basically my strategy to tackle this problem was: 1) define a color key for all plots; 2) calculate the variable range for each plot; 3) calculate the range of colors from the color key that correspond with the variable range in each plot. This works fine for one plot but gives problems for an other. The code is shown below. Any ideas of how to solve this would be very welcome! I think you are misunderstanding what 'cuts' does. SInce you are plotting on the same x-y grid, why not do it together in a single plot, as is more natural? If I understand your attempts, that's what you seem to be trying to do anyway in a roundabout way. For example, x.marginal - seq(min(x), max(x), length.out = 50) y.marginal - seq(min(y), max(y), length.out = 50) xy.marginal - list(x = x.marginal, y = y.marginal) zz1 - loess((z1) ~ x * y) zz2 - loess((z2) ~ x * y) grid - expand.grid(xy.marginal) grid[, fit1] - c(predict(zz1, grid)) grid[, fit2] - c(predict(zz2, grid)) contourplot(fit1 + fit2 ~ x * y, data = grid, layout = c(1, 2), as.table = TRUE, at = deltaseq, region = TRUE, col.regions = rev(rainbow(n = numbcolstotal - 2)), colorkey = list(at = deltaseq, labels=list(at=deltaseq)), xlab=distance to river (m), ylab=lambda (/d), strip = strip.custom(factor.levels = c(Mean Z, L Med Clay, Grass, Mean Z, L Med Clay, Grass))) -Deepayan Thanks in advance, Joep van der Zanden MSc student Hydrology and Water Quality Wageningen University, The Netherlands / University of Sydney, Australia # variable to plot: groundwater heads for 2 situations (L Med Clay Grass; S Clay Loam Trees): z2 and z1 # x: range of distances to river # y: range of lambda values ## define input data x - vector(mode=numeric, length=64) for(i in 1:8){x[((i-1)*8+1):(i*8)]- c(0,25,65,140,240,390,590,840)} y - vector(mode=numeric, length=64) for(i in 1:8){y[((i-1)*8+1):(i*8)]- c(.1,.2,.3,.4,.5,.6,.7,.8)[i]} #Groundwater heads L Med Clay Grass: z1-c(-2.50,-2.505624,-2.518618,-2.545173,-2.573554,-2.602523,-2.618828,-2.626615, -2.50,-2.605805,-2.753376,-2.975772,-3.176528,-3.384182,-3.522155,-3.600701, -2.50,-2.665630,-2.899457,-3.264895,-3.622645,-4.027824,-4.344277,-4.552091, -2.50,-2.681250,-2.937271,-3.338646,-3.736416,-4.193448,-4.559524,-4.805040, -2.50,-2.704198,-2.995056,-3.458425,-3.932201,-4.493279,-4.964342,-5.291613, -2.50,-2.714743,-3.021087,-3.511101,-4.015649,-4.616914,-5.126386,-5.482788, -2.50,-2.714741,-3.021320,-3.512125,-4.018124,-4.621621,-5.133251,-5.491187, -2.50,-2.714734,-3.021383,-3.512441,-4.018981,-4.623628,-5.137014,-5.496542) #Groundwater heads S Clay Loam Trees: z2-c(-3.98,-3.097126,-3.239426,-3.473582,-3.721368,-4.020646,-4.277272,-4.457723, -3.000100,-3.107293,-3.263420,-3.518309,-3.786358,-4.109567,-4.386844,-4.582058, -3.000100,-3.110692,-3.271133,-3.531996,-3.805475,-4.134907,-4.417481,-4.616469, -3.000100,-3.110448,-3.270550,-3.531059,-3.804453,-4.134001,-4.416834,-4.616037, -3.000100,-3.111242,-3.272157,-3.533380,-3.806871,-4.136071,-4.418223,-4.616824, -3.000100,-3.110467,-3.270084,-3.528994,-3.799932,-4.125979,-4.405361,-4.601977, -3.000100,-3.110983,-3.271263,-3.531276,-3.803556,-4.131419,-4.412594,-4.610585, -3.000100,-3.111398,-3.272309,-3.533291,-3.806408,-4.135115,-4.416826,-4.615104) ##step 1 define color key bynumb - .2 # delta deltaseq - seq(-5.8,-2,by=bynumb) # define color key numbcolstotal - length(deltaseq)-1 # number of colors in color key #define plot for L Med Clay Grass: x.marginal - seq(min(x), max(x), length.out = 50) y.marginal - seq(min(y), max(y), length.out = 50) xy.marginal - list(x = x.marginal, y = y.marginal) zz - loess((z1) ~ x * y) grid - expand.grid(xy.marginal) grid[, fit] - c(predict(zz, grid)) # step 2 calculate the range of groundwater heads for this plot: rangehere-seq(floor(min(z1)/bynumb)*bynumb,ceiling(max(z1)/bynumb)*bynumb,by=bynumb) # rangehere appears to be seq(-5.6, -2.4, by = -.2, length = 17) # step 3 calculate the range of colors that fit to the range of groundwater heads colorrange-round(seq(((rangehere[1]-deltaseq[1])/bynumb)+1,((max(rangehere)-deltaseq[1])/bynumb),1)) # colorrange is 1 to 16 -- should be correct. plot1 - contourplot(fit~x*y, data = grid,cuts = length(colorrange)-2,region=TRUE, col.regions=rev(rainbow(n=numbcolstotal))[colorrange], colorkey=list(col=rev(rainbow(n=numbcolstotal)),at=deltaseq,labels=list(at=deltaseq)), xlab=distance to river (m), ylab=lambda (/d), main=Mean Z, L Med Clay, Grass) ## do the
Re: [R] sorting subsetting a data.frame
On Mon, Mar 7, 2011 at 1:38 AM, David Winsemius dwinsem...@comcast.net wrote: subset(x[order(x$Species1), ], Sepal.Length==6.7 ) Thank you all for the suggestions. Now I can do exactly what I wanted. Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transaction list transformation to use rpart.
On 06/03/11 22:34, John Dennison wrote: [...] from data like Customer-ID | Item-ID cust1 | 2 cust1 | 3 cust1 | 5 cust2 | 5 cust2 | 3 cust3 | 2 ... #read in data to a sparse binary transaction matrix txn = read.transactions(file=tranaction_list.txt, rm.duplicates= TRUE, format=single,sep=|,cols =c(1,2)); #tranaction matrix to matrix a-as(txn, matrix) #matrix to data.frame b-as.data.frame(a) I end up with a data.frame like: X X.1 X.2 X.3 X.4 X.5 ... cust1 01 101 cust2 00 101 cust3 01 000 ... However the as.data.frame(a) transforms the matrix into a numeric data.frame so when I implement the rpart algorithm it automatically returns a regression classification tree. I am not sure your approach with rpart is going to give you what you are looking for, but on to your R question: [...] I can't successfully transform the data.frame to a factor. i tried: b_factor-as.factor(b) Error in sort.list(y) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list? You need to do each column individually, i.e. b_factor$X.1 - as.factor(b$X.1) or str( as.data.frame(lapply(b, as.factor)) ) 'data.frame':4 obs. of 4 variables: $ X.2 : Factor w/ 2 levels 0,1: 2 1 2 1 $ X.3 : Factor w/ 2 levels 0,1: 2 2 1 1 $ X.5 : Factor w/ 2 levels 0,1: 2 2 1 1 $ X.Item.ID: Factor w/ 2 levels 0,1: 1 1 1 2 Also have a look at as(txn, data.frame) for a different format that may (with some clean up) be easier to use. as(txn, data.frame) transactionID items 1 cust1{ 2, 3, 5} 2 cust2 { 3, 5} 3 cust3{ 2} 4 Customer-ID { Item-ID} Hope this helps a little. Allan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Difference between the S-plus influence and R empinf functions
Hello everyone ! I am currently trying to convert a program from S-plus to R, and I am having some trouble with the S-plus function called influence(data, statistic,...). This function aims to calculate empirical influence values and related quantities, and is part of the Resample library that I cannot find for R. However, 2 similar functions are available in R: - the lm.influence(model, ...) function, - the empinf(data, statistic,...) function. I didn't manage to use the lm.influence() function correctly, because it needs a linear model as input (lm, glm), and what I have as input is a function (I don't know well R/S-plus languages, so I may be mistaken, but I believe lm.influence() is not what I should use for my problem ...?) I have tried to use the R empinf() instead but I am stucked with a problem concerning the input argument group that I cannot translate in R. Here is a copy of the S-plus influence() help concerning this argument: group : vector of length equal to the number of observations in data, for stratified sampling or multiple-sample problems. Sampling is done separately for each group (determined by unique values of this vector). If data is a data frame, this may be a variable in the data frame, or expression involving such variables. empinf() accepts an argument called strata but it doesn't seem to correspond to group. Below is a sample test showing my problem: testinflu = function(data, weights) {sum(data[,1]*weights) } mydata - cbind(c(1,2,3,4,5), c(1,1,1,1,0)) # In S-plus : testinflu(data=mydata, weights=rep(1,length(mydata[,1]))) 15 # In R: testinflu(data=mydata, weights=rep(1,length(mydata[,1]))) 15 # In S-plus : influence(data = mydata, statistic=testinflu)$L testinflu [1,] -2.00e+000 [2,] -1.00e+000 [3,] -1.776357e-013 [4,] 1.00e+000 [5,] 2.00e+000 # In R : empinf(data = mydata, statistic=testinflu) [1] -2.00e+00 -1.00e+00 2.220446e-12 1.00e+00 2.00e+00 # == OK # In S-plus : influence(data = mydata, statistic=testinflu, group = mydata[, 2])$L testinflu [1,] -1.2 [2,] -0.4 [3,] 0.4 [4,] 1.2 [5,] 0.0 # In R: empinf(data = mydata, statistic=testinflu, strata = mydata[, 2]) [1] -1.5 -0.5 0.5 1.5 0.0 # == NOT OK So I have a few questions: - has anyone already experienced the same kind of problem with the influence function ? - is it possible to mimic the use of the group argument in empinf() ? I have looked for answers on the web but couldn't find anythings really helpful, so if someone has an idea I would really appreciate it !! :) Thanks, Fred ps : sorry for my broken English ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Associating the day of week to a daily xts object
I have the following xts objetct temp str(temp) An ‘xts’ object from 2010-12-26 to 2011-03-05 containing: Data: num [1:70, 1] 2.95 0.852 -0.139 1.347 2.485 ... - attr(*, dimnames)=List of 2 ..$ : NULL ..$ : chr t_n Indexed by objects of class: [POSIXct,POSIXt] TZ: GMT xts Attributes: NULL temp t_n 2010-12-26 2.950 2010-12-27 0.8520833 2010-12-28 -0.1390625 ... I would like to associate another column with the day of week in the form of 1=Mon, 2=Tue, ..., 7=Sun in order to obtain: newtemp t_n dow 2010-12-26 2.9507 2010-12-27 0.85208331 2010-12-28 -0.13906252 .. How could make this in the shortest (and elegant?) way? Ciao from Rome Vittorio __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] clustersets comparison
Hi, I have produced some clustersets with the same program and different parameters and want to compare them. They contain a few thousand clusters each with a few million elements in total. After googling around, I couldn't find much of relevance, so I am asking here. Is there any package in R that is suited to this comparisons? I want to have an idea of how the clusters grow/shrink/merge/split, how the elements move around, etc. Cheers, Albert. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: Fwd: r.dll
Thank you very much. -- Forwarded message -- From: Uwe Ligges lig...@statistik.tu-dortmund.de Date: 2011/3/5 Subject: Re: [R] Fwd: r.dll To: wesley mathew wesleycmat...@gmail.com Cc: r-help@r-project.org On 04.03.2011 19:40, wesley mathew wrote: Dear All I downloaded R-2.12.2.tar file, but I could not find R.dll file there. Do u mean the Windows binary distribution is R-2.12.2.tar No, it is the source distribution. 1. Go to CRAN. 2. Click on Windows 3. Click on base 4. Click on Previous releases (since you want the outdated R-2.12.1) 5. Click on R-2.12.1 5. Click on Download R 2.12.1 for Windows Uwe Ligges or another file? Can you help me to find R.dll of version 2.12.1 Kind regards Wesley -- Forwarded message -- From: Uwe Liggeslig...@statistik.tu-dortmund.de Date: 2011/3/4 Subject: Re: [R] r.dll To: wesley mathewwesleycmat...@gmail.com Cc: r-help@r-project.org On 04.03.2011 18:15, wesley mathew wrote: Dear all I have some problem to execute jri package. R.dll file has to copped to jri directory for the execution of jar file in eclips. But R.dll file is not available in the R version 2.12.1 . It is, at least in the Windows binary distribution. Uwe Ligges Is there any chance to get this file. Thanks in advanced Kind regards W. Mathew [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- W. Mathew [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in installing and starting Rattle
CHECK FOR CONFLICTS IN YOUR PATH !!! I had a related problem when trying to use library RGtk2 for the first time. My problem was that when loading the library R was looking for the file zlib1.dll but couldn't find the procedure to launch RGtk2. I was getting an Entry Point not found error from Rgui.exe. The reason was because I had another package in my PATH environment variable (C:\program files\Intel\WiFi\bin) which had a CONFLICTING version of the zlib1.dll - and it was looking in this file and not the zlib1.dll which came with GTK+. Removed this conflict from the PATH and all was ok. -- View this message in context: http://r.789695.n4.nabble.com/Problem-in-installing-and-starting-Rattle-tp3042502p3338807.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Associating the day of week to a daily xts object
Victor, The weekdays function will return the days of the week (as a character vector of names) that a given vector of dates (Date or POSIXct) fall on. These can then be converted into numbers using a look-up table/vector. See below for an example using the sample_matrix data included with the xts package. ## require(xts) data(sample_matrix) dates - as.POSIXct(rownames(sample_matrix), format = %Y-%m-%d) dayLookup - 1:7 names(dayLookup) - c(Mon, Tue, Wed, Thu, Fri, Sat, Sun) datesDays - dayLookup[weekdays(dates, abbreviate = TRUE)] print(datesDays) ## From here, you can just add the datesDays vector as an additional column to the original data, e.g. xts(cbind(sample_matrix, dow =datesDays), order.by = dates). HTH, Francisco On Mon, Mar 7, 2011 at 9:05 AM, Victor vdem...@gmail.com wrote: I have the following xts objetct temp str(temp) An xts object from 2010-12-26 to 2011-03-05 containing: Data: num [1:70, 1] 2.95 0.852 -0.139 1.347 2.485 ... - attr(*, dimnames)=List of 2 ..$ : NULL ..$ : chr t_n Indexed by objects of class: [POSIXct,POSIXt] TZ: GMT xts Attributes: NULL temp t_n 2010-12-26 2.950 2010-12-27 0.8520833 2010-12-28 -0.1390625 ... I would like to associate another column with the day of week in the form of 1=Mon, 2=Tue, ..., 7=Sun in order to obtain: newtemp t_n dow 2010-12-26 2.9507 2010-12-27 0.85208331 2010-12-28 -0.13906252 .. How could make this in the shortest (and elegant?) way? Ciao from Rome Vittorio __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Array Help
Hi, I have two 3 D arrays. Both are of this form array_1- array[n,n,k] array_2-array[m,m,k] Lets say n=83 and m=80 Since nm. I would like to add rows and columns to array_2 to make them equal. I want to keep the size of the third dimension fixed i.e.. k. i.e. if (nrow(array_1)nrow(array_2)) { array_2[m:n,m:n,]- 10^6 } But this doesn't work. I tried abind and rbind but it didn't work either. Any help/tips will be appreciated! Thanks, Usman [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parsing question, partly comma separated partly underscore separated string
On Sun, Mar 6, 2011 at 10:13 PM, Eric Fail eric.f...@gmx.com wrote: Dear R-list, I have a partly comma separated partly underscore separated string that I am trying to parse into R. Furthermore I have a bunch of them, and they are quite long. I have now spent most of my Sunday trying to figure this out and thought I would try the list to see if someone here would be able to get me started. My data structure looks like this, (in a example.txt file) Subject ID,ExperimentName,2010-04-23,32:34:23,Version 0.4, 640 by 960 pixels, On Device M, M, 3.2.4,zz_373_462_488_...@9z.svg,592,820,3.35,zz_032_288_436_...@9z.svg,332,878,3.66,zz_384_204_433_...@9z.svg,334,824,3.28,zz_365_575_683_...@9z.svg,598,878,3.50,zz_005_480_239_...@9z.svg,630,856,8.03,zz_030_423_394_...@9z.svg,98,846,4.09,zz_033_596_398_...@9z.svg,636,902,3.28,zz_263_064_320_...@9z.svg,570,894,1.26,bl...@9z.svg,322,842,32.96,zz_004_088_403_...@9z.svg,606,908,3.32,zz_703_546_434_...@9z.svg,624,934,2.58,zz_712_348_543_...@9z.svg,20,828,5.36,zz_005_48_239_...@9z.svg,580,830,4.36,zz_310_444_623_...@9z.svg,586,806,0.08,zz_030_423_394_...@9z.svg,350,854,3.84,zz_340_382_539_...@9z.svg,570,894,1.26,bl...@9z.svg,542,840,4.44,zz_345_230_662_...@9z.svg,632,844,2.47,zz_006_335_309_...@9z.svg,96,930,3.63,zz_782_346_746_...@9z.svg,306,850,2.58,zz_334_200_333_...@9z.svg,304,842,3.34,zz_383_506_726_...@9z.svg,622,884,3.84,zz_294_360_448_...@9z.svg,90,858,3.56,zz_334_335_473_...@9z.svg,570,894,1.26,bl...@9z.svg,320,852,4.04, (end of example.txt file) The above is approximate 5% of the length of a full file, and then I got about 100 of them. Please note that the strings end with a comma. I am trying to parse it into something like this ID ImgNam BLOCK RUN Tx Ty Treatment x y Y Subject ID 373 1 1 462 488 TRT 592 820 3.35 Subject ID 32 1 2 288 436 CON 332 878 3.66 Subject ID 384 1 3 204 433 TRT 334 824 3.28 Subject ID 365 1 4 575 683 TRT 598 878 3.5 Subject ID 5 1 5 480 239 CON 630 856 8.03 Subject ID 30 1 6 423 394 CON 98 846 4.09 Subject ID 33 1 7 596 398 CON 636 902 3.28 Subject ID 263 1 8 64 320 TRT 570 894 1.26 Subject ID 4 2 1 88 403 CON 606 908 3.32 Subject ID 703 2 2 546 434 CON 624 934 2.58 Subject ID 712 2 3 348 543 CON 20 828 5.36 Subject ID 5 2 4 48 239 CON 580 830 4.36 Subject ID 310 2 5 444 623 TRT 586 806 0.08 Subject ID 30 2 6 423 394 CON 350 854 3.84 Subject ID 340 2 7 382 539 TRT 570 894 1.26 Subject ID 345 3 1 230 662 TRT 632 844 2.47 Subject ID 6 3 2 335 309 CON 96 930 3.63 Subject ID 782 3 3 346 746 TRT 306 850 2.58 Subject ID 334 3 4 200 333 TRT 304 842 3.34 Subject ID 383 3 5 506 726 TRT 622 884 3.84 Subject ID 294 3 6 360 448 TRT 90 858 3.56 Subject ID 334 3 7 335 473 TRT 570 894 1.26 I could do it in Excel, but it would take me a week--and it would be stupid--if someone could please help me get started I would very much appreciate it. It would not only benefit me, but my colleagues would see the benefit of R and the R-list in particular. Try this. We split the line by ZZ_ giving s and remove the junk after the word BLOCK giving s2. Then we remove @9z.svg giving s3 and convert each _ to , giving s4. We then read it into a data frame using comma as the separator, calculate the block and run columns, remove one junk column and assign column names. Line - Subject ID,ExperimentName,2010-04-23,32:34:23,Version 0.4, 640 by 960 pixels, On Device M, M, 3.2.4,zz_373_462_488_...@9z.svg,592,820,3.35,zz_032_288_436_...@9z.svg,332,878,3.66,zz_384_204_433_...@9z.svg,334,824,3.28,zz_365_575_683_...@9z.svg,598,878,3.50,zz_005_480_239_...@9z.svg,630,856,8.03,zz_030_423_394_...@9z.svg,98,846,4.09,zz_033_596_398_...@9z.svg,636,902,3.28,zz_263_064_320_...@9z.svg,570,894,1.26,bl...@9z.svg,322,842,32.96,zz_004_088_403_...@9z.svg,606,908,3.32,zz_703_546_434_...@9z.svg,624,934,2.58,zz_712_348_543_...@9z.svg,20,828,5.36,zz_005_48_239_...@9z.svg,580,830,4.36,zz_310_444_623_...@9z.svg,586,806,0.08,zz_030_423_394_...@9z.svg,350,854,3.84,zz_340_382_539_...@9z.svg,570,894,1.26,bl...@9z.svg,542,840,4.44,zz_345_230_662_...@9z.svg,632,844,2.47,zz_006_335_309_...@9z.svg,96,930,3.63,zz_782_346_746_...@9z.svg,306,850,2.58,zz_334_200_333_...@9z.svg,304,842,3.34,zz_383_506_726_...@9z.svg,622,884,3.84,zz_294_360_448_...@9z.svg,90,858,3.56,zz_334_335_473_...@9z.svg,570,894,1.26,bl...@9z.svg,320,852,4.04, s - strsplit(Line, ZZ_)[[1]] s2 - sub(BLOCK.*, BLOCK, s) s3 - sub(@9z.svg, , s2) s4 - gsub(_, ,, s3) DF - read.table(textConnection(s4), skip = 1, sep = ,, as.is = TRUE) DF$block - head(cumsum(c(, DF$V8) == BLOCK)+1, -1) DF$run - ave(DF$block, DF$block, FUN = seq_along) DF$V8 - NULL names(DF) - c(IngNam, Tx, Ty, Treatment, x, y, Y, BLOCK, RUN) DF IngNam Tx Ty Treatment x yY BLOCK RUN 1 373 462 488 TRT 592 820 3.35 1 1 2 32 288 436 CON 332 878 3.66 1 2 3 384 204 433 TRT 334 824 3.28 1 3 4 365 575 683 TRT 598 878 3.50 1 4 5
Re: [R] Replace for loop when vector calling itself
On Mar 7, 2011, at 12:34 AM, rivercode wrote: Hi, I am missing something obvious. Need to create vector as: (0, i-1 + TheoP(i) - TheoP(i-1), repeat) Where i is the index position in the vector and i[1] is always 0. I think your prototype is not agreeing with the code below. Is i suppose to be the index (as suggested above) or the prior term (as implied below)? Found myself having to use a For Loop because I could not get sapply working. Any suggestions ? Assuming the code below, you construct the first three or four values by hand I think you will find that the intermediate values of TheoP will have alternating signs. term2 = 2-1 + TheoP(2) - TheoP(1) term3 = 3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) term4 = 4-1 + TheoP(4) - (3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) ) The answer to the first question will determine how you proceed. If the index is being used then there are two series of cumulative sums and perhaps you can construct an expression that can be fed to the cumsum function. The diff function is also available and if the index version is correct, then it might even be as simple as c(0, ((1:len)-1)+diff(TheoP) ) So clarify what is intended. -- David. delta - function(x) { start = index[x] end = index[x+1] - 1 iTheo = start:end len = length(iTheo) theoP = as.numeric(TheoBA$Bid[iTheo]) d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } return(d) } Thanks, Chris -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Preferred way to create bubble plots?
On Mar 7, 2011, at 12:11 AM, Al Roark wrote: I have to create a number of bubble plots, and am wondering what methods folks prefer for this task. I've been experimenting with the symbols() function, with text() to provide plot labels. Any opinions on the relative merits of this method versus others? One criterion would be the ability to fine-tune the placement of text labels. I would like to use lattice, but haven't found a way to make it work for this purpose. The xYplot function in Hmisc is based on xyplot and is used to create bubble plots. There are also bubble plot functions in gstat and ps packages. Thanks in advance. [[alternative HTML version deleted]] David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] null model for a single species?
Dear List members, I would like to test whether an observed occupancy of lakes in a landscape has occurred randomly (by chance) or not. How can I do that? The problem is that it concerns only a single species and I would like to use binary data only. At first I thought of generating null models and test the observed occupancy against the randomly generated one. However, this needs more than one species... Any hints are highly appreciated! Best regards Johannes Penner Museum für Naturkunde (AG Herpetologie) Invalidenstrasse 43 D-10115 Berlin Tel: +49 (0)30 2093 8708 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conflict between gam::gam and mgcv::gam
I am trying to compare and contrast the smoothing in the {mgcv} version of gam vs. the {gam} version of gam but I get a strange side effects when I try to alternate calls to these routines, even though I detach and unload namespaces. Specifically when I start up R the following code runs successfully until the last line i.e. plot(g4,se=TRUE) when I get Error in dim(data) - dim : attempt to set an attribute on NULL. On subsequent attempts to plot any gam::gam object I get the same result. The only way I can get gam:gam working again is to close down R and restart it. x=runif(1:20) y=runif(1:20) library(gam) g1= gam::gam( y ~ s(x,5) ) plot(g1,se=TRUE) detach(package:gam,unload=TRUE) detach(package:splines,unload=TRUE) library(gam) g2= gam::gam( y ~ s(x,5) ) plot(g2,se=TRUE) detach(package:gam,unload=TRUE) detach(package:splines,unload=TRUE) library(mgcv) g3 = mgcv::gam( y ~ s(x) ) plot(g3) detach(package:mgcv,unload=TRUE) library(gam) g4= gam::gam( y ~ s(x,5) ) plot(g4,se=TRUE) Julian Shaw Notice: This e-mail message is intended only for the named recipient(s) above. It may contain confidential information that is proprietary or privileged. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution or copying of this e-mail and any attachment(s) is strictly prohibited. Please immediately notify the sender by replying to this e-mail and delete the message and any attachment(s) from your system. Permal archives and reviews outgoing and incoming e-mail. It may be produced at the request of regulators or in connection with civil litigation. Permal accepts no liability for any errors or omissions arising as a result of transmission. Any proposals, offers or other potential terms described or referred to in this message are subject to contract and shall not be binding on any member of the Permal Group, or any affiliate thereof, unless and until documented in a written agreement executed by all necessary parties by their duly a! uthorized representative(s). Past performance is not indicative of future results. This material is not an offer or a solicitation to subscribe for any Permal fund. Permal Investment Management Services Limited is authorized and regulated by the Financial Services Authority in the UK and is regulated by the Dubai Financial Services Authority. Permal (Singapore) Pte. Limited, Company Registration Number 200711607C, is regulated by the Monetary Authority of Singapore. Permal (Japan) K.K., Registration Number: Director General of Kanto Local Finance Bureau (KLFB) (FlEA) No. 2335, is authorized by the KLFB in Japan as a non-discretionary investment adviser. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XYPLOT - GROUPING WITH TWO CATEGORICAL VARIABLES
Hi! I have a dataframe like this: dat=data.frame(Age=c(rep(30,8),rep(40,8),rep(50,8)),Period=rep(seq(2005,2008,1),3),Rate=c(seq(1,8,1),seq(9,16,1),seq(17,24,1)),Sex=rep(c(rep(0,4),rep(1,4)),3))attach(dat)dat Age Period Rate Sex1 30 2005 1 02 30 2006 2 03 30 2007 3 04 30 2008 4 05 30 2005 5 16 30 2006 6 17 30 2007 7 18 30 2008 8 19 40 2005 9 010 40 2006 10 011 40 2007 11 012 40 2008 12 013 40 2005 13 114 40 2006 14 115 40 2007 15 116 40 2008 16 117 50 2005 17 018 50 2006 18 019 50 2007 19 020 50 2008 20 021 50 2005 21 122 50 2006 22 123 50 2007 23 124 50 2008 24 1 And I can do these separated graphs by sex: xyplot(Rate ~ Period, data=subset(dat,Sex==0), groups = Age, type = b, auto.key = list(cex=0.8,border=TRUE,size=3,cex.title=1,space = right, title=Age, points = FALSE , lines = TRUE), xlab = Period, ylab = Rate,ylim=c(0,25), main = Mortality Rates for 10 inhabitants - Men, scales=list(tck=c(1,0)) )xyplot(Rate ~ Period, data=subset(dat,Sex==1), groups = Age, type = b, auto.key = list(cex=0.8,border=TRUE,size=3,cex.title=1,space = right, title=Age, points = FALSE , lines = TRUE), xlab = Period, ylab = Rate,ylim=c(0,25), main = Mortality Rates for 10 inhabitants - Women, scales=list(tck=c(1,0)) ) BUT I WANT THEM IN THE SAME GRAPH, IN THE SAME PANEL, TO BE COMPARED, FOR EXAMPLE WITH THE SAME COLOUR FOR THE SAME AGE GROUPS,AND A COMPLETE LINE FOR SEX=0 AND A DOTTED LINE FOR SEX=1 I TRIED DIFFERENT THINGS BUT NOTHING WORKED FOR ME. I USED PANEL OPTIONS, OR XY.SUPERPOSE, BUT I DON`T KNOW HOW TO USE THOSE THINGS PROPERLY. THANK YOU!!! MARCOS (ARGENTINA) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a numeric problem
### An numeric problem in R ###I have two matrix one is## A - matrix(c(21.97844, 250.1960, 2752.033, 29675.88, 316318.4, 3349550, 35336827, 24.89267, 261.4211, 2691.009, 27796.02, 288738.7, 3011839, 31498784, 21.80384, 232.3765, 2460.495, 25992.77, 274001.6, 2883756, 30318645, 39.85801, 392.2341, 3971.349, 40814.22, 423126.2, 4409388, 46090850, 25.55343, 235.5315, 2343.281, 23987.10, 248557.3, 2590821, 27089162, 16.63100, 195.4592, 2200.264, 24006.66, 257499.5, 2736200, 28922851, 23.20258, 262.6086, 2824.971, 29973.61, 316369.1, 3331009, 35025965 ),7,7,byrow=T) the other one is## S-matrix(c(356.57205, 32.55943, 120.28162, 95.74285, 45.013261, 271.557635, 183.9009, 32.55943, 299.68103, 81.95644, 246.95280, 96.023270, 19.383732, 153.4544, 120.28162, 81.95644, 277.09354, 85.42180, 138.751600, 138.972234, 140.0991, 95.74285, 246.95280, 85.42180, 527.78444, 182.417527, 24.946245, 129.1490, 45.01326, 96.02327, 138.75160, 182.41753, 328.414655, 1.035543, 63.2730, 271.55764, 19.38373, 138.97223, 24.94625, 1.035543, 322.635669, 158.5317, 183.90092, 153.45443, 140.09909, 129.14899, 63.273005, 158.531662, 256.1531 ),7,7,byrow=T) both of these two matrix are non-singular so their inverse should be exists## while KK - t(A)%*%solve(S)%*%A det(KK) ###12553.48 which is not 0, but solve() function refuse to work now. ## So I try to use ginv() instead### ginv(KK) ### It is expected that A%*%ginv(t(A)%*%solve(S)%*%A)%*%t(A) ### should equal to solve(S)## solve(S) ### but it is not!!! ### So I am wondering in such situation, do you have any suggestion? ### I have tried the argument tol = sqrt(.Machine$double.eps)) in ginv(), but it won't help in ### more larger determinant matrix. -- View this message in context: http://r.789695.n4.nabble.com/a-numeric-problem-tp3338989p3338989.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] attr question
On 07/03/2011 12:11 AM, Erin Hodgess wrote: Dear R People: When I want to produce a small sample confidence interval using t.test, I get the following: t.test(buzz$var1, conf.level=.98)$conf.int [1] 2.239337 4.260663 attr(,conf.level) [1] 0.98 How do I keep the attr statement from printing, please? I'm sure it's something really simple. The as.numeric() function strips attributes, so as.numeric(t.test(buzz$var1, conf.level=.98)$conf.int) should work. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a numeric problem
On 07/03/2011 8:46 AM, baicaidoufu wrote: ### An numeric problem in R ###I have two matrix one is## A- matrix(c(21.97844, 250.1960, 2752.033, 29675.88, 316318.4, 3349550, 35336827, 24.89267, 261.4211, 2691.009, 27796.02, 288738.7, 3011839, 31498784, 21.80384, 232.3765, 2460.495, 25992.77, 274001.6, 2883756, 30318645, 39.85801, 392.2341, 3971.349, 40814.22, 423126.2, 4409388, 46090850, 25.55343, 235.5315, 2343.281, 23987.10, 248557.3, 2590821, 27089162, 16.63100, 195.4592, 2200.264, 24006.66, 257499.5, 2736200, 28922851, 23.20258, 262.6086, 2824.971, 29973.61, 316369.1, 3331009, 35025965 ),7,7,byrow=T) the other one is## S-matrix(c(356.57205, 32.55943, 120.28162, 95.74285, 45.013261, 271.557635, 183.9009, 32.55943, 299.68103, 81.95644, 246.95280, 96.023270, 19.383732, 153.4544, 120.28162, 81.95644, 277.09354, 85.42180, 138.751600, 138.972234, 140.0991, 95.74285, 246.95280, 85.42180, 527.78444, 182.417527, 24.946245, 129.1490, 45.01326, 96.02327, 138.75160, 182.41753, 328.414655, 1.035543, 63.2730, 271.55764, 19.38373, 138.97223, 24.94625, 1.035543, 322.635669, 158.5317, 183.90092, 153.45443, 140.09909, 129.14899, 63.273005, 158.531662, 256.1531 ),7,7,byrow=T) both of these two matrix are non-singular so their inverse should be exists## while KK- t(A)%*%solve(S)%*%A det(KK) ###12553.48 which is not 0, but solve() function refuse to work now. Look at the eigenvalues of KK: they range from 10^(-7) to 10^12. Its condition number (according to the rcond definition) is about 10^(-21). If you could invert it, you would just be seeing noise from rounding error. So the best advice I can give is: don't do that. But if you really don't care about the quality of your results, you could use solve(A) %*% S %*% t(solve(A)) to avoid the error messages. If you multiply that by KK you'll get something that's a long way from an identity matrix. Duncan Murdoch ## So I try to use ginv() instead### ginv(KK) ### It is expected that A%*%ginv(t(A)%*%solve(S)%*%A)%*%t(A) ### should equal to solve(S)## solve(S) ### but it is not!!! ### So I am wondering in such situation, do you have any suggestion? ### I have tried the argument tol = sqrt(.Machine$double.eps)) in ginv(), but it won't help in ### more larger determinant matrix. -- View this message in context: http://r.789695.n4.nabble.com/a-numeric-problem-tp3338989p3338989.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XYPLOT - GROUPING WITH TWO CATEGORICAL VARIABLES
On Mar 7, 2011, at 8:37 AM, Marcos Prunello wrote: Hi! I have a dataframe like this: dat = data .frame (Age=c(rep(30,8),rep(40,8),rep(50,8)),Period=rep(seq(2005,2008,1), 3 ),Rate =c(seq(1,8,1),seq(9,16,1),seq(17,24,1)),Sex=rep(c(rep(0,4),rep(1,4)), 3))attach(dat)dat Age Period Rate Sex1 30 20051 02 30 20062 03 30 20073 04 30 20084 05 30 20055 16 30 20066 17 30 20077 18 30 20088 19 40 20059 010 40 2006 10 011 40 2007 11 012 40 2008 12 013 40 2005 13 114 40 2006 14 115 40 2007 15 116 40 2008 16 117 50 2005 17 018 50 2006 18 019 50 2007 19 020 50 2008 20 021 50 2005 21 122 50 2006 22 123 50 2007 23 124 50 2008 24 1 And I can do these separated graphs by sex: xyplot(Rate ~ Period, data=subset(dat,Sex==0), groups = Age, type = b, auto.key = list(cex=0.8,border=TRUE,size=3,cex.title=1,space = right, title=Age, points = FALSE , lines = TRUE), xlab = Period, ylab = Rate,ylim=c(0,25), main = Mortality Rates for 10 inhabitants - Men, scales=list(tck=c(1,0)) )xyplot(Rate ~ Period, data=subset(dat,Sex==1), groups = Age, type = b, auto.key = list(cex=0.8,border=TRUE,size=3,cex.title=1,space = right, title=Age, points = FALSE , lines = TRUE), xlab = Period, ylab = Rate,ylim=c(0,25), main = Mortality Rates for 10 inhabitants - Women, scales=list(tck=c(1,0)) ) BUT I WANT THEM IN THE SAME GRAPH, IN THE SAME PANEL, TO BE COMPARED, FOR EXAMPLE WITH THE SAME COLOUR FOR THE SAME AGE GROUPS,AND A COMPLETE LINE FOR SEX=0 AND A DOTTED LINE FOR SEX=1 I TRIED DIFFERENT THINGS BUT NOTHING WORKED FOR ME. I USED PANEL OPTIONS, OR XY.SUPERPOSE, BUT I DON`T KNOW HOW TO USE THOSE THINGS PROPERLY. You may need to adjust your netiquette. All caps is considered shouting (not to mention more difficult to read). Try: xyplot(Rate ~ Period, data=dat, groups = interaction(Age, Sex), type = b, auto.key = list(cex=0.8,border=TRUE,size=3,cex.title=1, space = right, title=Age.Sex, points = FALSE , lines = TRUE), xlab = Period, ylab = Rate,ylim=c(0,25), main = Mortality Rates for 10 inhabitants - Women Men, scales=list(tck=c(1,0)) ) THANK YOU!!! MARCOS (ARGENTINA) [[alternative HTML version deleted]] You should also learn to post in plain text. That way you linefeeds won't disappear. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rowSums - am I getting something wrong?
Hi Thomas, Several of us explained this in different ways just last week, so you might search the archive. Floating point numbers are an approximate representation of real numbers. Things that can be expressed exactly in powers of 10 can't be expressed exactly in powers of 2. So the sum 0.6+0.3+0.1 is NOT clearly 1.0. You can use signif and round to overcome this a = seq(0,1,0.1) a [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 a[7]-0.6 [1] 1.110223e-16 1-(a[4]+a[7]+a[2]) [1] -2.220446e-16 b = rev(seq(1,0,-0.1)) b [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 a-b [1] 0.00e+00 2.775558e-17 5.551115e-17 1.110223e-16 1.110223e-16 [6] 0.00e+00 1.110223e-16 1.110223e-16 0.00e+00 0.00e+00 [11] 0.00e+00 round(a-b,10) [1] 0 0 0 0 0 0 0 0 0 0 0 round(a,10)-round(b,10) [1] 0 0 0 0 0 0 0 0 0 0 0 The first commandment of floating point programming is THOU SHALT NOT TEST WHETHER TWO FP NUMBERS ARE EQUAL HTH Rex -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of thomas.salve...@syngenta.com Sent: Monday, March 07, 2011 2:09 AM To: r-help@r-project.org Subject: [R] rowSums - am I getting something wrong? I am trying to construct a data set with some sequences for example: a = seq(0,1,0.1) m = matrix(nrow = 1331, ncol = 3) m[,1] = rep(a,121) m[,2] = rep(a,11,each = 11) m[,3] = rep(a,1,each = 121) I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having. I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values: [161,] 0.6 0.3 0.1 which(rowSum(m)1) [53] 119 120 121 132 142 143 152 153 154 161 162 As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix) If I try the following: q=rowSums(m) which(q1) [53] 119 120 121 132 142 143 152 153 154 161 162 But if I add and subtract 1 from this: q=q+1 q=q-1 which(q1) [53] 119 120 121 132 142 143 152 153 154 162 What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect. Any help would be great Tom message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re transaction list transformation to use rpart.
However the as.data.frame(a) transforms the matrix into a numeric data.frame so when I implement the rpart algorithm it automatically returns a regression classification tree. Look at help(rpart). The program uses the type of the y variable to GUESS at what you want for the method argument. It often guesses wrong. Simply add method=class as an argument to your call. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] null model for a single species?
Date: Mon, 7 Mar 2011 14:15:49 +0100 From: johannes.pen...@mfn-berlin.de To: r-help@r-project.org Subject: [R] null model for a single species? Dear List members, I would like to test whether an observed occupancy of lakes in a landscape has occurred randomly (by chance) or not. How can I do that? The problem is that it concerns only a single species and I would like to use binary data only. At first I thought of generating null models and test the observed occupancy against the randomly generated one. However, this needs more than one species... [[elided Hotmail spam]] Try a specialized list for eco or whatever applies. I'm not sure it would be obvious to many people here what you are trying to test or why you need another speicies. There are probably well known approaches that are just assumed among specialists. To refer to something in a set of obervations as an effect of an identifiable cause as opposed to statstical fluctuation presumbly requires some notion of the features of each so you could make a testable null hypothesis. Even noise has a cause, it is just not something you care about. Best regards Johannes Penner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Teaching R: To quote, or not to quote?
Hi All, When I teach an intro workshop on R, I've been minimizing quote confusion by always using quotes around package names in function calls. For example: install.packages(Hmisc) update.packages(Hmisc) library(Hmisc) citation(Hmisc) search() # displays package names in quotes detach(packages:Hmisc) # just as search displayed it all look consistent with quotes. They're optional, of course, with library and detach and I tell them that. But for beginners, it's hard to remember when they don't need quotes. This perspective continues with function names in help: help(mean) ?mean help(if) ?if which avoids the fact that some important topics like control-flow words (e.g. help(if) ) generate error messages without the quotes. For help, the quotes make the string a topic instead of a name, but that doesn't seem to block it from finding function names in quotes. I'm about to go to press with the second edition of R for SAS and SPSS Users I'm wondering if there's a downside to this. No other books I've seen use library(package) or help(function) consistently. Is there a reason I should avoid it? Thanks, Bob = Bob Muenchen (pronounced Min'-chen), Manager Research Computing Support Voice: (865) 974-5230 Email: muenc...@utk.edu Web: http://oit.utk.edu/research, News: http://oit.utk.edu/research/news.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Teaching R: To quote, or not to quote?
On 07/03/2011 9:52 AM, Muenchen, Robert A (Bob) wrote: Hi All, When I teach an intro workshop on R, I've been minimizing quote confusion by always using quotes around package names in function calls. For example: install.packages(Hmisc) update.packages(Hmisc) library(Hmisc) citation(Hmisc) search() # displays package names in quotes detach(packages:Hmisc) # just as search displayed it all look consistent with quotes. They're optional, of course, with library and detach and I tell them that. But for beginners, it's hard to remember when they don't need quotes. This perspective continues with function names in help: help(mean) ?mean help(if) ?if which avoids the fact that some important topics like control-flow words (e.g. help(if) ) generate error messages without the quotes. For help, the quotes make the string a topic instead of a name, but that doesn't seem to block it from finding function names in quotes. I'm about to go to press with the second edition of R for SAS and SPSS Users I'm wondering if there's a downside to this. No other books I've seen use library(package) or help(function) consistently. Is there a reason I should avoid it? The only reasons I can think to avoid that recommendation is that people might find typing unnecessary quotes to be irritating and they might be confused when they see unquoted usage elsewhere. Those aren't particularly strong reasons... Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Array Help
abind works well for this example. a1 - array(1:18, dim=c(3,3,2)) a2 - array(101:150, dim=c(5,5,2)) a1b - abind(a1, array(400, dim=c(2,3,2)), along=1) a1c - abind(a1b, array(500, dim=c(5,2,2)), along=2) dim(a1c) a1 a2 a1c On Mon, Mar 7, 2011 at 6:54 AM, Usman Munir usman.muni...@googlemail.comwrote: Hi, I have two 3 D arrays. Both are of this form array_1- array[n,n,k] array_2-array[m,m,k] Lets say n=83 and m=80 Since nm. I would like to add rows and columns to array_2 to make them equal. I want to keep the size of the third dimension fixed i.e.. k. i.e. if (nrow(array_1)nrow(array_2)) { array_2[m:n,m:n,]- 10^6 } But this doesn't work. I tried abind and rbind but it didn't work either. Any help/tips will be appreciated! Thanks, Usman [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sweave with scan()-ed data
In an Sweave slide, I want to use sem::read.moments() and sem::specify.model(), which work by using scan() to read the following lines, up to the first blank line. However, Sweave throws an error: Sweave(sem-thurstone.Rnw) Writing to file sem-thurstone.tex Processing code chunks ... 1 : term hide (label=arrests-setup) 2 : echo term hide (label=thurstone-data) Error: chunk 2 (label=thurstone-data) Error in sem-thurstone.Rnw:43:12: unexpected numeric constant 42: .828 43: .776 .779 ^ Is there some switch or option for Sweave that will make this work? Below is the slide in question in an executable example: \documentclass[dvipsnames,pdflatex,compress,beamer]{beamer} \usepackage{Sweave} \definecolor{Sinput}{rgb}{1,0,0} \definecolor{Scode}{rgb}{0,0,0.56} \definecolor{Soutput}{rgb}{0,0,1} \DefineVerbatimEnvironment{Sinput}{Verbatim}{formatcom={\color{Sinput}},fontsize=\footnotesize,baselinestretch=0.9} \DefineVerbatimEnvironment{Soutput}{Verbatim}{formatcom={\color{Soutput}},fontsize=\footnotesize,baselinestretch=0.85} \DefineVerbatimEnvironment{Scode}{Verbatim}{formatcom={\color{Scode}},fontsize=\small} \begin{document} \SweaveOpts{engine=R,height=6,width=6,results=hide,fig=FALSE,echo=TRUE} \SweaveOpts{prefix.string=fig/sem} \section{sem package: Second-order CFA, Thurstone data} \begin{frame}[fragile] \frametitle{sem package: Second-order CFA, Thurstone data} \framesubtitle{Data} Data on 9 ability variables: thurstone-data, echo=TRUE= R.thur - read.moments(diag=FALSE, names=c('Sentences','Vocabulary', 'Sent.Completion','First.Letters','4.Letter.Words','Suffixes', 'Letter.Series','Pedigrees', 'Letter.Group')) .828 .776 .779 .439 .493.46 .432 .464.425 .674 .447 .489.443 .59.541 .447 .432.401 .381.402 .288 .541 .537.534 .35.367 .32 .555 .38 .358.359 .424.446 .325 .598 .452 @ \end{frame} \end{document} -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele StreetWeb: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave with scan()-ed data
On 07/03/2011 10:21 AM, Michael Friendly wrote: In an Sweave slide, I want to use sem::read.moments() and sem::specify.model(), which work by using scan() to read the following lines, up to the first blank line. However, Sweave throws an error: Sweave(sem-thurstone.Rnw) Writing to file sem-thurstone.tex Processing code chunks ... 1 : term hide (label=arrests-setup) 2 : echo term hide (label=thurstone-data) Error: chunk 2 (label=thurstone-data) Error in sem-thurstone.Rnw:43:12: unexpected numeric constant 42: .828 43: .776 .779 ^ Is there some switch or option for Sweave that will make this work? I don't think so. The way Sweave works is not to pipe the code chunks into a console-like evaluator, it's to parse the whole code chunk, then evaluate the expressions one by one. So you can probably fake the behaviour by telling read.moments to read from somewhere else and showing different code than you really executed on the slide, but I don't think there's a way to honestly do what you want. You might be able to automate this, i.e. to write code that source()'s a file and echos the right tex code to make it look as though it was entered at the command line, but it would be messy. Duncan Murdoch Below is the slide in question in an executable example: \documentclass[dvipsnames,pdflatex,compress,beamer]{beamer} \usepackage{Sweave} \definecolor{Sinput}{rgb}{1,0,0} \definecolor{Scode}{rgb}{0,0,0.56} \definecolor{Soutput}{rgb}{0,0,1} \DefineVerbatimEnvironment{Sinput}{Verbatim}{formatcom={\color{Sinput}},fontsize=\footnotesize,baselinestretch=0.9} \DefineVerbatimEnvironment{Soutput}{Verbatim}{formatcom={\color{Soutput}},fontsize=\footnotesize,baselinestretch=0.85} \DefineVerbatimEnvironment{Scode}{Verbatim}{formatcom={\color{Scode}},fontsize=\small} \begin{document} \SweaveOpts{engine=R,height=6,width=6,results=hide,fig=FALSE,echo=TRUE} \SweaveOpts{prefix.string=fig/sem} \section{sem package: Second-order CFA, Thurstone data} \begin{frame}[fragile] \frametitle{sem package: Second-order CFA, Thurstone data} \framesubtitle{Data} Data on 9 ability variables: thurstone-data, echo=TRUE= R.thur- read.moments(diag=FALSE, names=c('Sentences','Vocabulary', 'Sent.Completion','First.Letters','4.Letter.Words','Suffixes', 'Letter.Series','Pedigrees', 'Letter.Group')) .828 .776 .779 .439 .493.46 .432 .464.425 .674 .447 .489.443 .59.541 .447 .432.401 .381.402 .288 .541 .537.534 .35.367 .32 .555 .38 .358.359 .424.446 .325 .598 .452 @ \end{frame} \end{document} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] attr question
One way would be to wrap it in as.vector() as.vector( t.test(rnorm(5),rnorm(5))$conf.int ) [1] -0.9718231 1.2267976 -Don On 3/6/11 9:11 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Dear R People: When I want to produce a small sample confidence interval using t.test, I get the following: t.test(buzz$var1, conf.level=.98)$conf.int [1] 2.239337 4.260663 attr(,conf.level) [1] 0.98 How do I keep the attr statement from printing, please? I'm sure it's something really simple. Thanks, Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] attr question
On Mar 7, 2011, at 10:44 AM, MacQueen, Don wrote: One way would be to wrap it in as.vector() as.vector( t.test(rnorm(5),rnorm(5))$conf.int ) [1] -0.9718231 1.2267976 Or even c(): c( t.test(rnorm(5),rnorm(5))$conf.int ) [1] -1.055843 1.742806 -Don On 3/6/11 9:11 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Dear R People: When I want to produce a small sample confidence interval using t.test, I get the following: t.test(buzz$var1, conf.level=.98)$conf.int [1] 2.239337 4.260663 attr(,conf.level) [1] 0.98 How do I keep the attr statement from printing, please? I'm sure it's something really simple. Thanks, Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] species projected in a ordiplot
Dear all, I'm performing a detrended correspondence analysis on vascular plant community data (296 species), and I have a question on the species scores projected in the ordination diagram. When I run a ordiplot all species are projected in the output graph, but I'd like to restrict the number of species plotted in the final graph. Some species are so rare in the data, that no relevant information is provided on their ecological preferences, and other species might not characterize well the analysis, then I'd like to select these species with highest fit state. Any help would be much appreciated. Giving thanks in advance Elisa Elisa Santi Department of Environmental Sciences University of Siena Italy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Preferred way to create bubble plots?
I'm not sure if this exactly what you need but it is a good introduction. and a great website to boot. http://flowingdata.com/2010/11/23/how-to-make-bubble-charts/ John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave with scan()-ed data
At 10:21 AM -0500 3/7/11, Michael Friendly wrote: In an Sweave slide, I want to use sem::read.moments() and sem::specify.model(), which work by using scan() to read the following lines, up to the first blank line. However, Sweave throws an error: Sweave(sem-thurstone.Rnw) Writing to file sem-thurstone.tex Processing code chunks ... 1 : term hide (label=arrests-setup) 2 : echo term hide (label=thurstone-data) Error: chunk 2 (label=thurstone-data) Error in sem-thurstone.Rnw:43:12: unexpected numeric constant 42: .828 43: .776 .779 ^ Is there some switch or option for Sweave that will make this work? Below is the slide in question in an executable example: \documentclass[dvipsnames,pdflatex,compress,beamer]{beamer} \usepackage{Sweave} \definecolor{Sinput}{rgb}{1,0,0} \definecolor{Scode}{rgb}{0,0,0.56} \definecolor{Soutput}{rgb}{0,0,1} \DefineVerbatimEnvironment{Sinput}{Verbatim}{formatcom={\color{Sinput}},fontsize=\footnotesize,baselinestretch=0.9} \DefineVerbatimEnvironment{Soutput}{Verbatim}{formatcom={\color{Soutput}},fontsize=\footnotesize,baselinestretch=0.85} \DefineVerbatimEnvironment{Scode}{Verbatim}{formatcom={\color{Scode}},fontsize=\small} \begin{document} \SweaveOpts{engine=R,height=6,width=6,results=hide,fig=FALSE,echo=TRUE} \SweaveOpts{prefix.string=fig/sem} \section{sem package: Second-order CFA, Thurstone data} \begin{frame}[fragile] \frametitle{sem package: Second-order CFA, Thurstone data} \framesubtitle{Data} Data on 9 ability variables: thurstone-data, echo=TRUE= R.thur - read.moments(diag=FALSE, names=c('Sentences','Vocabulary', 'Sent.Completion','First.Letters','4.Letter.Words','Suffixes', 'Letter.Series','Pedigrees', 'Letter.Group')) .828 .776 .779 .439 .493.46 .432 .464.425 .674 .447 .489.443 .59.541 .447 .432.401 .381.402 .288 .541 .537.534 .35.367 .32 .555 .38 .358.359 .424.446 .325 .598 .452 @ \end{frame} \end{document} At 10:31 AM -0500 3/7/11, Duncan Murdoch wrote: On 07/03/2011 10:21 AM, Michael Friendly wrote: In an Sweave slide, I want to use sem::read.moments() and sem::specify.model(), which work by using scan() to read the following lines, up to the first blank ...snip... 42: .828 43: .776 .779 ^ Is there some switch or option for Sweave that will make this work? I don't think so. The way Sweave works is not to pipe the code chunks into a console-like evaluator, it's to parse the whole code chunk, then evaluate the expressions one by one. So you can probably fake the behaviour by telling read.moments to read from somewhere else and showing different code than you really executed on the slide, but I don't think there's a way to honestly do what you want. You might be able to automate this, i.e. to write code that source()'s a file and echos the right tex code to make it look as though it was entered at the command line, but it would be messy. Duncan Murdoch Below is the slide in question in an executable example: \documentclass[dvipsnames,pdflatex,compress,beamer]{beamer} \usepackage{Sweave} ...snip... .541 .537.534 .35.367 .32 .555 .38 .358.359 .424.446 .325 .598 .452 @ \end{frame} \end{document} Not a Sweave solution, but that data set is available in psych: library(psych) data(bifactor) Thurstone Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace for loop when vector calling itself
Hope this clarifies my Q. Creating a vector where each element is (except the first which is 0) is: the previous element + a calculation from another vector theoP[i] - theoP[i-1] I could not figure out how to do this without a for loop, as the vector had to reference itself for the next element...I am missing something obvious, but not too sure what. d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } Thanks, Chris Hi, I am missing something obvious. Need to create vector as: (0, i-1 + TheoP(i) - TheoP(i-1), repeat) Where i is the index position in the vector and i[1] is always 0. I think your prototype is not agreeing with the code below. Is i suppose to be the index (as suggested above) or the prior term (as implied below)? Found myself having to use a For Loop because I could not get sapply working. Any suggestions ? Assuming the code below, you construct the first three or four values by hand I think you will find that the intermediate values of TheoP will have alternating signs. term2 = 2-1 + TheoP(2) - TheoP(1) term3 = 3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) term4 = 4-1 + TheoP(4) - (3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) ) The answer to the first question will determine how you proceed. If the index is being used then there are two series of cumulative sums and perhaps you can construct an expression that can be fed to the cumsum function. The diff function is also available and if the index version is correct, then it might even be as simple as c(0, ((1:len)-1)+diff(TheoP) ) So clarify what is intended. -- David. delta - function(x) { start = index[x] end = index[x+1] - 1 iTheo = start:end len = length(iTheo) theoP = as.numeric(TheoBA$Bid[iTheo]) d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } return(d) } Thanks, Chris -- -- View this message in context: http://r.789695.n4.nabble.com/Replace-for-loop-when-vector-calling-itself-tp3338383p3339351.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace for loop when vector calling itself
On Mar 7, 2011, at 11:12 AM, rivercode wrote: Hope this clarifies my Q. Creating a vector where each element is (except the first which is 0) is: the previous element + a calculation from another vector theoP[i] - theoP[i-1] I could not figure out how to do this without a for loop, as the vector had to reference itself for the next element...I am missing something obvious, but not too sure what. d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } so why not: c(0, cumsum(diff(theoP)) ) theoP - sample(1:10, 15, replace=TRUE); len=length(theoP); d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } d [1] 0 1 -6 -5 -4 0 1 0 1 -4 3 -5 -5 1 -3 c(0, cumsum(diff(theoP)) ) [1] 0 1 -6 -5 -4 0 1 0 1 -4 3 -5 -5 1 -3 all.equal(d, c(0, cumsum(diff(theoP)) ) ) [1] TRUE -- David. Thanks, Chris Hi, I am missing something obvious. Need to create vector as: (0, i-1 + TheoP(i) - TheoP(i-1), repeat) Where i is the index position in the vector and i[1] is always 0. I think your prototype is not agreeing with the code below. Is i suppose to be the index (as suggested above) or the prior term (as implied below)? Found myself having to use a For Loop because I could not get sapply working. Any suggestions ? Assuming the code below, you construct the first three or four values by hand I think you will find that the intermediate values of TheoP will have alternating signs. term2 = 2-1 + TheoP(2) - TheoP(1) term3 = 3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) term4 = 4-1 + TheoP(4) - (3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) ) The answer to the first question will determine how you proceed. If the index is being used then there are two series of cumulative sums and perhaps you can construct an expression that can be fed to the cumsum function. The diff function is also available and if the index version is correct, then it might even be as simple as c(0, ((1:len)-1)+diff(TheoP) ) So clarify what is intended. -- David. delta - function(x) { start = index[x] end = index[x+1] - 1 iTheo = start:end len = length(iTheo) theoP = as.numeric(TheoBA$Bid[iTheo]) d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } return(d) } Thanks, Chris -- -- View this message in context: http://r.789695.n4.nabble.com/Replace-for-loop-when-vector-calling-itself-tp3338383p3339351.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Teaching R: To quote, or not to quote?
On 07.03.2011 16:17, Duncan Murdoch wrote: On 07/03/2011 9:52 AM, Muenchen, Robert A (Bob) wrote: Hi All, When I teach an intro workshop on R, I've been minimizing quote confusion by always using quotes around package names in function calls. For example: install.packages(Hmisc) update.packages(Hmisc) library(Hmisc) citation(Hmisc) search() # displays package names in quotes detach(packages:Hmisc) # just as search displayed it all look consistent with quotes. They're optional, of course, with library and detach and I tell them that. But for beginners, it's hard to remember when they don't need quotes. This perspective continues with function names in help: help(mean) ?mean help(if) ?if which avoids the fact that some important topics like control-flow words (e.g. help(if) ) generate error messages without the quotes. For help, the quotes make the string a topic instead of a name, but that doesn't seem to block it from finding function names in quotes. I'm about to go to press with the second edition of R for SAS and SPSS Users I'm wondering if there's a downside to this. No other books I've seen use library(package) or help(function) consistently. I hope at least my German / Japanese one does (but I'd need to parse it myself). As you said, it is more consistent to use quotes here and over the years, I tried to change all my course material to show versions with quotes only. Uwe Is there a reason I should avoid it? The only reasons I can think to avoid that recommendation is that people might find typing unnecessary quotes to be irritating and they might be confused when they see unquoted usage elsewhere. Those aren't particularly strong reasons... Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace for loop when vector calling itself
Look at the filter() function, which can do recursive and convolutional filtering. cumsum() and diff(), respectively, are special cases of recursive and convolutional filtering and cumsum() may be enough in your case. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of rivercode Sent: Monday, March 07, 2011 8:12 AM To: r-help@r-project.org Subject: Re: [R] Replace for loop when vector calling itself Hope this clarifies my Q. Creating a vector where each element is (except the first which is 0) is: the previous element + a calculation from another vector theoP[i] - theoP[i-1] I could not figure out how to do this without a for loop, as the vector had to reference itself for the next element...I am missing something obvious, but not too sure what. d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } Thanks, Chris Hi, I am missing something obvious. Need to create vector as: (0, i-1 + TheoP(i) - TheoP(i-1), repeat) Where i is the index position in the vector and i[1] is always 0. I think your prototype is not agreeing with the code below. Is i suppose to be the index (as suggested above) or the prior term (as implied below)? Found myself having to use a For Loop because I could not get sapply working. Any suggestions ? Assuming the code below, you construct the first three or four values by hand I think you will find that the intermediate values of TheoP will have alternating signs. term2 = 2-1 + TheoP(2) - TheoP(1) term3 = 3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) term4 = 4-1 + TheoP(4) - (3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) ) The answer to the first question will determine how you proceed. If the index is being used then there are two series of cumulative sums and perhaps you can construct an expression that can be fed to the cumsum function. The diff function is also available and if the index version is correct, then it might even be as simple as c(0, ((1:len)-1)+diff(TheoP) ) So clarify what is intended. -- David. delta - function(x) { start = index[x] end = index[x+1] - 1 iTheo = start:end len = length(iTheo) theoP = as.numeric(TheoBA$Bid[iTheo]) d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } return(d) } Thanks, Chris -- -- View this message in context: http://r.789695.n4.nabble.com/Replace-for-loop-when-vector-cal ling-itself-tp3338383p3339351.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace for loop when vector calling itself
Isn't c(0, cumsum(diff(theoP)) ) just theoP - theoP[1] ? -- David -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Monday, March 07, 2011 10:52 AM To: rivercode Cc: r-help@r-project.org Subject: Re: [R] Replace for loop when vector calling itself On Mar 7, 2011, at 11:12 AM, rivercode wrote: Hope this clarifies my Q. Creating a vector where each element is (except the first which is 0) is: the previous element + a calculation from another vector theoP[i] - theoP[i-1] I could not figure out how to do this without a for loop, as the vector had to reference itself for the next element...I am missing something obvious, but not too sure what. d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } so why not: c(0, cumsum(diff(theoP)) ) theoP - sample(1:10, 15, replace=TRUE); len=length(theoP); d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } d [1] 0 1 -6 -5 -4 0 1 0 1 -4 3 -5 -5 1 -3 c(0, cumsum(diff(theoP)) ) [1] 0 1 -6 -5 -4 0 1 0 1 -4 3 -5 -5 1 -3 all.equal(d, c(0, cumsum(diff(theoP)) ) ) [1] TRUE -- David. Thanks, Chris Hi, I am missing something obvious. Need to create vector as: (0, i-1 + TheoP(i) - TheoP(i-1), repeat) Where i is the index position in the vector and i[1] is always 0. I think your prototype is not agreeing with the code below. Is i suppose to be the index (as suggested above) or the prior term (as implied below)? Found myself having to use a For Loop because I could not get sapply working. Any suggestions ? Assuming the code below, you construct the first three or four values by hand I think you will find that the intermediate values of TheoP will have alternating signs. term2 = 2-1 + TheoP(2) - TheoP(1) term3 = 3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) term4 = 4-1 + TheoP(4) - (3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) ) The answer to the first question will determine how you proceed. If the index is being used then there are two series of cumulative sums and perhaps you can construct an expression that can be fed to the cumsum function. The diff function is also available and if the index version is correct, then it might even be as simple as c(0, ((1:len)-1)+diff(TheoP) ) So clarify what is intended. -- David. delta - function(x) { start = index[x] end = index[x+1] - 1 iTheo = start:end len = length(iTheo) theoP = as.numeric(TheoBA$Bid[iTheo]) d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } return(d) } Thanks, Chris -- -- View this message in context: http://r.789695.n4.nabble.com/Replace-for-loop-when-vector-calling-itself-tp3338383p3339351.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail and any materials attached hereto, including, without limitation, all content hereof and thereof (collectively, XR Content) are confidential and proprietary to XR Trading, LLC (XR) and/or its affiliates, and are protected by intellectual property laws. Without the prior written consent of XR, the XR Content may not (i) be disclosed to any third party or (ii) be reproduced or otherwise used by anyone other than current employees of XR or its affiliates, on behalf of XR or its affiliates. THE XR CONTENT IS PROVIDED AS IS, WITHOUT REPRESENTATIONS OR WARRANTIES OF ANY KIND. TO THE MAXIMUM EXTENT PERMISSIBLE UNDER APPLICABLE LAW, XR HEREBY DISCLAIMS ANY AND ALL WARRANTIES, EXPRESS AND IMPLIED, RELATING TO THE XR CONTENT, AND NEITHER XR NOR ANY OF ITS AFFILIATES SHALL IN ANY EVENT BE LIABLE FOR ANY DAMAGES OF ANY NATURE WHATSOEVER, INCLUDING, BUT NOT LIMITED TO, DIRECT, INDIRECT, CONSEQUENTIAL, SPECIAL AND PUNITIVE DAMAGES, LOSS OF PROFITS AND TRADING LOSSES, RESULTING FROM ANY PERSON'S USE OR RELIANCE UPON, OR INABILITY TO USE, ANY XR CONTENT, EVEN IF XR IS ADVISED OF THE POSSIBILITY OF SUCH DAMAGES OR IF SUCH DAMAGES WERE FORESEEABLE. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented,
[R] png inside loop
hello list! I'm sorry, I just stumbled over this strange behaviour (at least I am not able to explain the behaviour, therefore I assume it to be a strange behaviour): attach(water) # I know, this is not recommended names(water[3:10]) [1] temp pH DO BOD COD no3 no2 po4 for (i in names(water)[3:10]){ fname-paste(Henni/GFX/fem,i,.png,sep=) mname-paste(Henni/GFX/mal,i,.png,sep=) png(fname,1000,1000) xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() png(mname,1000,1000) xyplot(N_male~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() } well, to anyone's surprise, there are no plots in the folder. the loop finishes (i, fname and mname have values assigned) and executing png(fname,1000,1000) xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() does produce a png in the folder. I assume this to be caused by the png function, since removing the graphics.off() and playing with dev.off() and the likes did not help. anyone ideas?? am I missing the obvious?? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] png inside loop
Hi, I myself do not use lattice plots, but I think your problem is in FAQ 7.22: you didn't print() your plots. See the R FAQ for more details on it. HTH, Ivan Le 3/7/2011 18:28, Sacha Viquerat a écrit : hello list! I'm sorry, I just stumbled over this strange behaviour (at least I am not able to explain the behaviour, therefore I assume it to be a strange behaviour): attach(water) # I know, this is not recommended names(water[3:10]) [1] temp pH DO BOD COD no3 no2 po4 for (i in names(water)[3:10]){ fname-paste(Henni/GFX/fem,i,.png,sep=) mname-paste(Henni/GFX/mal,i,.png,sep=) png(fname,1000,1000) xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() png(mname,1000,1000) xyplot(N_male~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() } well, to anyone's surprise, there are no plots in the folder. the loop finishes (i, fname and mname have values assigned) and executing png(fname,1000,1000) xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() does produce a png in the folder. I assume this to be caused by the png function, since removing the graphics.off() and playing with dev.off() and the likes did not help. anyone ideas?? am I missing the obvious?? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] png inside loop
Try: print(xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance)) Steve Riley, PharmD, PhD Clinical Pharmacology Specialty Care Medicines Development Group Pfizer Inc. 50 Pequot Ave MS-6025-B2110 New London, CT 06320 Email: steve.ri...@pfizer.com Phone: (860) 732-1796 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sacha Viquerat Sent: Monday, March 07, 2011 12:28 PM To: r-help@r-project.org Subject: [R] png inside loop hello list! I'm sorry, I just stumbled over this strange behaviour (at least I am not able to explain the behaviour, therefore I assume it to be a strange behaviour): attach(water) # I know, this is not recommended names(water[3:10]) [1] temp pH DO BOD COD no3 no2 po4 for (i in names(water)[3:10]){ fname-paste(Henni/GFX/fem,i,.png,sep=) mname-paste(Henni/GFX/mal,i,.png,sep=) png(fname,1000,1000) xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() png(mname,1000,1000) xyplot(N_male~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() } well, to anyone's surprise, there are no plots in the folder. the loop finishes (i, fname and mname have values assigned) and executing png(fname,1000,1000) xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() does produce a png in the folder. I assume this to be caused by the png function, since removing the graphics.off() and playing with dev.off() and the likes did not help. anyone ideas?? am I missing the obvious?? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] png inside loop
Dear Sacha, On Mon, Mar 7, 2011 at 9:28 AM, Sacha Viquerat tweedi...@web.de wrote: hello list! I'm sorry, I just stumbled over this strange behaviour (at least I am not able to explain the behaviour, therefore I assume it to be a strange behaviour): attach(water) # I know, this is not recommended not only not recommended, not needed. xyplot() has a data argument you could have used. names(water[3:10]) [1] temp pH DO BOD COD no3 no2 po4 for (i in names(water)[3:10]){ fname-paste(Henni/GFX/fem,i,.png,sep=) mname-paste(Henni/GFX/mal,i,.png,sep=) png(fname,1000,1000) xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() png(mname,1000,1000) xyplot(N_male~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() } well, to anyone's surprise, there are no plots in the folder. the loop finishes (i, fname and mname have values assigned) and executing png(fname,1000,1000) xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() does produce a png in the folder. I assume this to be caused by the png No. function, since removing the graphics.off() and playing with dev.off() and I always thought using dev.off() was the normal way, but it may just be *my* normal way. the likes did not help. anyone ideas?? am I missing the obvious?? It is obvious only insofar as it is noted in the FAQ. Here is a link: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f short answer, wrap your call in print() like so: print(xyplot(yourargument)) HTH, Josh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replace for loop when vector calling itself
On Mar 7, 2011, at 12:27 PM, David Reiner wrote: Isn't c(0, cumsum(diff(theoP)) ) just theoP - theoP[1] ? I think it just might be. There is no random or systematic factor that modifies that alternating sign in the series. -- other David. -- David -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On Behalf Of David Winsemius Sent: Monday, March 07, 2011 10:52 AM To: rivercode Cc: r-help@r-project.org Subject: Re: [R] Replace for loop when vector calling itself On Mar 7, 2011, at 11:12 AM, rivercode wrote: Hope this clarifies my Q. Creating a vector where each element is (except the first which is 0) is: the previous element + a calculation from another vector theoP[i] - theoP[i-1] I could not figure out how to do this without a for loop, as the vector had to reference itself for the next element...I am missing something obvious, but not too sure what. d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } so why not: c(0, cumsum(diff(theoP)) ) theoP - sample(1:10, 15, replace=TRUE); len=length(theoP); d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } d [1] 0 1 -6 -5 -4 0 1 0 1 -4 3 -5 -5 1 -3 c(0, cumsum(diff(theoP)) ) [1] 0 1 -6 -5 -4 0 1 0 1 -4 3 -5 -5 1 -3 all.equal(d, c(0, cumsum(diff(theoP)) ) ) [1] TRUE -- David. Thanks, Chris Hi, I am missing something obvious. Need to create vector as: (0, i-1 + TheoP(i) - TheoP(i-1), repeat) Where i is the index position in the vector and i[1] is always 0. I think your prototype is not agreeing with the code below. Is i suppose to be the index (as suggested above) or the prior term (as implied below)? Found myself having to use a For Loop because I could not get sapply working. Any suggestions ? Assuming the code below, you construct the first three or four values by hand I think you will find that the intermediate values of TheoP will have alternating signs. term2 = 2-1 + TheoP(2) - TheoP(1) term3 = 3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) term4 = 4-1 + TheoP(4) - (3-1 + TheoP(3) - (2-1 + TheoP(2) - TheoP(1)) ) The answer to the first question will determine how you proceed. If the index is being used then there are two series of cumulative sums and perhaps you can construct an expression that can be fed to the cumsum function. The diff function is also available and if the index version is correct, then it might even be as simple as c(0, ((1:len)-1)+diff(TheoP) ) So clarify what is intended. -- David. delta - function(x) { start = index[x] end = index[x+1] - 1 iTheo = start:end len = length(iTheo) theoP = as.numeric(TheoBA$Bid[iTheo]) d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } return(d) } Thanks, Chris -- -- View this message in context: http://r.789695.n4.nabble.com/Replace-for-loop-when-vector-calling-itself-tp3338383p3339351.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail and any materials attached hereto, including, without limitation, all content hereof and thereof (collectively, XR Content) are confidential and proprietary to XR Trading, LLC (XR) and/or its affiliates, and are protected by intellectual property laws. Without the prior written consent of XR, the XR Content may not (i) be disclosed to any third party or (ii) be reproduced or otherwise used by anyone other than current employees of XR or its affiliates, on behalf of XR or its affiliates. THE XR CONTENT IS PROVIDED AS IS, WITHOUT REPRESENTATIONS OR WARRANTIES OF ANY KIND. TO THE MAXIMUM EXTENT PERMISSIBLE UNDER APPLICABLE LAW, XR HEREBY DISCLAIMS ANY AND ALL WARRANTIES, EXPRESS AND IMPLIED, RELATING TO THE XR CONTENT, AND NEITHER XR NOR ANY OF ITS AFFILIATES SHALL IN ANY EVENT BE LIABLE FOR ANY DAMAGES OF ANY NATURE WHATSOEVER, INCLUDING, BUT NOT LIMITED TO, DIRECT, INDIRECT, CONSEQUENTIAL, SPECIAL AND PUNITIVE DAMAGES, LOSS OF PROFITS AND TRADING LOSSES, RESULTING FROM ANY PERSON'S USE OR RELIANCE UPON, OR INABILITY TO USE, ANY XR CONTENT, EVEN IF XR IS ADVISED OF THE POSSIBILITY OF SUCH DAMAGES OR IF SUCH DAMAGES WERE FORESEEABLE. David Winsemius, MD West Hartford, CT
[R] Risk differences with survey package
I'm trying to use the survey package to calculate a risk difference with confidence interval for binge drinking between sexes. Variables are X_RFBING2 (Yes, No) and SEX. Both are factors. I can get the group prevalences easily enough with result - svyby(~X_RFBING2, ~SEX, la04.svy, svymean, na.rm = TRUE) and then extract components from the svyby object with SE() and coef() to do the computations. This gives the correct results, but I'd like to set this up as a contrast and am having difficulty. What is the best way to do it? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] png inside loop
You need to print lattice objects. See the FAQ. -- David. On Mar 7, 2011, at 12:28 PM, Sacha Viquerat wrote: hello list! I'm sorry, I just stumbled over this strange behaviour (at least I am not able to explain the behaviour, therefore I assume it to be a strange behaviour): attach(water) # I know, this is not recommended names(water[3:10]) [1] temp pH DO BOD COD no3 no2 po4 for (i in names(water)[3:10]){ fname-paste(Henni/GFX/fem,i,.png,sep=) mname-paste(Henni/GFX/mal,i,.png,sep=) png(fname,1000,1000) xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() png(mname,1000,1000) xyplot(N_male~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() } well, to anyone's surprise, there are no plots in the folder. the loop finishes (i, fname and mname have values assigned) and executing png(fname,1000,1000) xyplot(N_female~eval(parse(text=i)) |group,xlab=i,ylab=Abundance) graphics.off() does produce a png in the folder. I assume this to be caused by the png function, since removing the graphics.off() and playing with dev.off() and the likes did not help. anyone ideas?? am I missing the obvious?? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re-arranging data to create an image (or heatmap)
It turned out that rearranging the data was indeed the key to get the image I want. The way I do it now is this: tt - read.csv(file=file.csv, header=T, sep=,, dec=., stringsAsFactors=F) names(tt) - c('time','abs') dat - with(tt, table(time, abs)) image(dat,col=rainbow(256)) I'm now modifying the rainbow color palette to match my needs. Also, the X-axis and Y-axis display values between 0 and 1, so I want to change them to show the actual time and abs values. -- View this message in context: http://r.789695.n4.nabble.com/re-arranging-data-to-create-an-image-or-heatmap-tp3327986p3339498.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with uniform distribution
Hi I have the table data below Simula.Capital-data.frame(Week=c(0:52), Production=0) Simula.Capital$Production-round(c(120,rnorm(52, mean = 100, sd = 25)), 0) weeks=3 i-1; Sell-NULL; Maximo-NULL for(i in seq(along= Simula.Capital$Production)){ Maximo-Simula.Capital[2,Production] Sell-c(Sell,(round(runif(weeks, min=0, max=Maximo), 0))) i=i+1 } Vendas.final-matrix(Sell, ncol=weeks) cbind(Simula.Capital,Vendas.final) I want the production sell in n weeks. I use 3 weeks by example. And the selling follows a uniform distribution. 1. How can i split the production in n weeks. Please be aware that the sum of 3 weeks selling must be equal to the production. I always sell all production. 2. At the end of 52 weeks I don´t have stock. 3. I need a column that sum the selling by week. Thanks a lot Bala -- View this message in context: http://r.789695.n4.nabble.com/Help-with-uniform-distribution-tp3339560p3339560.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CDF of Sample Quantile
Just to tie up this thread, I wanted to report my solution: When (n-1)p is an integer, there is a closed form solution: pbinom(j-1,n,...) When it is not an integer, its fairly easy to approximate the solution by interpolating between the closed-form solutions: fitting log(1 - probability from closed form solution) on an orthogonal polynomial in n. This is a _very_ fast and fairly accurate solution. Thanks to all who offered their help... On Thu, Feb 17, 2011 at 11:11 PM, Bentley Coffey bentleygcof...@gmail.comwrote: Duncan, I'm not sure how I missed your message. Sorry. What you describe is what I do when (n-1)p is an integer so that R computes the sample quantile using a single order statistic. My later posting has that exact binomial expression in there as a special case. When (n-1)p is not an integer then R uses a weighted average of 2 order statistics, in which case I'm left with my standing problem... On Mon, Feb 14, 2011 at 2:26 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 14/02/2011 9:58 AM, Bentley Coffey wrote: I need to calculate the probability that a sample quantile will exceed a threshold given the size of the iid sample and the parameters describing the distribution of each observation (normal, in my case). I can compute the probability with brute force simulation: simulate a size N sample, apply R's quantile() function on it, compare it to the threshold, replicate this MANY times, and count the number of times the sample quantile exceeded the threshold (dividing by the total number of replications yields the probability of interest). The problem is that the number of replications required to get sufficient precision (3 digits say) is so HUGE that this takes FOREVER. I have to perform this task so much in my script (searching over the sample size and repeated for several different distribution parameters) that it takes too many hours to run. I've searched for pre-existing code to do this in R and haven't found anything. Perhaps I'm missing something. Is anyone aware of an R function to compute this probability? I've tried writing my own code using the fact that R's quantile() function is a linear combination of 2 order statistics. Basically, I wrote down the mathematical form of the joint pdf for the 2 order statistics (a function of the sample size and the distribution parameters) then performed a pseudo-Monte Carlo integration (i.e. using Halton Draws rather than R's random draws) over the region where the sample quantile exceeds the threshold. In theory, this should work and it takes about 1000 times fewer clock cycles to compute than the Brute Force approach. My problem is that there is a significant discrepancy between the results using Brute Force and using this more efficient approach that I have coded up. I believe that the problem is numerical error but it could be some programming bug; regardless, I have been unable to locate the source of this problem and have spent over 20 hours trying to identify it this weekend. Please, somebody help!!! So, again, my question: is there code in R for quickly evaluating the CDF of a Sample Quantile given the sample size and the parameters governing the distribution of each iid point in the sample? I think the answer to your question is no, but I think it's the wrong question. Suppose Xm:n is the mth sample quantile in a sample of size n, and you want to calculate P(Xm:n x). Let X be a draw from the underlying distribution, and suppose P(X x) = p. Then the event Xm:n x is the same as the event that out of n independent draws of X, at least n-m+1 are bigger than x: a binomial probability. And R can calculate binomial probabilities using pbinom(). Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] linear mixed model with nested factors
Hi R-help. I am trying to run a linear mixed model with nested factors with either lme or lmer and I am having no luck obtaining the same results as Minitab. Here is Minitab's code: MTB GLM 'count' = site year replicate(site year) site*year; SUBC Random 'year' 'replicate'; Can you tell me how to code this in R? The settings are typeII, Tukey, 95%confidence interval, but I know how to set those in R. I have basic R skills and I've worked with this on and off for several days, and have also consulted some people with basic R experience, but nobody can get it to work as it seems to be a bit complex. I'd prefer to use R rather than Minitab for the research project I'm working on, but the time spent on this problem has gotten to be too much. Thanks in advance. Sincerely, David student - Norway __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear mixed model with nested factors
David: It is unlikely you will get a helpful response to this. Instead, you will improve your chances of a good response if you do three things: 1) Provide a mathematical description of the model you are trying to estimate 2) Provide a description of the data you have 3) Provide some code or any efforts you have made with the lmer function so far. For example, for (1) you might provide something like, I want a model as y_ji = mu + beta(x) + u_j + e_ij where x is ... For (2) You can show your data structure as: str(myData) And for (3) just indicate what you have done so far. For example, I have tried lmer(response ~ covariate + (1|covariate), myData) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Dudek Sent: Monday, March 07, 2011 1:29 PM To: r-help@r-project.org Subject: [R] linear mixed model with nested factors Hi R-help. I am trying to run a linear mixed model with nested factors with either lme or lmer and I am having no luck obtaining the same results as Minitab. Here is Minitab's code: MTB GLM 'count' = site year replicate(site year) site*year; SUBC Random 'year' 'replicate'; Can you tell me how to code this in R? The settings are typeII, Tukey, 95%confidence interval, but I know how to set those in R. I have basic R skills and I've worked with this on and off for several days, and have also consulted some people with basic R experience, but nobody can get it to work as it seems to be a bit complex. I'd prefer to use R rather than Minitab for the research project I'm working on, but the time spent on this problem has gotten to be too much. Thanks in advance. Sincerely, David student - Norway __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear mixed model with nested factors
Think about it. You have asked for help from the R list but have posted no R code, only Minitab code. That means in order to answer, the helpeR must know Minitab. Well, some may, but there's certainly no reason to expect so on an R list. Don't you therefore think it might be wiser to post a careful description of your study design and the model you wish to fit along with any R code attempting to accomplish this, as the posting guide asks? A minimal reproducible example might also help you get a response. Again, as the posting guide asks. FWIW, if your response variable is a count of some sort, you will probably need glmer in the lme4 package to fit the data, as lme only fits continuous, approximately Gaussian data. But of course, with no details, this is just a guess. -- Bert On Mon, Mar 7, 2011 at 10:29 AM, David Dudek david.du...@student.umb.no wrote: Hi R-help. I am trying to run a linear mixed model with nested factors with either lme or lmer and I am having no luck obtaining the same results as Minitab. Here is Minitab's code: MTB GLM 'count' = site year replicate(site year) site*year; SUBC Random 'year' 'replicate'; Can you tell me how to code this in R? The settings are typeII, Tukey, 95%confidence interval, but I know how to set those in R. I have basic R skills and I've worked with this on and off for several days, and have also consulted some people with basic R experience, but nobody can get it to work as it seems to be a bit complex. I'd prefer to use R rather than Minitab for the research project I'm working on, but the time spent on this problem has gotten to be too much. Thanks in advance. Sincerely, David student - Norway __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Risk differences with survey package
On Tue, Mar 8, 2011 at 6:50 AM, michael.laviole...@dhhs.state.nh.us wrote: I'm trying to use the survey package to calculate a risk difference with confidence interval for binge drinking between sexes. Variables are X_RFBING2 (Yes, No) and SEX. Both are factors. I can get the group prevalences easily enough with result - svyby(~X_RFBING2, ~SEX, la04.svy, svymean, na.rm = TRUE) and then extract components from the svyby object with SE() and coef() to do the computations. This gives the correct results, but I'd like to set this up as a contrast and am having difficulty. What is the best way to do it? The easiest way is to use svyglm() -- a risk difference is what you get in a linear model For example, using the built-in dclus2 data set riskmodel-svyglm(as.numeric(comp.imp)~sch.wide, design=dclus2) riskmodel 2 - level Cluster Sampling design With (40, 126) clusters. svydesign(id = ~dnum + snum, fpc = ~fpc1 + fpc2, data = apiclus2) Call: svyglm(as.numeric(comp.imp) ~ sch.wide, design = dclus2) Coefficients: (Intercept) sch.wideYes 1.1068 0.7743 Degrees of Freedom: 125 Total (i.e. Null); 38 Residual Null Deviance: 27.02 Residual Deviance: 12.9 AIC: 124.8 confint(riskmodel) 2.5 %97.5 % (Intercept) 0.9484637 1.2651862 sch.wideYes 0.6232830 0.9253461 You can't extract the results from the output of svyby(), in general, because svyby() doesn't know how to compute the covariances between estimates in the groups -- after all, the function input to svyby() can be any function including one you wrote. These estimates won't typically be independent in a complex design, in contrast to the situation in a simple random sample. With a replicate-weights design you can use svyby() with the covmat=TRUE option to return the covariance matrix, and then compute contrasts. This only works if the function returns the replicate estimates (which all the built-in functions do), allowing the covariance to be computed. groups-svyby(~comp.imp, ~sch.wide, design=rclus1, svymean, covmat=TRUE) svycontrast(groups, quote(`Yes:comp.impYes`-`No:comp.impYes`)) nlcon SE contrast 0.83125 0.0209 -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Monte carlo help
The way I see it is that you have a non-homogeneous poisson process to describe the way the students arrive and that you're missing the service time of your tutors. The way you are modeling the arrival of students is *really *bad. At most, only a single student can arrive each hour so to solve the model you've created you need to have one tutor available each hour. Check out the wikipedia articles on poisson process and non-homogeneous poisson process (the first is simpler to model but the second is the situation you are modeling). As for service time, presumably you'll have tutors of different levels (algebra 1 vs analysis 1) and it is not necessarily the case that your analysis tutors can do algebra faster but that they command a higher hourly rate for their expertise. In modeling you'll want to look at minimizing cost (both direct for paying tutors and indirect for excessive wait time of your customers) Hope this helps. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] generate 3 distinct random samples without replacement
Hello: I wonder if I could get a little help with random sampling in R. I have a vector of length 7375. I would like to draw 3 distinct random samples, each of length 100 without replacement. I have tried the following: d1 - 1:7375 set.seed(7) i - sample(d1, 100, replace=F) s1 - sort(d1[i]) s1 d2 - d1[-i] set.seed(77) j - sample(d2, 100, replace=F) s2 - sort(d2[j]) s2 d3 - d2[-j] set.seed(777) k - sample(d3, 100, replace=F) s3 - sort(d3[k]) s3 D - data.frame(a=s1,b=s2,c=s3) However, s2 is only 97 elements long, and s3, only 96 long. I would appreciate any suggestions on a better approach. I'm also curious to know why my second and third samples are less than 100 elements in length. Thanks for your time and consideration, Cesar A. Hincapié, DC, MHSc Research Fellow, Division of Health Care and Outcomes Research, Toronto Western Research Institute PhD Candidate in Epidemiology, Dalla Lana School of Public Health, University of Toronto e. cesar.hinca...@utoronto.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.ssd() from foreign package
Thank you Peter. SAS Universal Viewer can open both SAS datasets. And if I do the following in SAS, it will print out the dataset: libname x 'C:\SASdata'; proc print data=x.a; run; Here are what is in the log files: 1. For the one that doesn't work: tmp-read.ssd(C:\\SASdata, a,sascmd=C:/Program Files/SAS/SASFoundation/9.2/sas.exe) SAS failed. The log file will be file3d6c4ae1.log in the current directory Warning message: In read.ssd(C:\\SASdata, a, sascmd = C:/Program Files/SAS/SASFoundation/9.2/sas.exe) : SAS return code was 1 The content of log file file3d6c4ae1.log: NOTE: SAS initialization used: real time 0.90 seconds cpu time0.31 seconds 1 option validvarname = v6;libname src2rd 'C:\SASdata'; NOTE: Libref SRC2RD was successfully assigned as follows: Engine:V9 Physical Name: C:\SASdata 2 libname rd xport 'C:\DOCUME~1\yiz01\LOCALS~1\Temp\Rtmptznwrt\file678418be'; NOTE: Libref RD was successfully assigned as follows: Engine:XPORT Physical Name: C:\DOCUME~1\yiz01\LOCALS~1\Temp\Rtmptznwrt\file678418be 3 proc copy in=src2rd out=rd; 4 select a ; NOTE: Copying SRC2RD.A to RD.A (memtype=DATA). NOTE: There were 3347 observations read from the data set SRC2RD.A. NOTE: The data set RD.A has 3347 observations and 52 variables. WARNING: Labels exceeding length 40 are not supported by engine XPORT and are being truncated. NOTE: PROCEDURE COPY used (Total process time): real time 2.75 seconds cpu time0.09 seconds NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414 NOTE: The SAS System used: real time 4.00 seconds cpu time0.43 seconds It only has an warning about the label exceeding length 40, but has no error message. And it even appears that it has read in 3347 observations and 52 variables. 2. For the one that worked: tmp-read.ssd(C:\\SASdata, b,sascmd=C:/Program Files/SAS/SASFoundation/9.2/sas.exe) The content of the log file is: NOTE: SAS initialization used: real time 0.59 seconds cpu time0.17 seconds 1 option validvarname = v6;libname src2rd 'C:\SASdata'; NOTE: Libref SRC2RD was successfully assigned as follows: Engine:V9 Physical Name: C:\SASdata 2 libname rd xport 'C:\DOCUME~1\yiz01\LOCALS~1\Temp\Rtmptznwrt\file72ae2cd6'; NOTE: Libref RD was successfully assigned as follows: Engine:XPORT Physical Name: C:\DOCUME~1\yiz01\LOCALS~1\Temp\Rtmptznwrt\file72ae2cd6 3 proc copy in=src2rd out=rd; 4 select b ; NOTE: Copying SRC2RD.B to RD.B (memtype=DATA). NOTE: Data file SRC2RD.B.DATA is in a format that is native to another host, or the file encoding does not match the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance. NOTE: The variable name study_eye has been truncated to study_ey. NOTE: The variable study_ey now has a label set to study_eye. NOTE: The variable name OCSHEVA_dec has been truncated to OCSHEVA_. WARNING: Engine XPORT does not support SORTEDBY operations. SORTEDBY information cannot be copied. NOTE: There were 3574 observations read from the data set SRC2RD.B. NOTE: The data set RD.B has 3574 observations and 62 variables. NOTE: PROCEDURE COPY used (Total process time): real time 1.81 seconds cpu time0.42 seconds NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414 NOTE: The SAS System used: real time 2.40 seconds cpu time0.59 seconds Thank you! John From: peter dalgaard pda...@gmail.com Cc: r-help@r-project.org Sent: Sun, March 6, 2011 1:51:51 AM Subject: Re: [R] read.ssd() from foreign package On Mar 6, 2011, at 09:34 , array chip wrote: Hi, I am encountering a confusing problem when I tried to use read.ssd to read SAS datasets. For one SAS dataset a.sas7bdat, it did not work; while for another SAS dataset b.sas7bdat it worked: tmp-read.ssd(C:\\SASdata, a,sascmd=C:/Program Files/SAS/SASFoundation/9.2/sas.exe) SAS failed. SAS program at C:\DOCUME~1\yiz01\LOCALS~1\Temp\RtmpVjJa6m\file12384509.sas The log file will be file12384509.log in the current directory Warning message: In read.ssd(C:\\SASdata, a, sascmd = C:/Program Files/SAS/SASFoundation/9.2/sas.exe) : SAS return code was 1 tmp-read.ssd(C:\\SASdata, b,sascmd=C:/Program Files/SAS/SASFoundation/9.2/sas.exe) The attached log files are also attached. Nope... Presumably, your mailer encoded them as non-text and the mailing list software scrubbed them. Try inlining them. Not much we can do without an error message to go on. (I gather, by the way, that even SAS itself has trouble reading SAS files these days due to
Re: [R] advice on classes/methods/extending classes
Hi Erin, The Chambers book and web page Josh mentioned provide a good description of the S4 object system. Another introduction to S4 is Robert Gentleman's S4 Classes in 15 pages, more or less ( http://www.stat.auckland.ac.nz/S-Workshop/Gentleman/S4Objects.pdf). The help pages for ?setClass, ?setGeneric, ?setMethod, ?Classes, and ?Methods also have more details. The S3 object system is described in detail in a chapter of Statistical Models in S by Chambers and Hastie (although I think some of the details are different in R), the R language manual ( http://cran.case.edu/doc/manuals/R-lang.html#Object_002doriented-programming), and the help pages for ?UseMethod, ?class, and ?methods. Chambers and Hastie have an example of designing a class like the standard factor class and extending it to be an ordered factor. The first vignette in Charlotte Maia's ofp package ( http://cran.case.edu/web/packages/ofp/index.html, http://cran.case.edu/web/packages/ofp/vignettes/ofp1.pdf) has a brief example of extending class structures in S3 with a classic OO example of point, circle, and colouredcircle. The focus of the vignette is on the enhancements provided by the ofp package, but the first example only uses S3 without any enhancements. There is also the new S4 reference class system. I think that is only documented in the ?setRefClass help page. John -- John Oleynick Johnson Johnson Pharmaceutical Research Development, Non-Clinical Statistics University of Medicine and Dentistry of New Jersey, Ph.D. Candidate joley...@gmail.com On Fri, Mar 4, 2011 at 2:02 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Erin, One good option would be the official manual: http://cran.r-project.org/doc/manuals/R-exts.html It depends to an extent, I think, on what types of methods you would like to work with and use. FWIW, I have and really enjoy both S Programming by Venables Ripley (mostly S3 methods and general programming, though I am sure that poor summary does not do it justice) and Software for Data Analysis by John Chambers (S4 methods). HTH, Josh On Thu, Mar 3, 2011 at 9:47 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Dear R People: What is the best way to learn about classes, methods, extending classes, and namespaces, please? I know a bit about classes, but would like to learn much more. Thanks in advance for any advice! Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generate 3 distinct random samples without replacement
would this work? s - sample(d1, 300, F) D - data.frame(a = s[1:100], b = s[101:200], c = s[201:300]) -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly r-help-boun...@r-project.org wrote on 03/07/2011 02:17:19 PM: [image removed] [R] generate 3 distinct random samples without replacement Cesar Hincapié to: r-help 03/07/2011 03:06 PM Sent by: r-help-boun...@r-project.org Hello: I wonder if I could get a little help with random sampling in R. I have a vector of length 7375. I would like to draw 3 distinct random samples, each of length 100 without replacement. I have tried the following: d1 - 1:7375 set.seed(7) i - sample(d1, 100, replace=F) s1 - sort(d1[i]) s1 d2 - d1[-i] set.seed(77) j - sample(d2, 100, replace=F) s2 - sort(d2[j]) s2 d3 - d2[-j] set.seed(777) k - sample(d3, 100, replace=F) s3 - sort(d3[k]) s3 D - data.frame(a=s1,b=s2,c=s3) However, s2 is only 97 elements long, and s3, only 96 long. I would appreciate any suggestions on a better approach. I'm also curious to know why my second and third samples are less than 100 elements in length. Thanks for your time and consideration, Cesar A. Hincapié, DC, MHSc Research Fellow, Division of Health Care and Outcomes Research, Toronto Western Research Institute PhD Candidate in Epidemiology, Dalla Lana School of Public Health, University of Toronto e. cesar.hinca...@utoronto.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generate 3 distinct random samples without replacement
Cesar, your indexing is wrong: On Mon, Mar 7, 2011 at 2:17 PM, Cesar Hincapié cesar.hinca...@utoronto.ca wrote: Hello: I wonder if I could get a little help with random sampling in R. I have a vector of length 7375. I would like to draw 3 distinct random samples, each of length 100 without replacement. I have tried the following: d1 - 1:7375 set.seed(7) i - sample(d1, 100, replace=F) s1 - sort(d1[i]) s1 d1 is a continuous vector of integers, 1 thru 7375 and of length 7375 d2 - d1[-i] but you've taken out 100 of those numbers, so d2 is now of length 7275 and has gaps in the sequence. set.seed(77) j - sample(d2, 100, replace=F) s2 - sort(d2[j]) s2 j is a sample *of the values* and those values are no longer the indices of the vector d2 You need instead j - sample(1:length(d2), 100, replace=FALSE) s2 - sort(d2[j]) Some of the value in j no longer exist in d2 as indices. 7375 could be selected, but since d2 only has 7275 elements d2[7375] doesn't return anything (actually NA). Same for your third sample, only the indices are even less like the elements of the vector because you've removed another random set of values. Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] use caret to rank predictors by random forest model
Hi, I'm using package caret to rank predictors using random forest model and draw predictors importance plot. I used below commands: rf.fit-randomForest(x,y,ntree=500,importance=TRUE) ## x is matrix whose columns are predictors, y is a binary resonse vector ## Then I got the ranked predictors by ranking rf1$importance[,MeanDecreaseAccuracy] ## Then draw the importance plot varImpPlot(rf.fit) As you can see, all the functions I used are directly from the package randomForest, instead of from caret. so I'm wondering if the package caret has some functions who can do the above ranking and ploting. In fact, I tried functions train, varImp and plot from package caret, the random forest model that built by train can not be input correctly to varImp, which gave error message like subscripts out of bounds. Also function plot doesn't work neither. So I'm wondering if anybody has encountered the same problem before, and could shed some light on this. I would really appreciate your help. Thanks, Xiaoqi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use caret to rank predictors by random forest model
It would help if you provided the code that you used for the caret functions. The most likely issues is not using importance = TRUE in the call to train() I believe that I've only implemented code for plotting the varImp objects resulting from train() (eg. there is plot.varImp.train but not plot.varImp). Max On Mon, Mar 7, 2011 at 3:27 PM, Xiaoqi Cui x...@mtu.edu wrote: Hi, I'm using package caret to rank predictors using random forest model and draw predictors importance plot. I used below commands: rf.fit-randomForest(x,y,ntree=500,importance=TRUE) ## x is matrix whose columns are predictors, y is a binary resonse vector ## Then I got the ranked predictors by ranking rf1$importance[,MeanDecreaseAccuracy] ## Then draw the importance plot varImpPlot(rf.fit) As you can see, all the functions I used are directly from the package randomForest, instead of from caret. so I'm wondering if the package caret has some functions who can do the above ranking and ploting. In fact, I tried functions train, varImp and plot from package caret, the random forest model that built by train can not be input correctly to varImp, which gave error message like subscripts out of bounds. Also function plot doesn't work neither. So I'm wondering if anybody has encountered the same problem before, and could shed some light on this. I would really appreciate your help. Thanks, Xiaoqi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generate 3 distinct random samples without replacement
On 07/03/2011 2:17 PM, Cesar Hincapié wrote: Hello: I wonder if I could get a little help with random sampling in R. I have a vector of length 7375. I would like to draw 3 distinct random samples, each of length 100 without replacement. I have tried the following: d1- 1:7375 set.seed(7) i- sample(d1, 100, replace=F) s1- sort(d1[i]) s1 d2- d1[-i] set.seed(77) j- sample(d2, 100, replace=F) s2- sort(d2[j]) s2 d3- d2[-j] set.seed(777) k- sample(d3, 100, replace=F) s3- sort(d3[k]) s3 D- data.frame(a=s1,b=s2,c=s3) However, s2 is only 97 elements long, and s3, only 96 long. I would appreciate any suggestions on a better approach. I'm also curious to know why my second and third samples are less than 100 elements in length. If you want 3 non-overlapping, non-repeating samples of 100, why not draw one sample of 300, and take 3 subsets of it? The reason you were finding shorter samples is because you were using j and k as indices into vectors d2 and d3 that didn't have enough elements, and then you sorted the result, losing the NAs. For example, d2 - 1:10 d2[10:12] sort(d2[10:12]) See ?sort for an explanation of how to keep NA values when you sort. Duncan Murdoch Thanks for your time and consideration, Cesar A. Hincapié, DC, MHSc Research Fellow, Division of Health Care and Outcomes Research, Toronto Western Research Institute PhD Candidate in Epidemiology, Dalla Lana School of Public Health, University of Toronto e. cesar.hinca...@utoronto.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] survest() for cph() in Design package
Hi, I am trying to run a conditional logistic model on a nested case-control study using cph() and then estimate survival based on the model. The data came from Prof Bryan Langholz website where he also has the SAS code to this, so I am trying to replicate the SAS results. The data attached. Basically, the variables are: rstime: risk set age rsentry: fake entry time, just before rstime setno: risk set identifier cc: lung cancer death (0/1) cr500: independent variable 1 (0/1), cumulative radon 500WLM smoke25: independent variable 2 (0/1), 1/2+ pack/dat at age 25 logw: weight to be used as an offset in the model rstime2: indicator variable to indicate whether age is below 50, 50-69 or =70, used as strata in the model The SAS code is (if this helps to address my question/problem) proc phreg data=uminers; model rstime*cc(0)=cr500 smoke25/ entry=rsentry offset=logw rl; baseline out=ch covariates=covs cumhaz=cumhaz stdcumhaz=secumhaz lowercumhaz=lci uppercumhaz=uci/ nomean; strata rstime2; run; My R code is: library(survival) library(Design) uminers-read.table(uminers.txt,sep='\t',header=T,row.names=NULL) fit - cph(Surv(rsentry,rstime,cc)~cr500+smoke25+strat(rstime2) +offset(logw), uminers, x=T, y=T) fit Cox Proportional Hazards Model cph(formula = Surv(rsentry, rstime, cc) ~ cr500 + smoke25 + strat(rstime2) + offset(logw), data = uminers, x = T, y = T) Obs Events Model L.R. d.f. P 774258 89.98 2 0 ScoreScore P R2 82.61 0 0.211 Status Stratum No Event Event rstime2=1 10050 rstime2=2 348 174 rstime2=3 6834 coef se(coef)zp cr500 1.4910.184 8.08 6.66e-16 smoke25 0.4610.182 2.53 1.16e-02 These results are the same as what SAS reported! Next, I want to estimate the survival for stratum age 50-69 (rstime2=2), so I used the following, this is where I got the error message: survest(fit, expand.grid(cr500=c(0,1),smoke25=c(0,1),rstime2=2, logw=0), times=unique(uminers$rstime[uminers$rstime2=='2']), conf.int=.95) Error in factor(rep(1:nstrat, scount), labels = names(fit$strata)) : invalid labels; length 3 should be 1 or 2 Anyone has any suggestions what went wrong? Thanks! John setno rstime cc rsentry cr500 smoke25 logwrstime2 1 34.51 34.4999 1 1 6.315358002 1 1 34.50 34.4999 1 0 6.315358002 1 1 34.50 34.4999 0 1 6.315358002 1 2 36.42 1 36.4199 1 1 6.404125922 1 2 36.42 0 36.4199 0 0 6.404125922 1 2 36.42 0 36.4199 1 1 6.404125922 1 3 38.17 1 38.1699 1 0 6.470283375 1 3 38.17 0 38.1699 0 1 6.470283375 1 3 38.17 0 38.1699 0 1 6.470283375 1 4 39.92 1 39.9199 1 1 6.532334292 1 4 39.92 0 39.9199 0 0 6.532334292 1 4 39.92 0 39.9199 1 0 6.532334292 1 5 40 1 39. 1 1 6.532334292 1 5 40 0 39. 0 1 6.532334292 1 5 40 0 39. 1 1 6.532334292 1 6 40.08 1 40.0799 0 1 6.537657315 1 6 40.08 0 40.0799 1 1 6.537657315 1 6 40.08 0 40.0799 1 1 6.537657315 1 7 41.92 1 41.9199 1 1 6.582486713 1 7 41.92 0 41.9199 1 1 6.582486713 1 7 41.92 0 41.9199 0 0 6.582486713 1 8 42.08 1 42.0799 1 0 6.592130875 1 8 42.08 0 42.0799 1 1 6.592130875 1 8 42.08 0 42.0799 0 1 6.592130875 1 9 42.081 1 42.0809 1 1 6.593044534 1 9 42.081 0 42.0809 0 0 6.593044534 1 9 42.081 0 42.0809 1 0 6.593044534 1 10 42.42 1 42.4199 1 1 6.599870499 1 10 42.42 0 42.4199 1 1 6.599870499 1 10 42.42 0 42.4199 0 0 6.599870499 1 11 42.51 42.4999 0 1 6.598509029 1 11 42.50 42.4999 1 1 6.598509029 1 11 42.50 42.4999 0 1 6.598509029 1 12 42.92 1 42.9199 1 1 6.598054793 1 12 42.92 0 42.9199 1 1 6.598054793 1 12 42.92 0 42.9199 1 1 6.598054793 1 13 43.75 1 43.7499 1 1 6.612041035 1 13 43.75 0 43.7499 0 1 6.612041035 1 13 43.75 0 43.7499 0 1 6.612041035 1 14 44.33 1
[R] How to reference a package in academical paper
Dear, I am now writing more formal academical paper, and would like to reference an R package. Do you have any recommendation how to do it? Taking for instance the RODBC package as an example, how would the reference look like? http://cran.r-project.org/web/packages/RODBC/index.html Thank you Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Teaching R: To quote, or not to quote?
Duncan Murdoch murdoch.dun...@gmail.com 03/07/11 3:17 PM On 07/03/2011 9:52 AM, Muenchen, Robert A (Bob) wrote: Hi All, When I teach an intro workshop on R, I've been minimizing quote confusion by always using quotes around package names in function calls. ... I'm wondering if there's a downside to this. The only reasons I can think to avoid that recommendation is that people might find typing unnecessary quotes to be irritating and they might be confused when they see unquoted usage elsewhere. Those aren't particularly strong reasons... Agreed, especially for library and the like where . But from a user perspective most of the advantage of ?mean and ??mean over help(mean) is the reduction in typing - especially the utility of things like only typing the ? in ?rnorm(q=23) after Error in rnorm(q = 23) : unused argument(s) (q = 23). That is so useful that I'd say it's worth reinforcing by using ? without quotes where possible. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to reference a package in academical paper
http://www.iiap.res.in/astrostat/School07/R/html/utils/html/citation.html On Mon, Mar 7, 2011 at 4:12 PM, Jan Hornych jh.horn...@gmail.com wrote: Dear, I am now writing more formal academical paper, and would like to reference an R package. Do you have any recommendation how to do it? Taking for instance the RODBC package as an example, how would the reference look like? http://cran.r-project.org/web/packages/RODBC/index.html Thank you Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to reference a package in academical paper
Hi Jan, R citation('RODBC') To cite package RODBC in publications use: Brian Ripley and and from 1999 to Oct 2002 Michael Lapsley (2010). RODBC: ODBC Database Access. R package version 1.3-2. http://CRAN.R-project.org/package=RODBC A BibTeX entry for LaTeX users is @Manual{, title = {RODBC: ODBC Database Access}, author = {Brian Ripley and and from 1999 to Oct 2002 Michael Lapsley}, year = {2010}, note = {R package version 1.3-2}, url = {http://CRAN.R-project.org/package=RODBC}, } ATTENTION: This citation information has been auto-generated from the package DESCRIPTION file and may need manual editing, see help(citation) . * * On Mon, Mar 7, 2011 at 4:12 PM, Jan Hornych wrote: Dear, I am now writing more formal academical paper, and would like to reference an R package. Do you have any recommendation how to do it? Taking for instance the RODBC package as an example, how would the reference look like? http://cran.r-project.org/web/packages/RODBC/index.html Thank you Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] More appropriate optimization routine?
Hello! I have 2 variables - predictor pred and response variable DV: pred-c(439635.053, 222925.718, 668434.755, 194242.330, 5786.321, 115537.344, 100835.368, 7133.206, 159058.286, 4079991.629, 3380078.060, 2661279.136, 2698324.478, 1245213.965, 1901815.503, 1517019.451, 1396857.736, 1034030.988, 432249.574, 342329.325, 1831335.792, 2209578.859, 1641709.754, 1329308.669, 1251794.367, 731368.430, 1705626.983, 673535.171, 242519.280, 57251.998, 5728.821, 2054514.244, 301954.819, 773955.355, 735497.506, 347355.976, 1678175.153, 133082.395, 591326.289, 30866.182, 27235.846, 118372.342, 71590.969, 84813.299, 366146.153, 1391725.205, 763199.746, 1216661.202, 263878.157, 930832.769, 261270.130, 589303.561, 455137.946, 954655.201, 873434.054) (pred) DV-c(0.55351297,0.27616943,0.58134926,0.33887159,0.03092546,0.14928061, 0.11836759,0.01719463,0.03258188,1.81205587,2.86657699,2.49491195, 3.09727230,1.95648776,2.28106268,1.78978179,1.74003678,1.22520393, 0.54245878,0.41483039,1.08731493,2.19581289,1.60516129,1.30723431, 1.41822649,1.31530539,2.02406576,1.22211412,0.52055790,0.12975522, 0.01416903,0.61043485,0.44141748,0.64327070,0.53607039,0.32603820, 1.77261016,0.42035756,0.37853917,0.12342486,0.06607710,0.02383682, 0.08421590,0.09255332,0.23644909,1.67921092,1.26864432,1.38654574, 1.29833020,1.76873555,0.93363677,1.01857658,0.81359775,2.14758239,2.41583852) (DV) Both pred and DV above are time series (observed across 55 months). The relationship between them is pre-specified. In this relationship, the (predicted) DV at time t is a specific function of itself at time t-1, of pred at time t, and of 2 scalars - a and b. I have to find optimal a and b that would ensure the best fit between the observed DV and the predicted DV. Below is the function I have to optimize: my.function - function(param){ a-param[1] b-param[2] DV_pred - rep(0,length(pred)) for(i in 2:length(pred)){ DV_pred[i] - 1 - ( (1 - DV_pred[i-1] * a) / (exp(pred[i] * b)) ) } DV_pred[1]-DV[1] correl - cor(DV,DV_pred) return(correl) } a has to be between 0.001 and 0.75 b has to be positive. I apologize if it's a simple question for statisticians but I am not a mathematician/statistician by training. I didn't think optim would work here. The only thing I could think of is genetic optimization, for example in rgenoud below. However, I don't think I could use it for 2 reasons: (1) Solutions do not seem stable and depend on starting values, set.seed, and domains chosen; (2) It takes too long - I have to do the task I described above about 1,200 times and it would take me forever. Could someone maybe provide a pointer: what would be a faster/more efficient optimization approach for such a function as mine? Thanks a lot! library(rgenoud) genoud.opt-genoud(my.function, nvars=2, max=TRUE, pop.size=1, hard.generation.limit=TRUE, max.generations=20, wait.generations=5, starting.values=c(0.5,1), Domains=matrix(c(0.001,0.75,0.0001,2),ncol=2,byrow=T), boundary.enforcement=2) -- Dimitri Liakhovitski Ninah Consulting www.ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generate 3 distinct random samples without replacement
Cesar, I think your basic misconception is that you believe 'sample' returns a list of indices into the original vector. It does not; it returns actual elements of the vector: sample(runif(100),3) [1] 0.4492988 0.0336069 0.6948440 I'm not sure why you keep resetting the seed, but if it's important, replace d2-d1[-i] with d2- setdiff(d1,i) Otherwise Duncan's suggestion is must nicer: s = sample(d1,300,replace=FALSE) s1 = sort(s[1:100]) s2 = sort(s[101:200]) s3 = sort(s[201:300]) If what you actually need are indices into the original vector, replace d1 with length(d1). (When you say 'distinct', I'm assuming you mean 'disjoint'.) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Monday, March 07, 2011 3:52 PM To: Cesar Hincapié Cc: r-help@r-project.org Subject: Re: [R] generate 3 distinct random samples without replacement On 07/03/2011 2:17 PM, Cesar Hincapié wrote: Hello: I wonder if I could get a little help with random sampling in R. I have a vector of length 7375. I would like to draw 3 distinct random samples, each of length 100 without replacement. I have tried the following: d1- 1:7375 set.seed(7) i- sample(d1, 100, replace=F) s1- sort(d1[i]) s1 d2- d1[-i] set.seed(77) j- sample(d2, 100, replace=F) s2- sort(d2[j]) s2 d3- d2[-j] set.seed(777) k- sample(d3, 100, replace=F) s3- sort(d3[k]) s3 D- data.frame(a=s1,b=s2,c=s3) However, s2 is only 97 elements long, and s3, only 96 long. I would appreciate any suggestions on a better approach. I'm also curious to know why my second and third samples are less than 100 elements in length. If you want 3 non-overlapping, non-repeating samples of 100, why not draw one sample of 300, and take 3 subsets of it? The reason you were finding shorter samples is because you were using j and k as indices into vectors d2 and d3 that didn't have enough elements, and then you sorted the result, losing the NAs. For example, d2 - 1:10 d2[10:12] sort(d2[10:12]) See ?sort for an explanation of how to keep NA values when you sort. Duncan Murdoch Thanks for your time and consideration, Cesar A. Hincapié, DC, MHSc Research Fellow, Division of Health Care and Outcomes Research, Toronto Western Research Institute PhD Candidate in Epidemiology, Dalla Lana School of Public Health, University of Toronto e. cesar.hinca...@utoronto.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XYPLOT - GROUPING WITH TWO CATEGORICAL VARIABLES
David, David, David...you forgot the solid and dashed lines :) It's OK, it gives me an excuse to look at this from a slightly different angle. [The intersection() function is a *really* good trick, BTW - thanks for the reminder.] Back to the OP. Let's re-read the data: dat=data.frame(Age = rep(c(30, 40, 50), each = 8), Period = rep(2005:2008 ,6), Rate = 1:24, Sex = rep(rep(c('F', 'M'), each = 4), 3) dat$Period - factor(dat$Period, levels = c(2005:2008)) # Go one step further and create a Sex * Age interaction factor by hand, using the paste() function: dat$SexAge - with(dat, paste(Sex, Age, sep = ' ')) # Now use it in the xyplot... xyplot(Rate ~ Period, data = dat, groups = SexAge, type = 'b', lty = rep(c('solid', 'dashed'), each = 3), col = rep(1:3, 2), col.lines = rep(1:3, 2), key = list(space = 'right', title = 'Sex-Age', cex.title = 1.1, text = list(levels(dat$SexAge), cex = 0.8), lines = list(lty = rep(c('solid', 'dashed'), each = 3), col = rep(1:3, 2)) ), xlab = Period, ylab = Rate,ylim=c(0,25), main = Mortality Rates per 10 inhabitants - Women Men, scales=list(tck=c(1,0)) ) Dashed lines tend to show up better than do dotted lines. If you really want the dotted lines instead, substitute 'dotted' for 'dashed'. This way isn't any better than what David showed you, but as there are multiple ways to do things in R, creation of a new factor in the data to handle age-gender combinations is one method that allows you to exercise more control over the set of values produced, particularly if you intend to produce a legend. If you want different colors, create a vector of length 3, say c('red', 'green', 'blue') [hoping your viewer isn't color blind] and substitute it for 1:3 in the col and col.lines statements in the xyplot() call, as well as in the col argument for lines = in the key() statement. ...BTW, I echo David's editorial comments and would suggest that you read the Posting Guide..carefully. (See the bottom of this message for the link.) Dennis On Mon, Mar 7, 2011 at 6:18 AM, David Winsemius dwinsem...@comcast.netwrote: On Mar 7, 2011, at 8:37 AM, Marcos Prunello wrote: Hi! I have a dataframe like this: dat = data .frame (Age=c(rep(30,8),rep(40,8),rep(50,8)),Period=rep(seq(2005,2008,1),3),Rate=c(seq(1,8,1),seq(9,16,1),seq(17,24,1)),Sex=rep(c(rep(0,4),rep(1,4)),3))attach(dat)dat Age Period Rate Sex1 30 20051 02 30 20062 03 30 20073 04 30 20084 05 30 20055 16 30 20066 17 30 20077 18 30 20088 19 40 20059 010 40 2006 10 011 40 2007 11 012 40 2008 12 013 40 2005 13 114 40 2006 14 115 40 2007 15 116 40 2008 16 117 50 2005 17 018 50 2006 18 019 50 2007 19 020 50 2008 20 021 50 2005 21 122 50 2006 22 123 50 2007 23 124 50 2008 24 1 And I can do these separated graphs by sex: xyplot(Rate ~ Period, data=subset(dat,Sex==0), groups = Age, type = b, auto.key = list(cex=0.8,border=TRUE,size=3,cex.title=1,space = right, title=Age, points = FALSE , lines = TRUE), xlab = Period, ylab = Rate,ylim=c(0,25), main = Mortality Rates for 10 inhabitants - Men, scales=list(tck=c(1,0)) )xyplot(Rate ~ Period, data=subset(dat,Sex==1), groups = Age, type = b, auto.key = list(cex=0.8,border=TRUE,size=3,cex.title=1,space = right, title=Age, points = FALSE , lines = TRUE), xlab = Period, ylab = Rate,ylim=c(0,25), main = Mortality Rates for 10 inhabitants - Women, scales=list(tck=c(1,0)) ) BUT I WANT THEM IN THE SAME GRAPH, IN THE SAME PANEL, TO BE COMPARED, FOR EXAMPLE WITH THE SAME COLOUR FOR THE SAME AGE GROUPS,AND A COMPLETE LINE FOR SEX=0 AND A DOTTED LINE FOR SEX=1 I TRIED DIFFERENT THINGS BUT NOTHING WORKED FOR ME. I USED PANEL OPTIONS, OR XY.SUPERPOSE, BUT I DON`T KNOW HOW TO USE THOSE THINGS PROPERLY. You may need to adjust your netiquette. All caps is considered shouting (not to mention more difficult to read). Try: xyplot(Rate ~ Period, data=dat, groups = interaction(Age, Sex), type = b, auto.key = list(cex=0.8,border=TRUE,size=3,cex.title=1, space = right, title=Age.Sex, points = FALSE , lines = TRUE), xlab = Period, ylab = Rate,ylim=c(0,25), main = Mortality Rates for 10 inhabitants - Women Men, scales=list(tck=c(1,0)) ) THANK YOU!!! MARCOS (ARGENTINA) [[alternative HTML version deleted]] You should also learn to post in plain text. That way you linefeeds won't disappear. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list
[R] postscript cuts off left margin
I am using the following commands: postscript(file=test.eps,paper=special,width=6,height=6,horizontal=FALSE) # fake data x - c(12,13,14) y - c(41,42,43) plot(x,y,type=n,xlab=expression(paste(log ,nu[peak], [Hz],sep=)),ylab=expression(paste(log ,L[peak], [,ergs, ,s^-1,],sep=)),ylim=c(41,48),yaxt=n,xaxt=n,xlim=c(12,18),cex.lab=1.3) axis(2,at=c(41,42,43,44,45,46,47,48),labels=expression(41,42,43,44,45,46,47,48),las=1,cex.axis=1.3) axis(1,at=c(12,13,14,15,16,17),labels=expression(12,13,14,15,16,17),cex.axis=1.3) dev.off() The generated output cuts off the left-hand margin. Unfortunately you cannot fix this after the fact with bounding box as it is already set at llx = 0. I have looked through the help for postscript output in R, but I don't see anything which allows me to set the margins. The page size (6x6) is chosen because it yields a figure which has nice proportions for my paper. Any help appreciated. Eileen -- Eileen Meyer Physics Astronomy MS-108 Rice University 6100 Main St. Houston, TX 77005 www.interfolio.com/portfolio/eileenmeyer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] postscript cuts off left margin
I found the answer, sorry for not waiting longer before asking. For anyone reading the archives, inserting par(mar=c(5,4,4,2)+0.5) should alleviate the problem (default is +0.1). In general, help(par) is a good thing to check for graphical issues. On 03/07/2011 04:53 PM, Eileen Meyer wrote: I am using the following commands: postscript(file=test.eps,paper=special,width=6,height=6,horizontal=FALSE) # fake data x - c(12,13,14) y - c(41,42,43) plot(x,y,type=n,xlab=expression(paste(log ,nu[peak], [Hz],sep=)),ylab=expression(paste(log ,L[peak], [,ergs, ,s^-1,],sep=)),ylim=c(41,48),yaxt=n,xaxt=n,xlim=c(12,18),cex.lab=1.3) axis(2,at=c(41,42,43,44,45,46,47,48),labels=expression(41,42,43,44,45,46,47,48),las=1,cex.axis=1.3) axis(1,at=c(12,13,14,15,16,17),labels=expression(12,13,14,15,16,17),cex.axis=1.3) dev.off() The generated output cuts off the left-hand margin. Unfortunately you cannot fix this after the fact with bounding box as it is already set at llx = 0. I have looked through the help for postscript output in R, but I don't see anything which allows me to set the margins. The page size (6x6) is chosen because it yields a figure which has nice proportions for my paper. Any help appreciated. Eileen -- Eileen Meyer Physics Astronomy MS-108 Rice University 6100 Main St. Houston, TX 77005 www.interfolio.com/portfolio/eileenmeyer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generate 3 distinct random samples without replacement
Thank you all for your helpful comments and suggestions. Both proper indexing and subsetting a random sample of 300 work well. Best wishes, Cesar On 2011-03-07, at 5:31 PM, rex.dw...@syngenta.com rex.dw...@syngenta.com wrote: Cesar, I think your basic misconception is that you believe 'sample' returns a list of indices into the original vector. It does not; it returns actual elements of the vector: sample(runif(100),3) [1] 0.4492988 0.0336069 0.6948440 I'm not sure why you keep resetting the seed, but if it's important, replace d2-d1[-i] with d2- setdiff(d1,i) Otherwise Duncan's suggestion is must nicer: s = sample(d1,300,replace=FALSE) s1 = sort(s[1:100]) s2 = sort(s[101:200]) s3 = sort(s[201:300]) If what you actually need are indices into the original vector, replace d1 with length(d1). (When you say 'distinct', I'm assuming you mean 'disjoint'.) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Monday, March 07, 2011 3:52 PM To: Cesar Hincapié Cc: r-help@r-project.org Subject: Re: [R] generate 3 distinct random samples without replacement On 07/03/2011 2:17 PM, Cesar Hincapié wrote: Hello: I wonder if I could get a little help with random sampling in R. I have a vector of length 7375. I would like to draw 3 distinct random samples, each of length 100 without replacement. I have tried the following: d1- 1:7375 set.seed(7) i- sample(d1, 100, replace=F) s1- sort(d1[i]) s1 d2- d1[-i] set.seed(77) j- sample(d2, 100, replace=F) s2- sort(d2[j]) s2 d3- d2[-j] set.seed(777) k- sample(d3, 100, replace=F) s3- sort(d3[k]) s3 D- data.frame(a=s1,b=s2,c=s3) However, s2 is only 97 elements long, and s3, only 96 long. I would appreciate any suggestions on a better approach. I'm also curious to know why my second and third samples are less than 100 elements in length. If you want 3 non-overlapping, non-repeating samples of 100, why not draw one sample of 300, and take 3 subsets of it? The reason you were finding shorter samples is because you were using j and k as indices into vectors d2 and d3 that didn't have enough elements, and then you sorted the result, losing the NAs. For example, d2 - 1:10 d2[10:12] sort(d2[10:12]) See ?sort for an explanation of how to keep NA values when you sort. Duncan Murdoch Thanks for your time and consideration, Cesar A. Hincapié, DC, MHSc Research Fellow, Division of Health Care and Outcomes Research, Toronto Western Research Institute PhD Candidate in Epidemiology, Dalla Lana School of Public Health, University of Toronto e. cesar.hinca...@utoronto.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] attr question
Erin You could use as.vector(t.test(buzz$var1, conf.level=.98)$conf.int) Bill Venables. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erin Hodgess Sent: Monday, 7 March 2011 3:12 PM To: R help Subject: [R] attr question Dear R People: When I want to produce a small sample confidence interval using t.test, I get the following: t.test(buzz$var1, conf.level=.98)$conf.int [1] 2.239337 4.260663 attr(,conf.level) [1] 0.98 How do I keep the attr statement from printing, please? I'm sure it's something really simple. Thanks, Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] compiling r with icc
I am using, http://www.rd.dnc.ac.jp/~otsu/lecture/RwithMKL.html to compile R 2.10 . Everything compiles fine but I was wondering if there are any more optimizations I can do with the flags. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear mixed model with nested factors
David Dudek david.dudek at student.umb.no writes: Hi R-help. I am trying to run a linear mixed model with nested factors with either lme or lmer and I am having no luck obtaining the same results as Minitab. Here is Minitab's code: MTB GLM 'count' = site year replicate(site year) site*year; SUBC Random 'year' 'replicate'; Can you tell me how to code this in R? Guessing: lmer(count~site+(site|year)+(1|replicate:site:year),data=...) This assumes (1) you're willing to treat your data as normally distributed (see previous comments) (2) you really have multiple samples within each value of 'replicate' (otherwise 'replicate' will be confounded with your residual error) (3) every site is measured in more than one year (most or all years if you want decent power) I'm having a bit of a hard time figuring out what you want to do with 'replicate'. Are replicates coded uniquely, or within sites? If you have multiple samples per replicate, and they are strictly nested, it will probably be easiest to take the averages for each replicate -- then your model would reduce to avgcount~site+(site|year), and the residual error would be the among-replicate within site-year variation. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] More appropriate optimization routine?
Dimitri Liakhovitski dimitri.liakhovitski at gmail.com writes: I have 2 variables - predictor pred and response variable DV: pred-c(439635.053, 222925.718, 668434.755, 194242.330, 5786.321, 115537.344, 100835.368, 7133.206, 159058.286, 4079991.629, 3380078.060, 2661279.136, 2698324.478, 1245213.965, 1901815.503, 1517019.451, 1396857.736, 1034030.988, 432249.574, 342329.325, 1831335.792, 2209578.859, 1641709.754, 1329308.669, 1251794.367, 731368.430, 1705626.983, 673535.171, 242519.280, 57251.998, 5728.821, 2054514.244, 301954.819, 773955.355, 735497.506, 347355.976, 1678175.153, 133082.395, 591326.289, 30866.182, 27235.846, 118372.342, 71590.969, 84813.299, 366146.153, 1391725.205, 763199.746, 1216661.202, 263878.157, 930832.769, 261270.130, 589303.561, 455137.946, 954655.201, 873434.054) (pred) DV-c(0.55351297,0.27616943,0.58134926,0.33887159,0.03092546,0.14928061, 0.11836759,0.01719463,0.03258188,1.81205587,2.86657699,2.49491195, 3.09727230,1.95648776,2.28106268,1.78978179,1.74003678,1.22520393, 0.54245878,0.41483039,1.08731493,2.19581289,1.60516129,1.30723431, 1.41822649,1.31530539,2.02406576,1.22211412,0.52055790,0.12975522, 0.01416903,0.61043485,0.44141748,0.64327070,0.53607039,0.32603820, 1.77261016,0.42035756,0.37853917,0.12342486,0.06607710,0.02383682, 0.08421590,0.09255332,0.23644909,1.67921092,1.26864432,1.38654574, 1.29833020,1.76873555,0.93363677,1.01857658,0.81359775,2.14758239,2.41583852) (DV) Both pred and DV above are time series (observed across 55 months). The relationship between them is pre-specified. In this relationship, the (predicted) DV at time t is a specific function of itself at time t-1, of pred at time t, and of 2 scalars - a and b. I have to find optimal a and b that would ensure the best fit between the observed DV and the predicted DV. Below is the function I have to optimize: my.function - function(param){ a-param[1] b-param[2] DV_pred - rep(0,length(pred)) for(i in 2:length(pred)){ DV_pred[i] - 1 - ( (1 - DV_pred[i-1] * a) / (exp(pred[i] * b)) ) } DV_pred[1]-DV[1] correl - cor(DV,DV_pred) return(correl) } a has to be between 0.001 and 0.75 b has to be positive. Rather than worry about optimization routines, I think you need to think more carefully about what your objective function is actually doing. I played around with this a bit and got something reasonable. You only have two parameters, so it shouldn't be too hard to figure out what's going on by appropriate exploration. matplot(cbind(pred,DV),log=y) ## split objective function into two parts, one that computes ## the predicted value ... calc_DV_pred - function(a,b) { DV_pred - rep(0,length(pred)) DV_pred[1]-DV[1] for(i in 2:length(pred)){ DV_pred[i] - 1 - ( (1 - DV_pred[i-1] * a) / (exp(pred[i] * b)) ) } DV_pred } ## ... and the other (wrapper) to compute the correlation ## I switched to estimating b on a logarithmic scale my.function - function(param){ a-param[1] b-exp(param[2]) correl - cor(DV,calc_DV_pred(a,b)) return(correl) } ## try out the function for various values my.function(c(0.5,-5)) my.function(c(0.5,-6)) my.function(c(0.5,-9)) my.function(c(0.5,1.1)) my.function(c(0.5,1.2)) ## try to fit opt1 - optim(fn=my.function,par=c(a=0.5,b=-9), method=L-BFGS-B, lower=c(0.001,-17), upper=c(0.75,Inf), control=list(fnscale=-1)) ## look at objective function surface library(emdbook) cc - curve3d(my.function(c(x,y)),xlim=c(0.001,0.75),ylim=c(-18,1), n=c(50,50),sys3d=contour) cc2 - curve3d(my.function(c(x,y)),xlim=c(0.001,0.75),ylim=c(-16,-12), n=c(50,50),sys3d=contour) points(opt1$par[1],opt1$par[2]) DV_pred - calc_DV_pred(opt1$par[1],exp(opt1$par[2])) matplot(cbind(pred,DV,DV_pred),log=y,type=l,col=c(1,2,4)) In hindsight, my initial difficulty (and possibly yours as well) was starting with a value of b that was much too large. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] still a problem remainingRE: Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function
Hi: This works only because all of the observations you want to label are in one panel - it is not a general solution. I used the layer() function from the latticeExtra package for this: library(lattice) library(latticeExtra) xyplot(p ~ xvar|chr, data=dataf, panel = function(x, y, ...) { panel.xyplot(x, y, ...) panel.abline(h=0.01, col=red) }) + layer(with(pvals, panel.text(xvar, p, lab = name)), packets = 5) If you try this on the subset where p 0.05 rather than p 0.01, you'll see what I mean about the solution not being general. I believe a more general solution will require you to use subscripts inside your panel function for panel.text() to distribute the text labels properly across panels, because the chr in your pvals data frame needs to match the packet number (= panel number in this case). Dennis On Mon, Mar 7, 2011 at 5:26 PM, Umesh Rosyara rosyar...@gmail.com wrote: Hi Lattice Users I have been working to fix this problem, still I am not able to solve fully. I could label those names that have pvalue less than 0.01 but still the label appears in all compoent plots eventhough those who do have the pvalue ! How can I implement it successuflly to grouped data like mine. You help is highly appreciated. #my data name - c(paste (M, 1:1000, sep = )) xvar - seq(1, 1, 10) chr - c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200)) set.seed(134) p - rnorm(1000, 0.15,0.05) dataf - data.frame(name,xvar, chr, p) dataf$chr - as.factor(dataf$chr) # lattice plot: As far as I can go now ! little progress but final push required ! require(lattice) pvals - dataf[dataf$p 0.01,] p1 - pvals$p n1 - pvals$name xv1 - pvals$xvar xyplot(p ~ xvar|chr, data=dataf, panel = function(x, y) { panel.xyplot(x, y) panel.abline(h=0.01, col=red) panel.text(xv1, p1, n1, col=green2) }) Thank you in advance. Best Regards Umesh R ** ** ** -- *From:* Bert Gunter [mailto:gunter.ber...@gene.com] *Sent:* Sunday, March 06, 2011 10:50 AM *To:* Umesh Rosyara *Cc:* Jorge Ivan Velez; Dennis Murphy; sarah.gos...@gmail.com; R mailing list *Subject:* Re: [R] Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function This is easy to do by specifying xyplot's panel function. Assuming only one panel -- otherwise you need to pass the subscripts arguments to choose the values belonging to the panel -- somethings like: xyplot(y~x, pvals = pvals,..., ## pvals is your vector of small p values with e.g. NA's elsewhere panel = function(x,y, pvals,...) { panel.xyplot(...) panel.text((x,y, pvals,...) } ) This is obviously just a sketch and will not work as written. So please read the Help page on xyplot carefully and perhaps also Deepayan's book on trellis graphics -- there are also undoubtedly online resources: search on trellis graphics tutorial or some such. This is not hard, but there are some details that you will need to master,especially regarding argument passing. Another alternative is to use the layer() function in the latticeExtra package instead. Consult the documentation there for details. Cheers, Bert On Sun, Mar 6, 2011 at 5:17 AM, Umesh Rosyara rosyar...@gmail.com wrote: Dear Jorge, Dennis, Sarah and R-experts. Thank for helping me. As you mentioned it is difficult apply in lattice in this situation. Unless, there is a possibility, I would try to use lattice. The major reason toward this is- my ultimate solution might be better of in lattice as I have a classificatory variable to make similar graph for each caterogory in the lattice graph. Lattice cleates nice stacked xyplots. p ~ xvar | chr # require plots by the factor variable chr # with a classificatory variable name - c(paste (M, 1:1000, sep = )) xvar - seq(1, 1, 10) chr - c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200)) set.seed(134) p - rnorm(1000, 0.15,0.05) dataf - data.frame(name,xvar, chr, p) dataf$chr - as.factor(dataf$chr) # lattice plot: As far as I can go now ! require(lattice) xyplot(pval ~ xvar1|chr, dataf) Best Regards Umesh R _ From: Jorge Ivan Velez [mailto:jorgeivanve...@gmail.comjorgeivanve...@gmail.com ] Sent: Sunday, March 06, 2011 12:22 AM To: Umesh Rosyara Cc: R mailing list Subject: Re: [R] displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function Hi Umesh, You can try something along the lines of: d - dataf[dataf$p 0.05, ] # p 0.05 with(d, plot(xvar, p, col = 'white')) with(d, text(xvar, p, name, cex = .7)) HTH, Jorge On Sat, Mar 5, 2011 at 12:29 PM, Umesh Rosyara wrote: Dear R users, Here is my problem: # example data name - c(paste (M, 1:1000, sep = )) xvar - seq(1,
[R] beamer overlays with Sweave?
This may be asking too much, but I'm wondering if anyone has a solution (even a hack) for creating multiple (overlay) plots in an Sweave file and post-processing the overlays in beamer appropriately. For example, suppose I have a series of figure blocks in my .Rnw file: plot1,fig=TRUE= [stuff] @ plot2,fig=TRUE= [stuff] @ plot3,fig=TRUE= [stuff] @ These three blocks create three figures that I want to have appear as a series of overlays in the final PDF file. Sweave outputs the following LaTeX code: \includegraphics{plot1} \includegraphics{plot2} \includegraphics{plot3} and I need to turn it into \only1{\includegraphics{plot1}} \only2{\includegraphics{plot2}} \only3{\includegraphics{plot3}} I have few enough of these that I've been modifying them by hand (couldn't easily come up with the appropriate sed/awk incantation); it works, but it's annoying and error-prone. I could spend some more time hacking it, but I wondered if anyone else had already solved this ... (Brief googling of beamer+Sweave+overlay didn't find an obvious answer, but I might have missed something ...) thanks Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parsing question, partly comma separated partly underscore separated string
Thanks to Gabor Grothendieck and Dennis Murphy I can now solve first part of my problem and already impress my colleagues with the R-program below (I know it could be written in a smarter way, but I am learning). It reads my partly comma separated partly underscore separated string and cleans it up in a very need way. Regardless of my inability to write tight code I moved on to the second part of my quest, to put it all in to a loop to be able to loop over my approximately 100 .txt files in /usr2/username/data/ I got started with list.files() and my loop is more or less working, but I got stuck on the last cbind part. Is there a friendly R-hacker out there that would be willing to take a look at my loop below*2? Thanks, Eric ### #### ## The answer to the first part of my question ## #### ### Line - readLines(file(/usr2/efail/data/example.txt)) s - strsplit(Line, ZZ_)[[1]] s2 - sub(BLOCK.*, BLOCK, s) s3 - sub(@9z.svg, , s2) s4 - gsub(_, ,, s3) s5 - read.table(textConnection(s4[1]), sep = ,) DF - read.table(textConnection(s4), skip = 1, sep = ,, as.is = TRUE) DF$block - head(cumsum(c(, DF$V8) == BLOCK)+1, -1) DF$run - ave(DF$block, DF$block, FUN = seq_along) DF$V8 - NULL names(DF) - c(IngNam, Tx, Ty, Treatment, x, y, Y, BLOCK, RUN) DF$ID - s5$V1 DF # ## ## ## The PARTLY WORKING loop## ## ## # fname - list.files(/usr2/efail/data,pattern=.txt, full.names = TRUE, recursive =TRUE, ignore.case = TRUE) for (sp in 1:length(fname)) { Line - readLines(file(fname[sp])) s - strsplit(Line, ZZ_)[[1]] s2 - sub(BLOCK.*, BLOCK, s) s3 - sub(@9z.svg, , s2) s4 - gsub(_, ,, s3) s5 - read.table(textConnection(s4[1]), sep = ,) DF - read.table(textConnection(s4), skip = 1, sep = ,, as.is = TRUE) DF$block - head(cumsum(c(, DF$V8) == BLOCK)+1, -1) DF$run - ave(DF$block, DF$block, FUN = seq_along) DF$V8 - NULL names(DF) - c(IngNam, Tx, Ty, Treatment, x, y, Y, BLOCK, RUN) DF$ID - s5$V1 FINAL.DF - cbind(DF… ## This is where I got stuck. } On Mon, Mar 7, 2011 at 8:18 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Sun, Mar 6, 2011 at 10:13 PM, Eric Fail eric.f...@gmx.com wrote: Dear R-list, I have a partly comma separated partly underscore separated string that I am trying to parse into R. Furthermore I have a bunch of them, and they are quite long. I have now spent most of my Sunday trying to figure this out and thought I would try the list to see if someone here would be able to get me started. My data structure looks like this, (in a example.txt file) Subject ID,ExperimentName,2010-04-23,32:34:23,Version 0.4, 640 by 960 pixels, On Device M, M, 3.2.4,zz_373_462_488_...@9z.svg,592,820,3.35,zz_032_288_436_...@9z.svg,332,878,3.66,zz_384_204_433_...@9z.svg,334,824,3.28,zz_365_575_683_...@9z.svg,598,878,3.50,zz_005_480_239_...@9z.svg,630,856,8.03,zz_030_423_394_...@9z.svg,98,846,4.09,zz_033_596_398_...@9z.svg,636,902,3.28,zz_263_064_320_...@9z.svg,570,894,1.26,bl...@9z.svg,322,842,32.96,zz_004_088_403_...@9z.svg,606,908,3.32,zz_703_546_434_...@9z.svg,624,934,2.58,zz_712_348_543_...@9z.svg,20,828,5.36,zz_005_48_239_...@9z.svg,580,830,4.36,zz_310_444_623_...@9z.svg,586,806,0.08,zz_030_423_394_...@9z.svg,350,854,3.84,zz_340_382_539_...@9z.svg,570,894,1.26,bl...@9z.svg,542,840,4.44,zz_345_230_662_...@9z.svg,632,844,2.47,zz_006_335_309_...@9z.svg,96,930,3.63,zz_782_346_746_...@9z.svg,306,850,2.58,zz_334_200_333_...@9z.svg,304,842,3.34,zz_383_506_726_...@9z.svg,622,884,3.84,zz_294_360_448_...@9z.svg,90,858,3.56,zz_334_335_473_...@9z.svg,570,894,1.26,bl...@9z.svg,320,852,4.04, (end of example.txt file) The above is approximate 5% of the length of a full file, and then I got about 100 of them. Please note that the strings end with a comma. I am trying to parse it into something like this ID ImgNam BLOCK RUN Tx Ty Treatment x y Y Subject ID 373 1 1 462 488 TRT 592 820 3.35 Subject ID 32 1 2 288 436 CON 332 878 3.66 Subject ID 384 1 3 204 433 TRT 334 824 3.28 Subject ID 365 1 4 575 683 TRT 598 878 3.5 Subject ID 5 1 5 480 239 CON 630 856 8.03 Subject ID 30 1 6 423 394 CON 98 846 4.09 Subject ID 33 1 7 596 398 CON 636 902 3.28 Subject ID 263 1 8 64 320 TRT 570 894 1.26 Subject ID 4 2 1 88 403 CON 606 908 3.32 Subject ID 703 2 2 546 434 CON 624 934 2.58 Subject ID 712 2 3 348 543 CON 20 828 5.36 Subject ID 5 2 4 48 239 CON 580 830 4.36 Subject ID 310 2 5 444 623 TRT 586 806 0.08 Subject ID 30 2 6 423 394 CON 350 854 3.84 Subject ID 340 2 7 382 539 TRT 570 894 1.26 Subject ID 345 3 1 230 662 TRT 632 844 2.47 Subject ID 6 3 2 335 309 CON 96 930 3.63 Subject ID 782 3 3 346
[R] extract rows with unique values from data.frame
Hello! I have the data frame pop: xloc yloc gonad indEneW Agent 123 20 516.74 1 0.02 20.21 0.25 223 20 1143.20 1 0.02 20.21 0.50 321 19 250.00 1 0.02 20.21 0.25 422 15 251.98 1 0.02 18.69 0.25 524 18 598.08 1 0.02 18.69 0.25 And I need to extract the number of rows that have unique (xloc, yloc) values. For example I need 2 unique cells (xloc,yloc) so the number of rows to extract would be 3 as follows: xloc yloc gonad indEneW Agent 123 20 516.74 1 0.02 20.21 0.25 223 20 1143.20 1 0.02 20.21 0.50 321 19 250.00 1 0.02 20.21 0.25 Thanks for any help! Nico __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ok to use glht() when interaction is NOT significant?
Hi, let's say I have a simple ANOVA model with 2 factors A (level A1 and A2) and B (level B1 and B2) and their interaction: aov(y~A*B, data=dat) It turns out that the interaction term is not significant (e.g. P value = 0.2), but if I used glht() to compare A1 vs. A2 within each level of B, I found that the comparison is not significant when B=B1, but is very significant (P0.01) when B=B2. My question is whether it's legal to do this post-hoc comparison when the interaction is NOT significant? Can I still make the claim that there is a significant difference between A1 and A2 when B=B2? Thanks John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract rows with unique values from data.frame
Here is possibly one method (if I have understood you correctly): con - textConnection( + xloc yloc gonad indEneW Agent + 123 20 516.74 1 0.02 20.21 0.25 + 223 20 1143.20 1 0.02 20.21 0.50 + 321 19 250.00 1 0.02 20.21 0.25 + 422 15 251.98 1 0.02 18.69 0.25 + 524 18 598.08 1 0.02 18.69 0.25 + ) pop - read.table(con, header = TRUE) close(con) i - with(pop, cumsum(!duplicated(cbind(xloc, yloc k - 2 ## how many do you want? no - min(which(i == k)) pop[1:no, ] xloc yloc gonad ind Ene W Agent 1 23 20 516.74 1 0.02 20.21 0.25 2 23 20 1143.20 1 0.02 20.21 0.50 3 21 19 250.00 1 0.02 20.21 0.25 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Nicolas Gutierrez Sent: Tuesday, 8 March 2011 1:52 PM To: R help Subject: [R] extract rows with unique values from data.frame Hello! I have the data frame pop: xloc yloc gonad indEneW Agent 123 20 516.74 1 0.02 20.21 0.25 223 20 1143.20 1 0.02 20.21 0.50 321 19 250.00 1 0.02 20.21 0.25 422 15 251.98 1 0.02 18.69 0.25 524 18 598.08 1 0.02 18.69 0.25 And I need to extract the number of rows that have unique (xloc, yloc) values. For example I need 2 unique cells (xloc,yloc) so the number of rows to extract would be 3 as follows: xloc yloc gonad indEneW Agent 123 20 516.74 1 0.02 20.21 0.25 223 20 1143.20 1 0.02 20.21 0.50 321 19 250.00 1 0.02 20.21 0.25 Thanks for any help! Nico __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract rows with unique values from data.frame
Dear Bill.. .great, thanks for your quick response. Cheers Nico On 3/7/2011 8:16 PM, bill.venab...@csiro.au wrote: i- with(pop, cumsum(!duplicated(cbind(xloc, yloc k- 2 ## how many do you want? no- min(which(i == k)) pop[1:no, ] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] still a problem remainingRE: Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function
As I believe I already told you in my original reply, you have to make use of the subscripts argument in the panel function to subscript the P values etc. vector to be plotted in each panel. Something like: (untested) panel = function(x, y,subscripts,...) { panel.xyplot(x, y,...) panel.abline(h=0.01, col=red) panel.text(xv1[subscripts], p1[subscripts], n1[subscripts], col=green2) } Also,in future, please send plain text email, as requested in the guide. Your message was in an annoying blue font in my gmail reader. Cheers, Bert On Mon, Mar 7, 2011 at 5:26 PM, Umesh Rosyara rosyar...@gmail.com wrote: Hi Lattice Users I have been working to fix this problem, still I am not able to solve fully. I could label those names that have pvalue less than 0.01 but still the label appears in all compoent plots eventhough those who do have the pvalue ! How can I implement it successuflly to grouped data like mine. You help is highly appreciated. #my data name - c(paste (M, 1:1000, sep = )) xvar - seq(1, 1, 10) chr - c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200)) set.seed(134) p - rnorm(1000, 0.15,0.05) dataf - data.frame(name,xvar, chr, p) dataf$chr - as.factor(dataf$chr) # lattice plot: As far as I can go now ! little progress but final push required ! require(lattice) pvals - dataf[dataf$p 0.01,] p1 - pvals$p n1 - pvals$name xv1 - pvals$xvar xyplot(p ~ xvar|chr, data=dataf, panel = function(x, y) { panel.xyplot(x, y) panel.abline(h=0.01, col=red) panel.text(xv1, p1, n1, col=green2) }) Thank you in advance. Best Regards Umesh R From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Sunday, March 06, 2011 10:50 AM To: Umesh Rosyara Cc: Jorge Ivan Velez; Dennis Murphy; sarah.gos...@gmail.com; R mailing list Subject: Re: [R] Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function This is easy to do by specifying xyplot's panel function. Assuming only one panel -- otherwise you need to pass the subscripts arguments to choose the values belonging to the panel -- somethings like: xyplot(y~x, pvals = pvals,..., ## pvals is your vector of small p values with e.g. NA's elsewhere panel = function(x,y, pvals,...) { panel.xyplot(...) panel.text((x,y, pvals,...) } ) This is obviously just a sketch and will not work as written. So please read the Help page on xyplot carefully and perhaps also Deepayan's book on trellis graphics -- there are also undoubtedly online resources: search on trellis graphics tutorial or some such. This is not hard, but there are some details that you will need to master,especially regarding argument passing. Another alternative is to use the layer() function in the latticeExtra package instead. Consult the documentation there for details. Cheers, Bert On Sun, Mar 6, 2011 at 5:17 AM, Umesh Rosyara rosyar...@gmail.com wrote: Dear Jorge, Dennis, Sarah and R-experts. Thank for helping me. As you mentioned it is difficult apply in lattice in this situation. Unless, there is a possibility, I would try to use lattice. The major reason toward this is- my ultimate solution might be better of in lattice as I have a classificatory variable to make similar graph for each caterogory in the lattice graph. Lattice cleates nice stacked xyplots. p ~ xvar | chr # require plots by the factor variable chr # with a classificatory variable name - c(paste (M, 1:1000, sep = )) xvar - seq(1, 1, 10) chr - c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200)) set.seed(134) p - rnorm(1000, 0.15,0.05) dataf - data.frame(name,xvar, chr, p) dataf$chr - as.factor(dataf$chr) # lattice plot: As far as I can go now ! require(lattice) xyplot(pval ~ xvar1|chr, dataf) Best Regards Umesh R _ From: Jorge Ivan Velez [mailto:jorgeivanve...@gmail.com] Sent: Sunday, March 06, 2011 12:22 AM To: Umesh Rosyara Cc: R mailing list Subject: Re: [R] displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function Hi Umesh, You can try something along the lines of: d - dataf[dataf$p 0.05, ] # p 0.05 with(d, plot(xvar, p, col = 'white')) with(d, text(xvar, p, name, cex = .7)) HTH, Jorge On Sat, Mar 5, 2011 at 12:29 PM, Umesh Rosyara wrote: Dear R users, Here is my problem: # example data name - c(paste (M, 1:1000, sep = )) xvar - seq(1, 1, 10) set.seed(134) p - rnorm(1000, 0.15,0.05) dataf - data.frame(name,xvar, p) plot (dataf$xvar,p) abline(h=0.05) # I can know which observation number is less than 0.05 which (dataf$p 0.05) [1] 12 20 80 269 272 338 366 368 397 403 432 453 494 543 592 691 723 789 811 [20] 854 891 931 955 I want to display (label) corresponding names on the plot above: means that
Re: [R] Replace for loop when vector calling itself
Hi, Thanks for your replies. In summary: 1. Replace code with c(0, cumsum(diff(theoP)) ). This is indeed correct and I had not realized it !! d = vector(mode = numeric, length= len) d[1] = 0 if (len1) for (i in 2:len) { d[i] = d[i-1] + theoP[i] - theoP[i-1] } 2. How to create recursive vector, eg. vector = previous vector * new_data ... suggested to look at the filter() function. Thanks for your replies. Cheers, Chris -- View this message in context: http://r.789695.n4.nabble.com/Replace-for-loop-when-vector-calling-itself-tp3338383p3339938.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MCMC_glm models
Hello, I am trying to run multiple glm models for a dataset and need some help First, i generated a matrix of abundance for 1 populations based on the mean and variance of my dataset X - replicate(1, rpois(50, 9.244655)) and entered the years as row names Y - c(1960:2009) rownames(X)-Y Now my issue is that I want to run a glm on each of those columns. However I cannot just run glm(X~Y) Error: (subscript) logical subscript too long I know I can run individual column glms X_glm - glm(X[,1]~Y) but would rather not do that 1 times. Any suggestions? Thank you for any help. Nick. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] This is supposed to predict a time series?!
Hello, I just ran the predict.StructTS function using the AirPassengers data and got a ridiculous result. Here's what I ended up with: http://24.210.155.111/PredictWhat!.pdf Who wrote this? Am I seriously supposed to think this function would accurately predict a time series? -AnalogKid __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] what controls the limits of x axis in hclust
Hi all, I followed the example in http://www.statmethods.net/advstats/cluster.html and apply it to one of my own dataset, I got this tiny problem with the boreders, http://r.789695.n4.nabble.com/file/n3340238/temp.png The red rectangle are 'outside' the plot, so I want to know how do I control the 'xlim' in plot(hclust) Thanks. casper -- View this message in context: http://r.789695.n4.nabble.com/what-controls-the-limits-of-x-axis-in-hclust-tp3340238p3340238.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rmpi fails to install
I try to install Rmpi as root with install.packages(Rmpi). It fails with: ... checking for x86_64-pc-linux-gnu-gcc -std=gnu99 option to accept ISO C89... none needed I am here /usr and it is OpenMPI Trying to find mpi.h ... Found in /usr/include Trying to find libmpi.so or libmpich.a ... Found libmpi in /usr/lib checking for openpty in -lutil... yes checking for main in -lpthread... yes configure: creating ./config.status config.status: creating src/Makevars ** libs x86_64-pc-linux-gnu-gcc -std=gnu99 -I/usr/lib64/R/include -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -I/usr/include -DMPI2 -DOPENMPI -I/usr/local/include-fpic -march=nocona -O2 -pipe -fomit-frame-pointer -c RegQuery.c -o RegQuery.o x86_64-pc-linux-gnu-gcc -std=gnu99 -I/usr/lib64/R/include -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -I/usr/include -DMPI2 -DOPENMPI -I/usr/local/include-fpic -march=nocona -O2 -pipe -fomit-frame-pointer -c Rmpi.c -o Rmpi.o x86_64-pc-linux-gnu-gcc -std=gnu99 -I/usr/lib64/R/include -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -I/usr/include -DMPI2 -DOPENMPI -I/usr/local/include-fpic -march=nocona -O2 -pipe -fomit-frame-pointer -c conversion.c -o conversion.o x86_64-pc-linux-gnu-gcc -std=gnu99 -I/usr/lib64/R/include -DPACKAGE_NAME=\\ -DPACKAGE_TARNAME=\\ -DPACKAGE_VERSION=\\ -DPACKAGE_STRING=\\ -DPACKAGE_BUGREPORT=\\ -I/usr/include -DMPI2 -DOPENMPI -I/usr/local/include-fpic -march=nocona -O2 -pipe -fomit-frame-pointer -c internal.c -o internal.o x86_64-pc-linux-gnu-gcc -std=gnu99 -shared -Wl,-O1 -Wl,--as-needed -o Rmpi.so RegQuery.o Rmpi.o conversion.o internal.o -L/usr/lib -lmpi -lutil -lpthread -L/usr/lib64/R/lib -lR installing to /usr/lib64/R/library/Rmpi/libs ** R ** demo ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... ** testing if installed package can be loaded /usr/lib64/R/bin/exec/R: symbol lookup error: /usr/lib64/openmpi/mca_paffinity_linux.so: undefined symbol: mca_base_param_reg_int The downloaded packages are in ?/tmp/RtmpLvvl9R/downloaded_packages? Updating HTML index of packages in '.Library' Warning message: In install.packages(Rmpi) : installation of package 'Rmpi' had non-zero exit status The system is a gentoo system with openmpi-1.5.1. I am thankful for any hint. Juergen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] still a problem remainingRE: Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function
Hi Lattice Users I have been working to fix this problem, still I am not able to solve fully. I could label those names that have pvalue less than 0.01 but still the label appears in all compoent plots eventhough those who do have the pvalue ! How can I implement it successuflly to grouped data like mine. You help is highly appreciated. #my data name - c(paste (M, 1:1000, sep = )) xvar - seq(1, 1, 10) chr - c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200)) set.seed(134) p - rnorm(1000, 0.15,0.05) dataf - data.frame(name,xvar, chr, p) dataf$chr - as.factor(dataf$chr) # lattice plot: As far as I can go now ! little progress but final push required ! require(lattice) pvals - dataf[dataf$p 0.01,] p1 - pvals$p n1 - pvals$name xv1 - pvals$xvar xyplot(p ~ xvar|chr, data=dataf, panel = function(x, y) { panel.xyplot(x, y) panel.abline(h=0.01, col=red) panel.text(xv1, p1, n1, col=green2) }) Thank you in advance. Best Regards Umesh R _ From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Sunday, March 06, 2011 10:50 AM To: Umesh Rosyara Cc: Jorge Ivan Velez; Dennis Murphy; sarah.gos...@gmail.com; R mailing list Subject: Re: [R] Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function This is easy to do by specifying xyplot's panel function. Assuming only one panel -- otherwise you need to pass the subscripts arguments to choose the values belonging to the panel -- somethings like: xyplot(y~x, pvals = pvals,..., ## pvals is your vector of small p values with e.g. NA's elsewhere panel = function(x,y, pvals,...) { panel.xyplot(...) panel.text((x,y, pvals,...) } ) This is obviously just a sketch and will not work as written. So please read the Help page on xyplot carefully and perhaps also Deepayan's book on trellis graphics -- there are also undoubtedly online resources: search on trellis graphics tutorial or some such. This is not hard, but there are some details that you will need to master,especially regarding argument passing. Another alternative is to use the layer() function in the latticeExtra package instead. Consult the documentation there for details. Cheers, Bert On Sun, Mar 6, 2011 at 5:17 AM, Umesh Rosyara rosyar...@gmail.com wrote: Dear Jorge, Dennis, Sarah and R-experts. Thank for helping me. As you mentioned it is difficult apply in lattice in this situation. Unless, there is a possibility, I would try to use lattice. The major reason toward this is- my ultimate solution might be better of in lattice as I have a classificatory variable to make similar graph for each caterogory in the lattice graph. Lattice cleates nice stacked xyplots. p ~ xvar | chr # require plots by the factor variable chr # with a classificatory variable name - c(paste (M, 1:1000, sep = )) xvar - seq(1, 1, 10) chr - c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200)) set.seed(134) p - rnorm(1000, 0.15,0.05) dataf - data.frame(name,xvar, chr, p) dataf$chr - as.factor(dataf$chr) # lattice plot: As far as I can go now ! require(lattice) xyplot(pval ~ xvar1|chr, dataf) Best Regards Umesh R _ From: Jorge Ivan Velez [mailto:jorgeivanve...@gmail.com] Sent: Sunday, March 06, 2011 12:22 AM To: Umesh Rosyara Cc: R mailing list Subject: Re: [R] displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function Hi Umesh, You can try something along the lines of: d - dataf[dataf$p 0.05, ] # p 0.05 with(d, plot(xvar, p, col = 'white')) with(d, text(xvar, p, name, cex = .7)) HTH, Jorge On Sat, Mar 5, 2011 at 12:29 PM, Umesh Rosyara wrote: Dear R users, Here is my problem: # example data name - c(paste (M, 1:1000, sep = )) xvar - seq(1, 1, 10) set.seed(134) p - rnorm(1000, 0.15,0.05) dataf - data.frame(name,xvar, p) plot (dataf$xvar,p) abline(h=0.05) # I can know which observation number is less than 0.05 which (dataf$p 0.05) [1] 12 20 80 269 272 338 366 368 397 403 432 453 494 543 592 691 723 789 811 [20] 854 891 931 955 I want to display (label) corresponding names on the plot above: means that 12th observation M12, 20th observation M20 and so on. Please note that I have names not in numerical sequience (rather different names), just provided for this example to create dataset easily. Thanks in advance Umesh R [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ No virus found in this message. Checked by AVG - www.avg.com [[alternative HTML version deleted]] __
Re: [R] ok to use glht() when interaction is NOT significant?
Inline below On Mon, Mar 7, 2011 at 8:08 PM, array chip arrayprof...@yahoo.com wrote: Hi, let's say I have a simple ANOVA model with 2 factors A (level A1 and A2) and B (level B1 and B2) and their interaction: aov(y~A*B, data=dat) It turns out that the interaction term is not significant (e.g. P value = 0.2), but if I used glht() to compare A1 vs. A2 within each level of B, I found that the comparison is not significant when B=B1, but is very significant (P0.01) when B=B2. My question is whether it's legal to do this post-hoc comparison when the interaction is NOT significant? Can I still make the claim that there is a significant difference between A1 and A2 when B=B2? (I am serious here). Don't know what legal means. Why do you want to make the claim? When does it **ever** mean anything scientifically meaningful to make it? What is the **scientific** question of interest? Are the data unbalanced? Have you plotted the data to tell you what's going on? Warning: I come from the school (maybe I'm the only student...) that believes all such formal post hoc comparisons are pointless, silly, wastes of effort. Note the word: formal -- that is pretending the P values mean anything, For exploratory purposes, which can certainly include producing P values as well as graphs, such post hoc comparisons might lead to great science. It's the formal part that I reject and that you seem to be hung up on. Note also: If you're a Bayesian and can put priors on everything, you can spit out posteriors and Bayes factors to your heart's content. Really! -- no need to sweat multiplicity even. Of course, I speak here only as an observer, having never actually inhaled myself.* Cheers, Bert *Apologies to all non-US and younger readers. This is a smart-aleck reference to an infamous dumb remark from a recent famous, smart former U.S. president. Google never inhaled for details. Thanks John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://devo.gene.com/groups/devo/depts/ncb/home.shtml __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SEM error
Dear All, I am new for R and SEM. I try to fit the model with Y (ordinal outcome), X (4 categorical data), M1-M3 (continuous), and 2 covariates (Agesex) as a diagram. library(polycor) model.ly -specify.model() 1: x - m1, gam11, NA 2: x - m2, gam12, NA 3: x - m3, gam13, NA 4: age - m1, gam14, NA 5: age - m2, gam15, NA 6: age - m3, gam16, NA 7: sex - m1, gam17, NA 8: sex - m2, gam18, NA 9: sex - m3, gam19, NA 10: x - y, gam20, NA 11: m1 - y, gam21, NA 12: m2 - y, gam22, NA 13: m3 - y, gam23, NA 14: age - y, gam24, NA 15: sex - y, gam25, NA, 16: m1 -m1, psi11, NA 17: m2 - m2, psi12, NA 18: m3 - m3, psi13, NA 19: m1 - m2, psi21, NA 20: m1 -m3, psi22, NA 21: m2 - m3, psi23, NA 22: Y - Y, psi24, NA, hcor -function(ly) hetcor(ly, std.err=FALSE)$correlations R.ly -hcor(ly) sem.ly - sem(model.ly, R.ly, N=174) Error in sem.default(ram=ram, S=S, N=N,……) The model has negative degree of freedom = -12 First, I do not know what the mistake is. Second, is this correctly modeling my diagram? Any suggestions would be appreciated. Thank you, Kesinee http://r.789695.n4.nabble.com/file/n3340497/path.gif.png -- View this message in context: http://r.789695.n4.nabble.com/SEM-error-tp3340497p3340497.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.