Re: [R] Constrained Regression
Hello everyone, I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I want to do a constrained regression such that a0 and (1-a) 0 for the model: Y = a*X1 + (1-a)*X2 I tried the help on the constrained regression in R but I concede that it was not helpful. Any help is greatly appreciated -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] make many barplot into one plot
Dear R users I would like to group my barplot graph (see example on the R help link). The proposed R code, adding individual bars to the plot, looks really overwhelming. My specific dataset just consists of five groups and three different levels within each groups (the individual bars). The .txt file is read as matrix (horizontal: group, vertical: levels). The R trellis barchart (function group=) is an easy function, but unfortunately the upper plot part look much different from other graphs. I would therefore prefer barplot to stansdardize my plots within the manuscript. It would be very helpful for me to know if anyone else has worked on the barplot group function. Thanks Sibylle http://onertipaday.blogspot.com/2007/05/make-many-barplot-into-one-plot.html R code from the link ## I have 4 tables like this:satu - array(c(5,15,20,68,29,54,84,119), dim=c(2,4), dimnames=list(c(Negative, Positive), c(Black, Brown, Red, Blond)))dua - array(c(50,105,30,8,29,25,84,9), dim=c(2,4), dimnames=list(c(Negative, Positive), c(Black, Brown, Red, Blond)))tiga - array(c(9,16,26,68,12,4,84,12), dim=c(2,4), dimnames=list(c(Negative, Positive), c(Black, Brown, Red, Blond)))empat - array(c(25,13,50,78,19,34,84,101), dim=c(2,4), dimnames=list(c(Negative, Positive), c(Black, Brown, Red, Blond)))# rbind() the tables togetherTAB - rbind(satu, dua, tiga, empat)# Do the barplot and save the bar midpointsmp - barplot(TAB, beside = TRUE, axisnames = FALSE)# Add the individual bar labelsmtext(1, at = mp, text = c(N, P),line = 0, cex = 0.5)# Get the midpoints of each sequential pair of bars# within each of the four groupsat - t(sapply(seq(1, nrow(TAB), by = 2),function(x) colMeans(mp[c(x, x+1), ])))# Add the group labels ! for each pairmtext(1, at = at, text = rep(c(satu, dua, tiga, empat), 4),line = 1, cex = 0.75)# Add the color labels for each groupmtext(1, at = colMeans(mp), text = c(Black, Brown, Red, Blond), line = 2) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need help with lmer model specification syntax for nested mixed model
I haven't been able to fully make sense of the conflicting online information about whether and how to specify nesting structure for a nested, mixed model. I'll describe my experiment and hopefully somebody who knows lme4 well can help. We're measuring the fluorescence intensity of brain slices from frogs that have undergone various treatments. We want to use multcomp to look for differences between treatments, while accounting for the variance introduced by the random effects of brain and slice. There are a few measurements per slice, several slices per brain, and several brains per treatment. In the data file, the numbering for slices starts over from 1 for each brain, and the numbering for brains starts over from 1 for each treatment. In other words: Treatment is a fixed effect, brain is a random effect nested in treatment, and slice is a random effect nested in brain. As I understood the documentation, this is the correct specification: log(Intensity) ~ Treatment + (1|Brain) + (1|Slice) However, I don't see how lmer understands the correct nesting structure from that. How does it know brain isn't crossed with treatment? Here are two other things I tried, and each gave different results: log(Intensity) ~ Treatment + (1|Slice/Brain/Treatment) log(Intensity) ~ Treatment + (1|Brain/Treatment) + (1|Slice/Brain) I'm not sure why these things give different results, or which one (if any) is right. Can anyone help? Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Need-help-with-lmer-model-specification-syntax-for-nested-mixed-model-tp3020895p3020895.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to control in order of groups in xyplot
Hi guys, I used the following R code to generate one plot library(lattice) xyplot(Y~X1|as.factor(X2)*as.factor(X3), groups = as.factor(X4), data=mydata) Both X2 and X3 have three values. X4 has two values. I got 3x3 grids and in each grid there are two curves about y~x1 for the two X4 values. I am quite happy with the plot except that I need a different layout of the 3x3 layout. For example, X2={m=2, m=5, and m=10} and it plots with the order m=10, m=2, and m=5. Is there any way I can control the order of the groups in the whole plot? Thanks a lot, --Jerry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] biglm: how it handles large data set?
I am trying to figure out why 'biglm' can handle large data set... According to the R document - biglm creates a linear model object that uses only p^2 memory for p variables. It can be updated with more data using update. This allows linear regression on data sets larger than memory. After reading the source code below, I still could not figure out how 'update' implements the algorithm... Thanks for any light shed upon this ... biglm::biglm function (formula, data, weights = NULL, sandwich = FALSE) { tt - terms(formula) if (!is.null(weights)) { if (!inherits(weights, formula)) stop(`weights' must be a formula) w - model.frame(weights, data)[[1]] } else w - NULL mf - model.frame(tt, data) mm - model.matrix(tt, mf) qr - bigqr.init(NCOL(mm)) qr - update(qr, mm, model.response(mf), w) rval - list(call = sys.call(), qr = qr, assign = attr(mm, assign), terms = tt, n = NROW(mm), names = colnames(mm), weights = weights) if (sandwich) { p - ncol(mm) n - nrow(mm) xyqr - bigqr.init(p * (p + 1)) xx - matrix(nrow = n, ncol = p * (p + 1)) xx[, 1:p] - mm * model.response(mf) for (i in 1:p) xx[, p * i + (1:p)] - mm * mm[, i] xyqr - update(xyqr, xx, rep(0, n), w * w) rval$sandwich - list(xy = xyqr) } rval$df.resid - rval$n - length(qr$D) class(rval) - biglm rval } environment: namespace:biglm --- -- View this message in context: http://r.789695.n4.nabble.com/biglm-how-it-handles-large-data-set-tp3020890p3020890.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to control in order of groups in xyplot
A working minimal example would have been better, see FAQ. But I think you are looking for the following: X2 - factor(c(m=2, m=5, m=10), levels=c(m=2, m=5, m=10)) Here levels are ordered in your way. There might be other solutions for this ordering problem. Hope it helps, Rainer On 31.10.2010 06:55 (UTC+1), Jie Liu wrote: Hi guys, I used the following R code to generate one plot library(lattice) xyplot(Y~X1|as.factor(X2)*as.factor(X3), groups = as.factor(X4), data=mydata) Both X2 and X3 have three values. X4 has two values. I got 3x3 grids and in each grid there are two curves about y~x1 for the two X4 values. I am quite happy with the plot except that I need a different layout of the 3x3 layout. For example, X2={m=2, m=5, and m=10} and it plots with the order m=10, m=2, and m=5. Is there any way I can control the order of the groups in the whole plot? Thanks a lot, --Jerry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] for loop
Hi Dennis, Thank you for your extensive explanations. Yes, I guess I did not explain what I would like to do. Basically I would like to conduct a linear regression for each of 15 classes. Your answers gave me new Perspective on how R works. Thanks again for the help, m From: Dennis Murphy [mailto:djmu...@gmail.com] Sent: Sunday, October 31, 2010 4:40 AM To: Matevž PavliÄ Cc: r-help@r-project.org Subject: Re: [R] for loop Hi: If your objective is to make 15 plots, one for each level of razred, then you don't need to make 15 individual data frames first. The lattice and ggplot2 packages allow conditioning plots. You haven't mentioned what types of plots you're interested in getting, but if it's something simple like a scatterplot of y vs. x for each level of razred, it's not that hard to do. Let's fake some data: d - data.frame(razred = rep(LETTERS[1:15], each = 10), x = rep(1:10, 15), y = rep(2 + 0.5 * 1:10, 15) + rnorm(150)) d has 15 levels of razred with 10 observations at each level. razred is a factor, the other variables are either integer or numeric. Produce scatterplots of y vs. x for each level of razred, using both the lattice and ggplot2 packages: library(lattice) # each plot adds a new feature - run one plot at a time. xyplot(y ~ x | razred, data = d, type = c('p', 'r')) xyplot(y ~ x | razred, data = d, type = c('p', 'r'), layout = c(3, 5)) xyplot(y ~ x | razred, data = d, type = c('p', 'r'), layout = c(3, 5), as.table = TRUE) library(ggplot2) ggplot(d, aes(x, y)) + geom_point() + geom_smooth(method = 'lm') + facet_wrap( ~ razred, ncol = 3) ggplot(d, aes(x, y)) + geom_point() + geom_smooth(method = 'lm', se = FALSE) + facet_wrap( ~ razred, ncol = 3) If instead you want something like a scatterplot matrix for each data subset defined by level of razred, then maybe something like this (?): # add a new variable to the data frame # splom is the scatterplot matrix function in lattice d$z1 - rnorm(150) splom(~ d[, -1] | razred, data = d, layout = c(2, 2, 4)) Just guessing here since you didn't make your objective explicit. It's entirely possible that you can conduct a significant part of your data analysis without having to split the data into subsets. Several summary functions, for example, can compute a number of summary functions by group with a one-line call. Here are a couple of examples, one using aggregate() from the base package and another using function ddply() from the plyr package: aggregate(y ~ razred, data = d, FUN = mean) razredy 1 A 4.816841 2 B 4.520804 3 C 5.196329 4 D 4.615575 5 E 3.982240 6 F 4.466559 7 G 4.938669 8 H 4.539541 9 I 4.354991 10 J 4.573654 11 K 4.450624 12 L 5.138087 13 M 4.93 14 N 4.879493 15 O 5.087452 library(plyr) ddply(d, 'razred', summarise, mx = mean(x), my = mean(y), mz1 = mean(z1)) razred mx my mz1 1 A 5.5 4.816841 -0.01745305 2 B 5.5 4.520804 0.24724069 3 C 5.5 5.196329 0.18717750 4 D 5.5 4.615575 0.18885590 5 E 5.5 3.982240 -0.91284339 6 F 5.5 4.466559 0.36479266 7 G 5.5 4.938669 -0.36359562 8 H 5.5 4.539541 0.06061162 9 I 5.5 4.354991 0.05138409 10 J 5.5 4.573654 0.31160018 11 K 5.5 4.450624 0.17458712 12 L 5.5 5.138087 -0.26482357 13 M 5.5 4.93 -0.39194953 14 N 5.5 4.879493 0.33154075 15 O 5.5 5.087452 0.32816931 There are a number of functions and packages that will do this sort of thing quite well - I'll mention doBy, data.table, Hmisc and sqldf as excellent options, noting that there are other packages and functions in the apply family that can perform groupwise processing seamlessly. The point of mentioning this is so that you don't automatically think you have to split the data in myriad ways before you can process a function. The good folks that designed this language, and the many people who have contributed code to the R project, are pretty smart, and have devised fairly simple ways to process data, even if it's large. Of course, it's always possible that splitting is necessary; if you're willing to be a little more forthcoming about your analysis goals, you might get a better targeted response.. HTH, Dennis On Sat, Oct 30, 2010 at 12:00 PM, Matevž PavliÄ matevz.pav...@gi-zrmk.si wrote: Just one more thing... I get a list with 15 data.frames : List of 15 $ 1:'data.frame': 7 obs. of 9 variables: ..$ vrtina : Factor w/ 6 levels T1A-1,T1A-2,..: 1 1 2 2 5 5 5 ..$ globina.meritve: num [1:7] 7.6 8.5 10.4 17.4 12.5 15.5 16.5 ..$ E0 : num [1:7] 4109 2533 491 810 2374 ... ..$ Eur1 : num [1:7] 6194 4713 605 1473 NA ... ..$ Eur2 : num [1:7] 3665 7216 266 4794 7387 ... ..$ Eur3 : num [1:7] 3221 3545 920 3347 6768 ... ..$ H : num [1:7] 8 5.9 5.9 6.9 9.3
Re: [R] R-help Digest, Vol 92, Issue 31
On 31.10.2010 05:22, Neyra Peña wrote: Hi, I'd like to unsubscribe from the list. Thanks Neyra Have you ever read the first lines (those I still cite below) of the message you got that already tells you how to unsubscribe? Uwe Ligges De: r-help-requ...@r-project.orgr-help-requ...@r-project.org Para: r-help@r-project.org Enviado: sáb, octubre 30, 2010 5:30:07 AM Asunto: R-help Digest, Vol 92, Issue 31 Send R-help mailing list submissions to r-help@r-project.org To subscribe or unsubscribe via the World Wide Web, visit https://stat.ethz.ch/mailman/listinfo/r-help [SNIP] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Downloading Package Sources
Dear Group, any idea how I can download the source code for all packages in Windows 7? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] return value for grep
Ulrich wrote: Hi, is it possible to easily change the return value for the grep function for cases where there is no match, for example the value 0 or No instead of integer (0) )? It sounds like you might want grepl (which returns a vector of TRUE and FALSE values) rather than grep (which returns a vector of indices or matches). Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R VBA
Many thanks for the feedback the mailing list reference - will do Obviously I am feeling very stupid now, not sure I missed that =0 sorry for that silly question Rgds, Julien On Oct 30, 2010, at 8:56 PM, RICHARD M. HEIBERGER wrote: Look at the macro RInterface.PutArrayFromVBA documented in the RExcel help file. Please send followup questions on RExcel and statconnDCOM to the mailing list at statconnDCOM, rcom, and RExcel server related issues rco...@mailman.csd.univie.ac.at Rich On Sat, Oct 30, 2010 at 1:44 PM, julien cuisinier j_cuisin...@hotmail.com wrote: Hi List, Is there any way to pass on directly VBA variable content into a R variable? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] For loop
Thanks for the suggestions. I add some more detail to clarify: h=matrix(nrow=1,ncol=22) In my data h is: (0.25 0.25 0 0 0 0 -0.25 -0.25 -0.25 -0.25 -0.5 -0.5 0 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25) xx-seq(0,1,0.5) v=matrix(nrow=20001,ncol=22) vv=matrix(nrow=20001,ncol=22) for (y in 1:20001) { v[y,22]=h[1,22] } vv[20001,22]=v[20001,22] for(k in 21:1) { for(j in 20001:2) { vv[j-1,k+1]=min(xx[j-1]*v[j-1,k+1],vv[j,k+1]) v[j,k]=h[1,k]+vv[j-1,k+1] } vv[20001,k]=v[20001,k] } The idea of using Rcpp seems to be good. Having never used this package will give a look. Somewhere I read an example like this: for (i in 1:R) { res[i]-f() NULL } where f() is a function, but I don't how can I trasform my code: vv[j-1,k+1]=min(xx[j-1]*v[j-1,k+1],vv[j,k+1]) v[j,k]=h[1,k]+vv[j-1,k+1] in a function [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to control in order of groups in xyplot
On 2010-10-31 01:19, Rainer Hurling wrote: A working minimal example would have been better, see FAQ. But I think you are looking for the following: X2- factor(c(m=2, m=5, m=10), levels=c(m=2, m=5, m=10)) Here levels are ordered in your way. There might be other solutions for this ordering problem. That would be the best way for this (common) situation since the ordering is not likely to be wanted in any other way. For playing with different orderings, lattice provides the index.cond argument described in the '...' part of help(xyplot). -Peter Ehlers Hope it helps, Rainer On 31.10.2010 06:55 (UTC+1), Jie Liu wrote: Hi guys, I used the following R code to generate one plot library(lattice) xyplot(Y~X1|as.factor(X2)*as.factor(X3), groups = as.factor(X4), data=mydata) Both X2 and X3 have three values. X4 has two values. I got 3x3 grids and in each grid there are two curves about y~x1 for the two X4 values. I am quite happy with the plot except that I need a different layout of the 3x3 layout. For example, X2={m=2, m=5, and m=10} and it plots with the order m=10, m=2, and m=5. Is there any way I can control the order of the groups in the whole plot? Thanks a lot, --Jerry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constrained Regression
What is the stochastic mechanism that generates the data? In other words, what distribution is Y, conditioned on X1 and X2, supposed to be? You can also get `a' if you do not wanrt to specify the probability mechanism by just doing a least squares fitting, but then making inferences can be a bit tricky. Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Jim Silverton jim.silver...@gmail.com Date: Sunday, October 31, 2010 2:45 am Subject: Re: [R] Constrained Regression To: r-help@r-project.org Hello everyone, I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I want to do a constrained regression such that a0 and (1-a) 0 for the model: Y = a*X1 + (1-a)*X2 I tried the help on the constrained regression in R but I concede that it was not helpful. Any help is greatly appreciated -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Randomly split a sample in two equal subsamples
Dear all, I would like to randomly split a sample in two equally large subsamples. The sample data is stored as a matrix with each row representing an individual and each column representing some variable (e.g., name, age, sex, etc.); the first row contains the names of the variables; the first column contains the individual number (1:n, for n individuals); the number of individuals is even (so, the overall number of rows is odd). I found similar threads (like random subset), but I don't know how to apply the information from them in my case. Could somebody help me a little bit? Thanks in advance! Yoan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extracting named vector from dataframe
Suppose df is a dataframe with one named row of numeric observations. I want to coerce df into a named vector. as.vector does not work as I expected: as.vector(df) returns the original dataframe, while as.vector(df,mode=numeric) returns an unnamed vector of NAs. This works: v - as.numeric(as.matrix(df)); names(v) - names(df); I just wanted check if there was a better/more natural way of doing this? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constrained Regression
On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote: Hello everyone, I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I want to do a constrained regression such that a0 and (1-a) 0 for the model: Y = a*X1 + (1-a)*X2 It would not accomplish the constraint that a 0 but you could accomplish the other constraint within an lm fit: X3 - X1-X2 lm(Y ~ X3 + offset(X2) ) Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2) Y ~ intercept + beta1*X1 + (1 -beta1)*X2 ... so beta1 is a. In the case beta 0 then I suppose a would be assigned 0. This might be accomplished within an iterative calculation framework by a large penalization for negative values. In a reply (1) to a question by Carlos Alzola in 2008 on rhalp, Berwin Turlach offered a solution to a similar problem ( sum(coef) == 1 AND coef non-negative). Modifying his code to incorporate the above strategy (and choosing two variables for which parameter values might be inside the constraint boundaries) we get: library(MASS) ## to access the Boston data designmat - model.matrix(medv~I(age-lstat) +offset(lstat), data=Boston) Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat, Boston$medv) Amat - cbind(1, diag(NROW(Dmat))) bvec - c(1,rep(0,NROW(Dmat))) meq - 1 library(quadprog) res - solve.QP(Dmat, dvec, Amat, bvec, meq) zapsmall(res$solution) [1] 0.686547 0.313453 Turlach specifically advised against any interpretation of this particular result which was only contructed to demonstrate the mathematical mechanics. I tried the help on the constrained regression in R but I concede that it was not helpful. I must not have that package installed because I got nothing that appeared to be useful with ??constrained regression . David Winsemius, MD West Hartford, CT 1) http://finzi.psych.upenn.edu/Rhelp10/2008-March/155990.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting named vector from dataframe
Hi, I think you want ?unlist d = data.frame(x=1, y=2, z=3) v = unlist(d) is(v) [1] numeric vector HTH, baptiste On 31 October 2010 16:54, James Hirschorn james.hirsch...@hotmail.com wrote: Suppose df is a dataframe with one named row of numeric observations. I want to coerce df into a named vector. as.vector does not work as I expected: as.vector(df) returns the original dataframe, while as.vector(df,mode=numeric) returns an unnamed vector of NAs. This works: v - as.numeric(as.matrix(df)); names(v) - names(df); I just wanted check if there was a better/more natural way of doing this? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting named vector from dataframe
On Oct 31, 2010, at 11:54 AM, James Hirschorn wrote: Suppose df is a dataframe with one named row of numeric observations. I want to coerce df into a named vector. I don't think you understand the structure of dataframes. They are named lists of component columns. The names you are attributing to the rows are not attached to the observations but rather are column names. So that row is not in any sense a named vector. If you created a dataframe with the first column a named vector its names would become the rownames. as.vector does not work as I expected: as.vector(df) returns the original dataframe, while as.vector(df,mode=numeric) returns an unnamed vector of NAs. This works: v - as.numeric(as.matrix(df)); names(v) - names(df); Right. You are now assigning the column names to the elements in a row, but in some ways that is an unnatural act, and not something that would be expected to work in the general case where a row might be a diverse set of types and even different classes. Your as.matrix operation coerced all of the values to be of one type. I just wanted check if there was a better/more natural way of doing this? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Randomly split a sample in two equal subsamples
Hi Yoan, Please try ?sample. Suppose you have 1:n ids of total observations where n is even, you want to randomly split it into two subsamples, the following code should work. n - 20 one.sample - sort(sample(1:n, n/2)) another.sample - (1:n)[-one.sample] Good luck. Wu - A R learner. -- View this message in context: http://r.789695.n4.nabble.com/Randomly-split-a-sample-in-two-equal-subsamples-tp3021140p3021257.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help with lmer model specification syntax for nested mixed model
On Sun, Oct 31, 2010 at 2:35 AM, Carabiniero ja...@troutnut.com wrote: I haven't been able to fully make sense of the conflicting online information about whether and how to specify nesting structure for a nested, mixed model. I'll describe my experiment and hopefully somebody who knows lme4 well can help. We're measuring the fluorescence intensity of brain slices from frogs that have undergone various treatments. We want to use multcomp to look for differences between treatments, while accounting for the variance introduced by the random effects of brain and slice. There are a few measurements per slice, several slices per brain, and several brains per treatment. In the data file, the numbering for slices starts over from 1 for each brain, and the numbering for brains starts over from 1 for each treatment. This is what I call implicit nesting in the definition of the variables. My general recommendation is to create new variables that reflect the actual structure of the data, as in mydata - within(mydata, { ubrain - factor(Treatment:Brain) uslice - factor(Treatment:Brain:Slice) } then define the model in terms of these factors, ubrain and uslice, that have the desirable property that each distinct brain has a distinct label. In other words: Treatment is a fixed effect, brain is a random effect nested in treatment, and slice is a random effect nested in brain. As I understood the documentation, this is the correct specification: log(Intensity) ~ Treatment + (1|Brain) + (1|Slice) That will work with ubrain and uslice instead of the implicitly nested Brain and Slice. However, I don't see how lmer understands the correct nesting structure from that. How does it know brain isn't crossed with treatment? lmer can determine the crossed or nested structure from the data whenever the data reflect the structure. Implicitly nested factors don't reflect the structure of the data and rely on external information to augment the data given. The computational methods used in lmer don't depend on whether the grouping factors for the random effects are nested or not. However they do require that the grouping factors are well-defined. Here are two other things I tried, and each gave different results: log(Intensity) ~ Treatment + (1|Slice/Brain/Treatment) log(Intensity) ~ Treatment + (1|Brain/Treatment) + (1|Slice/Brain) I'm not sure why these things give different results, or which one (if any) is right. Can anyone help? I have taken the liberty of cc:ing the R-SIG-Mixed-Models mailing list on this reply and suggest that any follow-ups be on that list. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constrained Regression
Have you tried the 'sos' package? install.packages('sos') # if not already installed library(sos) cr - ???'constrained regression' # found 149 matches summary(cr) # in 69 packages cr # opens a table in a browser listing all 169 matches with links to the help pages However, I agree with Ravi Varadhan: I'd want to understand the physical mechanism generating the data. If each is, for example, a proportion, then I'd want to use logistic regression, possible after some approximate logistic transformation of X1 and X2 that prevents logit(X) from going to +/-Inf. This is a different model, but it achieves the need to avoid predictions of Y going outside the range (0, 1). Spencer On 10/31/2010 9:01 AM, David Winsemius wrote: On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote: Hello everyone, I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I want to do a constrained regression such that a0 and (1-a) 0 for the model: Y = a*X1 + (1-a)*X2 It would not accomplish the constraint that a 0 but you could accomplish the other constraint within an lm fit: X3 - X1-X2 lm(Y ~ X3 + offset(X2) ) Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2) Y ~ intercept + beta1*X1 + (1 -beta1)*X2 ... so beta1 is a. In the case beta 0 then I suppose a would be assigned 0. This might be accomplished within an iterative calculation framework by a large penalization for negative values. In a reply (1) to a question by Carlos Alzola in 2008 on rhalp, Berwin Turlach offered a solution to a similar problem ( sum(coef) == 1 AND coef non-negative). Modifying his code to incorporate the above strategy (and choosing two variables for which parameter values might be inside the constraint boundaries) we get: library(MASS) ## to access the Boston data designmat - model.matrix(medv~I(age-lstat) +offset(lstat), data=Boston) Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat, Boston$medv) Amat - cbind(1, diag(NROW(Dmat))) bvec - c(1,rep(0,NROW(Dmat))) meq - 1 library(quadprog) res - solve.QP(Dmat, dvec, Amat, bvec, meq) zapsmall(res$solution) [1] 0.686547 0.313453 Turlach specifically advised against any interpretation of this particular result which was only contructed to demonstrate the mathematical mechanics. I tried the help on the constrained regression in R but I concede that it was not helpful. I must not have that package installed because I got nothing that appeared to be useful with ??constrained regression . David Winsemius, MD West Hartford, CT 1) http://finzi.psych.upenn.edu/Rhelp10/2008-March/155990.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] doubt in climate variability analysis in R! - code
I am sorry, i think the link was broken..! here is the correct one!!! http://www.4shared.com/file/4zV0g3JR/RF_80-85.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] complicated graphic -- persp+map
Hello I'm trying to render in 3D what I usually plot by image(), or image.plot() from the library fields, followed by a map(world,add=TRUE) type of command. More concretely, I have a field of temperature values, for a given geographic area, and I would like to plot a 3D surface whose x and y axes consist of longitude and latitude values and whose height and color-coding correspond to the temperature values but also -- and here is the problem -- superimpose a map of the region. persp() can easily do the color-coded surface but I cannot figure out how (if at all possible) to overlay geographic boundaries, in this specific case a simple map of the land area outlines. Thank you in advance for any help Claudia -- Claudia Tebaldi Research Scientist, Climate Central http://www.climatecentral.org (303) 775 5365 c (Colorado area code) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] make many barplot into one plot
hierobarp or barNest from {plotrix} may do this more neatly. 2010/10/31 Sibylle Stöckli sibylle.stoec...@gmx.ch Dear R users I would like to group my barplot graph (see example on the R help link). The proposed R code, adding individual bars to the plot, looks really overwhelming. My specific dataset just consists of five groups and three different levels within each groups (the individual bars). The .txt file is read as matrix (horizontal: group, vertical: levels). The R trellis barchart (function group=) is an easy function, but unfortunately the upper plot part look much different from other graphs. I would therefore prefer barplot to stansdardize my plots within the manuscript. It would be very helpful for me to know if anyone else has worked on the barplot group function. Thanks Sibylle http://onertipaday.blogspot.com/2007/05/make-many-barplot-into-one-plot.html R code from the link ## I have 4 tables like this:satu - array(c(5,15,20,68,29,54,84,119), dim=c(2,4), dimnames=list(c(Negative, Positive), c(Black, Brown, Red, Blond)))dua - array(c(50,105,30,8,29,25,84,9), dim=c(2,4), dimnames=list(c(Negative, Positive), c(Black, Brown, Red, Blond)))tiga - array(c(9,16,26,68,12,4,84,12), dim=c(2,4), dimnames=list(c(Negative, Positive), c(Black, Brown, Red, Blond)))empat - array(c(25,13,50,78,19,34,84,101), dim=c(2,4), dimnames=list(c(Negative, Positive), c(Black, Brown, Red, Blond)))# rbind() the tables togetherTAB - rbind(satu, dua, tiga, empat)# Do the barplot and save the bar midpointsmp - barplot(TAB, beside = TRUE, axisnames = FALSE)# Add the individual bar labelsmtext(1, at = mp, text = c(N, P),line = 0, cex = 0.5)# Get the midpoints of each sequential pair of bars# within each of the four groupsat - t(sapply(seq(1, nrow(TAB), by = 2),function(x) colMeans(mp[c(x, x+1), ])))# Add the group labels ! for each pairmtext(1, at = at, text = rep(c(satu, dua, tiga, empat), 4),line = 1, cex = 0.75)# Add the color labels for each groupmtext(1, at = colMeans(mp), text = c(Black, Brown, Red, Blond), line = 2) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constrained Regression
On Oct 31, 2010, at 12:54 PM, Spencer Graves wrote: Have you tried the 'sos' package? I have, and I am taking this opportunity to load it with my .Rprofile to make it more accessible. It works very well. Very clean display. I also have constructed a variant of RSiteSearch that I find more useful than the current defaults. rhelpSearch - function(string, restrict = c(Rhelp10, Rhelp08, Rhelp02, functions ), matchesPerPage = 100, ...) { RSiteSearch(string=string, restrict = restrict, matchesPerPage = matchesPerPage, ...)} install.packages('sos') # if not already installed library(sos) cr - ???'constrained regression' # found 149 matches summary(cr) # in 69 packages cr # opens a table in a browser listing all 169 matches with links to the help pages However, I agree with Ravi Varadhan: I'd want to understand the physical mechanism generating the data. If each is, for example, a proportion, then I'd want to use logistic regression, possible after some approximate logistic transformation of X1 and X2 that prevents logit(X) from going to +/-Inf. This is a different model, but it achieves the need to avoid predictions of Y going outside the range (0, 1). No argument. I defer to both of your greater experiences in such problems and your interest in educating us less knowledgeable users. I also need to amend my suggested strategy in situations where a linear model _might_ be appropriate, since I think the inclusion of the surrogate variable in the solve.QP setup is very probably creating confusion. After reconsideration I think one should keep the two approaches separate. These are two approaches to the non-intercept versions of the model that yield the same estimate (but only because the constraints do not get invoked ): lm(medv~I(age-lstat) +offset(lstat) -1, data=Boston) Call: lm(formula = medv ~ I(age - lstat) + offset(lstat) - 1, data = Boston) Coefficients: I(age - lstat) 0.1163 library(MASS) ## to access the Boston data designmat - model.matrix(medv~age+lstat-1, data=Boston) Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat, + Boston$medv) Amat - cbind(1, diag(NROW(Dmat))); bvec - c(1,rep(0,NROW(Dmat))); meq - 1 library(quadprog); res - solve.QP(Dmat, dvec, Amat, bvec, meq) zapsmall(res$solution) # zapsmall not really needed in this instance [1] 0.1163065 0.8836935 -- David. Spencer On 10/31/2010 9:01 AM, David Winsemius wrote: On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote: Hello everyone, I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I want to do a constrained regression such that a0 and (1-a) 0 for the model: Y = a*X1 + (1-a)*X2 It would not accomplish the constraint that a 0 but you could accomplish the other constraint within an lm fit: X3 - X1-X2 lm(Y ~ X3 + offset(X2) ) Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2) Y ~ intercept + beta1*X1 + (1 -beta1)*X2 ... so beta1 is a. In the case beta 0 then I suppose a would be assigned 0. This might be accomplished within an iterative calculation framework by a large penalization for negative values. In a reply (1) to a question by Carlos Alzola in 2008 on rhalp, Berwin Turlach offered a solution to a similar problem ( sum(coef) == 1 AND coef non- negative). Modifying his code to incorporate the above strategy (and choosing two variables for which parameter values might be inside the constraint boundaries) we get: library(MASS) ## to access the Boston data designmat - model.matrix(medv~I(age-lstat) +offset(lstat), data=Boston) Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat, Boston$medv) Amat - cbind(1, diag(NROW(Dmat))) bvec - c(1,rep(0,NROW(Dmat))) meq - 1 library(quadprog) res - solve.QP(Dmat, dvec, Amat, bvec, meq) zapsmall(res$solution) [1] 0.686547 0.313453 Turlach specifically advised against any interpretation of this particular result which was only contructed to demonstrate the mathematical mechanics. I tried the help on the constrained regression in R but I concede that it was not helpful. I must not have that package installed because I got nothing that appeared to be useful with ??constrained regression . David Winsemius, MD West Hartford, CT 1) http://finzi.psych.upenn.edu/Rhelp10/2008-March/155990.html __ David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Randomly split a sample in two equal subsamples
Thanks, but I just don't know how to translate that to a dataset with rows and columns. Initially, I was thinking about something like that: # Create some data: a - c(10,20,15,43,76,41,25,46) b - factor(c(m, w, m, w, m, w, m, w)) c - c(2,5,8,3,6,1,5,6) number - c(1:8) myframe - data.frame(a,b,c, number) # Randomly sample a subset of numbers: v1 - sample(number, 4, replace=FALSE) v2 - number[-v1] # Use the subset command like this: firsthalf - subset(myframe, number=v1) Of course, the last line doesn't work. Is this generally a wrong approach, or is just my writing wrong? -- View this message in context: http://r.789695.n4.nabble.com/Randomly-split-a-sample-in-two-equal-subsamples-tp3021140p3021339.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Randomly split a sample in two equal subsamples
firsthalf - myframe[v1,] or firsthalf - subset(myframe, number %in% v1) - A R learner. -- View this message in context: http://r.789695.n4.nabble.com/Randomly-split-a-sample-in-two-equal-subsamples-tp3021140p3021353.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting named vector from dataframe
Of course you are right that this would not be appropriate in general, but what I'm doing--which as Baptiste explained can be done much more nicely with unlist()---seems reasonable in my context: The dataframe has a computed statistic for each input, but I need a vector so that I can do operations such as sort(). -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Sunday, October 31, 2010 12:24 PM To: James Hirschorn Cc: R-help@r-project.org Subject: Re: [R] extracting named vector from dataframe On Oct 31, 2010, at 11:54 AM, James Hirschorn wrote: Suppose df is a dataframe with one named row of numeric observations. I want to coerce df into a named vector. I don't think you understand the structure of dataframes. They are named lists of component columns. The names you are attributing to the rows are not attached to the observations but rather are column names. So that row is not in any sense a named vector. If you created a dataframe with the first column a named vector its names would become the rownames. as.vector does not work as I expected: as.vector(df) returns the original dataframe, while as.vector(df,mode=numeric) returns an unnamed vector of NAs. This works: v - as.numeric(as.matrix(df)); names(v) - names(df); Right. You are now assigning the column names to the elements in a row, but in some ways that is an unnatural act, and not something that would be expected to work in the general case where a row might be a diverse set of types and even different classes. Your as.matrix operation coerced all of the values to be of one type. I just wanted check if there was a better/more natural way of doing this? -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Downloading Package Sources
On 31.10.2010 11:30, Santosh Srinivas wrote: Dear Group, any idea how I can download the source code for all packages in Windows 7? Either apply wget on yourCRANmirror/src/contrib/ or - choose CRAN mirror - select CRAN as the only repository (unless yoiu want other packages as well) - ask R to download.packages(available.packages(type=source)[,1], type=source, destdir=.) Maybe you want to redefine the filters. for available.packages. See its help page for details. Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Questions about Probit Analysis
Dear All, I have some questions about probit regressions. I saw a nice introduction at http://bit.ly/bU9xL5 and I mainly have two questions. (1) The first is almost about data manipulation. Consider the following snippet ## mydata - read.csv(url(http://www.ats.ucla.edu/stat/r/dae/binary.csv;)) names(mydata) - c(outcome,x1,x2,x3) myprobit - glm(mydata$outcome~mydata$x1+mydata$x2+as.factor(mydata$x3), family=binomial(link=probit)) print(summary(myprobit)) #Now assume I can make a regression only on x1 myprobit2 - glm(mydata$outcome~mydata$x1, family=binomial(link=probit)) print(summary(myprobit2)) #express in terms of counts md - t(table(mydata$outcome, mydata$x1)) # create new dataframe mydatanew - data.frame(as.numeric(row.names(md))) names(mydatanew) - c(x1) mydatanew$successes -as.numeric(md[ ,2]) mydatanew$failures -as.numeric(md[ ,1]) where first I carry out a logit regression of the binary outcome (i.e. taking only 0/1 as values) on 3 regressors, then I simply regress the outcome on the x1 variable. Finally, I generate the data frame mydatanew (see some of its entries below) mydatanew x1 successes failures 1 220 01 2 300 12 3 340 13 4 360 04 5 380 08 [...] where for every value of x1 I count the number of 0 and 1 outcomes (namely number of failures and number of successes). This is equivalent to having a full list of x1 values with an associated 0/1 outcome (I have simply counted them) hence it is all the info I need to again perform a logit regression of the binary outcome on x1, but the data format is now different. How can I actually feed R with mydatanew to perform again a logistic regression on x1 only? (2) This is a bit more conceptual. Let us say that you have a set of products A,B,C,D,E,F. Each product has a list of features: x_A for product A, x_B for B etc... Each customer has its own set of parameters (age, sex, income etc..) I call x_cust. Finally, the customer is confronted with two products (e.g. A and D; combinations may vary, I call each combination of two products a scenario) and asked which one he would like to buy. Bottom line: your data are in the format 1 x_A x_cust 0 x_D x_cust meaning that a certain customer chose product A against product D; similarly 1 x_C x_cust 0 x_B x_cust would mean that the customer choosing between C and B finally selected C. Every customer needs to choose a product in a variety of different scenarios. How would you analyze this kind of data? Is there any way I can express, in my probit analysis, the fact that my binary outcome (but this product or not) arises always from the comparison of two products only (customers are never given a choice between more than two products in a given scenario). Or should I simply run my logistic regression on my 0/1 outcome without any extra worry (like in the snippet above)? Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Randomly split a sample in two equal subsamples
Thanks, it works! -- View this message in context: http://r.789695.n4.nabble.com/Randomly-split-a-sample-in-two-equal-subsamples-tp3021140p3021365.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting named vector from dataframe
On Sun, Oct 31, 2010 at 12:11 PM, baptiste auguie baptiste.aug...@googlemail.com wrote: Hi, I think you want ?unlist d = data.frame(x=1, y=2, z=3) v = unlist(d) is(v) [1] numeric vector Here are a few other possibilities too: drop(as.matrix(d)) do.call(c, d) sapply(d, identity) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] latex on summary.formula from Hmisc doesn't remove N
Hi All, I'm trying to get a summary table of my datasets, in which I want the mean of different groups calculated and presented in a table. I want this table to easily be exported to LaTeX. However, I'm not able to remove the N in this summary table. It gives the following error: Error in 1:lt : argument of length 0 What have I tried: tmp-summary.formula(traveltime ~ guidanceID + gender, data=event2, fun=mean, method='cross', prn=FALSE) latex(tmp, file=, title=,cdec=0, prn = FALSE, booktabs=TRUE) Error in 1:lt : argument of length 0 Without prn = FALSE or with prn = TRUE, the latex function works, but gives the unwanted result with the number of values. Using: print(tmp, prn=FALSE) gives the proper table, but this cannot be convert to a LaTeX table. Am I doing something wrong here? Or is it a bug in Hmisc? Any ideas? Thijs __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] transfer string to expression
Dear all: when I use parse() there is some problems. Below is an example: b0-1 b1-1 x-1 str2expr-function(x){eval(parse(text=x))} test1-b0+b1*sqrt(x) test2-b0+b1 str2expr(test1) str2expr(test2) it can work well for test2 but not for test1. Could you tell me how to fix this problem or is there other more stable method to transfer an string to expression? best regards Elong [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transfer string to expression
On 31/10/2010 3:22 PM, Yilong Zhang wrote: Dear all: when I use parse() there is some problems. Below is an example: b0-1 b1-1 x-1 str2expr-function(x){eval(parse(text=x))} test1-b0+b1*sqrt(x) test2-b0+b1 str2expr(test1) str2expr(test2) it can work well for test2 but not for test1. Could you tell me how to fix this problem or is there other more stable method to transfer an string to expression? You are evaluating those expressions in the evaluation frame of your function, where x is the expression. Just specify the environment in which you want to evaluate them, e.g. str2expr-function(x){eval(parse(text=x), envir=parent.frame() )} Duncan Murdoch best regards Elong [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transfer string to expression
On Oct 31, 2010, at 3:22 PM, Yilong Zhang wrote: Dear all: when I use parse() there is some problems. Below is an example: b0-1 b1-1 x-1 str2expr-function(x){eval(parse(text=x))} test1-b0+b1*sqrt(x) test2-b0+b1 str2expr(test1) str2expr(test2) I don't think the scoping rules are up to the task of keeping the x's distinct: b0-1 b1-1 xx-1 str2expr-function(x){eval(parse(text=x))} test1-b0+b1*sqrt(xx) test2-b0+b1 str2expr(test1) [1] 2 str2expr(test2) [1] 2 it can work well for test2 but not for test1. Could you tell me how to fix this problem or is there other more stable method to transfer an string to expression? I generally try to keep the variables distinct so my brain can handle the various levels. So I would probably not have used x in that manner. I think it made the interpreter attempt to take the sqrt of b0+b1*sqrt(x), which gave you the Error in sqrt(x) : Non-numeric argument to mathematical function And apropos is this: fortune(If the answer is parse()) If the answer is parse() you should usually rethink the question. -- Thomas Lumley R-help (February 2005) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] complicated graphic -- persp+map
On 31/10/2010 1:29 PM, claudia tebaldi wrote: Hello I'm trying to render in 3D what I usually plot by image(), or image.plot() from the library fields, followed by a map(world,add=TRUE) type of command. More concretely, I have a field of temperature values, for a given geographic area, and I would like to plot a 3D surface whose x and y axes consist of longitude and latitude values and whose height and color-coding correspond to the temperature values but also -- and here is the problem -- superimpose a map of the region. persp() can easily do the color-coded surface but I cannot figure out how (if at all possible) to overlay geographic boundaries, in this specific case a simple map of the land area outlines. This is possible in rgl; see example(persp3d). (This is working in R 2.11.1, but not in the CRAN build of R 2.12.0 at the moment. If you build it yourself it works there.) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transfer string to expression
Hi Duncan: I'm curious about the environment setting. ?eval says: If envir is not specified, then the default is parent.frame() (the environment where the call to eval was made). So what's the difference between set envir=parent.frame() or not? Thank you. Wu - A R learner. -- View this message in context: http://r.789695.n4.nabble.com/transfer-string-to-expression-tp3021453p3021487.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transfer string to expression
On 31/10/2010 4:47 PM, Wu Gong wrote: Hi Duncan: I'm curious about the environment setting. ?eval says: If envir is not specified, then the default is parent.frame() (the environment where the call to eval was made). So what's the difference between set envir=parent.frame() or not? If you pass parent.frame() as an argument, it will be one level up: it's the parent of the function that calls eval. If you leave it at the default, it's the parent of eval, i.e. the function that calls it. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] doubt in climate variability analysis in R! - code
Ok I downloaded it and showed you how to get your data out. How to read it into a raster brick, how to plot the data, how to get the mean rainfall of every day.lots more you can do. there is a bad bit of data in the last time step. check my blog. In the future what you should do is write code to emulate your problem. for example, in your problem you had created a ncdf file with a 3D matrix of 65,69,2192. You should just do a subset of that, show the code to create a ncdf with random numbers in it. creating working code that emulates your problem is key if you want help. Off list for the rest. On Sun, Oct 31, 2010 at 10:21 AM, govin...@msu.edu wrote: I am sorry, i think the link was broken..! here is the correct one!!! http://www.4shared.com/file/4zV0g3JR/RF_80-85.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] For loop
Hi, Using the code below I got almost a 60% speedup. Obviously this is largely dependent on there being relatively fewer columns than rows. HTH, Josh # ## Create data h - structure(c(0.25, 0.25, 0, 0, 0, 0, -0.25, -0.25, -0.25, -0.25, -0.5, -0.5, 0, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25), .Dim = c(1L, 22L), .Dimnames = list(NULL, NULL)) xx - seq(0, 1, 0.5) v - matrix(nrow=20001,ncol=22) vv - matrix(nrow=20001,ncol=22) ## Create duplicates for comparison h.2 - h; xx.2 - xx; v.2 - v; vv.2 - vv ## Old system.time(for (y in 1:20001) {v[y, 22] - h[1, 22]}) ## New system.time(v.2[, 22] - h.2[1, 22]) ## no speedup possible here vv[20001,22] - v[20001,22] vv.2[20001,22] - v.2[20001,22] ## Old system.time( for(k in 21:1) { for(j in 20001:2) { vv[j-1, k + 1] - min(xx[j-1] * v[j-1,k+1], vv[j,k+1]) v[j, k] - h[1, k] + vv[j-1, k+1] } vv[20001, k] - v[20001, k] }) # user system elapsed # 20.733 0.044 25.869 ## New system.time(for(k in 21:1) { tmp - xx.2 * v.2[, k+1] for(j in 20001:2) { vv.2[j-1, k + 1] - min(tmp[j-1], vv.2[j, k+1]) } v.2[-1, k] - h.2[1, k] + vv.2[-20001, k+1] vv.2[20001, k] - v.2[20001, k] }) # user system elapsed # 9.184 0.012 10.500 ## Check that new is identical to old identical(vv, vv.2) identical(v, v.2) On Sun, Oct 31, 2010 at 3:58 AM, Astabenefica astabenef...@hotmail.it wrote: Thanks for the suggestions. I add some more detail to clarify: h=matrix(nrow=1,ncol=22) In my data h is: (0.25 0.25 0 0 0 0 -0.25 -0.25 -0.25 -0.25 -0.5 -0.5 0 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25) xx-seq(0,1,0.5) v=matrix(nrow=20001,ncol=22) vv=matrix(nrow=20001,ncol=22) for (y in 1:20001) { v[y,22]=h[1,22] } vv[20001,22]=v[20001,22] for(k in 21:1) { for(j in 20001:2) { vv[j-1,k+1]=min(xx[j-1]*v[j-1,k+1],vv[j,k+1]) v[j,k]=h[1,k]+vv[j-1,k+1] } vv[20001,k]=v[20001,k] } The idea of using Rcpp seems to be good. Having never used this package will give a look. Somewhere I read an example like this: for (i in 1:R) { res[i]-f() NULL } where f() is a function, but I don't how can I trasform my code: vv[j-1,k+1]=min(xx[j-1]*v[j-1,k+1],vv[j,k+1]) v[j,k]=h[1,k]+vv[j-1,k+1] in a function [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] For loop
Actually, I'm able to gain another second by limiting -/+s in the subscripts (maybe just an artifact? I ran it a couple of times to check, but still). system.time(for(k in 22:2) { tmp - xx.2 * v.2[, k] for(j in 2:1) { vv.2[j, k] - min(tmp[j], vv.2[j+1, k]) } v.2[-1, k-1] - h.2[1, k-1] + vv.2[-20001, k] vv.2[20001, k-1] - v.2[20001, k-1] }) On Sun, Oct 31, 2010 at 2:39 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi, Using the code below I got almost a 60% speedup. Obviously this is largely dependent on there being relatively fewer columns than rows. HTH, Josh # ## Create data h - structure(c(0.25, 0.25, 0, 0, 0, 0, -0.25, -0.25, -0.25, -0.25, -0.5, -0.5, 0, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25), .Dim = c(1L, 22L), .Dimnames = list(NULL, NULL)) xx - seq(0, 1, 0.5) v - matrix(nrow=20001,ncol=22) vv - matrix(nrow=20001,ncol=22) ## Create duplicates for comparison h.2 - h; xx.2 - xx; v.2 - v; vv.2 - vv ## Old system.time(for (y in 1:20001) {v[y, 22] - h[1, 22]}) ## New system.time(v.2[, 22] - h.2[1, 22]) ## no speedup possible here vv[20001,22] - v[20001,22] vv.2[20001,22] - v.2[20001,22] ## Old system.time( for(k in 21:1) { for(j in 20001:2) { vv[j-1, k + 1] - min(xx[j-1] * v[j-1,k+1], vv[j,k+1]) v[j, k] - h[1, k] + vv[j-1, k+1] } vv[20001, k] - v[20001, k] }) # user system elapsed # 20.733 0.044 25.869 ## New system.time(for(k in 21:1) { tmp - xx.2 * v.2[, k+1] for(j in 20001:2) { vv.2[j-1, k + 1] - min(tmp[j-1], vv.2[j, k+1]) } v.2[-1, k] - h.2[1, k] + vv.2[-20001, k+1] vv.2[20001, k] - v.2[20001, k] }) # user system elapsed # 9.184 0.012 10.500 ## Check that new is identical to old identical(vv, vv.2) identical(v, v.2) On Sun, Oct 31, 2010 at 3:58 AM, Astabenefica astabenef...@hotmail.it wrote: Thanks for the suggestions. I add some more detail to clarify: h=matrix(nrow=1,ncol=22) In my data h is: (0.25 0.25 0 0 0 0 -0.25 -0.25 -0.25 -0.25 -0.5 -0.5 0 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25) xx-seq(0,1,0.5) v=matrix(nrow=20001,ncol=22) vv=matrix(nrow=20001,ncol=22) for (y in 1:20001) { v[y,22]=h[1,22] } vv[20001,22]=v[20001,22] for(k in 21:1) { for(j in 20001:2) { vv[j-1,k+1]=min(xx[j-1]*v[j-1,k+1],vv[j,k+1]) v[j,k]=h[1,k]+vv[j-1,k+1] } vv[20001,k]=v[20001,k] } The idea of using Rcpp seems to be good. Having never used this package will give a look. Somewhere I read an example like this: for (i in 1:R) { res[i]-f() NULL } where f() is a function, but I don't how can I trasform my code: vv[j-1,k+1]=min(xx[j-1]*v[j-1,k+1],vv[j,k+1]) v[j,k]=h[1,k]+vv[j-1,k+1] in a function [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] parallel for loop
Hi all, Just following on from a previous thread (for loop). Is there a parallel 'for' loop like matlab (parfor maybe?). I know there was a Nvidia GPU version for blas somewhere. But is there a CPU or a GPU version of the for loop? Thanks, Sachin p.s. sorry about the corporate notice below: cant get rid of it. Don't have other mail client access at the office :( --- Please consider the environment before printing this email --- Allianz - Best General Insurance Company of the Year 2010* Allianz - General Insurance Company of the Year 2009+ * Australian Banking and Finance Insurance Awards + Australia and New Zealand Insurance Industry Awards This email and any attachments has been sent by Allianz Australia Insurance Limited (ABN 15 000 122 850) and is intended solely for the addressee. It is confidential, may contain personal information and may be subject to legal professional privilege. Unauthorised use is strictly prohibited and may be unlawful. If you have received this by mistake, confidentiality and any legal privilege are not waived or lost and we ask that you contact the sender and delete and destroy this and any other copies. In relation to any legal use you may make of the contents of this email, you must ensure that you comply with the Privacy Act (Cth) 1988 and you should note that the contents may be subject to copyright and therefore may not be reproduced, communicated or adapted without the express consent of the owner of the copyright. Allianz will not be liable in connection with any data corruption, interruption, delay, computer virus or unauthorised access or amendment to the contents of this email. If this email is a commercial electronic message and you would prefer not to receive further commercial electronic messages from Allianz, please forward a copy of this email to unsubscr...@allianz.com.au with the word unsubscribe in the subject header. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] parallel for loop
Take a look at the parallel computing section of: http://cran.r-project.org/web/views/HighPerformanceComputing.html specifically the line concerning the foreach package. Jeff. On Sun, Oct 31, 2010 at 6:38 PM, sachinthaka.abeyward...@allianz.com.au wrote: Hi all, Just following on from a previous thread (for loop). Is there a parallel 'for' loop like matlab (parfor maybe?). I know there was a Nvidia GPU version for blas somewhere. But is there a CPU or a GPU version of the for loop? Thanks, Sachin p.s. sorry about the corporate notice below: cant get rid of it. Don't have other mail client access at the office :( --- Please consider the environment before printing this email --- Allianz - Best General Insurance Company of the Year 2010* Allianz - General Insurance Company of the Year 2009+ * Australian Banking and Finance Insurance Awards + Australia and New Zealand Insurance Industry Awards This email and any attachments has been sent by Allianz Australia Insurance Limited (ABN 15 000 122 850) and is intended solely for the addressee. It is confidential, may contain personal information and may be subject to legal professional privilege. Unauthorised use is strictly prohibited and may be unlawful. If you have received this by mistake, confidentiality and any legal privilege are not waived or lost and we ask that you contact the sender and delete and destroy this and any other copies. In relation to any legal use you may make of the contents of this email, you must ensure that you comply with the Privacy Act (Cth) 1988 and you should note that the contents may be subject to copyright and therefore may not be reproduced, communicated or adapted without the express consent of the owner of the copyright. Allianz will not be liable in connection with any data corruption, interruption, delay, computer virus or unauthorised access or amendment to the contents of this email. If this email is a commercial electronic message and you would prefer not to receive further commercial electronic messages from Allianz, please forward a copy of this email to unsubscr...@allianz.com.au with the word unsubscribe in the subject header. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rserve error
Hi all, I'm trying to run Rserve on windows RGui. It installs successfully but when I use Rserve() to invoke the service it shows following error: The program can't start because R.dll is missing from your computer. Try reinstalling the program to fix this problem. I even tried reinstalling R but it still shows the same error. Thanks, Anand [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to save this result in a vector
HI, Dear R community, I have the following codes to calculate the commulative coverage. I want to save the output in a vector, How to do this? test-seq(10, 342, by=2) #cover is a vector cover_per-function (cover) { for (i in min(cover):max(cover)) {print(100*sum(ifelse(cover = i, 1, 0))/length(cover))} } result-cover_per(test) result NULL Can anyone help me this this? -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to save this result in a vector
Hi Changbin, The yek is that you need to save the results of your for loop rather than just printing it. It also may be possible to vectorize your for loop which might simplify things and speed them up, but I did not look at that. Here is one way to save the results, see inline comments for more details. Cheers, Josh ## test-seq(10, 342, by=2) #cover is a vector cover_per - function(cover) { ## create a vector to store the results of your for loop output - vector(numeric, length(min(cover):max(cover))) for (i in min(cover):max(cover)) { ## rather than print()ing the output, assign it to an object output[i] - 100*sum(ifelse(cover = i, 1, 0))/length(cover) } ## have the return value from the function be ## the object 'output' return(output) } ## here the results will be printed to the screen ## (the results are just the 'output' object) cover_per(test) ## if you assign it to another object, nothing is printed ## but you can easily print 'result' if you want result - cover_per(test) ## On Sun, Oct 31, 2010 at 5:35 PM, Changbin Du changb...@gmail.com wrote: HI, Dear R community, I have the following codes to calculate the commulative coverage. I want to save the output in a vector, How to do this? test-seq(10, 342, by=2) #cover is a vector cover_per-function (cover) { for (i in min(cover):max(cover)) {print(100*sum(ifelse(cover = i, 1, 0))/length(cover))} } result-cover_per(test) result NULL Can anyone help me this this? -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to save this result in a vector
On Oct 31, 2010, at 8:35 PM, Changbin Du wrote: HI, Dear R community, I have the following codes to calculate the commulative coverage. Not sure exactly what you mean by this. My guess is implemented below. I want to save the output in a vector, How to do this? test-seq(10, 342, by=2) #cover is a vector cover_per-function (cover) { for (i in min(cover):max(cover)) Using either for (i in cover) { ...} or for (i in seq_along(cover) ) {...} would be more typical. {print(100*sum(ifelse(cover = i, 1, 0))/length(cover))} } result-cover_per(test) Are you looking for cumsum? test-seq(10, 34, by=2) 100*cumsum(test)/sum(test) [1] 3.496503 7.692308 12.587413 18.181818 24.475524 31.468531 39.160839 [8] 47.552448 56.643357 66.433566 76.923077 88.111888 100.00 print(100*cumsum(test)/sum(test), digits=2) [1] 3.5 7.7 12.6 18.2 24.5 31.5 39.2 47.6 56.6 66.4 76.9 88.1 100.0 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to save this result in a vector
On Sun, Oct 31, 2010 at 5:44 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: snip #cover is a vector cover_per - function(cover) { ## create a vector to store the results of your for loop output - vector(numeric, length(min(cover):max(cover))) for (i in min(cover):max(cover)) { ## rather than print()ing the output, assign it to an object output[i] - 100*sum(ifelse(cover = i, 1, 0))/length(cover) } ## have the return value from the function be ## the object 'output' return(output) } I did not catch that i was not necessarily starting at 1 going sequentially up, so that would have to be done manually (or use cumsum per David rather than the function you wrote). cover_per2 - function(cover) { output - vector(numeric, length(min(cover):max(cover))) j - 1 for (i in min(cover):max(cover)) { output[j] - 100*sum(ifelse(cover = i, 1, 0))/length(cover) j - j + 1 } return(output) } Josh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to save this result in a vector
HI, David, Juan, and Joshua, Thanks so much for your help! The following codes works. Appreciated! test-seq(10, 342, by=2) #data is a vector cover_per-function (data) { output-vector(numeric,length(min(data):max(data))) for (i in min(data):max(data)) { output[i]-(100*sum(ifelse(data = i, 1, 0))/length(data)) } return(output) } result-cover_per(test) #*** #data is a vector cover_per-function (data) { output-numeric(0) for (i in min(data):max(data)) { x-(100*sum(ifelse(data = i, 1, 0))/length(data)) output-c(output, x) } return(output) } result-cover_per(test) On Sun, Oct 31, 2010 at 5:46 PM, David Winsemius dwinsem...@comcast.netwrote: On Oct 31, 2010, at 8:35 PM, Changbin Du wrote: HI, Dear R community, I have the following codes to calculate the commulative coverage. Not sure exactly what you mean by this. My guess is implemented below. I want to save the output in a vector, How to do this? test-seq(10, 342, by=2) #cover is a vector cover_per-function (cover) { for (i in min(cover):max(cover)) Using either for (i in cover) { ...} or for (i in seq_along(cover) ) {...} would be more typical. {print(100*sum(ifelse(cover = i, 1, 0))/length(cover))} } result-cover_per(test) Are you looking for cumsum? test-seq(10, 34, by=2) 100*cumsum(test)/sum(test) [1] 3.496503 7.692308 12.587413 18.181818 24.475524 31.468531 39.160839 [8] 47.552448 56.643357 66.433566 76.923077 88.111888 100.00 print(100*cumsum(test)/sum(test), digits=2) [1] 3.5 7.7 12.6 18.2 24.5 31.5 39.2 47.6 56.6 66.4 76.9 88.1 100.0 -- David Winsemius, MD West Hartford, CT -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to save this result in a vector
Thanks Joshua! Yes, i is not going up sequentially by 1, as i here is the raw number of reads for each DNA base. Thanks so much for the great help! On Sun, Oct 31, 2010 at 6:03 PM, Joshua Wiley jwiley.ps...@gmail.comwrote: On Sun, Oct 31, 2010 at 5:44 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: snip #cover is a vector cover_per - function(cover) { ## create a vector to store the results of your for loop output - vector(numeric, length(min(cover):max(cover))) for (i in min(cover):max(cover)) { ## rather than print()ing the output, assign it to an object output[i] - 100*sum(ifelse(cover = i, 1, 0))/length(cover) } ## have the return value from the function be ## the object 'output' return(output) } I did not catch that i was not necessarily starting at 1 going sequentially up, so that would have to be done manually (or use cumsum per David rather than the function you wrote). cover_per2 - function(cover) { output - vector(numeric, length(min(cover):max(cover))) j - 1 for (i in min(cover):max(cover)) { output[j] - 100*sum(ifelse(cover = i, 1, 0))/length(cover) j - j + 1 } return(output) } Josh -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constrained Regression
I thought I would 'add' some meat to the problem I sent. This is all I know (1) f = a*X1 + (1-a)*X2 (2) I know n values of f and X1 which happens to be probabilities (3) I know nothing about X2 except that it also lies in (0,1) (4) X1 is the probability under the null (fisher's exact test) and X2 is the alternative. But I have no idea what the alternative to fisher's exact test can be.. (5) I need to estimate a (which is supposed to be a proportion). (6) I was thinking about imputing values from (0,1) using a beta as the values of X2. Any help is greatly appreciated. Jim... On Sun, Oct 31, 2010 at 1:44 PM, David Winsemius dwinsem...@comcast.netwrote: On Oct 31, 2010, at 12:54 PM, Spencer Graves wrote: Have you tried the 'sos' package? I have, and I am taking this opportunity to load it with my .Rprofile to make it more accessible. It works very well. Very clean display. I also have constructed a variant of RSiteSearch that I find more useful than the current defaults. rhelpSearch - function(string, restrict = c(Rhelp10, Rhelp08, Rhelp02, functions ), matchesPerPage = 100, ...) { RSiteSearch(string=string, restrict = restrict, matchesPerPage = matchesPerPage, ...)} install.packages('sos') # if not already installed library(sos) cr - ???'constrained regression' # found 149 matches summary(cr) # in 69 packages cr # opens a table in a browser listing all 169 matches with links to the help pages However, I agree with Ravi Varadhan: I'd want to understand the physical mechanism generating the data. If each is, for example, a proportion, then I'd want to use logistic regression, possible after some approximate logistic transformation of X1 and X2 that prevents logit(X) from going to +/-Inf. This is a different model, but it achieves the need to avoid predictions of Y going outside the range (0, 1). No argument. I defer to both of your greater experiences in such problems and your interest in educating us less knowledgeable users. I also need to amend my suggested strategy in situations where a linear model _might_ be appropriate, since I think the inclusion of the surrogate variable in the solve.QP setup is very probably creating confusion. After reconsideration I think one should keep the two approaches separate. These are two approaches to the non-intercept versions of the model that yield the same estimate (but only because the constraints do not get invoked ): lm(medv~I(age-lstat) +offset(lstat) -1, data=Boston) Call: lm(formula = medv ~ I(age - lstat) + offset(lstat) - 1, data = Boston) Coefficients: I(age - lstat) 0.1163 library(MASS) ## to access the Boston data designmat - model.matrix(medv~age+lstat-1, data=Boston) Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat, + Boston$medv) Amat - cbind(1, diag(NROW(Dmat))); bvec - c(1,rep(0,NROW(Dmat))); meq - 1 library(quadprog); res - solve.QP(Dmat, dvec, Amat, bvec, meq) zapsmall(res$solution) # zapsmall not really needed in this instance [1] 0.1163065 0.8836935 -- David. Spencer On 10/31/2010 9:01 AM, David Winsemius wrote: On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote: Hello everyone, I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I want to do a constrained regression such that a0 and (1-a) 0 for the model: Y = a*X1 + (1-a)*X2 It would not accomplish the constraint that a 0 but you could accomplish the other constraint within an lm fit: X3 - X1-X2 lm(Y ~ X3 + offset(X2) ) Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2) Y ~ intercept + beta1*X1 + (1 -beta1)*X2 ... so beta1 is a. In the case beta 0 then I suppose a would be assigned 0. This might be accomplished within an iterative calculation framework by a large penalization for negative values. In a reply (1) to a question by Carlos Alzola in 2008 on rhalp, Berwin Turlach offered a solution to a similar problem ( sum(coef) == 1 AND coef non-negative). Modifying his code to incorporate the above strategy (and choosing two variables for which parameter values might be inside the constraint boundaries) we get: library(MASS) ## to access the Boston data designmat - model.matrix(medv~I(age-lstat) +offset(lstat), data=Boston) Dmat -crossprod(designmat, designmat); dvec - crossprod(designmat, Boston$medv) Amat - cbind(1, diag(NROW(Dmat))) bvec - c(1,rep(0,NROW(Dmat))) meq - 1 library(quadprog) res - solve.QP(Dmat, dvec, Amat, bvec, meq) zapsmall(res$solution) [1] 0.686547 0.313453 Turlach specifically advised against any interpretation of this particular result which was only contructed to demonstrate the mathematical mechanics. I tried the help on the constrained regression in R but I concede that it was not helpful. I must not have that package installed because I got
[R] Mean and individual growth curve trajectories
I'm trying to understand how to plot individual growth curve trajectories, with the overall mean trajectory superimposed (preferably in a slightly thicker line, maybe in black) over the individual trajectories. Using the sleepstudy data in lme4, here is the code I have so far: library(lme4) library(lattice) xyplot(Reaction ~ Days, data = sleepstudy, group = Subject, type = 'l') This plot produces the individual growth curves nicely, but I'd like to be able to plot the mean for each day (averaged over subjects) on top of this graph. Many thanks in advance for any help/suggestions. John -- View this message in context: http://r.789695.n4.nabble.com/Mean-and-individual-growth-curve-trajectories-tp3021672p3021672.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constrained Regression
in line On 10/31/2010 6:26 PM, Jim Silverton wrote: I thought I would 'add' some meat to the problem I sent. This is all I know (1) f = a*X1 + (1-a)*X2 How do you know f = a*X1 + (1-a)*X2? Why does this relationship make more sense than, e.g., log(f/(1-f)) = a*X1 + (1-a)*X2? (2) I know n values of f and X1 which happens to be probabilities What do you mean that f and X1 are probabilities? Where do they come from? Are the the results of binomial experiments like tosses of a biased coin? (3) I know nothing about X2 except that it also lies in (0,1) (4) X1 is the probability under the null (fisher's exact test) and X2 is the alternative. But I have no idea what the alternative to fisher's exact test can be.. How do you compute X1? Do you compute it from data? If yes, then it's more like an estimate of a probability rather than a probability itself. Ditto for X2. The preferred method for estimating almost anything whenever feasible is to write a likelihood function, i.e., the probability of data as a function of unknown parameters, then estimate those parameters to maximize the likelihood. Even most nonparametric procedures can be expressed this way for an appropriately chosen probability model. In most situations, I prefer to eliminate constraints by transformations, replacing f by ph = log(f/(1-f), X1 and X2 by xi1 = log(X1/(1-X1)) and X2 by xi2 = log(X2/(1-X2)), a by alpha = log(a/(1-a)), then write a relationship between these unconstrained variables and try to estimate alpha by maximum likelihood. Hope this helps. Spencer (5) I need to estimate a (which is supposed to be a proportion). (6) I was thinking about imputing values from (0,1) using a beta as the values of X2. Any help is greatly appreciated. Jim... On Sun, Oct 31, 2010 at 1:44 PM, David Winsemiusdwinsem...@comcast.netwrote: On Oct 31, 2010, at 12:54 PM, Spencer Graves wrote: Have you tried the 'sos' package? I have, and I am taking this opportunity to load it with my .Rprofile to make it more accessible. It works very well. Very clean display. I also have constructed a variant of RSiteSearch that I find more useful than the current defaults. rhelpSearch- function(string, restrict = c(Rhelp10, Rhelp08, Rhelp02, functions ), matchesPerPage = 100, ...) { RSiteSearch(string=string, restrict = restrict, matchesPerPage = matchesPerPage, ...)} install.packages('sos') # if not already installed library(sos) cr- ???'constrained regression' # found 149 matches summary(cr) # in 69 packages cr # opens a table in a browser listing all 169 matches with links to the help pages However, I agree with Ravi Varadhan: I'd want to understand the physical mechanism generating the data. If each is, for example, a proportion, then I'd want to use logistic regression, possible after some approximate logistic transformation of X1 and X2 that prevents logit(X) from going to +/-Inf. This is a different model, but it achieves the need to avoid predictions of Y going outside the range (0, 1). No argument. I defer to both of your greater experiences in such problems and your interest in educating us less knowledgeable users. I also need to amend my suggested strategy in situations where a linear model _might_ be appropriate, since I think the inclusion of the surrogate variable in the solve.QP setup is very probably creating confusion. After reconsideration I think one should keep the two approaches separate. These are two approaches to the non-intercept versions of the model that yield the same estimate (but only because the constraints do not get invoked ): lm(medv~I(age-lstat) +offset(lstat) -1, data=Boston) Call: lm(formula = medv ~ I(age - lstat) + offset(lstat) - 1, data = Boston) Coefficients: I(age - lstat) 0.1163 library(MASS) ## to access the Boston data designmat- model.matrix(medv~age+lstat-1, data=Boston) Dmat-crossprod(designmat, designmat); dvec- crossprod(designmat, + Boston$medv) Amat- cbind(1, diag(NROW(Dmat))); bvec- c(1,rep(0,NROW(Dmat))); meq- 1 library(quadprog); res- solve.QP(Dmat, dvec, Amat, bvec, meq) zapsmall(res$solution) # zapsmall not really needed in this instance [1] 0.1163065 0.8836935 -- David. Spencer On 10/31/2010 9:01 AM, David Winsemius wrote: On Oct 31, 2010, at 2:44 AM, Jim Silverton wrote: Hello everyone, I have 3 variables Y, X1 and X2. Each variables lies between 0 and 1. I want to do a constrained regression such that a0 and (1-a)0 for the model: Y = a*X1 + (1-a)*X2 It would not accomplish the constraint that a 0 but you could accomplish the other constraint within an lm fit: X3- X1-X2 lm(Y ~ X3 + offset(X2) ) Since beta1 is for the model Y ~ 1 + beta1(X1- X2) + 1*X2) Y ~ intercept + beta1*X1 + (1 -beta1)*X2 ... so beta1 is a. In the case beta 0 then I suppose a would be assigned 0. This
Re: [R] Mean and individual growth curve trajectories
jlwoodard john.woodard at wayne.edu writes: I'm trying to understand how to plot individual growth curve trajectories, with the overall mean trajectory superimposed (preferably in a slightly thicker line, maybe in black) over the individual trajectories. Using the sleepstudy data in lme4, here is the code I have so far: library(lme4) library(lattice) xyplot(Reaction ~ Days, data = sleepstudy, group = Subject, type = 'l') This plot produces the individual growth curves nicely, but I'd like to be able to plot the mean for each day (averaged over subjects) on top of this graph. Is this what you want? xyplot(Reaction ~ Days, data = sleepstudy, group = Subject, type = 'l', panel=function(...){ panel.xyplot(...) panel.average(...,fun=mean,horizontal=FALSE,col='red',lwd=3) } ) and have you considered: xyplot(Reaction ~ Days, data = sleepstudy, group = Subject, type = 'l', panel=function(...){ panel.xyplot(...) panel.loess(...,fun=mean,horizontal=FALSE,col='red',lwd=3) } ) for a smoother curve? Hope it helps, Michael Bibo Queensland Health __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] combining plots (curve + Plot functions)
Hello, What I really want to do is to add a rejection region in the form of a long rectangle to a density plot I have drawn. I am getting 2 plots. How can I add rectangle to first plot? see code below. First section works fine. It just is not quite what I want. # NORMAL DISTRIBUTION PLOT OF RAW DATA WITH UPPER CRITICAL LEVEL - ok xcrit=144.1 # *** single-sample Upper one-tailed hypothesis Test, z statistic *** cord.x - c(xcrit,seq(xcrit,200,0.01),200) cord.y - c(0,dnorm(seq(xcrit,200,0.01),140,15),0) curve(dnorm(x,140,15),xlim=c(80,200),main='Normal PDF',ylab=Probability) polygon(cord.x,cord.y,col='orange') # NORMAL DISTRIBUTION PLOT OF RAW DATA WITH UPPER CRITICAL LEVEL - 2 plots? xcrit=144.1 # *** single-sample Upper one-tailed hypothesis Test, z statistic *** curve(dnorm(x,140,15),xlim=c(80,200),main='Normal PDF',ylab=Probability) plot(c(144.1, 200), c(0, .03), type= n) Thanks. Mary A. Marion __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mean and individual growth curve trajectories
Thank you so much, Michael. This solution is just what I was looking for. Many thanks! John -- View this message in context: http://r.789695.n4.nabble.com/Mean-and-individual-growth-curve-trajectories-tp3021672p3021746.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.