Re: [R] fill map with gradient: package?
Thanks Paul. I'm still struggeling with some beginners issues on the ps-import (windows troubles with installing ghostscript), but when I resolved them I'm sure that I can use your example code which loos great to me. Thanks a lot, Thomas 2009/11/4 Paul Murrell p.murr...@auckland.ac.nz: Hi Thomas Steiner wrote: Hi, I'd like to fill an existing svg (or png) map with gradient colors. In detail: The file http://commons.wikimedia.org/wiki/File:Karte_%C3%96sterreich_Bundesl%C3%A4nder.svg should be filled with the population density data from this table: http://de.wikipedia.org/wiki/%C3%96sterreich#Verwaltungsgliederung choosing the right color saturation (or whatever). The final result should be something like this map: http://commons.wikimedia.org/wiki/Image:Bevoelkerungsdichte_-_Oesterreich.svg Is there a package or so for these two tasks (filling and color density ploting)? The 'grImport' package can help with getting the SVG into R (see http://www.jstatsoft.org/v30/i04). First step is to convert the SVG to PostScript (I used InkScape - you can play around with how the text comes across, but I'm going to ignore that and concentrate on the map regions). Having done that, the following code loads the map into R and draws it ... library(grImport) PostScriptTrace(Austria-Map-withFonts.ps, charpath=FALSE) map - readPicture(Austria-Map-withFonts.ps.xml) grid.picture(map) ... (the orientation may be 90 degrees out and you may get some warnings about character encodings; the former is easy to fix [see below] and the latter can just be ignored for now because we are ignoring the text). The next code shows the breakdown of the map into separate paths ... grid.newpage() picturePaths(map) ... from which we can see that the regions are the first 10 paths ... grid.newpage() grid.picture(map[1:10], use.gc=FALSE) At this point, you can use grImport to draw the regions with different fill colours, or you can just extract the x,y coordinates of the regions and go-it-alone. The following code takes the latter path, setting up 10 different colours, and drawing each region using grid.polygon(). The orientation is fixed by pushing a rotated viewport first ... colours - hcl(240, 60, seq(30, 80, length=10)) grid.newpage() pushViewport(viewport(angle=-90), grImport:::pictureVP(map[1:10])) mapply(function(p, col) { grid.polygon(p$x, p$y, default.units=native, gp=gpar(fill=col)) }, regions, colours) Hope that helps. Paul Thanks for your help, Thomas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] after PCA, the pc values are so large, wrong?
ok,I understand your means, maybe PLS is better for my aim. but I have done that, also bad. the most questions for me is how to select less variables from the independent to fit dependent. GA maybe is good way, but I do not learn it well. Ben Bolker wrote: bbslover dluthm at yeah.net writes: [snip] the fit result below: Call: lm(formula = y ~ x1 + x2 + x3, data = pc) Residuals: Min 1Q Median 3Q Max -1.29638 -0.47622 0.01059 0.49268 1.69335 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 5.613e+00 8.143e-02 68.932 2e-16 *** x1 -3.089e-05 5.150e-06 -5.998 8.58e-08 *** x2 -4.095e-05 3.448e-05 -1.1880.239 x3 -8.106e-05 6.412e-05 -1.2640.210 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.691 on 68 degrees of freedom Multiple R-squared: 0.3644, Adjusted R-squared: 0.3364 F-statistic: 12.99 on 3 and 68 DF, p-value: 8.368e-07 x2,x3 is not significance. by pricipal, after PCA, the pcs should significance, but my data is not, why? Why is it necessary that the first few principal components should have significant relationships with some other response values? The strength, and weakness, of PCA is that it is calculated *without regard* to a response variable, so it does not constitute data snooping ... I may of course have misinterpreted your question, but at a quick look, I don't see anything obviously wrong here. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/after-PCA%2C-the-pc-values-are-so-large%2C-wrong--tp26240926p26251658.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot.window arguments being ignored?
Hi, Why, when I run the script below, is my y-axis range command being ignored? I.e., the y axis graphed consistently fits the line specified in the plot command, but never fits the line I'm trying to add in the subsequent line command. This is my first R script, so if I'm doing anything else wacky that jumps out, please let me know. I'm an advanced Stata user, but R is new to me. The script is designed to simulate a binary-choice decision field theory example with internally-controlled stop time, as per Neural Networks 19 (2006) 1047-1058. Thanks in advance!! -Dan #- # Starting preference state; two choices preference-c(0,0) # Preference threshold; when one of the preference states exceeds this threshhold, deliberation stops theta- 2 # Contrast matrix contrast-array(c(1, -1, -1, 1), dim=c(2,2)) # Value Matrix; three possible states of the world value-array(c(1,0, 0, 1, .5,.5), dim=c(2, 3)) # Feedback Matrix feedback-array(c(1, .5, .5, 1), dim=c(2,2)) # Time t- array(0, dim=c(1,1)) # Arrays to keep track of history for graphing time_for_graph-array(t, dim=c(1,1)) preference_for_graph-array(preference, dim=c(1,2)) # Moving through time until preference crosses the threshold theta while (preference[1]theta preference[2]theta) { #stochastic weights for this time period weight-rnorm(3,0,1) #updating preference preference-feedback%*%preference + contrast%*%value%*%weight #updating history t-t+1 time_for_graph-rbind(time_for_graph, t) preference_for_graph-rbind(preference_for_graph, array(preference, dim=c(1,2))) #updating graph ranges ry-range(preference_for_graph) rx-range(time_for_graph) #updating graph plot(time_for_graph[,1], preference_for_graph[,1], type=b, plot.window(rx, ry), xlab=Time, ylab=Preference) lines(time_for_graph[,1], preference_for_graph[,2], type=b, col=red, ylim=ry) } #- -- View this message in context: http://old.nabble.com/plot.window-arguments-being-ignored--tp26249609p26249609.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme4 and incomplete block design
Le dimanche 08 novembre 2009 à 00:05 +0100, Christian Lerch a écrit : Dear list members, I try to simulate an incomplete block design in which every participants receives 3 out of 4 possible treatment. The outcome in binary. Assigning a binary outcome to the BIB or PBIB dataset of the package SASmixed gives the appropriate output. With the code below, fixed treatment estimates are not given for each of the 4 possible treatments, instead a kind of summary measure(?) for 'treatment' is given. block-rep(1:24,each=3) treatment-c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3) outcome-c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0) data-data.frame(block,treatment,outcome) lmer(outcome~treatment +(1|block), family=binomial, data=data) Is this a problem with the recovery of interblock estimates? No... Are there special rules how incomplete block designs should look like to enable this recovery? Neither... Compare : library(lme4) Le chargement a nécessité le package : Matrix Le chargement a nécessité le package : lattice summary(lmer(outcome~treatment +(1|block), family=binomial, + data=data.frame(block-rep(1:24,each=3), +treatment-c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, + 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, + 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, + 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, + 4, 4, 1, 3), +outcome-c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, + 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, + 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, + 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0)) + )) Generalized linear mixed model fit by the Laplace approximation Formula: outcome ~ treatment + (1 | block) Data: data.frame(block - rep(1:24, each = 3), treatment - c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3), outcome - c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0)) AIC BIC logLik deviance 86.28 93.1 -40.1480.28 Random effects: Groups NameVariance Std.Dev. block (Intercept) 0.60425 0.77734 Number of obs: 72, groups: block, 24 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) -1.277830.71767 -1.7800.075 . treatment0.011620.25571 0.0450.964 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation of Fixed Effects: (Intr) treatment -0.892 with : summary(lmer(outcome~treatment +(1|block), family=binomial, + data=data.frame(block-rep(1:24,each=3), +treatment-factor(c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, +2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, +3, 2, 4, 4, 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, +4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, +1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3)), +outcome-c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, + 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, + 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, + 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0)) + )) Generalized linear mixed model fit by the Laplace approximation Formula: outcome ~ treatment + (1 | block) Data: data.frame(block - rep(1:24, each = 3), treatment - factor(c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3)), outcome - c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0,
Re: [R] influence.measures(stats): hatvalues(model, ...)
Sigmund Freud wrote: Hello: I am trying to understand the method 'hatvalues(...)', which returns something similar to the diagonals of the plain vanilla hat matrix [X(X'X)^(-1)X'], but not quite. A Fortran programmer I am not, but tracing through the code it looks like perhaps some sort of correction based on the notion of 'leave-one-out' variance is being applied. I can't see what the problem is. Using the LifeCycleSavings example from ?influence.measures: lm.SR - lm(sr ~ pop15 + pop75 + dpi + ddpi, data = LifeCycleSavings) X - model.matrix(lm.SR) H - X %*% solve(t(X) %*% X) %*% t(X) hats1 - diag(H) hats2 - hatvalues(lm.SR) all.equal(hats1, hats2) #[1] TRUE Whatever the difference, in simulations 'hatvalues' appears to perform much better in the context of identifying outiers using Cook's Distance than the diagonals of the plain vanilla hat matrix. (As in http://en.wikipedia.org/wiki/Cook's_distance). Would prefer to understand a little more when using this method. I have downloaded the freely available references cited in the help and am in the process of digesting them. If someone with knowledge could offer a pointer on the most efficient way to get at why 'hatvalues' does what it does, that would be great. In a nutshell, hatvalues are a measure of how unusual a point is in predictor space, i.e. to what extent it sticks out in one or more of the X-dimensions. -Peter Ehlers Thanks, Jean Yarrington Independent consultant. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Counting non-empty levels of a factor
Hi everyone, I'm struggling with a little problem for a while, and I'm wondering if anyone could help... I have a dataset (from retailing industry) that indicates which brands are present in a panel of 500 stores, store , brand 1 , B1 1 , B2 1 , B3 2 , B1 2 , B3 3 , B2 3 , B3 3 , B4 I would like to know how many brands are present in each store, I tried: result - aggregate(MyData$brand , by=list(MyData$store) , nlevels) but I got: Group.1 x 1 , 4 2 , 4 3 , 4 which is not exactly the result I expected I would like to get sthg like: Group.1 x 1 , 3 2 , 2 3 , 3 Looking around, I found I can delete empty levels of factor using: problem.factor - problem.factor[,drop=TRUE] But this solution isn't handy for me as I have many stores and should make a subset of my data for each store before dropping empty factor I can't either counting the line for each store (N), because the same brand can appear several times in each store (several products for the same brand, and/or several weeks of observation) I used to do this calculation using SAS with: proc freq data = MyData noprint ; by store ; tables brand / out = result ; run ; (the cool thing was I got a database I can merge with MyData) any idea for doing that in R ? Thanks in advance, King Regards, Sylvain Willart, PhD Marketing, IAE Lille, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] influence.measures(stats): hatvalues(model, ...)
Not sure what you mean. yi - c(2,3,2,4,3,6) xi - c(1,4,3,2,4,5) res - lm(yi ~ xi) hatvalues(res) X - cbind(1, xi) diag( X%*%solve(t(X)%*%X)%*%t(X) ) Same result. Best, -- Wolfgang Viechtbauerhttp://www.wvbauer.com/ Department of Methodology and StatisticsTel: +31 (0)43 388-2277 School for Public Health and Primary Care Office Location: Maastricht University, P.O. Box 616 Room B2.01 (second floor) 6200 MD Maastricht, The Netherlands Debyeplein 1 (Randwyck) From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Sigmund Freud [ss_freud...@yahoo.com] Sent: Sunday, November 08, 2009 8:14 AM To: r-help@r-project.org Subject: [R] influence.measures(stats): hatvalues(model, ...) Hello: I am trying to understand the method 'hatvalues(...)', which returns something similar to the diagonals of the plain vanilla hat matrix [X(X'X)^(-1)X'], but not quite. A Fortran programmer I am not, but tracing through the code it looks like perhaps some sort of correction based on the notion of 'leave-one-out' variance is being applied. Whatever the difference, in simulations 'hatvalues' appears to perform much better in the context of identifying outiers using Cook's Distance than the diagonals of the plain vanilla hat matrix. (As in http://en.wikipedia.org/wiki/Cook's_distance). Would prefer to understand a little more when using this method. I have downloaded the freely available references cited in the help and am in the process of digesting them. If someone with knowledge could offer a pointer on the most efficient way to get at why 'hatvalues' does what it does, that would be great. Thanks, Jean Yarrington Independent consultant. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting non-empty levels of a factor
On Nov 8, 2009, at 8:38 AM, sylvain willart wrote: Hi everyone, I'm struggling with a little problem for a while, and I'm wondering if anyone could help... I have a dataset (from retailing industry) that indicates which brands are present in a panel of 500 stores, store , brand 1 , B1 1 , B2 1 , B3 2 , B1 2 , B3 3 , B2 3 , B3 3 , B4 I would like to know how many brands are present in each store, I tried: result - aggregate(MyData$brand , by=list(MyData$store) , nlevels) but I got: Group.1 x 1 , 4 2 , 4 3 , 4 which is not exactly the result I expected I would like to get sthg like: Group.1 x 1 , 3 2 , 2 3 , 3 Try: result - aggregate(MyData$brand , by=list(MyData$store) , length) Quick, easy and generalizes to other situations. The factor levels got carried along identically, but length counts the number of elements in the list returned by tapply. Looking around, I found I can delete empty levels of factor using: problem.factor - problem.factor[,drop=TRUE] If you reapply the function, factor, you get the same result. So you could have done this: result - aggregate(MyData$brand , by=list(MyData$store) , function(x) nlevels(factor(x))) result Group.1 x 1 1 3 2 2 2 3 3 3 But this solution isn't handy for me as I have many stores and should make a subset of my data for each store before dropping empty factor I can't either counting the line for each store (N), because the same brand can appear several times in each store (several products for the same brand, and/or several weeks of observation) I used to do this calculation using SAS with: proc freq data = MyData noprint ; by store ; tables brand / out = result ; run ; (the cool thing was I got a database I can merge with MyData) any idea for doing that in R ? Thanks in advance, King Regards, Sylvain Willart, PhD Marketing, IAE Lille, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot.window arguments being ignored?
Dan, with base graphics, the plot command determines the ylim, lines can only add lines/points to an exisiting graph, it cannot modify the existing ylim. You have two choices. 1) Before proceeding to the graph, determine which of the two data sets has the greater range and call plot with that range, then the subsequent lines command will fit. 2) use a dlfferent graphics system such as lattice or ggplot2. In these packages, the entire graph is computed in the background, then plotted explicitly. So as you build up the graph by adding things to it, the routines automatically resize as necessary. Both lattice and ggplot2 have books and very nice web sites where you can find an example like yours. Bryan * Bryan Hanson Acting Chair Professor of Chemistry Biochemistry DePauw University, Greencastle IN USA On 11/7/09 6:28 PM, AR_user dweitzenf...@gmail.com wrote: Hi, Why, when I run the script below, is my y-axis range command being ignored? I.e., the y axis graphed consistently fits the line specified in the plot command, but never fits the line I'm trying to add in the subsequent line command. This is my first R script, so if I'm doing anything else wacky that jumps out, please let me know. I'm an advanced Stata user, but R is new to me. The script is designed to simulate a binary-choice decision field theory example with internally-controlled stop time, as per Neural Networks 19 (2006) 1047-1058. Thanks in advance!! -Dan #- # Starting preference state; two choices preference-c(0,0) # Preference threshold; when one of the preference states exceeds this threshhold, deliberation stops theta- 2 # Contrast matrix contrast-array(c(1, -1, -1, 1), dim=c(2,2)) # Value Matrix; three possible states of the world value-array(c(1,0, 0, 1, .5,.5), dim=c(2, 3)) # Feedback Matrix feedback-array(c(1, .5, .5, 1), dim=c(2,2)) # Time t- array(0, dim=c(1,1)) # Arrays to keep track of history for graphing time_for_graph-array(t, dim=c(1,1)) preference_for_graph-array(preference, dim=c(1,2)) # Moving through time until preference crosses the threshold theta while (preference[1]theta preference[2]theta) { #stochastic weights for this time period weight-rnorm(3,0,1) #updating preference preference-feedback%*%preference + contrast%*%value%*%weight #updating history t-t+1 time_for_graph-rbind(time_for_graph, t) preference_for_graph-rbind(preference_for_graph, array(preference, dim=c(1,2))) #updating graph ranges ry-range(preference_for_graph) rx-range(time_for_graph) #updating graph plot(time_for_graph[,1], preference_for_graph[,1], type=b, plot.window(rx, ry), xlab=Time, ylab=Preference) lines(time_for_graph[,1], preference_for_graph[,2], type=b, col=red, ylim=ry) } #- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting non-empty levels of a factor
On Nov 8, 2009, at 9:11 AM, David Winsemius wrote: On Nov 8, 2009, at 8:38 AM, sylvain willart wrote: Hi everyone, I'm struggling with a little problem for a while, and I'm wondering if anyone could help... I have a dataset (from retailing industry) that indicates which brands are present in a panel of 500 stores, store , brand 1 , B1 1 , B2 1 , B3 2 , B1 2 , B3 3 , B2 3 , B3 3 , B4 I would like to know how many brands are present in each store, I tried: result - aggregate(MyData$brand , by=list(MyData$store) , nlevels) but I got: Group.1 x 1 , 4 2 , 4 3 , 4 which is not exactly the result I expected I would like to get sthg like: Group.1 x 1 , 3 2 , 2 3 , 3 Try: result - aggregate(MyData$brand , by=list(MyData$store) , length) Quick, easy and generalizes to other situations. The factor levels got carried along identically, but length counts the number of elements in the list returned by tapply. Which may not have been what you asked for as this would demonstrate. You probably wnat the second solution: mydata2 - rbind(MyData, MyData) result - aggregate(mydata2$brand , by=list(mydata2$store) , length) result Group.1 x 1 1 6 2 2 4 3 3 6 result - aggregate(mydata2$brand , by=list(mydata2$store) , function(x) nlevels(factor(x))) result Group.1 x 1 1 3 2 2 2 3 3 3 Looking around, I found I can delete empty levels of factor using: problem.factor - problem.factor[,drop=TRUE] If you reapply the function, factor, you get the same result. So you could have done this: result - aggregate(MyData$brand , by=list(MyData$store) , function(x) nlevels(factor(x))) result Group.1 x 1 1 3 2 2 2 3 3 3 But this solution isn't handy for me as I have many stores and should make a subset of my data for each store before dropping empty factor I can't either counting the line for each store (N), because the same brand can appear several times in each store (several products for the same brand, and/or several weeks of observation) I used to do this calculation using SAS with: proc freq data = MyData noprint ; by store ; tables brand / out = result ; run ; (the cool thing was I got a database I can merge with MyData) any idea for doing that in R ? Thanks in advance, King Regards, Sylvain Willart, PhD Marketing, IAE Lille, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Windows 7 editor - I can't make RWinEdt work
Good morning I just got a new computer with Windows 7. R works fine, but the editor I am used to using RWinEdt does not. I did find one blog post on how to get RWinEdt to work in Windows 7, but I could not get those instructions to work either. Is there a patch for RWinEdt? If not, is there another good R editor that works under Windows 7? I tried RSiteSearch with various combinations of Windows 7 and Editor and so on, but found nothing. I also tried googling on these terms. Thanks Peter Peter L. Flom, PhD Statistical Consultant Website: www DOT peterflomconsulting DOT com Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] look up and Missing
HI R-Users Assume that I have a data frame 'temp' with several variables (v1,v2,v3,v4,v5.). v1 v2 v3 v4 v5 1 2 3 36 5 2 420 2 -9 5 43 6 2 1 34 1, I want to look at the entire row values of when v2 =-9 like 2 -9 5 43 I wrote K- list(if(temp$v2)==-9)) I wrote the like this but it gave me which is not correct. False false false false false 2. I want assign that values as missing if v2 = -9. (ie., I want exclude from the analysis How do I do it in R? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Windows 7 editor - I can't make RWinEdt work
Aha, what is that blog post and what does not work for you? I haven't got any report so far and do not have Windows 7 easily available yet. Best, Uwe Ligges Peter Flom wrote: Good morning I just got a new computer with Windows 7. R works fine, but the editor I am used to using RWinEdt does not. I did find one blog post on how to get RWinEdt to work in Windows 7, but I could not get those instructions to work either. Is there a patch for RWinEdt? If not, is there another good R editor that works under Windows 7? I tried RSiteSearch with various combinations of Windows 7 and Editor and so on, but found nothing. I also tried googling on these terms. Thanks Peter Peter L. Flom, PhD Statistical Consultant Website: www DOT peterflomconsulting DOT com Writing; http://www.associatedcontent.com/user/582880/peter_flom.html Twitter: @peterflom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to remove one of any two indices with a correlation greater than 0.90 in a matrix (or data.frame)
my code is not right below: rm(list=ls()) #define data.frame a=c(1,2,3,5,6); b=c(1,2,3,4,7); c=c(1,2,3,4,8); d=c(1,2,3,5,1); e=c(1,2,3,5,7) data.f=data.frame(a,b,c,d,e) #backup data.f origin.data-data.f #get correlation matrix cor.matrix-cor(origin.data) #backup correlation matrix origin.cor-cor.matrix #get dim dim.cor-dim(origin.cor) #perform Loop n-0 for(i in 1:(dim.cor[1]-1)) { for(j in (i+1):(dim.cor[2])) { if (cor.matrix[i,j]=0.95) { data.f-data.f[,-(i-n)] n-1 break } } } origin.cor origin.data data.f cor(data.f) how write the code to do with my questions? and have a simple way? -- View this message in context: http://old.nabble.com/how-to-remove-one-of-any-two-indices-with-a-correlation-greater-than-0.90-in-a-matrix-%28or-data.frame%29-tp26254082p26254082.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save an object by dynamicly created name
Hi Henrik, I am using your saveObject/loadObject to handle over 1000 matrices. It worked beautifully. Because I need to load those matrices often for evaluating a few functions on them and those matrices do not fit all in memory at once, is there a way to speed up the loading part? I tried save all the binary files to /dev/shm (shared memory section in linux) but the speed of loadObject on /dev/shm remains the same as on the disk. Thanks Hao -Original Message- From: henrik.bengts...@gmail.com [mailto:henrik.bengts...@gmail.com] On Behalf Of Henrik Bengtsson Sent: Monday, November 02, 2009 12:34 AM To: David Winsemius Cc: r-help@r-project.org; jeffc Subject: Re: [R] save an object by dynamicly created name On Sun, Nov 1, 2009 at 9:18 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 1, 2009, at 11:28 PM, Henrik Bengtsson wrote: On Sun, Nov 1, 2009 at 7:48 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote: path - data; dir.create(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); pathname - file.path(path, filename); save(m, file=pathname); } That would result in each of the ten files containing an object with the same name == m. (Also on my system R data files have type Rdta.) So I thought what was requested might have been a slight mod: path - ~/; dir.create(path); for (i in 1:10) { assign( paste(m, i, sep=), i:5) filename - sprintf(m%02d.Rdta, i) pathname - file.path(path, filename) obj =get(paste(m, i, sep=)) save(obj, file=pathname) } Then a more convenient solution is to use saveObject() and loadObject() of R.utils. saveObject() does not save the name of the object save. The OP asked for this outcome : I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... If you want to save multiple objects, the wrap them up in a list. I agree that a list would makes sense if it were to be stored in one file , although it was not what requested. That comment was not for the OP, but for saveObject()/loadObject() in general. But wouldn't that require assign()-ing a name before list()-wrapping? Nope, the whole point of using saveObject()/loadObject() is to save the objects/values without their names that you happens to choose in the current session, and to avoid overwriting existing ones in your next session. My example could also have been: library(R.utils); saveObject(list(a=1,b=LETTERS,c=Sys.time()), file=foo.Rbin); y - loadObject(foo.Rbin); z - loadObject(foo.Rbin); stopifnot(identical(y,z)); If you really want to attach the elements of the saved list, do: attachLocally(loadObject(foo.Rbin)); str(a) num 1 str(b) chr [1:26] A B C D E F G H I J ... str(c) POSIXct[1:1], format: 2009-11-01 21:30:41 I suppose we ought to mention that the use of assign to create a variable is a FAQ ... 7.21? Yep, I have now referred to it a sufficient number of times to refer to it by number. http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a- variable_003f My personal take on assign() and get() is that if you find yourself using them (at this level), there is a good chance there exists a better solution that you should use instead. My $.02 /H -- David loadObject() does not assign variable, but instead return them. Example: library(R.utils); x - list(a=1,b=LETTERS,c=Sys.time()); saveObject(x, file=foo.Rbin); y - loadObject(foo.Rbin); stopifnot(identical(x,y)); So, for the original example, I'd recommend: library(R.utils); path - data; mkdirs(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); saveObject(m, file=filename, path=path); } and loading the objects back as: for (i in 1:10) { filename - sprintf(m%02d.Rbin, i); m - loadObject(filename, path=path); print(m); } /Henrik -- David. /H On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote: Hi, I would like to save a few dynamically created objects to disk. The following is the basic flow of the code segment for(i = 1:10) { m = i:5 save(m, file = ...) ## ??? } To distinguish different objects to be saved, I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... I tried a couple of methods on translating between object names and strings (below) but couldn't get it to work. https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html Any suggestions would be appreciated. thanks Hao -- View this message in context: http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26 155437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Counting non-empty levels of a factor
Thanks a lot for those solutions, Both are working great, and they do slightly different (but both very interesting) things, Moreover, I learned about the length() function ... one more to add to my personal cheat sheet King Regards 2009/11/8 David Winsemius dwinsem...@comcast.net: On Nov 8, 2009, at 9:11 AM, David Winsemius wrote: On Nov 8, 2009, at 8:38 AM, sylvain willart wrote: Hi everyone,h I'm struggling with a little problem for a while, and I'm wondering if anyone could help... I have a dataset (from retailing industry) that indicates which brands are present in a panel of 500 stores, store , brand 1 , B1 1 , B2 1 , B3 2 , B1 2 , B3 3 , B2 3 , B3 3 , B4 I would like to know how many brands are present in each store, I tried: result - aggregate(MyData$brand , by=list(MyData$store) , nlevels) but I got: Group.1 x 1 , 4 2 , 4 3 , 4 which is not exactly the result I expected I would like to get sthg like: Group.1 x 1 , 3 2 , 2 3 , 3 Try: result - aggregate(MyData$brand , by=list(MyData$store) , length) Quick, easy and generalizes to other situations. The factor levels got carried along identically, but length counts the number of elements in the list returned by tapply. Which may not have been what you asked for as this would demonstrate. You probably wnat the second solution: mydata2 - rbind(MyData, MyData) result - aggregate(mydata2$brand , by=list(mydata2$store) , length) result Group.1 x 1 1 6 2 2 4 3 3 6 result - aggregate(mydata2$brand , by=list(mydata2$store) , function(x) nlevels(factor(x))) result Group.1 x 1 1 3 2 2 2 3 3 3 Looking around, I found I can delete empty levels of factor using: problem.factor - problem.factor[,drop=TRUE] If you reapply the function, factor, you get the same result. So you could have done this: result - aggregate(MyData$brand , by=list(MyData$store) , function(x) nlevels(factor(x))) result Group.1 x 1 1 3 2 2 2 3 3 3 But this solution isn't handy for me as I have many stores and should make a subset of my data for each store before dropping empty factor I can't either counting the line for each store (N), because the same brand can appear several times in each store (several products for the same brand, and/or several weeks of observation) I used to do this calculation using SAS with: proc freq data = MyData noprint ; by store ; tables brand / out = result ; run ; (the cool thing was I got a database I can merge with MyData) any idea for doing that in R ? Thanks in advance, King Regards, Sylvain Willart, PhD Marketing, IAE Lille, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reference on contr.helmert and typo on its help page.
I'm wondering which textbook discussed the various contrast matrices mentioned in the help page of 'contr.helmert'. Could somebody let me know? BTW, in R version 2.9.1, there is a typo on the help page of 'contr.helmert' ('cont.helmert' should be 'contr.helmert'). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Strange Conflicts with Debian Repositories
Dear All, I have just installed a fresh Debian testing (squeeze) on my system (amd64 architecture). I am experiencing some really strange problems in updating my system whenever I have both the R repository and the multimedia repository available. This is my source.list (when I disable the multimedia repository) ~$ cat /etc/apt/sources.list deb http://ftp.ch.debian.org/debian/ testing main contrib non-free deb-src http://ftp.ch.debian.org/debian/ testing main non-free contrib deb http://security.debian.org/ testing/updates main contrib non-free deb-src http://security.debian.org/ testing/updates main contrib non-free #deb http://www.debian-multimedia.org testing main #deb http://favorite-cran-mirror/bin/linux/debian lenny-cran/ #deb-src http://favorite-cran-mirror/bin/linux/debian lenny-cran/ # Add R-repository deb http://cran.ch.r-project.org/bin/linux/debian lenny-cran/ deb-src http://cran.ch.r-project.org/debian/bin/linux/debian lenny-cran/ This way I have no trouble to update/upgrade my system (only, I do not see the sources of the R packages, but this is really a minor issue for now) ~$ sudo apt-get update Get:1 http://security.debian.org testing/updates Release.gpg [835B] Ign http://security.debian.org testing/updates/main Translation-en_US Ign http://security.debian.org testing/updates/contrib Translation-en_US Get:2 http://ftp.ch.debian.org testing Release.gpg [835B] Ign http://ftp.ch.debian.org testing/main Translation-en_US Ign http://ftp.ch.debian.org testing/contrib Translation-en_US Ign http://security.debian.org testing/updates/non-free Translation-en_US Hit http://security.debian.org testing/updates Release Ign http://ftp.ch.debian.org testing/non-free Translation-en_US Hit http://ftp.ch.debian.org testing Release Ign http://security.debian.org testing/updates/main Packages/DiffIndex Hit http://ftp.ch.debian.org testing/main Packages/DiffIndex Ign http://security.debian.org testing/updates/contrib Packages/DiffIndex Ign http://security.debian.org testing/updates/non-free Packages/DiffIndex Ign http://security.debian.org testing/updates/main Sources/DiffIndex Ign http://security.debian.org testing/updates/contrib Sources/DiffIndex Ign http://security.debian.org testing/updates/non-free Sources/DiffIndex Get:3 http://cran.ch.r-project.org lenny-cran/ Release.gpg [197B] Ign http://cran.ch.r-project.org lenny-cran/ Translation-en_US Ign http://cran.ch.r-project.org lenny-cran/ Release.gpg Hit http://ftp.ch.debian.org testing/contrib Packages/DiffIndex Ign http://ftp.ch.debian.org testing/non-free Packages/DiffIndex Hit http://ftp.ch.debian.org testing/main Sources/DiffIndex Ign http://ftp.ch.debian.org testing/non-free Sources/DiffIndex Hit http://ftp.ch.debian.org testing/contrib Sources/DiffIndex Hit http://security.debian.org testing/updates/main Packages Hit http://security.debian.org testing/updates/contrib Packages Hit http://security.debian.org testing/updates/non-free Packages Hit http://security.debian.org testing/updates/main Sources Hit http://security.debian.org testing/updates/contrib Sources Get:4 http://cran.ch.r-project.org lenny-cran/ Release [1,288B] Hit http://ftp.ch.debian.org testing/non-free Packages Hit http://ftp.ch.debian.org testing/non-free Sources Hit http://security.debian.org testing/updates/non-free Sources Ign http://cran.ch.r-project.org lenny-cran/ Release Ign http://cran.ch.r-project.org lenny-cran/ Packages Ign http://cran.ch.r-project.org lenny-cran/ Sources Ign http://cran.ch.r-project.org lenny-cran/ Packages Ign http://cran.ch.r-project.org lenny-cran/ Sources Get:5 http://cran.ch.r-project.org lenny-cran/ Packages [11.4kB] Err http://cran.ch.r-project.org lenny-cran/ Sources 404 Not Found Fetched 14.5kB in 3s (3,851B/s) W: Failed to fetch http://cran.ch.r-project.org/debian/bin/linux/debian/lenny-cran/Sources.gz 404 Not Found E: Some index files failed to download, they have been ignored, or old ones used instead. However (and this is really odd...) if I uncomment the multimedia repository this is what happens $ sudo apt-get update Err http://www.debian-multimedia.org testing Release.gpg Could not resolve 'www.debian-multimedia.org' Err http://security.debian.org testing/updates Release.gpg Could not resolve 'security.debian.org' Err http://security.debian.org testing/updates/main Translation-en_US Could not resolve 'security.debian.org' Err http://www.debian-multimedia.org testing/main Translation-en_US Could not resolve 'www.debian-multimedia.org' Err http://security.debian.org testing/updates/contrib Translation-en_US Could not resolve 'security.debian.org' Err http://security.debian.org testing/updates/non-free Translation-en_US Could not resolve 'security.debian.org' Err http://cran.ch.r-project.org lenny-cran/ Release.gpg Could not resolve 'cran.ch.r-project.org' Err http://cran.ch.r-project.org lenny-cran/ Translation-en_US Could not resolve 'cran.ch.r-project.org' Err http://cran.ch.r-project.org lenny-cran/
[R] Scheffe test
Dear all, Please help me with the R code which compute SCHEFFE TEST Thanking you in advance Kind regards Mangalani Peter Makananisa Statistical Analyst South African Revenue Service (SARS) Segmentation and Research : Data Modelling Tel: +2712 422 7357 Cell: +2782 456 4669 Fax:+2712 422 6579 Please Note: This email and its contents are subject to our email legal notice which can be viewed at http://www.sars.gov.za/Email_Disclaimer.pdf [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scheffe test
Dear all, Please help me with the R code which compute SCHEFFE TEST Thanking you in advance Kind regards Mangalani Peter Makananisa Statistical Analyst South African Revenue Service (SARS) Segmentation and Research : Data Modelling Tel: +2712 422 7357 Cell: +2782 456 4669 Fax:+2712 422 6579 Please Note: This email and its contents are subject to our email legal notice which can be viewed at http://www.sars.gov.za/Email_Disclaimer.pdf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scheffe test
Dear Prof, Please help me with the R code which compute SCHEFFE TEST Thanking you in advance Kind regards Mangalani Peter Makananisa Statistical Analyst South African Revenue Service (SARS) Segmentation and Research : Data Modelling Tel: +2712 422 7357 Cell: +2782 456 4669 Fax:+2712 422 6579 Please Note: This email and its contents are subject to our email legal notice which can be viewed at http://www.sars.gov.za/Email_Disclaimer.pdf [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Recall: Scheffe test
Mangalani Peter Makananisa would like to recall the message, Scheffe test. Please Note: This email and its contents are subject to our email legal notice which can be viewed at http://www.sars.gov.za/Email_Disclaimer.pdf [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] look up and Missing
Here is how to find out which rows contain -9 and then you can do with it as you please: x v1 v2 v3 v4 v5 1 1 2 3 3 6 2 5 2 4 2 0 3 2 -9 5 4 3 4 6 2 1 3 4 which(apply(x, 1, function(.row) any(.row == -9))) [1] 3 On Sun, Nov 8, 2009 at 10:23 AM, Ashta sewa...@gmail.com wrote: HI R-Users Assume that I have a data frame 'temp' with several variables (v1,v2,v3,v4,v5.). v1 v2 v3 v4 v5 1 2 3 3 6 5 2 4 2 0 2 -9 5 4 3 6 2 1 3 4 1, I want to look at the entire row values of when v2 =-9 like 2 -9 5 4 3 I wrote K- list(if(temp$v2)==-9)) I wrote the like this but it gave me which is not correct. False false false false false 2. I want assign that values as missing if v2 = -9. (ie., I want exclude from the analysis How do I do it in R? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on contr.helmert and typo on its help page.
On Nov 8, 2009, at 11:03 AM, Peng Yu wrote: I'm wondering which textbook discussed the various contrast matrices mentioned in the help page of 'contr.helmert'. Could somebody let me know? My version of Modern Applied Statistics in S (aka MASS) deals with it in enough detail for a person with background to understand. At one point I asked Ripley whether later editions of MASS expanded the coverage of that topic but my memory is that he did not reply. Coming from a perspective that emphasized regression, I found it rather terse (2 pages), so there might be one of the more recently published texts that spends more time developing the concepts from basics. Statistical Models in S also covers the topic (5 pages) in an early chapter, again at a level that assumes prior training in ANOVA models. BTW, in R version 2.9.1, there is a typo on the help page of 'contr.helmert' ('cont.helmert' should be 'contr.helmert'). __ -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange Conflicts with Debian Repositories
On 8 November 2009 at 17:05, Lorenzo Isella wrote: | I am experiencing some really strange problems in updating my system | whenever I have both the R repository and the multimedia repository | available. [...] | deb http://cran.ch.r-project.org/bin/linux/debian lenny-cran/ This URLs doesn't exist / doesn't resolve. So don't use it. This one seems to work http://stat.ethz.ch/CRAN/bin/linux/debian/lenny-cran/ so try that instead. Likewise for the sources. Also note the trailing slash. This question, by the way, would be much more appropriate for the r-sig-debian list which is dedicated to Debian / Ubuntu questions. Dirk -- Three out of two people have difficulties with fractions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] look up and Missing
On Nov 8, 2009, at 10:23 AM, Ashta wrote: HI R-Users Assume that I have a data frame 'temp' with several variables (v1,v2,v3,v4,v5.). v1 v2 v3 v4 v5 1 2 3 36 5 2 420 2 -9 5 43 6 2 1 34 1, I want to look at the entire row values of when v2 =-9 like 2 -9 5 43 I wrote K- list(if(temp$v2)==-9)) if would be the wrong R function to use. It's mostly for program control. And where did the 3 come from? You were working with the column temp$v2. Oh, you wanted a row rather than the column, v2? So how were you going to select that row? Perhaps: K -temp[ temp$v2 == -9, ] K I wrote the like this but it gave me which is not correct. False false false false false I could not get your code to produce this. I got: Error: unexpected '==' in K- list(if(temp$v2)== 2. I want assign that values as missing if v2 = -9. (ie., I want exclude from the analysis How do I do it in R? Your request is not well specified at least to my reading, because I could not tell if you wanted the re-assignment to occur in temp (and that was after I came down on the row side of the whether you wanted a row or column.) . The following assumes you wanted the row in question (created above) modified outside of temp. is.na(K) - K == -9 K v1 v2 v3 v4 v5 3 2 NA 5 4 3 If you had used ifelse you would have gotten close, but the data type would have been a list, which may not have been what you expected: K - ifelse(K==-9, NA, K) K [[1]] [1] 2 [[2]] [1] NA [[3]] [1] 5 [[4]] [1] 4 [[5]] [1] 3 -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] look up and Missing
try this: temp.new - temp[temp$v2 != -9, ] temp.new I hope it helps. Best, Dimitris Ashta wrote: HI R-Users Assume that I have a data frame 'temp' with several variables (v1,v2,v3,v4,v5.). v1 v2 v3 v4 v5 1 2 3 36 5 2 420 2 -9 5 43 6 2 1 34 1, I want to look at the entire row values of when v2 =-9 like 2 -9 5 43 I wrote K- list(if(temp$v2)==-9)) I wrote the like this but it gave me which is not correct. False false false false false 2. I want assign that values as missing if v2 = -9. (ie., I want exclude from the analysis How do I do it in R? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] look up and Missing
On Nov 8, 2009, at 11:08 AM, David Winsemius wrote: On Nov 8, 2009, at 10:23 AM, Ashta wrote: HI R-Users Assume that I have a data frame 'temp' with several variables (v1,v2,v3,v4,v5.). v1 v2 v3 v4 v5 1 2 3 36 5 2 420 2 -9 5 43 6 2 1 34 1, I want to look at the entire row values of when v2 =-9 like 2 -9 5 43 I wrote K- list(if(temp$v2)==-9)) A further thought, that might be more useful if you were intending to supply a portion of a dataframe to an analytical function, would be the subset function: t2 - subset(temp, v2 != -9) E. g.: lm( v1 ~ v2 + v3, data= subset(temp, v2 != -9) if would be the wrong R function to use. It's mostly for program control. And where did the 3 come from? You were working with the column temp$v2. Oh, you wanted a row rather than the column, v2? So how were you going to select that row? Perhaps: K -temp[ temp$v2 == -9, ] K I wrote the like this but it gave me which is not correct. False false false false false I could not get your code to produce this. I got: Error: unexpected '==' in K- list(if(temp$v2)== 2. I want assign that values as missing if v2 = -9. (ie., I want exclude from the analysis How do I do it in R? Your request is not well specified at least to my reading, because I could not tell if you wanted the re-assignment to occur in temp (and that was after I came down on the row side of the whether you wanted a row or column.) . The following assumes you wanted the row in question (created above) modified outside of temp. is.na(K) - K == -9 K v1 v2 v3 v4 v5 3 2 NA 5 4 3 If you had used ifelse you would have gotten close, but the data type would have been a list, which may not have been what you expected: K - ifelse(K==-9, NA, K) K [[1]] [1] 2 [[2]] [1] NA [[3]] [1] 5 [[4]] [1] 4 [[5]] [1] 3 -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] look up and Missing
I'm not quite sure I understood the second queston but does this work? subset(temp, xx$v2==-9) subset(temp, xx$v2!= -9) --- On Sun, 11/8/09, Ashta sewa...@gmail.com wrote: From: Ashta sewa...@gmail.com Subject: [R] look up and Missing To: r-h...@stat.math.ethz.ch Received: Sunday, November 8, 2009, 10:23 AM HI R-Users Assume that I have a data frame 'temp' with several variables (v1,v2,v3,v4,v5.). v1 v2 v3 v4 v5 1 2 3 3 6 5 2 4 2 0 2 -9 5 4 3 6 2 1 3 4 1, I want to look at the entire row values of when v2 =-9 like 2 -9 5 4 3 I wrote K- list(if(temp$v2)==-9)) I wrote the like this but it gave me which is not correct. False false false false false 2. I want assign that values as missing if v2 = -9. (ie., I want exclude from the analysis How do I do it in R? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now http://ca.toolbar.yahoo.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Strange Conflicts with Debian Repositories
On Sun, Nov 08, 2009 at 10:31:12AM -0600, Dirk Eddelbuettel wrote: On 8 November 2009 at 17:05, Lorenzo Isella wrote: | I am experiencing some really strange problems in updating my system | whenever I have both the R repository and the multimedia repository | available. [...] | deb http://cran.ch.r-project.org/bin/linux/debian lenny-cran/ This URLs doesn't exist / doesn't resolve. So don't use it. This one seems to work http://stat.ethz.ch/CRAN/bin/linux/debian/lenny-cran/ Sorry, I meant http://stat.ethz.ch/CRAN/bin/linux/debian/ lenny-cran/ which works for me across the pond as well. Dirk -- Three out of two people have difficulties with fractions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rd doc truncated with R 2.10.0
Hi, I am routinely compiling a package and since I have moved to R 2.10.0, it troncates some section texts in the doc: With the following section in the rd file: \details{ The function calls gpsbabel via the system. The gpsbabel program must be present and on the user's PATH for the function to work see http://www.gpsbabel.org/. The function has been tested on the following Garmin GPS devices: Etrex Summit, Etrex Vista Cx and GPSmap 60CSx. } ...compiling under R 2.9.2 (rcmd build --binary --auto-zip pgirmess) I get this Details: The function calls gpsbabel via the system. The gpsbabel program must be present and on the user's PATH for the function to work, see http://www.gpsbabel.org/. The function has been tested on the following Garmin GPS devices: Etrex Summit, Etrex Vista Cx and GPSmap 60CSx. and compiling now under R 2.10.0 Details: The function has been tested on the following Garmin GPS devices: Etrex Summit, Etrex Vista Cx and GPSmap 60CSx. The function calls gpsbabel via the system. The gpsbabel program must be presen Has anyone an explanation and a workaround ? Best, Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting non-empty levels of a factor
With xx as your sample data will this work? See ?addmargins jj - table(xx) addmargins(jj, 2) # or for both margins addmargins(jj, c(1,2)) or apply(jj, 1, sum) --- On Sun, 11/8/09, sylvain willart sylvain.will...@gmail.com wrote: From: sylvain willart sylvain.will...@gmail.com Subject: [R] Counting non-empty levels of a factor To: r-help@r-project.org Received: Sunday, November 8, 2009, 8:38 AM Hi everyone, I'm struggling with a little problem for a while, and I'm wondering if anyone could help... I have a dataset (from retailing industry) that indicates which brands are present in a panel of 500 stores, store , brand 1 , B1 1 , B2 1 , B3 2 , B1 2 , B3 3 , B2 3 , B3 3 , B4 I would like to know how many brands are present in each store, I tried: result - aggregate(MyData$brand , by=list(MyData$store) , nlevels) but I got: Group.1 x 1 , 4 2 , 4 3 , 4 which is not exactly the result I expected I would like to get sthg like: Group.1 x 1 , 3 2 , 2 3 , 3 Looking around, I found I can delete empty levels of factor using: problem.factor - problem.factor[,drop=TRUE] But this solution isn't handy for me as I have many stores and should make a subset of my data for each store before dropping empty factor I can't either counting the line for each store (N), because the same brand can appear several times in each store (several products for the same brand, and/or several weeks of observation) I used to do this calculation using SAS with: proc freq data = MyData noprint ; by store ; tables brand / out = result ; run ; (the cool thing was I got a database I can merge with MyData) any idea for doing that in R ? Thanks in advance, King Regards, Sylvain Willart, PhD Marketing, IAE Lille, France __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ Make your browsing faster, safer, and easier with the new Internet Explorer® 8. Opt ernetexplorer/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on contr.helmert and typo on its help page.
On Sun, Nov 8, 2009 at 11:59 AM, Peng Yu pengyu...@gmail.com wrote: On Sun, Nov 8, 2009 at 10:11 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 08/11/2009 11:03 AM, Peng Yu wrote: I'm wondering which textbook discussed the various contrast matrices mentioned in the help page of 'contr.helmert'. Could somebody let me know? Doesn't the reference on that page discuss them? It does explain what the functions are. But I need a more basic and complete reference. For example, I want to understand what 'Helmert parametrization' (on page 33 of 'Statistical Models in S') is. Just google for: Helmert contrasts __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on contr.helmert and typo on its help page.
Gabor Grothendieck wrote: On Sun, Nov 8, 2009 at 11:59 AM, Peng Yu pengyu...@gmail.com wrote: On Sun, Nov 8, 2009 at 10:11 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 08/11/2009 11:03 AM, Peng Yu wrote: I'm wondering which textbook discussed the various contrast matrices mentioned in the help page of 'contr.helmert'. Could somebody let me know? Doesn't the reference on that page discuss them? It does explain what the functions are. But I need a more basic and complete reference. For example, I want to understand what 'Helmert parametrization' (on page 33 of 'Statistical Models in S') is. Just google for: Helmert contrasts Or, contr.helmert(5) [,1] [,2] [,3] [,4] 1 -1 -1 -1 -1 21 -1 -1 -1 302 -1 -1 4003 -1 50004 MASS::fractions(MASS::ginv(contr.helmert(5))) [,1] [,2] [,3] [,4] [,5] [1,] -1/2 1/2 0 0 0 [2,] -1/6 -1/6 1/3 0 0 [3,] -1/12 -1/12 -1/12 1/4 0 [4,] -1/20 -1/20 -1/20 -1/20 1/5 and apply brains. I.e., except for a slightly odd multiplier, the parameters represent the difference between each level and the average of the preceding levels. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rd doc truncated with R 2.10.0
On 08/11/2009 12:07 PM, Patrick Giraudoux wrote: Hi, I am routinely compiling a package and since I have moved to R 2.10.0, it troncates some section texts in the doc: With the following section in the rd file: \details{ The function calls gpsbabel via the system. The gpsbabel program must be present and on the user's PATH for the function to work see http://www.gpsbabel.org/. The function has been tested on the following Garmin GPS devices: Etrex Summit, Etrex Vista Cx and GPSmap 60CSx. } ...compiling under R 2.9.2 (rcmd build --binary --auto-zip pgirmess) I get this Details: The function calls gpsbabel via the system. The gpsbabel program must be present and on the user's PATH for the function to work, see http://www.gpsbabel.org/. The function has been tested on the following Garmin GPS devices: Etrex Summit, Etrex Vista Cx and GPSmap 60CSx. and compiling now under R 2.10.0 Details: The function has been tested on the following Garmin GPS devices: Etrex Summit, Etrex Vista Cx and GPSmap 60CSx. The function calls gpsbabel via the system. The gpsbabel program must be presen Has anyone an explanation and a workaround ? You will need to make the complete file available to us to diagnose this. Is it in pgirmess 1.4.0? Which topic? Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear trend line and a quadratic trend line.
Eric Fail wrote: Dear list users How is it possible to visualise both a linear trend line and a quadratic trend line on a plot of two variables? Here my almost working exsample. data(Duncan) attach(Duncan) plot(prestige ~ income) abline(lm(prestige ~ income), col=2, lwd=2) Now I would like to add yet another trend line, but this time a quadratic one. So I have two trend lines. One linear trend line and a quadratic trend line. A trend line as if I had taken I(income^2) into the equation. I know I can make two models and compare them using anova, but for pedagogical resons I wold like to visualise it. I know the trick from my past as an SPSS user, but I would like to do this in R as well. Se it in SPSS http://www.childrens-mercy.org/stats/weblog2006/QuadraticRegression.asp There's no precooked function that I am aware of, but the generic way is like rg - range(income) N - 200 x - seq(rg[1], rg[2],, N) pred - predict(lm(prestige~ income+I(income^2)), newdata=data.frame(income=x)) lines(x, pred) as usual, like means that if you can't be bothered with making your example reproducible, I can't be bothered with testing the code! Well, actually, I found the Duncan data in library(car), so I did in fact test... -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rd doc truncated with R 2.10.0
Duncan Murdoch a écrit : On 08/11/2009 12:07 PM, Patrick Giraudoux wrote: Hi, I am routinely compiling a package and since I have moved to R 2.10.0, it troncates some section texts in the doc: With the following section in the rd file: \details{ The function calls gpsbabel via the system. The gpsbabel program must be present and on the user's PATH for the function to work see http://www.gpsbabel.org/. The function has been tested on the following Garmin GPS devices: Etrex Summit, Etrex Vista Cx and GPSmap 60CSx. } ...compiling under R 2.9.2 (rcmd build --binary --auto-zip pgirmess) I get this Details: The function calls gpsbabel via the system. The gpsbabel program must be present and on the user's PATH for the function to work, see http://www.gpsbabel.org/. The function has been tested on the following Garmin GPS devices: Etrex Summit, Etrex Vista Cx and GPSmap 60CSx. and compiling now under R 2.10.0 Details: The function has been tested on the following Garmin GPS devices: Etrex Summit, Etrex Vista Cx and GPSmap 60CSx. The function calls gpsbabel via the system. The gpsbabel program must be presen Has anyone an explanation and a workaround ? You will need to make the complete file available to us to diagnose this. Is it in pgirmess 1.4.0? Which topic? Duncan Murdoch OK. Will send it offlist. Patrick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create multiple command windows of R on Windows?
Thanks a lot for your explanation. That actually makes sense. On Sat, Nov 7, 2009 at 3:54 PM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 07/11/2009 6:31 PM, Ivan wrote: Well, the problem is that I want those console to be from the same session (i.e., they share same objects etc.). I do not think multiple instances of R could do it ... No, there's only one console in each instance. That's pretty fundamental: there is one master loop reading, evaluating and printing, and the console shows you what's going on there. You can write your own front end to send things to the console (and somehow decide what to do when it's busy), but don't expect R to ever have more than one console. Duncan Murdoch 2009/11/7 Uwe Ligges lig...@statistik.tu-dortmund.de You can start as many instances of R as you like (except for memory restrictions, perhaps), hence I do not get your point. Uwe Ligges Ivan wrote: Thanks for that. I seem to be able to only get one R console here though. Actually that is my question: how to get a different R console? On Fri, Nov 6, 2009 at 4:10 PM, guohao.hu...@gmail.com wrote: It's easy. You can execute R commands under different ``command consoles'' in Windows. regards Huang, Guo-Hao -- From: Ivan skylark2...@gmail.com Sent: Saturday, November 07, 2009 8:01 AM To: r-help@r-project.org Subject: [R] How to create multiple command windows of R on Windows? Hi there, I am using R on Windows here. And I got a rather stupid question: how can I create another R console? It would be nice to have multiple R consoles so that I can separate different types of commands I use. Thank you, Ivan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting matched expressions
Hi all, Is there a tool in base R to extract matched expressions from a regular expression? i.e. given the regular expression (.*?) (.*?) ([ehtr]{5}) is there a way to extract the character vector c(one, two, three) from the string one two three ? Thanks, Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Models for Discrete Choice in R
Hi, I would like to fit Logit models for ordered data, such as those suggested by Greene (2003), p. 736. Does anyone suggests any package in R for that? By the way, my dependent variable is ordinal and my independent variables are ratio/intervalar. Thanks, Iuri. Greene, W. H. Econometric Analysis. Upper Saddle River, NJ: Prentice Hall, 2003 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MCMC gradually slows down
Hello, I have written a simple Metropolis-Hastings MCMC algorithm for a binomial parameter: MHastings = function(n,p0,d){ theta = c() theta[1] = p0 t =1 while(t=n){ phi = log(theta[t]/(1-theta[t])) phisim = phi + rnorm(1,0,d) thetasim = exp(phisim)/(1+exp(phisim)) r = (thetasim)^4*(1-thetasim)^8/(theta[t]^4*(1-theta[t])^8) if(runif(1,0,1)r){ theta[t+1] = thetasim } else { theta[t+1] = theta[t] } t = t+1 if(t%%1000==0) print(t) # diagnostic } data.frame(theta) } The problem is that it gradually slows down. It is very fast in the beginning, but slows down and gets very slow as you reach about 5 iterations and I need do to plenty more. I know there are more fancy MCMC routines available, but I am really just interested in this to work. Thank you for your help, Jens Malmros __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] negative log likelihood
I have two related variables, each with 16 points (x and Y). I am given variance and the y-intercept. I know how to create a regression line and find the residuals, but here is my problem. I have to make a loop that uses the seq() function, so that it changes the slope value of the y=mx + B equation ranging from 0-5 in increments of 0.01. The loop also needs to calculate the negative log likelihood at each slope value and determine the lowest one. I know that R can compute the best regression line by using lm(y~x), but I need to see if that value matches the loop functions. -- View this message in context: http://old.nabble.com/negative-log-likelihood-tp26256881p26256881.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Turn dates into age
Ive got a big column of dates (also some fields dont have a date so they have NA instead), that i have converted into date format as so... dates-as.character(data[,date_commissioned]); # converted dates to characters dates[1:10] [1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) dateObs[1:10] [1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 Now i need to turn the dates into years, how do i do it? Im not worried about fractions of years, whole years would do. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26256656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme4 and incomplete block design
Many thanks, Bill and Emmanuel! Christian Emmanuel Charpentier schrieb: Le dimanche 08 novembre 2009 à 00:05 +0100, Christian Lerch a écrit : Dear list members, I try to simulate an incomplete block design in which every participants receives 3 out of 4 possible treatment. The outcome in binary. Assigning a binary outcome to the BIB or PBIB dataset of the package SASmixed gives the appropriate output. With the code below, fixed treatment estimates are not given for each of the 4 possible treatments, instead a kind of summary measure(?) for 'treatment' is given. block-rep(1:24,each=3) treatment-c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3) outcome-c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0) data-data.frame(block,treatment,outcome) lmer(outcome~treatment +(1|block), family=binomial, data=data) Is this a problem with the recovery of interblock estimates? No... Are there special rules how incomplete block designs should look like to enable this recovery? Neither... Compare : library(lme4) Le chargement a nécessité le package : Matrix Le chargement a nécessité le package : lattice summary(lmer(outcome~treatment +(1|block), family=binomial, + data=data.frame(block-rep(1:24,each=3), +treatment-c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, + 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, + 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, + 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, + 4, 4, 1, 3), +outcome-c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, + 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, + 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, + 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0)) + )) Generalized linear mixed model fit by the Laplace approximation Formula: outcome ~ treatment + (1 | block) Data: data.frame(block - rep(1:24, each = 3), treatment - c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3), outcome - c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0)) AIC BIC logLik deviance 86.28 93.1 -40.1480.28 Random effects: Groups NameVariance Std.Dev. block (Intercept) 0.60425 0.77734 Number of obs: 72, groups: block, 24 Fixed effects: Estimate Std. Error z value Pr(|z|) (Intercept) -1.277830.71767 -1.7800.075 . treatment0.011620.25571 0.0450.964 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation of Fixed Effects: (Intr) treatment -0.892 with : summary(lmer(outcome~treatment +(1|block), family=binomial, + data=data.frame(block-rep(1:24,each=3), +treatment-factor(c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, +2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, +3, 2, 4, 4, 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, +4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, +1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3)), +outcome-c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, + 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, + 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, + 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0)) + )) Generalized linear mixed model fit by the Laplace approximation Formula: outcome ~ treatment + (1 | block) Data: data.frame(block - rep(1:24, each = 3), treatment - factor(c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3)), outcome - c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0,
Re: [R] Turn dates into age
What is the frame of reference to determine the age? Check out 'difftime'. On Sun, Nov 8, 2009 at 1:50 PM, frenchcr frenc...@btinternet.com wrote: Ive got a big column of dates (also some fields dont have a date so they have NA instead), that i have converted into date format as so... dates-as.character(data[,date_commissioned]); # converted dates to characters dates[1:10] [1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) dateObs[1:10] [1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 Now i need to turn the dates into AGE, how do i do it? Im not worried about fractions of years, whole years would do. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26256656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] correlated binary data and overall probability
To answer my own question: The package mvtBinaryEP was helpful! Christian Lerch schrieb: Dear All, I try to simulate correlated binary data for a clinical research project. Unfortunately, I do not come to grips with bindata(). Consider corr-data.frame(ID=as.factor(rep(c(1:10), each=5)), task=as.factor(rep(c(1:5),10))) [this format might be more appropriate:] corr2-data.frame(ID=as.factor(rep(c(1:10),5)), tablet=as.factor(rep(c(1:5),each=10))) Now, I want to add one column 'outcome' for the binary response: * within each 'task', the probabilty of success (1) is fixed, (say rbinom(x,1,0.7)) * within each 'ID', the outcomes are correlated (say 0.8) How can I generate this column 'outcome' with the given proporties? Many thanks for hints or even code! Regards, Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting matched expressions
Is this what you want: x - ' one two three ' y - sub(.*?([^[:space:]]+)[[:space:]]+([^[:space:]]+)[[:space:]]+([ehrt]{5}).*, + \\1 \\2 \\3, x, perl=TRUE) unlist(strsplit(y, ' ')) [1] one two three On Sun, Nov 8, 2009 at 1:51 PM, Hadley Wickham had...@rice.edu wrote: Hi all, Is there a tool in base R to extract matched expressions from a regular expression? i.e. given the regular expression (.*?) (.*?) ([ehtr]{5}) is there a way to extract the character vector c(one, two, three) from the string one two three ? Thanks, Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MCMC gradually slows down
First of all, allocate 'theta' to be the final size you need. Every time through your loop you are extending it by one, meaning you are spending a lot of time copying the data each time. Do something like: theta - numeric(n) and then see how fast it works. On Sun, Nov 8, 2009 at 2:11 PM, Jens Malmros jens.malm...@gmail.com wrote: Hello, I have written a simple Metropolis-Hastings MCMC algorithm for a binomial parameter: MHastings = function(n,p0,d){ theta = c() theta[1] = p0 t =1 while(t=n){ phi = log(theta[t]/(1-theta[t])) phisim = phi + rnorm(1,0,d) thetasim = exp(phisim)/(1+exp(phisim)) r = (thetasim)^4*(1-thetasim)^8/(theta[t]^4*(1-theta[t])^8) if(runif(1,0,1)r){ theta[t+1] = thetasim } else { theta[t+1] = theta[t] } t = t+1 if(t%%1000==0) print(t) # diagnostic } data.frame(theta) } The problem is that it gradually slows down. It is very fast in the beginning, but slows down and gets very slow as you reach about 5 iterations and I need do to plenty more. I know there are more fancy MCMC routines available, but I am really just interested in this to work. Thank you for your help, Jens Malmros __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MCMC gradually slows down
Try this: Instead of: theta = c() use: theta - rep(NA, 50) or however many iterations you want the algorithm to run. Best, -- Wolfgang Viechtbauerhttp://www.wvbauer.com/ Department of Methodology and StatisticsTel: +31 (0)43 388-2277 School for Public Health and Primary Care Office Location: Maastricht University, P.O. Box 616 Room B2.01 (second floor) 6200 MD Maastricht, The Netherlands Debyeplein 1 (Randwyck) From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Jens Malmros [jens.malm...@gmail.com] Sent: Sunday, November 08, 2009 8:11 PM To: r-help@r-project.org Subject: [R] MCMC gradually slows down Hello, I have written a simple Metropolis-Hastings MCMC algorithm for a binomial parameter: MHastings = function(n,p0,d){ theta = c() theta[1] = p0 t =1 while(t=n){ phi = log(theta[t]/(1-theta[t])) phisim = phi + rnorm(1,0,d) thetasim = exp(phisim)/(1+exp(phisim)) r = (thetasim)^4*(1-thetasim)^8/(theta[t]^4*(1-theta[t])^8) if(runif(1,0,1)r){ theta[t+1] = thetasim } else { theta[t+1] = theta[t] } t = t+1 if(t%%1000==0) print(t) # diagnostic } data.frame(theta) } The problem is that it gradually slows down. It is very fast in the beginning, but slows down and gets very slow as you reach about 5 iterations and I need do to plenty more. I know there are more fancy MCMC routines available, but I am really just interested in this to work. Thank you for your help, Jens Malmros __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting matched expressions
strapply in the gsubfn package can do that. It applies the indicated function, here just c, to the back references from the pattern match and then simplifies the result using simplify. (If you omit simplify here it would give a one element list like strsplit does.) library(gsubfn) pat - (.*?) (.*?) ([ehtr]{5}) strapply(one two three, pat, c, simplify = c) See home page at: http://gsubfn.googlecode.com On Sun, Nov 8, 2009 at 1:51 PM, Hadley Wickham had...@rice.edu wrote: Hi all, Is there a tool in base R to extract matched expressions from a regular expression? i.e. given the regular expression (.*?) (.*?) ([ehtr]{5}) is there a way to extract the character vector c(one, two, three) from the string one two three ? Thanks, Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Turn dates into age
As Jim has noted, if the dates you have below are an 'end date', you need to define the time0 or start date for each to calculate the intervals. On the other hand, are the dates you have below the start dates and you need to calculate the time to today? In the latter case, see ?Sys.Date to get the current system date from your computer. Generically speaking you can use the following: (EndDate - StartDate) / 365.25 where EndDate and StartDate are of class Date. This will give you the time interval in years plus any fraction. You can then use round() which will give you a whole year with typical rounding up or down to the nearest whole integer. You can use floor(), which will give you the nearest whole integer less than the result or basically a round down result. Keep in mind that the above calculation does not honor a calendar year, but is an approximation. If you want to calculate age in years as we typically think of it using the calendar, you can use the following, where DOB is the Date of Birth (Start Date) and Date2 is the End Date: # Calculate Age in Years # DOB: Class Date # Date2: Class Date Calc_Age - function(DOB, Date2) { if (length(DOB) != length(Date2)) stop(length(DOB) != length(Date2)) if (!inherits(DOB, Date) | !inherits(Date2, Date)) stop(Both DOB and Date2 must be Date class objects) start - as.POSIXlt(DOB) end - as.POSIXlt(Date2) Age - end$year - start$year ifelse((end$mon start$mon) | ((end$mon == start$mon) (end$mday start$mday)), Age - 1, Age) } HTH, Marc Schwartz On Nov 8, 2009, at 1:30 PM, jim holtman wrote: What is the frame of reference to determine the age? Check out 'difftime'. On Sun, Nov 8, 2009 at 1:50 PM, frenchcr frenc...@btinternet.com wrote: Ive got a big column of dates (also some fields dont have a date so they have NA instead), that i have converted into date format as so... dates-as.character(data[,date_commissioned]); # converted dates to characters dates[1:10] [1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) dateObs[1:10] [1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 Now i need to turn the dates into AGE, how do i do it? Im not worried about fractions of years, whole years would do. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a question in coxph
Hi there, I tried to fit a penalized spline for a continuous risk factor in recurrent events data. I found that I can include both pspline and frailty terms in coxph. So I use code like fit1 - coxph(Surv(start, end, event)~pspline(age, df=0) + male + white +frailty(id), data.all) It yields results for both age spline and frailty variance. Next I tried to draw the plot for age effect. I used termplot function termplot(fit1, terms=c(1), se=TRUE) However, it gives me some error message: Error in dimnames(pred) - list(dimnames(x)[[1]], termname) : length of 'dimnames' [2] not equal to array extent I don't know why. However, if I delete the frailty option, and use only fit2 - coxph(Surv(start, end, event)~pspline(age, df=0) + male + white, data.all) I can generate the plot by termplot(fit2, terms=c(1), se=TRUE) So it seems to me that adding the frailty() option makes the termplot function fail to work? Do you know how to draw the plot for my function fit1? Thanks a lot! Lei At 09:53 AM 7/20/2009, you wrote: I would be willing to chair the session. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Models for Discrete Choice in R
Le dimanche 08 novembre 2009 à 17:07 -0200, Iuri Gavronski a écrit : Hi, I would like to fit Logit models for ordered data, such as those suggested by Greene (2003), p. 736. Does anyone suggests any package in R for that? look up the polr function in package MASS (and read the relevant pages in VR4 and some quoted references...) or the slightly more sophisticated (larger range of models) lrm function in F. Harrell's Design (now rms) packge (but be aware that Design is a huge beast witch carries its own computing universe, based on (strong) Harrell's view of what a regression analysis should be : reading his book is, IMHO, necessary to understand his choices and agree (or disgree) with them). If you have a multilevel model (a. k. a. one random effect grouping), the repolr packge aims at that, but I've been unable to use it recently (numerical exceptions). By the way, my dependent variable is ordinal and my independent variables are ratio/intervalar. Numeric ? Then maybe some recoding/transformation is in order ... in which case Design/rms might or might not be useful. HTH, Emmanuel Charpentier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MCMC gradually slows down
Thanks a lot, this works. jim holtman wrote: First of all, allocate 'theta' to be the final size you need. Every time through your loop you are extending it by one, meaning you are spending a lot of time copying the data each time. Do something like: theta - numeric(n) and then see how fast it works. On Sun, Nov 8, 2009 at 2:11 PM, Jens Malmros jens.malm...@gmail.com wrote: Hello, I have written a simple Metropolis-Hastings MCMC algorithm for a binomial parameter: MHastings = function(n,p0,d){ theta = c() theta[1] = p0 t =1 while(t=n){ phi = log(theta[t]/(1-theta[t])) phisim = phi + rnorm(1,0,d) thetasim = exp(phisim)/(1+exp(phisim)) r = (thetasim)^4*(1-thetasim)^8/(theta[t]^4*(1-theta[t])^8) if(runif(1,0,1)r){ theta[t+1] = thetasim } else { theta[t+1] = theta[t] } t = t+1 if(t%%1000==0) print(t) # diagnostic } data.frame(theta) } The problem is that it gradually slows down. It is very fast in the beginning, but slows down and gets very slow as you reach about 5 iterations and I need do to plenty more. I know there are more fancy MCMC routines available, but I am really just interested in this to work. Thank you for your help, Jens Malmros __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Turn dates into age
it sure does thank you! will this work for you x - c('19910101', '19950302', '20010502') today - Sys.Date() x.date - as.Date(x, format=%Y%m%d) round(as.vector(difftime(today , x.date, units='day') / 365.25)) [1] 19 15 9 On Sun, Nov 8, 2009 at 2:44 PM, frenc...@btinternet.com wrote: Hi Jim, Thanks for the quick reply...not sure what you mean by frame of reference(only been using R for 4 days)...to clarify, i need to turn my dates from 1999-10-01 into 1999 then i subtract 2009 -1999 to get an age of 10. The column im working on has 312,000 rows and some have NA in them as we have no dates for that item. To recap, the column is just a bunch of dates with some field empty, i want to change the column from date of commision to age of asset Cheers Chris. jholtman wrote: What is the frame of reference to determine the age? Check out 'difftime'. On Sun, Nov 8, 2009 at 1:50 PM, frenchcr frenc...@btinternet.com wrote: Ive got a big column of dates (also some fields dont have a date so they have NA instead), that i have converted into date format as so... dates-as.character(data[,date_commissioned]); # converted dates to characters dates[1:10] [1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) dateObs[1:10] [1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 Now i need to turn the dates into AGE, how do i do it? Im not worried about fractions of years, whole years would do. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26256656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26257419.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Obtaining midpoints of class intervals produced by cut and table
Hello list: I am using cut and table to obtain a frequency table from a numeric sample vector. The idea is to calculate mean and standard deviation on grouped data. However, I can't extract the midpoints of the class intervals, which seem to be strings treated as factors. How do i extract the midpoint? Thanks, jose loreto [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Turn dates into age
why do you use 365.25? dates-as.character(data[,date_commissioned]); # convert dates to characters #dates[1:10] #[1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) #dateObs[1:10] #[1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 today - Sys.Date() x.date - as.Date(dateObs, format=%Y%m%d) AGE - round(as.vector(difftime(today , x.date, units='day') / 365.25)) frenchcr wrote: it sure does thank you! will this work for you x - c('19910101', '19950302', '20010502') today - Sys.Date() x.date - as.Date(x, format=%Y%m%d) round(as.vector(difftime(today , x.date, units='day') / 365.25)) [1] 19 15 9 On Sun, Nov 8, 2009 at 2:44 PM, frenc...@btinternet.com wrote: Hi Jim, Thanks for the quick reply...not sure what you mean by frame of reference(only been using R for 4 days)...to clarify, i need to turn my dates from 1999-10-01 into 1999 then i subtract 2009 -1999 to get an age of 10. The column im working on has 312,000 rows and some have NA in them as we have no dates for that item. To recap, the column is just a bunch of dates with some field empty, i want to change the column from date of commision to age of asset Cheers Chris. jholtman wrote: What is the frame of reference to determine the age? Check out 'difftime'. On Sun, Nov 8, 2009 at 1:50 PM, frenchcr frenc...@btinternet.com wrote: Ive got a big column of dates (also some fields dont have a date so they have NA instead), that i have converted into date format as so... dates-as.character(data[,date_commissioned]); # converted dates to characters dates[1:10] [1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) dateObs[1:10] [1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 Now i need to turn the dates into AGE, how do i do it? Im not worried about fractions of years, whole years would do. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26256656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26257435.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Obtaining midpoints of class intervals produced by cut and table
Hi, Maybe something like this (inspired by ?cut), cut2num - function(f){ labs - levels(f) d - data.frame(lower = as.numeric( sub(\\((.+),.*, \\1, labs) ), upper = as.numeric( sub([^,]*,([^]]*)\\], \\1, labs) )) d$midpoints - rowMeans(d) d } a - c(1, 2, 3, 4, 5, 2, 3, 4, 5, 6, 7) cut2num(cut(a, 3)) HTH, baptiste 2009/11/8 jose romero jlauren...@yahoo.com: Hello list: I am using cut and table to obtain a frequency table from a numeric sample vector. The idea is to calculate mean and standard deviation on grouped data. However, I can't extract the midpoints of the class intervals, which seem to be strings treated as factors. How do i extract the midpoint? Thanks, jose loreto [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Obtaining midpoints of class intervals produced by cut and table
jose romero wrote: Hello list: I am using cut and table to obtain a frequency table from a numeric sample vector. The idea is to calculate mean and standard deviation on grouped data. However, I can't extract the midpoints of the class intervals, which seem to be strings treated as factors. How do i extract the midpoint? You might consider starting with hist() instead. Otherwise, how about mid - brk[-1] - diff(brk)/2 where brk is the breakpoints used in cut() -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Obtaining midpoints of class intervals produced by cut and table
Try this: library(gsubfn) demo(gsubfn-cut) and note the strapply call that converts the levels from cut to a matrix. On Sun, Nov 8, 2009 at 4:08 PM, jose romero jlauren...@yahoo.com wrote: Hello list: I am using cut and table to obtain a frequency table from a numeric sample vector. The idea is to calculate mean and standard deviation on grouped data. However, I can't extract the midpoints of the class intervals, which seem to be strings treated as factors. How do i extract the midpoint? Thanks, jose loreto [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on contr.helmert and typo on its help page.
On Sun, Nov 8, 2009 at 11:28 AM, Peter Dalgaard p.dalga...@biostat.ku.dk wrote: Gabor Grothendieck wrote: On Sun, Nov 8, 2009 at 11:59 AM, Peng Yu pengyu...@gmail.com wrote: On Sun, Nov 8, 2009 at 10:11 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 08/11/2009 11:03 AM, Peng Yu wrote: I'm wondering which textbook discussed the various contrast matrices mentioned in the help page of 'contr.helmert'. Could somebody let me know? Doesn't the reference on that page discuss them? It does explain what the functions are. But I need a more basic and complete reference. For example, I want to understand what 'Helmert parametrization' (on page 33 of 'Statistical Models in S') is. Just google for: Helmert contrasts Or, contr.helmert(5) [,1] [,2] [,3] [,4] 1 -1 -1 -1 -1 2 1 -1 -1 -1 3 0 2 -1 -1 4 0 0 3 -1 5 0 0 0 4 MASS::fractions(MASS::ginv(contr.helmert(5))) [,1] [,2] [,3] [,4] [,5] [1,] -1/2 1/2 0 0 0 [2,] -1/6 -1/6 1/3 0 0 [3,] -1/12 -1/12 -1/12 1/4 0 [4,] -1/20 -1/20 -1/20 -1/20 1/5 and apply brains. I.e., except for a slightly odd multiplier, the parameters represent the difference between each level and the average of the preceding levels. I realized that my questions are what a contrast matrix is and how it is related to hypothesis testing. For a give hypothesis, how to get the corresponding contrast matrix in a systematical way? There are some online materials, but they are all diffused. I have also read the book Applied Linear Regression Models, which doesn't give a complete descriptions on all the aspects of contrast and contrast matrix. But I would want a textbook that gives a complete description, so that I don't have to look around for other materials. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Turn dates into age
On Nov 8, 2009, at 3:11 PM, frenchcr wrote: why do you use 365.25? As opposed to what? -- David dates-as.character(data[,date_commissioned]); # convert dates to characters #dates[1:10] #[1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) #dateObs[1:10] #[1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 today - Sys.Date() x.date - as.Date(dateObs, format=%Y%m%d) AGE - round(as.vector(difftime(today , x.date, units='day') / 365.25)) frenchcr wrote: it sure does thank you! will this work for you x - c('19910101', '19950302', '20010502') today - Sys.Date() x.date - as.Date(x, format=%Y%m%d) round(as.vector(difftime(today , x.date, units='day') / 365.25)) [1] 19 15 9 On Sun, Nov 8, 2009 at 2:44 PM, frenc...@btinternet.com wrote: Hi Jim, Thanks for the quick reply...not sure what you mean by frame of reference(only been using R for 4 days)...to clarify, i need to turn my dates from 1999-10-01 into 1999 then i subtract 2009 -1999 to get an age of 10. The column im working on has 312,000 rows and some have NA in them as we have no dates for that item. To recap, the column is just a bunch of dates with some field empty, i want to change the column from date of commision to age of asset Cheers Chris. jholtman wrote: What is the frame of reference to determine the age? Check out 'difftime'. On Sun, Nov 8, 2009 at 1:50 PM, frenchcr frenc...@btinternet.com wrote: Ive got a big column of dates (also some fields dont have a date so they have NA instead), that i have converted into date format as so... dates-as.character(data[,date_commissioned]); # converted dates to characters dates[1:10] [1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) dateObs[1:10] [1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 Now i need to turn the dates into AGE, how do i do it? Im not worried about fractions of years, whole years would do. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26256656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26257435.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Turn dates into age
To clarify. Lets turn a date into an age. Given 05/29/1971 in mm/dd/ format. What is the year difference between then and today? This would be the age requested that starts 05/29/1971 as one. Thanks, Jim David Winsemius wrote: On Nov 8, 2009, at 3:11 PM, frenchcr wrote: why do you use 365.25? As opposed to what? -- David dates-as.character(data[,date_commissioned]); # convert dates to characters #dates[1:10] #[1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) #dateObs[1:10] #[1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 today - Sys.Date() x.date - as.Date(dateObs, format=%Y%m%d) AGE - round(as.vector(difftime(today , x.date, units='day') / 365.25)) frenchcr wrote: it sure does thank you! will this work for you x - c('19910101', '19950302', '20010502') today - Sys.Date() x.date - as.Date(x, format=%Y%m%d) round(as.vector(difftime(today , x.date, units='day') / 365.25)) [1] 19 15 9 On Sun, Nov 8, 2009 at 2:44 PM, frenc...@btinternet.com wrote: Hi Jim, Thanks for the quick reply...not sure what you mean by frame of reference(only been using R for 4 days)...to clarify, i need to turn my dates from 1999-10-01 into 1999 then i subtract 2009 -1999 to get an age of 10. The column im working on has 312,000 rows and some have NA in them as we have no dates for that item. To recap, the column is just a bunch of dates with some field empty, i want to change the column from date of commision to age of asset Cheers Chris. jholtman wrote: What is the frame of reference to determine the age? Check out 'difftime'. On Sun, Nov 8, 2009 at 1:50 PM, frenchcr frenc...@btinternet.com wrote: Ive got a big column of dates (also some fields dont have a date so they have NA instead), that i have converted into date format as so... dates-as.character(data[,date_commissioned]); # converted dates to characters dates[1:10] [1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) dateObs[1:10] [1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 Now i need to turn the dates into AGE, how do i do it? Im not worried about fractions of years, whole years would do. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26256656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26257435.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ordered factor and unordered factor
I don't understand under what situation ordered factor rather than unordered factor should be used. Could somebody give me some examples? What are the implications of order vs. unordered factors? Could somebody recommend a textbook to me? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple 2-Way Anova issue in R
If I've understood correctly, you have cell sizes of 1. This is not enough. ANOVA compares within group variance to between group variance, and your within group variances are zero. You need more data, or to collapse some cells. Jeremy 2009/11/8 znd zackdaughe...@mail.com: Hello, I'm new to R and have been following many guides including the two-way anova (http://www.personality-project.org/r/r.anova.html). Using that walkthrough including the supplied data I do str(data.ex2) and receive the appropriate types of data as follows: str(data.ex2) 'data.frame': 16 obs. of 4 variables: $ Observation: int 1 2 3 4 5 6 7 8 9 10 ... $ Gender : Factor w/ 2 levels f,m: 2 2 2 2 2 2 2 2 1 1 ... $ Dosage : Factor w/ 2 levels a,b: 1 1 1 1 2 2 2 2 1 1 ... $ Alertness : int 8 12 13 12 6 7 23 14 15 12 ... aov.ex2 = aov(Alertness~Gender*Dosage,data=data.ex2) summary(aov.ex2) Outputs: Df Sum Sq Mean Sq F value Pr(F) Gender 1 76.562 76.562 2.9518 0.1115 Dosage 1 5.062 5.062 0.1952 0.6665 Gender:Dosage 1 0.063 0.063 0.0024 0.9617 Residuals 12 311.250 25.938 However, when I got to use my data that I made in csv format I have to tell R to interpret my factors which are year and depth as factors... datafilename=C:/Rclass/hmwk1pt2.csv data.ex2=read.csv(datafilename,header=T) data.ex2$Year-as.factor(data.ex2$Year) data.ex2$Depth-as.factor(data.ex2$Depth) data.ex2 str(data.ex2) This outputs what I would expect: str(data.ex2) 'data.frame': 12 obs. of 4 variables: $ Year : Factor w/ 3 levels 1999,2000,..: 1 1 1 1 2 2 2 2 3 3 ... $ Depth : Factor w/ 4 levels 10,15,20,..: 1 2 3 4 1 2 3 4 1 2 ... $ Replicate1: num 14.3 15.1 16.7 17.3 16.3 17.4 18.6 20.9 22.9 23.9 ... $ Replicate2: num 14.7 15.6 16.9 17.9 16.4 17.2 19.6 21.3 22.7 23.3 ... But something is not causing my anova to carry through...this is what I have. ANOVA = aov(Replicate1~Year*Depth,data=data.ex2) summary(ANOVA) which outputs: summary(ANOVA) Df Sum Sq Mean Sq Year 2 143.607 71.803 Depth 3 17.323 5.774 Year:Depth 6 2.587 0.431 There is no F-value or Pr(F) columns. I also can't boxplot this correctly, again following the example at that website above they have: boxplot(Alertness~Dosage*Gender,data=data.ex2) which outputs: http://old.nabble.com/file/p26258684/87o3uicpf6dt4kkdyvfv.jpeg My code is: boxplot(Replicate1~Year*Depth,data=data.ex2) which outputs: http://old.nabble.com/file/p26258684/gik02vyhvvbmcvw3ia2h.jpeg This is incorrect, it's multiplying my factors but I thought that when I did the str() on my data it recognized the Year and Depth as factors, not numbers or integers. My csv file is: http://old.nabble.com/file/p26258684/hmwk1pt2.csv hmwk1pt2.csv Any help on what is going one would be greatly appreciated because I need to perform one-way, two-way, nested, and factorial anovas but I first need to solve this problem before I can continue. -- View this message in context: http://old.nabble.com/Simple-2-Way-Anova-issue-in-R-tp26258684p26258684.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jeremy Miles Psychology Research Methods Wiki: www.researchmethodsinpsychology.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Turn dates into age
Sys.Date() [1] 2009-11-08 as.Date(05/29/1971, format = %m/%d/%Y) [1] 1971-05-29 as.numeric((Sys.Date() - as.Date(05/29/1971, format = %m/%d/ %Y)) / 365.25) [1] 38.44764 or perhaps more clearly: EndDate - Sys.Date() StartDate - as.Date(05/29/1971, format = %m/%d/%Y) as.numeric((EndDate - StartDate) / 365.25) [1] 38.44764 We coerce to numeric here, to return a standard numeric value, rather than the result being of class difftime with an attribute of 'days' for units: str((Sys.Date() - as.Date(05/29/1971, format = %m/%d/%Y)) / 365.25) Class 'difftime' atomic [1:1] 38.4 ..- attr(*, units)= chr days HTH, Marc Schwartz On Nov 8, 2009, at 5:22 PM, Jim Burke wrote: To clarify. Lets turn a date into an age. Given 05/29/1971 in mm/dd/ format. What is the year difference between then and today? This would be the age requested that starts 05/29/1971 as one. Thanks, Jim David Winsemius wrote: On Nov 8, 2009, at 3:11 PM, frenchcr wrote: why do you use 365.25? As opposed to what? -- David dates-as.character(data[,date_commissioned]); # convert dates to characters #dates[1:10] #[1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) #dateObs[1:10] #[1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 today - Sys.Date() x.date - as.Date(dateObs, format=%Y%m%d) AGE - round(as.vector(difftime(today , x.date, units='day') / 365.25)) frenchcr wrote: it sure does thank you! will this work for you x - c('19910101', '19950302', '20010502') today - Sys.Date() x.date - as.Date(x, format=%Y%m%d) round(as.vector(difftime(today , x.date, units='day') / 365.25)) [1] 19 15 9 On Sun, Nov 8, 2009 at 2:44 PM, frenc...@btinternet.com wrote: Hi Jim, Thanks for the quick reply...not sure what you mean by frame of reference(only been using R for 4 days)...to clarify, i need to turn my dates from 1999-10-01 into 1999 then i subtract 2009 -1999 to get an age of 10. The column im working on has 312,000 rows and some have NA in them as we have no dates for that item. To recap, the column is just a bunch of dates with some field empty, i want to change the column from date of commision to age of asset Cheers Chris. jholtman wrote: What is the frame of reference to determine the age? Check out 'difftime'. On Sun, Nov 8, 2009 at 1:50 PM, frenchcr frenc...@btinternet.com wrote: Ive got a big column of dates (also some fields dont have a date so they have NA instead), that i have converted into date format as so... dates-as.character(data[,date_commissioned]); # converted dates to characters dates[1:10] [1] 19910101 19860101 19910101 19860101 19910101 19910101 19910101 19910101 19910101 19910101 dateObs - as.Date(dates,format=%Y%m%d) dateObs[1:10] [1] 1991-01-01 1986-01-01 1991-01-01 1986-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 1991-01-01 Now i need to turn the dates into AGE, how do i do it? Im not worried about fractions of years, whole years would do. -- View this message in context: http://old.nabble.com/Turn-dates-into-age- tp26256656p26256656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Turn-dates-into-age-tp26256656p26257435.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide
Re: [R] Normal distribution
Normal distribution check within R can be done with functions available in nortest package. This package consists of several normality tests. In order to install package type install.packages('nortest'). Afterwards, you should consider running ks.test() only if mu and sigma parameters are known (these stand for population arithmetic mean and variance) - and that's only applicable if your data is gathered from the population. Therefor I recommend using lillie.test() function, which is Lilliefors' modification of Kolmogorov-Smirnov statistic. It's applicable both for data gathered from a sample, but can also be applied to population data. You can run ks.test(x, pnorm), but don't worry if you get several ties - these occur due to rounding of values, or if your data come from descrete probability function... You can also try shapiro.test() function if your sample counts less then 50 responses (Shapiro-Wilks' test for small samples), or ad.test() for Anderson-Darling normality test. You should revise these statistical procedures in official literature, but there's also a lot of info on wikipedia about stated statistical techniques. If absolute value of skewness is larger than 1.96 * standard error of skewness, your distribution significantly differs from normal. Also stands for kurtosis. Value 1.96 implies p-value lower than .05, and 2.58 lower than .01 Skewness function is called with skew(), and kurtosis with kurtosi() function. Standard error of skewness is calculated from formula se.sk - sqrt(6/length(x)) and standard error of kurtosis from formula se.ku - sqrt(24/length(x)) If mean is not located in the middle of the range, this can also indicate a violation of normality. I strongly recommend reading official help for nortest package, and consulting an official statistical literature! P.S. shapiro.test() is located in stats package, but running it on a large sample (N 50) is not quite applicable, hence use lillie.test() for those purposes, or ks.test(x, pnorm) - where x argument is variable which normality you're about to check, and pnorm stands for normal distribution function. Noela Sánchez-2 wrote: Hi, I am dealing with how to check in R if some data that I have belongs to a normal distribution or not. I am not interested in obtaining the theoreticall frequencies. I am only interested in determing if (by means of a test as Kolmogorov, or whatever), if my data are normal or not. But I have tried with ks.test() and I have not got it. -- Noela Grupo de Recursos Marinos y Pesquerías Universidad de A Coruña [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/Normal-distribution-tp25702570p26259120.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Express e in R?
How do you express e the base of the natural log in R. -- View this message in context: http://old.nabble.com/Express-%22e%22-in-R--tp26258503p26258503.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Express e in R?
On 08/11/2009 5:03 PM, Crab wrote: How do you express e the base of the natural log in R. e - exp(1) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Express e in R?
exp(1) #[1] 2.718282 On Nov 8, 2009, at 5:03 PM, Crab wrote: How do you express e the base of the natural log in R. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on contr.helmert and typo on its help page.
Dear Peng Yu, Perhaps you're referring to my text, Applied Linear Regression Analysis and Generalized Linear Models, since I seem to recall that you sent me a number of questions about it. See Section 9.1.2 on linear contrasts for the answer to your question. I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peng Yu Sent: November-08-09 4:52 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] reference on contr.helmert and typo on its help page. On Sun, Nov 8, 2009 at 11:28 AM, Peter Dalgaard p.dalga...@biostat.ku.dk wrote: Gabor Grothendieck wrote: On Sun, Nov 8, 2009 at 11:59 AM, Peng Yu pengyu...@gmail.com wrote: On Sun, Nov 8, 2009 at 10:11 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 08/11/2009 11:03 AM, Peng Yu wrote: I'm wondering which textbook discussed the various contrast matrices mentioned in the help page of 'contr.helmert'. Could somebody let me know? Doesn't the reference on that page discuss them? It does explain what the functions are. But I need a more basic and complete reference. For example, I want to understand what 'Helmert parametrization' (on page 33 of 'Statistical Models in S') is. Just google for: Helmert contrasts Or, contr.helmert(5) [,1] [,2] [,3] [,4] 1 -1 -1 -1 -1 2 1 -1 -1 -1 3 0 2 -1 -1 4 0 0 3 -1 5 0 0 0 4 MASS::fractions(MASS::ginv(contr.helmert(5))) [,1] [,2] [,3] [,4] [,5] [1,] -1/2 1/2 0 0 0 [2,] -1/6 -1/6 1/3 0 0 [3,] -1/12 -1/12 -1/12 1/4 0 [4,] -1/20 -1/20 -1/20 -1/20 1/5 and apply brains. I.e., except for a slightly odd multiplier, the parameters represent the difference between each level and the average of the preceding levels. I realized that my questions are what a contrast matrix is and how it is related to hypothesis testing. For a give hypothesis, how to get the corresponding contrast matrix in a systematical way? There are some online materials, but they are all diffused. I have also read the book Applied Linear Regression Models, which doesn't give a complete descriptions on all the aspects of contrast and contrast matrix. But I would want a textbook that gives a complete description, so that I don't have to look around for other materials. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Outputing multilple subsets
Hi Rusers, I hope to divide the original dataset into several subsets and output these multilple datasets. But errors appeared in my loops. See example. ## a-c(1:10) b-c(rep(1,3),rep(2,3),rep(3,4)) c-data.frame(a,b) #c is the example data num-c(unique(b)) # I hope to get the subsets c_1.csv,c_2.csv and c_3.csv #Errors for (i in num) { c_num-c[c$b==num,] write.csv(c_num,file=c:/c_num.csv) } Warning messages: 1: In c$b == num : longer object length is not a multiple of shorter object length 2: In c$b == num : longer object length is not a multiple of shorter object length 3: In c$b == num : longer object length is not a multiple of shorter object length I think the problem should be file=c:/c_num.csv, anybody has ever met this problem? Thanks very much. - Jane Chang Queen's [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple 2-Way Anova issue in R
Ah, I believe I constructed my *.csv wrong in that I only had 1 observation within groups whereas I needed at least 2. Originally I had: Year Depth Biomass1 Biomass2 1999 10 14.3 14.7 1999 15 14.7 15.6 etc. but I switched this to: Year Depth Biomass 1999 10 14.3 1999 10 14.7 1999 15 15.1 1999 15 15.6 I believe this may be the appropriate way to collapse my biomass data in order to perform a 2-way anova. If not feel free comment and thanks for the help. -- View this message in context: http://old.nabble.com/Simple-2-Way-Anova-issue-in-R-tp26258684p26259649.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on contr.helmert and typo on its help page.
Dear John, I did read Section 9.1.2 and various other textbooks before posting my questions. But each reference uses slightly different notations and terminology. I get confused and would like a description that summaries everything so that I don't have to refer to many different resources. May I ask a few questions on the section in your textbook? Which variable in Section 9.1.2 is a matrix of contrasts mentioned in the help page of 'contr.helmert'? Which matrix of contrast in R corresponds to dummy regression? With different R formula, e.g. y ~ x vs. y ~ x -1, $X_F$ (mentioned on page 189) is different and hence $\beta_F$ (mentioned in eq. 9.3) is be different. So my understanding is that the matrix of contrast should depend on the formula. But it is not according to the help page of contr.helmert. Regards, Peng On Sun, Nov 8, 2009 at 6:17 PM, John Fox j...@mcmaster.ca wrote: Dear Peng Yu, Perhaps you're referring to my text, Applied Linear Regression Analysis and Generalized Linear Models, since I seem to recall that you sent me a number of questions about it. See Section 9.1.2 on linear contrasts for the answer to your question. I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peng Yu Sent: November-08-09 4:52 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] reference on contr.helmert and typo on its help page. On Sun, Nov 8, 2009 at 11:28 AM, Peter Dalgaard p.dalga...@biostat.ku.dk wrote: Gabor Grothendieck wrote: On Sun, Nov 8, 2009 at 11:59 AM, Peng Yu pengyu...@gmail.com wrote: On Sun, Nov 8, 2009 at 10:11 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 08/11/2009 11:03 AM, Peng Yu wrote: I'm wondering which textbook discussed the various contrast matrices mentioned in the help page of 'contr.helmert'. Could somebody let me know? Doesn't the reference on that page discuss them? It does explain what the functions are. But I need a more basic and complete reference. For example, I want to understand what 'Helmert parametrization' (on page 33 of 'Statistical Models in S') is. Just google for: Helmert contrasts Or, contr.helmert(5) [,1] [,2] [,3] [,4] 1 -1 -1 -1 -1 2 1 -1 -1 -1 3 0 2 -1 -1 4 0 0 3 -1 5 0 0 0 4 MASS::fractions(MASS::ginv(contr.helmert(5))) [,1] [,2] [,3] [,4] [,5] [1,] -1/2 1/2 0 0 0 [2,] -1/6 -1/6 1/3 0 0 [3,] -1/12 -1/12 -1/12 1/4 0 [4,] -1/20 -1/20 -1/20 -1/20 1/5 and apply brains. I.e., except for a slightly odd multiplier, the parameters represent the difference between each level and the average of the preceding levels. I realized that my questions are what a contrast matrix is and how it is related to hypothesis testing. For a give hypothesis, how to get the corresponding contrast matrix in a systematical way? There are some online materials, but they are all diffused. I have also read the book Applied Linear Regression Models, which doesn't give a complete descriptions on all the aspects of contrast and contrast matrix. But I would want a textbook that gives a complete description, so that I don't have to look around for other materials. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple 2-Way Anova issue in R
x Year Depth Biomass1 Biomass2 1 199910 14.3 14.7 2 199915 14.7 15.6 require(reshape) melt(x, id=c('Year','Depth')) Year Depth variable value 1 199910 Biomass1 14.3 2 199915 Biomass1 14.7 3 199910 Biomass2 14.7 4 199915 Biomass2 15.6 On Sun, Nov 8, 2009 at 7:38 PM, znd zackdaughe...@mail.com wrote: Ah, I believe I constructed my *.csv wrong in that I only had 1 observation within groups whereas I needed at least 2. Originally I had: Year Depth Biomass1 Biomass2 1999 10 14.3 14.7 1999 15 14.7 15.6 etc. but I switched this to: Year Depth Biomass 1999 10 14.3 1999 10 14.7 1999 15 15.1 1999 15 15.6 I believe this may be the appropriate way to collapse my biomass data in order to perform a 2-way anova. If not feel free comment and thanks for the help. -- View this message in context: http://old.nabble.com/Simple-2-Way-Anova-issue-in-R-tp26258684p26259649.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Outputing multilple subsets
Have you considered using split? -Ista On Sun, Nov 8, 2009 at 7:23 PM, rusers.sh rusers...@gmail.com wrote: Hi Rusers, I hope to divide the original dataset into several subsets and output these multilple datasets. But errors appeared in my loops. See example. ## a-c(1:10) b-c(rep(1,3),rep(2,3),rep(3,4)) c-data.frame(a,b) #c is the example data num-c(unique(b)) # I hope to get the subsets c_1.csv,c_2.csv and c_3.csv #Errors for (i in num) { c_num-c[c$b==num,] write.csv(c_num,file=c:/c_num.csv) } Warning messages: 1: In c$b == num : longer object length is not a multiple of shorter object length 2: In c$b == num : longer object length is not a multiple of shorter object length 3: In c$b == num : longer object length is not a multiple of shorter object length I think the problem should be file=c:/c_num.csv, anybody has ever met this problem? Thanks very much. - Jane Chang Queen's [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Models for Discrete Choice in R
Emmanuel Charpentier wrote: Le dimanche 08 novembre 2009 à 17:07 -0200, Iuri Gavronski a écrit : Hi, I would like to fit Logit models for ordered data, such as those suggested by Greene (2003), p. 736. Does anyone suggests any package in R for that? look up the polr function in package MASS (and read the relevant pages in VR4 and some quoted references...) or the slightly more sophisticated (larger range of models) lrm function in F. Harrell's Design (now rms) packge (but be aware that Design is a huge beast witch carries its own computing universe, based on (strong) Harrell's view of what a regression analysis should be : reading his book is, IMHO, necessary to understand his choices and agree (or disgree) with them). If you have a multilevel model (a. k. a. one random effect grouping), the repolr packge aims at that, but I've been unable to use it recently (numerical exceptions). By the way, my dependent variable is ordinal and my independent variables are ratio/intervalar. Numeric ? Then maybe some recoding/transformation is in order ... in which case Design/rms might or might not be useful. I'm not clear on what recoding or transformation is needed for an ordinal dependent variable and ratio/interval independent variables, nor why rms/Design would not be useful. Frank HTH, Emmanuel Charpentier -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reference on contr.helmert and typo on its help page.
Dear Peng, I'm tempted to try to get an entry in the fortunes package but will instead try to answer your questions directly: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peng Yu Sent: November-08-09 7:41 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] reference on contr.helmert and typo on its help page. Dear John, I did read Section 9.1.2 and various other textbooks before posting my questions. But each reference uses slightly different notations and terminology. I get confused and would like a description that summaries everything so that I don't have to refer to many different resources. May I ask a few questions on the section in your textbook? Which variable in Section 9.1.2 is a matrix of contrasts mentioned in the help page of 'contr.helmert'? Which matrix of contrast in R corresponds to dummy regression? With different R formula, e.g. y ~ x vs. y ~ x -1, $X_F$ (mentioned on page 189) is different and hence $\beta_F$ (mentioned in eq. 9.3) is be different. So my understanding is that the matrix of contrast should depend on the formula. But it is not according to the help page of contr.helmert. If the model is simply y ~ A, for the factor A, then cbind(1, contrasts(A)) is what I call X_B, the row-basis of the model matrix. As I explain in the section that you read, the level means are mu = X_B beta, and thus beta = X_B^-1 mu = 0 are the hypotheses tested by the contrasts. Moreover, if, as in Helmert contrasts, the columns of X_B are orthogonal, then so are the rows of X_B^-1, and the latter are simply rescalings of the former. That allows one conveniently to code the hypotheses directly in X_B; all this is also explained in that section of my book, and is essentially what Peter D. told you. In R, contr.treatment and contr.SAS provide dummy-variable (0/1) coding of regressors, differing only in the selection of the reference level. John Regards, Peng On Sun, Nov 8, 2009 at 6:17 PM, John Fox j...@mcmaster.ca wrote: Dear Peng Yu, Perhaps you're referring to my text, Applied Linear Regression Analysis and Generalized Linear Models, since I seem to recall that you sent me a number of questions about it. See Section 9.1.2 on linear contrasts for the answer to your question. I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Peng Yu Sent: November-08-09 4:52 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] reference on contr.helmert and typo on its help page. On Sun, Nov 8, 2009 at 11:28 AM, Peter Dalgaard p.dalga...@biostat.ku.dk wrote: Gabor Grothendieck wrote: On Sun, Nov 8, 2009 at 11:59 AM, Peng Yu pengyu...@gmail.com wrote: On Sun, Nov 8, 2009 at 10:11 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 08/11/2009 11:03 AM, Peng Yu wrote: I'm wondering which textbook discussed the various contrast matrices mentioned in the help page of 'contr.helmert'. Could somebody let me know? Doesn't the reference on that page discuss them? It does explain what the functions are. But I need a more basic and complete reference. For example, I want to understand what 'Helmert parametrization' (on page 33 of 'Statistical Models in S') is. Just google for: Helmert contrasts Or, contr.helmert(5) [,1] [,2] [,3] [,4] 1 -1 -1 -1 -1 2 1 -1 -1 -1 3 0 2 -1 -1 4 0 0 3 -1 5 0 0 0 4 MASS::fractions(MASS::ginv(contr.helmert(5))) [,1] [,2] [,3] [,4] [,5] [1,] -1/2 1/2 0 0 0 [2,] -1/6 -1/6 1/3 0 0 [3,] -1/12 -1/12 -1/12 1/4 0 [4,] -1/20 -1/20 -1/20 -1/20 1/5 and apply brains. I.e., except for a slightly odd multiplier, the parameters represent the difference between each level and the average of the preceding levels. I realized that my questions are what a contrast matrix is and how it is related to hypothesis testing. For a give hypothesis, how to get the corresponding contrast matrix in a systematical way? There are some online materials, but they are all diffused. I have also read the book Applied Linear Regression Models, which doesn't give a complete descriptions on all the aspects of contrast and contrast matrix. But I would want a textbook that gives a complete description, so that I don't have to look around for other materials. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
[R] Bug in gplots::heatmap.2 symbreaks arg default (?)
Hi all, I think the code to calculate the default value of the symbreaks argument in the gplots::heatmap.2 function is wrong. The documentation says symbreaks defaults to true if any negative values are detected in the data passed in. The relevant code in the parameter list of this function definition (gplots 2.7.3) is this: symbreaks = min(x 0, na.rm = TRUE) || scale != none When I'm pretty sure it should be: symbreaks = min(x, na.rm = TRUE) 0 || scale != none Likewise for the symkey parameter. Right? Thanks, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Forward selection peculiarity
Hi all, I'm using the mle.stepwise function in the wle package to run forward selection. My statistical background tells me that the first variable selected should be the one with the highest correlation with the response, however that's not the case. The two highest correlations with the response are similar (0.86 and 0.89) however the forward selection selects the variable with 0.86 correlation first and then in the next step chooses the variable with correlation 0.89. Any thoughts on this? Is is this some type of bug in the function or package? Thanks -- View this message in context: http://old.nabble.com/Forward-selection-peculiarity-tp26260670p26260670.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Outputing multilple subsets
On Nov 8, 2009, at 7:23 PM, rusers.sh wrote: for (i in num) { c_num-c[c$b==num,] write.csv(c_num,file=c:/c_num.csv) } Warning messages: 1: In c$b == num : longer object length is not a multiple of shorter object length This is because you're comparing column b to the entire vector of numbers (num), not the current number in the iteration (i). The first line of the loop should be c_num-c[c$b==i,]. From a style point of view, I'd use n as my variable, since i is too commonly used as an integer index. Also, you will be overwriting the same file, called c_num.csv, on each iteration. You should try something more like: for (n in num) { c.n - c[c$b==n,] write.csv(c.n, file=paste(c:/c_, n, .csv, sep=) } I hope that helps. Cheers, Johann Hibschman __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to change color the default in levelplot() ?
Dear R communities May I seek your advices on how to change color the default in levelplot(), e.g. from the default of pink and light blue, to e.g. red and green ? The levelplot function has 1 of the arguments being panel (which is actually panel.levelplot), but I am not sure where the commands to alter the color. For example, I type: p1-levelplot(my.mat,colorkey=FALSE), how could I include col.regions of panel.levelplot? And whether it is right to use the col.regions? I am using R 2.8.1 in ubuntu. Many thanks and have a good day! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hand-crafting an .RData file
Hello, I frequently have to export a large quantity of data from some source (for example, a database, or a hand-written perl script) and then read it into R. This occasionally takes a lot of time; I'm usually using read.table(filename,comment.char=,quote=) to read the data once it is written to disk. However, I *know* that the program that generates the data is more or less just calling printf in a for loop to create the csv or tab-delimited file, writing, then having R parse it, which is pretty inefficient. Instead, I am interested in figuring out how to write the data in .RData format so that I can load() it instead of read.table() it. Trolling the internet, however, has not suggested anything about the specification for an .RData file. Could somebody link me to a specification or some information that would instruct me on how to construct a .RData file (either compressed or uncompressed)? Also, I am open to other suggestions of how to get load()-like efficiency in some other way. Many thanks, Adam D. I. Kramer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.